diff --git a/docs.it4i/anselm-cluster-documentation/software/mpi/Running_OpenMPI.md b/docs.it4i/anselm-cluster-documentation/software/mpi/Running_OpenMPI.md index e2376523a96ba461422fe63335aa3bf913679bd4..2560280b66329edb7966882797e90982f915d14f 100644 --- a/docs.it4i/anselm-cluster-documentation/software/mpi/Running_OpenMPI.md +++ b/docs.it4i/anselm-cluster-documentation/software/mpi/Running_OpenMPI.md @@ -6,7 +6,7 @@ The OpenMPI programs may be executed only via the PBS Workload manager, by enter ### Basic usage -!!! Note "Note" +!!! Note Use the mpiexec to run the OpenMPI code. Example: @@ -27,7 +27,7 @@ Example: Hello world! from rank 3 of 4 on host cn110 ``` -!!! Note "Note" +!!! Note Please be aware, that in this example, the directive **-pernode** is used to run only **one task per node**, which is normally an unwanted behaviour (unless you want to run hybrid code with just one MPI and 16 OpenMP tasks per node). In normal MPI programs **omit the -pernode directive** to run up to 16 MPI tasks per each node. In this example, we allocate 4 nodes via the express queue interactively. We set up the openmpi environment and interactively run the helloworld_mpi.x program. Note that the executable helloworld_mpi.x must be available within the @@ -48,7 +48,7 @@ You need to preload the executable, if running on the local scratch /lscratch fi In this example, we assume the executable helloworld_mpi.x is present on compute node cn17 on local scratch. We call the mpiexec whith the **--preload-binary** argument (valid for openmpi). The mpiexec will copy the executable from cn17 to the /lscratch/15210.srv11 directory on cn108, cn109 and cn110 and execute the program. -!!! Note "Note" +!!! Note MPI process mapping may be controlled by PBS parameters. The mpiprocs and ompthreads parameters allow for selection of number of running MPI processes per node as well as number of OpenMP threads per MPI process. @@ -97,7 +97,7 @@ In this example, we demonstrate recommended way to run an MPI application, using ### OpenMP thread affinity -!!! Note "Note" +!!! Note Important! Bind every OpenMP thread to a core! In the previous two examples with one or two MPI processes per node, the operating system might still migrate OpenMP threads between cores. You might want to avoid this by setting these environment variable for GCC OpenMP: @@ -108,16 +108,16 @@ In the previous two examples with one or two MPI processes per node, the operati or this one for Intel OpenMP: -````bash - $ export KMP_AFFINITY=granularity=fine,compact,1,0 - `` +```bash +$ export KMP_AFFINITY=granularity=fine,compact,1,0 +``` - As of OpenMP 4.0 (supported by GCC 4.9 and later and Intel 14.0 and later) the following variables may be used for Intel or GCC: +As of OpenMP 4.0 (supported by GCC 4.9 and later and Intel 14.0 and later) the following variables may be used for Intel or GCC: - ```bash - $ export OMP_PROC_BIND=true - $ export OMP_PLACES=cores -```` +```bash +$ export OMP_PROC_BIND=true +$ export OMP_PLACES=cores +``` ## OpenMPI Process Mapping and Binding @@ -152,7 +152,7 @@ In this example, we see that ranks have been mapped on nodes according to the or Exact control of MPI process placement and resource binding is provided by specifying a rankfile -!!! Note "Note" +!!! Note Appropriate binding may boost performance of your application. Example rankfile diff --git a/docs.it4i/anselm-cluster-documentation/software/mpi/mpi4py-mpi-for-python.md b/docs.it4i/anselm-cluster-documentation/software/mpi/mpi4py-mpi-for-python.md index 3738f2ec2d02bb0afd9e221be02c91fdbf24ac5b..c9237a8346d90b4e98f59ea4d9a07d473a250e3e 100644 --- a/docs.it4i/anselm-cluster-documentation/software/mpi/mpi4py-mpi-for-python.md +++ b/docs.it4i/anselm-cluster-documentation/software/mpi/mpi4py-mpi-for-python.md @@ -51,7 +51,7 @@ For example comm.Barrier() # wait for everybody to synchronize ``` -\###Collective Communication with NumPy arrays +### Collective Communication with NumPy arrays ```cpp from mpi4py import MPI