diff --git a/docs.it4i/software/mpi/running_openmpi.md b/docs.it4i/software/mpi/running_openmpi.md index 59b38e159afa07dd1de91fd9ab936933cc4c1da2..e2733f05c6f353d87e7bd2e7e61b4e13275f96a1 100644 --- a/docs.it4i/software/mpi/running_openmpi.md +++ b/docs.it4i/software/mpi/running_openmpi.md @@ -7,7 +7,7 @@ The OpenMPI programs may be executed only via the PBS Workload manager, by enter ### Basic Usage !!! note - Use the mpiexec to run the OpenMPI code. + Use the `mpiexec` to run the OpenMPI code. Example (for Anselm): @@ -26,10 +26,9 @@ $ mpiexec -pernode ./helloworld_mpi.x ``` !!! note - In this example, the directive **-pernode** is used to run only **one task per node**, which is normally an unwanted behavior (unless you want to run hybrid code with just one MPI and 16 OpenMPI tasks per node). In normal MPI programs, **omit the -pernode directive** to run up to 16 MPI tasks per each node. + In this example, the directive `-pernode` is used to run only **one task per node**, which is normally an unwanted behavior (unless you want to run hybrid code with just one MPI and 16 OpenMPI tasks per node). In normal MPI programs, **omit the `-pernode` directive** to run up to 16 MPI tasks per each node. -In this example, we allocate 4 nodes via the express queue interactively. We set up the OpenMPI environment and interactively run the helloworld_mpi.x program. Note that the executable helloworld_mpi.x must be available within the -same path on all nodes. This is automatically fulfilled on the /home and /scratch filesystem. +In this example, we allocate 4 nodes via the express queue interactively. We set up the OpenMPI environment and interactively run the helloworld_mpi.x program. Note that the executable helloworld_mpi.x must be available within the same path on all nodes. This is automatically fulfilled on the /home and /scratch filesystem. You need to preload the executable, if running on the local scratch /lscratch filesystem: @@ -43,16 +42,16 @@ $ mpiexec -pernode --preload-binary ./helloworld_mpi.x Hello world! from rank 3 of 4 on host cn110 ``` -In this example, we assume the executable helloworld_mpi.x is present on compute node cn17 on local scratch. We call the mpiexec with the **--preload-binary** argument (valid for OpenMPI). The mpiexec will copy the executable from cn17 to the /lscratch/15210.srv11 directory on cn108, cn109, and cn110 and execute the program. +In this example, we assume the executable helloworld_mpi.x is present on compute node cn17 on local scratch. We call the `mpiexec` with the `--preload-binary` argument (valid for OpenMPI). The `mpiexec` will copy the executable from cn17 to the /lscratch/15210.srv11 directory on cn108, cn109, and cn110 and execute the program. !!! note MPI process mapping may be controlled by PBS parameters. -The mpiprocs and ompthreads parameters allow for selection of number of running MPI processes per node as well as number of OpenMP threads per MPI process. +The `mpiprocs` and `ompthreads` parameters allow for selection of number of running MPI processes per node as well as number of OpenMP threads per MPI process. ### One MPI Process Per Node -Follow this example to run one MPI process per node, 16 threads per process (**on Salomon, try 24 threads in following examples**). +Follow this example to run one MPI process per node, 16 threads per process (**on Salomon, try 24 threads in following examples**): ```console $ qsub -q qexp -l select=4:ncpus=16:mpiprocs=1:ompthreads=16 -I @@ -60,7 +59,7 @@ $ ml OpenMPI $ mpiexec --bind-to-none ./helloworld_mpi.x ``` -In this example, we demonstrate tthe recommended way to run an MPI application, using 1 MPI processes per node and 16 threads per socket, on 4 nodes. +In this example, we demonstrate the recommended way to run an MPI application, using 1 MPI processes per node and 16 threads per socket, on 4 nodes. ### Two MPI Processes Per Node @@ -112,9 +111,9 @@ $ export OMP_PLACES=cores ## OpenMPI Process Mapping and Binding -The mpiexec allows for precise selection of how the MPI processes will be mapped to the computational nodes and how these processes will bind to particular processor sockets and cores. +`mpiexec` allows for precise selection of how the MPI processes will be mapped to the computational nodes and how these processes will bind to particular processor sockets and cores. -MPI process mapping may be specified by a hostfile or rankfile input to the mpiexec program. Altough all implementations of MPI provide means for process mapping and binding, the following examples are valid for the OpenMPI only. +MPI process mapping may be specified by a hostfile or rankfile input to the `mpiexec` program. Altough all implementations of MPI provide means for process mapping and binding, the following examples are valid for the OpenMPI only. ### Hostfile @@ -179,9 +178,9 @@ $ mpiexec -n 5 -rf rankfile --report-bindings ./helloworld_mpi.x Hello world! from rank 2 of 5 on host cn108 ``` -In this example, we run 5 MPI processes (5 ranks) on four nodes. The rankfile defines how the processes will be mapped on the nodes, sockets and cores. The **--report-bindings** option was used to print out the actual process location and bindings. Note that ranks 1 and 4 run on the same node and their core binding overlaps. +In this example, we run 5 MPI processes (5 ranks) on four nodes. The rankfile defines how the processes will be mapped on the nodes, sockets and cores. The `--report-bindings` option was used to print out the actual process location and bindings. Note that ranks 1 and 4 run on the same node and their core binding overlaps. -It is the user's responsibility to provide the correct number of ranks, sockets, and cores. +The user must provide the correct number of ranks, sockets, and cores. ### Bindings Verification