4 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section
In this example, the directive **-pernode** is used to run only **one task per node**, which is normally an unwanted behavior (unless you want to run hybrid code with just one MPI and 16 OpenMPI tasks per node). In normal MPI programs, **omit the -pernode directive** to run up to 16 MPI tasks per each node.
In this example, the directive `-pernode` is used to run only **one task per node**, which is normally an unwanted behavior (unless you want to run hybrid code with just one MPI and 16 OpenMPI tasks per node). In normal MPI programs, **omit the `-pernode` directive** to run up to 16 MPI tasks per each node.
In this example, we allocate 4 nodes via the express queue interactively. We set up the OpenMPI environment and interactively run the helloworld_mpi.x program. Note that the executable helloworld_mpi.x must be available within the
In this example, we allocate 4 nodes via the express queue interactively. We set up the OpenMPI environment and interactively run the helloworld_mpi.x program. Note that the executable helloworld_mpi.x must be available within the same path on all nodes. This is automatically fulfilled on the /home and /scratch filesystem.
same path on all nodes. This is automatically fulfilled on the /home and /scratch filesystem.
You need to preload the executable, if running on the local scratch /lscratch filesystem:
You need to preload the executable, if running on the local scratch /lscratch filesystem:
In this example, we assume the executable helloworld_mpi.x is present on compute node cn17 on local scratch. We call the mpiexec with the **--preload-binary** argument (valid for OpenMPI). The mpiexec will copy the executable from cn17 to the /lscratch/15210.srv11 directory on cn108, cn109, and cn110 and execute the program.
In this example, we assume the executable helloworld_mpi.x is present on compute node cn17 on local scratch. We call the `mpiexec` with the `--preload-binary` argument (valid for OpenMPI). The `mpiexec` will copy the executable from cn17 to the /lscratch/15210.srv11 directory on cn108, cn109, and cn110 and execute the program.
!!! note
!!! note
MPI process mapping may be controlled by PBS parameters.
MPI process mapping may be controlled by PBS parameters.
The mpiprocs and ompthreads parameters allow for selection of number of running MPI processes per node as well as number of OpenMP threads per MPI process.
The `mpiprocs` and `ompthreads` parameters allow for selection of number of running MPI processes per node as well as number of OpenMP threads per MPI process.
### One MPI Process Per Node
### One MPI Process Per Node
Follow this example to run one MPI process per node, 16 threads per process (**on Salomon, try 24 threads in following examples**).
Follow this example to run one MPI process per node, 16 threads per process (**on Salomon, try 24 threads in following examples**):
In this example, we demonstrate tthe recommended way to run an MPI application, using 1 MPI processes per node and 16 threads per socket, on 4 nodes.
In this example, we demonstrate the recommended way to run an MPI application, using 1 MPI processes per node and 16 threads per socket, on 4 nodes.
### Two MPI Processes Per Node
### Two MPI Processes Per Node
...
@@ -112,9 +111,9 @@ $ export OMP_PLACES=cores
...
@@ -112,9 +111,9 @@ $ export OMP_PLACES=cores
## OpenMPI Process Mapping and Binding
## OpenMPI Process Mapping and Binding
The mpiexec allows for precise selection of how the MPI processes will be mapped to the computational nodes and how these processes will bind to particular processor sockets and cores.
`mpiexec` allows for precise selection of how the MPI processes will be mapped to the computational nodes and how these processes will bind to particular processor sockets and cores.
MPI process mapping may be specified by a hostfile or rankfile input to the mpiexec program. Altough all implementations of MPI provide means for process mapping and binding, the following examples are valid for the OpenMPI only.
MPI process mapping may be specified by a hostfile or rankfile input to the `mpiexec` program. Altough all implementations of MPI provide means for process mapping and binding, the following examples are valid for the OpenMPI only.
In this example, we run 5 MPI processes (5 ranks) on four nodes. The rankfile defines how the processes will be mapped on the nodes, sockets and cores. The **--report-bindings** option was used to print out the actual process location and bindings. Note that ranks 1 and 4 run on the same node and their core binding overlaps.
In this example, we run 5 MPI processes (5 ranks) on four nodes. The rankfile defines how the processes will be mapped on the nodes, sockets and cores. The `--report-bindings` option was used to print out the actual process location and bindings. Note that ranks 1 and 4 run on the same node and their core binding overlaps.
It is the user's responsibility to provide the correct number of ranks, sockets, and cores.
The user must provide the correct number of ranks, sockets, and cores.