Skip to content
Snippets Groups Projects
Commit fd653427 authored by Jan Siwiec's avatar Jan Siwiec
Browse files

Update job-submission-and-execution.md inifiband > infiniband

parent 086f0c30
No related branches found
No related tags found
4 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section
......@@ -183,7 +183,7 @@ In this example, we allocate 4 nodes, 16 cores per node, selecting only the node
### Anselm - Placement by IB Switch
Groups of computational nodes are connected to chassis integrated Infiniband switches. These switches form the leaf switch layer of the [Infiniband network][3] fat tree topology. Nodes sharing the leaf switch can communicate most efficiently. Sharing the same switch prevents hops in the network and facilitates unbiased, highly efficient network communication.
Groups of computational nodes are connected to chassis integrated InfiniBand switches. These switches form the leaf switch layer of the [Infiniband network][3] fat tree topology. Nodes sharing the leaf switch can communicate most efficiently. Sharing the same switch prevents hops in the network and facilitates unbiased, highly efficient network communication.
Nodes sharing the same switch may be selected via the PBS resource attribute `ibswitch`. Values of this attribute are `iswXX`, where `XX` is the switch number. The node-switch mapping can be seen in the [Hardware Overview][4] section.
......@@ -197,18 +197,18 @@ In this example, we request all of the 18 nodes sharing the isw11 switch for 24
### Salomon - Placement by Network Location
The network location of allocated nodes in the [InifiBand network][3] influences efficiency of network communication between nodes of job. Nodes on the same InifiBand switch communicate faster with lower latency than distant nodes. To improve communication efficiency of jobs, PBS scheduler on Salomon is configured to allocate nodes (from currently available resources), which are as close as possible in the network topology.
The network location of allocated nodes in the [InfiniBand network][3] influences efficiency of network communication between nodes of job. Nodes on the same InfiniBand switch communicate faster with lower latency than distant nodes. To improve communication efficiency of jobs, PBS scheduler on Salomon is configured to allocate nodes (from currently available resources), which are as close as possible in the network topology.
For communication intensive jobs, it is possible to set stricter requirement - to require nodes directly connected to the same InifiBand switch or to require nodes located in the same dimension group of the InifiBand network.
For communication intensive jobs, it is possible to set stricter requirement - to require nodes directly connected to the same InfiniBand switch or to require nodes located in the same dimension group of the InfiniBand network.
### Salomon - Placement by InifiBand Switch
### Salomon - Placement by InfiniBand Switch
Nodes directly connected to the same InifiBand switch can communicate most efficiently. Using the same switch prevents hops in the network and provides for unbiased, most efficient network communication. There are 9 nodes directly connected to every InifiBand switch.
Nodes directly connected to the same InfiniBand switch can communicate most efficiently. Using the same switch prevents hops in the network and provides for unbiased, most efficient network communication. There are 9 nodes directly connected to every InfiniBand switch.
!!! note
We recommend allocating compute nodes of a single switch when the best possible computational network performance is required to run job efficiently.
Nodes directly connected to the one InifiBand switch can be allocated using node grouping on the PBS resource attribute `switch`.
Nodes directly connected to the one InfiniBand switch can be allocated using node grouping on the PBS resource attribute `switch`.
In this example, we request all 9 nodes directly connected to the same switch using node grouping placement.
......@@ -216,12 +216,12 @@ In this example, we request all 9 nodes directly connected to the same switch us
$ qsub -A OPEN-0-0 -q qprod -l select=9:ncpus=24 -l place=group=switch ./myjob
```
### Salomon - Placement by Specific InifiBand Switch
### Salomon - Placement by Specific InfiniBand Switch
!!! note
Not useful for ordinary computing, suitable for testing and management tasks.
Nodes directly connected to the specific InifiBand switch can be selected using the PBS resource attribute `switch`.
Nodes directly connected to the specific InfiniBand switch can be selected using the PBS resource attribute `switch`.
In this example, we request all 9 nodes directly connected to the r4i1s0sw1 switch.
......@@ -229,7 +229,7 @@ In this example, we request all 9 nodes directly connected to the r4i1s0sw1 swit
$ qsub -A OPEN-0-0 -q qprod -l select=9:ncpus=24:switch=r4i1s0sw1 ./myjob
```
List of all InifiBand switches:
List of all InfiniBand switches:
```console
$ qmgr -c 'print node @a' | grep switch | awk '{print $6}' | sort -u
......@@ -241,7 +241,7 @@ r1i2s0sw0
...
```
List of all nodes directly connected to the specific InifiBand switch:
List of all nodes directly connected to the specific InfiniBand switch:
```console
$ qmgr -c 'p n @d' | grep 'switch = r36sw3' | awk '{print $3}' | sort
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment