Skip to content
Snippets Groups Projects
Commit c502be19 authored by Jan Siwiec's avatar Jan Siwiec
Browse files

Merge branch 'gaussian-gpu-support' into 'master'

Update gaussian.md

See merge request !290
parents 464b4395 e3bffe42
No related branches found
No related tags found
5 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section,!290Update gaussian.md
......@@ -16,7 +16,7 @@ not in direct or indirect competition with the Gaussian Inc. company and have a
The license includes GPU support and Linda parallel environment for Gaussian multi-node parallel execution.
!!! note
You need to be a member of the **gaussian group**. Contact support@it4i.cz in order to get included in the gaussian group.
You need to be a member of the **gaussian group**. Contact [support\[at\]it4i.cz][b] in order to get included in the gaussian group.
Check your group membership:
......@@ -27,16 +27,19 @@ uid=1000(user) gid=1000(user) groups=1000(user),1234(open-0-0),7310(gaussian)
## Installed Version
Gaussian is available on Salomon, Barbora, and DGX-2 systems in the latest version Gaussian 16 rev. c0.
Gaussian is available on Anselm, Salomon, Barbora, and DGX-2 systems in the latest version Gaussian 16 rev. c0.
| Module | CPU support | GPU support | Parallelization | Note |
|--------------------------------------|-------------|--------------|-----------------|---------------------|
| Gaussian/16_rev_c0-binary | AVX2 | Yes | SMP | Binary distribution |
| Gaussian/16_rev_c0-binary-Linda | AVX2 | Yes | SMP + Linda | Binary distribution |
| Gaussian/16_rev_c0-CascadeLake | AVX-512 | No | SMP | IT4I compiled |
| Gaussian/16_rev_c0-CascadeLake-Linda | AVX-512 | No | SMP + Linda | IT4I compiled |
| Module | CPU support | GPU support | Parallelization | Note | Anselm | Barbora | Salomon | DGX-2 |
|--------------------------------------|-------------|--------------|-----------------|---------------------|---------|---------|---------|-------|
| Gaussian/16_rev_c0-binary | AVX2 | Yes | SMP | Binary distribution | No | Yes | Yes | Yes |
| Gaussian/16_rev_c0-binary-Linda | AVX2 | Yes | SMP + Linda | Binary distribution | No | Yes | Yes | No |
| Gaussian/16_rev_c0-CascadeLake | AVX-512 | No | SMP | IT4I compiled | No | Yes | No | No |
| Gaussian/16_rev_c0-CascadeLake-Linda | AVX-512 | No | SMP + Linda | IT4I compiled | No | Yes | No | No |
| Gaussian/16_rev_c0-GPU-Linda | AVX-512 | Yes | SMP + Linda | IT4I compiled | No | Yes | No | No |
| Gaussian/16_rev_c0-GPU | AVX-512 | Yes | SMP | IT4I compiled | No | No | No | Yes |
| Gaussian/16_rev_c0-Linda | AVX | No | SMP + Linda | IT4I compiled | Yes | No | No | No |
Speedup may be observed on Barbora supercomputer when using the `CascadeLake` module compared to the `binary` module.
Speedup may be observed on Barbora and DGX-2 systems when using the `CascadeLake` and `GPU` modules compared to the `binary` module.
## Running
......@@ -44,7 +47,7 @@ Gaussian is compiled for single node parallel execution as well as multi-node pa
GPU support for V100 cards is available on Barbora and DGX-2.
!!! note
By default, the execution is single-core, single node, without GPU acceleration.
By default, the execution is single-core, single-node, and without GPU acceleration.
### Shared-Memory Multiprocessor Parallel Execution (Single Node)
......@@ -71,7 +74,7 @@ $ ml Gaussian/16_rev_c0-binary-Linda
The network parallelization environment is **Linda**.
In the input file Link0 header section, set the CPU cores (24 for Salomon, 36 for Barbora, 48 for DGX-2) and memory amount.
Include the **%UseSSH keyword**, as well. This enables Linda to spawn parallel workers.
Include the `%UseSSH` keyword, as well. This enables Linda to spawn parallel workers.
```bash
%CPU=0-35
......@@ -81,8 +84,8 @@ Include the **%UseSSH keyword**, as well. This enables Linda to spawn parallel w
The number and placement of Linda workers may be controlled by %LindaWorkers keyword or by
GAUSS_WDEF environment variable. When running multi-node job via the PBS batch queue, loading
the Linda-enabled module **automaticaly sets the `GAUSS_WDEF` variable** to the correct node-list, using one worker per node.
In combination with the %CPU keyword, this enables a full scale multi-node execution.
the Linda-enabled module **automatically sets the `GAUSS_WDEF` variable** to the correct node-list, using one worker per node.
In combination with the %CPU keyword, this enables a full-scale multi-node execution.
```bash
$ echo $GAUSS_WDEF
......@@ -100,10 +103,16 @@ Load Linda-enabled binary module
$ ml Gaussian/16_rev_c0-binary-Linda
```
Or GPU-enabled module
```bash
$ ml Gaussian/16_rev_c0-GPU-Linda
```
In the input file Link0 header section, set the CPU cores (24 for Barbora GPU nodes) and memory.
To enable GPU acceleration, set the **%GPUCPU** keyword. This keyword activates GPU accelerators and dedicates CPU cores to drive the GPU accelerators.
On Barbora GPU nodes, we activate GPUs 0-3 and assign cores 0,2,12,14 (two from each CPU socket) to drive the accelerators.
If multi node computation is intended, Include the **%UseSSH keyword**, as well. This enables Linda to spawn parallel workers.
To enable GPU acceleration, set the `%GPUCPU` keyword. This keyword activates GPU accelerators and dedicates CPU cores to drive the GPU accelerators.
On Barbora GPU nodes, we activate GPUs 0-3 and assign cores 0, 2, 12, 14 (two from each CPU socket) to drive the accelerators.
If multi-node computation is intended, include the `%UseSSH` keyword, as well. This enables Linda to spawn parallel workers.
```bash
%CPU=0-23
......@@ -112,14 +121,20 @@ If multi node computation is intended, Include the **%UseSSH keyword**, as well.
%UseSSH
```
GPU accelerated caclulations on the **DGX-2** are supported
with Gaussian binary module.
GPU accelerated calculations on the **DGX-2** are supported
with Gaussian binary module
```bash
$ ml Gaussian/16_rev_c0-binary
```
In the input file Link0 header section, modify the %CPU keyword to 48 cores and the %GPUCPU keyword to 16 GPU accelerators. Omit the Linda.
Or IT4I-compiled module
```bash
$ ml Gaussian/16_rev_c0-GPU
```
In the input file Link0 header section, modify the `%CPU` keyword to 48 cores and the `%GPUCPU` keyword to 16 GPU accelerators. Omit the Linda.
```bash
%CPU=0-47
......@@ -149,8 +164,6 @@ WATER H20 optimization
R 0.96
A 109.471221
```
### Run Gaussian (All Modes)
......@@ -225,4 +238,5 @@ exit
[1]: ../../salomon/storage.md
[a]: https://gaussian.com/gaussian16/
[b]: mailto:support@it4i.cz
[c]: https://gaussian.com/man/
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment