Skip to content
Snippets Groups Projects
Commit a7b8b1d3 authored by Jan Siwiec's avatar Jan Siwiec
Browse files

proofread

parent 7a423d58
No related branches found
No related tags found
No related merge requests found
Pipeline #33914 passed with warnings
...@@ -3,12 +3,12 @@ ...@@ -3,12 +3,12 @@
## Introduction ## Introduction
[Slurm][1] workload manager is used to allocate and access Karolina cluster's resources. [Slurm][1] workload manager is used to allocate and access Karolina cluster's resources.
This page describes Karolina cluster specific Slurm settings and use. This page describes Karolina cluster's specific Slurm settings and usage.
General information about Slurm use at IT4Innovations can be found at [Slurm Job Submission and Execution][2]. General information about Slurm usage at IT4Innovations can be found at [Slurm Job Submission and Execution][2].
## Partition Information ## Partition Information
Partitions/queues on system: Partitions/queues on the system:
```console ```console
$ sinfo -s $ sinfo -s
...@@ -28,24 +28,24 @@ qfat up 2-00:00:00 0/1/0/1 sdf1 ...@@ -28,24 +28,24 @@ qfat up 2-00:00:00 0/1/0/1 sdf1
qviz up 8:00:00 0/2/0/2 viz[1-2] qviz up 8:00:00 0/2/0/2 viz[1-2]
``` ```
For more information about Karolina's queues see [this page][8]. For more information about Karolina's queues, see [this page][8].
Graphical representation of cluster usage, partitions, nodes, and jobs could be found Graphical representation of cluster usage, partitions, nodes, and jobs could be found
at [https://extranet.it4i.cz/rsweb/karolina][3] at [https://extranet.it4i.cz/rsweb/karolina][3]
On Karolina cluster On Karolina cluster
* all cpu queues/partitions provide full node allocation, whole nodes (all node resources) are allocated to a job. * all CPU queues/partitions provide full node allocation, whole nodes (all node resources) are allocated to a job.
* other queues/partitions (gpu, fat, viz) provide partial node allocation. Jobs' resources (cpu, mem) are separated and dedicated for job. * other queues/partitions (gpu, fat, viz) provide partial node allocation. Jobs' resources (cpu, mem) are separated and dedicated for job.
!!! important "Partial node allocation and Security" !!! important "Partial node allocation and security"
Division of nodes means that if two users allocate a portion of the same node, they can see each other's running processes. Division of nodes means that if two users allocate a portion of the same node, they can see each other's running processes.
If this solution is inconvenient for you, consider allocating a whole node. If this solution is inconvenient for you, consider allocating a whole node.
## Using CPU Queues ## Using CPU Queues
Access [standard compute nodes][4]. Access [standard compute nodes][4].
Whole nodes are allocated. Use `--nodes` option to specify number of requested nodes. Whole nodes are allocated. Use the `--nodes` option to specify the number of requested nodes.
There is no need to specify the number of cores and memory size. There is no need to specify the number of cores and memory size.
```console ```console
...@@ -62,9 +62,9 @@ There is no need to specify the number of cores and memory size. ...@@ -62,9 +62,9 @@ There is no need to specify the number of cores and memory size.
Access [GPU accelerated nodes][5]. Access [GPU accelerated nodes][5].
Every GPU accelerated node is divided into eight parts, each part contains one GPU, 16 CPU cores and corresponding memory. Every GPU accelerated node is divided into eight parts, each part contains one GPU, 16 CPU cores and corresponding memory.
By default only one part i.e. 1/8 of the node - one GPU and corresponding CPU cores and memory is allocated. By default, only one part, i.e. 1/8 of the node - one GPU and corresponding CPU cores and memory, is allocated.
There is no need to specify the number of cores and memory size, on the contrary, it is undesirable. There is no need to specify the number of cores and memory size, on the contrary, it is undesirable.
There are emloyed some restrictions which aim to provide fair division and efficient use of node resources. There are employed some restrictions which aim to provide fair division and efficient use of node resources.
```console ```console
#!/usr/bin/bash #!/usr/bin/bash
...@@ -78,44 +78,45 @@ There are emloyed some restrictions which aim to provide fair division and effic ...@@ -78,44 +78,45 @@ There are emloyed some restrictions which aim to provide fair division and effic
To allocate more GPUs use `--gpus` option. To allocate more GPUs use `--gpus` option.
The default behavior is to allocate enough nodes to satisfy the requested resources as expressed by `--gpus` option and without delaying the initiation of the job. The default behavior is to allocate enough nodes to satisfy the requested resources as expressed by `--gpus` option and without delaying the initiation of the job.
Following code requests four gpus, scheduler can allocate one up to four nodes depending on actual cluster state (i.e. GPU availability) to fulfil the request. The following code requests four GPUs; scheduler can allocate from one up to four nodes depending on the actual cluster state (i.e. GPU availability) to fulfil the request.
```console ```console
#SBATCH --gpus 4 #SBATCH --gpus 4
``` ```
Following code requests 16 gpus, scheduler can allocate two up to sixteen nodes depending on actual cluster state (i.e. GPU availability) to fulfil the request. The following code requests 16 GPUs; scheduler can allocate from two up to sixteen nodes depending on the actual cluster state (i.e. GPU availability) to fulfil the request.
```console ```console
#SBATCH --gpus 16 #SBATCH --gpus 16
``` ```
To allocate GPUs within one node you have to specify `--nodes` option. To allocate GPUs within one node you have to specify the `--nodes` option.
Following code requests four gpus on exactly one node. The following code requests four GPUs on exactly one node
```console ```console
#SBATCH --gpus 4 #SBATCH --gpus 4
#SBATCH --nodes 1 #SBATCH --nodes 1
``` ```
Following code requests 16 gpus on exactly two nodes. The following code requests 16 GPUs on exactly two nodes.
```console ```console
#SBATCH --gpus 16 #SBATCH --gpus 16
#SBATCH --nodes 2 #SBATCH --nodes 2
``` ```
Alternatively you can use `--gpus-per-node` option. Only value 8 is allowed for multi-node allocation to prevent fragmenting nodes. Alternatively, you can use the `--gpus-per-node` option.
Only value 8 is allowed for multi-node allocation to prevent fragmenting nodes.
Following code requests 16 gpus on exactly two nodes. The following code requests 16 GPUs on exactly two nodes.
```console ```console
#SBATCH --gpus-per-node 8 #SBATCH --gpus-per-node 8
#SBATCH --nodes 2 #SBATCH --nodes 2
``` ```
To allocate whole GPU accelerated node you can also use `--exclusive` option To allocate a whole GPU accelerated node, you can also use the `--exclusive` option
```console ```console
#SBATCH --exclusive #SBATCH --exclusive
...@@ -125,10 +126,11 @@ To allocate whole GPU accelerated node you can also use `--exclusive` option ...@@ -125,10 +126,11 @@ To allocate whole GPU accelerated node you can also use `--exclusive` option
Access [data analytics aka fat node][6]. Access [data analytics aka fat node][6].
Fat node is divided into 32 parts, each part contains one socket/processor (24 cores) and corresponding memory. Fat node is divided into 32 parts, each part contains one socket/processor (24 cores) and corresponding memory.
By default only one part i.e. 1/32 of the node - one processor and corresponding memory is allocated. By default, only one part, i.e. 1/32 of the node - one processor and corresponding memory, is allocated.
To allocate requested memory use `--mem` option. To allocate requested memory use the `--mem` option.
Corresponding CPUs wil be allocated. Fat node has about 23TB of memory available for jobs. Corresponding CPUs will be allocated.
Fat node has about 23TB of memory available for jobs.
```console ```console
#!/usr/bin/bash #!/usr/bin/bash
...@@ -140,9 +142,9 @@ Corresponding CPUs wil be allocated. Fat node has about 23TB of memory available ...@@ -140,9 +142,9 @@ Corresponding CPUs wil be allocated. Fat node has about 23TB of memory available
... ...
``` ```
You can also specify CPU-oriented options (like `--cpus-per-task`), then appropriate memory will be allocated to job. You can also specify CPU-oriented options (like `--cpus-per-task`), then appropriate memory will be allocated to the job.
To allocate whole fat node use `--exclusive` option To allocate a whole fat node, use the `--exclusive` option
```console ```console
#SBATCH --exclusive #SBATCH --exclusive
...@@ -150,15 +152,15 @@ To allocate whole fat node use `--exclusive` option ...@@ -150,15 +152,15 @@ To allocate whole fat node use `--exclusive` option
## Using Viz Queue ## Using Viz Queue
Access [visualisation nodes][7]. Access [visualization nodes][7].
Every visualisation node is divided into eight parts. Every visualization node is divided into eight parts.
By default only one part i.e. 1/8 of the node is allocated. By default, only one part, i.e. 1/8 of the node, is allocated.
```console ```console
$ salloc -A PROJECT-ID -p qviz $ salloc -A PROJECT-ID -p qviz
``` ```
To allocate whole visualisation node use `--exclusive` option To allocate a whole visualisation node, use the `--exclusive` option
```console ```console
$ salloc -A PROJECT-ID -p qviz --exclusive $ salloc -A PROJECT-ID -p qviz --exclusive
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment