From a7b8b1d3dd8282dff0bbb0c7feb663529883c2f0 Mon Sep 17 00:00:00 2001 From: Jan Siwiec <jan.siwiec@vsb.cz> Date: Tue, 5 Sep 2023 11:09:10 +0200 Subject: [PATCH] proofread --- docs.it4i/general/karolina-slurm.md | 54 +++++++++++++++-------------- 1 file changed, 28 insertions(+), 26 deletions(-) diff --git a/docs.it4i/general/karolina-slurm.md b/docs.it4i/general/karolina-slurm.md index 77f91e3f6..c274cb87e 100644 --- a/docs.it4i/general/karolina-slurm.md +++ b/docs.it4i/general/karolina-slurm.md @@ -3,12 +3,12 @@ ## Introduction [Slurm][1] workload manager is used to allocate and access Karolina cluster's resources. -This page describes Karolina cluster specific Slurm settings and use. -General information about Slurm use at IT4Innovations can be found at [Slurm Job Submission and Execution][2]. +This page describes Karolina cluster's specific Slurm settings and usage. +General information about Slurm usage at IT4Innovations can be found at [Slurm Job Submission and Execution][2]. ## Partition Information -Partitions/queues on system: +Partitions/queues on the system: ```console $ sinfo -s @@ -28,24 +28,24 @@ qfat up 2-00:00:00 0/1/0/1 sdf1 qviz up 8:00:00 0/2/0/2 viz[1-2] ``` -For more information about Karolina's queues see [this page][8]. +For more information about Karolina's queues, see [this page][8]. Graphical representation of cluster usage, partitions, nodes, and jobs could be found at [https://extranet.it4i.cz/rsweb/karolina][3] On Karolina cluster -* all cpu queues/partitions provide full node allocation, whole nodes (all node resources) are allocated to a job. +* all CPU queues/partitions provide full node allocation, whole nodes (all node resources) are allocated to a job. * other queues/partitions (gpu, fat, viz) provide partial node allocation. Jobs' resources (cpu, mem) are separated and dedicated for job. -!!! important "Partial node allocation and Security" +!!! important "Partial node allocation and security" Division of nodes means that if two users allocate a portion of the same node, they can see each other's running processes. If this solution is inconvenient for you, consider allocating a whole node. ## Using CPU Queues Access [standard compute nodes][4]. -Whole nodes are allocated. Use `--nodes` option to specify number of requested nodes. +Whole nodes are allocated. Use the `--nodes` option to specify the number of requested nodes. There is no need to specify the number of cores and memory size. ```console @@ -62,9 +62,9 @@ There is no need to specify the number of cores and memory size. Access [GPU accelerated nodes][5]. Every GPU accelerated node is divided into eight parts, each part contains one GPU, 16 CPU cores and corresponding memory. -By default only one part i.e. 1/8 of the node - one GPU and corresponding CPU cores and memory is allocated. +By default, only one part, i.e. 1/8 of the node - one GPU and corresponding CPU cores and memory, is allocated. There is no need to specify the number of cores and memory size, on the contrary, it is undesirable. -There are emloyed some restrictions which aim to provide fair division and efficient use of node resources. +There are employed some restrictions which aim to provide fair division and efficient use of node resources. ```console #!/usr/bin/bash @@ -78,44 +78,45 @@ There are emloyed some restrictions which aim to provide fair division and effic To allocate more GPUs use `--gpus` option. The default behavior is to allocate enough nodes to satisfy the requested resources as expressed by `--gpus` option and without delaying the initiation of the job. -Following code requests four gpus, scheduler can allocate one up to four nodes depending on actual cluster state (i.e. GPU availability) to fulfil the request. +The following code requests four GPUs; scheduler can allocate from one up to four nodes depending on the actual cluster state (i.e. GPU availability) to fulfil the request. ```console #SBATCH --gpus 4 ``` -Following code requests 16 gpus, scheduler can allocate two up to sixteen nodes depending on actual cluster state (i.e. GPU availability) to fulfil the request. +The following code requests 16 GPUs; scheduler can allocate from two up to sixteen nodes depending on the actual cluster state (i.e. GPU availability) to fulfil the request. ```console #SBATCH --gpus 16 ``` -To allocate GPUs within one node you have to specify `--nodes` option. +To allocate GPUs within one node you have to specify the `--nodes` option. -Following code requests four gpus on exactly one node. +The following code requests four GPUs on exactly one node ```console #SBATCH --gpus 4 #SBATCH --nodes 1 ``` -Following code requests 16 gpus on exactly two nodes. +The following code requests 16 GPUs on exactly two nodes. ```console #SBATCH --gpus 16 #SBATCH --nodes 2 ``` -Alternatively you can use `--gpus-per-node` option. Only value 8 is allowed for multi-node allocation to prevent fragmenting nodes. +Alternatively, you can use the `--gpus-per-node` option. +Only value 8 is allowed for multi-node allocation to prevent fragmenting nodes. -Following code requests 16 gpus on exactly two nodes. +The following code requests 16 GPUs on exactly two nodes. ```console #SBATCH --gpus-per-node 8 #SBATCH --nodes 2 ``` -To allocate whole GPU accelerated node you can also use `--exclusive` option +To allocate a whole GPU accelerated node, you can also use the `--exclusive` option ```console #SBATCH --exclusive @@ -125,10 +126,11 @@ To allocate whole GPU accelerated node you can also use `--exclusive` option Access [data analytics aka fat node][6]. Fat node is divided into 32 parts, each part contains one socket/processor (24 cores) and corresponding memory. -By default only one part i.e. 1/32 of the node - one processor and corresponding memory is allocated. +By default, only one part, i.e. 1/32 of the node - one processor and corresponding memory, is allocated. -To allocate requested memory use `--mem` option. -Corresponding CPUs wil be allocated. Fat node has about 23TB of memory available for jobs. +To allocate requested memory use the `--mem` option. +Corresponding CPUs will be allocated. +Fat node has about 23TB of memory available for jobs. ```console #!/usr/bin/bash @@ -140,9 +142,9 @@ Corresponding CPUs wil be allocated. Fat node has about 23TB of memory available ... ``` -You can also specify CPU-oriented options (like `--cpus-per-task`), then appropriate memory will be allocated to job. +You can also specify CPU-oriented options (like `--cpus-per-task`), then appropriate memory will be allocated to the job. -To allocate whole fat node use `--exclusive` option +To allocate a whole fat node, use the `--exclusive` option ```console #SBATCH --exclusive @@ -150,15 +152,15 @@ To allocate whole fat node use `--exclusive` option ## Using Viz Queue -Access [visualisation nodes][7]. -Every visualisation node is divided into eight parts. -By default only one part i.e. 1/8 of the node is allocated. +Access [visualization nodes][7]. +Every visualization node is divided into eight parts. +By default, only one part, i.e. 1/8 of the node, is allocated. ```console $ salloc -A PROJECT-ID -p qviz ``` -To allocate whole visualisation node use `--exclusive` option +To allocate a whole visualisation node, use the `--exclusive` option ```console $ salloc -A PROJECT-ID -p qviz --exclusive -- GitLab