diff --git a/docs.it4i/general/slurm-job-submission-and-execution.md b/docs.it4i/general/slurm-job-submission-and-execution.md index a05967e5718604709cd3110ae51146e11f48467d..37c1afa262dc9c1b5dae5c6505ad7a8bef2957bc 100644 --- a/docs.it4i/general/slurm-job-submission-and-execution.md +++ b/docs.it4i/general/slurm-job-submission-and-execution.md @@ -2,9 +2,12 @@ ## Introduction -[Slurm][1] workload manager is used to allocate and access Barbora cluster and Complementary systems resources. Karolina cluster coming soon... +[Slurm][1] workload manager is used to allocate and access Barbora's and Complementary systems' resources. +Slurm on Karolina will be implemented later in 2023. -A `man` page exists for all Slurm commands, as well as `--help` command option, which provides a brief summary of options. Slurm [documentation][c] and [man pages][d] are also available online. +A `man` page exists for all Slurm commands, as well as the `--help` command option, +which provides a brief summary of options. +Slurm [documentation][c] and [man pages][d] are also available online. ## Getting Partition Information @@ -29,29 +32,32 @@ qdgx up 2-00:00:00 0/1/0/1 cn202 qviz up 8:00:00 0/2/0/2 vizserv[1-2] ``` -`NODES(A/I/O/T)` column sumarizes node count per state, where the `A/I/O/T` stands for `allocated/idle/other/total`. +`NODES(A/I/O/T)` column summarizes node count per state, where the `A/I/O/T` stands for `allocated/idle/other/total`. Example output is from Barbora cluster. Graphical representation of clusters' usage, partitions, nodes, and jobs could be found -* for Barbora cluster at [https://extranet.it4i.cz/rsweb/barbora][4] +* for Barbora at [https://extranet.it4i.cz/rsweb/barbora][4] * for Complementary Systems at [https://extranet.it4i.cz/rsweb/compsys][6] -On Barbora cluster all queues/partitions provide full node allocation, whole nodes are allocated to job. +On Barbora, all queues/partitions provide full node allocation, whole nodes are allocated to job. -On Complementary systems only some queues/partitions provide full node allocation, see [Complementary systems documentation][2] for details. +On Complementary systems, only some queues/partitions provide full node allocation, +see [Complementary systems documentation][2] for details. ## Running Interactive Jobs -Sometimes you may want to run your job interactively, for example for debugging, running your commands one by one from the command line. +Sometimes you may want to run your job interactively, for example for debugging, +running your commands one by one from the command line. -Run interactive job - queue qcpu_exp, one node by default, one task by default: +Run interactive job - queue `qcpu_exp`, one node by default, one task by default: ```console $ salloc -A PROJECT-ID -p qcpu_exp ``` -Run interactive job on four nodes, 36 tasks per node (Barbora cluster, cpu partition recommended value based on node core count), two hours time limit: +Run interactive job on four nodes, 36 tasks per node (Barbora cluster, CPU partition recommended value based on node core count), +two hours time limit: ```console $ salloc -A PROJECT-ID -p qcpu -N 4 --ntasks-per-node 36 -t 2:00:00 @@ -103,7 +109,8 @@ Script will: * load appropriate module * run command, srun serves as Slurm's native way of executing MPI-enabled applications, hostname is used in the example just for sake of simplicity -Submit directory will be used as working directory for submitted job, so there is no need to change directory in the job script. +Submit directory will be used as working directory for submitted job, +so there is no need to change directory in the job script. Alternatively you can specify job working directory using sbatch `--chdir` (or shortly `-D`) option. ### Job Submit @@ -115,9 +122,10 @@ $ cd my_work_dir $ sbatch script.sh ``` -Path to script.sh (relative or absolute) should be given if job script is in different location than job working directory. +A path to `script.sh` (relative or absolute) should be given +if the job script is in different location than the job working directory. -By default, job output is stored in a file called slurm-JOBID.out and contains both job standard output and error output. +By default, job output is stored in a file called `slurm-JOBID.out` and contains both job standard output and error output. This can be changed using sbatch options `--output` (shortly `-o`) and `--error` (shortly `-e`). Example output of the job: @@ -131,7 +139,8 @@ Example output of the job: ### Job Environment Variables -Slurm provides useful information to the job via environment variables. Environment variables are available on all nodes allocated to job when accessed via Slurm supported means (srun, compatible mpirun). +Slurm provides useful information to the job via environment variables. +Environment variables are available on all nodes allocated to job when accessed via Slurm supported means (srun, compatible mpirun). See all Slurm variables @@ -186,7 +195,7 @@ $ squeue --me 104 qcpu interact user R 1:48 2 cn[101-102] ``` -Show job details for specific job: +Show job details for a specific job: ```console $ scontrol show job JOBID @@ -198,7 +207,7 @@ Show job details for executing job from job session: $ scontrol show job $SLURM_JOBID ``` -Show my jobs using long output format which includes time limit: +Show my jobs using a long output format which includes time limit: ```console $ squeue --me -l @@ -216,7 +225,7 @@ Show my jobs in pending state: $ squeue --me -t pending ``` -Show jobs for given project: +Show jobs for a given project: ```console $ squeue -A PROJECT-ID @@ -226,20 +235,20 @@ $ squeue -A PROJECT-ID The most common job states are (in alphabetical order): -| Code | Job State | Explanation | -| :--: | :------------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| CA | CANCELLED | Job was explicitly cancelled by the user or system administrator. The job may or may not have been initiated. | -| CD | COMPLETED | Job has terminated all processes on all nodes with an exit code of zero. | -| CG | COMPLETING | Job is in the process of completing. Some processes on some nodes may still be active. | -| F | FAILED | Job terminated with non-zero exit code or other failure condition. | -| NF | NODE_FAIL | Job terminated due to failure of one or more allocated nodes. | -| OOM | OUT_OF_MEMORY | Job experienced out of memory error. | -| PD | PENDING | Job is awaiting resource allocation. | -| PR | PREEMPTED | Job terminated due to preemption. | -| R | RUNNING | Job currently has an allocation. | -| RQ | REQUEUED | Completing job is being requeued. | -| SI | SIGNALING | Job is being signaled. | -| TO | TIMEOUT | Job terminated upon reaching its time limit. | +| Code | Job State | Explanation | +| :--: | :------------ | :------------------------------------------------------------------------------------------------------------- | +| CA | CANCELLED | Job was explicitly cancelled by the user or system administrator. The job may or may not have been initiated. | +| CD | COMPLETED | Job has terminated all processes on all nodes with an exit code of zero. | +| CG | COMPLETING | Job is in the process of completing. Some processes on some nodes may still be active. | +| F | FAILED | Job terminated with non-zero exit code or other failure condition. | +| NF | NODE_FAIL | Job terminated due to failure of one or more allocated nodes. | +| OOM | OUT_OF_MEMORY | Job experienced out of memory error. | +| PD | PENDING | Job is awaiting resource allocation. | +| PR | PREEMPTED | Job terminated due to preemption. | +| R | RUNNING | Job currently has an allocation. | +| RQ | REQUEUED | Completing job is being requeued. | +| SI | SIGNALING | Job is being signaled. | +| TO | TIMEOUT | Job terminated upon reaching its time limit. | ### Modifying Jobs @@ -263,7 +272,7 @@ $ scontrol update JobId=JOBID Comment='The best job ever' ### Deleting Jobs -Delete job by job id: +Delete a job by job ID: ``` $ scancel JOBID @@ -293,7 +302,7 @@ Delete all my pending jobs: $ scancel --me -t pending ``` -Delete all my pending jobs for project PROJECT-ID: +Delete all my pending jobs for a project PROJECT-ID: ``` $ scancel --me -t pending -A PROJECT-ID @@ -307,10 +316,10 @@ $ scancel --me -t pending -A PROJECT-ID Possible causes: -* Invalid account (i.e. project) was specified in job submission -* User does not have access to given account/project -* Given account/project does not have access to given partition -* Access to given partition was retracted due to the project's allocation exhaustion +* Invalid account (i.e. project) was specified in job submission. +* User does not have access to given account/project. +* Given account/project does not have access to given partition. +* Access to given partition was retracted due to the project's allocation exhaustion. [1]: https://slurm.schedmd.com/ [2]: /cs/job-scheduling/#partitions