diff --git a/docs.it4i/general/slurm-job-submission-and-execution.md b/docs.it4i/general/slurm-job-submission-and-execution.md index f260135ef0ecd429b0318a5b3a8b49b0a76de34d..d9de1de37024dcf440063bae8de9d726f05ef1f4 100644 --- a/docs.it4i/general/slurm-job-submission-and-execution.md +++ b/docs.it4i/general/slurm-job-submission-and-execution.md @@ -27,7 +27,7 @@ A `man` page exists for all Slurm commands, as well as `--help` command option, ### Job Submission Options Slurm provides three different commands for job submission: `salloc`, `sbatch`, and `srun`. Each of those serves a slightly different purpose; -`salloc` is used to run interactive job - behaviour of this command at IT4Innovation's clusters might differ from behaviour at other clusters, it is alternative to PBS qsub -I command. +`salloc` is used to run interactive job - behaviour of this command on IT4Innovation's clusters might differ from behaviour at other clusters, it is alternative to PBS qsub -I command. `sbatch` is used to submit a batch script to Slurm, it is alternative to PBS qsub command used without -I option. `srun` behaviour of the command depends on whether it was run from inside of the job or from outside, from inside of the job command serves as Slurm's native way of executing MPI-enabled applications, from outside of the job command obtains appropriate allocation first and then run a specified command in parallel. @@ -395,35 +395,7 @@ Apart from job submission and execution, Slurm also provides a number of command ### Job Partition Information -To view information about available job partitions, use the `sinfo` command: - -```console -$ sinfo -PARTITION AVAIL TIMELIMIT NODES STATE NODELIST -qcpu* up 2-00:00:00 191 idle cn[1-67,69-192] -qcpu_biz up 2-00:00:00 1 alloc cn68 -qcpu_biz up 2-00:00:00 191 idle cn[1-67,69-192] -qcpu_exp up 1:00:00 1 alloc cn68 -qcpu_exp up 1:00:00 191 idle cn[1-67,69-192] -qcpu_free up 18:00:00 1 alloc cn68 -qcpu_free up 18:00:00 191 idle cn[1-67,69-192] -qcpu_long up 6-00:00:00 1 alloc cn68 -qcpu_long up 6-00:00:00 191 idle cn[1-67,69-192] -qcpu_preempt up 12:00:00 1 alloc cn68 -qcpu_preempt up 12:00:00 191 idle cn[1-67,69-192] -qgpu up 2-00:00:00 8 idle cn[193-200] -qgpu_biz up 2-00:00:00 8 idle cn[193-200] -qgpu_exp up 1:00:00 8 idle cn[193-200] -qgpu_free up 18:00:00 8 idle cn[193-200] -qgpu_preempt up 12:00:00 8 idle cn[193-200] -qfat up 2-00:00:00 1 idle cn201 -qdgx up 2-00:00:00 1 idle cn202 -qviz up 8:00:00 2 idle vizserv[1-2] -``` - -Here we can see output of the `sinfo` command ran on Barbora cluster. By default, it shows basic node and partition configurations. - -To view partition summary information, use `sinfo -s`, or `sinfo --summarize`: +To view information about job partitions and nodes, use the `sinfo` command. Useful option for quick overview of cluster is `-s`, or `--summarize`: ```console $ sinfo -s @@ -443,8 +415,9 @@ qfat up 2-00:00:00 0/1/0/1 cn201 qdgx up 2-00:00:00 0/1/0/1 cn202 qviz up 8:00:00 0/2/0/2 vizserv[1-2] ``` +Here we can see output of the `sinfo` command ran on Barbora cluster. -This lists only a partition state summary with no dedicated column for partition state. Instead, it is summarized in the `NODES(A/I/O/T)` column, where the `A/I/O/T` stands for `allocated/idle/other/total`. +`NODES(A/I/O/T)` column sumarizes node count per state, where the `A/I/O/T` stands for `allocated/idle/other/total`. `sinfo` can also report more granular information, such detailed exact node-oriented information: @@ -488,11 +461,6 @@ By default, this shows the job ID, partition, name of the job, job owner's usern To view jobs only belonging to a particular user, you can either use `--user=<username>`, or `--me`, which serves as an alias for `--user=$USER`, to shows only your jobs: ```console -$ squeue - JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) - 1559 p01-arm interact user017 R 3:37 8 p01-arm[01-08] - 1560 p02-intel interact user018 R 0:05 2 p02-intel[01-02] - 1557 p03-amd interact user017 R 10:22 1 p03-amd01 $ squeue --me JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 1559 p01-arm interact user017 R 4:04 8 p01-arm[01-08] @@ -508,6 +476,10 @@ $ squeue --jobs 1557,1560 1557 p03-amd interact user017 R 12:26 1 p03-amd01 ``` +| `squeue -A $project`| List project's jobs. | +| `squeue -t RUNNING` | List running jobs | +| `squeue -t PENDING` | List pending jobs | + For more information about the `squeue` command, its flags, and formatting options, see the manual, either by using the `man squeue` command or [online][f]. #### Job State Codes Overview