| sinfo | View information about nodes and partitions. |
| sinfo | View information about nodes and partitions. |
| squeue | View information about jobs located in the scheduling queue. |
| squeue | View information about jobs located in the scheduling queue. |
| sacct | Display accounting data for all jobs and job steps in the job accounting log or Slurm database. |
| sacct | Display accounting data for all jobs and job steps in the job accounting log or Slurm database. |
| |
| scontrol | View or modify jobs, nodes, partitions, reservations, and other Slurm objects. |
| salloc | Obtain a job allocation (a set of nodes), execute a command, and then release the allocation when the command is finished. |
| |
| sattach | Attach to a job step. |
| sbatch | Submit a batch script to Slurm. |
| sbatch | Submit a batch script to Slurm. |
| salloc | Run an interactive job. |
| sbcast | Transmit a file to the nodes allocated to a job. |
| srun | Run parallel tasks. |
| scancel | Used to signal jobs or job steps that are under the control of Slurm. |
| scancel | Cancel job. |
|
| srun | Run parallel jobs. |
### Job Submission Options
### Job Submission Options
...
@@ -54,11 +52,11 @@ To define Slurm job options within the batch script, use `SBATCH` keyword follow
...
@@ -54,11 +52,11 @@ To define Slurm job options within the batch script, use `SBATCH` keyword follow
```shell
```shell
#SBATCH -A OPEN-00-00
#SBATCH -A OPEN-00-00
#SBATCH -p p03-amd
#SBATCH -p qcpu
#SBATCH -n 4
#SBATCH -n 4
```
```
Here we asked for 4 tasks in total to be executed on partition p03-amd using OPEN-00-00's project resources.
Here we asked for 4 tasks in total to be executed on partition qcpu using OPEN-00-00's project resources.
Job instructions should contain everything you'd like your job to do; that is, every single command the job is supposed to execute:
Job instructions should contain everything you'd like your job to do; that is, every single command the job is supposed to execute:
...
@@ -73,7 +71,7 @@ Combined together, the previous examples make up a following script:
...
@@ -73,7 +71,7 @@ Combined together, the previous examples make up a following script:
```shell
```shell
#!/usr/bin/bash
#!/usr/bin/bash
#SBATCH -A OPEN-00-00
#SBATCH -A OPEN-00-00
#SBATCH -p p03-amd
#SBATCH -p qcpu
#SBATCH -n 4
#SBATCH -n 4
ml OpenMPI/4.1.4-GCC-11.3.0
ml OpenMPI/4.1.4-GCC-11.3.0
...
@@ -92,8 +90,8 @@ we get an output file with the following contents:
...
@@ -92,8 +90,8 @@ we get an output file with the following contents:
```console
```console
$cat slurm-1511.out
$cat slurm-1511.out
1 p03-amd01.cs.it4i.cz
1 cn1.barbora.it4i.cz
3 p03-amd02.cs.it4i.cz
3 cn2.barbora.it4i.cz
```
```
Notice that Slurm spread our job across 2 different nodes; by default, Slurm selects the number of nodes to minimize wait time before job execution. However, sometimes you may want to restrict your job to only a certain minimum or maximum number of nodes (or both). You may also require more time for your calculation to finish than the default allocated time. For an overview of such job options, see table below.
Notice that Slurm spread our job across 2 different nodes; by default, Slurm selects the number of nodes to minimize wait time before job execution. However, sometimes you may want to restrict your job to only a certain minimum or maximum number of nodes (or both). You may also require more time for your calculation to finish than the default allocated time. For an overview of such job options, see table below.
...
@@ -138,7 +136,7 @@ The recommended way to run production jobs is to change to the `/scratch` direct
...
@@ -138,7 +136,7 @@ The recommended way to run production jobs is to change to the `/scratch` direct
#!/bin/bash
#!/bin/bash
#SBATCH -J job_example
#SBATCH -J job_example
#SBATCH -A OPEN-00-00
#SBATCH -A OPEN-00-00
#SBATCH -p p03-amd
#SBATCH -p qcpu
#SBATCH -n 4
#SBATCH -n 4
cd$SLURM_SUBMIT_DIR
cd$SLURM_SUBMIT_DIR
...
@@ -382,19 +380,27 @@ p04-edge up 1-00:00:00 1 idle p04-edge01
...
@@ -382,19 +380,27 @@ p04-edge up 1-00:00:00 1 idle p04-edge01
p05-synt up 1-00:00:00 1 idle p05-synt01
p05-synt up 1-00:00:00 1 idle p05-synt01
```
```
Here we can see output of the `sinfo` command ran on the Complementary System. By default, it shows basic node and partition configurations.
Here we can see output of the `sinfo` command ran on Barbora cluster. By default, it shows basic node and partition configurations.
To view partition summary information, use `sinfo -s`, or `sinfo --summarize`:
To view partition summary information, use `sinfo -s`, or `sinfo --summarize`:
```console
```console
$sinfo -s
$sinfo -s
PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST
PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST
p00-arm up 1-00:00:00 0/1/0/1 p00-arm01
qcpu* up 2-00:00:00 0/192/0/192 cn[1-192]
p01-arm* up 1-00:00:00 0/8/0/8 p01-arm[01-08]
qcpu_biz up 2-00:00:00 0/192/0/192 cn[1-192]
p02-intel up 1-00:00:00 0/2/0/2 p02-intel[01-02]
qcpu_exp up 1:00:00 0/192/0/192 cn[1-192]
p03-amd up 1-00:00:00 0/2/0/2 p03-amd[01-02]
qcpu_free up 18:00:00 0/192/0/192 cn[1-192]
p04-edge up 1-00:00:00 0/1/0/1 p04-edge01
qcpu_long up 6-00:00:00 0/192/0/192 cn[1-192]
p05-synt up 1-00:00:00 0/1/0/1 p05-synt01
qcpu_preempt up 12:00:00 0/192/0/192 cn[1-192]
qgpu up 2-00:00:00 0/8/0/8 cn[193-200]
qgpu_biz up 2-00:00:00 0/8/0/8 cn[193-200]
qgpu_exp up 1:00:00 0/8/0/8 cn[193-200]
qgpu_free up 18:00:00 0/8/0/8 cn[193-200]
qgpu_preempt up 12:00:00 0/8/0/8 cn[193-200]
qfat up 2-00:00:00 0/1/0/1 cn201
qdgx up 2-00:00:00 0/1/0/1 cn202
qviz up 8:00:00 0/2/0/2 vizserv[1-2]
```
```
This lists only a partition state summary with no dedicated column for partition state. Instead, it is summarized in the `NODES(A/I/O/T)` column, where the `A/I/O/T` stands for `allocated/idle/other/total`.
This lists only a partition state summary with no dedicated column for partition state. Instead, it is summarized in the `NODES(A/I/O/T)` column, where the `A/I/O/T` stands for `allocated/idle/other/total`.