Skip to content
Snippets Groups Projects
Commit fd2ae1dc authored by Roman Sliva's avatar Roman Sliva
Browse files

Update job-features

parent 60e17082
Branches
No related tags found
1 merge request!440PBS eradication
Pipeline #34180 passed with warnings
# Job Features / Job Features
Special features installed/configured on the fly on allocated nodes, features are requested in PBS job. Special features installed/configured on the fly on allocated nodes, features are requested in Slurm job usin specially formatted comment.
```console ```console
$ qsub... -l feature=req $ salloc... --comment "use:feature=req"
``` ```
or
```
SBATCH --comment "use:feature=req"
```
or for multiple features
```console
$ salloc ... --comment "use:feature1=req1 use:feature2=req2 ..."
```
where feature is feature name and req is requested value (true, version string, etc.)
## Xorg ## Xorg
[Xorg][2] is a free and open source implementation of the X Window System imaging server maintained by the X.Org Foundation. Xorg is available only for Karolina accelerated nodes Acn[01-72]. [Xorg][2] is a free and open source implementation of the X Window System imaging server maintained by the X.Org Foundation. Xorg is available only for Karolina accelerated nodes Acn[01-72].
```console ```console
$ qsub ... -l xorg=True $ salloc ... --comment "use:xorg=True"
``` ```
## VTune Support ## VTune Support
...@@ -19,20 +33,23 @@ $ qsub ... -l xorg=True ...@@ -19,20 +33,23 @@ $ qsub ... -l xorg=True
Load the VTune kernel modules. Load the VTune kernel modules.
```console ```console
$ qsub ... -l vtune=version_string $ salloc ... --comment "use:vtune=version_string"
``` ```
`version_string` is VTune version e.g. 2019_update4 `version_string` is VTune version e.g. 2019_update4
## Global RAM Disk ## Global RAM Disk
!!! warning
The feature has not been implemented on Slurm yet.
The Global RAM disk deploys BeeGFS On Demand parallel filesystem, The Global RAM disk deploys BeeGFS On Demand parallel filesystem,
using local (i.e. allocated nodes') RAM disks as a storage backend. using local (i.e. allocated nodes') RAM disks as a storage backend.
The Global RAM disk is mounted at `/mnt/global_ramdisk`. The Global RAM disk is mounted at `/mnt/global_ramdisk`.
```console ```console
$ qsub ... -l global_ramdisk=true $ salloc ... --comment "use:global_ramdisk=true"
``` ```
![Global RAM disk](../img/global_ramdisk.png) ![Global RAM disk](../img/global_ramdisk.png)
...@@ -40,18 +57,18 @@ $ qsub ... -l global_ramdisk=true ...@@ -40,18 +57,18 @@ $ qsub ... -l global_ramdisk=true
### Example ### Example
```console ```console
$ qsub -q qprod -l select=4,global_ramdisk=true ./jobscript $ sbatch -A PROJECT-ID -p qcpu --nodes 4 --comment="use:global_ramdisk=true" ./jobscript
``` ```
This command submits a 4-node job in the `qprod` queue; This command submits a 4-node job in the `qcpu` queue;
once running, a 440GB RAM disk shared across the 4 nodes will be created. once running, a RAM disk shared across the 4 nodes will be created.
The RAM disk will be accessible at `/mnt/global_ramdisk` The RAM disk will be accessible at `/mnt/global_ramdisk`
and files written to this RAM disk will be visible on all 4 nodes. and files written to this RAM disk will be visible on all 4 nodes.
The file system is private to a job and shared among the nodes, The file system is private to a job and shared among the nodes,
created when the job starts and deleted at the job's end. created when the job starts and deleted at the job's end.
!!! note !!! warning
The Global RAM disk will be deleted immediately after the calculation end. The Global RAM disk will be deleted immediately after the calculation end.
Users should take care to save the output data from within the jobscript. Users should take care to save the output data from within the jobscript.
...@@ -87,7 +104,7 @@ Load a kernel module that allows saving/restoring values of MSR registers. ...@@ -87,7 +104,7 @@ Load a kernel module that allows saving/restoring values of MSR registers.
Uses [LLNL MSR-SAFE][a]. Uses [LLNL MSR-SAFE][a].
```console ```console
$ qsub ... -l msr=version_string $ salloc ... --comment "use:msr=version_string"
``` ```
`version_string` is MSR-SAFE version e.g. 1.4.0 `version_string` is MSR-SAFE version e.g. 1.4.0
...@@ -98,34 +115,12 @@ $ qsub ... -l msr=version_string ...@@ -98,34 +115,12 @@ $ qsub ... -l msr=version_string
!!! Warning !!! Warning
Available on Barbora nodes only. Available on Barbora nodes only.
## Offlining CPU Cores
!!! Info
Not available now.
To offline N CPU cores:
```console
$ qsub ... -l cpu_offline_cores=N
```
To offline CPU cores according to pattern:
```console
$ qsub ... -l cpu_offline_cores=PATTERN
```
where `PATTERN` is a list of core's numbers to offline, separated by the character 'c' (e.g. "5c11c16c23c").
!!! Danger
Hazardous, it causes Lustre threads disruption.
## HDEEM Support ## HDEEM Support
Load the HDEEM software stack. The [High Definition Energy Efficiency Monitoring][b] (HDEEM) library is a software interface used to measure power consumption of HPC clusters with bullx blades. Load the HDEEM software stack. The [High Definition Energy Efficiency Monitoring][b] (HDEEM) library is a software interface used to measure power consumption of HPC clusters with bullx blades.
```console ```console
$ qsub ... -l hdeem=version_string $ salloc ... --comment "use:hdeem=version_string"
``` ```
`version_string` is HDEEM version e.g. 2.2.8-1 `version_string` is HDEEM version e.g. 2.2.8-1
...@@ -135,25 +130,28 @@ $ qsub ... -l hdeem=version_string ...@@ -135,25 +130,28 @@ $ qsub ... -l hdeem=version_string
## NVMe Over Fabrics File System ## NVMe Over Fabrics File System
!!! warning
The feature has not been implemented on Slurm yet.
Attach a volume from an NVMe storage and mount it as a file-system. File-system is mounted on /mnt/nvmeof (on the first node of the job). Attach a volume from an NVMe storage and mount it as a file-system. File-system is mounted on /mnt/nvmeof (on the first node of the job).
Barbora cluster provides two NVMeoF storage nodes equipped with NVMe disks. Each storage node contains seven 1.6TB NVMe disks and provides net aggregated capacity of 10.18TiB. Storage space is provided using the NVMe over Fabrics protocol; RDMA network i.e. InfiniBand is used for data transfers. Barbora cluster provides two NVMeoF storage nodes equipped with NVMe disks. Each storage node contains seven 1.6TB NVMe disks and provides net aggregated capacity of 10.18TiB. Storage space is provided using the NVMe over Fabrics protocol; RDMA network i.e. InfiniBand is used for data transfers.
```console ```console
$ qsub ... -l nvmeof=size $ salloc ... --comment "use:nvmeof=size"
``` ```
`size` is a size of the requested volume, PBS size conventions are used, e.g. 10t `size` is a size of the requested volume, size conventions are used, e.g. 10t
Create a shared file-system on the attached NVMe file-system and make it available on all nodes of the job. Append `:shared` to the size specification, shared file-system is mounted on /mnt/nvmeof-shared. Create a shared file-system on the attached NVMe file-system and make it available on all nodes of the job. Append `:shared` to the size specification, shared file-system is mounted on /mnt/nvmeof-shared.
```console ```console
$ qsub ... -l nvmeof=size:shared $ salloc ... --comment "use:nvmeof=size:shared"
``` ```
For example: For example:
```console ```console
$ qsub ... -l nvmeof=10t:shared $ salloc ... --comment "use:nvmeof=10t:shared"
``` ```
!!! Warning !!! Warning
...@@ -161,12 +159,15 @@ $ qsub ... -l nvmeof=10t:shared ...@@ -161,12 +159,15 @@ $ qsub ... -l nvmeof=10t:shared
## Smart Burst Buffer ## Smart Burst Buffer
Accelerate SCRATCH storage using the Smart Burst Buffer (SBB) technology. A specific Burst Buffer process is launched and Burst Buffer resources (CPUs, memory, flash storage) are allocated on an SBB storage node for acceleration (I/O caching) of SCRATCH data operations. The SBB profile file `/lscratch/$PBS_JOBID/sbb.sh` is created on the first allocated node of job. For SCRATCH acceleration, the SBB profile file has to be sourced into the shell environment - provided environment variables have to be defined in the process environment. Modified data is written asynchronously to a backend (Lustre) filesystem, writes might be proceeded after job termination. !!! warning
The feature has not been implemented on Slurm yet.
Accelerate SCRATCH storage using the Smart Burst Buffer (SBB) technology. A specific Burst Buffer process is launched and Burst Buffer resources (CPUs, memory, flash storage) are allocated on an SBB storage node for acceleration (I/O caching) of SCRATCH data operations. The SBB profile file `/lscratch/$SLURM_JOB_ID/sbb.sh` is created on the first allocated node of job. For SCRATCH acceleration, the SBB profile file has to be sourced into the shell environment - provided environment variables have to be defined in the process environment. Modified data is written asynchronously to a backend (Lustre) filesystem, writes might be proceeded after job termination.
Barbora cluster provides two SBB storage nodes equipped with NVMe disks. Each storage node contains ten 3.2TB NVMe disks and provides net aggregated capacity of 29.1TiB. Acceleration uses RDMA network i.e. InfiniBand is used for data transfers. Barbora cluster provides two SBB storage nodes equipped with NVMe disks. Each storage node contains ten 3.2TB NVMe disks and provides net aggregated capacity of 29.1TiB. Acceleration uses RDMA network i.e. InfiniBand is used for data transfers.
```console ```console
$ qsub ... -l sbb=spec $ salloc ... --comment "use:sbb=spec:
``` ```
`spec` specifies amount of resources requested for Burst Buffer (CPUs, memory, flash storage), available values are small, medium, and large `spec` specifies amount of resources requested for Burst Buffer (CPUs, memory, flash storage), available values are small, medium, and large
...@@ -174,7 +175,7 @@ $ qsub ... -l sbb=spec ...@@ -174,7 +175,7 @@ $ qsub ... -l sbb=spec
Loading SBB profile: Loading SBB profile:
```console ```console
$ source /lscratch/$PBS_JOBID/sbb.sh $ source /lscratch/$SLURM_JOB_ID/sbb.sh
``` ```
!!! Warning !!! Warning
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment