Skip to content
Snippets Groups Projects

Resources Allocation Policy

Job Queue Policies

Resources are allocated to jobs in a fair-share fashion, subject to constraints set by the queue and the resources available to the project. The fair-share system ensures that individual users may consume approximately equal amounts of resources per week. Detailed information can be found in the Job scheduling section.

Resources are accessible via several queues for queueing the jobs. Queues provide prioritized and exclusive access to the computational resources.

!!! important "Queues update" We are introducing updated queues. These have the same parameters as the legacy queues but are divided based on resource type (qcpu_ for non-accelerated nodes and qgpu_ for accelerated nodes).

Note that on the Karolina's qgpu queue, you can now allocate 1/8 of the node - 1 GPU and 16 cores. For more information, see Allocation of vnodes on qgpu.

We have also added completely new queues qcpu_preempt and qgpu_preempt. For more information, see the table below.

New Queues

Queue
Description
qcpu Production queue for non-accelerated nodes intended for standard production runs. Requires an active project with nonzero remaining resources. Full nodes are allocated. Identical to qprod.
qgpu Dedicated queue for accessing the NVIDIA accelerated nodes. Requires an active project with nonzero remaining resources. It utilizes 8x NVIDIA A100 with 320GB HBM2 memory per node. The PI needs to explicitly ask support for authorization to enter the queue for all users associated with their project. On Karolina, you can allocate 1/8 of the node - 1 GPU and 16 cores. For more information, see Allocation of vnodes on qgpu.
qcpu_biz
qgpu_biz
Commercial queues, slightly higher priority.
qcpu_eurohpc
qgpu_eurohpc
EuroHPC queues, slightly higher priority, Karolina only.
qcpu_exp
qgpu_exp
Express queues for testing and running very small jobs. Doesn't require a project. There are 2 nodes always reserved (w/o accelerators), max 8 nodes available per user. The nodes may be allocated on a per core basis. It is configured to run one job and accept five jobs in a queue per user.
qcpu_free
qgpu_free
Intended for utilization of free resources, after a project exhausted all its allocated resources. Note that the queue is not free of charge. Normal accounting applies. (Does not apply to DD projects by default. DD projects have to request for permission after exhaustion of computational resources.). Consumed resources will be accounted to the Project. Access to the queue is removed if consumed resources exceed 150% of the allocation. Full nodes are allocated.
qcpu_long
qgpu_long
Queues for long production runs. Require an active project with nonzero remaining resources. Only 200 nodes without acceleration may be accessed. Full nodes are allocated.
qcpu_preempt
qgpu_preempt
Free queues with the lowest priority (LP). The queues require a project with allocation of the respective resource type. There is no limit on resource overdraft. Jobs are killed if other jobs with a higher priority (HP) request the nodes and there are no other nodes available. LP jobs are automatically re-queued once HP jobs finish, so make sure your jobs are re-runnable.
qdgx Queue for DGX-2, accessible from Barbora.
qfat Queue for fat node, PI must request authorization to enter the queue for all users associated to their project.
qviz Visualization queue Intended for pre-/post-processing using OpenGL accelerated graphics. Each user gets 8 cores of a CPU allocated (approx. 64 GB of RAM and 1/8 of the GPU capacity (default "chunk")). If more GPU power or RAM is required, it is recommended to allocate more chunks (with 8 cores each) up to one whole node per user. This is currently also the maximum allowed allocation per one user. One hour of work is allocated by default, the user may ask for 2 hours maximum.

Legacy Queues

Legacy queues stay in production until the end of 2022.

Legacy queue Replaced by
qexp qcpu_exp & qgpu_exp
qprod qcpu
qlong qcpu_long & qgpu_long
nvidia qgpu Note that unlike in new queues, only full nodes can be allocated.
qfree qcpu_free & qgpu_free

The following table provides the queue partitioning per cluster overview:

Karolina

Queue Active project Project resources Nodes Min ncpus Priority Authorization Walltime (default/max)
qcpu yes > 0 756 nodes 128 0 no 24 / 48h
qcpu_biz yes > 0 756 nodes 128 50 no 24 / 48h
qcpu_eurohpc yes > 0 756 nodes 128 50 no 24 / 48h
qcpu_exp yes none required 756 nodes
max 2 nodes per user
128 150 no 1 / 1h
qcpu_free yes < 150% of allocation 756 nodes
max 4 nodes per job
128 -100 no 12 / 12h
qcpu_long yes > 0 200 nodes
max 20 nodes per job, only non-accelerated nodes allowed
128 0 no 72 / 144h
qcpu_preempt yes > 0 756 nodes
max 4 nodes per job
128 -200 no 12 / 12h
qgpu yes > 0 72 nodes 16 cpus
1 gpu
0 yes 24 / 48h
qgpu_biz yes > 0 70 nodes 128 50 yes 24 / 48h
qgpu_eurohpc yes > 0 70 nodes 128 50 yes 24 / 48h
qgpu_exp yes none required 4 nodes
max 1 node per job
16 cpus
1 gpu
0 no 1 / 1h
qgpu_free yes < 150% of allocation 46 nodes
max 2 nodes per job
16 cpus
1 gpu
-100 no 12 / 12h
qgpu_preempt yes > 0 72 nodes
max 2 nodes per job
16 cpus
1 gpu
-200 no 12 / 12h
qviz yes none required 2 nodes (with NVIDIA® Quadro RTX™ 6000) 8 0 no 1 / 8h
qfat yes > 0 1 (sdf1) 24 0 yes 24 / 48h
Legacy Queues
qfree yes < 150% of allocation 756 nodes
max 4 nodes per job
128 -100 no 12 / 12h
qexp no none required 756 nodes
max 2 nodes per job
128 150 no 1 / 1h
qprod yes > 0 756 nodes 128 0 no 24 / 48h
qlong yes > 0 200 nodes
max 20 nodes per job, only non-accelerated nodes allowed
128 0 no 72 / 144h
qnvidia yes > 0 72 nodes 128 0 yes 24 / 48h

Barbora

Queue Active project Project resources Nodes Min ncpus Priority Authorization Walltime (default/max)
qcpu yes > 0 190 nodes 36 0 no 24 / 48h
qcpu_biz yes > 0 190 nodes 36 50 no 24 / 48h
qcpu_exp yes none required 16 nodes 36 150 no 1 / 1h
qcpu_free yes < 150% of allocation 124 nodes
max 4 nodes per job
36 -100 no 12 / 18h
qcpu_long yes > 0 60 nodes
max 20 nodes per job
36 0 no 72 / 144h
qcpu_preempt yes > 0 190 nodes
max 4 nodes per job
36 -200 no 12 / 12h
qgpu yes > 0 8 nodes 24 0 yes 24 / 48h
qgpu_biz yes > 0 8 nodes 24 50 yes 24 / 48h
qgpu_exp yes none required 4 nodes
max 1 node per job
24 0 no 1 / 1h
qgpu_free yes < 150% of allocation 5 nodes
max 2 nodes per job
24 -100 no 12 / 18h
qgpu_preempt yes > 0 4 nodes
max 2 nodes per job
24 -200 no 12 / 12h
qdgx yes > 0 cn202 96 0 yes 4 / 48h
qviz yes none required 2 nodes with NVIDIA Quadro P6000 4 0 no 1 / 8h
qfat yes > 0 1 fat node 128 0 yes 24 / 48h
Legacy Queues
qexp no none required 16 nodes
max 4 nodes per job
36 150 no 1 / 1h
qprod yes > 0 190 nodes w/o accelerator 36 0 no 24 / 48h
qlong yes > 0 60 nodes w/o accelerator
max 20 nodes per job
36 0 no 72 / 144h
qnvidia yes > 0 8 NVIDIA nodes 24 0 yes 24 / 48h
qfree yes < 150% of allocation 192 w/o accelerator
max 32 nodes per job
36 -100 no 12 / 12h

Queue Notes

The job wallclock time defaults to half the maximum time, see the table above. Longer wall time limits can be set manually, see examples.

Jobs that exceed the reserved wall clock time (Req'd Time) get killed automatically. The wall clock time limit can be changed for queuing jobs (state Q) using the qalter command, however it cannot be changed for a running job (state R).

Queue Status

!!! tip Check the status of jobs, queues and compute nodes here.

rspbs web interface

Display the queue status:

$ qstat -q

The PBS allocation overview may also be obtained using the rspbs command:

$ rspbs
Usage: rspbs [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  --get-server-details  Print server
  --get-queues          Print queues
  --get-queues-details  Print queues details
  --get-reservations    Print reservations
  --get-reservations-details
                        Print reservations details
  ...
  ..
  .

---8<--- "resource_accounting.md"

---8<--- "mathjax.md"