Skip to content
Snippets Groups Projects
resources-allocation-policy.md 13.6 KiB
Newer Older
  • Learn to ignore specific revisions
  • Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    # Resources Allocation Policy
    
    ## Job Queue Policies
    
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    Resources are allocated to jobs in a fair-share fashion,
    subject to constraints set by the queue and the resources available to the project.
    The fair-share system ensures that individual users may consume approximately equal amounts of resources per week.
    Detailed information can be found in the [Job scheduling][1] section.
    
    Resources are accessible via several queues for queueing the jobs.
    Queues provide prioritized and exclusive access to the computational resources.
    
    !!! important "Queues update"
        We are introducing updated queues.
        These have the same parameters as the legacy queues but are divided based on resource type (`qcpu_` for non-accelerated nodes and `qgpu_` for accelerated nodes).<br><br>
        Note that on the Karolina's `qgpu` queue, **you can now allocate 1/8 of the node - 1 GPU and 16 cores**. For more information, see [Allocation of vnodes on qgpu][4].<br><br>
        We have also added completely new queues `qcpu_preempt` and `qgpu_preempt`. For more information, see the table below.
    
    ### New Queues
    
    | <div style="width:86px">Queue</div>| Description |
    | -------------------------------- | ----------- |
    | `qcpu`                           | Production queue for non-accelerated nodes intended for standard production runs. Requires an active project with nonzero remaining resources. Full nodes are allocated. Identical to `qprod`. |
    | `qgpu`                           | Dedicated queue for accessing the NVIDIA accelerated nodes. Requires an active project with nonzero remaining resources. It utilizes 8x NVIDIA A100 with 320GB HBM2 memory per node. The PI needs to explicitly ask support for authorization to enter the queue for all users associated with their project. **On Karolina, you can allocate 1/8 of the node - 1 GPU and 16 cores**. For more information, see [Allocation of vnodes on qgpu][4]. |
    | `qcpu_biz`<br>`qgpu_biz`         | Commercial queues, slightly higher priority.                   |
    | `qcpu_eurohpc`<br>`qgpu_eurohpc` | EuroHPC queues, slightly higher priority, **Karolina only**.   |
    | `qcpu_exp`<br>`qgpu_exp`         | Express queues for testing and running very small jobs. Doesn't require a project. There are 2 nodes always reserved (w/o accelerators), max 8 nodes available per user. The nodes may be allocated on a per core basis. It is configured to run one job and accept five jobs in a queue per user. |
    | `qcpu_free`<br>`qgpu_free`       | Intended for utilization of free resources, after a project exhausted all its allocated resources. Note that the queue is **not free of charge**. [Normal accounting][2] applies. (Does not apply to DD projects by default. DD projects have to request for permission after exhaustion of computational resources.). Consumed resources will be accounted to the Project. Access to the queue is removed if consumed resources exceed 150% of the allocation. Full nodes are allocated. |
    | `qcpu_long`<br>`qgpu_long`       | Queues for long production runs. Require an active project with nonzero remaining resources. Only 200 nodes without acceleration may be accessed. Full nodes are allocated. |
    | `qcpu_preempt`<br>`qgpu_preempt` | Free queues with the lowest priority (LP). The queues require a project with allocation of the respective resource type. There is no limit on resource overdraft. Jobs are killed if other jobs with a higher priority (HP) request the nodes and there are no other nodes available. LP jobs are automatically re-queued once HP jobs finish, so **make sure your jobs are re-runnable**. |
    | `qdgx`                           | Queue for DGX-2, accessible from Barbora. |
    | `qfat`                           | Queue for fat node, PI must request authorization to enter the queue for all users associated to their project. |
    | `qviz`                           | Visualization queue Intended for pre-/post-processing using OpenGL accelerated graphics. Each user gets 8 cores of a CPU allocated (approx. 64 GB of RAM and 1/8 of the GPU capacity (default "chunk")). If more GPU power or RAM is required, it is recommended to allocate more chunks (with 8 cores each) up to one whole node per user. This is currently also the maximum allowed allocation per one user. One hour of work is allocated by default, the user may ask for 2 hours maximum. |
    
    ### Legacy Queues
    
    Legacy queues stay in production until the end of 2022.
    
    | Legacy queue | Replaced by               |
    | ------------ | ------------------------- |
    | `qexp`       | `qcpu_exp` & `qgpu_exp`   |
    | `qprod`      | `qcpu`                    |
    | `qlong`      | `qcpu_long` & `qgpu_long` |
    | `nvidia`     | `qgpu` Note that unlike in new queues, only full nodes can be allocated. |
    | `qfree`      | `qcpu_free` & `qgpu_free` |
    
    The following table provides the queue partitioning per cluster overview:
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    Roman Sliva's avatar
    Roman Sliva committed
    ### Karolina
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    | Queue            | Active project | Project resources    | Nodes                                                         | Min ncpus | Priority | Authorization | Walltime (default/max)  |
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    | ---------------- | -------------- | -------------------- | ------------------------------------------------------------- | --------- | -------- | ------------- | ----------------------- |
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    | **qcpu**         | yes            | > 0                  | 756 nodes                                                     | 128       | 0        | no            | 24 / 48h                |
    | **qcpu_biz**     | yes            | > 0                  | 756 nodes                                                     | 128       | 50       | no            | 24 / 48h                |
    | **qcpu_eurohpc** | yes            | > 0                  | 756 nodes                                                     | 128       | 50       | no            | 24 / 48h                |
    | **qcpu_exp**     | yes            | none required        | 756 nodes<br>max 2 nodes per user                             | 128       | 150      | no            | 1 / 1h                  |
    | **qcpu_free**    | yes            | < 150% of allocation | 756 nodes<br>max 4 nodes per job                              | 128       | -100    | no            | 12 / 12h                |
    | **qcpu_long**    | yes            | > 0                  | 200 nodes<br>max 20 nodes per job, only non-accelerated nodes allowed | 128 | 0        | no            | 72 / 144h               |
    | **qcpu_preempt** | yes            | > 0                  | 756 nodes<br>max 4 nodes per job                              | 128       | -200     | no            | 12 / 12h                |
    | **qgpu**         | yes            | > 0                  | 72 nodes                                                      | 16 cpus<br>1 gpu | 0 | yes           | 24 / 48h                |
    | **qgpu_biz**     | yes            | > 0                  | 70 nodes                                                      | 128       | 50       | yes           | 24 / 48h                |
    | **qgpu_eurohpc** | yes            | > 0                  | 70 nodes                                                      | 128       | 50       | yes           | 24 / 48h                |
    | **qgpu_exp**     | yes            | none required        | 4 nodes<br>max 1 node per job                                 | 16 cpus<br>1 gpu | 0 | no            | 1 / 1h                  |
    | **qgpu_free**    | yes            | < 150% of allocation | 46 nodes<br>max 2 nodes per job                               | 16 cpus<br>1 gpu|-100| no            | 12 / 12h                |
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    | **qgpu_preempt** | yes            | > 0                  | 72 nodes<br>max 2 nodes per job                               | 16 cpus<br>1 gpu|-200| no            | 12 / 12h                |
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    | **qviz**         | yes            | none required        | 2 nodes (with NVIDIA® Quadro RTX™ 6000)                       | 8         | 0        | no            | 1 / 8h                  |
    | **qfat**         | yes            | > 0                  | 1 (sdf1)                                                      | 24        | 0        | yes           | 24 / 48h                |
    | **Legacy Queues**                 |
    | **qfree**        | yes            | < 150% of allocation | 756 nodes<br>max 4 nodes per job                              | 128       | -100    | no            | 12 / 12h                |
    | **qexp**         | no             | none required        | 756 nodes<br>max 2 nodes per job                             | 128       | 150      | no            | 1 / 1h                  |
    | **qprod**        | yes            | > 0                  | 756 nodes                                                     | 128       | 0        | no            | 24 / 48h                |
    | **qlong**        | yes            | > 0                  | 200 nodes<br>max 20 nodes per job, only non-accelerated nodes allowed | 128 | 0        | no            | 72 / 144h               |
    | **qnvidia**      | yes            | > 0                  | 72 nodes                                                      | 128       | 0        | yes           | 24 / 48h                |
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    ### Barbora
    
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    | Queue            | Active project | Project resources    | Nodes                                                         | Min ncpus | Priority | Authorization | Walltime (default/max)  |
    | ---------------- | -------------- | -------------------- | -------------------------------- | --------- | -------- | ------------- | ---------------------- |
    | **qcpu**         | yes            | > 0                  | 190 nodes                        | 36        | 0        | no            | 24 / 48h               |
    | **qcpu_biz**     | yes            | > 0                  | 190 nodes                        | 36        | 50       | no            | 24 / 48h               |
    | **qcpu_exp**     | yes            | none required        | 16 nodes                         | 36        | 150      | no            | 1 / 1h                 |
    | **qcpu_free**    | yes            | < 150% of allocation | 124 nodes<br>max 4 nodes per job | 36        | -100     | no            | 12 / 18h               |
    | **qcpu_long**    | yes            | > 0                  | 60 nodes<br>max 20 nodes per job | 36        | 0        | no            | 72 / 144h              |
    | **qcpu_preempt** | yes            | > 0                  | 190 nodes<br>max 4 nodes per job | 36        | -200     | no            | 12 / 12h               |
    | **qgpu**         | yes            | > 0                  | 8 nodes                          | 24        | 0        | yes           | 24 / 48h               |
    | **qgpu_biz**     | yes            | > 0                  | 8 nodes                          | 24        | 50       | yes           | 24 / 48h               |
    | **qgpu_exp**     | yes            | none required        | 4 nodes<br>max 1 node per job    | 24        | 0        | no            | 1 / 1h                 |
    | **qgpu_free**    | yes            | < 150% of allocation | 5 nodes<br>max 2 nodes per job   | 24        | -100     | no            | 12 / 18h               |
    | **qgpu_preempt** | yes            | > 0                  | 4 nodes<br>max 2 nodes per job   | 24        | -200     | no            | 12 / 12h               |
    | **qdgx**         | yes            | > 0                  | cn202                            | 96        | 0        | yes           | 4 / 48h                |
    | **qviz**         | yes            | none required        | 2 nodes with NVIDIA Quadro P6000 | 4         | 0        | no            | 1 / 8h                 |
    | **qfat**         | yes            | > 0                  | 1 fat node                       | 128       | 0        | yes           | 24 / 48h               |
    | **Legacy Queues**                 |
    | **qexp**         | no             | none required        | 16 nodes<br>max 4 nodes per job  | 36        | 150      | no            | 1 / 1h                 |
    | **qprod**        | yes            | > 0                  | 190 nodes w/o accelerator        | 36        | 0        | no            | 24 / 48h               |
    | **qlong**        | yes            | > 0                  | 60 nodes w/o accelerator<br>max 20 nodes per job     | 36        | 0        | no            | 72 / 144h              |
    | **qnvidia**      | yes            | > 0                  | 8 NVIDIA nodes                   | 24        | 0        | yes           | 24 / 48h               |
    | **qfree**        | yes            | < 150% of allocation | 192 w/o accelerator<br>max 32 nodes per job  | 36       | -100    | no            | 12 / 12h     |
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ## Queue Notes
    
    
    The job wallclock time defaults to **half the maximum time**, see the table above. Longer wall time limits can be [set manually, see examples][3].
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    Jobs that exceed the reserved wall clock time (Req'd Time) get killed automatically. The wall clock time limit can be changed for queuing jobs (state Q) using the `qalter` command, however it cannot be changed for a running job (state R).
    
    ## Queue Status
    
    !!! tip
        Check the status of jobs, queues and compute nodes [here][c].
    
    ![rspbs web interface](../img/barbora_cluster_usage.png)
    
    Display the queue status:
    
    ```console
    $ qstat -q
    ```
    
    The PBS allocation overview may also be obtained using the `rspbs` command:
    
    ```console
    $ rspbs
    Usage: rspbs [options]
    
    Options:
      --version             show program's version number and exit
      -h, --help            show this help message and exit
      --get-server-details  Print server
      --get-queues          Print queues
      --get-queues-details  Print queues details
      --get-reservations    Print reservations
      --get-reservations-details
                            Print reservations details
    
    Jan Siwiec's avatar
    Jan Siwiec committed
      ...
      ..
      .
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    ---8<--- "resource_accounting.md"
    
    ---8<--- "mathjax.md"
    
    [1]: job-priority.md
    [2]: #resource-accounting-policy
    [3]: job-submission-and-execution.md
    
    Jan Siwiec's avatar
    Jan Siwiec committed
    [4]: ./vnode-allocation.md
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    [a]: https://support.it4i.cz/rt/
    [c]: https://extranet.it4i.cz/rsweb