# Allocation of vnodes on qgpu ## Introduction The `qgpu` queue on Karolina takes advantage of the division of nodes into vnodes. Accelerated node equipped with two 64-core processors and eight GPU cards is treated as eight vnodes, each containing 16 CPU cores and 1 GPU card. Vnodes can be allocated to jobs individually – through precise definition of resource list at job submission, you may allocate varying number of resources/GPU cards according to your needs. !!! important "Vnodes and Security" Division of nodes into vnodes was implemented to be as secure as possible, but it is still a "multi-user mode", which means that if two users allocate a portion of the same node, they can see each other's running processes. If this solution is inconvenient for you, consider allocating a whole node. ## Selection Statement and Chunks Requested resources are specified using a selection statement: ``` -l select=[<N>:]<chunk>[+[<N>:]<chunk> ...] ``` `N` specifies the number of chunks; if not specified then `N = 1`.<br> `chunk` declares the value of each resource in a set of resources which are to be allocated as a unit to a job. * `chunk` is seen by the MPI as one node. * Multiple chunks are then seen as multiple nodes. * Maximum chunk size is equal to the size of a full physical node (8 GPU cards, 128 cores) Default chunk for the `qgpu` queue is configured to contain 1 GPU card and 16 CPU cores, i.e. `ncpus=16:ngpus=1`. * `ncpus` specifies number of CPU cores * `ngpus` specifies number of GPU cards ### Allocating Single GPU Single GPU can be allocated in an interactive session using ```console qsub -q qgpu -A OPEN-00-00 -l select=1 -I ``` or simply ```console qsub -q qgpu -A OPEN-00-00 -I ``` In this case, the `ngpus` parameter is optional, since it defaults to `1`. You can verify your allocation either in the PBS using the `qstat` command, or by checking the number of allocated GPU cards in the `CUDA_VISIBLE_DEVICES` variable: ```console $ qstat -F json -f $PBS_JOBID | grep exec_vnode "exec_vnode":"(acn53[0]:ncpus=16:ngpus=1)" $ echo $CUDA_VISIBLE_DEVICES GPU-8772c06c-0e5e-9f87-8a41-30f1a70baa00 ``` The output shows that you have been allocated vnode acn53[0]. ### Allocating Single Accelerated Node !!! tip "Security tip" Allocating a whole node prevents other users from seeing your running processes. Single accelerated node can be allocated in an interactive session using ```console qsub -q qgpu -A OPEN-00-00 -l select=8 -I ``` Setting `select=8` automatically allocates a whole accelerated node and sets `mpiproc`. So for `N` full nodes, set `select` to `N x 8`. However, note that it may take some time before your jobs are executed if the required amount of full nodes isn't available. ### Allocating Multiple GPUs !!! important "Security risk" If two users allocate a portion of the same node, they can see each other's running processes. When required for security reasons, consider allocating a whole node. Again, the following examples use only the selection statement, so no additional setting is required. ```console qsub -q qgpu -A OPEN-00-00 -l select=2 -I ``` In this example two chunks will be allocated on the same node, if possible. ```console qsub -q qgpu -A OPEN-00-00 -l select=16 -I ``` This example allocates two whole accelerated nodes. Multiple vnodes within the same chunk can be allocated using the `ngpus` parameter. For example, to allocate 2 vnodes in an interactive mode, run ```console qsub -q qgpu -A OPEN-00-00 -l select=1:ngpus=2:mpiprocs=2 -I ``` Remember to **set the number of `mpiprocs` equal to that of `ngpus`** to spawn an according number of MPI processes. To verify the correctness: ```console $ qstat -F json -f $PBS_JOBID | grep exec_vnode "exec_vnode":"(acn53[0]:ncpus=16:ngpus=1+acn53[1]:ncpus=16:ngpus=1)" $ echo $CUDA_VISIBLE_DEVICES | tr ',' '\n' GPU-8772c06c-0e5e-9f87-8a41-30f1a70baa00 GPU-5e88c15c-e331-a1e4-c80c-ceb3f49c300e ``` The number of chunks to allocate is specified in the `select` parameter. For example, to allocate 2 chunks, each with 4 GPUs, run ```console qsub -q qgpu -A OPEN-00-00 -l select=2:ngpus=4:mpiprocs=4 -I ``` To verify the correctness: ```console $ cat > print-cuda-devices.sh <<EOF #!/bin/bash echo \$CUDA_VISIBLE_DEVICES EOF $ chmod +x print-cuda-devices.sh $ ml OpenMPI/4.1.4-GCC-11.3.0 $ mpirun ./print-cuda-devices.sh | tr ',' '\n' | sort | uniq GPU-0910c544-aef7-eab8-f49e-f90d4d9b7560 GPU-1422a1c6-15b4-7b23-dd58-af3a233cda51 GPU-3dbf6187-9833-b50b-b536-a83e18688cff GPU-3dd0ae4b-e196-7c77-146d-ae16368152d0 GPU-93edfee0-4cfa-3f82-18a1-1e5f93e614b9 GPU-9c8143a6-274d-d9fc-e793-a7833adde729 GPU-ad06ab8b-99cd-e1eb-6f40-d0f9694601c0 GPU-dc0bc3d6-e300-a80a-79d9-3e5373cb84c9 ```