resource_allocation_and_job_execution.md



Resource Allocation and Job Execution
!!! important "Barbora migrating to Slurm"
Starting July 19. 9AM, we are migrating the Barbora's workload manager from PBS to Slurm.
For more information on how to submit jobs in Slurm, see the Slurm Job Submission and Execution section.
To run a job, computational resources for this particular job must be allocated. This is done via the PBS Pro job workload manager software, which distributes workloads across the supercomputer. Extensive information about PBS Pro can be found in the PBS Pro User's Guide.

Resource Allocation Policy
Resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and resources available to the Project. The Fair-share ensures that individual users may consume approximately equal amount of resources per week. The resources are accessible via queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources.

Resource Reservation
You can request a reservation of a specific number, range, or type of computational resources at support@it4i.cz.
Note that unspent reserved node-hours count towards the total computational resources used.
!!! note
See the queue status for Karolina or Barbora.
Read more on the Resource Allocation Policy page.

Job Submission and Execution
The qsub command creates a request to the PBS Job manager for allocation of specified resources. The smallest allocation unit is an entire node, with the exception of the qexp queue. The resources will be allocated when available, subject to allocation policies and constraints. After the resources are allocated, the jobscript or interactive shell is executed on first of the allocated nodes.
Read more on the Job Submission and Execution page.

Capacity Computing
!!! note
Use Job arrays when running huge number of jobs.
Use GNU Parallel and/or Job arrays when running (many) single core jobs.
In many cases, it is useful to submit a huge (100+) number of computational jobs into the PBS queue system. A huge number of (small) jobs is one of the most effective ways to execute parallel calculations, achieving best runtime, throughput and computer utilization. In this chapter, we discuss the recommended way to run huge numbers of jobs, including ways to run huge numbers of single core jobs.
Read more on the Capacity Computing page.

Vnode Allocation
The qgpu queue on Karolina takes advantage of the division of nodes into vnodes. Accelerated node equipped with two 64-core processors and eight GPU cards is treated as eight vnodes, each containing 16 CPU cores and 1 GPU card. Vnodes can be allocated to jobs individually –⁠ through precise definition of resource list at job submission, you may allocate varying number of resources/GPU cards according to your needs.
Red more on the Vnode Allocation page.