 David Hrbáč committed Oct 19, 2017 1 # Job Scheduling  Lukáš Krupčík committed Aug 11, 2016 2   David Hrbáč committed Jan 26, 2017 3 ## Job Execution Priority  Lukáš Krupčík committed Aug 11, 2016 4   John Cawley committed Nov 28, 2017 5 The scheduler gives each job an execution priority and then uses this job execution priority to select which job(s) to run.  Lukáš Krupčík committed Aug 11, 2016 6   David Hrbáč committed Jan 23, 2017 7 Job execution priority on Anselm is determined by these job properties (in order of importance):  Lukáš Krupčík committed Aug 11, 2016 8   Lukáš Krupčík committed Jan 27, 2017 9 10 11 1. queue priority 1. fair-share priority 1. eligible time  Lukáš Krupčík committed Aug 11, 2016 12   David Hrbáč committed Jan 26, 2017 13 ### Queue Priority  Lukáš Krupčík committed Aug 11, 2016 14   John Cawley committed Nov 28, 2017 15 Queue priority is the priority of the queue in which the job is waiting prior to execution.  Lukáš Krupčík committed Aug 11, 2016 16   John Cawley committed Nov 28, 2017 17 Queue priority has the biggest impact on job execution priority. The execution priority of jobs in higher priority queues is always greater than the execution priority of jobs in lower priority queues. Other properties of jobs used for determining the job execution priority (fair-share priority, eligible time) cannot compete with queue priority.  Lukáš Krupčík committed Aug 11, 2016 18   David Hrbáč committed Oct 31, 2018 19 Queue priorities can be seen [here][a].  Lukáš Krupčík committed Aug 11, 2016 20   David Hrbáč committed Jan 26, 2017 21 ### Fair-Share Priority  Lukáš Krupčík committed Aug 11, 2016 22   John Cawley committed Nov 28, 2017 23 Fair-share priority is priority calculated on the basis of recent usage of resources. Fair-share priority is calculated per project, all members of a project sharing the same fair-share priority. Projects with higher recent usage have a lower fair-share priority than projects with lower or no recent usage.  Lukáš Krupčík committed Aug 11, 2016 24   David Hrbáč committed Jan 22, 2017 25 Fair-share priority is used for ranking jobs with equal queue priority.  Lukáš Krupčík committed Aug 11, 2016 26   David Hrbáč committed Jan 22, 2017 27 Fair-share priority is calculated as  Lukáš Krupčík committed Aug 11, 2016 28   David Hrbáč committed Feb 15, 2017 29 ---8<--- "fairshare_formula.md"  Lukáš Krupčík committed Aug 11, 2016 30   Pavel Jirásek committed Jan 23, 2017 31 where MAX_FAIRSHARE has value 1E6,  John Cawley committed Nov 28, 2017 32 33 usageProject is accumulated usage by all members of a selected project, usageTotal is total usage by all users, across all projects.  Lukáš Krupčík committed Aug 11, 2016 34   John Cawley committed Nov 28, 2017 35 36 Usage counts allocated core-hours (ncpus x walltime). Usage decays, halving at intervals of 168 hours (one week). Jobs queued in the queue qexp are not used to calculate the project's usage.  Lukáš Krupčík committed Aug 11, 2016 37   David Hrbáč committed Jan 27, 2017 38 !!! note  David Hrbáč committed Oct 31, 2018 39  Calculated usage and fair-share priority can be seen [here][b].  Lukáš Krupčík committed Aug 11, 2016 40   John Cawley committed Nov 28, 2017 41 Calculated fair-share priority can be also be seen in the Resource_List.fairshare attribute of a job.  Lukáš Krupčík committed Aug 11, 2016 42   David Hrbáč committed Jan 26, 2017 43 ### Eligible Time  Lukáš Krupčík committed Aug 11, 2016 44   John Cawley committed Nov 28, 2017 45 Eligible time is the amount (in seconds) of eligible time a job accrues while waiting to run. Jobs with higher eligible time gain higher priority.  Lukáš Krupčík committed Aug 11, 2016 46   David Hrbáč committed Jan 22, 2017 47 Eligible time has the least impact on execution priority. Eligible time is used for sorting jobs with equal queue priority and fair-share priority. It is very, very difficult for eligible time to compete with fair-share priority.  Lukáš Krupčík committed Aug 11, 2016 48   John Cawley committed Nov 28, 2017 49 Eligible time can be seen in the eligible_time attribute of job.  Lukáš Krupčík committed Aug 11, 2016 50 51 52 53 54  ### Formula Job execution priority (job sort formula) is calculated as:  David Hrbáč committed Feb 15, 2017 55 ---8<--- "job_sort_formula.md"  Lukáš Krupčík committed Aug 11, 2016 56   David Hrbáč committed Oct 19, 2017 57 ### Job Backfilling  Lukáš Krupčík committed Aug 11, 2016 58   John Cawley committed Nov 28, 2017 59 The Anselm cluster uses job backfilling.  Lukáš Krupčík committed Aug 11, 2016 60   John Cawley committed Nov 28, 2017 61 Backfilling means fitting smaller jobs around the higher-priority jobs that the scheduler is going to run next, in such a way that the higher-priority jobs are not delayed. Backfilling allows us to keep resources from becoming idle when the top job (the job with the highest execution priority) cannot run.  Lukáš Krupčík committed Aug 11, 2016 62   John Cawley committed Nov 28, 2017 63 The scheduler makes a list of jobs to run in order of execution priority. The scheduler looks for smaller jobs that can fit into the usage gaps around the highest-priority jobs in the list. The scheduler looks in the prioritized list of jobs and chooses the highest-priority smaller jobs that fit. Filler jobs are run only if they will not delay the start time of top jobs.  Lukáš Krupčík committed Aug 11, 2016 64   John Cawley committed Nov 28, 2017 65 This means that jobs with lower execution priority can be run before jobs with higher execution priority.  Lukáš Krupčík committed Aug 11, 2016 66   David Hrbáč committed Jan 27, 2017 67 !!! note  Lukáš Krupčík committed Jan 27, 2017 68  It is **very beneficial to specify the walltime** when submitting jobs.  Lukáš Krupčík committed Aug 11, 2016 69   John Cawley committed Nov 28, 2017 70 Specifying more accurate walltime enables better scheduling, better execution times, and better resource usage. Jobs with suitable (small) walltime can be backfilled - and overtake job(s) with a higher priority.  David Hrbáč committed Feb 15, 2017 71 72  ---8<--- "mathjax.md"  David Hrbáč committed Oct 31, 2018 73 74 75  [a]: https://extranet.it4i.cz/anselm/queues [b]: https://extranet.it4i.cz/anselm/projects