4 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section
@@ -29,11 +29,11 @@ A job array is a compact representation of many jobs called subjobs. Subjobs sha
...
@@ -29,11 +29,11 @@ A job array is a compact representation of many jobs called subjobs. Subjobs sha
* job Identifiers of subjobs only differ by their indices
* job Identifiers of subjobs only differ by their indices
* the state of subjobs can differ (R, Q, ..., etc.)
* the state of subjobs can differ (R, Q, ..., etc.)
All subjobs within a job array have the same scheduling priority and schedule as independent jobs. An entire job array is submitted through a single qsub command and may be managed by qdel, qalter, qhold, qrls, and qsig commands as a single job.
All subjobs within a job array have the same scheduling priority and schedule as independent jobs. An entire job array is submitted through a single `qsub` command and may be managed by `qdel`, `qalter`, `qhold`, `qrls`, and `qsig` commands as a single job.
### Shared Jobscript
### Shared Jobscript
All subjobs in a job array use the very same single jobscript. Each subjob runs its own instance of the jobscript. The instances execute different work controlled by the $PBS_ARRAY_INDEX variable.
All subjobs in a job array use the very same single jobscript. Each subjob runs its own instance of the jobscript. The instances execute different work controlled by the `$PBS_ARRAY_INDEX` variable.
In this example, the submit directory holds the 900 input files, the myprog.x executable, and the jobscript file. As an input for each run, we take the filename of the input file from the created tasklist file. We copy the input file to the local scratch memory /lscratch/$PBS_JOBID, execute the myprog.x and copy the output file back to the submit directory, under the $TASK.out name. The myprog.x executable runs on one node only and must use threads to run in parallel. Be aware, that if the myprog.x **is not multithreaded**, then all the **jobs are run as single thread programs in a sequential manner**. Due to the allocation of the whole node, the accounted time is equal to the usage of the whole node, while using only 1/16 of the node.
In this example, the submit directory holds the 900 input files, the myprog.x executable, and the jobscript file. As an input for each run, we take the filename of the input file from the created tasklist file. We copy the input file to the local scratch memory `/lscratch/$PBS_JOBID`, execute the myprog.x and copy the output file back to the submit directory, under the `$TASK.out` name. The myprog.x executable runs on one node only and must use threads to run in parallel. Be aware, that if the myprog.x **is not multithreaded**, then all the **jobs are run as single thread programs in a sequential manner**. Due to the allocation of the whole node, the accounted time is equal to the usage of the whole node, while using only 1/16 of the node.
If running a huge number of parallel multicore (in means of multinode multithread, e.g. MPI enabled) jobs is needed, then a job array approach should be used. The main difference as compared to previous examples using one node is that the local scratch memory should not be used (as it is not shared between nodes) and MPI or other techniques for parallel multinode processing has to be used properly.
If running a huge number of parallel multicore (in means of multinode multithread, e.g. MPI enabled) jobs is needed, then a job array approach should be used. The main difference as compared to previous examples using one node is that the local scratch memory should not be used (as it is not shared between nodes) and MPI or other techniques for parallel multinode processing has to be used properly.
### Submit the Job Array
### Submit the Job Array
To submit the job array, use the qsub -J command. The 900 jobs of the [example above][5] may be submitted like this:
To submit the job array, use the `qsub -J` command. The 900 jobs of the [example above][5] may be submitted like this:
#### Anselm
#### Anselm
...
@@ -129,7 +129,7 @@ This will only choose the lower index (9 in this example) for submitting/running
...
@@ -129,7 +129,7 @@ This will only choose the lower index (9 in this example) for submitting/running
### Manage the Job Array
### Manage the Job Array
Check status of the job array using the qstat command.
Check status of the job array using the `qstat` command.
```console
```console
$qstat -a 12345[].dm2
$qstat -a 12345[].dm2
...
@@ -142,7 +142,7 @@ Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
...
@@ -142,7 +142,7 @@ Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
```
```
When the status is B, it means that some subjobs are already running.
When the status is B, it means that some subjobs are already running.
Check the status of the first 100 subjobs using the qstat command.
Check the status of the first 100 subjobs using the `qstat` command.