Skip to content
Snippets Groups Projects
Commit 0f5e5e4b authored by David Hrbáč's avatar David Hrbáč
Browse files

Typo

parent b7d172e4
No related branches found
No related tags found
5 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section,!67Remark
...@@ -70,7 +70,7 @@ cp $PBS_O_WORKDIR/$TASK input ; cp $PBS_O_WORKDIR/myprog.x . ...@@ -70,7 +70,7 @@ cp $PBS_O_WORKDIR/$TASK input ; cp $PBS_O_WORKDIR/myprog.x .
cp output $PBS_O_WORKDIR/$TASK.out cp output $PBS_O_WORKDIR/$TASK.out
``` ```
In this example, the submit directory holds the 900 input files, executable myprog.x and the jobscript file. As input for each run, we take the filename of input file from created tasklist file. We copy the input file to local scratch /lscratch/$PBS_JOBID, execute the myprog.x and copy the output file back to >the submit directory, under the $TASK.out name. The myprog.x runs on one node only and must use threads to run in parallel. Be aware, that if the myprog.x **is not multithreaded**, then all the **jobs are run as single thread programs in sequential** manner. Due to allocation of the whole node, the accounted time is equal to the usage of whole node\*\*, while using only 1/16 of the node! In this example, the submit directory holds the 900 input files, executable myprog.x and the jobscript file. As input for each run, we take the filename of input file from created tasklist file. We copy the input file to local scratch /lscratch/$PBS_JOBID, execute the myprog.x and copy the output file back to the submit directory, under the $TASK.out name. The myprog.x runs on one node only and must use threads to run in parallel. Be aware, that if the myprog.x **is not multithreaded**, then all the **jobs are run as single thread programs in sequential** manner. Due to allocation of the whole node, the accounted time is equal to the usage of whole node, while using only 1/16 of the node!
If huge number of parallel multicore (in means of multinode multithread, e. g. MPI enabled) jobs is needed to run, then a job array approach should also be used. The main difference compared to previous example using one node is that the local scratch should not be used (as it's not shared between nodes) and MPI or other technique for parallel multinode run has to be used properly. If huge number of parallel multicore (in means of multinode multithread, e. g. MPI enabled) jobs is needed to run, then a job array approach should also be used. The main difference compared to previous example using one node is that the local scratch should not be used (as it's not shared between nodes) and MPI or other technique for parallel multinode run has to be used properly.
...@@ -285,7 +285,7 @@ In this example, the jobscript executes in multiple instances in parallel, on al ...@@ -285,7 +285,7 @@ In this example, the jobscript executes in multiple instances in parallel, on al
When deciding this values, think about following guiding rules: When deciding this values, think about following guiding rules:
1. Let n=N/16. Inequality (n+1) \* T &lt; W should hold. The N is number of tasks per subjob, T is expected single task walltime and W is subjob walltime. Short subjob walltime improves scheduling and job throughput. 1. Let n=N/16. Inequality (n+1) \* T < W should hold. The N is number of tasks per subjob, T is expected single task walltime and W is subjob walltime. Short subjob walltime improves scheduling and job throughput.
2. Number of tasks should be modulo 16. 2. Number of tasks should be modulo 16.
3. These rules are valid only when all tasks have similar task walltimes T. 3. These rules are valid only when all tasks have similar task walltimes T.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment