Commit d018d740 authored by David Hrbáč's avatar David Hrbáč
Browse files

Merge branch '30-zacistit-please-note' into 'master'

Note clean-up

Closes #30

See merge request !70
parents 09989368 0bc1b533
Pipeline #2023 passed with stages
in 1 minute and 1 second
......@@ -6,12 +6,12 @@ In many cases, it is useful to submit huge (>100+) number of computational jobs
However, executing huge number of jobs via the PBS queue may strain the system. This strain may result in slow response to commands, inefficient scheduling and overall degradation of performance and user experience, for all users. For this reason, the number of jobs is **limited to 100 per user, 1000 per job array**
!!! Note "Note"
Please follow one of the procedures below, in case you wish to schedule more than 100 jobs at a time.
!!! note
Please follow one of the procedures below, in case you wish to schedule more than 100 jobs at a time.
- Use [Job arrays](capacity-computing/#job-arrays) when running huge number of [multithread](capacity-computing/#shared-jobscript-on-one-node) (bound to one node only) or multinode (multithread across several nodes) jobs
- Use [GNU parallel](capacity-computing/#gnu-parallel) when running single core jobs
- Combine [GNU parallel with Job arrays](capacity-computing/#job-arrays-and-gnu-parallel) when running huge number of single core jobs
* Use [Job arrays](capacity-computing/#job-arrays) when running huge number of [multithread](capacity-computing/#shared-jobscript-on-one-node) (bound to one node only) or multinode (multithread across several nodes) jobs
* Use [GNU parallel](capacity-computing/#gnu-parallel) when running single core jobs
* Combine [GNU parallel with Job arrays](capacity-computing/#job-arrays-and-gnu-parallel) when running huge number of single core jobs
## Policy
......@@ -20,14 +20,14 @@ However, executing huge number of jobs via the PBS queue may strain the system.
## Job Arrays
!!! Note "Note"
Huge number of jobs may be easily submitted and managed as a job array.
!!! note
Huge number of jobs may be easily submitted and managed as a job array.
A job array is a compact representation of many jobs, called subjobs. The subjobs share the same job script, and have the same values for all attributes and resources, with the following exceptions:
- each subjob has a unique index, $PBS_ARRAY_INDEX
- job Identifiers of subjobs only differ by their indices
- the state of subjobs can differ (R,Q,...etc.)
* each subjob has a unique index, $PBS_ARRAY_INDEX
* job Identifiers of subjobs only differ by their indices
* the state of subjobs can differ (R,Q,...etc.)
All subjobs within a job array have the same scheduling priority and schedule as independent jobs. Entire job array is submitted through a single qsub command and may be managed by qdel, qalter, qhold, qrls and qsig commands as a single job.
......@@ -149,8 +149,8 @@ Read more on job arrays in the [PBSPro Users guide](../../pbspro-documentation/)
## GNU Parallel
!!! Note "Note"
Use GNU parallel to run many single core tasks on one node.
!!! note
Use GNU parallel to run many single core tasks on one node.
GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. GNU parallel is most useful in running single core jobs via the queue system on Anselm.
......@@ -216,17 +216,18 @@ $ qsub -N JOBNAME jobscript
In this example, we submit a job of 101 tasks. 16 input files will be processed in parallel. The 101 tasks on 16 cores are assumed to complete in less than 2 hours.
Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
!!! hint
Use #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
## Job Arrays and GNU Parallel
!!! Note "Note"
Combine the Job arrays and GNU parallel for best throughput of single core jobs
!!! note
Combine the Job arrays and GNU parallel for best throughput of single core jobs
While job arrays are able to utilize all available computational nodes, the GNU parallel can be used to efficiently run multiple single-core jobs on single node. The two approaches may be combined to utilize all available (current and future) resources to execute single core jobs.
!!! Note "Note"
Every subjob in an array runs GNU parallel to utilize all cores on the node
!!! note
Every subjob in an array runs GNU parallel to utilize all cores on the node
### GNU Parallel, Shared jobscript
......@@ -280,8 +281,8 @@ cp output $PBS_O_WORKDIR/$TASK.out
In this example, the jobscript executes in multiple instances in parallel, on all cores of a computing node. Variable $TASK expands to one of the input filenames from tasklist. We copy the input file to local scratch, execute the myprog.x and copy the output file back to the submit directory, under the $TASK.out name. The numtasks file controls how many tasks will be run per subjob. Once an task is finished, new task starts, until the number of tasks in numtasks file is reached.
!!! Note "Note"
Select subjob walltime and number of tasks per subjob carefully
!!! note
Select subjob walltime and number of tasks per subjob carefully
When deciding this values, think about following guiding rules:
......@@ -300,7 +301,8 @@ $ qsub -N JOBNAME -J 1-992:32 jobscript
In this example, we submit a job array of 31 subjobs. Note the -J 1-992:**32**, this must be the same as the number sent to numtasks file. Each subjob will run on full node and process 16 input files in parallel, 32 in total per subjob. Every subjob is assumed to complete in less than 2 hours.
Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
!!! hint
Use #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
## Examples
......
......@@ -6,46 +6,46 @@ Anselm is cluster of x86-64 Intel based nodes built on Bull Extreme Computing bu
### Compute Nodes Without Accelerator
- 180 nodes
- 2880 cores in total
- two Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
- 64 GB of physical memory per node
- one 500GB SATA 2,5” 7,2 krpm HDD per node
- bullx B510 blade servers
- cn[1-180]
* 180 nodes
* 2880 cores in total
* two Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
* 64 GB of physical memory per node
* one 500GB SATA 2,5” 7,2 krpm HDD per node
* bullx B510 blade servers
* cn[1-180]
### Compute Nodes With GPU Accelerator
- 23 nodes
- 368 cores in total
- two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
- 96 GB of physical memory per node
- one 500GB SATA 2,5” 7,2 krpm HDD per node
- GPU accelerator 1x NVIDIA Tesla Kepler K20 per node
- bullx B515 blade servers
- cn[181-203]
* 23 nodes
* 368 cores in total
* two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
* 96 GB of physical memory per node
* one 500GB SATA 2,5” 7,2 krpm HDD per node
* GPU accelerator 1x NVIDIA Tesla Kepler K20 per node
* bullx B515 blade servers
* cn[181-203]
### Compute Nodes With MIC Accelerator
- 4 nodes
- 64 cores in total
- two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
- 96 GB of physical memory per node
- one 500GB SATA 2,5” 7,2 krpm HDD per node
- MIC accelerator 1x Intel Phi 5110P per node
- bullx B515 blade servers
- cn[204-207]
* 4 nodes
* 64 cores in total
* two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
* 96 GB of physical memory per node
* one 500GB SATA 2,5” 7,2 krpm HDD per node
* MIC accelerator 1x Intel Phi 5110P per node
* bullx B515 blade servers
* cn[204-207]
### Fat Compute Nodes
- 2 nodes
- 32 cores in total
- 2 Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
- 512 GB of physical memory per node
- two 300GB SAS 3,5”15krpm HDD (RAID1) per node
- two 100GB SLC SSD per node
- bullx R423-E3 servers
- cn[208-209]
* 2 nodes
* 32 cores in total
* 2 Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
* 512 GB of physical memory per node
* two 300GB SAS 3,5”15krpm HDD (RAID1) per node
* two 100GB SLC SSD per node
* bullx R423-E3 servers
* cn[208-209]
![](../img/bullxB510.png)
**Figure Anselm bullx B510 servers**
......@@ -65,23 +65,23 @@ Anselm is equipped with Intel Sandy Bridge processors Intel Xeon E5-2665 (nodes
### Intel Sandy Bridge E5-2665 Processor
- eight-core
- speed: 2.4 GHz, up to 3.1 GHz using Turbo Boost Technology
- peak performance: 19.2 GFLOP/s per core
- caches:
- L2: 256 KB per core
- L3: 20 MB per processor
- memory bandwidth at the level of the processor: 51.2 GB/s
* eight-core
* speed: 2.4 GHz, up to 3.1 GHz using Turbo Boost Technology
* peak performance: 19.2 GFLOP/s per core
* caches:
* L2: 256 KB per core
* L3: 20 MB per processor
* memory bandwidth at the level of the processor: 51.2 GB/s
### Intel Sandy Bridge E5-2470 Processor
- eight-core
- speed: 2.3 GHz, up to 3.1 GHz using Turbo Boost Technology
- peak performance: 18.4 GFLOP/s per core
- caches:
- L2: 256 KB per core
- L3: 20 MB per processor
- memory bandwidth at the level of the processor: 38.4 GB/s
* eight-core
* speed: 2.3 GHz, up to 3.1 GHz using Turbo Boost Technology
* peak performance: 18.4 GFLOP/s per core
* caches:
* L2: 256 KB per core
* L3: 20 MB per processor
* memory bandwidth at the level of the processor: 38.4 GB/s
Nodes equipped with Intel Xeon E5-2665 CPU have set PBS resource attribute cpu_freq = 24, nodes equipped with Intel Xeon E5-2470 CPU have set PBS resource attribute cpu_freq = 23.
......@@ -101,30 +101,30 @@ Intel Turbo Boost Technology is used by default, you can disable it for all nod
### Compute Node Without Accelerator
- 2 sockets
- Memory Controllers are integrated into processors.
- 8 DDR3 DIMMs per node
- 4 DDR3 DIMMs per CPU
- 1 DDR3 DIMMs per channel
- Data rate support: up to 1600MT/s
- Populated memory: 8 x 8 GB DDR3 DIMM 1600 MHz
* 2 sockets
* Memory Controllers are integrated into processors.
* 8 DDR3 DIMMs per node
* 4 DDR3 DIMMs per CPU
* 1 DDR3 DIMMs per channel
* Data rate support: up to 1600MT/s
* Populated memory: 8 x 8 GB DDR3 DIMM 1600 MHz
### Compute Node With GPU or MIC Accelerator
- 2 sockets
- Memory Controllers are integrated into processors.
- 6 DDR3 DIMMs per node
- 3 DDR3 DIMMs per CPU
- 1 DDR3 DIMMs per channel
- Data rate support: up to 1600MT/s
- Populated memory: 6 x 16 GB DDR3 DIMM 1600 MHz
* 2 sockets
* Memory Controllers are integrated into processors.
* 6 DDR3 DIMMs per node
* 3 DDR3 DIMMs per CPU
* 1 DDR3 DIMMs per channel
* Data rate support: up to 1600MT/s
* Populated memory: 6 x 16 GB DDR3 DIMM 1600 MHz
### Fat Compute Node
- 2 sockets
- Memory Controllers are integrated into processors.
- 16 DDR3 DIMMs per node
- 8 DDR3 DIMMs per CPU
- 2 DDR3 DIMMs per channel
- Data rate support: up to 1600MT/s
- Populated memory: 16 x 32 GB DDR3 DIMM 1600 MHz
* 2 sockets
* Memory Controllers are integrated into processors.
* 16 DDR3 DIMMs per node
* 8 DDR3 DIMMs per CPU
* 2 DDR3 DIMMs per channel
* Data rate support: up to 1600MT/s
* Populated memory: 16 x 32 GB DDR3 DIMM 1600 MHz
......@@ -23,15 +23,15 @@ then
fi
```
!!! Note "Note"
Do not run commands outputting to standard output (echo, module list, etc) in .bashrc for non-interactive SSH sessions. It breaks fundamental functionality (scp, PBS) of your account! Conside utilization of SSH session interactivity for such commands as stated in the previous example.
!!! note
Do not run commands outputting to standard output (echo, module list, etc) in .bashrc for non-interactive SSH sessions. It breaks fundamental functionality (scp, PBS) of your account! Conside utilization of SSH session interactivity for such commands as stated in the previous example.
### Application Modules
In order to configure your shell for running particular application on Anselm we use Module package interface.
!!! Note "Note"
The modules set up the application paths, library paths and environment variables for running particular application.
!!! note
The modules set up the application paths, library paths and environment variables for running particular application.
We have also second modules repository. This modules repository is created using tool called EasyBuild. On Salomon cluster, all modules will be build by this tool. If you want to use software from this modules repository, please follow instructions in section [Application Modules Path Expansion](environment-and-modules/#EasyBuild).
......
......@@ -12,10 +12,10 @@ The cluster compute nodes cn[1-207] are organized within 13 chassis.
There are four types of compute nodes:
- 180 compute nodes without the accelerator
- 23 compute nodes with GPU accelerator - equipped with NVIDIA Tesla Kepler K20
- 4 compute nodes with MIC accelerator - equipped with Intel Xeon Phi 5110P
- 2 fat nodes - equipped with 512 GB RAM and two 100 GB SSD drives
* 180 compute nodes without the accelerator
* 23 compute nodes with GPU accelerator - equipped with NVIDIA Tesla Kepler K20
* 4 compute nodes with MIC accelerator - equipped with Intel Xeon Phi 5110P
* 2 fat nodes - equipped with 512 GB RAM and two 100 GB SSD drives
[More about Compute nodes](compute-nodes/).
......
......@@ -35,8 +35,8 @@ usage<sub>Total</sub> is total usage by all users, by all projects.
Usage counts allocated core-hours (`ncpus x walltime`). Usage is decayed, or cut in half periodically, at the interval 168 hours (one week).
Jobs queued in queue qexp are not calculated to project's usage.
!!! Note "Note"
Calculated usage and fair-share priority can be seen at <https://extranet.it4i.cz/anselm/projects>.
!!! note
Calculated usage and fair-share priority can be seen at <https://extranet.it4i.cz/anselm/projects>.
Calculated fair-share priority can be also seen as Resource_List.fairshare attribute of a job.
......@@ -64,7 +64,7 @@ The scheduler makes a list of jobs to run in order of execution priority. Schedu
It means, that jobs with lower execution priority can be run before jobs with higher execution priority.
!!! Note "Note"
It is **very beneficial to specify the walltime** when submitting jobs.
!!! note
It is **very beneficial to specify the walltime** when submitting jobs.
Specifying more accurate walltime enables better scheduling, better execution times and better resource usage. Jobs with suitable (small) walltime could be backfilled - and overtake job(s) with higher priority.
......@@ -11,7 +11,7 @@ When allocating computational resources for the job, please specify
5. Project ID
6. Jobscript or interactive switch
!!! Note "Note"
!!! note
Use the **qsub** command to submit your job to a queue for allocation of the computational resources.
Submit the job using the qsub command:
......@@ -132,7 +132,7 @@ Although this example is somewhat artificial, it demonstrates the flexibility of
## Job Management
!!! Note "Note"
!!! note
Check status of your jobs using the **qstat** and **check-pbs-jobs** commands
```bash
......@@ -213,7 +213,7 @@ Run loop 3
In this example, we see actual output (some iteration loops) of the job 35141.dm2
!!! Note "Note"
!!! note
Manage your queued or running jobs, using the **qhold**, **qrls**, **qdel**, **qsig** or **qalter** commands
You may release your allocation at any time, using qdel command
......@@ -238,12 +238,12 @@ $ man pbs_professional
### Jobscript
!!! Note "Note"
!!! note
Prepare the jobscript to run batch jobs in the PBS queue system
The Jobscript is a user made script, controlling sequence of commands for executing the calculation. It is often written in bash, other scripts may be used as well. The jobscript is supplied to PBS **qsub** command as an argument and executed by the PBS Professional workload manager.
!!! Note "Note"
!!! note
The jobscript or interactive shell is executed on first of the allocated nodes.
```bash
......@@ -273,7 +273,7 @@ $ pwd
In this example, 4 nodes were allocated interactively for 1 hour via the qexp queue. The interactive shell is executed in the home directory.
!!! Note "Note"
!!! note
All nodes within the allocation may be accessed via ssh. Unallocated nodes are not accessible to user.
The allocated nodes are accessible via ssh from login nodes. The nodes may access each other via ssh as well.
......@@ -305,7 +305,7 @@ In this example, the hostname program is executed via pdsh from the interactive
### Example Jobscript for MPI Calculation
!!! Note "Note"
!!! note
Production jobs must use the /scratch directory for I/O
The recommended way to run production jobs is to change to /scratch directory early in the jobscript, copy all inputs to /scratch, execute the calculations and copy outputs to home directory.
......@@ -337,12 +337,12 @@ exit
In this example, some directory on the /home holds the input file input and executable mympiprog.x . We create a directory myjob on the /scratch filesystem, copy input and executable files from the /home directory where the qsub was invoked ($PBS_O_WORKDIR) to /scratch, execute the MPI programm mympiprog.x and copy the output file back to the /home directory. The mympiprog.x is executed as one process per node, on all allocated nodes.
!!! Note "Note"
!!! note
Consider preloading inputs and executables onto [shared scratch](storage/) before the calculation starts.
In some cases, it may be impractical to copy the inputs to scratch and outputs to home. This is especially true when very large input and output files are expected, or when the files should be reused by a subsequent calculation. In such a case, it is users responsibility to preload the input files on shared /scratch before the job submission and retrieve the outputs manually, after all calculations are finished.
!!! Note "Note"
!!! note
Store the qsub options within the jobscript. Use **mpiprocs** and **ompthreads** qsub options to control the MPI job execution.
Example jobscript for an MPI job with preloaded inputs and executables, options for qsub are stored within the script :
......@@ -375,7 +375,7 @@ sections.
### Example Jobscript for Single Node Calculation
!!! Note "Note"
!!! note
Local scratch directory is often useful for single node jobs. Local scratch will be deleted immediately after the job ends.
Example jobscript for single node calculation, using [local scratch](storage/) on the node:
......
......@@ -8,8 +8,8 @@ All compute and login nodes of Anselm are interconnected by a high-bandwidth, lo
The compute nodes may be accessed via the InfiniBand network using ib0 network interface, in address range 10.2.1.1-209. The MPI may be used to establish native InfiniBand connection among the nodes.
!!! Note "Note"
The network provides **2170 MB/s** transfer rates via the TCP connection (single stream) and up to **3600 MB/s** via native InfiniBand protocol.
!!! note
The network provides **2170 MB/s** transfer rates via the TCP connection (single stream) and up to **3600 MB/s** via native InfiniBand protocol.
The Fat tree topology ensures that peak transfer rates are achieved between any two nodes, independent of network traffic exchanged among other nodes concurrently.
......
......@@ -28,11 +28,11 @@ The user will need a valid certificate and to be present in the PRACE LDAP (plea
Most of the information needed by PRACE users accessing the Anselm TIER-1 system can be found here:
- [General user's FAQ](http://www.prace-ri.eu/Users-General-FAQs)
- [Certificates FAQ](http://www.prace-ri.eu/Certificates-FAQ)
- [Interactive access using GSISSH](http://www.prace-ri.eu/Interactive-Access-Using-gsissh)
- [Data transfer with GridFTP](http://www.prace-ri.eu/Data-Transfer-with-GridFTP-Details)
- [Data transfer with gtransfer](http://www.prace-ri.eu/Data-Transfer-with-gtransfer)
* [General user's FAQ](http://www.prace-ri.eu/Users-General-FAQs)
* [Certificates FAQ](http://www.prace-ri.eu/Certificates-FAQ)
* [Interactive access using GSISSH](http://www.prace-ri.eu/Interactive-Access-Using-gsissh)
* [Data transfer with GridFTP](http://www.prace-ri.eu/Data-Transfer-with-GridFTP-Details)
* [Data transfer with gtransfer](http://www.prace-ri.eu/Data-Transfer-with-gtransfer)
Before you start to use any of the services don't forget to create a proxy certificate from your certificate:
......@@ -233,9 +233,12 @@ The resources that are currently subject to accounting are the core hours. The c
PRACE users should check their project accounting using the [PRACE Accounting Tool (DART)](http://www.prace-ri.eu/accounting-report-tool/).
Users who have undergone the full local registration procedure (including signing the IT4Innovations Acceptable Use Policy) and who have received local password may check at any time, how many core-hours have been consumed by themselves and their projects using the command "it4ifree". Please note that you need to know your user password to use the command and that the displayed core hours are "system core hours" which differ from PRACE "standardized core hours".
Users who have undergone the full local registration procedure (including signing the IT4Innovations Acceptable Use Policy) and who have received local password may check at any time, how many core-hours have been consumed by themselves and their projects using the command "it4ifree".
!!! Note "Note"
!!! note
You need to know your user password to use the command. Displayed core hours are "system core hours" which differ from PRACE "standardized core hours".
!!! hint
The **it4ifree** command is a part of it4i.portal.clients package, located here: <https://pypi.python.org/pypi/it4i.portal.clients>
```bash
......
......@@ -41,7 +41,7 @@ Please [follow the documentation](shell-and-data-access/).
To have the OpenGL acceleration, **24 bit color depth must be used**. Otherwise only the geometry (desktop size) definition is needed.
!!! Hint
!!! hint
At first VNC server run you need to define a password.
This example defines desktop with dimensions 1200x700 pixels and 24 bit color depth.
......@@ -138,7 +138,7 @@ qviz**. The queue has following properties:
Currently when accessing the node, each user gets 4 cores of a CPU allocated, thus approximately 16 GB of RAM and 1/4 of the GPU capacity.
!!! Note
!!! note
If more GPU power or RAM is required, it is recommended to allocate one whole node per user, so that all 16 cores, whole RAM and whole GPU is exclusive. This is currently also the maximum allowed allocation per one user. One hour of work is allocated by default, the user may ask for 2 hours maximum.
To access the visualization node, follow these steps:
......@@ -192,7 +192,7 @@ $ module load virtualgl/2.4
$ vglrun glxgears
```
Please note, that if you want to run an OpenGL application which is vailable through modules, you need at first load the respective module. . g. to run the **Mentat** OpenGL application from **MARC** software ackage use:
If you want to run an OpenGL application which is vailable through modules, you need at first load the respective module. E.g. to run the **Mentat** OpenGL application from **MARC** software ackage use:
```bash
$ module load marc/2013.1
......
......@@ -6,21 +6,21 @@ To run a [job](../introduction/), [computational resources](../introduction/) fo
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and resources available to the Project. [The Fair-share](job-priority/) at Anselm ensures that individual users may consume approximately equal amount of resources per week. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following queues are available to Anselm users:
- **qexp**, the Express queue
- **qprod**, the Production queue
- **qlong**, the Long queue, regula
- **qnvidia**, **qmic**, **qfat**, the Dedicated queues
- **qfree**, the Free resource utilization queue
* **qexp**, the Express queue
* **qprod**, the Production queue
* **qlong**, the Long queue, regula
* **qnvidia**, **qmic**, **qfat**, the Dedicated queues
* **qfree**, the Free resource utilization queue
!!! Note "Note"
Check the queue status at <https://extranet.it4i.cz/anselm/>
!!! note
Check the queue status at <https://extranet.it4i.cz/anselm/>
Read more on the [Resource AllocationPolicy](resources-allocation-policy/) page.
## Job Submission and Execution
!!! Note "Note"
Use the **qsub** command to submit your jobs.
!!! note
Use the **qsub** command to submit your jobs.
The qsub submits the job into the queue. The qsub command creates a request to the PBS Job manager for allocation of specified resources. The **smallest allocation unit is entire node, 16 cores**, with exception of the qexp queue. The resources will be allocated when available, subject to allocation policies and constraints. **After the resources are allocated the jobscript or interactive shell is executed on first of the allocated nodes.**
......@@ -28,8 +28,8 @@ Read more on the [Job submission and execution](job-submission-and-execution/) p
## Capacity Computing
!!! Note "Note"
Use Job arrays when running huge number of jobs.
!!! note
Use Job arrays when running huge number of jobs.
Use GNU Parallel and/or Job arrays when running (many) single core jobs.
......
......@@ -4,7 +4,7 @@
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and resources available to the Project. The Fair-share at Anselm ensures that individual users may consume approximately equal amount of resources per week. Detailed information in the [Job scheduling](job-priority/) section. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following table provides the queue partitioning overview:
!!! Note "Note"
!!! note
Check the queue status at <https://extranet.it4i.cz/anselm/>
| queue | active project | project resources | nodes | min ncpus | priority | authorization | walltime |
......@@ -15,16 +15,16 @@ The resources are allocated to the job in a fair-share fashion, subject to const
| qnvidia, qmic, qfat | yes | 0 | 23 total qnvidia4 total qmic2 total qfat | 16 | 200 | yes | 24/48 h |
| qfree | yes | none required | 178 w/o accelerator | 16 | -1024 | no | 12 h |
!!! Note "Note"
!!! note
**The qfree queue is not free of charge**. [Normal accounting](#resources-accounting-policy) applies. However, it allows for utilization of free resources, once a Project exhausted all its allocated computational resources. This does not apply for Directors Discreation's projects (DD projects) by default. Usage of qfree after exhaustion of DD projects computational resources is allowed after request for this queue.
**The qexp queue is equipped with the nodes not having the very same CPU clock speed.** Should you need the very same CPU speed, you have to select the proper nodes during the PSB job submission.
- **qexp**, the Express queue: This queue is dedicated for testing and running very small jobs. It is not required to specify a project to enter the qexp. There are 2 nodes always reserved for this queue (w/o accelerator), maximum 8 nodes are available via the qexp for a particular user, from a pool of nodes containing Nvidia accelerated nodes (cn181-203), MIC accelerated nodes (cn204-207) and Fat nodes with 512GB RAM (cn208-209). This enables to test and tune also accelerated code or code with higher RAM requirements. The nodes may be allocated on per core basis. No special authorization is required to use it. The maximum runtime in qexp is 1 hour.
- **qprod**, the Production queue: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, except the reserved ones. 178 nodes without accelerator are included. Full nodes, 16 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours.
- **qlong**, the Long queue: This queue is intended for long production runs. It is required that active project with nonzero remaining resources is specified to enter the qlong. Only 60 nodes without acceleration may be accessed via the qlong queue. Full nodes, 16 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qlong is 144 hours (three times of the standard qprod time - 3 x 48 h).
- **qnvidia**, qmic, qfat, the Dedicated queues: The queue qnvidia is dedicated to access the Nvidia accelerated nodes, the qmic to access MIC nodes and qfat the Fat nodes. It is required that active project with nonzero remaining resources is specified to enter these queues. 23 nvidia, 4 mic and 2 fat nodes are included. Full nodes, 16 cores per node are allocated. The queues run with very high priority, the jobs will be scheduled before the jobs coming from the qexp queue. An PI needs explicitly ask [support](https://support.it4i.cz/rt/) for authorization to enter the dedicated queues for all users associated to her/his Project.
- **qfree**, The Free resource queue: The queue qfree is intended for utilization of free resources, after a Project exhausted all its allocated computational resources (Does not apply to DD projects by default. DD projects have to request for persmission on qfree after exhaustion of computational resources.). It is required that active project is specified to enter the queue, however no remaining resources are required. Consumed resources will be accounted to the Project. Only 178 nodes without accelerator may be accessed from this queue. Full nodes, 16 cores per node are allocated. The queue runs with very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours.
* **qexp**, the Express queue: This queue is dedicated for testing and running very small jobs. It is not required to specify a project to enter the qexp. There are 2 nodes always reserved for this queue (w/o accelerator), maximum 8 nodes are available via the qexp for a particular user, from a pool of nodes containing Nvidia accelerated nodes (cn181-203), MIC accelerated nodes (cn204-207) and Fat nodes with 512GB RAM (cn208-209). This enables to test and tune also accelerated code or code with higher RAM requirements. The nodes may be allocated on per core basis. No special authorization is required to use it. The maximum runtime in qexp is 1 hour.
* **qprod**, the Production queue: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, except the reserved ones. 178 nodes without accelerator are included. Full nodes, 16 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours.
* **qlong**, the Long queue: This queue is intended for long production runs. It is required that active project with nonzero remaining resources is specified to enter the qlong. Only 60 nodes without acceleration may be accessed via the qlong queue. Full nodes, 16 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qlong is 144 hours (three times of the standard qprod time - 3 x 48 h).
* **qnvidia**, qmic, qfat, the Dedicated queues: The queue qnvidia is dedicated to access the Nvidia accelerated nodes, the qmic to access MIC nodes and qfat the Fat nodes. It is required that active project with nonzero remaining resources is specified to enter these queues. 23 nvidia, 4 mic and 2 fat nodes are included. Full nodes, 16 cores per node are allocated. The queues run with very high priority, the jobs will be scheduled before the jobs coming from the qexp queue. An PI needs explicitly ask [support](https://support.it4i.cz/rt/) for authorization to enter the dedicated queues for all users associated to her/his Project.
* **qfree**, The Free resource queue: The queue qfree is intended for utilization of free resources, after a Project exhausted all its allocated computational resources (Does not apply to DD projects by default. DD projects have to request for persmission on qfree after exhaustion of computational resources.). It is required that active project is specified to enter the queue, however no remaining resources are required. Consumed resources will be accounted to the Project. Only 178 nodes without accelerator may be accessed from this queue. Full nodes, 16 cores per node are allocated. The queue runs with very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours.
### Notes
......@@ -113,7 +113,7 @@ The resources that are currently subject to accounting are the core-hours. The c
### Check Consumed Resources
!!! Note "Note"
!!! note
The **it4ifree** command is a part of it4i.portal.clients package, located here: <https://pypi.python.org/pypi/it4i.portal.clients>
User may check at any time, how many core-hours have been consumed by himself/herself and his/her projects. The command is available on clusters' login nodes.
......
......@@ -53,7 +53,7 @@ Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com
Example to the cluster login:
!!! Note "Note"
!!! note
The environment is **not** shared between login nodes, except for [shared filesystems](storage/#shared-filesystems).
## Data Transfer
......@@ -69,14 +69,14 @@ Data in and out of the system may be transferred by the [scp](http://en.wikipedi
The authentication is by the [private key](../get-started-with-it4innovations/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys/)
!!! Note "Note"
!!! note
Data transfer rates up to **160MB/s** can be achieved with scp or sftp.
1TB may be transferred in 1:50h.
To achieve 160MB/s transfer rates, the end user must be connected by 10G line all the way to IT4Innovations and use computer with fast processor for the transfer. Using Gigabit ethernet connection, up to 110MB/s may be expected. Fast cipher (aes128-ctr) should be used.
!!! Note "Note"
!!! note
If you experience degraded data transfer performance, consult your local network provider.
On linux or Mac, use scp or sftp client to transfer the data to Anselm:
......@@ -126,7 +126,7 @@ Outgoing connections, from Anselm Cluster login nodes to the outside world, are
| 443 | https |
| 9418 | git |
!!! Note "Note"
!!! note
Please use **ssh port forwarding** and proxy servers to connect from Anselm to all other remote ports.
Outgoing connections, from Anselm Cluster compute nodes are restricted to the internal network. Direct connections form compute nodes to outside world are cut.
......@@ -135,7 +135,7 @@ Outgoing connections, from Anselm Cluster compute nodes are restricted to the in
### Port Forwarding From Login Nodes
!!! Note "Note"
!!! note
Port forwarding allows an application running on Anselm to connect to arbitrary remote host and port.
It works by tunneling the connection from Anselm back to users workstation and forwarding from the workstation to the remote host.
......@@ -177,7 +177,7 @@ In this example, we assume that port forwarding from login1:6000 to remote.host.
Port forwarding is static, each single port is mapped to a particular port on remote host. Connection to other remote host, requires new forward.