Commit 2172f360 authored by Lukáš Krupčík's avatar Lukáš Krupčík

update

parent e4dc6645
Pipeline #1882 passed with stages
in 1 minute and 7 seconds
......@@ -7,11 +7,11 @@ In many cases, it is useful to submit huge (>100+) number of computational jobs
However, executing huge number of jobs via the PBS queue may strain the system. This strain may result in slow response to commands, inefficient scheduling and overall degradation of performance and user experience, for all users. For this reason, the number of jobs is **limited to 100 per user, 1000 per job array**
!!! Note
Please follow one of the procedures below, in case you wish to schedule more than 100 jobs at a time.
Please follow one of the procedures below, in case you wish to schedule more than 100 jobs at a time.
- Use [Job arrays](capacity-computing/#job-arrays) when running huge number of [multithread](capacity-computing/#shared-jobscript-on-one-node) (bound to one node only) or multinode (multithread across several nodes) jobs
- Use [GNU parallel](capacity-computing/#gnu-parallel) when running single core jobs
- Combine [GNU parallel with Job arrays](capacity-computing/#job-arrays-and-gnu-parallel) when running huge number of single core jobs
* Use [Job arrays](capacity-computing/#job-arrays) when running huge number of [multithread](capacity-computing/#shared-jobscript-on-one-node) (bound to one node only) or multinode (multithread across several nodes) jobs
* Use [GNU parallel](capacity-computing/#gnu-parallel) when running single core jobs
* Combine [GNU parallel with Job arrays](capacity-computing/#job-arrays-and-gnu-parallel) when running huge number of single core jobs
## Policy
......@@ -21,13 +21,13 @@ However, executing huge number of jobs via the PBS queue may strain the system.
## Job Arrays
!!! Note
Huge number of jobs may be easily submitted and managed as a job array.
Huge number of jobs may be easily submitted and managed as a job array.
A job array is a compact representation of many jobs, called subjobs. The subjobs share the same job script, and have the same values for all attributes and resources, with the following exceptions:
- each subjob has a unique index, $PBS_ARRAY_INDEX
- job Identifiers of subjobs only differ by their indices
- the state of subjobs can differ (R,Q,...etc.)
* each subjob has a unique index, $PBS_ARRAY_INDEX
* job Identifiers of subjobs only differ by their indices
* the state of subjobs can differ (R,Q,...etc.)
All subjobs within a job array have the same scheduling priority and schedule as independent jobs. Entire job array is submitted through a single qsub command and may be managed by qdel, qalter, qhold, qrls and qsig commands as a single job.
......@@ -39,7 +39,7 @@ Example:
Assume we have 900 input files with name beginning with "file" (e. g. file001, ..., file900). Assume we would like to use each of these input files with program executable myprog.x, each as a separate job.
First, we create a tasklist file (or subjobs list), listing all tasks (subjobs) - all input files in our example:
First, we create a tasklist file (or subjobs list), listing all tasks (subjobs) * all input files in our example:
```bash
$ find . -name 'file*' > tasklist
......@@ -103,8 +103,8 @@ $ qstat -a 12345[].dm2
dm2:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -- |---|---| ------ --- --- ------ ----- - -----
12345[].dm2 user2 qprod xx 13516 1 16 -- 00:50 B 00:02
--------------* -------* -* |---|---| -----* --* --* -----* ----* * -----
12345[].dm2 user2 qprod xx 13516 1 16 -* 00:50 B 00:02
```
The status B means that some subjobs are already running.
......@@ -116,14 +116,14 @@ $ qstat -a 12345[1-100].dm2
dm2:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -- |---|---| ------ --- --- ------ ----- - -----
12345[1].dm2 user2 qprod xx 13516 1 16 -- 00:50 R 00:02
12345[2].dm2 user2 qprod xx 13516 1 16 -- 00:50 R 00:02
12345[3].dm2 user2 qprod xx 13516 1 16 -- 00:50 R 00:01
12345[4].dm2 user2 qprod xx 13516 1 16 -- 00:50 Q --
--------------* -------* -* |---|---| -----* --* --* -----* ----* * -----
12345[1].dm2 user2 qprod xx 13516 1 16 -* 00:50 R 00:02
12345[2].dm2 user2 qprod xx 13516 1 16 -* 00:50 R 00:02
12345[3].dm2 user2 qprod xx 13516 1 16 -* 00:50 R 00:01
12345[4].dm2 user2 qprod xx 13516 1 16 -* 00:50 Q --
. . . . . . . . . . .
, . . . . . . . . . .
12345[100].dm2 user2 qprod xx 13516 1 16 -- 00:50 Q --
12345[100].dm2 user2 qprod xx 13516 1 16 -* 00:50 Q --
```
Delete the entire job array. Running subjobs will be killed, queueing subjobs will be deleted.
......@@ -150,7 +150,7 @@ Read more on job arrays in the [PBSPro Users guide](../../pbspro-documentation/)
## GNU Parallel
!!! Note
Use GNU parallel to run many single core tasks on one node.
Use GNU parallel to run many single core tasks on one node.
GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. GNU parallel is most useful in running single core jobs via the queue system on Anselm.
......@@ -169,7 +169,7 @@ Example:
Assume we have 101 input files with name beginning with "file" (e. g. file001, ..., file101). Assume we would like to use each of these input files with program executable myprog.x, each as a separate single core job. We call these single core jobs tasks.
First, we create a tasklist file, listing all tasks - all input files in our example:
First, we create a tasklist file, listing all tasks * all input files in our example:
```bash
$ find . -name 'file*' > tasklist
......@@ -237,7 +237,7 @@ Example:
Assume we have 992 input files with name beginning with "file" (e. g. file001, ..., file992). Assume we would like to use each of these input files with program executable myprog.x, each as a separate single core job. We call these single core jobs tasks.
First, we create a tasklist file, listing all tasks - all input files in our example:
First, we create a tasklist file, listing all tasks * all input files in our example:
```bash
$ find . -name 'file*' > tasklist
......@@ -265,7 +265,7 @@ SCR=/lscratch/$PBS_JOBID/$PARALLEL_SEQ
mkdir -p $SCR ; cd $SCR || exit
# get individual task from tasklist with index from PBS JOB ARRAY and index form Parallel
IDX=$(($PBS_ARRAY_INDEX + $PARALLEL_SEQ - 1))
IDX=$(($PBS_ARRAY_INDEX + $PARALLEL_SEQ * 1))
TASK=$(sed -n "${IDX}p" $PBS_O_WORKDIR/tasklist)
[ -z "$TASK" ] && exit
......
......@@ -6,46 +6,46 @@ Anselm is cluster of x86-64 Intel based nodes built on Bull Extreme Computing bu
### Compute Nodes Without Accelerator
- 180 nodes
- 2880 cores in total
- two Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
- 64 GB of physical memory per node
- one 500GB SATA 2,5” 7,2 krpm HDD per node
- bullx B510 blade servers
- cn[1-180]
* 180 nodes
* 2880 cores in total
* two Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
* 64 GB of physical memory per node
* one 500GB SATA 2,5” 7,2 krpm HDD per node
* bullx B510 blade servers
* cn[1-180]
### Compute Nodes With GPU Accelerator
- 23 nodes
- 368 cores in total
- two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
- 96 GB of physical memory per node
- one 500GB SATA 2,5” 7,2 krpm HDD per node
- GPU accelerator 1x NVIDIA Tesla Kepler K20 per node
- bullx B515 blade servers
- cn[181-203]
* 23 nodes
* 368 cores in total
* two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
* 96 GB of physical memory per node
* one 500GB SATA 2,5” 7,2 krpm HDD per node
* GPU accelerator 1x NVIDIA Tesla Kepler K20 per node
* bullx B515 blade servers
* cn[181-203]
### Compute Nodes With MIC Accelerator
- 4 nodes
- 64 cores in total
- two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
- 96 GB of physical memory per node
- one 500GB SATA 2,5” 7,2 krpm HDD per node
- MIC accelerator 1x Intel Phi 5110P per node
- bullx B515 blade servers
- cn[204-207]
* 4 nodes
* 64 cores in total
* two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
* 96 GB of physical memory per node
* one 500GB SATA 2,5” 7,2 krpm HDD per node
* MIC accelerator 1x Intel Phi 5110P per node
* bullx B515 blade servers
* cn[204-207]
### Fat Compute Nodes
- 2 nodes
- 32 cores in total
- 2 Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
- 512 GB of physical memory per node
- two 300GB SAS 3,5”15krpm HDD (RAID1) per node
- two 100GB SLC SSD per node
- bullx R423-E3 servers
- cn[208-209]
* 2 nodes
* 32 cores in total
* 2 Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
* 512 GB of physical memory per node
* two 300GB SAS 3,5”15krpm HDD (RAID1) per node
* two 100GB SLC SSD per node
* bullx R423-E3 servers
* cn[208-209]
![](../img/bullxB510.png)
**Figure Anselm bullx B510 servers**
......@@ -53,7 +53,7 @@ Anselm is cluster of x86-64 Intel based nodes built on Bull Extreme Computing bu
### Compute Nodes Summary
| Node type | Count | Range | Memory | Cores | [Access](resources-allocation-policy/) |
| -------------------------- | ----- | ----------- | ------ | ----------- | -------------------------------------- |
| -------------------------* | ----* | ----------* | -----* | ----------* | -------------------------------------* |
| Nodes without accelerator | 180 | cn[1-180] | 64GB | 16 @ 2.4Ghz | qexp, qprod, qlong, qfree |
| Nodes with GPU accelerator | 23 | cn[181-203] | 96GB | 16 @ 2.3Ghz | qgpu, qprod |
| Nodes with MIC accelerator | 4 | cn[204-207] | 96GB | 16 @ 2.3GHz | qmic, qprod |
......@@ -65,23 +65,23 @@ Anselm is equipped with Intel Sandy Bridge processors Intel Xeon E5-2665 (nodes
### Intel Sandy Bridge E5-2665 Processor
- eight-core
- speed: 2.4 GHz, up to 3.1 GHz using Turbo Boost Technology
- peak performance: 19.2 GFLOP/s per core
- caches:
- L2: 256 KB per core
- L3: 20 MB per processor
- memory bandwidth at the level of the processor: 51.2 GB/s
* eight-core
* speed: 2.4 GHz, up to 3.1 GHz using Turbo Boost Technology
* peak performance: 19.2 GFLOP/s per core
* caches:
* L2: 256 KB per core
* L3: 20 MB per processor
* memory bandwidth at the level of the processor: 51.2 GB/s
### Intel Sandy Bridge E5-2470 Processor
- eight-core
- speed: 2.3 GHz, up to 3.1 GHz using Turbo Boost Technology
- peak performance: 18.4 GFLOP/s per core
- caches:
- L2: 256 KB per core
- L3: 20 MB per processor
- memory bandwidth at the level of the processor: 38.4 GB/s
* eight-core
* speed: 2.3 GHz, up to 3.1 GHz using Turbo Boost Technology
* peak performance: 18.4 GFLOP/s per core
* caches:
* L2: 256 KB per core
* L3: 20 MB per processor
* memory bandwidth at the level of the processor: 38.4 GB/s
Nodes equipped with Intel Xeon E5-2665 CPU have set PBS resource attribute cpu_freq = 24, nodes equipped with Intel Xeon E5-2470 CPU have set PBS resource attribute cpu_freq = 23.
......@@ -101,30 +101,30 @@ Intel Turbo Boost Technology is used by default, you can disable it for all nod
### Compute Node Without Accelerator
- 2 sockets
- Memory Controllers are integrated into processors.
- 8 DDR3 DIMMs per node
- 4 DDR3 DIMMs per CPU
- 1 DDR3 DIMMs per channel
- Data rate support: up to 1600MT/s
- Populated memory: 8 x 8 GB DDR3 DIMM 1600 MHz
* 2 sockets
* Memory Controllers are integrated into processors.
* 8 DDR3 DIMMs per node
* 4 DDR3 DIMMs per CPU
* 1 DDR3 DIMMs per channel
* Data rate support: up to 1600MT/s
* Populated memory: 8 x 8 GB DDR3 DIMM 1600 MHz
### Compute Node With GPU or MIC Accelerator
- 2 sockets
- Memory Controllers are integrated into processors.
- 6 DDR3 DIMMs per node
- 3 DDR3 DIMMs per CPU
- 1 DDR3 DIMMs per channel
- Data rate support: up to 1600MT/s
- Populated memory: 6 x 16 GB DDR3 DIMM 1600 MHz
* 2 sockets
* Memory Controllers are integrated into processors.
* 6 DDR3 DIMMs per node
* 3 DDR3 DIMMs per CPU
* 1 DDR3 DIMMs per channel
* Data rate support: up to 1600MT/s
* Populated memory: 6 x 16 GB DDR3 DIMM 1600 MHz
### Fat Compute Node
- 2 sockets
- Memory Controllers are integrated into processors.
- 16 DDR3 DIMMs per node
- 8 DDR3 DIMMs per CPU
- 2 DDR3 DIMMs per channel
- Data rate support: up to 1600MT/s
- Populated memory: 16 x 32 GB DDR3 DIMM 1600 MHz
* 2 sockets
* Memory Controllers are integrated into processors.
* 16 DDR3 DIMMs per node
* 8 DDR3 DIMMs per CPU
* 2 DDR3 DIMMs per channel
* Data rate support: up to 1600MT/s
* Populated memory: 16 x 32 GB DDR3 DIMM 1600 MHz
......@@ -16,7 +16,7 @@ fi
alias qs='qstat -a'
module load PrgEnv-gnu
# Display information to standard output - only in interactive ssh session
# Display information to standard output * only in interactive ssh session
if [ -n "$SSH_TTY" ]
then
module list # Display loaded modules
......@@ -24,14 +24,14 @@ fi
```
!!! Note
Do not run commands outputting to standard output (echo, module list, etc) in .bashrc for non-interactive SSH sessions. It breaks fundamental functionality (scp, PBS) of your account! Conside utilization of SSH session interactivity for such commands as stated in the previous example.
Do not run commands outputting to standard output (echo, module list, etc) in .bashrc for non-interactive SSH sessions. It breaks fundamental functionality (scp, PBS) of your account! Conside utilization of SSH session interactivity for such commands as stated in the previous example.
### Application Modules
In order to configure your shell for running particular application on Anselm we use Module package interface.
!!! Note
The modules set up the application paths, library paths and environment variables for running particular application.
The modules set up the application paths, library paths and environment variables for running particular application.
We have also second modules repository. This modules repository is created using tool called EasyBuild. On Salomon cluster, all modules will be build by this tool. If you want to use software from this modules repository, please follow instructions in section [Application Modules Path Expansion](environment-and-modules/#EasyBuild).
......
......@@ -12,10 +12,10 @@ The cluster compute nodes cn[1-207] are organized within 13 chassis.
There are four types of compute nodes:
- 180 compute nodes without the accelerator
- 23 compute nodes with GPU accelerator - equipped with NVIDIA Tesla Kepler K20
- 4 compute nodes with MIC accelerator - equipped with Intel Xeon Phi 5110P
- 2 fat nodes - equipped with 512 GB RAM and two 100 GB SSD drives
* 180 compute nodes without the accelerator
* 23 compute nodes with GPU accelerator * equipped with NVIDIA Tesla Kepler K20
* 4 compute nodes with MIC accelerator * equipped with Intel Xeon Phi 5110P
* 2 fat nodes * equipped with 512 GB RAM and two 100 GB SSD drives
[More about Compute nodes](compute-nodes/).
......@@ -31,7 +31,7 @@ The user access to the Anselm cluster is provided by two login nodes login1, log
The parameters are summarized in the following tables:
| **In general** | |
| ------------------------------------------- | -------------------------------------------- |
| ------------------------------------------* | -------------------------------------------* |
| Primary purpose | High Performance Computing |
| Architecture of compute nodes | x86-64 |
| Operating system | Linux |
......@@ -39,7 +39,7 @@ The parameters are summarized in the following tables:
| Totally | 209 |
| Processor cores | 16 (2 x 8 cores) |
| RAM | min. 64 GB, min. 4 GB per core |
| Local disk drive | yes - usually 500 GB |
| Local disk drive | yes * usually 500 GB |
| Compute network | InfiniBand QDR, fully non-blocking, fat-tree |
| w/o accelerator | 180, cn[1-180] |
| GPU accelerated | 23, cn[181-203] |
......@@ -51,10 +51,10 @@ The parameters are summarized in the following tables:
| Total amount of RAM | 15.136 TB |
| Node | Processor | Memory | Accelerator |
| ---------------- | --------------------------------------- | ------ | -------------------- |
| w/o accelerator | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 64 GB | - |
| ---------------* | --------------------------------------* | -----* | -------------------* |
| w/o accelerator | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 64 GB | * |
| GPU accelerated | 2 x Intel Sandy Bridge E5-2470, 2.3 GHz | 96 GB | NVIDIA Kepler K20 |
| MIC accelerated | 2 x Intel Sandy Bridge E5-2470, 2.3 GHz | 96 GB | Intel Xeon Phi 5110P |
| Fat compute node | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 512 GB | - |
| Fat compute node | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 512 GB | * |
For more details please refer to the [Compute nodes](compute-nodes/), [Storage](storage/), and [Network](network/).
......@@ -36,7 +36,7 @@ Usage counts allocated core-hours (`ncpus x walltime`). Usage is decayed, or cut
Jobs queued in queue qexp are not calculated to project's usage.
!!! Note
Calculated usage and fair-share priority can be seen at <https://extranet.it4i.cz/anselm/projects>.
Calculated usage and fair-share priority can be seen at <https://extranet.it4i.cz/anselm/projects>.
Calculated fair-share priority can be also seen as Resource_List.fairshare attribute of a job.
......@@ -65,6 +65,6 @@ The scheduler makes a list of jobs to run in order of execution priority. Schedu
It means, that jobs with lower execution priority can be run before jobs with higher execution priority.
!!! Note
It is **very beneficial to specify the walltime** when submitting jobs.
It is **very beneficial to specify the walltime** when submitting jobs.
Specifying more accurate walltime enables better scheduling, better execution times and better resource usage. Jobs with suitable (small) walltime could be backfilled - and overtake job(s) with higher priority.
Specifying more accurate walltime enables better scheduling, better execution times and better resource usage. Jobs with suitable (small) walltime could be backfilled * and overtake job(s) with higher priority.
......@@ -77,7 +77,7 @@ In this example, we allocate nodes cn171 and cn172, all 16 cores per node, for 2
Nodes equipped with Intel Xeon E5-2665 CPU have base clock frequency 2.4GHz, nodes equipped with Intel Xeon E5-2470 CPU have base frequency 2.3 GHz (see section Compute Nodes for details). Nodes may be selected via the PBS resource attribute cpu_freq .
| CPU Type | base freq. | Nodes | cpu_freq attribute |
| ------------------ | ---------- | ---------------------- | ------------------ |
| -----------------* | ---------* | ---------------------* | -----------------* |
| Intel Xeon E5-2665 | 2.4GHz | cn[1-180], cn[208-209] | 24 |
| Intel Xeon E5-2470 | 2.3GHz | cn[181-207] | 23 |
......@@ -150,10 +150,10 @@ $ qstat -a
srv11:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -- |---|---| ------ --- --- ------ ----- - -----
16287.srv11 user1 qlong job1 6183 4 64 -- 144:0 R 38:25
16468.srv11 user1 qlong job2 8060 4 64 -- 144:0 R 17:44
16547.srv11 user2 qprod job3x 13516 2 32 -- 48:00 R 00:58
--------------* -------* -* |---|---| -----* --* --* -----* ----* * -----
16287.srv11 user1 qlong job1 6183 4 64 -* 144:0 R 38:25
16468.srv11 user1 qlong job2 8060 4 64 -* 144:0 R 17:44
16547.srv11 user2 qprod job3x 13516 2 32 -* 48:00 R 00:58
```
In this example user1 and user2 are running jobs named job1, job2 and job3x. The jobs job1 and job2 are using 4 nodes, 16 cores per node each. The job1 already runs for 38 hours and 25 minutes, job2 for 17 hours 44 minutes. The job1 already consumed `64 x 38.41 = 2458.6` core hours. The job3x already consumed `0.96 x 32 = 30.93` core hours. These consumed core hours will be accounted on the respective project accounts, regardless of whether the allocated cores were actually used for computations.
......@@ -253,8 +253,8 @@ $ qstat -n -u username
srv11:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -- |---|---| ------ --- --- ------ ----- - -----
15209.srv11 username qexp Name0 5530 4 64 -- 01:00 R 00:00
--------------* -------* -* |---|---| -----* --* --* -----* ----* * -----
15209.srv11 username qexp Name0 5530 4 64 -* 01:00 R 00:00
cn17/0*16+cn108/0*16+cn109/0*16+cn110/0*16
```
......
......@@ -9,7 +9,7 @@ All compute and login nodes of Anselm are interconnected by a high-bandwidth, lo
The compute nodes may be accessed via the InfiniBand network using ib0 network interface, in address range 10.2.1.1-209. The MPI may be used to establish native InfiniBand connection among the nodes.
!!! Note
The network provides **2170 MB/s** transfer rates via the TCP connection (single stream) and up to **3600 MB/s** via native InfiniBand protocol.
The network provides **2170 MB/s** transfer rates via the TCP connection (single stream) and up to **3600 MB/s** via native InfiniBand protocol.
The Fat tree topology ensures that peak transfer rates are achieved between any two nodes, independent of network traffic exchanged among other nodes concurrently.
......@@ -24,8 +24,8 @@ $ qsub -q qexp -l select=4:ncpus=16 -N Name0 ./myjob
$ qstat -n -u username
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -- |---|---| ------ --- --- ------ ----- - -----
15209.srv11 username qexp Name0 5530 4 64 -- 01:00 R 00:00
--------------* -------* -* |---|---| -----* --* --* -----* ----* * -----
15209.srv11 username qexp Name0 5530 4 64 -* 01:00 R 00:00
cn17/0*16+cn108/0*16+cn109/0*16+cn110/0*16
$ ssh 10.2.1.110
......
......@@ -28,11 +28,11 @@ The user will need a valid certificate and to be present in the PRACE LDAP (plea
Most of the information needed by PRACE users accessing the Anselm TIER-1 system can be found here:
- [General user's FAQ](http://www.prace-ri.eu/Users-General-FAQs)
- [Certificates FAQ](http://www.prace-ri.eu/Certificates-FAQ)
- [Interactive access using GSISSH](http://www.prace-ri.eu/Interactive-Access-Using-gsissh)
- [Data transfer with GridFTP](http://www.prace-ri.eu/Data-Transfer-with-GridFTP-Details)
- [Data transfer with gtransfer](http://www.prace-ri.eu/Data-Transfer-with-gtransfer)
* [General user's FAQ](http://www.prace-ri.eu/Users-General-FAQs)
* [Certificates FAQ](http://www.prace-ri.eu/Certificates-FAQ)
* [Interactive access using GSISSH](http://www.prace-ri.eu/Interactive-Access-Using-gsissh)
* [Data transfer with GridFTP](http://www.prace-ri.eu/Data-Transfer-with-GridFTP-Details)
* [Data transfer with gtransfer](http://www.prace-ri.eu/Data-Transfer-with-gtransfer)
Before you start to use any of the services don't forget to create a proxy certificate from your certificate:
......@@ -53,7 +53,7 @@ To access Anselm cluster, two login nodes running GSI SSH service are available.
It is recommended to use the single DNS name anselm-prace.it4i.cz which is distributed between the two login nodes. If needed, user can login directly to one of the login nodes. The addresses are:
| Login address | Port | Protocol | Login node |
| --------------------------- | ---- | -------- | ---------------- |
| --------------------------* | ---* | -------* | ---------------* |
| anselm-prace.it4i.cz | 2222 | gsissh | login1 or login2 |
| login1-prace.anselm.it4i.cz | 2222 | gsissh | login1 |
| login2-prace.anselm.it4i.cz | 2222 | gsissh | login2 |
......@@ -73,7 +73,7 @@ When logging from other PRACE system, the prace_service script can be used:
It is recommended to use the single DNS name anselm.it4i.cz which is distributed between the two login nodes. If needed, user can login directly to one of the login nodes. The addresses are:
| Login address | Port | Protocol | Login node |
| --------------------- | ---- | -------- | ---------------- |
| --------------------* | ---* | -------* | ---------------* |
| anselm.it4i.cz | 2222 | gsissh | login1 or login2 |
| login1.anselm.it4i.cz | 2222 | gsissh | login1 |
| login2.anselm.it4i.cz | 2222 | gsissh | login2 |
......@@ -125,7 +125,7 @@ There's one control server and three backend servers for striping and/or backup
**Access from PRACE network:**
| Login address | Port | Node role |
| ---------------------------- | ---- | --------------------------- |
| ---------------------------* | ---* | --------------------------* |
| gridftp-prace.anselm.it4i.cz | 2812 | Front end /control server |
| login1-prace.anselm.it4i.cz | 2813 | Backend / data mover server |
| login2-prace.anselm.it4i.cz | 2813 | Backend / data mover server |
......@@ -158,7 +158,7 @@ Or by using prace_service script:
**Access from public Internet:**
| Login address | Port | Node role |
| ---------------------- | ---- | --------------------------- |
| ---------------------* | ---* | --------------------------* |
| gridftp.anselm.it4i.cz | 2812 | Front end /control server |
| login1.anselm.it4i.cz | 2813 | Backend / data mover server |
| login2.anselm.it4i.cz | 2813 | Backend / data mover server |
......@@ -191,7 +191,7 @@ Or by using prace_service script:
Generally both shared file systems are available through GridFTP:
| File system mount point | Filesystem | Comment |
| ----------------------- | ---------- | -------------------------------------------------------------- |
| ----------------------* | ---------* | -------------------------------------------------------------* |
| /home | Lustre | Default HOME directories of users in format /home/prace/login/ |
| /scratch | Lustre | Shared SCRATCH mounted on the whole cluster |
......@@ -220,7 +220,7 @@ General information about the resource allocation, job queuing and job execution
For PRACE users, the default production run queue is "qprace". PRACE users can also use two other queues "qexp" and "qfree".
| queue | Active project | Project resources | Nodes | priority | authorization | walltime |
| ----------------------------- | -------------- | ----------------- | ------------------- | -------- | ------------- | --------- |
| ----------------------------* | -------------* | ----------------* | ------------------* | -------* | ------------* | --------* |
| **qexp** Express queue | no | none required | 2 reserved, 8 total | high | no | 1 / 1h |
| **qprace** Production queue | yes | > 0 | 178 w/o accelerator | medium | no | 24 / 48 h |
| **qfree** Free resource queue | yes | none required | 178 w/o accelerator | very low | no | 12 / 12 h |
......@@ -245,7 +245,7 @@ Users who have undergone the full local registration procedure (including signin
$ it4ifree
Password:
PID Total Used ...by me Free
-------- ------- ------ -------- -------
-------* ------* -----* -------* -------
OPEN-0-0 1500000 400644 225265 1099356
DD-13-1 10000 2606 2606 7394
```
......
......@@ -2,19 +2,19 @@
## Introduction
The goal of this service is to provide the users a GPU accelerated use of OpenGL applications, especially for pre- and post- processing work, where not only the GPU performance is needed but also fast access to the shared file systems of the cluster and a reasonable amount of RAM.
The goal of this service is to provide the users a GPU accelerated use of OpenGL applications, especially for pre* and post* processing work, where not only the GPU performance is needed but also fast access to the shared file systems of the cluster and a reasonable amount of RAM.
The service is based on integration of open source tools VirtualGL and TurboVNC together with the cluster's job scheduler PBS Professional.
Currently two compute nodes are dedicated for this service with following configuration for each node:
| [**Visualization node configuration**](compute-nodes/) | |
| ------------------------------------------------------ | --------------------------------------- |
| -----------------------------------------------------* | --------------------------------------* |
| CPU | 2 x Intel Sandy Bridge E5-2670, 2.6 GHz |
| Processor cores | 16 (2 x 8 cores) |
| RAM | 64 GB, min. 4 GB per core |
| GPU | NVIDIA Quadro 4000, 2 GB RAM |
| Local disk drive | yes - 500 GB |
| Local disk drive | yes * 500 GB |
| Compute network | InfiniBand QDR |
## Schematic Overview
......@@ -133,7 +133,7 @@ $ vncserver -kill :1
qviz**. The queue has following properties:
| queue | active project | project resources | nodes | min ncpus | priority | authorization | walltime |
| ---------------------------- | -------------- | ----------------- | ----- | --------- | -------- | ------------- | ---------------- |
| ---------------------------* | -------------* | ----------------* | ----* | --------* | -------* | ------------* | ---------------* |
| **qviz** Visualization queue | yes | none required | 2 | 4 | 150 | no | 1 hour / 8 hours |
Currently when accessing the node, each user gets 4 cores of a CPU allocated, thus approximately 16 GB of RAM and 1/4 of the GPU capacity.
......
......@@ -6,21 +6,21 @@ To run a [job](../introduction/), [computational resources](../introduction/) fo
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and resources available to the Project. [The Fair-share](job-priority/) at Anselm ensures that individual users may consume approximately equal amount of resources per week. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following queues are available to Anselm users:
- **qexp**, the Express queue
- **qprod**, the Production queue
- **qlong**, the Long queue, regula
- **qnvidia**, **qmic**, **qfat**, the Dedicated queues
- **qfree**, the Free resource utilization queue
* **qexp**, the Express queue
* **qprod**, the Production queue
* **qlong**, the Long queue, regula
* **qnvidia**, **qmic**, **qfat**, the Dedicated queues
* **qfree**, the Free resource utilization queue
!!! Note
Check the queue status at <https://extranet.it4i.cz/anselm/>
Check the queue status at <https://extranet.it4i.cz/anselm/>
Read more on the [Resource AllocationPolicy](resources-allocation-policy/) page.
## Job Submission and Execution
!!! Note
Use the **qsub** command to submit your jobs.
Use the **qsub** command to submit your jobs.
The qsub submits the job into the queue. The qsub command creates a request to the PBS Job manager for allocation of specified resources. The **smallest allocation unit is entire node, 16 cores**, with exception of the qexp queue. The resources will be allocated when available, subject to allocation policies and constraints. **After the resources are allocated the jobscript or interactive shell is executed on first of the allocated nodes.**
......@@ -29,7 +29,7 @@ Read more on the [Job submission and execution](job-submission-and-execution/) p
## Capacity Computing
!!! Note
Use Job arrays when running huge number of jobs.
Use Job arrays when running huge number of jobs.
Use GNU Parallel and/or Job arrays when running (many) single core jobs.
......
......@@ -8,7 +8,7 @@ The resources are allocated to the job in a fair-share fashion, subject to const
Check the queue status at <https://extranet.it4i.cz/anselm/>
| queue | active project | project resources | nodes | min ncpus | priority | authorization | walltime |
| ------------------- | -------------- | ----------------- | ---------------------------------------------------- | --------- | -------- | ------------- | -------- |
| ------------------* | -------------* | ----------------* | ---------------------------------------------------* | --------* | -------* | ------------* | -------* |
| qexp | no | none required | 2 reserved, 31 totalincluding MIC, GPU and FAT nodes | 1 | 150 | no | 1 h |
| qprod | yes | 0 | 178 nodes w/o accelerator | 16 | 0 | no | 24/48 h |
| qlong | yes | 0 | 60 nodes w/o accelerator | 16 | 0 | no | 72/144 h |
......@@ -20,11 +20,11 @@ The resources are allocated to the job in a fair-share fashion, subject to const
**The qexp queue is equipped with the nodes not having the very same CPU clock speed.** Should you need the very same CPU speed, you have to select the proper nodes during the PSB job submission.
- **qexp**, the Express queue: This queue is dedicated for testing and running very small jobs. It is not required to specify a project to enter the qexp. There are 2 nodes always reserved for this queue (w/o accelerator), maximum 8 nodes are available via the qexp for a particular user, from a pool of nodes containing Nvidia accelerated nodes (cn181-203), MIC accelerated nodes (cn204-207) and Fat nodes with 512GB RAM (cn208-209). This enables to test and tune also accelerated code or code with higher RAM requirements. The nodes may be allocated on per core basis. No special authorization is required to use it. The maximum runtime in qexp is 1 hour.
- **qprod**, the Production queue: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, except the reserved ones. 178 nodes without accelerator are included. Full nodes, 16 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours.
- **qlong**, the Long queue: This queue is intended for long production runs. It is required that active project with nonzero remaining resources is specified to enter the qlong. Only 60 nodes without acceleration may be accessed via the qlong queue. Full nodes, 16 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qlong is 144 hours (three times of the standard qprod time - 3 x 48 h).
- **qnvidia**, qmic, qfat, the Dedicated queues: The queue qnvidia is dedicated to access the Nvidia accelerated nodes, the qmic to access MIC nodes and qfat the Fat nodes. It is required that active project with nonzero remaining resources is specified to enter these queues. 23 nvidia, 4 mic and 2 fat nodes are included. Full nodes, 16 cores per node are allocated. The queues run with very high priority, the jobs will be scheduled before the jobs coming from the qexp queue. An PI needs explicitly ask [support](https://support.it4i.cz/rt/) for authorization to enter the dedicated queues for all users associated to her/his Project.
- **qfree**, The Free resource queue: The queue qfree is intended for utilization of free resources, after a Project exhausted all its allocated computational resources (Does not apply to DD projects by default. DD projects have to request for persmission on qfree after exhaustion of computational resources.). It is required that active project is specified to enter the queue, however no remaining resources are required. Consumed resources will be accounted to the Project. Only 178 nodes without accelerator may be accessed from this queue. Full nodes, 16 cores per node are allocated. The queue runs with very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours.
* **qexp**, the Express queue: This queue is dedicated for testing and running very small jobs. It is not required to specify a project to enter the qexp. There are 2 nodes always reserved for this queue (w/o accelerator), maximum 8 nodes are available via the qexp for a particular user, from a pool of nodes containing Nvidia accelerated nodes (cn181-203), MIC accelerated nodes (cn204-207) and Fat nodes with 512GB RAM (cn208-209). This enables to test and tune also accelerated code or code with higher RAM requirements. The nodes may be allocated on per core basis. No special authorization is required to use it. The maximum runtime in qexp is 1 hour.
* **qprod**, the Production queue: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, except the reserved ones. 178 nodes without accelerator are included. Full nodes, 16 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours.
* **qlong**, the Long queue: This queue is intended for long production runs. It is required that active project with nonzero remaining resources is specified to enter the qlong. Only 60 nodes without acceleration may be accessed via the qlong queue. Full nodes, 16 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qlong is 144 hours (three times of the standard qprod time * 3 x 48 h).
* **qnvidia**, qmic, qfat, the Dedicated queues: The queue qnvidia is dedicated to access the Nvidia accelerated nodes, the qmic to access MIC nodes and qfat the Fat nodes. It is required that active project with nonzero remaining resources is specified to enter these queues. 23 nvidia, 4 mic and 2 fat nodes are included. Full nodes, 16 cores per node are allocated. The queues run with very high priority, the jobs will be scheduled before the jobs coming from the qexp queue. An PI needs explicitly ask [support](https://support.it4i.cz/rt/) for authorization to enter the dedicated queues for all users associated to her/his Project.
* **qfree**, The Free resource queue: The queue qfree is intended for utilization of free resources, after a Project exhausted all its allocated computational resources (Does not apply to DD projects by default. DD projects have to request for persmission on qfree after exhaustion of computational resources.). It is required that active project is specified to enter the queue, however no remaining resources are required. Consumed resources will be accounted to the Project. Only 178 nodes without accelerator may be accessed from this queue. Full nodes, 16 cores per node are allocated. The queue runs with very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours.
### Notes
......@@ -122,7 +122,7 @@ User may check at any time, how many core-hours have been consumed by himself/he
$ it4ifree
Password:
PID Total Used ...by me Free
-------- ------- ------ -------- -------
-------* ------* -----* -------* -------
OPEN-0-0 1500000 400644 225265 1099356
DD-13-1 10000 2606 2606 7394
```
......@@ -5,7 +5,7 @@
The Anselm cluster is accessed by SSH protocol via login nodes login1 and login2 at address anselm.it4i.cz. The login nodes may be addressed specifically, by prepending the login node name to the address.
| Login address | Port | Protocol | Login node |
| --------------------- | ---- | -------- | -------------------------------------------- |
| --------------------* | ---* | -------* | -------------------------------------------* |
| anselm.it4i.cz | 22 | ssh | round-robin DNS record for login1 and login2 |
| login1.anselm.it4i.cz | 22 | ssh | login1 |
| login2.anselm.it4i.cz | 22 | ssh | login2 |
......@@ -61,7 +61,7 @@ Example to the cluster login:
Data in and out of the system may be transferred by the [scp](http://en.wikipedia.org/wiki/Secure_copy) and sftp protocols. (Not available yet.) In case large volumes of data are transferred, use dedicated data mover node dm1.anselm.it4i.cz for increased performance.
| Address | Port | Protocol |
| --------------------- | ---- | --------- |
| --------------------* | ---* | --------* |
| anselm.it4i.cz | 22 | scp, sftp |
| login1.anselm.it4i.cz | 22 | scp, sftp |
| login2.anselm.it4i.cz | 22 | scp, sftp |
......@@ -120,7 +120,7 @@ More information about the shared file systems is available [here](storage/).
Outgoing connections, from Anselm Cluster login nodes to the outside world, are restricted to following ports:
| Port | Protocol |
| ---- | -------- |
| ---* | -------* |
| 22 | ssh |
| 80 | http |
| 443 | https |
......@@ -198,9 +198,9 @@ Now, configure the applications proxy settings to **localhost:6000**. Use port f
## Graphical User Interface
- The [X Window system](../get-started-with-it4innovations/accessing-the-clusters/graphical-user-interface/x-window-system/) is a principal way to get GUI access to the clusters.
- The [Virtual Network Computing](../get-started-with-it4innovations/accessing-the-clusters/graphical-user-interface/vnc/) is a graphical [desktop sharing](http://en.wikipedia.org/wiki/Desktop_sharing) system that uses the [Remote Frame Buffer protocol](http://en.wikipedia.org/wiki/RFB_protocol) to remotely control another [computer](http://en.wikipedia.org/wiki/Computer).
* The [X Window system](../get-started-with-it4innovations/accessing-the-clusters/graphical-user-interface/x-window-system/) is a principal way to get GUI access to the clusters.
* The [Virtual Network Computing](../get-started-with-it4innovations/accessing-the-clusters/graphical-user-interface/vnc/) is a graphical [desktop sharing](http://en.wikipedia.org/wiki/Desktop_sharing) system that uses the [Remote Frame Buffer protocol](http://en.wikipedia.org/wiki/RFB_protocol) to remotely control another [computer](http://en.wikipedia.org/wiki/Computer).
## VPN Access
- Access to IT4Innovations internal resources via [VPN](../get-started-with-it4innovations/accessing-the-clusters/vpn-access/).
* Access to IT4Innovations internal resources via [VPN](../get-started-with-it4innovations/accessing-the-clusters/vpn-access/).
......@@ -129,7 +129,7 @@ To run ANSYS Fluent in batch mode with user's config file you can utilize/modify
#Default arguments for all jobs
fluent_args="-ssh -g -i $input $fluent_args"
echo "---------- Going to start a fluent job with the following settings:
echo "---------* Going to start a fluent job with the following settings: