Commit 14676d74 authored by David Hrbáč's avatar David Hrbáč

Remarked all the mds

parent 9f1e0b67
Pipeline #1940 passed with stages
in 1 minute and 11 seconds
Capacity computing
==================
# Capacity computing
## Introduction
Introduction
------------
In many cases, it is useful to submit huge (>100+) number of computational jobs into the PBS queue system. Huge number of (small) jobs is one of the most effective ways to execute embarrassingly parallel calculations, achieving best runtime, throughput and computer utilization.
However, executing huge number of jobs via the PBS queue may strain the system. This strain may result in slow response to commands, inefficient scheduling and overall degradation of performance and user experience, for all users. For this reason, the number of jobs is **limited to 100 per user, 1000 per job array**
......@@ -14,14 +13,12 @@ However, executing huge number of jobs via the PBS queue may strain the system.
- Use [GNU parallel](capacity-computing/#gnu-parallel) when running single core jobs
- Combine [GNU parallel with Job arrays](capacity-computing/#job-arrays-and-gnu-parallel) when running huge number of single core jobs
Policy
------
## Policy
1. A user is allowed to submit at most 100 jobs. Each job may be [a job array](capacity-computing/#job-arrays).
2. The array size is at most 1000 subjobs.
Job arrays
--------------
## Job arrays
!!! Note "Note"
Huge number of jobs may be easily submitted and managed as a job array.
......@@ -73,7 +70,7 @@ cp $PBS_O_WORKDIR/$TASK input ; cp $PBS_O_WORKDIR/myprog.x .
cp output $PBS_O_WORKDIR/$TASK.out
```
In this example, the submit directory holds the 900 input files, executable myprog.x and the jobscript file. As input for each run, we take the filename of input file from created tasklist file. We copy the input file to local scratch /lscratch/$PBS_JOBID, execute the myprog.x and copy the output file back to >the submit directory, under the $TASK.out name. The myprog.x runs on one node only and must use threads to run in parallel. Be aware, that if the myprog.x **is not multithreaded**, then all the **jobs are run as single thread programs in sequential** manner. Due to allocation of the whole node, the accounted time is equal to the usage of whole node**, while using only 1/16 of the node!
In this example, the submit directory holds the 900 input files, executable myprog.x and the jobscript file. As input for each run, we take the filename of input file from created tasklist file. We copy the input file to local scratch /lscratch/$PBS_JOBID, execute the myprog.x and copy the output file back to >the submit directory, under the $TASK.out name. The myprog.x runs on one node only and must use threads to run in parallel. Be aware, that if the myprog.x **is not multithreaded**, then all the **jobs are run as single thread programs in sequential** manner. Due to allocation of the whole node, the accounted time is equal to the usage of whole node\*\*, while using only 1/16 of the node!
If huge number of parallel multicore (in means of multinode multithread, e. g. MPI enabled) jobs is needed to run, then a job array approach should also be used. The main difference compared to previous example using one node is that the local scratch should not be used (as it's not shared between nodes) and MPI or other technique for parallel multinode run has to be used properly.
......@@ -150,8 +147,7 @@ $ qstat -u $USER -tJ
Read more on job arrays in the [PBSPro Users guide](../../pbspro-documentation/).
GNU parallel
----------------
## GNU parallel
!!! Note "Note"
Use GNU parallel to run many single core tasks on one node.
......@@ -222,8 +218,7 @@ In this example, we submit a job of 101 tasks. 16 input files will be processed
Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
Job arrays and GNU parallel
---------------------------
## Job arrays and GNU parallel
!!! Note "Note"
Combine the Job arrays and GNU parallel for best throughput of single core jobs
......@@ -290,7 +285,7 @@ In this example, the jobscript executes in multiple instances in parallel, on al
When deciding this values, think about following guiding rules:
1. Let n=N/16. Inequality (n+1) * T < W should hold. The N is number of tasks per subjob, T is expected single task walltime and W is subjob walltime. Short subjob walltime improves scheduling and job throughput.
1. Let n=N/16. Inequality (n+1) \* T < W should hold. The N is number of tasks per subjob, T is expected single task walltime and W is subjob walltime. Short subjob walltime improves scheduling and job throughput.
2. Number of tasks should be modulo 16.
3. These rules are valid only when all tasks have similar task walltimes T.
......@@ -307,8 +302,7 @@ In this example, we submit a job array of 31 subjobs. Note the -J 1-992:**32**,
Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
Examples
--------
## Examples
Download the examples in [capacity.zip](capacity.zip), illustrating the above listed ways to run huge number of jobs. We recommend to try out the examples, before using this for running production jobs.
......
......@@ -6,58 +6,58 @@ Anselm is cluster of x86-64 Intel based nodes built on Bull Extreme Computing bu
### Compute Nodes Without Accelerator
* 180 nodes
* 2880 cores in total
* two Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
* 64 GB of physical memory per node
* one 500GB SATA 2,5” 7,2 krpm HDD per node
* bullx B510 blade servers
* cn[1-180]
- 180 nodes
- 2880 cores in total
- two Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
- 64 GB of physical memory per node
- one 500GB SATA 2,5” 7,2 krpm HDD per node
- bullx B510 blade servers
- cn[1-180]
### Compute Nodes With GPU Accelerator
* 23 nodes
* 368 cores in total
* two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
* 96 GB of physical memory per node
* one 500GB SATA 2,5” 7,2 krpm HDD per node
* GPU accelerator 1x NVIDIA Tesla Kepler K20 per node
* bullx B515 blade servers
* cn[181-203]
- 23 nodes
- 368 cores in total
- two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
- 96 GB of physical memory per node
- one 500GB SATA 2,5” 7,2 krpm HDD per node
- GPU accelerator 1x NVIDIA Tesla Kepler K20 per node
- bullx B515 blade servers
- cn[181-203]
### Compute Nodes With MIC Accelerator
* 4 nodes
* 64 cores in total
* two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
* 96 GB of physical memory per node
* one 500GB SATA 2,5” 7,2 krpm HDD per node
* MIC accelerator 1x Intel Phi 5110P per node
* bullx B515 blade servers
* cn[204-207]
- 4 nodes
- 64 cores in total
- two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
- 96 GB of physical memory per node
- one 500GB SATA 2,5” 7,2 krpm HDD per node
- MIC accelerator 1x Intel Phi 5110P per node
- bullx B515 blade servers
- cn[204-207]
### Fat Compute Nodes
* 2 nodes
* 32 cores in total
* 2 Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
* 512 GB of physical memory per node
* two 300GB SAS 3,5”15krpm HDD (RAID1) per node
* two 100GB SLC SSD per node
* bullx R423-E3 servers
* cn[208-209]
- 2 nodes
- 32 cores in total
- 2 Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
- 512 GB of physical memory per node
- two 300GB SAS 3,5”15krpm HDD (RAID1) per node
- two 100GB SLC SSD per node
- bullx R423-E3 servers
- cn[208-209]
![](../img/bullxB510.png)
**Figure Anselm bullx B510 servers**
### Compute Nodes Summary
|Node type|Count|Range|Memory|Cores|[Access](resources-allocation-policy/)|
|---|---|---|---|---|---|
|Nodes without accelerator|180|cn[1-180]|64GB|16 @ 2.4Ghz|qexp, qprod, qlong, qfree|
|Nodes with GPU accelerator|23|cn[181-203]|96GB|16 @ 2.3Ghz|qgpu, qprod|
|Nodes with MIC accelerator|4|cn[204-207]|96GB|16 @ 2.3GHz|qmic, qprod|
|Fat compute nodes|2|cn[208-209]|512GB|16 @ 2.4GHz|qfat, qprod|
| Node type | Count | Range | Memory | Cores | [Access](resources-allocation-policy/) |
| -------------------------- | ----- | ----------- | ------ | ----------- | -------------------------------------- |
| Nodes without accelerator | 180 | cn[1-180] | 64GB | 16 @ 2.4Ghz | qexp, qprod, qlong, qfree |
| Nodes with GPU accelerator | 23 | cn[181-203] | 96GB | 16 @ 2.3Ghz | qgpu, qprod |
| Nodes with MIC accelerator | 4 | cn[204-207] | 96GB | 16 @ 2.3GHz | qmic, qprod |
| Fat compute nodes | 2 | cn[208-209] | 512GB | 16 @ 2.4GHz | qfat, qprod |
## Processor Architecture
......@@ -65,23 +65,23 @@ Anselm is equipped with Intel Sandy Bridge processors Intel Xeon E5-2665 (nodes
### Intel Sandy Bridge E5-2665 Processor
* eight-core
* speed: 2.4 GHz, up to 3.1 GHz using Turbo Boost Technology
* peak performance: 19.2 GFLOP/s per core
* caches:
* L2: 256 KB per core
* L3: 20 MB per processor
* memory bandwidth at the level of the processor: 51.2 GB/s
- eight-core
- speed: 2.4 GHz, up to 3.1 GHz using Turbo Boost Technology
- peak performance: 19.2 GFLOP/s per core
- caches:
- L2: 256 KB per core
- L3: 20 MB per processor
- memory bandwidth at the level of the processor: 51.2 GB/s
### Intel Sandy Bridge E5-2470 Processor
* eight-core
* speed: 2.3 GHz, up to 3.1 GHz using Turbo Boost Technology
* peak performance: 18.4 GFLOP/s per core
* caches:
* L2: 256 KB per core
* L3: 20 MB per processor
* memory bandwidth at the level of the processor: 38.4 GB/s
- eight-core
- speed: 2.3 GHz, up to 3.1 GHz using Turbo Boost Technology
- peak performance: 18.4 GFLOP/s per core
- caches:
- L2: 256 KB per core
- L3: 20 MB per processor
- memory bandwidth at the level of the processor: 38.4 GB/s
Nodes equipped with Intel Xeon E5-2665 CPU have set PBS resource attribute cpu_freq = 24, nodes equipped with Intel Xeon E5-2470 CPU have set PBS resource attribute cpu_freq = 23.
......@@ -101,30 +101,30 @@ Intel Turbo Boost Technology is used by default, you can disable it for all nod
### Compute Node Without Accelerator
* 2 sockets
* Memory Controllers are integrated into processors.
* 8 DDR3 DIMMs per node
* 4 DDR3 DIMMs per CPU
* 1 DDR3 DIMMs per channel
* Data rate support: up to 1600MT/s
* Populated memory: 8 x 8 GB DDR3 DIMM 1600 MHz
- 2 sockets
- Memory Controllers are integrated into processors.
- 8 DDR3 DIMMs per node
- 4 DDR3 DIMMs per CPU
- 1 DDR3 DIMMs per channel
- Data rate support: up to 1600MT/s
- Populated memory: 8 x 8 GB DDR3 DIMM 1600 MHz
### Compute Node With GPU or MIC Accelerator
* 2 sockets
* Memory Controllers are integrated into processors.
* 6 DDR3 DIMMs per node
* 3 DDR3 DIMMs per CPU
* 1 DDR3 DIMMs per channel
* Data rate support: up to 1600MT/s
* Populated memory: 6 x 16 GB DDR3 DIMM 1600 MHz
- 2 sockets
- Memory Controllers are integrated into processors.
- 6 DDR3 DIMMs per node
- 3 DDR3 DIMMs per CPU
- 1 DDR3 DIMMs per channel
- Data rate support: up to 1600MT/s
- Populated memory: 6 x 16 GB DDR3 DIMM 1600 MHz
### Fat Compute Node
* 2 sockets
* Memory Controllers are integrated into processors.
* 16 DDR3 DIMMs per node
* 8 DDR3 DIMMs per CPU
* 2 DDR3 DIMMs per channel
* Data rate support: up to 1600MT/s
* Populated memory: 16 x 32 GB DDR3 DIMM 1600 MHz
- 2 sockets
- Memory Controllers are integrated into processors.
- 16 DDR3 DIMMs per node
- 8 DDR3 DIMMs per CPU
- 2 DDR3 DIMMs per channel
- Data rate support: up to 1600MT/s
- Populated memory: 16 x 32 GB DDR3 DIMM 1600 MHz
Environment and Modules
=======================
# Environment and Modules
### Environment Customization
......@@ -77,7 +76,7 @@ PrgEnv-gnu sets up the GNU development environment in conjunction with the bullx
PrgEnv-intel sets up the INTEL development environment in conjunction with the Intel MPI library
How to using modules in examples:
<tty-player controls src=/src/anselm/modules_anselm.ttyrec></tty-player>
&lt;tty-player controls src=/src/anselm/modules_anselm.ttyrec>&lt;/tty-player>
### Application Modules Path Expansion
......
Hardware Overview
=================
# Hardware Overview
The Anselm cluster consists of 209 computational nodes named cn[1-209] of which 180 are regular compute nodes, 23 GPU Kepler K20 accelerated nodes, 4 MIC Xeon Phi 5110P accelerated nodes and 2 fat nodes. Each node is a powerful x86-64 computer, equipped with 16 cores (two eight-core Intel Sandy Bridge processors), at least 64 GB RAM, and local hard drive. The user access to the Anselm cluster is provided by two login nodes login[1,2]. The nodes are interlinked by high speed InfiniBand and Ethernet networks. All nodes share 320 TB /home disk storage to store the user files. The 146 TB shared /scratch storage is available for the scratch data.
......@@ -31,31 +30,31 @@ The user access to the Anselm cluster is provided by two login nodes login1, log
The parameters are summarized in the following tables:
|**In general**||
|---|---|
|Primary purpose|High Performance Computing|
|Architecture of compute nodes|x86-64|
|Operating system|Linux|
|[**Compute nodes**](compute-nodes/)||
|Totally|209|
|Processor cores|16 (2 x 8 cores)|
|RAM|min. 64 GB, min. 4 GB per core|
|Local disk drive|yes - usually 500 GB|
|Compute network|InfiniBand QDR, fully non-blocking, fat-tree|
|w/o accelerator|180, cn[1-180]|
|GPU accelerated|23, cn[181-203]|
|MIC accelerated|4, cn[204-207]|
|Fat compute nodes|2, cn[208-209]|
|**In total**||
|Total theoretical peak performance (Rpeak)|94 TFLOP/s|
|Total max. LINPACK performance (Rmax)|73 TFLOP/s|
|Total amount of RAM|15.136 TB|
|Node|Processor|Memory|Accelerator|
|---|---|---|---|
|w/o accelerator|2 x Intel Sandy Bridge E5-2665, 2.4 GHz|64 GB|-|
|GPU accelerated|2 x Intel Sandy Bridge E5-2470, 2.3 GHz|96 GB|NVIDIA Kepler K20|
|MIC accelerated|2 x Intel Sandy Bridge E5-2470, 2.3 GHz|96 GB|Intel Xeon Phi 5110P|
|Fat compute node|2 x Intel Sandy Bridge E5-2665, 2.4 GHz|512 GB|-|
| **In general** | |
| ------------------------------------------- | -------------------------------------------- |
| Primary purpose | High Performance Computing |
| Architecture of compute nodes | x86-64 |
| Operating system | Linux |
| [**Compute nodes**](compute-nodes/) | |
| Totally | 209 |
| Processor cores | 16 (2 x 8 cores) |
| RAM | min. 64 GB, min. 4 GB per core |
| Local disk drive | yes - usually 500 GB |
| Compute network | InfiniBand QDR, fully non-blocking, fat-tree |
| w/o accelerator | 180, cn[1-180] |
| GPU accelerated | 23, cn[181-203] |
| MIC accelerated | 4, cn[204-207] |
| Fat compute nodes | 2, cn[208-209] |
| **In total** | |
| Total theoretical peak performance (Rpeak) | 94 TFLOP/s |
| Total max. LINPACK performance (Rmax) | 73 TFLOP/s |
| Total amount of RAM | 15.136 TB |
| Node | Processor | Memory | Accelerator |
| ---------------- | --------------------------------------- | ------ | -------------------- |
| w/o accelerator | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 64 GB | - |
| GPU accelerated | 2 x Intel Sandy Bridge E5-2470, 2.3 GHz | 96 GB | NVIDIA Kepler K20 |
| MIC accelerated | 2 x Intel Sandy Bridge E5-2470, 2.3 GHz | 96 GB | Intel Xeon Phi 5110P |
| Fat compute node | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 512 GB | - |
For more details please refer to the [Compute nodes](compute-nodes/), [Storage](storage/), and [Network](network/).
Job scheduling
==============
# Job scheduling
Job execution priority
----------------------
## Job execution priority
Scheduler gives each job an execution priority and then uses this job execution priority to select which job(s) to run.
......@@ -31,8 +29,8 @@ Fair-share priority is calculated as
![](../img/fairshare_formula.png)
where MAX_FAIRSHARE has value 1E6,
usage*Project* is cumulated usage by all members of selected project,
usage*Total* is total usage by all users, by all projects.
usage_Project_ is cumulated usage by all members of selected project,
usage_Total_ is total usage by all users, by all projects.
Usage counts allocated core-hours (`ncpus x walltime`). Usage is decayed, or cut in half periodically, at the interval 168 hours (one week).
Jobs queued in queue qexp are not calculated to project's usage.
......@@ -42,7 +40,7 @@ Jobs queued in queue qexp are not calculated to project's usage.
Calculated fair-share priority can be also seen as Resource_List.fairshare attribute of a job.
###Eligible time
\###Eligible time
Eligible time is amount (in seconds) of eligible time job accrued while waiting to run. Jobs with higher eligible time gains higher priority.
......
......@@ -5,11 +5,11 @@
When allocating computational resources for the job, please specify
1. suitable queue for your job (default is qprod)
1. number of computational nodes required
1. number of cores per node required
1. maximum wall time allocated to your calculation, note that jobs exceeding maximum wall time will be killed
1. Project ID
1. Jobscript or interactive switch
2. number of computational nodes required
3. number of cores per node required
4. maximum wall time allocated to your calculation, note that jobs exceeding maximum wall time will be killed
5. Project ID
6. Jobscript or interactive switch
!!! Note "Note"
Use the **qsub** command to submit your job to a queue for allocation of the computational resources.
......@@ -76,10 +76,10 @@ In this example, we allocate nodes cn171 and cn172, all 16 cores per node, for 2
Nodes equipped with Intel Xeon E5-2665 CPU have base clock frequency 2.4GHz, nodes equipped with Intel Xeon E5-2470 CPU have base frequency 2.3 GHz (see section Compute Nodes for details). Nodes may be selected via the PBS resource attribute cpu_freq .
|CPU Type|base freq.|Nodes|cpu_freq attribute|
|---|---|---|---|
|Intel Xeon E5-2665|2.4GHz|cn[1-180], cn[208-209]|24|
|Intel Xeon E5-2470|2.3GHz|cn[181-207]|23|
| CPU Type | base freq. | Nodes | cpu_freq attribute |
| ------------------ | ---------- | ---------------------- | ------------------ |
| Intel Xeon E5-2665 | 2.4GHz | cn[1-180], cn[208-209] | 24 |
| Intel Xeon E5-2470 | 2.3GHz | cn[181-207] | 23 |
```bash
$ qsub -A OPEN-0-0 -q qprod -l select=4:ncpus=16:cpu_freq=24 -I
......@@ -156,7 +156,7 @@ Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
16547.srv11 user2 qprod job3x 13516 2 32 -- 48:00 R 00:58
```
In this example user1 and user2 are running jobs named job1, job2 and job3x. The jobs job1 and job2 are using 4 nodes, 16 cores per node each. The job1 already runs for 38 hours and 25 minutes, job2 for 17 hours 44 minutes. The job1 already consumed 64*38.41 = 2458.6 core hours. The job3x already consumed 0.96*32 = 30.93 core hours. These consumed core hours will be accounted on the respective project accounts, regardless of whether the allocated cores were actually used for computations.
In this example user1 and user2 are running jobs named job1, job2 and job3x. The jobs job1 and job2 are using 4 nodes, 16 cores per node each. The job1 already runs for 38 hours and 25 minutes, job2 for 17 hours 44 minutes. The job1 already consumed 64_38.41 = 2458.6 core hours. The job3x already consumed 0.96_32 = 30.93 core hours. These consumed core hours will be accounted on the respective project accounts, regardless of whether the allocated cores were actually used for computations.
Check status of your jobs using check-pbs-jobs command. Check presence of user's PBS jobs' processes on execution hosts. Display load, processes. Display job standard and error output. Continuously display (tail -f) job standard or error output.
......
Network
=======
# Network
All compute and login nodes of Anselm are interconnected by [InfiniBand](http://en.wikipedia.org/wiki/InfiniBand) QDR network and by Gigabit [Ethernet](http://en.wikipedia.org/wiki/Ethernet) network. Both networks may be used to transfer user data.
InfiniBand Network
------------------
## InfiniBand Network
All compute and login nodes of Anselm are interconnected by a high-bandwidth, low-latency [InfiniBand](http://en.wikipedia.org/wiki/InfiniBand) QDR network (IB 4 x QDR, 40 Gbps). The network topology is a fully non-blocking fat-tree.
The compute nodes may be accessed via the InfiniBand network using ib0 network interface, in address range 10.2.1.1-209. The MPI may be used to establish native InfiniBand connection among the nodes.
......@@ -14,12 +13,11 @@ The compute nodes may be accessed via the InfiniBand network using ib0 network i
The Fat tree topology ensures that peak transfer rates are achieved between any two nodes, independent of network traffic exchanged among other nodes concurrently.
Ethernet Network
----------------
## Ethernet Network
The compute nodes may be accessed via the regular Gigabit Ethernet network interface eth0, in address range 10.1.1.1-209, or by using aliases cn1-cn209. The network provides **114 MB/s** transfer rates via the TCP connection.
Example
-------
## Example
```bash
$ qsub -q qexp -l select=4:ncpus=16 -N Name0 ./myjob
......
......@@ -28,11 +28,11 @@ The user will need a valid certificate and to be present in the PRACE LDAP (plea
Most of the information needed by PRACE users accessing the Anselm TIER-1 system can be found here:
* [General user's FAQ](http://www.prace-ri.eu/Users-General-FAQs)
* [Certificates FAQ](http://www.prace-ri.eu/Certificates-FAQ)
* [Interactive access using GSISSH](http://www.prace-ri.eu/Interactive-Access-Using-gsissh)
* [Data transfer with GridFTP](http://www.prace-ri.eu/Data-Transfer-with-GridFTP-Details)
* [Data transfer with gtransfer](http://www.prace-ri.eu/Data-Transfer-with-gtransfer)
- [General user's FAQ](http://www.prace-ri.eu/Users-General-FAQs)
- [Certificates FAQ](http://www.prace-ri.eu/Certificates-FAQ)
- [Interactive access using GSISSH](http://www.prace-ri.eu/Interactive-Access-Using-gsissh)
- [Data transfer with GridFTP](http://www.prace-ri.eu/Data-Transfer-with-GridFTP-Details)
- [Data transfer with gtransfer](http://www.prace-ri.eu/Data-Transfer-with-gtransfer)
Before you start to use any of the services don't forget to create a proxy certificate from your certificate:
......@@ -52,11 +52,11 @@ To access Anselm cluster, two login nodes running GSI SSH service are available.
It is recommended to use the single DNS name anselm-prace.it4i.cz which is distributed between the two login nodes. If needed, user can login directly to one of the login nodes. The addresses are:
|Login address|Port|Protocol|Login node|
|---|---|---|---|
|anselm-prace.it4i.cz|2222|gsissh|login1 or login2|
|login1-prace.anselm.it4i.cz|2222|gsissh|login1|
|login2-prace.anselm.it4i.cz|2222|gsissh|login2|
| Login address | Port | Protocol | Login node |
| --------------------------- | ---- | -------- | ---------------- |
| anselm-prace.it4i.cz | 2222 | gsissh | login1 or login2 |
| login1-prace.anselm.it4i.cz | 2222 | gsissh | login1 |
| login2-prace.anselm.it4i.cz | 2222 | gsissh | login2 |
```bash
$ gsissh -p 2222 anselm-prace.it4i.cz
......@@ -72,11 +72,11 @@ When logging from other PRACE system, the prace_service script can be used:
It is recommended to use the single DNS name anselm.it4i.cz which is distributed between the two login nodes. If needed, user can login directly to one of the login nodes. The addresses are:
|Login address|Port|Protocol|Login node|
|---|---|---|---|
|anselm.it4i.cz|2222|gsissh|login1 or login2|
|login1.anselm.it4i.cz|2222|gsissh|login1|
|login2.anselm.it4i.cz|2222|gsissh|login2|
| Login address | Port | Protocol | Login node |
| --------------------- | ---- | -------- | ---------------- |
| anselm.it4i.cz | 2222 | gsissh | login1 or login2 |
| login1.anselm.it4i.cz | 2222 | gsissh | login1 |
| login2.anselm.it4i.cz | 2222 | gsissh | login2 |
```bash
$ gsissh -p 2222 anselm.it4i.cz
......@@ -124,12 +124,12 @@ There's one control server and three backend servers for striping and/or backup
**Access from PRACE network:**
|Login address|Port|Node role|
|---|---|---|
|gridftp-prace.anselm.it4i.cz|2812|Front end /control server|
|login1-prace.anselm.it4i.cz|2813|Backend / data mover server|
|login2-prace.anselm.it4i.cz|2813|Backend / data mover server|
|dm1-prace.anselm.it4i.cz|2813|Backend / data mover server|
| Login address | Port | Node role |
| ---------------------------- | ---- | --------------------------- |
| gridftp-prace.anselm.it4i.cz | 2812 | Front end /control server |
| login1-prace.anselm.it4i.cz | 2813 | Backend / data mover server |
| login2-prace.anselm.it4i.cz | 2813 | Backend / data mover server |
| dm1-prace.anselm.it4i.cz | 2813 | Backend / data mover server |
Copy files **to** Anselm by running the following commands on your local machine:
......@@ -157,12 +157,12 @@ Or by using prace_service script:
**Access from public Internet:**
|Login address|Port|Node role|
|---|---|---|
|gridftp.anselm.it4i.cz|2812|Front end /control server|
|login1.anselm.it4i.cz|2813|Backend / data mover server|
|login2.anselm.it4i.cz|2813|Backend / data mover server|
|dm1.anselm.it4i.cz|2813|Backend / data mover server|
| Login address | Port | Node role |
| ---------------------- | ---- | --------------------------- |
| gridftp.anselm.it4i.cz | 2812 | Front end /control server |
| login1.anselm.it4i.cz | 2813 | Backend / data mover server |
| login2.anselm.it4i.cz | 2813 | Backend / data mover server |
| dm1.anselm.it4i.cz | 2813 | Backend / data mover server |
Copy files **to** Anselm by running the following commands on your local machine:
......@@ -190,10 +190,10 @@ Or by using prace_service script:
Generally both shared file systems are available through GridFTP:
|File system mount point|Filesystem|Comment|
|---|---|---|
|/home|Lustre|Default HOME directories of users in format /home/prace/login/|
|/scratch|Lustre|Shared SCRATCH mounted on the whole cluster|
| File system mount point | Filesystem | Comment |
| ----------------------- | ---------- | -------------------------------------------------------------- |
| /home | Lustre | Default HOME directories of users in format /home/prace/login/ |
| /scratch | Lustre | Shared SCRATCH mounted on the whole cluster |
More information about the shared file systems is available [here](storage/).
......@@ -219,11 +219,11 @@ General information about the resource allocation, job queuing and job execution
For PRACE users, the default production run queue is "qprace". PRACE users can also use two other queues "qexp" and "qfree".
|queue|Active project|Project resources|Nodes|priority|authorization|walltime|
|---|---|---|---|---|---|---|
|**qexp** Express queue|no|none required|2 reserved, 8 total|high|no|1 / 1h|
|**qprace** Production queue|yes|> 0|178 w/o accelerator|medium|no|24 / 48 h|
|**qfree** Free resource queue|yes|none required|178 w/o accelerator|very low|no|12 / 12 h|
| queue | Active project | Project resources | Nodes | priority | authorization | walltime |
| ----------------------------- | -------------- | ----------------- | ------------------- | -------- | ------------- | --------- |
| **qexp** Express queue | no | none required | 2 reserved, 8 total | high | no | 1 / 1h |
| **qprace** Production queue | yes | > 0 | 178 w/o accelerator | medium | no | 24 / 48 h |
| **qfree** Free resource queue | yes | none required | 178 w/o accelerator | very low | no | 12 / 12 h |
**qprace**, the PRACE: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprace. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprace is 12 hours. If the job needs longer time, it must use checkpoint/restart functionality.
......
......@@ -8,14 +8,14 @@ The service is based on integration of open source tools VirtualGL and TurboVNC
Currently two compute nodes are dedicated for this service with following configuration for each node:
|[**Visualization node configuration**](compute-nodes/)||
|---|---|
|CPU|2 x Intel Sandy Bridge E5-2670, 2.6 GHz|
|Processor cores|16 (2 x 8 cores)|
|RAM|64 GB, min. 4 GB per core|
|GPU|NVIDIA Quadro 4000, 2 GB RAM|
|Local disk drive|yes - 500 GB|
|Compute network|InfiniBand QDR|
| [**Visualization node configuration**](compute-nodes/) | |
| ------------------------------------------------------ | --------------------------------------- |
| CPU | 2 x Intel Sandy Bridge E5-2670, 2.6 GHz |
| Processor cores | 16 (2 x 8 cores) |
| RAM | 64 GB, min. 4 GB per core |
| GPU | NVIDIA Quadro 4000, 2 GB RAM |
| Local disk drive | yes - 500 GB |
| Compute network | InfiniBand QDR |
## Schematic overview
......@@ -41,7 +41,7 @@ Please [follow the documentation](shell-and-data-access/).
To have the OpenGL acceleration, **24 bit color depth must be used**. Otherwise only the geometry (desktop size) definition is needed.
*At first VNC server run you need to define a password.*
_At first VNC server run you need to define a password._
This example defines desktop with dimensions 1200x700 pixels and 24 bit color depth.
......@@ -97,7 +97,7 @@ $ ssh login2.anselm.it4i.cz -L 5901:localhost:5901
```
x-window-system/
*If you use Windows and Putty, please refer to port forwarding setup in the documentation:*
_If you use Windows and Putty, please refer to port forwarding setup in the documentation:_
[x-window-and-vnc#section-12](../get-started-with-it4innovations/accessing-the-clusters/graphical-user-interface/x-window-system/)
#### 7. If you don't have Turbo VNC installed on your workstation
......@@ -112,15 +112,15 @@ Mind that you should connect through the SSH tunneled port. In this example it i
$ vncviewer localhost:5901
```
*If you use Windows version of TurboVNC Viewer, just run the Viewer and use address **localhost:5901**.*
_If you use Windows version of TurboVNC Viewer, just run the Viewer and use address **localhost:5901**._
#### 9. Proceed to the chapter "Access the visualization node"
*Now you should have working TurboVNC session connected to your workstation.*
_Now you should have working TurboVNC session connected to your workstation._
#### 10. After you end your visualization session
*Don't forget to correctly shutdown your own VNC server on the login node!*
_Don't forget to correctly shutdown your own VNC server on the login node!_
```bash
$ vncserver -kill :1
......@@ -131,17 +131,17 @@ $ vncserver -kill :1
**To access the node use a dedicated PBS Professional scheduler queue
qviz**. The queue has following properties:
|queue |active project |project resources |nodes|min ncpus|priority|authorization|walltime |
| --- | --- | --- | --- | --- | --- | --- | --- |
|**qviz** Visualization queue |yes |none required |2 |4 |150 |no |1 hour / 8 hours |
| queue | active project | project resources | nodes | min ncpus | priority | authorization | walltime |
| ---------------------------- | -------------- | ----------------- | ----- | --------- | -------- | ------------- | ---------------- |
| **qviz** Visualization queue | yes | none required | 2 | 4 | 150 | no | 1 hour / 8 hours |
Currently when accessing the node, each user gets 4 cores of a CPU allocated, thus approximately 16 GB of RAM and 1/4 of the GPU capacity. *If more GPU power or RAM is required, it is recommended to allocate one whole node per user, so that all 16 cores, whole RAM and whole GPU is exclusive. This is currently also the maximum allowed allocation per one user. One hour of work is allocated by default, the user may ask for 2 hours maximum.*
Currently when accessing the node, each user gets 4 cores of a CPU allocated, thus approximately 16 GB of RAM and 1/4 of the GPU capacity. _If more GPU power or RAM is required, it is recommended to allocate one whole node per user, so that all 16 cores, whole RAM and whole GPU is exclusive. This is currently also the maximum allowed allocation per one user. One hour of work is allocated by default, the user may ask for 2 hours maximum._
To access the visualization node, follow these steps:
#### 1. In your VNC session, open a terminal and allocate a node using PBSPro qsub command
*This step is necessary to allow you to proceed with next steps.*
_This step is necessary to allow you to proceed with next steps._
```bash
$ qsub -I -q qviz -A PROJECT_ID
......@@ -153,7 +153,7 @@ In this example the default values for CPU cores and usage time are used.
$ qsub -I -q qviz -A PROJECT_ID -l select=1:ncpus=16 -l walltime=02:00:00
```
*Substitute **PROJECT_ID** with the assigned project identification string.*
_Substitute **PROJECT_ID** with the assigned project identification string._