Skip to content
Snippets Groups Projects
Commit df901f9f authored by Lukáš Krupčík's avatar Lukáš Krupčík
Browse files

Merge branch 'anselm_removal_from_doc' into 'master'

Anselm removal from doc

See merge request !294
parents 0b6b5297 adc0cffe
No related branches found
No related tags found
5 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section,!294Anselm removal from doc
Showing
with 62 additions and 317 deletions
......@@ -6,7 +6,6 @@ The table shows which shells are supported on the IT4Innovations clusters.
| Cluster Name | bash | tcsh | zsh | ksh |
| --------------- | ---- | ---- | --- | --- |
| Anselm Cluster | yes | yes | yes | yes |
| Salomon Cluster | yes | yes | yes | yes |
| Barbora Cluster | yes | yes | yes | yes |
| DGX-2 Cluster | yes | no | no | no |
......
......@@ -47,31 +47,6 @@ $ find . -name 'file*' > tasklist
Then we create a jobscript:
#### Anselm
```bash
#!/bin/bash
#PBS -A PROJECT_ID
#PBS -q qprod
#PBS -l select=1:ncpus=16,walltime=02:00:00
# change to local scratch directory
SCR=/lscratch/$PBS_JOBID
mkdir -p $SCR ; cd $SCR || exit
# get individual tasks from tasklist with index from PBS JOB ARRAY
TASK=$(sed -n "${PBS_ARRAY_INDEX}p" $PBS_O_WORKDIR/tasklist)
# copy input file and executable to scratch
cp $PBS_O_WORKDIR/$TASK input ; cp $PBS_O_WORKDIR/myprog.x .
# execute the calculation
./myprog.x < input > output
# copy output file to submit directory
cp output $PBS_O_WORKDIR/$TASK.out
```
#### Salomon
```bash
......@@ -105,13 +80,6 @@ If running a huge number of parallel multicore (in means of multinode multithrea
To submit the job array, use the `qsub -J` command. The 900 jobs of the [example above][5] may be submitted like this:
#### Anselm
```console
$ qsub -N JOBNAME -J 1-900 jobscript
12345[].dm2
```
#### Salomon
```console
......
......@@ -19,39 +19,34 @@ $ qsub -A Project_ID -q queue -l select=x:ncpus=y,walltime=[[hh:]mm:]ss[.ms] job
The qsub command submits the job to the queue, i.e. the qsub command creates a request to the PBS Job manager for allocation of specified resources. The resources will be allocated when available, subject to the above described policies and constraints. **After the resources are allocated, the jobscript or interactive shell is executed on the first of the allocated nodes.**
!!! note
PBS statement nodes (qsub -l nodes=nodespec) are not supported on Anselm.
### Job Submission Examples
!!! note
Anselm: ncpus=16\
Salomon: ncpus=24\
Barbora: ncpus=36, or ncpus=24 for accelerate node
```console
$ qsub -A OPEN-0-0 -q qprod -l select=64:ncpus=16,walltime=03:00:00 ./myjob
$ qsub -A OPEN-0-0 -q qprod -l select=64:ncpus=24,walltime=03:00:00 ./myjob
```
In this example, we allocate 64 nodes, 16 cores per node, for 3 hours. We allocate these resources via the qprod queue, consumed resources will be accounted to the Project identified by Project ID OPEN-0-0. The jobscript 'myjob' will be executed on the first node in the allocation.
In this example, we allocate 64 nodes, 24 cores per node, for 3 hours. We allocate these resources via the qprod queue, consumed resources will be accounted to the Project identified by Project ID OPEN-0-0. The jobscript 'myjob' will be executed on the first node in the allocation.
```console
$ qsub -q qexp -l select=4:ncpus=16 -I
$ qsub -q qexp -l select=4:ncpus=24 -I
```
In this example, we allocate 4 nodes, 16 cores per node, for 1 hour. We allocate these resources via the qexp queue. The resources will be available interactively.
In this example, we allocate 4 nodes, 24 cores per node, for 1 hour. We allocate these resources via the qexp queue. The resources will be available interactively.
```console
$ qsub -A OPEN-0-0 -q qnvidia -l select=10:ncpus=16 ./myjob
$ qsub -A OPEN-0-0 -q qnvidia -l select=10:ncpus=24 ./myjob
```
In this example, we allocate 10 NVIDIA accelerated nodes, 16 cores per node, for 24 hours. We allocate these resources via the qnvidia queue. The jobscript 'myjob' will be executed on the first node in the allocation.
In this example, we allocate 10 NVIDIA accelerated nodes, 24 cores per node, for 24 hours. We allocate these resources via the qnvidia queue. The jobscript 'myjob' will be executed on the first node in the allocation.
```console
$ qsub -A OPEN-0-0 -q qfree -l select=10:ncpus=16 ./myjob
$ qsub -A OPEN-0-0 -q qfree -l select=10:ncpus=24 ./myjob
```
In this example, we allocate 10 nodes, 16 cores per node, for 12 hours. We allocate these resources via the qfree queue. It is not required that the project OPEN-0-0 has any available resources left. Consumed resources are still accounted for. The jobscript myjob will be executed on the first node in the allocation.
In this example, we allocate 10 nodes, 24 cores per node, for 12 hours. We allocate these resources via the qfree queue. It is not required that the project OPEN-0-0 has any available resources left. Consumed resources are still accounted for. The jobscript myjob will be executed on the first node in the allocation.
All qsub options may be [saved directly into the jobscript][1]. In such cases, it is not necessary to specify any options for qsub.
......@@ -156,48 +151,11 @@ $ qsub -m n
Specific nodes may be allocated via PBS:
```console
$ qsub -A OPEN-0-0 -q qprod -l select=1:ncpus=16:host=cn171+1:ncpus=16:host=cn172 -I
```
```console
$ qsub -A OPEN-0-0 -q qprod -l select=1:ncpus=24:host=r24u35n680+1:ncpus=24:host=r24u36n681 -I
```
In the first example, we allocate on Anselm nodes cn171 and cn172, all 16 cores per node, for 24 hours. Consumed resources will be accounted to the Project identified by Project ID OPEN-0-0. The resources will be available interactively.
The second example shows similar job placement for Salomon.
### Anselm - Placement by CPU Type
Nodes equipped with an Intel Xeon E5-2665 CPU have a base clock frequency of 2.4GHz; nodes equipped with an Intel Xeon E5-2470 CPU have a base frequency of 2.3 GHz (for details, see the Compute Nodes section). Nodes may be selected via the PBS resource attribute `cpu_freq`.
#### Anselm
| CPU Type | base freq. | Nodes | cpu_freq attribute |
| ------------------ | ---------- | ---------------------- | ------------------ |
| Intel Xeon E5-2665 | 2.4GHz | cn[1-180], cn[208-209] | 24 |
| Intel Xeon E5-2470 | 2.3GHz | cn[181-207] | 23 |
```console
$ qsub -A OPEN-0-0 -q qprod -l select=4:ncpus=16:cpu_freq=24 -I
```
In this example, we allocate 4 nodes, 16 cores per node, selecting only the nodes with the Intel Xeon E5-2665 CPU.
### Anselm - Placement by IB Switch
Groups of computational nodes are connected to chassis integrated InfiniBand switches. These switches form the leaf switch layer of the [Infiniband network][3] fat tree topology. Nodes sharing the leaf switch can communicate most efficiently. Sharing the same switch prevents hops in the network and facilitates unbiased, highly efficient network communication.
Nodes sharing the same switch may be selected via the PBS resource attribute `ibswitch`. Values of this attribute are `iswXX`, where `XX` is the switch number. The node-switch mapping can be seen in the [Hardware Overview][4] section.
We recommend allocating compute nodes to a single switch when best possible computational network performance is required to run the job efficiently:
```console
$ qsub -A OPEN-0-0 -q qprod -l select=18:ncpus=16:ibswitch=isw11 ./myjob
```
In this example, we request all of the 18 nodes sharing the isw11 switch for 24 hours. A full chassis will be allocated.
In this example, we allocate on Salomon nodes r24u35n680 and r24u36n681, all 24 cores per node, for 24 hours. Consumed resources will be accounted to the Project identified by Project ID OPEN-0-0. The resources will be available interactively.
### Salomon - Placement by Network Location
......@@ -610,10 +568,9 @@ Further jobscript examples may be found in the software section and the [Capacit
[1]: #example-jobscript-for-mpi-calculation-with-preloaded-inputs
[2]: resources-allocation-policy.md
[3]: ../anselm/network.md
[4]: ../anselm/hardware-overview.md
[3]: ../salomon/network.md
[5]: ../salomon/7d-enhanced-hypercube.md
[6]: ../anselm/storage.md
[6]: ../salomon/storage.md
[7]: ../software/mpi/running_openmpi.md
[8]: ../software/mpi/running-mpich2.md
[9]: capacity-computing.md
......@@ -2,7 +2,7 @@
## Job Queue Policies
Resources are allocated to jobs in a fair-share fashion, subject to constraints set by the queue and the resources available to the project. Anselm's fair-share system ensures that individual users may consume approximately equal amounts of resources per week. Detailed information can be found in the [Job scheduling][1] section. Resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. The following table provides the queue partitioning overview:
Resources are allocated to jobs in a fair-share fashion, subject to constraints set by the queue and the resources available to the project. The fair-share system ensures that individual users may consume approximately equal amounts of resources per week. Detailed information can be found in the [Job scheduling][1] section. Resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. The following table provides the queue partitioning overview:
!!! hint
The qexp queue is configured to run one job and accept five jobs in a queue per user.
......@@ -13,23 +13,6 @@ Resources are allocated to jobs in a fair-share fashion, subject to constraints
!!! note
**The qexp queue is equipped with nodes that do not have exactly the same CPU clock speed.** Should you need the nodes to have exactly the same CPU speed, you have to select the proper nodes during the PSB job submission.
### Anselm
| queue | active project | project resources | nodes | min ncpus | priority | authorization | walltime |
| ------------------- | -------------- | -------------------- | ---------------------------------------------------- | --------- | -------- | ------------- | -------- |
| qexp | no | none required | 209 nodes | 1 | 150 | no | 1 h |
| qprod | yes | > 0 | 180 nodes w/o accelerator | 16 | 0 | no | 24/48 h |
| qlong | yes | > 0 | 180 nodes w/o accelerator | 16 | 0 | no | 72/144 h |
| qnvidia | yes | > 0 | 23 NVIDIA nodes | 16 | 200 | yes | 24/48 h |
| qfat | yes | > 0 | 2 fat nodes | 16 | 200 | yes | 24/144 h |
| qfree | yes | < 120% of allocation | 180 w/o accelerator | 16 | -1024 | no | 12 h |
* **qexp**, the Express queue: This queue is dedicated to testing and running very small jobs. It is not required to specify a project to enter the qexp. There are always 2 nodes reserved for this queue (w/o accelerators), a maximum 8 nodes are available via the qexp for a particular user, from a pool of nodes containing Nvidia accelerated nodes (cn181-203), MIC accelerated nodes (cn204-207) and Fat nodes with 512GB of RAM (cn208-209). This enables us to test and tune accelerated code and code with higher RAM requirements. The nodes may be allocated on a per core basis. No special authorization is required to use qexp. The maximum runtime in qexp is 1 hour.
* **qprod**, the Production queue: This queue is intended for normal production runs. It is required that an active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, except the reserved ones. Included are 178 nodes without accelerators. Full nodes, 16 cores per node, are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours.
* **qlong**, the Long queue: This queue is intended for long production runs. It is required that an active project with nonzero remaining resources is specified to enter the qlong. Only 60 nodes without acceleration may be accessed via the qlong queue. Full nodes, 16 cores per node, are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qlong is 144 hours (three times that of the standard qprod time - 3 x 48 h).
* **qnvidia**, **qmic**, **qfat**, the Dedicated queues: The queue qnvidia is dedicated to accessing the NVIDIA-accelerated nodes, the qmic to accessing MIC nodes and qfat the Fat nodes. It is required that an active project with nonzero remaining resources is specified to enter these queues. Included are 23 NVIDIA, 4 mic, and 2 fat nodes. Full nodes, 16 cores per node, are allocated. The queues run with very high priority, the jobs will be scheduled before the jobs coming from the qexp queue. The PI needs to explicitly ask [support][a] for authorization to enter the dedicated queues for all users associated with their project.
* **qfree**, The Free resource queue: The queue qfree is intended for utilization of free resources, after a project has exhausted all of its allocated computational resources (Does not apply to DD projects by default; DD projects have to request permission to use qfree after exhaustion of computational resources). It is required that active project is specified to enter the queue. Consumed resources will be accounted to the Project. Access to the qfree queue is automatically removed if consumed resources exceed 120% of the resources allocated to the Project. Only 180 nodes without accelerators may be accessed from this queue. Full nodes, 16 cores per node, are allocated. The queue runs with a very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours.
### Salomon
| queue | active project | project resources | nodes | min ncpus | priority | authorization | walltime |
......@@ -80,16 +63,16 @@ The job wall clock time defaults to **half the maximum time**, see the table abo
Jobs that exceed the reserved wall clock time (Req'd Time) get killed automatically. The wall clock time limit can be changed for queuing jobs (state Q) using the `qalter` command, however it cannot be changed for a running job (state R).
Anselm users may check the current queue configuration [here][b].
You can check the current queue configuration on rsweb: [Barbora][b] or [Salomon][d].
## Queue Status
!!! tip
Check the status of jobs, queues and compute nodes [here][c].
![rspbs web interface](../img/rsweb.png)
![rspbs web interface](../img/barbora_cluster_usage.png)
Display the queue status on Anselm:
Display the queue status:
```console
$ qstat -q
......@@ -207,5 +190,6 @@ Options:
[3]: job-submission-and-execution.md
[a]: https://support.it4i.cz/rt/
[b]: https://extranet.it4i.cz/rsweb/anselm/queues
[c]: https://extranet.it4i.cz/rsweb/anselm/
[b]: https://extranet.it4i.cz/rsweb/barbora/queues
[c]: https://extranet.it4i.cz/rsweb
[d]: https://extranet.it4i.cz/rsweb/salomon/queues
......@@ -7,14 +7,6 @@ All IT4Innovations clusters are accessed by the SSH protocol via login nodes at
!!! note
The **cluster-name.it4i.cz** alias is currently not available through VPN connection. Use **loginX.cluster-name.it4i.cz** when connected to VPN.
### Anselm Cluster
| Login address | Port | Protocol | Login node |
| --------------------- | ---- | -------- | --------------------------------------|
| anselm.it4i.cz | 22 | SSH | round-robin DNS record for login[1-2] |
| login1.anselm.it4i.cz | 22 | SSH | login1 |
| login2.anselm.it4i.cz | 22 | SSH | login2 |
### Barbora Cluster
| Login address | Port | Protocol | Login node |
......@@ -37,22 +29,6 @@ All IT4Innovations clusters are accessed by the SSH protocol via login nodes at
Authentication is available by [private key][1] only. Verify SSH fingerprints during the first logon:
Anselm:
```console
md5:
29:b3:f4:64:b0:73:f5:6f:a7:85:0f:e0:0d:be:76:bf (DSA)
d4:6f:5c:18:f4:3f:70:ef:bc:fc:cc:2b:fd:13:36:b7 (RSA)
1a:19:75:31:ab:53:45:53:ce:35:82:13:29:e4:0d:d5 (ECDSA)
db:c7:5e:f6:31:be:80:9f:25:79:20:60:ad:93:f4:3b (ED25519)
sha256:
LX2034TYy6Lf0Q7Zf3zOIZuFlG09DaSGROGBz6LBUy4 (DSA)
+DcED3GDoA9piuyvQOho+ltNvwB9SJSYXbB639hbejY (RSA)
2Keuu9gzrcs1K8pu7ljm2wDdUXU6f+QGGSs8pyrMM3M (ECDSA)
C2ppGEk5QyB2ov+/9yPMqXpQvv1xu2UupoHFnwsLLWs (ED25519)
```
Barbora:
```console
......@@ -125,14 +101,6 @@ Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com
Data in and out of the system may be transferred by SCP and SFTP protocols.
### Anselm Cluster
| Address | Port | Protocol |
| --------------------- | ---- | --------- |
| anselm.it4i.cz | 22 | SCP |
| login1.anselm.it4i.cz | 22 | SCP |
| login2.anselm.it4i.cz | 22 | SCP |
### Barbora Cluster
| Address | Port | Protocol |
......
docs.it4i/img/barbora_cluster_usage.png

59 KiB

# Documentation
Welcome to the IT4Innovations documentation. The IT4Innovations National Supercomputing Center operates the [Anselm][2], [Barbora][3], and [Salomon][1] supercomputers. The supercomputers are [available][4] to the academic community within the Czech Republic and Europe, and the industrial community worldwide. The purpose of these pages is to provide comprehensive documentation of the hardware, software, and usage of the computers.
Welcome to the IT4Innovations documentation. The IT4Innovations National Supercomputing Center operates the [Barbora][3] and [Salomon][1] supercomputers. The supercomputers are [available][4] to the academic community within the Czech Republic and Europe, and the industrial community worldwide. The purpose of these pages is to provide comprehensive documentation of the hardware, software, and usage of the computers.
## How to Read the Documentation
......@@ -64,7 +64,6 @@ If you find an inconsistency or error, report it by visiting [support][b], creat
By doing so, you can save other readers from frustration and help us improve.
[1]: salomon/introduction.md
[2]: anselm/introduction.md
[3]: barbora/introduction.md
[4]: general/applying-for-resources.md
[5]: general/resources-allocation-policy.md#normalized-core-hours-nch
......
......@@ -36,7 +36,7 @@ $ qsub ... -l virt_network=true
```
!!! Warning
Available on Anselm and Salomon nodes only.
Available on Salomon nodes only.
[See Tap Interconnect][1]
......@@ -81,7 +81,7 @@ $ qsub ... -l cpu_turbo_boost=false
```
!!! Warning
Available on Anselm and Salomon nodes only.
Available on Salomon nodes only.
## Offlining CPU Cores
......
......@@ -10,7 +10,7 @@ All general [PRACE User Documentation][a] should be read before continuing readi
If you need any information, request support, or want to install additional software, use PRACE Helpdesk.
Information about the local services are provided in the [introduction of general user documentation Salomon][2] and [introduction of general user documentation Anselm][3]. Keep in mind, that standard PRACE accounts don't have a password to access the web interface of the local (IT4Innovations) request tracker and thus a new ticket should be created by sending an email to support[at]it4i.cz.
Information about the local services are provided in the [introduction of general user documentation Salomon][2] and [introduction of general user documentation Barbora][3]. Keep in mind, that standard PRACE accounts don't have a password to access the web interface of the local (IT4Innovations) request tracker and thus a new ticket should be created by sending an email to support[at]it4i.cz.
## Obtaining Login Credentials
......@@ -60,28 +60,12 @@ Salomon cluster:
$ gsissh -p 2222 salomon-prace.it4i.cz
```
Anselm cluster:
| Login address | Port | Protocol | Login node |
| --------------------------- | ---- | -------- | ---------------- |
| anselm-prace.it4i.cz | 2222 | gsissh | login1 or login2 |
| login1-prace.anselm.it4i.cz | 2222 | gsissh | login1 |
| login2-prace.anselm.it4i.cz | 2222 | gsissh | login2 |
```console
$ gsissh -p 2222 anselm-prace.it4i.cz
```
When logging from other PRACE system, the prace_service script can be used:
```console
$ gsissh `prace_service -i -s salomon`
```
```console
$ gsissh `prace_service -i -s anselm`
```
#### Access From Public Internet:
It is recommended to use the single DNS name **name-cluster**.it4i.cz which is distributed between the four login nodes. If needed, the user can login directly to one of the login nodes. The addresses are:
......@@ -100,28 +84,12 @@ Salomon cluster:
$ gsissh -p 2222 salomon.it4i.cz
```
Anselm cluster:
| Login address | Port | Protocol | Login node |
| --------------------- | ---- | -------- | ---------------- |
| anselm.it4i.cz | 2222 | gsissh | login1 or login2 |
| login1.anselm.it4i.cz | 2222 | gsissh | login1 |
| login2.anselm.it4i.cz | 2222 | gsissh | login2 |
```console
$ gsissh -p 2222 anselm.it4i.cz
```
When logging from other PRACE system, the prace_service script can be used:
```console
$ gsissh `prace_service -e -s salomon`
```
```console
$ gsissh `prace_service -e -s anselm`
```
Although the preferred and recommended file transfer mechanism is [using GridFTP][5], the GSI SSH implementation also supports SCP, so for small files transfer, gsiscp can be used:
```console
......@@ -131,13 +99,6 @@ $ gsiscp -P 2222 _LOCAL_PATH_TO_YOUR_FILE_ salomon-prace.it4i.cz:_SALOMON_PATH_T
$ gsiscp -P 2222 salomon-prace.it4i.cz:_SALOMON_PATH_TO_YOUR_FILE_ _LOCAL_PATH_TO_YOUR_FILE_
```
```console
$ gsiscp -P 2222 _LOCAL_PATH_TO_YOUR_FILE_ anselm.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_
$ gsiscp -P 2222 anselm.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_ _LOCAL_PATH_TO_YOUR_FILE_
$ gsiscp -P 2222 _LOCAL_PATH_TO_YOUR_FILE_ anselm-prace.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_
$ gsiscp -P 2222 anselm-prace.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_ _LOCAL_PATH_TO_YOUR_FILE_
```
### Access to X11 Applications (VNC)
If the user needs to run X11 based graphical application and does not have a X11 server, the applications can be run using VNC service. If the user is using a regular SSH based access, see this [section in general documentation][6].
......@@ -177,53 +138,24 @@ Copy files **to** Salomon by running the following commands on your local machin
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://gridftp-prace.salomon.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_SALOMON_/_PATH_TO_YOUR_FILE_
```
Anselm cluster:
| Login address | Port | Node role |
| ---------------------------- | ---- | --------------------------- |
| gridftp-prace.anselm.it4i.cz | 2812 | Front end /control server |
| login1-prace.anselm.it4i.cz | 2813 | Backend / data mover server |
| login2-prace.anselm.it4i.cz | 2813 | Backend / data mover server |
| dm1-prace.anselm.it4i.cz | 2813 | Backend / data mover server |
Copy files **to** Anselm by running the following commands on your local machine:
```console
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://gridftp-prace.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
```
Or by using prace_service script:
```console
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://`prace_service -i -f salomon`/home/prace/_YOUR_ACCOUNT_ON_SALOMON_/_PATH_TO_YOUR_FILE_
```
```console
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://`prace_service -i -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
```
Copy files **from** Salomon:
```console
$ globus-url-copy gsiftp://gridftp-prace.salomon.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_SALOMON_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
```
Copy files **from** Anselm:
```console
$ globus-url-copy gsiftp://gridftp-prace.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
```
Or by using the prace_service script:
```console
$ globus-url-copy gsiftp://`prace_service -i -f salomon`/home/prace/_YOUR_ACCOUNT_ON_SALOMON_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
```
```console
$ globus-url-copy gsiftp://`prace_service -i -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
```
### Access From Public Internet
Salomon cluster:
......@@ -241,53 +173,24 @@ Copy files **to** Salomon by running the following commands on your local machin
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://gridftp.salomon.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_SALOMON_/_PATH_TO_YOUR_FILE_
```
Anselm cluster:
| Login address | Port | Node role |
| ---------------------- | ---- | --------------------------- |
| gridftp.anselm.it4i.cz | 2812 | Front end /control server |
| login1.anselm.it4i.cz | 2813 | Backend / data mover server |
| login2.anselm.it4i.cz | 2813 | Backend / data mover server |
| dm1.anselm.it4i.cz | 2813 | Backend / data mover server |
Copy files **to** Anselm by running the following commands on your local machine:
```console
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://gridftp.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
```
Or by using the prace_service script:
```console
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://`prace_service -e -f salomon`/home/prace/_YOUR_ACCOUNT_ON_SALOMON_/_PATH_TO_YOUR_FILE_
```
```console
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://`prace_service -e -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
```
Copy files **from** Salomon:
```console
$ globus-url-copy gsiftp://gridftp.salomon.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_SALOMON_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
```
Copy files **from** Anselm:
```console
$ globus-url-copy gsiftp://gridftp.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
```
Or by using the prace_service script:
```console
$ globus-url-copy gsiftp://`prace_service -e -f salomon`/home/prace/_YOUR_ACCOUNT_ON_SALOMON_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
```
```console
$ globus-url-copy gsiftp://`prace_service -e -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
```
Generally, both shared file systems are available through GridFTP:
| File system mount point | Filesystem | Comment |
......@@ -295,12 +198,12 @@ Generally, both shared file systems are available through GridFTP:
| /home | Lustre | Default HOME directories of users in format /home/prace/login/ |
| /scratch | Lustre | Shared SCRATCH mounted on the whole cluster |
More information about the shared file systems is available [for Salomon here][10] and [for Anselm here][11].
More information about the shared file systems on Salomon is available [here][10].
!!! hint
The `prace` directory is used for PRACE users on the SCRATCH file system.
Only Salomon cluster /scratch:
Salomon cluster /scratch:
| Data type | Default path |
| ---------------------------- | ------------------------------- |
......@@ -311,7 +214,7 @@ Only Salomon cluster /scratch:
There are some limitations for PRACE users when using the cluster. By default, PRACE users are not allowed to access special queues in the PBS Pro to have high priority or exclusive access to some special equipment like accelerated nodes and high memory (fat) nodes. There may also be restrictions on obtaining a working license for the commercial software installed on the cluster, mostly because of the license agreement or because of insufficient amount of licenses.
For production runs, always use scratch file systems. The available file systems are described [for Salomon here][10] and [for Anselm here][11].
For production runs, always use scratch file systems. The available file systems on Salomon is described [here][10].
### Software, Modules and PRACE Common Production Environment
......@@ -337,14 +240,6 @@ Salomon:
| **qprace** Production queue | yes | >0 | 1006 nodes, max 86 per job | 0 | no | 24 / 48 h |
| **qfree** Free resource queue | yes | none required | 752 nodes, max 86 per job | -1024 | no | 12 / 12 h |
Anselm:
| queue | Active project | Project resources | Nodes | priority | authorization | walltime |
| ----------------------------- | -------------- | ----------------- | ------------------- | -------- | ------------- | --------- |
| **qexp** Express queue | no | none required | 2 reserved, 8 total | high | no | 1 / 1h |
| **qprace** Production queue | yes | > 0 | 178 w/o accelerator | medium | no | 24 / 48 h |
| **qfree** Free resource queue | yes | none required | 178 w/o accelerator | very low | no | 12 / 12 h |
**qprace**, the PRACE queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprace. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprace is 48 hours. If the job needs longer time, it must use the checkpoint/restart functionality.
### Accounting & Quota
......@@ -392,12 +287,11 @@ If the quota is insufficient, contact the [support][15] and request an increase.
[1]: general/obtaining-login-credentials/obtaining-login-credentials.md
[2]: salomon/introduction.md
[3]: anselm/introduction.md
[3]: barbora/introduction.md
[5]: #file-transfers
[6]: general/accessing-the-clusters/graphical-user-interface/x-window-system.md
[9]: general/shell-and-data-access.md
[10]: salomon/storage.md
[11]: anselm/storage.md
[12]: environment-and-modules.md
[13]: general/resources-allocation-policy.md
[15]: #help-and-support
......
......@@ -52,7 +52,7 @@ To run COMSOL in batch mode without the COMSOL Desktop GUI environment, utilize
#PBS -N JOB_NAME
#PBS -A PROJECT_ID
cd /scratch/work/user/$USER/ || exit # on Anselm use: /scratch/$USER
cd /scratch/work/user/$USER/ || exit
echo Time is `date`
echo Directory is `pwd`
......@@ -76,7 +76,7 @@ A working directory has to be created before sending the (comsol.pbs) job script
COMSOL is a software package for the numerical solution of partial differential equations. LiveLink for MATLAB allows connection to the COMSOL API (Application Programming Interface) with the benefits of the programming language and computing environment of the MATLAB.
LiveLink for MATLAB is available in both **EDU** and **COM** **variant** of the COMSOL release. On the clusters there is 1 commercial (**COM**) and 5 educational (**EDU**) licenses of LiveLink for MATLAB (see the [ISV Licenses][3]). The following example shows how to start COMSOL model from MATLAB via LiveLink in the interactive mode (on Anselm, use 16 threads).
LiveLink for MATLAB is available in both **EDU** and **COM** **variant** of the COMSOL release. On the clusters there is 1 commercial (**COM**) and 5 educational (**EDU**) licenses of LiveLink for MATLAB (see the [ISV Licenses][3]). The following example shows how to start COMSOL model from MATLAB via LiveLink in the interactive mode.
```console
$ xhost +
......@@ -97,7 +97,7 @@ To run LiveLink for MATLAB in batch mode with (comsol_matlab.pbs) job script, yo
#PBS -N JOB_NAME
#PBS -A PROJECT_ID
cd /scratch/work/user/$USER || exit # on Anselm use: /scratch/$USER
cd /scratch/work/user/$USER || exit
echo Time is `date`
echo Directory is `pwd`
......
......@@ -27,17 +27,17 @@ uid=1000(user) gid=1000(user) groups=1000(user),1234(open-0-0),7310(gaussian)
## Installed Version
Gaussian is available on Anselm, Salomon, Barbora, and DGX-2 systems in the latest version Gaussian 16 rev. c0.
| Module | CPU support | GPU support | Parallelization | Note | Anselm | Barbora | Salomon | DGX-2 |
|--------------------------------------|-------------|--------------|-----------------|---------------------|---------|---------|---------|-------|
| Gaussian/16_rev_c0-binary | AVX2 | Yes | SMP | Binary distribution | No | Yes | Yes | Yes |
| Gaussian/16_rev_c0-binary-Linda | AVX2 | Yes | SMP + Linda | Binary distribution | No | Yes | Yes | No |
| Gaussian/16_rev_c0-CascadeLake | AVX-512 | No | SMP | IT4I compiled | No | Yes | No | No |
| Gaussian/16_rev_c0-CascadeLake-Linda | AVX-512 | No | SMP + Linda | IT4I compiled | No | Yes | No | No |
| Gaussian/16_rev_c0-GPU-Linda | AVX-512 | Yes | SMP + Linda | IT4I compiled | No | Yes | No | No |
| Gaussian/16_rev_c0-GPU | AVX-512 | Yes | SMP | IT4I compiled | No | No | No | Yes |
| Gaussian/16_rev_c0-Linda | AVX | No | SMP + Linda | IT4I compiled | Yes | No | No | No |
Gaussian is available on Salomon, Barbora, and DGX-2 systems in the latest version Gaussian 16 rev. c0.
| Module | CPU support | GPU support | Parallelization | Note | Barbora | Salomon | DGX-2 |
|--------------------------------------|-------------|--------------|-----------------|---------------------|---------|---------|-------|
| Gaussian/16_rev_c0-binary | AVX2 | Yes | SMP | Binary distribution | Yes | Yes | Yes |
| Gaussian/16_rev_c0-binary-Linda | AVX2 | Yes | SMP + Linda | Binary distribution | Yes | Yes | No |
| Gaussian/16_rev_c0-CascadeLake | AVX-512 | No | SMP | IT4I compiled | Yes | No | No |
| Gaussian/16_rev_c0-CascadeLake-Linda | AVX-512 | No | SMP + Linda | IT4I compiled | Yes | No | No |
| Gaussian/16_rev_c0-GPU-Linda | AVX-512 | Yes | SMP + Linda | IT4I compiled | Yes | No | No |
| Gaussian/16_rev_c0-GPU | AVX-512 | Yes | SMP | IT4I compiled | No | No | Yes |
| Gaussian/16_rev_c0-Linda | AVX | No | SMP + Linda | IT4I compiled | No | No | No |
Speedup may be observed on Barbora and DGX-2 systems when using the `CascadeLake` and `GPU` modules compared to the `binary` module.
......
......@@ -12,7 +12,11 @@ To run MOLPRO, you need to have a valid license token present in `HOME/.molpro/t
## Installed Version
On Anselm, the current version is 2010.1, patch level 45, parallel version compiled with Intel compilers and Intel MPI.
For the current list of installed versions, use:
```console
$ ml av Molpro
```
Compilation parameters are default:
......
......@@ -8,18 +8,6 @@ Allinea DDT is a commercial debugger primarily for debugging parallel MPI or Ope
Allinea MAP is a profiler for C/C++/Fortran HPC codes. It is designed for profiling parallel code, which uses pthreads, OpenMP, or MPI.
## License and Limitations for Anselm Users
On Anselm users can debug OpenMP or MPI code that runs up to 64 parallel processes. In case of debugging GPU or Xeon Phi accelerated codes, the limit is 8 accelerators. This limitation means that:
* 1 user can debug up to 64 processes, or
* 32 users can debug 2 processes, etc.
In case of debugging on accelerators:
* 1 user can debug on up to 8 accelerators, or
* 8 users can debug on single accelerator.
## Compiling Code to Run With DDT
### Modules
......
......@@ -25,7 +25,7 @@ Currently, there are two versions of CUBE 4.2.3 available as [modules][1]:
## Usage
CUBE is a graphical application. Refer to the Graphical User Interface documentation for a list of methods to launch graphical applications on Anselm.
CUBE is a graphical application. Refer to the Graphical User Interface documentation for a list of methods to launch graphical applications clusters.
!!! note
Analyzing large data sets can consume large amount of CPU and RAM. Do not perform large analysis on login nodes.
......
......@@ -61,7 +61,7 @@ The `pcm-msr.x` command can be used to read/write model specific registers of th
### PCM-Numa
NUMA monitoring utility does not work on Anselm.
Monitors local and remote memory accesses.
### PCM-Pcie
......@@ -186,15 +186,12 @@ Sample output:
### PCM-Sensor
Can be used as a sensor for the ksysguard GUI, which is currently not installed on Anselm.
Can be used as a sensor for the ksysguard GUI.
## API
In a similar fashion to PAPI, PCM provides a C++ API to access the performance counter from within your application. Refer to the [Doxygen documentation][a] for details of the API.
!!! note
Due to security limitations, using PCM API to monitor your applications is currently not possible on Anselm. (The application must be run as root user)
Sample program using the API :
```cpp
......
......@@ -2,7 +2,7 @@
## Introduction
We provide state of the art programs and tools to develop, profile, and debug HPC codes at IT4Innovations. On these pages, we provide an overview of the profiling and debugging tools available on Anslem at IT4I.
We provide state of the art programs and tools to develop, profile, and debug HPC codes at IT4Innovations. In this cestion, we provide an overview of the profiling and debugging tools available on IT4I clusters.
## Intel Debugger
......
......@@ -8,10 +8,11 @@ Scalasca supports profiling of MPI, OpenMP and hybrid MPI+OpenMP applications.
## Installed Versions
There are currently two versions of Scalasca 2.0 [modules][1] installed on Anselm:
For the current list of installed versions, use:
* scalasca2/2.0-gcc-openmpi, for usage with [GNU Compiler][2] and [OpenMPI][3],
* scalasca2/2.0-icc-impi, for usage with [Intel Compiler][2] and [Intel MPI][4].
```console
$ ml av Scalasca
```
## Usage
......
......@@ -8,10 +8,11 @@ Score-P can be used as an instrumentation tool for [Scalasca][1].
## Installed Versions
There are currently two versions of Score-P version 1.2.6 [modules][2] installed on Anselm:
For the current list of installed versions, use:
* scorep/1.2.3-gcc-openmpi, for usage with [GNU Compiler][3] and [OpenMPI][4]
* scorep/1.2.3-icc-impi, for usage with [Intel Compiler][3] and [Intel MPI][5]
```console
$ ml av Score-P
```
## Instrumentation
......
......@@ -6,16 +6,11 @@ TotalView is a GUI-based source code multi-process, multi-thread debugger.
## License and Limitations for Cluster Users
On the cluster, users can debug OpenMP or MPI code that runs up to 64 parallel processes. This limitation means that:
```console
1 user can debug up 64 processes, or
32 users can debug 2 processes, etc.
```
On the cluster, users can debug OpenMP or MPI code that runs up to 64 parallel processes. This limitation means that 1 user can debug up 64 processes, or 32 users can debug 2 processes, etc.
Debugging of GPU accelerated codes is also supported.
You can check the status of the licenses [Salomon][a] or [Anselm][b]:
You can check the status of the licenses for [Salomon][a] or [Barbora][b]:
```console
$ cat /apps/user/licenses/totalview_features_state.txt
......@@ -70,9 +65,9 @@ Be sure to log in with an X window forwarding enabled. This could mean using the
ssh -X username@salomon.it4i.cz
```
Another option is to access the login node using VNC. See the detailed information on how to use the GUI on Anselm.
Another option is to access the login node using VNC.
From the login node an interactive session with X windows forwarding (`-X` option) can be started by the following command (for Anselm use 16 threads):
From the login node an interactive session with X windows forwarding (`-X` option) can be started by the following command:
```console
$ qsub -I -X -A NONE-0-0 -q qexp -lselect=1:ncpus=24:mpiprocs=24,walltime=01:00:00
......@@ -119,8 +114,6 @@ The source code of this function can be also found in
```console
$ /apps/all/OpenMPI/1.10.1-GNU-4.9.3-2.25/etc/openmpi-totalview.tcl #Salomon
$ /apps/mpi/openmpi/intel/1.6.5/etc/openmpi-totalview.tcl #Anselm
```
You can also add only following line to your ~/.tvdrc file instead of
......@@ -128,8 +121,6 @@ the entire function:
```console
$ source /apps/all/OpenMPI/1.10.1-GNU-4.9.3-2.25/etc/openmpi-totalview.tcl #Salomon
$ source /apps/mpi/openmpi/intel/1.6.5/etc/openmpi-totalview.tcl #Anselm
```
You need to do this step only once. See also [OpenMPI FAQ entry][c].
......@@ -169,6 +160,6 @@ More information regarding the command line parameters of the TotalView can be f
[1] The [TotalView documentation][d] web page is a good source for learning more about some of the advanced TotalView features.
[a]: https://extranet.it4i.cz/rsweb/salomon/license/Totalview
[b]: https://extranet.it4i.cz/rsweb/anselm/license/Totalview
[b]: https://extranet.it4i.cz/rsweb/barbora/license/Totalview
[c]: https://www.open-mpi.org/faq/?category=running#run-with-tv
[d]: http://www.roguewave.com/support/product-documentation/totalview-family.aspx#totalview
......@@ -17,11 +17,6 @@ The main tools available in Valgrind are:
## Installed Versions
There are two versions of Valgrind available on Anselm.
* Version 3.6.0, installed by operating system vendor in /usr/bin/valgrind. This version is available by default, without the need to load any module. This version however does not provide additional MPI support.
* Version 3.9.0 with support for Intel MPI, available in the `valgrind/3.9.0-impi` [module][1]. After loading the module, this version replaces the default Valgrind.
There are two versions of Valgrind available on the Salomon.
* Version 3.8.1, installed by operating system vendor in /usr/bin/valgrind. This version is available by default, without the need to load any module. However, this version does not provide additional MPI support. Also, it does not support AVX2 instructions - debugging of an AVX2-enabled executable with this version will fail
......@@ -162,7 +157,6 @@ The default version without MPI support will however report a large number of fa
So it is better to use the MPI-enabled Valgrind from the module. The MPI version requires the library:
* Anselm: /apps/tools/valgrind/3.9.0/impi/lib/valgrind/libmpiwrap-amd64-linux.so
* Salomon: $EBROOTVALGRIND/lib/valgrind/libmpiwrap-amd64-linux.so
which must be included in the `LD_PRELOAD` environment variable.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment