Skip to content
Snippets Groups Projects
Commit ecd98f69 authored by Josef Hrabal's avatar Josef Hrabal
Browse files

Merge branch 'anselm-revision' into 'master'

Anselm revision

See merge request !188
parents 4dfc7ae8 693c7ce5
Branches
No related tags found
6 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section,!196Master,!188Anselm revision
...@@ -21,7 +21,7 @@ Anselm is cluster of x86-64 Intel based nodes built on Bull Extreme Computing bu ...@@ -21,7 +21,7 @@ Anselm is cluster of x86-64 Intel based nodes built on Bull Extreme Computing bu
* two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node * two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
* 96 GB of physical memory per node * 96 GB of physical memory per node
* one 500GB SATA 2,5” 7,2 krpm HDD per node * one 500GB SATA 2,5” 7,2 krpm HDD per node
* GPU accelerator 1x NVIDIA Tesla Kepler K20 per node * GPU accelerator 1x NVIDIA Tesla Kepler K20m per node
* bullx B515 blade servers * bullx B515 blade servers
* cn[181-203] * cn[181-203]
...@@ -52,12 +52,12 @@ Anselm is cluster of x86-64 Intel based nodes built on Bull Extreme Computing bu ...@@ -52,12 +52,12 @@ Anselm is cluster of x86-64 Intel based nodes built on Bull Extreme Computing bu
### Compute Nodes Summary ### Compute Nodes Summary
| Node type | Count | Range | Memory | Cores | [Access](resources-allocation-policy/) | | Node type | Count | Range | Memory | Cores | [Access](resources-allocation-policy/) |
| -------------------------- | ----- | ----------- | ------ | ----------- | -------------------------------------- | | -------------------------- | ----- | ----------- | ------ | ----------- | -------------------------------------- |
| Nodes without accelerator | 180 | cn[1-180] | 64GB | 16 @ 2.4GHz | qexp, qprod, qlong, qfree | | Nodes without accelerator | 180 | cn[1-180] | 64GB | 16 @ 2.4GHz | qexp, qprod, qlong, qfree, qatlas, qprace |
| Nodes with GPU accelerator | 23 | cn[181-203] | 96GB | 16 @ 2.3GHz | qgpu, qexp | | Nodes with GPU accelerator | 23 | cn[181-203] | 96GB | 16 @ 2.3GHz | qnvidia, qexp, qatlas |
| Nodes with MIC accelerator | 4 | cn[204-207] | 96GB | 16 @ 2.3GHz | qmic, qexp | | Nodes with MIC accelerator | 4 | cn[204-207] | 96GB | 16 @ 2.3GHz | qmic, qexp |
| Fat compute nodes | 2 | cn[208-209] | 512GB | 16 @ 2.4GHz | qfat, qexp | | Fat compute nodes | 2 | cn[208-209] | 512GB | 16 @ 2.4GHz | qfat, qexp |
## Processor Architecture ## Processor Architecture
......
...@@ -42,13 +42,13 @@ The modules may be loaded, unloaded and switched, according to momentary needs. ...@@ -42,13 +42,13 @@ The modules may be loaded, unloaded and switched, according to momentary needs.
To check available modules use To check available modules use
```console ```console
$ module avail **or** ml av $ ml av
``` ```
To load a module, for example the octave module use To load a module, for example the octave module use
```console ```console
$ module load octave **or** ml octave $ ml octave
``` ```
loading the octave module will set up paths and environment variables of your active shell such that you are ready to run the octave software loading the octave module will set up paths and environment variables of your active shell such that you are ready to run the octave software
...@@ -56,19 +56,13 @@ loading the octave module will set up paths and environment variables of your ac ...@@ -56,19 +56,13 @@ loading the octave module will set up paths and environment variables of your ac
To check loaded modules use To check loaded modules use
```console ```console
$ module list **or** ml $ ml
``` ```
To unload a module, for example the octave module use To unload a module, for example the octave module use
```console ```console
$ module unload octave **or** ml -octave $ ml -octave
```
Learn more on modules by reading the module man page
```console
$ man module
``` ```
Following modules set up the development environment Following modules set up the development environment
...@@ -79,10 +73,6 @@ PrgEnv-intel sets up the INTEL development environment in conjunction with the I ...@@ -79,10 +73,6 @@ PrgEnv-intel sets up the INTEL development environment in conjunction with the I
## Application Modules Path Expansion ## Application Modules Path Expansion
All application modules on Salomon cluster (and further) will be build using tool called [EasyBuild](http://hpcugent.github.io/easybuild/ "EasyBuild"). In case that you want to use some applications that are build by EasyBuild already, you have to modify your MODULEPATH environment variable. All application modules on Anselm cluster (and further) will be build using tool called [EasyBuild](http://hpcugent.github.io/easybuild/ "EasyBuild").
```console
export MODULEPATH=$MODULEPATH:/apps/easybuild/modules/all/
```
This command expands your searched paths to modules. You can also add this command to the .bashrc file to expand paths permanently. After this command, you can use same commands to list/add/remove modules as is described above. This command expands your searched paths to modules. You can also add this command to the .bashrc file to expand paths permanently. After this command, you can use same commands to list/add/remove modules as is described above.
# Hardware Overview # Hardware Overview
The Anselm cluster consists of 209 computational nodes named cn[1-209] of which 180 are regular compute nodes, 23 GPU Kepler K20 accelerated nodes, 4 MIC Xeon Phi 5110P accelerated nodes and 2 fat nodes. Each node is a powerful x86-64 computer, equipped with 16 cores (two eight-core Intel Sandy Bridge processors), at least 64 GB RAM, and local hard drive. The user access to the Anselm cluster is provided by two login nodes login[1,2]. The nodes are interlinked by high speed InfiniBand and Ethernet networks. All nodes share 320 TB /home disk storage to store the user files. The 146 TB shared /scratch storage is available for the scratch data. The Anselm cluster consists of 209 computational nodes named cn[1-209] of which 180 are regular compute nodes, 23 GPU Kepler K20m accelerated nodes, 4 MIC Xeon Phi 5110P accelerated nodes and 2 fat nodes. Each node is a powerful x86-64 computer, equipped with 16 cores (two eight-core Intel Sandy Bridge processors), at least 64 GB RAM, and local hard drive. The user access to the Anselm cluster is provided by two login nodes login[1,2]. The nodes are interlinked by high speed InfiniBand and Ethernet networks. All nodes share 320 TB /home disk storage to store the user files. The 146 TB shared /scratch storage is available for the scratch data.
The Fat nodes are equipped with large amount (512 GB) of memory. Virtualization infrastructure provides resources to run long term servers and services in virtual mode. Fat nodes and virtual servers may access 45 TB of dedicated block storage. Accelerated nodes, fat nodes, and virtualization infrastructure are available [upon request](https://support.it4i.cz/rt) made by a PI. The Fat nodes are equipped with large amount (512 GB) of memory. Fat nodes may access 45 TB of dedicated block storage. Accelerated nodes, fat nodes are available [upon request](https://support.it4i.cz/rt) made by a PI.
Schematic representation of the Anselm cluster. Each box represents a node (computer) or storage capacity: Schematic representation of the Anselm cluster. Each box represents a node (computer) or storage capacity:
...@@ -13,7 +13,7 @@ The cluster compute nodes cn[1-207] are organized within 13 chassis. ...@@ -13,7 +13,7 @@ The cluster compute nodes cn[1-207] are organized within 13 chassis.
There are four types of compute nodes: There are four types of compute nodes:
* 180 compute nodes without the accelerator * 180 compute nodes without the accelerator
* 23 compute nodes with GPU accelerator - equipped with NVIDIA Tesla Kepler K20 * 23 compute nodes with GPU accelerator - equipped with NVIDIA Tesla Kepler K20m
* 4 compute nodes with MIC accelerator - equipped with Intel Xeon Phi 5110P * 4 compute nodes with MIC accelerator - equipped with Intel Xeon Phi 5110P
* 2 fat nodes - equipped with 512 GB RAM and two 100 GB SSD drives * 2 fat nodes - equipped with 512 GB RAM and two 100 GB SSD drives
...@@ -34,7 +34,7 @@ The parameters are summarized in the following tables: ...@@ -34,7 +34,7 @@ The parameters are summarized in the following tables:
| ------------------------------------------- | -------------------------------------------- | | ------------------------------------------- | -------------------------------------------- |
| Primary purpose | High Performance Computing | | Primary purpose | High Performance Computing |
| Architecture of compute nodes | x86-64 | | Architecture of compute nodes | x86-64 |
| Operating system | Linux | | Operating system | Linux (CentOS) |
| [**Compute nodes**](compute-nodes/) | | | [**Compute nodes**](compute-nodes/) | |
| Totally | 209 | | Totally | 209 |
| Processor cores | 16 (2 x 8 cores) | | Processor cores | 16 (2 x 8 cores) |
...@@ -53,7 +53,7 @@ The parameters are summarized in the following tables: ...@@ -53,7 +53,7 @@ The parameters are summarized in the following tables:
| Node | Processor | Memory | Accelerator | | Node | Processor | Memory | Accelerator |
| ---------------- | --------------------------------------- | ------ | -------------------- | | ---------------- | --------------------------------------- | ------ | -------------------- |
| w/o accelerator | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 64 GB | - | | w/o accelerator | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 64 GB | - |
| GPU accelerated | 2 x Intel Sandy Bridge E5-2470, 2.3 GHz | 96 GB | NVIDIA Kepler K20 | | GPU accelerated | 2 x Intel Sandy Bridge E5-2470, 2.3 GHz | 96 GB | NVIDIA Kepler K20m |
| MIC accelerated | 2 x Intel Sandy Bridge E5-2470, 2.3 GHz | 96 GB | Intel Xeon Phi 5110P | | MIC accelerated | 2 x Intel Sandy Bridge E5-2470, 2.3 GHz | 96 GB | Intel Xeon Phi 5110P |
| Fat compute node | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 512 GB | - | | Fat compute node | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 512 GB | - |
......
...@@ -324,10 +324,10 @@ cp $PBS_O_WORKDIR/input . ...@@ -324,10 +324,10 @@ cp $PBS_O_WORKDIR/input .
cp $PBS_O_WORKDIR/mympiprog.x . cp $PBS_O_WORKDIR/mympiprog.x .
# load the mpi module # load the mpi module
module load openmpi ml OpenMPI
# execute the calculation # execute the calculation
mpiexec -pernode ./mympiprog.x mpirun -pernode ./mympiprog.x
# copy output file to home # copy output file to home
cp output $PBS_O_WORKDIR/. cp output $PBS_O_WORKDIR/.
...@@ -362,10 +362,10 @@ SCRDIR=/scratch/$USER/myjob ...@@ -362,10 +362,10 @@ SCRDIR=/scratch/$USER/myjob
cd $SCRDIR || exit cd $SCRDIR || exit
# load the mpi module # load the mpi module
module load openmpi ml OpenMPI
# execute the calculation # execute the calculation
mpiexec ./mympiprog.x mpirun ./mympiprog.x
#exit #exit
exit exit
......
...@@ -210,7 +210,7 @@ All system wide installed software on the cluster is made available to the users ...@@ -210,7 +210,7 @@ All system wide installed software on the cluster is made available to the users
PRACE users can use the "prace" module to use the [PRACE Common Production Environment](http://www.prace-ri.eu/prace-common-production-environment/). PRACE users can use the "prace" module to use the [PRACE Common Production Environment](http://www.prace-ri.eu/prace-common-production-environment/).
```console ```console
$ module load prace $ ml prace
``` ```
### Resource Allocation and Job Execution ### Resource Allocation and Job Execution
......
...@@ -30,13 +30,13 @@ Private key authentication: ...@@ -30,13 +30,13 @@ Private key authentication:
On **Linux** or **Mac**, use On **Linux** or **Mac**, use
```console ```console
local $ ssh -i /path/to/id_rsa username@anselm.it4i.cz $ ssh -i /path/to/id_rsa username@anselm.it4i.cz
``` ```
If you see warning message "UNPROTECTED PRIVATE KEY FILE!", use this command to set lower permissions to private key file. If you see warning message "UNPROTECTED PRIVATE KEY FILE!", use this command to set lower permissions to private key file.
```console ```console
local $ chmod 600 /path/to/id_rsa $ chmod 600 /path/to/id_rsa
``` ```
On **Windows**, use [PuTTY ssh client](../general/accessing-the-clusters/shell-access-and-data-transfer/putty.md). On **Windows**, use [PuTTY ssh client](../general/accessing-the-clusters/shell-access-and-data-transfer/putty.md).
...@@ -89,23 +89,23 @@ To achieve 160MB/s transfer rates, the end user must be connected by 10G line al ...@@ -89,23 +89,23 @@ To achieve 160MB/s transfer rates, the end user must be connected by 10G line al
On linux or Mac, use scp or sftp client to transfer the data to Anselm: On linux or Mac, use scp or sftp client to transfer the data to Anselm:
```console ```console
local $ scp -i /path/to/id_rsa my-local-file username@anselm.it4i.cz:directory/file $ scp -i /path/to/id_rsa my-local-file username@anselm.it4i.cz:directory/file
``` ```
```console ```console
local $ scp -i /path/to/id_rsa -r my-local-dir username@anselm.it4i.cz:directory $ scp -i /path/to/id_rsa -r my-local-dir username@anselm.it4i.cz:directory
``` ```
or or
```console ```console
local $ sftp -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz $ sftp -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz
``` ```
Very convenient way to transfer files in and out of the Anselm computer is via the fuse filesystem [sshfs](http://linux.die.net/man/1/sshfs) Very convenient way to transfer files in and out of the Anselm computer is via the fuse filesystem [sshfs](http://linux.die.net/man/1/sshfs)
```console ```console
local $ sshfs -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz:. mountpoint $ sshfs -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz:. mountpoint
``` ```
Using sshfs, the users Anselm home directory will be mounted on your local computer, just like an external disk. Using sshfs, the users Anselm home directory will be mounted on your local computer, just like an external disk.
...@@ -150,7 +150,7 @@ It works by tunneling the connection from Anselm back to users workstation and f ...@@ -150,7 +150,7 @@ It works by tunneling the connection from Anselm back to users workstation and f
Pick some unused port on Anselm login node (for example 6000) and establish the port forwarding: Pick some unused port on Anselm login node (for example 6000) and establish the port forwarding:
```console ```console
local $ ssh -R 6000:remote.host.com:1234 anselm.it4i.cz $ ssh -R 6000:remote.host.com:1234 anselm.it4i.cz
``` ```
In this example, we establish port forwarding between port 6000 on Anselm and port 1234 on the remote.host.com. By accessing localhost:6000 on Anselm, an application will see response of remote.host.com:1234. The traffic will run via users local workstation. In this example, we establish port forwarding between port 6000 on Anselm and port 1234 on the remote.host.com. By accessing localhost:6000 on Anselm, an application will see response of remote.host.com:1234. The traffic will run via users local workstation.
...@@ -190,7 +190,7 @@ Port forwarding is static, each single port is mapped to a particular port on re ...@@ -190,7 +190,7 @@ Port forwarding is static, each single port is mapped to a particular port on re
To establish local proxy server on your workstation, install and run SOCKS proxy server software. On Linux, sshd demon provides the functionality. To establish SOCKS proxy server listening on port 1080 run: To establish local proxy server on your workstation, install and run SOCKS proxy server software. On Linux, sshd demon provides the functionality. To establish SOCKS proxy server listening on port 1080 run:
```console ```console
local $ ssh -D 1080 localhost $ ssh -D 1080 localhost
``` ```
On Windows, install and run the free, open source [Sock Puppet](http://sockspuppet.com/) server. On Windows, install and run the free, open source [Sock Puppet](http://sockspuppet.com/) server.
...@@ -198,7 +198,7 @@ On Windows, install and run the free, open source [Sock Puppet](http://sockspupp ...@@ -198,7 +198,7 @@ On Windows, install and run the free, open source [Sock Puppet](http://sockspupp
Once the proxy server is running, establish ssh port forwarding from Anselm to the proxy server, port 1080, exactly as [described above](#port-forwarding-from-login-nodes). Once the proxy server is running, establish ssh port forwarding from Anselm to the proxy server, port 1080, exactly as [described above](#port-forwarding-from-login-nodes).
```console ```console
local $ ssh -R 6000:localhost:1080 anselm.it4i.cz $ ssh -R 6000:localhost:1080 anselm.it4i.cz
``` ```
Now, configure the applications proxy settings to **localhost:6000**. Use port forwarding to access the [proxy server from compute nodes](#port-forwarding-from-compute-nodes) as well. Now, configure the applications proxy settings to **localhost:6000**. Use port forwarding to access the [proxy server from compute nodes](#port-forwarding-from-compute-nodes) as well.
......
...@@ -10,9 +10,7 @@ Please don't use shared filesystems as a backup for large amount of data or long ...@@ -10,9 +10,7 @@ Please don't use shared filesystems as a backup for large amount of data or long
Anselm computer provides two main shared filesystems, the [HOME filesystem](#home) and the [SCRATCH filesystem](#scratch). Both HOME and SCRATCH filesystems are realized as a parallel Lustre filesystem. Both shared file systems are accessible via the Infiniband network. Extended ACLs are provided on both Lustre filesystems for the purpose of sharing data with other users using fine-grained control. Anselm computer provides two main shared filesystems, the [HOME filesystem](#home) and the [SCRATCH filesystem](#scratch). Both HOME and SCRATCH filesystems are realized as a parallel Lustre filesystem. Both shared file systems are accessible via the Infiniband network. Extended ACLs are provided on both Lustre filesystems for the purpose of sharing data with other users using fine-grained control.
### Understanding the Lustre Filesystems ### [Understanding the Lustre Filesystems](http://www.nas.nasa.gov)
(source <http://www.nas.nasa.gov>)
A user file on the Lustre filesystem can be divided into multiple chunks (stripes) and stored across a subset of the object storage targets (OSTs) (disks). The stripes are distributed among the OSTs in a round-robin fashion to ensure load balancing. A user file on the Lustre filesystem can be divided into multiple chunks (stripes) and stored across a subset of the object storage targets (OSTs) (disks). The stripes are distributed among the OSTs in a round-robin fashion to ensure load balancing.
...@@ -72,7 +70,7 @@ Another good practice is to make the stripe count be an integral factor of the n ...@@ -72,7 +70,7 @@ Another good practice is to make the stripe count be an integral factor of the n
Large stripe size allows each client to have exclusive access to its own part of a file. However, it can be counterproductive in some cases if it does not match your I/O pattern. The choice of stripe size has no effect on a single-stripe file. Large stripe size allows each client to have exclusive access to its own part of a file. However, it can be counterproductive in some cases if it does not match your I/O pattern. The choice of stripe size has no effect on a single-stripe file.
Read more on <http://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace> Read more on [here](http://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace)
### Lustre on Anselm ### Lustre on Anselm
......
...@@ -56,7 +56,6 @@ pages: ...@@ -56,7 +56,6 @@ pages:
- Compute Nodes: anselm/compute-nodes.md - Compute Nodes: anselm/compute-nodes.md
- Storage: anselm/storage.md - Storage: anselm/storage.md
- Network: anselm/network.md - Network: anselm/network.md
- Remote Visualization: anselm/remote-visualization.md
- PRACE User Support: anselm/prace.md - PRACE User Support: anselm/prace.md
- Software: - Software:
- Modules: - Modules:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment