Skip to content
Snippets Groups Projects
Commit 6680f297 authored by Lukáš Krupčík's avatar Lukáš Krupčík
Browse files

salomon docs.it4i.cz/salomon/resource-allocation-and-job-execution

parent 86958620
No related branches found
No related tags found
6 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section,!3Rucni cisteni,!4Rucni cisteni
Pipeline #
Showing
with 566 additions and 874 deletions
Shell access and data transfer
Shell access and data transfer
==============================
Interactive Login
-----------------
The Salomon cluster is accessed by SSH protocol via login nodes login1, login2, login3 and login4 at address salomon.it4i.cz. The login nodes may be addressed specifically, by prepending the login node name to the address.
The Salomon cluster is accessed by SSH protocol via login nodes login1,
login2, login3 and login4 at address salomon.it4i.cz. The login nodes
may be addressed specifically, by prepending the login node name to the
address.
The alias >salomon.it4i.cz is currently not available through VPN
connection. Please use loginX.salomon.it4i.cz when connected to
VPN.
>The alias >salomon.it4i.cz is currently not available through VPN connection. Please use loginX.salomon.it4i.cz when connected to VPN.
|Login address|Port|Protocol|Login node|
|Login address|Port|Protocol|Login node|
|---|---|---|---|
|salomon.it4i.cz|22|ssh|round-robin DNS record for login[1-4]|
|login1.salomon.it4i.cz|22|ssh|login1|
......@@ -23,121 +15,100 @@ VPN.
|login1.salomon.it4i.cz|22|ssh|login1|
|login1.salomon.it4i.cz|22|ssh|login1|
The authentication is by the [private
key](../get-started-with-it4innovations/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys.html)
The authentication is by the [private key](../get-started-with-it4innovations/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys.html)
Please verify SSH fingerprints during the first logon. They are
identical on all login nodes:
>Please verify SSH fingerprints during the first logon. They are identical on all login nodes:
f6:28:98:e4:f9:b2:a6:8f:f2:f4:2d:0a:09:67:69:80 (DSA)
70:01:c9:9a:5d:88:91:c7:1b:c0:84:d1:fa:4e:83:5c (RSA)
Private key (`id_rsa/id_rsa.ppk` ): `600 (-rw-------)`s authentication:
Private key authentication:
On **Linux** or **Mac**, use
`
```bash
local $ ssh -i /path/to/id_rsa username@salomon.it4i.cz
`
```
If you see warning message "UNPROTECTED PRIVATE KEY FILE!", use this
command to set lower permissions to private key file.
If you see warning message "UNPROTECTED PRIVATE KEY FILE!", use this command to set lower permissions to private key file.
`
```bash
local $ chmod 600 /path/to/id_rsa
`
```
On **Windows**, use [PuTTY ssh
client](../get-started-with-it4innovations/accessing-the-clusters/shell-access-and-data-transfer/putty/putty.html).
On **Windows**, use [PuTTY ssh client](../get-started-with-it4innovations/accessing-the-clusters/shell-access-and-data-transfer/putty/putty.html).
After logging in, you will see the command prompt:
_____ _
/ ____| | |
| (___ __ _| | ___ _ __ ___ ___ _ __
___ / _` | |/ _ | '_ ` _ / _ | '_
____) | (_| | | (_) | | | | | | (_) | | | |
|_____/ __,_|_|___/|_| |_| |_|___/|_| |_|
```bash
_____ _
/ ____| | |
| (___ __ _| | ___ _ __ ___ ___ _ __
\___ \ / _` | |/ _ \| '_ ` _ \ / _ \| '_ \
____) | (_| | | (_) | | | | | | (_) | | | |
|_____/ \__,_|_|\___/|_| |_| |_|\___/|_| |_|
http://www.it4i.cz/?lang=en
Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com
[username@login2.salomon ~]$
http://www.it4i.cz/?lang=en
The environment is **not** shared between login nodes, except for
[shared filesystems](storage/storage.html).
Data Transfer
-------------
Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com
[username@login2.salomon ~]$
```
Data in and out of the system may be transferred by the
[scp](http://en.wikipedia.org/wiki/Secure_copy) and sftp
protocols.
>The environment is **not** shared between login nodes, except for [shared filesystems](storage/storage.html).
In case large volumes of data are transferred, use dedicated data mover
nodes cedge[1-3].salomon.it4i.cz for increased performance.
Data Transfer
-------------
Data in and out of the system may be transferred by the [scp](http://en.wikipedia.org/wiki/Secure_copy) and sftp protocols.
In case large volumes of data are transferred, use dedicated data mover nodes cedge[1-3].salomon.it4i.cz for increased performance.
HTML commented section #1 (removed cedge servers from the table)
Address |Port|Protocol|
----------------------- |---|---|------------
salomon.it4i.cz 22 scp, sftp
login1.salomon.it4i.cz 22 scp, sftp
login2.salomon.it4i.cz 22 scp, sftp
login3.salomon.it4i.cz 22 scp, sftp
login4.salomon.it4i.cz 22 scp, sftp
|Address|Port|Protocol|
|---|---|
|salomon.it4i.cz|22|scp, sftp|
|login1.salomon.it4i.cz|22|scp, sftp|
|login2.salomon.it4i.cz|22|scp, sftp|
|login3.salomon.it4i.cz|22|scp, sftp|
|login4.salomon.it4i.cz|22|scp, sftp|
The authentication is by the [private
key](../get-started-with-it4innovations/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys.html)
The authentication is by the [private key](../get-started-with-it4innovations/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys.html)
HTML commented section #2 (ssh transfer performance data need to be
verified)
HTML commented section #2 (ssh transfer performance data need to be verified)
On linux or Mac, use scp or sftp client to transfer the data to Salomon:
`
```bash
local $ scp -i /path/to/id_rsa my-local-file username@salomon.it4i.cz:directory/file
`
```
`
```bash
local $ scp -i /path/to/id_rsa -r my-local-dir username@salomon.it4i.cz:directory
`
```
or
or
`
```bash
local $ sftp -o IdentityFile=/path/to/id_rsa username@salomon.it4i.cz
`
```
Very convenient way to transfer files in and out of the Salomon computer
is via the fuse filesystem
[sshfs](http://linux.die.net/man/1/sshfs)
Very convenient way to transfer files in and out of the Salomon computer is via the fuse filesystem [sshfs](http://linux.die.net/man/1/sshfs)
`
```bash
local $ sshfs -o IdentityFile=/path/to/id_rsa username@salomon.it4i.cz:. mountpoint
`
```
Using sshfs, the users Salomon home directory will be mounted on your
local computer, just like an external disk.
Using sshfs, the users Salomon home directory will be mounted on your local computer, just like an external disk.
Learn more on ssh, scp and sshfs by reading the manpages
`
```bash
$ man ssh
$ man scp
$ man sshfs
`
On Windows, use [WinSCP
client](http://winscp.net/eng/download.php) to transfer
the data. The [win-sshfs
client](http://code.google.com/p/win-sshfs/) provides a
way to mount the Salomon filesystems directly as an external disc.
```
More information about the shared file systems is available
[here](storage/storage.html).
On Windows, use [WinSCP client](http://winscp.net/eng/download.php) to transfer the data. The [win-sshfs client](http://code.google.com/p/win-sshfs/) provides a way to mount the Salomon filesystems directly as an external disc.
More information about the shared file systems is available [here](storage/storage.html).
\ No newline at end of file
Outgoing connections
Outgoing connections
====================
Connection restrictions
-----------------------
Outgoing connections, from Salomon Cluster login nodes to the outside world, are restricted to following ports:
Outgoing connections, from Salomon Cluster login nodes to the outside
world, are restricted to following ports:
|Port|Protocol|
|---|---|
|22|ssh|
|80|http|
|443|https|
|9418|git|
|Port|Protocol|
|---|---|
|22|ssh|
|80|http|
|443|https|
|9418|git|
Please use **ssh port forwarding** and proxy servers to connect from
Salomon to all other remote ports.
>Please use **ssh port forwarding** and proxy servers to connect from Salomon to all other remote ports.
Outgoing connections, from Salomon Cluster compute nodes are restricted
to the internal network. Direct connections form compute nodes to
outside world are cut.
Outgoing connections, from Salomon Cluster compute nodes are restricted to the internal network. Direct connections form compute nodes to outside world are cut.
Port forwarding
---------------
### Port forwarding from login nodes
Port forwarding allows an application running on Salomon to connect to
arbitrary remote host and port.
>Port forwarding allows an application running on Salomon to connect to arbitrary remote host and port.
It works by tunneling the connection from Salomon back to users
workstation and forwarding from the workstation to the remote host.
It works by tunneling the connection from Salomon back to users workstation and forwarding from the workstation to the remote host.
Pick some unused port on Salomon login node (for example 6000) and
establish the port forwarding:
Pick some unused port on Salomon login node (for example 6000) and establish the port forwarding:
`
```bash
local $ ssh -R 6000:remote.host.com:1234 salomon.it4i.cz
`
```
In this example, we establish port forwarding between port 6000 on
Salomon and port 1234 on the remote.host.com. By accessing
localhost:6000 on Salomon, an application will see response of
remote.host.com:1234. The traffic will run via users local workstation.
In this example, we establish port forwarding between port 6000 on Salomon and port 1234 on the remote.host.com. By accessing localhost:6000 on Salomon, an application will see response of remote.host.com:1234. The traffic will run via users local workstation.
Port forwarding may be done **using PuTTY** as well. On the PuTTY
Configuration screen, load your Salomon configuration first. Then go to
Connection->SSH->Tunnels to set up the port forwarding. Click
Remote radio button. Insert 6000 to Source port textbox. Insert
remote.host.com:1234. Click Add button, then Open.
Port forwarding may be done **using PuTTY** as well. On the PuTTY Configuration screen, load your Salomon configuration first. Then go to Connection->SSH->Tunnels to set up the port forwarding. Click Remote radio button. Insert 6000 to Source port textbox. Insert remote.host.com:1234. Click Add button, then Open.
Port forwarding may be established directly to the remote host. However,
this requires that user has ssh access to remote.host.com
Port forwarding may be established directly to the remote host. However, this requires that user has ssh access to remote.host.com
`
```bash
$ ssh -L 6000:localhost:1234 remote.host.com
`
```
Note: Port number 6000 is chosen as an example only. Pick any free port.
### Port forwarding from compute nodes
Remote port forwarding from compute nodes allows applications running on
the compute nodes to access hosts outside Salomon Cluster.
Remote port forwarding from compute nodes allows applications running on the compute nodes to access hosts outside Salomon Cluster.
First, establish the remote port forwarding form the login node, as
[described
above](outgoing-connections.html#port-forwarding-from-login-nodes).
First, establish the remote port forwarding form the login node, as [described above](outgoing-connections.html#port-forwarding-from-login-nodes).
Second, invoke port forwarding from the compute node to the login node.
Insert following line into your jobscript or interactive shell
Second, invoke port forwarding from the compute node to the login node. Insert following line into your jobscript or interactive shell
`
```bash
$ ssh -TN -f -L 6000:localhost:6000 login1
`
```
In this example, we assume that port forwarding from login1:6000 to
remote.host.com:1234 has been established beforehand. By accessing
localhost:6000, an application running on a compute node will see
response of remote.host.com:1234
In this example, we assume that port forwarding from login1:6000 to remote.host.com:1234 has been established beforehand. By accessing localhost:6000, an application running on a compute node will see response of remote.host.com:1234
### Using proxy servers
Port forwarding is static, each single port is mapped to a particular
port on remote host. Connection to other remote host, requires new
forward.
Port forwarding is static, each single port is mapped to a particular port on remote host. Connection to other remote host, requires new forward.
Applications with inbuilt proxy support, experience unlimited access to
remote hosts, via single proxy server.
>Applications with inbuilt proxy support, experience unlimited access to remote hosts, via single proxy server.
To establish local proxy server on your workstation, install and run
SOCKS proxy server software. On Linux, sshd demon provides the
functionality. To establish SOCKS proxy server listening on port 1080
run:
To establish local proxy server on your workstation, install and run SOCKS proxy server software. On Linux, sshd demon provides the functionality. To establish SOCKS proxy server listening on port 1080 run:
`
```bash
local $ ssh -D 1080 localhost
`
```
On Windows, install and run the free, open source [Sock
Puppet](http://sockspuppet.com/) server.
On Windows, install and run the free, open source [Sock Puppet](http://sockspuppet.com/) server.
Once the proxy server is running, establish ssh port forwarding from
Salomon to the proxy server, port 1080, exactly as [described
above](outgoing-connections.html#port-forwarding-from-login-nodes).
Once the proxy server is running, establish ssh port forwarding from Salomon to the proxy server, port 1080, exactly as [described above](outgoing-connections.html#port-forwarding-from-login-nodes).
`
```bash
local $ ssh -R 6000:localhost:1080 salomon.it4i.cz
`
Now, configure the applications proxy settings to **localhost:6000**.
Use port forwarding to access the [proxy server from compute
nodes](outgoing-connections.html#port-forwarding-from-compute-nodes)
as well .
```
Now, configure the applications proxy settings to **localhost:6000**. Use port forwarding to access the [proxy server from compute nodes](outgoing-connections.html#port-forwarding-from-compute-nodes) as well .
\ No newline at end of file
VPN Access
VPN Access
==========
Accessing IT4Innovations internal resources via VPN
---------------------------------------------------
For using resources and licenses which are located at IT4Innovations
local network, it is necessary to VPN connect to this network.
We use Cisco AnyConnect Secure Mobility Client, which is supported on
the following operating systems:
For using resources and licenses which are located at IT4Innovations local network, it is necessary to VPN connect to this network. We use Cisco AnyConnect Secure Mobility Client, which is supported on the following operating systems:
- >Windows XP
- >Windows Vista
- >Windows 7
- >Windows 8
- >Linux
- >MacOS
- Windows XP
- Windows Vista
- Windows 7
- Windows 8
- Linux
- MacOS
It is impossible to connect to VPN from other operating systems.
VPN client installation
------------------------------------
You can install VPN client from web interface after successful login
with LDAP credentials on address <https://vpn.it4i.cz/user>
You can install VPN client from web interface after successful login with LDAP credentials on address <https://vpn.it4i.cz/user>
![](vpn_web_login.png)
According to the Java settings after login, the client either
automatically installs, or downloads installation file for your
operating system. It is necessary to allow start of installation tool
for automatic installation.
According to the Java settings after login, the client either automatically installs, or downloads installation file for your operating system. It is necessary to allow start of installation tool for automatic installation.
![](vpn_web_login_2.png)
![](vpn_web_install_2.png)
![](copy_of_vpn_web_install_3.png)
After successful installation, VPN connection will be established and
you can use available resources from IT4I network.
After successful installation, VPN connection will be established and you can use available resources from IT4I network.
![](vpn_web_install_4.png)
If your Java setting doesn't allow automatic installation, you can
download installation file and install VPN client manually.
If your Java setting doesn't allow automatic installation, you can download installation file and install VPN client manually.
![](vpn_web_download.png)
......@@ -52,57 +40,39 @@ After you click on the link, download of installation file will start.
![](vpn_web_download_2.png)
After successful download of installation file, you have to execute this
tool with administrator's rights and install VPN client manually.
After successful download of installation file, you have to execute this tool with administrator's rights and install VPN client manually.
Working with VPN client
-----------------------
You can use graphical user interface or command line interface to run
VPN client on all supported operating systems. We suggest using GUI.
You can use graphical user interface or command line interface to run VPN client on all supported operating systems. We suggest using GUI.
Before the first login to VPN, you have to fill
URL **https://vpn.it4i.cz/user** into the text field.
Before the first login to VPN, you have to fill URL **[https://vpn.it4i.cz/user](https://vpn.it4i.cz/user)** into the text field.
Contacting
![](vpn_contacting_https_cluster.png)
After you click on the Connect button, you must fill your login
credentials.
After you click on the Connect button, you must fill your login credentials.
Contacting
![](vpn_contacting_https.png)
After a successful login, the client will minimize to the system tray.
If everything works, you can see a lock in the Cisco tray icon.
After a successful login, the client will minimize to the system tray. If everything works, you can see a lock in the Cisco tray icon.
[
![](anyconnecticon.jpg)
If you right-click on this icon, you will see a context menu in which
you can control the VPN connection.
If you right-click on this icon, you will see a context menu in which you can control the VPN connection.
[
![](anyconnectcontextmenu.jpg)
When you connect to the VPN for the first time, the client downloads the
profile and creates a new item "IT4I cluster" in the connection list.
For subsequent connections, it is not necessary to re-enter the URL
address, but just select the corresponding item.
When you connect to the VPN for the first time, the client downloads the profile and creates a new item "IT4I cluster" in the connection list. For subsequent connections, it is not necessary to re-enter the URL address, but just select the corresponding item.
Contacting
![](vpn_contacting.png)
Then AnyConnect automatically proceeds like in the case of first logon.
![](vpn_login.png)
After a successful logon, you can see a green circle with a tick mark on
the lock icon.
After a successful logon, you can see a green circle with a tick mark on the lock icon.
Succesfull
![](vpn_successfull_connection.png)
For disconnecting, right-click on the AnyConnect client icon in the
system tray and select **VPN Disconnect**.
For disconnecting, right-click on the AnyConnect client icon in the system tray and select **VPN Disconnect**.
\ No newline at end of file
Environment and Modules
Environment and Modules
=======================
### Environment Customization
After logging in, you may want to configure the environment. Write your
preferred path definitions, aliases, functions and module loads in the
.bashrc file
After logging in, you may want to configure the environment. Write your preferred path definitions, aliases, functions and module loads in the .bashrc file
`
```bash
# ./bashrc
# Source global definitions
......@@ -26,25 +22,17 @@ if [ -n "$SSH_TTY" ]
then
module list # Display loaded modules
fi
`
```
Do not run commands outputing to standard output (echo, module list,
etc) in .bashrc for non-interactive SSH sessions. It breaks fundamental
functionality (scp, PBS) of your account! Take care for SSH session
interactivity for such commands as
stated in the previous example.
in the previous example.
>Do not run commands outputing to standard output (echo, module list, etc) in .bashrc for non-interactive SSH sessions. It breaks fundamental functionality (scp, PBS) of your account! Take care for SSH session interactivity for such commands as stated in the previous example.
### Application Modules
In order to configure your shell for running particular application on
Salomon we use Module package interface.
In order to configure your shell for running particular application on Salomon we use Module package interface.
Application modules on Salomon cluster are built using
[EasyBuild](http://hpcugent.github.io/easybuild/ "EasyBuild"). The
modules are divided into the following structure:
Application modules on Salomon cluster are built using [EasyBuild](http://hpcugent.github.io/easybuild/ "EasyBuild"). The modules are divided into the following structure:
`
```bash
base: Default module class
bio: Bioinformatics, biology and biomedical
cae: Computer Aided Engineering (incl. CFD)
......@@ -66,86 +54,61 @@ modules are divided into the following structure:
toolchain: EasyBuild toolchains
tools: General purpose tools
vis: Visualization, plotting, documentation and typesetting
`
```
The modules set up the application paths, library paths and environment
variables for running particular application.
>The modules set up the application paths, library paths and environment variables for running particular application.
The modules may be loaded, unloaded and switched, according to momentary
needs.
The modules may be loaded, unloaded and switched, according to momentary needs.
To check available modules use
`
```bash
$ module avail
`
```
To load a module, for example the OpenMPI module use
`
```bash
$ module load OpenMPI
`
```
loading the OpenMPI module will set up paths and environment variables
of your active shell such that you are ready to run the OpenMPI software
loading the OpenMPI module will set up paths and environment variables of your active shell such that you are ready to run the OpenMPI software
To check loaded modules use
`
```bash
$ module list
`
```
To unload a module, for example the OpenMPI module use
To unload a module, for example the OpenMPI module use
`
```bash
$ module unload OpenMPI
`
```
Learn more on modules by reading the module man page
`
```bash
$ man module
`
```
### EasyBuild Toolchains
As we wrote earlier, we are using EasyBuild for automatised software
installation and module creation.
EasyBuild employs so-called **compiler toolchains** or,
simply toolchains for short, which are a major concept in handling the
build and installation processes.
A typical toolchain consists of one or more compilers, usually put
together with some libraries for specific functionality, e.g., for using
an MPI stack for distributed computing, or which provide optimized
routines for commonly used math operations, e.g., the well-known
BLAS/LAPACK APIs for linear algebra routines.
For each software package being built, the toolchain to be used must be
specified in some way.
The EasyBuild framework prepares the build environment for the different
toolchain components, by loading their respective modules and defining
environment variables to specify compiler commands (e.g.,
via `$F90`), compiler and linker options (e.g.,
via `$CFLAGS` and `$LDFLAGS`{.docutils .literal}),
the list of library names to supply to the linker (via `$LIBS`{.docutils
.literal}), etc. This enables making easyblocks
largely toolchain-agnostic since they can simply rely on these
environment variables; that is, unless they need to be aware of, for
example, the particular compiler being used to determine the build
configuration options.
Recent releases of EasyBuild include out-of-the-box toolchain support
for:
As we wrote earlier, we are using EasyBuild for automatised software installation and module creation.
EasyBuild employs so-called **compiler toolchains** or, simply toolchains for short, which are a major concept in handling the build and installation processes.
A typical toolchain consists of one or more compilers, usually put together with some libraries for specific functionality, e.g., for using an MPI stack for distributed computing, or which provide optimized routines for commonly used math operations, e.g., the well-known BLAS/LAPACK APIs for linear algebra routines.
For each software package being built, the toolchain to be used must be specified in some way.
The EasyBuild framework prepares the build environment for the different toolchain components, by loading their respective modules and defining environment variables to specify compiler commands (e.g., via `$F90`), compiler and linker options (e.g., via `$CFLAGS` and `$LDFLAGS`), the list of library names to supply to the linker (via `$LIBS`), etc. This enables making easyblocks largely toolchain-agnostic since they can simply rely on these environment variables; that is, unless they need to be aware of, for example, the particular compiler being used to determine the build configuration options.
Recent releases of EasyBuild include out-of-the-box toolchain support for:
- various compilers, including GCC, Intel, Clang, CUDA
- common MPI libraries, such as Intel MPI, MPICH, MVAPICH2, OpenMPI
- various numerical libraries, including ATLAS, Intel MKL, OpenBLAS,
ScalaPACK, FFTW
- various numerical libraries, including ATLAS, Intel MKL, OpenBLAS, ScalaPACK, FFTW
On Salomon, we have currently following toolchains installed:
......
Hardware Overview
=================
Introduction
------------
The Salomon cluster consists of 1008 computational nodes of which 576
are regular compute nodes and 432 accelerated nodes. Each node is a
powerful x86-64 computer, equipped
with 24 cores (two twelve-core Intel Xeon processors) and 128GB RAM. The
nodes are interlinked by high speed InfiniBand and Ethernet networks.
All nodes share 0.5PB /home NFS disk storage to store the user files.
Users may use a DDN Lustre shared storage with capacity of 1.69 PB which
is available for the scratch project data. The user access to the
Salomon cluster is provided by four login nodes.
[More about schematic representation of the Salomon cluster compute
nodes IB
topology](../network-1/ib-single-plane-topology.html).
![Salomon](../salomon-2)
The parameters are summarized in the following tables:
General information
-------------------
In general**
Primary purpose
High Performance Computing
Architecture of compute nodes
x86-64
Operating system
CentOS 6.7 Linux
[**Compute nodes**](../compute-nodes.html)
Totally
1008
Processor
2x Intel Xeon E5-2680v3, 2.5GHz, 12cores
RAM
128GB, 5.3GB per core, DDR4@2133 MHz
Local disk drive
no
Compute network / Topology
InfiniBand FDR56 / 7D Enhanced hypercube
w/o accelerator
576
MIC accelerated
432
In total**
Total theoretical peak performance (Rpeak)
2011 Tflop/s
Total amount of RAM
129.024 TB
Compute nodes
-------------
|Node|Count|Processor|Cores|Memory|Accelerator|
----------------- - |---|---|------------------------ ------- -------- --------------------------------------------
|w/o accelerator|576|2x Intel Xeon E5-2680v3, 2.5GHz|24|128GB|-|
|MIC accelerated|432|2x Intel Xeon E5-2680v3, 2.5GHz|24|128GB|2x Intel Xeon Phi 7120P, 61cores, 16GB RAM|
For more details please refer to the [Compute
nodes](../compute-nodes.html).
Remote visualization nodes
--------------------------
For remote visualization two nodes with NICE DCV software are available
each configured:
|Node|Count|Processor|Cores|Memory|GPU Accelerator|
--------------- - |---|---|----------------------- ------- -------- ------------------------------
|visualization|2|2x Intel Xeon E5-2695v3, 2.3GHz|28|512GB|NVIDIA QUADRO K5000, 4GB RAM|
SGI UV 2000
-----------
For large memory computations a special SMP/NUMA SGI UV 2000 server is
available:
|Node |Count |Processor |Cores<th align="left">Memory<th align="left">Extra HW |
| --- | --- |
|UV2000 |1 |14x Intel Xeon E5-4627v2, 3.3GHz, 8cores |112 |3328GB DDR3@1866MHz |2x 400GB local SSD1x NVIDIA GM200(GeForce GTX TITAN X),12GB RAM\ |
![](uv-2000.jpeg)
Hardware Overview
=================
Introduction
------------
The Salomon cluster consists of 1008 computational nodes of which 576 are regular compute nodes and 432 accelerated nodes. Each node is a powerful x86-64 computer, equipped with 24 cores (two twelve-core Intel Xeon processors) and 128GB RAM. The nodes are interlinked by high speed InfiniBand and Ethernet networks. All nodes share 0.5PB /home NFS disk storage to store the user files. Users may use a DDN Lustre shared storage with capacity of 1.69 PB which is available for the scratch project data. The user access to the Salomon cluster is provided by four login nodes.
[More about schematic representation of the Salomon cluster compute nodes IB topology](../network/ib-single-plane-topology.md).
![Salomon](salomon-2)
The parameters are summarized in the following tables:
General information
-------------------
|**In general**||
|---|---|
|Primary purpose|High Performance Computing|
|Architecture of compute nodes|x86-64|
|Operating system|CentOS 6.7 Linux|
|[**Compute nodes**](../compute-nodes.md)||
|Totally|1008|
|Processor|2x Intel Xeon E5-2680v3, 2.5GHz, 12cores|
|RAM|128GB, 5.3GB per core, DDR4@2133 MHz|
|Local disk drive|no|
|Compute network / Topology|InfiniBand FDR56 / 7D Enhanced hypercube|
|w/o accelerator|576|
|MIC accelerated|432|
|**In total**||
|Total theoretical peak performance (Rpeak)|2011 Tflop/s|
|Total amount of RAM|129.024 TB|
Compute nodes
-------------
|Node|Count|Processor|Cores|Memory|Accelerator|
|---|---|
|w/o accelerator|576|2x Intel Xeon E5-2680v3, 2.5GHz|24|128GB|-|
|MIC accelerated|432|2x Intel Xeon E5-2680v3, 2.5GHz|24|128GB|2x Intel Xeon Phi 7120P, 61cores, 16GB RAM|
For more details please refer to the [Compute nodes](../compute-nodes.md).
Remote visualization nodes
--------------------------
For remote visualization two nodes with NICE DCV software are available each configured:
|Node|Count|Processor|Cores|Memory|GPU Accelerator|
|---|---|
|visualization|2|2x Intel Xeon E5-2695v3, 2.3GHz|28|512GB|NVIDIA QUADRO K5000, 4GB RAM|
SGI UV 2000
-----------
For large memory computations a special SMP/NUMA SGI UV 2000 server is available:
|Node |Count |Processor |Cores|Memory|Extra HW |
| --- | --- |
|UV2000 |1 |14x Intel Xeon E5-4627v2, 3.3GHz, 8cores |112 |3328GB DDR3@1866MHz |2x 400GB local SSD1x NVIDIA GM200(GeForce GTX TITAN X),12GB RAM\ |
![](uv-2000.jpeg)
\ No newline at end of file
Introduction
Introduction
============
Welcome to Salomon supercomputer cluster. The Salomon cluster consists
of 1008 compute nodes, totaling 24192 compute cores with 129TB RAM and
giving over 2 Pflop/s theoretical peak performance. Each node is a
powerful x86-64 computer, equipped with 24
cores, at least 128GB RAM. Nodes are interconnected by 7D Enhanced
hypercube Infiniband network and equipped with Intel Xeon E5-2680v3
processors. The Salomon cluster consists of 576 nodes without
accelerators and 432 nodes equipped with Intel Xeon Phi MIC
accelerators. Read more in [Hardware
Overview](hardware-overview-1/hardware-overview.html).
The cluster runs CentOS Linux [
](http://www.bull.com/bullx-logiciels/systeme-exploitation.html)
operating system, which is compatible with
the RedHat [
Linux
family.](http://upload.wikimedia.org/wikipedia/commons/1/1b/Linux_Distribution_Timeline.svg)
Welcome to Salomon supercomputer cluster. The Salomon cluster consists of 1008 compute nodes, totaling 24192 compute cores with 129TB RAM and giving over 2 Pflop/s theoretical peak performance. Each node is a powerful x86-64 computer, equipped with 24 cores, at least 128GB RAM. Nodes are interconnected by 7D Enhanced hypercube Infiniband network and equipped with Intel Xeon E5-2680v3 processors. The Salomon cluster consists of 576 nodes without accelerators and 432 nodes equipped with Intel Xeon Phi MIC accelerators. Read more in [Hardware Overview](hardware-overview-1/hardware-overview.html).
The cluster runs [CentOS Linux](http://www.bull.com/bullx-logiciels/systeme-exploitation.html) operating system, which is compatible with the RedHat [ Linux family.](http://upload.wikimedia.org/wikipedia/commons/1/1b/Linux_Distribution_Timeline.svg)
**Water-cooled Compute Nodes With MIC Accelerator**
......@@ -29,5 +15,4 @@ family.](http://upload.wikimedia.org/wikipedia/commons/1/1b/Linux_Distribution_T
![](salomon-3.jpeg)
![](salomon-4.jpeg)
![](salomon-4.jpeg)
\ No newline at end of file
7D Enhanced Hypercube
7D Enhanced Hypercube
=====================
[More about Job submission - Placement by IB switch / Hypercube
dimension.](../resource-allocation-and-job-execution/job-submission-and-execution.html)
[More about Job submission - Placement by IB switch / Hypercube dimension.](../resource-allocation-and-job-execution/job-submission-and-execution.md)
Nodes may be selected via the PBS resource attribute ehc_[1-7]d .
|Hypercube|dimension|
--------------- |---|---|---------------------------------
|1D|ehc_1d|
|2D|ehc_2d|
|3D|ehc_3d|
|4D|ehc_4d|
|5D|ehc_5d|
|6D|ehc_6d|
|7D|ehc_7d|
|Hypercube|dimension|
|---|---|
|1D|ehc_1d|
|2D|ehc_2d|
|3D|ehc_3d|
|4D|ehc_4d|
|5D|ehc_5d|
|6D|ehc_6d|
|7D|ehc_7d|
[Schematic representation of the Salomon cluster IB single-plain
topology represents hypercube
dimension 0](ib-single-plane-topology.html).
[Schematic representation of the Salomon cluster IB single-plain topology represents hypercube dimension 0](ib-single-plane-topology.md).
### 7D Enhanced Hypercube {#d-enhanced-hypercube}
![](7D_Enhanced_hypercube.png)
|Node type|Count|Short name|Long name|Rack|
-------------------------------------- - |---|---|-------- -------------------------- -------
|M-Cell compute nodes w/o accelerator|576|cns1 -cns576|r1i0n0 - r4i7n17|1-4|
|compute nodes MIC accelerated|432|cns577 - cns1008|r21u01n577 - r37u31n1008|21-38|
|Node type|Count|Short name|Long name|Rack|
|---|---|
|M-Cell compute nodes w/o accelerator|576|cns1 -cns576|r1i0n0 - r4i7n17|1-4|
|compute nodes MIC accelerated|432|cns577 - cns1008|r21u01n577 - r37u31n1008|21-38|
### IB Topology
......
IB single-plane topology
IB single-plane topology
========================
A complete M-Cell assembly consists of four compute racks. Each rack contains 4x physical IRUs - Independent rack units. Using one dual socket node per one blade slot leads to 8 logical IRUs. Each rack contains 4x2 SGI ICE X IB Premium Blades.
A complete M-Cell assembly consists of four compute racks. Each rack
contains 4x physical IRUs - Independent rack units. Using one dual
socket node per one blade slot leads to 8 logical IRUs. Each rack
contains 4x2 SGI ICE X IB Premium Blades.
The SGI ICE X IB Premium Blade provides the first level of interconnection via dual 36-port Mellanox FDR InfiniBand ASIC switch with connections as follows:
The SGI ICE X IB Premium Blade provides the first level of
interconnection via dual 36-port Mellanox FDR InfiniBand ASIC switch
with connections as follows:
- 9 ports from each switch chip connect to the unified backplane, to
connect the 18 compute node slots
- 9 ports from each switch chip connect to the unified backplane, to connect the 18 compute node slots
- 3 ports on each chip provide connectivity between the chips
- 24 ports from each switch chip connect to the external bulkhead, for
a total of 48
- 24 ports from each switch chip connect to the external bulkhead, for a total of 48
###IB single-plane topology - ICEX Mcell
......@@ -24,23 +15,14 @@ Each colour in each physical IRU represents one dual-switch ASIC switch.
![](IBsingleplanetopologyICEXMcellsmall.png)
### IB single-plane topology - Accelerated nodes
Each of the 3 inter-connected D racks are equivalent to one half of
Mcell rack. 18x D rack with MIC accelerated nodes [r21-r38] are
equivalent to 3 Mcell racks as shown in a diagram [7D Enhanced
Hypercube](7d-enhanced-hypercube.html).
Each of the 3 inter-connected D racks are equivalent to one half of Mcell rack. 18x D rack with MIC accelerated nodes [r21-r38] are equivalent to 3 Mcell racks as shown in a diagram [7D Enhanced Hypercube](7d-enhanced-hypercube.md).
As shown in a diagram ![IB
Topology](Salomon_IB_topology.png):
As shown in a diagram ![IB Topology](Salomon_IB_topology.png):
- Racks 21, 22, 23, 24, 25, 26 are equivalent to one Mcell rack.
- Racks 27, 28, 29, 30, 31, 32 are equivalent to one Mcell rack.
- Racks 33, 34, 35, 36, 37, 38 are equivalent to one Mcell rack.
![](IBsingleplanetopologyAcceleratednodessmall.png)
![](IBsingleplanetopologyAcceleratednodessmall.png)
\ No newline at end of file
Network
Network
=======
All compute and login nodes of Salomon are interconnected by 7D Enhanced
hypercube
[Infiniband](http://en.wikipedia.org/wiki/InfiniBand)
network and by Gigabit
[Ethernet](http://en.wikipedia.org/wiki/Ethernet)
network. Only
[Infiniband](http://en.wikipedia.org/wiki/InfiniBand)
network may be used to transfer user data.
All compute and login nodes of Salomon are interconnected by 7D Enhanced hypercube [Infiniband](http://en.wikipedia.org/wiki/InfiniBand) network and by Gigabit [Ethernet](http://en.wikipedia.org/wiki/Ethernet)
network. Only [Infiniband](http://en.wikipedia.org/wiki/InfiniBand) network may be used to transfer user data.
Infiniband Network
------------------
All compute and login nodes of Salomon are interconnected by 7D Enhanced hypercube [Infiniband](http://en.wikipedia.org/wiki/InfiniBand) network (56 Gbps). The network topology is a [7D Enhanced hypercube](7d-enhanced-hypercube.md).
All compute and login nodes of Salomon are interconnected by 7D Enhanced
hypercube
[Infiniband](http://en.wikipedia.org/wiki/InfiniBand)
network (56 Gbps). The network topology is a [7D Enhanced
hypercube](7d-enhanced-hypercube.html).
Read more about schematic representation of the Salomon cluster [IB
single-plain topology](ib-single-plane-topology.html)
([hypercube dimension](7d-enhanced-hypercube.html)
0).[>](IB%20single-plane%20topology%20-%20Accelerated%20nodes.pdf/view.html)
Read more about schematic representation of the Salomon cluster [IB single-plain topology](ib-single-plane-topology.md)
([hypercube dimension](7d-enhanced-hypercube.md) 0).
The compute nodes may be accessed via the Infiniband network using ib0
network interface, in address range 10.17.0.0 (mask 255.255.224.0). The
MPI may be used to establish native Infiniband connection among the
nodes.
The compute nodes may be accessed via the Infiniband network using ib0 network interface, in address range 10.17.0.0 (mask 255.255.224.0). The MPI may be used to establish native Infiniband connection among the nodes.
The network provides **2170MB/s** transfer rates via the TCP connection
(single stream) and up to **3600MB/s** via native Infiniband protocol.
The network provides **2170MB/s** transfer rates via the TCP connection (single stream) and up to **3600MB/s** via native Infiniband protocol.
Example
-------
`
```bash
$ qsub -q qexp -l select=4:ncpus=16 -N Name0 ./myjob
$ qstat -n -u username
Req'd Req'd Elap
......@@ -47,19 +26,18 @@ Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -- |---|---| ------ --- --- ------ ----- - -----
15209.isrv5 username qexp Name0 5530 4 96 -- 01:00 R 00:00
r4i1n0/0*24+r4i1n1/0*24+r4i1n2/0*24+r4i1n3/0*24
`
```
In this example, we access the node r4i1n0 by Infiniband network via the
ib0 interface.
In this example, we access the node r4i1n0 by Infiniband network via the ib0 interface.
`
```bash
$ ssh 10.17.35.19
`
```
In this example, we get
information of the Infiniband network.
`
```bash
$ ifconfig
....
inet addr:10.17.35.19....
......@@ -70,5 +48,4 @@ $ ip addr show ib0
....
inet 10.17.35.19....
....
`
```
\ No newline at end of file
This diff is collapsed.
Resource Allocation and Job Execution
Resource Allocation and Job Execution
=====================================
To run a [job](job-submission-and-execution.html),
[computational
resources](resources-allocation-policy.html) for this
particular job must be allocated. This is done via the PBS Pro job
workload manager software, which efficiently distributes workloads
across the supercomputer. Extensive informations about PBS Pro can be
found in the [official documentation
here](../../pbspro-documentation.html), especially in
the [PBS Pro User's
Guide](https://docs.it4i.cz/pbspro-documentation/pbspro-users-guide).
To run a [job](job-submission-and-execution.html), [computational resources](resources-allocation-policy.html) for this particular job must be allocated. This is done via the PBS Pro job workload manager software, which efficiently distributes workloads across the supercomputer. Extensive informations about PBS Pro can be found in the [official documentation here](../../pbspro-documentation.html), especially in the [PBS Pro User's Guide](https://docs.it4i.cz/pbspro-documentation/pbspro-users-guide).
Resources Allocation Policy
---------------------------
The resources are allocated to the job in a fairshare fashion, subject to constraints set by the queue and resources available to the Project. [The Fairshare](job-priority.html) at Salomon ensures that individual users may consume approximately equal amount of resources per week. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following queues are available to Anselm users:
The resources are allocated to the job in a fairshare fashion, subject
to constraints set by the queue and resources available to the Project.
[The Fairshare](job-priority.html) at Salomon ensures
that individual users may consume approximately equal amount of
resources per week. The resources are accessible via several queues for
queueing the jobs. The queues provide prioritized and exclusive access
to the computational resources. Following queues are available to Anselm
users:
- **qexp**, the \
- **qprod**, the \***
- **qlong**, the Long queue
- **qmpp**, the Massively parallel queue
- **qfat**, the queue to access SMP UV2000 machine
- **qfree,** the Free resource utilization queue
- **qexp**, the Express queue
- **qprod**, the Production queue
- **qlong**, the Long queue
- **qmpp**, the Massively parallel queue
- **qfat**, the queue to access SMP UV2000 machine
- **qfree**, the Free resource utilization queue
Check the queue status at <https://extranet.it4i.cz/rsweb/salomon/>
>Check the queue status at <https://extranet.it4i.cz/rsweb/salomon/>
Read more on the [Resource Allocation
Policy](resources-allocation-policy.html) page.
Read more on the [Resource Allocation Policy](resources-allocation-policy.html) page.
Job submission and execution
----------------------------
>Use the **qsub** command to submit your jobs.
Use the **qsub** command to submit your jobs.
The qsub submits the job into the queue. The qsub command creates a
request to the PBS Job manager for allocation of specified resources.
The **smallest allocation unit is entire node, 24 cores**, with
exception of the qexp queue. The resources will be allocated when
available, subject to allocation policies and constraints. **After the
resources are allocated the jobscript or interactive shell is executed
on first of the allocated nodes.**
Read more on the [Job submission and
execution](job-submission-and-execution.html) page.
The qsub submits the job into the queue. The qsub command creates a request to the PBS Job manager for allocation of specified resources. The **smallest allocation unit is entire node, 24 cores**, with exception of the qexp queue. The resources will be allocated when available, subject to allocation policies and constraints. **After the resources are allocated the jobscript or interactive shell is executed on first of the allocated nodes.**
Read more on the [Job submission and execution](job-submission-and-execution.html) page.
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment