diff --git a/docs.it4i.cz/anselm-cluster-documentation.html b/docs.it4i.cz/anselm-cluster-documentation.html new file mode 100644 index 0000000000000000000000000000000000000000..c277b9009961453b2801d2fc50bd6f48abfacb64 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation.html @@ -0,0 +1,674 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Introduction — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + + + Anselm Cluster Documentation + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Introduction +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Welcome to Anselm supercomputer cluster. The Anselm cluster consists of 209 compute nodes, totaling 3344 compute cores with 15TB RAM and giving over 94 Tflop/s theoretical peak performance. Each node is a powerful x86-64 computer, equipped with 16 cores, at least 64GB RAM, and 500GB harddrive. Nodes are interconnected by fully non-blocking fat-tree Infiniband network and equipped with Intel Sandy Bridge processors. A few nodes are also equipped with NVIDIA Kepler GPU or Intel Xeon Phi MIC accelerators. Read more in Hardware Overview.

+

The cluster runs bullx Linux operating system, which is compatible with the RedHat Linux family. We have installed a wide range of software packages targeted at different scientific domains. These packages are accessible via the modules environment.

+

User data shared file-system (HOME, 320TB) and job data shared file-system (SCRATCH, 146TB) are available to users.

+

The PBS Professional workload manager provides computing resources allocations and job execution.

+

Read more on how to apply for resources, obtain login credentials, and access the cluster.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation.md b/docs.it4i.cz/anselm-cluster-documentation.md index 6ff741f5e346648f03d2042fc150f55e924530d5..7d0f0258c0eec28b115e6619971e701ea4997fac 100644 --- a/docs.it4i.cz/anselm-cluster-documentation.md +++ b/docs.it4i.cz/anselm-cluster-documentation.md @@ -40,4 +40,3 @@ resources](get-started-with-it4innovations/applying-for-resources.html), credentials,](get-started-with-it4innovations/obtaining-login-credentials.html) and [access the cluster](anselm-cluster-documentation/accessing-the-cluster.html). - diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster.html b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster.html new file mode 100644 index 0000000000000000000000000000000000000000..f381892c7e2d520e9998c4a48a19a1976eb7dae8 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster.html @@ -0,0 +1,842 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Shell access and data transfer — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Accessing the Cluster + + / + + + + + + + + + + Shell access and data transfer + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Shell access and data transfer +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Interactive Login

+

The Anselm cluster is accessed by SSH protocol via login nodes login1 and login2 at address anselm.it4i.cz. The login nodes may be addressed specifically, by prepending the login node name to the address.

+ + + + + + + + + + + + + + + + + + + + + + +
Login addressPortProtocolLogin node
anselm.it4i.cz22sshround-robin DNS record for login1 and login2
login1.anselm.it4i.cz22sshlogin1
login2.anselm.it4i.cz22sshlogin2
+

The authentication is by the private key

+

Please verify SSH fingerprints during the first logon. They are identical on all login nodes:
29:b3:f4:64:b0:73:f5:6f:a7:85:0f:e0:0d:be:76:bf (DSA)
d4:6f:5c:18:f4:3f:70:ef:bc:fc:cc:2b:fd:13:36:b7 (RSA)

+

 

+

Private keys authentication:

+

On Linux or Mac, use

+
local $ ssh -i /path/to/id_rsa username@anselm.it4i.cz
+

If you see warning message "UNPROTECTED PRIVATE KEY FILE!", use this command to set lower permissions to private key file.

+
local $ chmod 600 /path/to/id_rsa
+

On Windows, use PuTTY ssh client.

+

After logging in, you will see the command prompt:

+
                                            _
/\ | |
/ \ _ __ ___ ___| |_ __ ___
/ /\ \ | '_ \/ __|/ _ \ | '_ ` _ \
/ ____ \| | | \__ \ __/ | | | | | |
/_/ \_\_| |_|___/\___|_|_| |_| |_|


                        http://www.it4i.cz/?lang=en


Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com
[username@login2.anselm ~]$
+

The environment is not shared between login nodes, except for shared filesystems.

+

Data Transfer

+

Data in and out of the system may be transferred by the scp and sftp protocols. (Not available yet.) In case large volumes of data are transferred, use dedicated data mover node dm1.anselm.it4i.cz for increased performance.

+ + + + + + + + + + + + + + + + + + + + + + + + +
AddressPortProtocol
anselm.it4i.cz22scp, sftp
login1.anselm.it4i.cz22scp, sftp
login2.anselm.it4i.cz22scp, sftp
dm1.anselm.it4i.cz22scp, sftp
+

 The authentication is by the private key

+

Data transfer rates up to 160MB/s can be achieved with scp or sftp. 1TB may be transferred in 1:50h.

+

To achieve 160MB/s transfer rates, the end user must be connected by 10G line all the way to IT4Innovations and use computer with fast processor for the transfer. Using Gigabit ethernet connection, up to 110MB/s may be expected.  Fast cipher (aes128-ctr) should be used.

+

If you experience degraded data transfer performance, consult your local network provider.

+

On linux or Mac, use scp or sftp client to transfer the data to Anselm:

+
local $ scp -i /path/to/id_rsa my-local-file username@anselm.it4i.cz:directory/file
+
local $ scp -i /path/to/id_rsa -r my-local-dir username@anselm.it4i.cz:directory
+
or
+
local $ sftp -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz
+

Very convenient way to transfer files in and out of the Anselm computer is via the fuse filesystem sshfs

+
local $ sshfs -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz:. mountpoint
+

Using sshfs, the users Anselm home directory will be mounted on your local computer, just like an external disk.

+

Learn more on ssh, scp and sshfs by reading the manpages

+
$ man ssh
$ man scp
$ man sshfs
+

On Windows, use WinSCP client to transfer the data. The win-sshfs client provides a way to mount the Anselm filesystems directly as an external disc.

+

More information about the shared file systems is available here.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster.md b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster.md index ab6ec152fd0be03649370cd50347baf4cf1fc3ad..93d7a7645c65c75cbb6af1059adbed58a6713616 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster.md +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster.md @@ -59,7 +59,7 @@ After logging in, you will see the command prompt:                         http://www.it4i.cz/?lang=en - Last loginTue Jul 9 15:57:38 2013 from your-host.example.com + Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com [username@login2.anselm ~]$ The environment is **not** shared between login nodes, except for @@ -139,4 +139,3 @@ way to mount the Anselm filesystems directly as an external disc. More information about the shared file systems is available [here](storage.html). - diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/outgoing-connections.html b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/outgoing-connections.html new file mode 100644 index 0000000000000000000000000000000000000000..d2e95800672b304bbd98bb8deffd33056c9541ce --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/outgoing-connections.html @@ -0,0 +1,813 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Outgoing connections — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Accessing the Cluster + + / + + + + + + + + + + Outgoing connections + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Outgoing connections +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Connection restrictions

+

Outgoing connections, from Anselm Cluster login nodes to the outside world, are restricted to following ports:

+ + + + + + + + + + + + + + + + + + + + +
PortProtocol
22ssh
80http
443https
9418git
+

Please use ssh port forwarding and proxy servers to connect from Anselm to all other remote ports.

+

Outgoing connections, from Anselm Cluster compute nodes are restricted to the internal network. Direct connections form compute nodes to outside world are cut.

+

Port forwarding

+

Port forwarding from login nodes

+

Port forwarding allows an application running on Anselm to connect to arbitrary remote host and port.

+

It works by tunneling the connection from Anselm back to users workstation and forwarding from the workstation to the remote host.

+

Pick some unused port on Anselm login node  (for example 6000) and establish the port forwarding:

+
local $ ssh -R 6000:remote.host.com:1234 anselm.it4i.cz
+

In this example, we establish port forwarding between port 6000 on Anselm and  port 1234 on the remote.host.com. By accessing localhost:6000 on Anselm, an application will see response of remote.host.com:1234. The traffic will run via users local workstation.

+

Port forwarding may be done using PuTTY as well. On the PuTTY Configuration screen, load your Anselm configuration first. Then go to Connection->SSH->Tunnels to set up the port forwarding. Click Remote radio button. Insert 6000 to Source port textbox. Insert remote.host.com:1234. Click Add button, then Open.

+

Port forwarding may be established directly to the remote host. However, this requires that user has ssh access to remote.host.com

+
$ ssh -L 6000:localhost:1234 remote.host.com
+

Note: Port number 6000 is chosen as an example only. Pick any free port.

+

Port forwarding from compute nodes

+

Remote port forwarding from compute nodes allows applications running on the compute nodes to access hosts outside Anselm Cluster.

+

First, establish the remote port forwarding form the login node, as described above.

+

Second, invoke port forwarding from the compute node to the login node. Insert following line into your jobscript or interactive shell

+
$ ssh  -TN -f -L 6000:localhost:6000 login1
+

In this example, we assume that port forwarding from login1:6000 to remote.host.com:1234 has been established beforehand. By accessing localhost:6000, an application running on a compute node will see response of remote.host.com:1234

+

Using proxy servers

+

Port forwarding is static, each single port is mapped to a particular port on remote host. Connection to other remote host, requires new forward.

+

Applications with inbuilt proxy support, experience unlimited access to remote hosts, via single proxy server.

+

To establish local proxy server on your workstation, install and run SOCKS proxy server software. On Linux, sshd demon provides the functionality. To establish SOCKS proxy server listening on port 1080 run:

+
local $ ssh -D 1080 localhost
+

On Windows, install and run the free, open source Sock Puppet server.

+

Once the proxy server is running, establish ssh port forwarding from Anselm to the proxy server, port 1080, exactly as described above.

+
local $ ssh -R 6000:localhost:1080 anselm.it4i.cz
+

Now, configure the applications proxy settings to localhost:6000. Use port forwarding  to access the proxy server from compute nodes as well .

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/outgoing-connections.md b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/outgoing-connections.md index 30f6a45a805704d1e8a43594ce7b63152192d5e6..b18fe9a0209b1677e3ece6eee4f42fd83eac02dd 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/outgoing-connections.md +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/outgoing-connections.md @@ -61,7 +61,7 @@ this requires that user has ssh access to remote.host.com $ ssh -L 6000:localhost:1234 remote.host.com ``` -NotePort number 6000 is chosen as an example only. Pick any free port. +Note: Port number 6000 is chosen as an example only. Pick any free port. ### []()Port forwarding from compute nodes @@ -117,4 +117,3 @@ Now, configure the applications proxy settings to **localhost:6000**. Use port forwarding  to access the [proxy server from compute nodes](outgoing-connections.html#port-forwarding-from-compute-nodes) as well . - diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/shell-and-data-access/shell-and-data-access.html b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/shell-and-data-access/shell-and-data-access.html new file mode 100644 index 0000000000000000000000000000000000000000..7dea5f2709fed1d9a511928bab9953b12840a15a --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/shell-and-data-access/shell-and-data-access.html @@ -0,0 +1,842 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Shell access and data transfer — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Accessing the Cluster + + / + + + + + + + + + + Shell access and data transfer + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Shell access and data transfer +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Interactive Login

+

The Anselm cluster is accessed by SSH protocol via login nodes login1 and login2 at address anselm.it4i.cz. The login nodes may be addressed specifically, by prepending the login node name to the address.

+ + + + + + + + + + + + + + + + + + + + + + +
Login addressPortProtocolLogin node
anselm.it4i.cz22sshround-robin DNS record for login1 and login2
login1.anselm.it4i.cz22sshlogin1
login2.anselm.it4i.cz22sshlogin2
+

The authentication is by the private key

+

Please verify SSH fingerprints during the first logon. They are identical on all login nodes:
29:b3:f4:64:b0:73:f5:6f:a7:85:0f:e0:0d:be:76:bf (DSA)
d4:6f:5c:18:f4:3f:70:ef:bc:fc:cc:2b:fd:13:36:b7 (RSA)

+

 

+

Private keys authentication:

+

On Linux or Mac, use

+
local $ ssh -i /path/to/id_rsa username@anselm.it4i.cz
+

If you see warning message "UNPROTECTED PRIVATE KEY FILE!", use this command to set lower permissions to private key file.

+
local $ chmod 600 /path/to/id_rsa
+

On Windows, use PuTTY ssh client.

+

After logging in, you will see the command prompt:

+
                                            _
/\ | |
/ \ _ __ ___ ___| |_ __ ___
/ /\ \ | '_ \/ __|/ _ \ | '_ ` _ \
/ ____ \| | | \__ \ __/ | | | | | |
/_/ \_\_| |_|___/\___|_|_| |_| |_|


                        http://www.it4i.cz/?lang=en


Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com
[username@login2.anselm ~]$
+

The environment is not shared between login nodes, except for shared filesystems.

+

Data Transfer

+

Data in and out of the system may be transferred by the scp and sftp protocols. (Not available yet.) In case large volumes of data are transferred, use dedicated data mover node dm1.anselm.it4i.cz for increased performance.

+ + + + + + + + + + + + + + + + + + + + + + + + +
AddressPortProtocol
anselm.it4i.cz22scp, sftp
login1.anselm.it4i.cz22scp, sftp
login2.anselm.it4i.cz22scp, sftp
dm1.anselm.it4i.cz22scp, sftp
+

 The authentication is by the private key

+

Data transfer rates up to 160MB/s can be achieved with scp or sftp. 1TB may be transferred in 1:50h.

+

To achieve 160MB/s transfer rates, the end user must be connected by 10G line all the way to IT4Innovations and use computer with fast processor for the transfer. Using Gigabit ethernet connection, up to 110MB/s may be expected.  Fast cipher (aes128-ctr) should be used.

+

If you experience degraded data transfer performance, consult your local network provider.

+

On linux or Mac, use scp or sftp client to transfer the data to Anselm:

+
local $ scp -i /path/to/id_rsa my-local-file username@anselm.it4i.cz:directory/file
+
local $ scp -i /path/to/id_rsa -r my-local-dir username@anselm.it4i.cz:directory
+
or
+
local $ sftp -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz
+

Very convenient way to transfer files in and out of the Anselm computer is via the fuse filesystem sshfs

+
local $ sshfs -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz:. mountpoint
+

Using sshfs, the users Anselm home directory will be mounted on your local computer, just like an external disk.

+

Learn more on ssh, scp and sshfs by reading the manpages

+
$ man ssh
$ man scp
$ man sshfs
+

On Windows, use WinSCP client to transfer the data. The win-sshfs client provides a way to mount the Anselm filesystems directly as an external disc.

+

More information about the shared file systems is available here.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/shell-and-data-access/shell-and-data-access.md b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/shell-and-data-access/shell-and-data-access.md index 46d22ec8e6d83b8645f6ebef25367c17c893828d..6ca703c9f35d55f50c729b4eb20318f4ebd70d67 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/shell-and-data-access/shell-and-data-access.md +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/shell-and-data-access/shell-and-data-access.md @@ -59,7 +59,7 @@ After logging in, you will see the command prompt:                         http://www.it4i.cz/?lang=en - Last loginTue Jul 9 15:57:38 2013 from your-host.example.com + Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com [username@login2.anselm ~]$ The environment is **not** shared between login nodes, except for @@ -138,4 +138,3 @@ way to mount the Anselm filesystems directly as an external disc. More information about the shared file systems is available [here](../../storage.html). - diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/storage-1.html b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/storage-1.html new file mode 100644 index 0000000000000000000000000000000000000000..1fb106c30634f91a1431d3e0de682e3f56b7d54c --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/storage-1.html @@ -0,0 +1,991 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Storage — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + + + Storage + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Storage +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

There are two main shared file systems on Anselm cluster, the HOME and SCRATCH. All login and compute nodes may access same data on shared filesystems. Compute nodes are also equipped with local (non-shared) scratch, ramdisk and tmp filesystems.

+

Archiving

+

Please don't use shared filesystems as a backup for large amount of data or long-term archiving mean. The academic staff and students of research institutions in the Czech Republic can use CESNET storage service, which is available via SSHFS.

+

Shared Filesystems

+

Anselm computer provides two main shared filesystems, the HOME filesystem and the SCRATCH filesystem. Both HOME and SCRATCH filesystems are realized as a parallel Lustre filesystem. Both shared file systems are accessible via the Infiniband network. Extended ACLs are provided on both Lustre filesystems for the purpose of sharing data with other users using fine-grained control.

+

Understanding the Lustre Filesystems

+

(source http://www.nas.nasa.gov)

+

A user file on the Lustre filesystem can be divided into multiple chunks (stripes) and stored across a subset of the object storage targets (OSTs) (disks). The stripes are distributed among the OSTs in a round-robin fashion to ensure load balancing.

+

When a client (a compute node from your job) needs to create or access a file, the client queries the metadata server (MDS) and the metadata target (MDT) for the layout and location of the file's stripes. Once the file is opened and the client obtains the striping information, the MDS is no longer involved in the file I/O process. The client interacts directly with the object storage servers (OSSes) and OSTs to perform I/O operations such as locking, disk allocation, storage, and retrieval.

+

If multiple clients try to read and write the same part of a file at the same time, the Lustre distributed lock manager enforces coherency so that all clients see consistent results.

+

There is default stripe configuration for Anselm Lustre filesystems. However, users can set the following stripe parameters for their own directories or files to get optimum I/O performance:

+
      +
    1. stripe_size: the size of the chunk in bytes; specify with k, m, or g to use units of KB, MB, or GB, respectively; the size must be an even multiple of 65,536 bytes; default is 1MB for all Anselm Lustre filesystems
    2. +
    3. stripe_count the number of OSTs to stripe across; default is 1 for Anselm Lustre filesystems  one can specify -1 to use all OSTs in the filesystem.
    4. +
    5. stripe_offset The index of the OST where the first stripe is to be placed; default is -1 which results in random selection; using a non-default value is NOT recommended.
    6. +
+

 

+

Setting stripe size and stripe count correctly for your needs may significantly impact the I/O performance you experience.

+

Use the lfs getstripe for getting the stripe parameters. Use the lfs setstripe command for setting the stripe parameters to get optimal I/O performance The correct stripe setting depends on your needs and file access patterns. 

+
$ lfs getstripe dir|filename 
$ lfs setstripe -s stripe_size -c stripe_count -o stripe_offset dir|filename
+

Example:

+
$ lfs getstripe /scratch/username/
/scratch/username/
stripe_count: 1 stripe_size: 1048576 stripe_offset: -1

$ lfs setstripe -c -1 /scratch/username/
$ lfs getstripe /scratch/username/
/scratch/username/
stripe_count: 10 stripe_size: 1048576 stripe_offset: -1
+

In this example, we view current stripe setting of the /scratch/username/ directory. The stripe count is changed to all OSTs, and verified. All files written to this directory will be striped over 10 OSTs

+

Use lfs check OSTs to see the number and status of active OSTs for each filesystem on Anselm. Learn more by reading the man page

+
$ lfs check osts
$ man lfs
+

Hints on Lustre Stripping

+

Increase the stripe_count for parallel I/O to the same file.

+

When multiple processes are writing blocks of data to the same file in parallel, the I/O performance for large files will improve when the stripe_count is set to a larger value. The stripe count sets the number of OSTs the file will be written to. By default, the stripe count is set to 1. While this default setting provides for efficient access of metadata (for example to support the ls -l command), large files should use stripe counts of greater than 1. This will increase the aggregate I/O bandwidth by using multiple OSTs in parallel instead of just one. A rule of thumb is to use a stripe count approximately equal to the number of gigabytes in the file.

+

Another good practice is to make the stripe count be an integral factor of the number of processes performing the write in parallel, so that you achieve load balance among the OSTs. For example, set the stripe count to 16 instead of 15 when you have 64 processes performing the writes.

+

Using a large stripe size can improve performance when accessing very large files

+

Large stripe size allows each client to have exclusive access to its own part of a file. However, it can be counterproductive in some cases if it does not match your I/O pattern. The choice of stripe size has no effect on a single-stripe file.

+

Read more on http://wiki.lustre.org/manual/LustreManual20_HTML/ManagingStripingFreeSpace.html

+

Lustre on Anselm

+

The  architecture of Lustre on Anselm is composed of two metadata servers (MDS) and four data/object storage servers (OSS). Two object storage servers are used for file system HOME and another two object storage servers are used for file system SCRATCH.

+

Configuration of the storages

+
+
    +
  • HOME Lustre object storage +
    +
      +
    • One disk array NetApp E5400
    • +
    • 22 OSTs
    • +
    • 227 2TB NL-SAS 7.2krpm disks
    • +
    • 22 groups of 10 disks in RAID6 (8+2)
    • +
    • 7 hot-spare disks
    • +
    +
    +
  • +
  • SCRATCH Lustre object storage +
    +
      +
    • Two disk arrays NetApp E5400
    • +
    • 10 OSTs
    • +
    • 106 2TB NL-SAS 7.2krpm disks
    • +
    • 10 groups of 10 disks in RAID6 (8+2)
    • +
    • 6 hot-spare disks
    • +
    +
    +
  • +
  • Lustre metadata storage +
    +
      +
    • One disk array NetApp E2600
    • +
    • 12 300GB SAS 15krpm disks
    • +
    • 2 groups of 5 disks in RAID5
    • +
    • 2 hot-spare disks
    • +
    +
    +
  • +
+
+

HOME

+

The HOME filesystem is mounted in directory /home. Users home directories /home/username reside on this filesystem. Accessible capacity is 320TB, shared among all users. Individual users are restricted by filesystem usage quotas, set to 250GB per user. If 250GB should prove as insufficient for particular user, please contact support, the quota may be lifted upon request.

+

The HOME filesystem is intended for preparation, evaluation, processing and storage of data generated by active Projects.

+

The HOME filesystem should not be used to archive data of past Projects or other unrelated data.

+

The files on HOME filesystem will not be deleted until end of the users lifecycle.

+

The filesystem is backed up, such that it can be restored in case of  catasthropic failure resulting in significant data loss. This backup however is not intended to restore old versions of user data or to restore (accidentaly) deleted files. 

+

The HOME filesystem is realized as Lustre parallel filesystem and is available on all login and computational nodes.
Default stripe size is 1MB, stripe count is 1. There are 22 OSTs dedicated for the HOME filesystem.

+

Setting stripe size and stripe count correctly for your needs may significantly impact the I/O performance you experience.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
HOME filesystem
Mountpoint/home
Capacity320TB
Throughput2GB/s
User quota250GB
Default stripe size1MB
Default stripe count1
Number of OSTs22
+

SCRATCH

+

The SCRATCH filesystem is mounted in directory /scratch. Users may freely create subdirectories and files on the filesystem. Accessible capacity is 146TB, shared among all users. Individual users are restricted by filesystem usage quotas, set to 100TB per user. The purpose of this quota is to prevent runaway programs from filling the entire filesystem and deny service to other users. If 100TB should prove as insufficient for particular user, please contact support, the quota may be lifted upon request. 

+

The Scratch filesystem is intended  for temporary scratch data generated during the calculation as well as for high performance access to input and output files. All I/O intensive jobs must use the SCRATCH filesystem as their working directory.

Users are advised to save the necessary data from the SCRATCH filesystem to HOME filesystem after the calculations and clean up the scratch files.

Files on the SCRATCH filesystem that are not accessed for more than 90 days will be automatically deleted.

+

The SCRATCH filesystem is realized as Lustre parallel filesystem and is available from all login and computational nodes.
Default stripe size is 1MB, stripe count is 1. There are 10 OSTs dedicated for the SCRATCH filesystem.

+

Setting stripe size and stripe count correctly for your needs may significantly impact the I/O performance you experience.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SCRATCH filesystem
Mountpoint/scratch
Capacity146TB
Throughput6GB/s
User quota100TB
Default stripe size1MB
Default stripe count1
Number of OSTs10
+

Disk usage and quota commands

+

User quotas on the file systems can be checked and reviewed using following command:

+
$ lfs quota dir
+

Example for Lustre HOME directory:

+
$ lfs quota /home
Disk quotas for user user001 (uid 1234):
Filesystem kbytes quota limit grace files quota limit grace
/home 300096 0 250000000 - 2102 0 500000 -
Disk quotas for group user001 (gid 1234):
Filesystem kbytes quota limit grace files quota limit grace
/home 300096 0 0 - 2102 0 0 -
+

In this example, we view current quota size limit of 250GB and 300MB currently used by user001.

+

Example for Lustre SCRATCH directory:

+
$ lfs quota /scratch
Disk quotas for user user001 (uid 1234):
Filesystem kbytes quota limit grace files quota limit grace
 /scratch       8       0 100000000000       -       3       0       0       -
Disk quotas for group user001 (gid 1234):
Filesystem kbytes quota limit grace files quota limit grace
/scratch       8       0       0       -       3       0       0       -
+

In this example, we view current quota size limit of 100TB and 8KB currently used by user001.

+

 

+

To have a better understanding of where the space is exactly used, you can use following command to find out.

+
$ du -hs dir
+

Example for your HOME directory:

+
$ cd /home
$ du -hs * .[a-zA-z0-9]* | grep -E "[0-9]*G|[0-9]*M" | sort -hr
258M cuda-samples
15M .cache
13M .mozilla
5,5M .eclipse
2,7M .idb_13.0_linux_intel64_app
+

This will list all directories which are having MegaBytes or GigaBytes of consumed space in your actual (in this example HOME) directory. List is sorted in descending order from largest to smallest files/directories.

+


To have a better understanding of previous commands, you can read manpages.

+
$ man lfs
+
$ man du 
+

Extended ACLs

+

Extended ACLs provide another security mechanism beside the standard POSIX ACLs which are defined by three entries (for owner/group/others). Extended ACLs have more than the three basic entries. In addition, they also contain a mask entry and may contain any number of named user and named group entries.

+

ACLs on a Lustre file system work exactly like ACLs on any Linux file system. They are manipulated with the standard tools in the standard manner. Below, we create a directory and allow a specific user access.

+
[vop999@login1.anselm ~]$ umask 027
[vop999@login1.anselm ~]$ mkdir test
[vop999@login1.anselm ~]$ ls -ld test
drwxr-x--- 2 vop999 vop999 4096 Nov  5 14:17 test
[vop999@login1.anselm ~]$ getfacl test
# file: test
# owner: vop999
# group: vop999
user::rwx
group::r-x
other::---

[vop999@login1.anselm ~]$ setfacl -m user:johnsm:rwx test
[vop999@login1.anselm ~]$ ls -ld test
drwxrwx---+ 2 vop999 vop999 4096 Nov  5 14:17 test
[vop999@login1.anselm ~]$ getfacl test
# file: test
# owner: vop999
# group: vop999
user::rwx
user:johnsm:rwx
group::r-x
mask::rwx
other::---
+

Default ACL mechanism can be used to replace setuid/setgid permissions on directories. Setting a default ACL on a directory (-d flag to setfacl) will cause the ACL permissions to be inherited by any newly created file or subdirectory within the directory. Refer to this page for more information on Linux ACL:

+

http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html 

+

Local Filesystems

+

Local Scratch

+

Every computational node is equipped with 330GB local scratch disk.

+

Use local scratch in case you need to access large amount of small files during your calculation.

+

The local scratch disk is mounted as /lscratch and is accessible to user at /lscratch/$PBS_JOBID directory.

+

The local scratch filesystem is intended  for temporary scratch data generated during the calculation as well as for high performance access to input and output files. All I/O intensive jobs that access large number of small files within the calculation must use the local scratch filesystem as their working directory. This is required for performance reasons, as frequent access to number of small files may overload the metadata servers (MDS) of the Lustre filesystem.

+

The local scratch directory /lscratch/$PBS_JOBID will be deleted immediately after the calculation end. Users should take care to save the output data from within the jobscript.

+ + + + + + + + + + + + + + + + + + + + + + + + +
local SCRATCH filesystem
Mountpoint/lscratch
Accesspoint/lscratch/$PBS_JOBID
Capacity330GB
Throughput100MB/s
User quotanone
+

RAM disk

+

Every computational node is equipped with filesystem realized in memory, so called RAM disk.

+

Use RAM disk in case you need really fast access to your data of limited size during your calculation.
Be very careful, use of RAM disk filesystem is at the expense of operational memory.

+

The local RAM disk is mounted as /ramdisk and is accessible to user at /ramdisk/$PBS_JOBID directory.

+

The local RAM disk filesystem is intended for temporary scratch data generated during the calculation as well as for high performance access to input and output files. Size of RAM disk filesystem is limited. Be very careful, use of RAM disk filesystem is at the expense of operational memory.  It is not recommended to allocate large amount of memory and use large amount of data in RAM disk filesystem at the same time.

+

The local RAM disk directory /ramdisk/$PBS_JOBID will be deleted immediately after the calculation end. Users should take care to save the output data from within the jobscript.

+ + + + + + + + + + + + + + + + + + + + + + + + +
RAM disk
Mountpoint/ramdisk
Accesspoint/ramdisk/$PBS_JOBID
Capacity +

60GB at compute nodes without accelerator

+

90GB at compute nodes with accelerator

+

500GB at fat nodes

+
Throughput +

over 1.5 GB/s write, over 5 GB/s read, single thread
over 10 GB/s write, over 50 GB/s read, 16 threads

+
User quotanone
+

tmp

+

Each node is equipped with local /tmp directory of few GB capacity. The /tmp directory should be used to work with small temporary files. Old files in /tmp directory are automatically purged.

+
+

Summary

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
MountpointUsageProtocolNet CapacityThroughputLimitationsAccessServices
/homehome directoryLustre320 TiB2 GB/sQuota 250GBCompute and login nodesbacked up
/scratchcluster shared jobs' dataLustre146 TiB6 GB/sQuota 100TBCompute and login nodesfiles older 90 days removed
/lscratchnode local jobs' datalocal330 GB100 MB/snoneCompute nodespurged after job ends
/ramdisknode local jobs' datalocal60, 90, 500 GB5-50 GB/snoneCompute nodespurged after job ends
/tmplocal temporary fileslocal100 MB/snoneCompute and login nodesauto purged
+
+
+

 

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/storage-1.md b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/storage-1.md index 7a0905bdee563dd060712afffcd08192a6689ec9..cb600cb9450ae40401fa6c66e6b11b831a625d47 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/storage-1.md +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/storage-1.md @@ -61,7 +61,7 @@ There is default stripe configuration for Anselm Lustre filesystems. However, users can set the following stripe parameters for their own directories or files to get optimum I/O performance: -1. stripe_sizethe size of the chunk in bytes; specify with k, m, or +1. stripe_size: the size of the chunk in bytes; specify with k, m, or g to use units of KB, MB, or GB, respectively; the size must be an even multiple of 65,536 bytes; default is 1MB for all Anselm Lustre filesystems @@ -93,12 +93,12 @@ Example: ``` $ lfs getstripe /scratch/username/ /scratch/username/ -stripe_count 1 stripe_size 1048576 stripe_offset -1 +stripe_count: 1 stripe_size: 1048576 stripe_offset: -1 $ lfs setstripe -c -1 /scratch/username/ $ lfs getstripe /scratch/username/ /scratch/username/ -stripe_count 10 stripe_size 1048576 stripe_offset -1 +stripe_count: 10 stripe_size: 1048576 stripe_offset: -1 ``` In this example, we view current stripe setting of the @@ -378,9 +378,9 @@ manner. Below, we create a directory and allow a specific user access. [vop999@login1.anselm ~]$ ls -ld test drwxr-x--- 2 vop999 vop999 4096 Nov  5 14:17 test [vop999@login1.anselm ~]$ getfacl test -# filetest -# ownervop999 -# groupvop999 +# file: test +# owner: vop999 +# group: vop999 user::rwx group::r-x other::--- @@ -389,9 +389,9 @@ other::--- [vop999@login1.anselm ~]$ ls -ld test drwxrwx---+ 2 vop999 vop999 4096 Nov  5 14:17 test [vop999@login1.anselm ~]$ getfacl test -# filetest -# ownervop999 -# groupvop999 +# file: test +# owner: vop999 +# group: vop999 user::rwx user:johnsm:rwx group::r-x @@ -507,4 +507,3 @@ files in /tmp directory are automatically purged. /tmp local temporary files local 100 MB/s none Compute and login nodes auto purged   - diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/vpn-access.html b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/vpn-access.html new file mode 100644 index 0000000000000000000000000000000000000000..90b52ad1cd8138fc15d2622906384647156f49ce --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/vpn-access.html @@ -0,0 +1,806 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +VPN Access — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Accessing the Cluster + + / + + + + + + + + + + VPN Access + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ VPN Access +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Accessing IT4Innovations internal resources via VPN

+

Failed to initialize connection subsystem Win 8.1 - 02-10-15 MS patch
Workaround can be found at https://docs.it4i.cz/vpn-connection-fail-in-win-8.1

+

 

+

For using resources and licenses which are located at IT4Innovations local network, it is necessary to VPN connect to this network.
We use Cisco AnyConnect Secure Mobility Client, which is supported on the following operating systems:

+
    +
  • Windows XP
  • +
  • Windows Vista
  • +
  • Windows 7
  • +
  • Windows 8
  • +
  • Linux
  • +
  • MacOS
  • +
+

It is impossible to connect to VPN from other operating systems.

+

VPN client installation

+

You can install VPN client from web interface after successful login with LDAP credentials on address https://vpn1.it4i.cz/anselm

+

+

According to the Java settings after login, the client either automatically installs, or downloads installation file for your operating system. It is necessary to allow start of installation tool for automatic installation.

+

Java detection

+

Execution accessExecution access 2

+

After successful installation, VPN connection will be established and you can use available resources from IT4I network.

+

Successfull instalation

+

If your Java setting doesn't allow automatic installation, you can download installation file and install VPN client manually.

+

Installation file

+

After you click on the link, download of installation file will start.

+

Download file successfull

+

After successful download of installation file, you have to execute this tool with administrator's rights and install VPN client manually.

+

Working with VPN client

+

You can use graphical user interface or command line interface to run VPN client on all supported operating systems. We suggest using GUI.

+

Icon

+

Before the first login to VPN, you have to fill URL https://vpn1.it4i.cz/anselm into the text field.

+

First run

+

After you click on the Connect button, you must fill your login credentials.

+

Login - GUI

+

After a successful login, the client will minimize to the system tray. If everything works, you can see a lock in the Cisco tray icon.

+

Successfull connection

+

If you right-click on this icon, you will see a context menu in which you can control the VPN connection.

+

Context menu

+

When you connect to the VPN for the first time, the client downloads the profile and creates a new item "ANSELM" in the connection list. For subsequent connections, it is not necessary to re-enter the URL address, but just select the corresponding item.

+

Anselm profile

+

Then AnyConnect automatically proceeds like in the case of first logon.

+

Login with profile

+

After a successful logon, you can see a green circle with a tick mark on the lock icon.

+

successful login

+

For disconnecting, right-click on the AnyConnect client icon in the system tray and select VPN Disconnect.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/vpn-access.md b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/vpn-access.md index ac664a415b8e432e3161a42183d9d8a96d3e7f38..cfc7e40c07a39c21432acbfe70d29f167cec9c03 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/vpn-access.md +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/vpn-access.md @@ -120,4 +120,3 @@ login](../successfullconnection.jpg "successful login") For disconnecting, right-click on the AnyConnect client icon in the system tray and select **VPN Disconnect**. - diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/x-window-and-vnc.html b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/x-window-and-vnc.html new file mode 100644 index 0000000000000000000000000000000000000000..3a3d5475aef393677c9c15e8bd971cc854b8bc7e --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/x-window-and-vnc.html @@ -0,0 +1,662 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Graphical User Interface — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Get Started with IT4Innovations + + / + + + + + + + + Accessing the Clusters + + / + + + + + + + + + + Graphical User Interface + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Graphical User Interface +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

X Window System

+

The X Window system is a principal way to get GUI access to the clusters.

+

Read more about configuring X Window System.

+

VNC

+

The Virtual Network Computing (VNC) is a graphical desktop sharing system that uses the Remote Frame Buffer protocol (RFB) to remotely control another computer.

+

Read more about configuring VNC.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/x-window-and-vnc.md b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/x-window-and-vnc.md index 25928db68f393a4119361ad777a960a5e7e73e4d..5e1cc315cabed61c0849f6dcb87783fab6287ef8 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/x-window-and-vnc.md +++ b/docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/x-window-and-vnc.md @@ -28,4 +28,3 @@ class="link-external">[computer](http://en.wikipedia.org/wiki/Computer "Computer Read more about configuring **[VNC](../../salomon/accessing-the-cluster/graphical-user-interface/vnc.html)**. - diff --git a/docs.it4i.cz/anselm-cluster-documentation/compute-nodes.html b/docs.it4i.cz/anselm-cluster-documentation/compute-nodes.html new file mode 100644 index 0000000000000000000000000000000000000000..cb3a7e909870f1fd57c4f763b86ba25324a043a0 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/compute-nodes.html @@ -0,0 +1,970 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Compute Nodes — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + + + Compute Nodes + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Compute Nodes +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Nodes Configuration

+

Anselm is cluster of x86-64 Intel based nodes built on Bull Extreme Computing bullx technology. The cluster contains four types of compute nodes.

+

Compute Nodes Without Accelerator

+
+
    +
  • +
    180 nodes
    +
  • +
  • +
    2880 cores in total
    +
  • +
  • +
    two Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
    +
  • +
  • +
    64 GB of physical memory per node
    +
  • +
  • one 500GB SATA 2,5” 7,2 krpm HDD per node
  • +
  • +
    bullx B510 blade servers
    +
  • +
  • +
    cn[1-180]
    +
  • +
+

Compute Nodes With GPU Accelerator

+
    +
  • +
    23 nodes
    +
  • +
  • +
    368 cores in total
    +
  • +
  • +
    two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
    +
  • +
  • +
    96 GB of physical memory per node
    +
  • +
  • one 500GB SATA 2,5” 7,2 krpm HDD per node
  • +
  • +
    GPU accelerator 1x NVIDIA Tesla Kepler K20 per node
    +
  • +
  • +
    bullx B515 blade servers
    +
  • +
  • +
    cn[181-203]
    +
  • +
+

Compute Nodes With MIC Accelerator

+
    +
  • +
    4 nodes
    +
  • +
  • +
    64 cores in total
    +
  • +
  • +
    two Intel Sandy Bridge E5-2470, 8-core, 2.3GHz processors per node
    +
  • +
  • +
    96 GB of physical memory per node
    +
  • +
  • one 500GB SATA 2,5” 7,2 krpm HDD per node
  • +
  • +
    MIC accelerator 1x Intel Phi 5110P per node
    +
  • +
  • +
    bullx B515 blade servers
    +
  • +
  • +
    cn[204-207]
    +
  • +
+

Fat Compute Nodes

+
    +
  • +
    2 nodes
    +
  • +
  • +
    32 cores in total
    +
  • +
  • +
    2 Intel Sandy Bridge E5-2665, 8-core, 2.4GHz processors per node
    +
  • +
  • +
    512 GB of physical memory per node
    +
  • +
  • two 300GB SAS 3,5”15krpm HDD (RAID1) per node
  • +
  • +
    two 100GB SLC SSD per node
    +
  • +
  • +
    bullx R423-E3 servers
    +
  • +
  • +
    cn[208-209]
    +
  • +
+
+

 

+
+

+

Figure Anselm bullx B510 servers

+

Compute Nodes Summary

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Node typeCountRangeMemoryCoresAccess
Nodes without accelerator180cn[1-180]64GB16 @ 2.4Ghzqexp, qprod, qlong, qfree
Nodes with GPU accelerator23cn[181-203]96GB16 @ 2.3Ghzqgpu, qprod
Nodes with MIC accelerator4cn[204-207]96GB16 @ 2.3GHzqmic, qprod
Fat compute nodes2cn[208-209]512GB16 @ 2.4GHzqfat, qprod
+
+
+
+
+
+

Processor Architecture

+
+
+
+

Anselm is equipped with Intel Sandy Bridge processors Intel Xeon E5-2665 (nodes without accelerator and fat nodes) and Intel Xeon E5-2470 (nodes with accelerator). Processors support Advanced Vector Extensions (AVX) 256-bit instruction set.

+
+
+
+
+

Intel Sandy Bridge E5-2665 Processor

+
+
+
+
+
    +
  • eight-core
  • +
  • speed: 2.4 GHz, up to 3.1 GHz using Turbo Boost Technology
  • +
  • peak performance: 19.2 Gflop/s per core
  • +
  • caches: +
    +
      +
    • L2: 256 KB per core
    • +
    • L3: 20 MB per processor
    • +
    +
    +
  • +
  • memory bandwidth at the level of the processor: 51.2 GB/s
  • +
+
+
+
+
+
+
+

Intel Sandy Bridge E5-2470 Processor

+
+
+
+
+
    +
  • eight-core
  • +
  • speed: 2.3 GHz, up to 3.1 GHz using Turbo Boost Technology
  • +
  • peak performance: 18.4 Gflop/s per core
  • +
  • caches: +
    +
      +
    • L2: 256 KB per core
    • +
    • L3: 20 MB per processor
    • +
    +
    +
  • +
  • memory bandwidth at the level of the processor: 38.4 GB/s
  • +
+
+

 

+

Nodes equipped with Intel Xeon E5-2665 CPU have set PBS resource attribute cpu_freq = 24, nodes equipped with Intel Xeon E5-2470 CPU have set PBS resource attribute cpu_freq = 23.

+
+
$ qsub -A OPEN-0-0 -q qprod -l select=4:ncpus=16:cpu_freq=24 -I
+
+
+
+
In this example, we allocate 4 nodes, 16 cores at 2.4GHhz per node.
+
+
+
Intel Turbo Boost Technology is used by default,  you can disable it for all nodes of job by using resource attribute cpu_turbo_boost.
+
$ qsub -A OPEN-0-0 -q qprod -l select=4:ncpus=16 -l cpu_turbo_boost=0 -I
+

Memory Architecture

+
+
+
+
+

Compute Node Without Accelerator

+
+
+
+
+
    +
  • 2 sockets
  • +
  • Memory Controllers are integrated into processors. +
    +
      +
    • 8 DDR3 DIMMS per node
    • +
    • 4 DDR3 DIMMS per CPU
    • +
    • 1 DDR3 DIMMS per channel
    • +
    • Data rate support: up to 1600MT/s
    • +
    +
    +
  • +
  • Populated memory: 8x 8GB DDR3 DIMM 1600Mhz
  • +
+
+
+
+
+
+
+

Compute Node With GPU or MIC Accelerator

+
+
+
+
+
    +
  • 2 sockets
  • +
  • Memory Controllers are integrated into processors. +
    +
      +
    • 6 DDR3 DIMMS per node
    • +
    • 3 DDR3 DIMMS per CPU
    • +
    • 1 DDR3 DIMMS per channel
    • +
    • Data rate support: up to 1600MT/s
    • +
    +
    +
  • +
  • Populated memory: 6x 16GB DDR3 DIMM 1600Mhz
  • +
+
+
+
+
+
+
+

Fat Compute Node

+
+
+
+
+
    +
  • 2 sockets
  • +
  • Memory Controllers are integrated into processors. +
    +
      +
    • 16 DDR3 DIMMS per node
    • +
    • 8 DDR3 DIMMS per CPU
    • +
    • 2 DDR3 DIMMS per channel
    • +
    • Data rate support: up to 1600MT/s
    • +
    +
    +
  • +
  • Populated memory: 16x 32GB DDR3 DIMM 1600Mhz
  • +
+
+
+
+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/compute-nodes.md b/docs.it4i.cz/anselm-cluster-documentation/compute-nodes.md index 6a013a209383c59f0f00610c81930a9ad2130418..5f8fe6273238b3e16196742cf6090ca48af2ad80 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/compute-nodes.md +++ b/docs.it4i.cz/anselm-cluster-documentation/compute-nodes.md @@ -219,33 +219,33 @@ with accelerator). Processors support Advanced Vector Extensions (AVX) ### Intel Sandy Bridge E5-2665 Processor - eight-core -- speed2.4 GHz, up to 3.1 GHz using Turbo Boost Technology -- peak performance19.2 Gflop/s per +- speed: 2.4 GHz, up to 3.1 GHz using Turbo Boost Technology +- peak performance: 19.2 Gflop/s per core - caches:
- - L2256 KB per core - - L320 MB per processor + - L2: 256 KB per core + - L3: 20 MB per processor -- memory bandwidth at the level of the processor51.2 GB/s +- memory bandwidth at the level of the processor: 51.2 GB/s ### Intel Sandy Bridge E5-2470 Processor - eight-core -- speed2.3 GHz, up to 3.1 GHz using Turbo Boost Technology -- peak performance18.4 Gflop/s per +- speed: 2.3 GHz, up to 3.1 GHz using Turbo Boost Technology +- peak performance: 18.4 Gflop/s per core - caches:
- - L2256 KB per core - - L320 MB per processor + - L2: 256 KB per core + - L3: 20 MB per processor -- memory bandwidth at the level of the processor38.4 GB/s +- memory bandwidth at the level of the processor: 38.4 GB/s @@ -282,11 +282,11 @@ Memory Architecture - 8 DDR3 DIMMS per node - 4 DDR3 DIMMS per CPU - 1 DDR3 DIMMS per channel - - Data rate supportup to 1600MT/s + - Data rate support: up to 1600MT/s -- Populated memory8x 8GB DDR3 DIMM 1600Mhz +- Populated memory: 8x 8GB DDR3 DIMM 1600Mhz ### Compute Node With GPU or MIC Accelerator - 2 sockets @@ -296,11 +296,11 @@ Memory Architecture - 6 DDR3 DIMMS per node - 3 DDR3 DIMMS per CPU - 1 DDR3 DIMMS per channel - - Data rate supportup to 1600MT/s + - Data rate support: up to 1600MT/s -- Populated memory6x 16GB DDR3 DIMM 1600Mhz +- Populated memory: 6x 16GB DDR3 DIMM 1600Mhz ### Fat Compute Node - 2 sockets @@ -310,12 +310,11 @@ Memory Architecture - 16 DDR3 DIMMS per node - 8 DDR3 DIMMS per CPU - 2 DDR3 DIMMS per channel - - Data rate supportup to 1600MT/s + - Data rate support: up to 1600MT/s -- Populated memory16x 32GB DDR3 DIMM 1600Mhz - +- Populated memory: 16x 32GB DDR3 DIMM 1600Mhz diff --git a/docs.it4i.cz/anselm-cluster-documentation/environment-and-modules.html b/docs.it4i.cz/anselm-cluster-documentation/environment-and-modules.html new file mode 100644 index 0000000000000000000000000000000000000000..af226bf56767113654daffe83f3827bf88e121a5 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/environment-and-modules.html @@ -0,0 +1,706 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Environment and Modules — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + + + Environment and Modules + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Environment and Modules +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Environment Customization

+

After logging in, you may want to configure the environment. Write your preferred path definitions, aliases, functions and module loads in the .bashrc file

+
# ./bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi

# User specific aliases and functions
alias qs='qstat -a'
module load PrgEnv-gnu

# Display informations to standard output - only in interactive ssh session
if [ -n "$SSH_TTY" ]
then
module list # Display loaded modules
fi
+

Do not run commands outputing to standard output (echo, module list, etc) in .bashrc  for non-interactive SSH sessions. It breaks fundamental functionality (scp, PBS) of your account! Take care for SSH session interactivity for such commands as stated in the previous example.

+

Application Modules

+

In order to configure your shell for  running particular application on Anselm we use Module package interface.

+

The modules set up the application paths, library paths and environment variables for running particular application.

+

We have also second modules repository. This modules repository is created using tool called EasyBuild. On Salomon cluster, all modules will be build by this tool. If you want to use software from this modules repository, please follow instructions in section Application Modules Path Expansion.

+

The modules may be loaded, unloaded and switched, according to momentary needs.

+

To check available modules use

+
$ module avail
+

To load a module, for example the octave module  use

+
$ module load octave
+

loading the octave module will set up paths and environment variables of your active shell such that you are ready to run the octave software

+

To check loaded modules use

+
$ module list
+

 To unload a module, for example the octave module use

+
$ module unload octave
+

Learn more on modules by reading the module man page

+
$ man module
+

Following modules set up the development environment

+

PrgEnv-gnu sets up the GNU development environment in conjunction with the bullx MPI library

+

PrgEnv-intel sets up the INTEL development environment in conjunction with the Intel MPI library

+

Application Modules Path Expansion

+

All application modules on Salomon cluster (and further) will be build using tool called EasyBuild. In case that you want to use some applications that are build by EasyBuild already, you have to modify your MODULEPATH environment variable.

+
export MODULEPATH=$MODULEPATH:/apps/easybuild/modules/all/
+

This command expands your searched paths to modules. You can also add this command to the .bashrc file to expand paths permanently. After this command, you can use same commands to list/add/remove modules as is described above.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/environment-and-modules.md b/docs.it4i.cz/anselm-cluster-documentation/environment-and-modules.md index b72a0f0b99fc28426fe83a40f149665f26a9b89b..a4e9c7abc5686caf3f59d29b5c353c369d46cd00 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/environment-and-modules.md +++ b/docs.it4i.cz/anselm-cluster-documentation/environment-and-modules.md @@ -113,4 +113,3 @@ This command expands your searched paths to modules. You can also add this command to the .bashrc file to expand paths permanently. After this command, you can use same commands to list/add/remove modules as is described above. - diff --git a/docs.it4i.cz/anselm-cluster-documentation/hardware-overview.html b/docs.it4i.cz/anselm-cluster-documentation/hardware-overview.html new file mode 100644 index 0000000000000000000000000000000000000000..a0fce7ec9adc82db3c2ab5cf6a40c560d0246d62 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/hardware-overview.html @@ -0,0 +1,1256 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Hardware Overview — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + + + Hardware Overview + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Hardware Overview +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

The Anselm cluster consists of 209 computational nodes named cn[1-209] of which 180 are regular compute nodes, 23 GPU Kepler K20 accelerated nodes, 4 MIC Xeon Phi 5110 accelerated nodes and 2 fat nodes. Each node is a powerful x86-64 computer, equipped with 16 cores (two eight-core Intel Sandy Bridge processors), at least 64GB RAM, and local hard drive. The user access to the Anselm cluster is provided by two login nodes login[1,2]. The nodes are interlinked by high speed InfiniBand and Ethernet networks. All nodes share 320TB /home disk storage to store the user files. The 146TB shared /scratch storage is available for the scratch data.

+

The Fat nodes are equipped with large amount (512GB) of memory. Virtualization infrastructure provides resources to run long term servers and services in virtual mode. Fat nodes and virtual servers may access 45 TB of dedicated block storage. Accelerated nodes, fat nodes, and virtualization infrastructure are available upon request made by a PI.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
User-oriented infrastructureStorageManagement infrastructure
+ + + + + + + + + + + + +
login1
login2
dm1
+
+

Rack 01, Switch isw5

+ + + + + + + + + + + + + + + + + +
cn186cn187cn188cn189
cn181cn182cn183cn184cn185
+
+

Rack 01, Switch isw4

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
cn29cn30cn31cn32cn33cn34cn35cn36
cn19cn20cn21cn22cn23cn24cn25cn26cn27cn28
+
+ + + + + + + + + +
+

 

+

 

+

Lustre FS

+

/home
320TB

+

 

+

 

+
+

Lustre FS

+

/scratch
146TB

+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Management
nodes
Block storage
45 TB
Virtualization
infrastructure
servers
...
Srv node
Srv node
Srv node
...
+
+

Rack 01, Switch isw0

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
cn11cn12cn13cn14cn15cn16cn17cn18
cn1cn2cn3cn4cn5cn6cn7cn8cn9cn10
+
+

Rack 02, Switch isw10

+ + + + + + + + + + + + + + + + + + + + + +
cn73cn74cn75cn76cn77cn78cn79cn80
cn190cn191cn192cn205cn206
+
+

Rack 02, Switch isw9

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
cn65cn66cn67cn68cn69cn70cn71cn72
cn55cn56cn57cn58cn59cn60cn61cn62cn63cn64
+
+

Rack 02, Switch isw6

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
cn47cn48cn49cn50cn51cn52cn53cn54
cn37cn38cn39cn40cn41cn42cn43cn44cn45cn46
+
+

Rack 03, Switch isw15

+ + + + + + + + + + + + + + + + + + + + + + +
cn193cn194cn195cn207
cn117cn118cn119cn120cn121cn122cn123cn124cn125cn126
+
+

Rack 03, Switch isw14

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
cn109cn110cn111cn112cn113cn114cn115cn116
cn99cn100cn101cn102cn103cn104cn105cn106cn107cn108
+
+

Rack 03, Switch isw11

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
cn91cn92cn93cn94cn95cn96cn97cn98
cn81cn82cn83cn84cn85cn86cn87cn88cn89cn90
+
+

Rack 04, Switch isw20

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
cn173cn174cn175cn176cn177cn178cn179cn180
cn163cn164cn165cn166cn167cn168cn169cn170cn171cn172
+
+

Rack 04, Switch isw19

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
cn155cn156cn157cn158cn159cn160cn161cn162
cn145cn146cn147cn148cn149cn150cn151cn152cn153cn154
+
+

Rack 04, Switch isw16

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
cn137cn138cn139cn140cn141cn142cn143cn144
cn127cn128cn129cn130cn131cn132cn133cn134cn135cn136
+
+

Rack 05, Switch isw21

+ + + + + + + + + + + + + + + + + +
cn201cn202cn203cn204
cn196cn197cn198cn199cn200
+
+ + + + + + + + + + + + +
Fat node cn208
Fat node cn209
...
+
+
+

The cluster compute nodes cn[1-207] are organized within 13 chassis. 

+

There are four types of compute nodes:

+
    +
  • 180 compute nodes without the accelerator
  • +
  • 23 compute nodes with GPU accelerator - equipped with NVIDIA Tesla Kepler K20
  • +
  • 4 compute nodes with MIC accelerator - equipped with Intel Xeon Phi 5110P
  • +
  • 2 fat nodes - equipped with 512GB RAM and two 100GB SSD drives
  • +
+

More about Compute nodes.

+

GPU and accelerated nodes are available upon request, see the Resources Allocation Policy.

+ + +

The user access to the Anselm cluster is provided by two login nodes login1, login2, and data mover node dm1. More about accessing cluster.

+

 The parameters are summarized in the following tables:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
In general
Primary purposeHigh Performance Computing
Architecture of compute nodesx86-64
Operating systemLinux
Compute nodes
Totally209
Processor cores16 (2x8 cores)
RAMmin. 64 GB, min. 4 GB per core
Local disk driveyes - usually 500 GB
Compute networkInfiniBand QDR, fully non-blocking, fat-tree
w/o accelerator180, cn[1-180]
GPU accelerated23, cn[181-203]
MIC accelerated4, cn[204-207]
Fat compute nodes2, cn[208-209]
In total
Total theoretical peak performance  (Rpeak)94 Tflop/s
Total max. LINPACK performance  (Rmax)73 Tflop/s
Total amount of RAM15.136 TB
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NodeProcessorMemoryAccelerator
w/o accelerator2x Intel Sandy Bridge E5-2665, 2.4GHz64GB-
GPU accelerated2x Intel Sandy Bridge E5-2470, 2.3GHz96GBNVIDIA Kepler K20
MIC accelerated2x Intel Sandy Bridge E5-2470, 2.3GHz96GBIntel Xeon Phi P5110
Fat compute node2x Intel Sandy Bridge E5-2665, 2.4GHz512GB-
+

  For more details please refer to the Compute nodes, Storage, and Network.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/hardware-overview.md b/docs.it4i.cz/anselm-cluster-documentation/hardware-overview.md index 0986e4ceb5bfbcb963e1d161458e1c202b11f39c..e996eddb1f9f55a75a64afce653c90d8900b0af9 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/hardware-overview.md +++ b/docs.it4i.cz/anselm-cluster-documentation/hardware-overview.md @@ -398,4 +398,3 @@ Total amount of RAM nodes](compute-nodes.html), [Storage](storage.html), and [Network](network.html). - diff --git a/docs.it4i.cz/anselm-cluster-documentation/introduction.html b/docs.it4i.cz/anselm-cluster-documentation/introduction.html new file mode 100644 index 0000000000000000000000000000000000000000..93655e169778d7117db7d61cae3269c3aefaafe1 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/introduction.html @@ -0,0 +1,674 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Introduction — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + + + Anselm Cluster Documentation + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Introduction +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Welcome to Anselm supercomputer cluster. The Anselm cluster consists of 209 compute nodes, totaling 3344 compute cores with 15TB RAM and giving over 94 Tflop/s theoretical peak performance. Each node is a powerful x86-64 computer, equipped with 16 cores, at least 64GB RAM, and 500GB harddrive. Nodes are interconnected by fully non-blocking fat-tree Infiniband network and equipped with Intel Sandy Bridge processors. A few nodes are also equipped with NVIDIA Kepler GPU or Intel Xeon Phi MIC accelerators. Read more in Hardware Overview.

+

The cluster runs bullx Linux operating system, which is compatible with the RedHat Linux family. We have installed a wide range of software packages targeted at different scientific domains. These packages are accessible via the modules environment.

+

User data shared file-system (HOME, 320TB) and job data shared file-system (SCRATCH, 146TB) are available to users.

+

The PBS Professional workload manager provides computing resources allocations and job execution.

+

Read more on how to apply for resources, obtain login credentials, and access the cluster.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/introduction.md b/docs.it4i.cz/anselm-cluster-documentation/introduction.md index c72b8f6f22ee8eb3a5bbb6d76c06e6057b5ed3ea..5be69738b59ac40ab44c1e5aaf11616b9b774647 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/introduction.md +++ b/docs.it4i.cz/anselm-cluster-documentation/introduction.md @@ -38,4 +38,3 @@ resources](../get-started-with-it4innovations/applying-for-resources.html), [obtain login credentials,](../get-started-with-it4innovations/obtaining-login-credentials.html) and [access the cluster](accessing-the-cluster.html). - diff --git a/docs.it4i.cz/anselm-cluster-documentation/network.html b/docs.it4i.cz/anselm-cluster-documentation/network.html new file mode 100644 index 0000000000000000000000000000000000000000..83ffc73ff83fc19d4a38b5c0697295dfcf2614b8 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/network.html @@ -0,0 +1,690 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Network — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + + + Network + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Network +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

All compute and login nodes of Anselm are interconnected by Infiniband QDR network and by Gigabit Ethernet network. Both networks may be used to transfer user data.

+

Infiniband Network

+

All compute and login nodes of Anselm are interconnected by a high-bandwidth, low-latency Infiniband QDR network (IB 4x QDR, 40 Gbps). The network topology is a fully non-blocking fat-tree.

+

The compute nodes may be accessed via the Infiniband network using ib0 network interface, in address range 10.2.1.1-209. The MPI may be used to establish native Infiniband connection among the nodes.

+

The network provides 2170MB/s transfer rates via the TCP connection (single stream) and up to 3600MB/s via native Infiniband protocol.

+

The Fat tree topology ensures that peak transfer rates are achieved between any two nodes, independent of network traffic exchanged among other nodes concurrently.

+

Ethernet Network

+

The compute nodes may be accessed via the regular Gigabit Ethernet network interface eth0, in address range 10.1.1.1-209, or by using aliases cn1-cn209.
The network provides 114MB/s transfer rates via the TCP connection.

+

Example

+
$ qsub -q qexp -l select=4:ncpus=16 -N Name0 ./myjob
$ qstat -n -u username
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
15209.srv11 username qexp Name0 5530 4 64 -- 01:00 R 00:00
cn17/0*16+cn108/0*16+cn109/0*16+cn110/0*16

$ ssh 10.2.1.110
$ ssh 10.1.1.108
+

In this example, we access the node cn110 by Infiniband network via the ib0 interface, then from cn110 to cn108 by Ethernet network.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/network.md b/docs.it4i.cz/anselm-cluster-documentation/network.md index 416f594dee5d36da404daae3ecb283ec17732aec..b18fe77e1bf6c2a978ab319e8f4d8b9c781b8fe8 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/network.md +++ b/docs.it4i.cz/anselm-cluster-documentation/network.md @@ -7,7 +7,7 @@ Network All compute and login nodes of Anselm are interconnected by [Infiniband](http://en.wikipedia.org/wiki/InfiniBand) -DR network and by Gigabit +QDR network and by Gigabit [Ethernet](http://en.wikipedia.org/wiki/Ethernet) network. Both networks may be used to transfer user data. @@ -17,7 +17,7 @@ Infiniband Network All compute and login nodes of Anselm are interconnected by a high-bandwidth, low-latency [Infiniband](http://en.wikipedia.org/wiki/InfiniBand) -DR network (IB 4x QDR, 40 Gbps). The network topology is a fully +QDR network (IB 4x QDR, 40 Gbps). The network topology is a fully non-blocking fat-tree. The compute nodes may be accessed via the Infiniband network using ib0 @@ -57,4 +57,3 @@ $ ssh 10.1.1.108 In this example, we access the node cn110 by Infiniband network via the ib0 interface, then from cn110 to cn108 by Ethernet network. - diff --git a/docs.it4i.cz/anselm-cluster-documentation/prace.html b/docs.it4i.cz/anselm-cluster-documentation/prace.html new file mode 100644 index 0000000000000000000000000000000000000000..be8384847c5725422656aebc5ed10fff093f5dae --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/prace.html @@ -0,0 +1,924 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +PRACE User Support — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + + + PRACE User Support + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ PRACE User Support +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Intro

+

PRACE users coming to Anselm as to TIER-1 system offered through the DECI calls are in general treated as standard users and so most of the general documentation applies to them as well. This section shows the main differences for quicker orientation, but often uses references to the original documentation. PRACE users who don't undergo the full procedure (including signing the IT4I AuP on top of the PRACE AuP) will not have a password and thus access to some services intended for regular users. This can lower their comfort, but otherwise they should be able to use the TIER-1 system as intended. Please see the Obtaining Login Credentials section, if the same level of access is required.

+

All general PRACE User Documentation should be read before continuing reading the local documentation here.

+

Help and Support

+

If you have any troubles, need information, request support or want to install additional software, please use PRACE Helpdesk.

+

Information about the local services are provided in the introduction of general user documentation. Please keep in mind, that standard PRACE accounts don't have a password to access the web interface of the local (IT4Innovations) request tracker and thus a new ticket should be created by sending an e-mail to support[at]it4i.cz.

+

Obtaining Login Credentials

+

In general PRACE users already have a PRACE account setup through their HOMESITE (institution from their country) as a result of rewarded PRACE project proposal. This includes signed PRACE AuP, generated and registered certificates, etc.

+

If there's a special need a PRACE user can get a standard (local) account at IT4Innovations. To get an account on the Anselm cluster, the user needs to obtain the login credentials. The procedure is the same as for general users of the cluster, so please see the corresponding section of the general documentation here.

+

Accessing the cluster

+

Access with GSI-SSH

+

For all PRACE users the method for interactive access (login) and data transfer based on grid services from Globus Toolkit (GSI SSH and GridFTP) is supported.

+

The user will need a valid certificate and to be present in the PRACE LDAP (please contact your HOME SITE or the primary investigator of your project for LDAP account creation).

+

Most of the information needed by PRACE users accessing the Anselm TIER-1 system can be found here:

+ +

 

+

Before you start to use any of the services don't forget to create a proxy certificate from your certificate:

+
$ grid-proxy-init
+

To check whether your proxy certificate is still valid (by default it's valid 12 hours), use:

+
$ grid-proxy-info
+

 

+

To access Anselm cluster, two login nodes running GSI SSH service are available. The service is available from public Internet as well as from the internal PRACE network (accessible only from other PRACE partners).

+

Access from PRACE network:

+

It is recommended to use the single DNS name anselm-prace.it4i.cz which is distributed between the two login nodes. If needed, user can login directly to one of the login nodes. The addresses are:

+ + + + + + + + + + + + + + + + + + + + + + +
Login addressPortProtocolLogin node
anselm-prace.it4i.cz2222gsisshlogin1 or login2
login1-prace.anselm.it4i.cz2222gsisshlogin1
login2-prace.anselm.it4i.cz2222gsisshlogin2
+

 

+
$ gsissh -p 2222 anselm-prace.it4i.cz
+

When logging from other PRACE system, the prace_service script can be used:

+
$ gsissh `prace_service -i -s anselm`
+

 

+

Access from public Internet:

+

It is recommended to use the single DNS name anselm.it4i.cz which is distributed between the two login nodes. If needed, user can login directly to one of the login nodes. The addresses are:

+ + + + + + + + + + + + + + + + + + + + + + +
Login addressPortProtocolLogin node
anselm.it4i.cz2222gsisshlogin1 or login2
login1.anselm.it4i.cz2222gsisshlogin1
login2.anselm.it4i.cz2222gsisshlogin2
+
$ gsissh -p 2222 anselm.it4i.cz
+

When logging from other PRACE system, the prace_service script can be used:

+
$ gsissh `prace_service -e -s anselm`
+

 

+

Although the preferred and recommended file transfer mechanism is using GridFTP, the GSI SSH implementation on Anselm supports also SCP, so for small files transfer gsiscp can be used:

+
$ gsiscp -P 2222 _LOCAL_PATH_TO_YOUR_FILE_ anselm.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_
+
$ gsiscp -P 2222 anselm.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_ _LOCAL_PATH_TO_YOUR_FILE_ 
+
$ gsiscp -P 2222 _LOCAL_PATH_TO_YOUR_FILE_ anselm-prace.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_
+
$ gsiscp -P 2222 anselm-prace.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_ _LOCAL_PATH_TO_YOUR_FILE_ 
+

Access to X11 applications (VNC)

+

If the user needs to run X11 based graphical application and does not have a X11 server, the applications can be run using VNC service. If the user is using regular SSH based access, please see the section in general documentation.

+

If the user uses GSI SSH based access, then the procedure is similar to the SSH based access (look here), only the port forwarding must be done using GSI SSH:

+
$ gsissh -p 2222 anselm.it4i.cz -L 5961:localhost:5961
+

Access with SSH

+

After successful obtainment of login credentials for the local IT4Innovations account, the PRACE users can access the cluster as regular users using SSH. For more information please see the section in general documentation.

+

File transfers

+

PRACE users can use the same transfer mechanisms as regular users (if they've undergone the full registration procedure). For information about this, please see the section in the general documentation.

+

Apart from the standard mechanisms, for PRACE users to transfer data to/from Anselm cluster, a GridFTP server running Globus Toolkit GridFTP service is available. The service is available from public Internet as well as from the internal PRACE network (accessible only from other PRACE partners).

+

There's one control server and three backend servers for striping and/or backup in case one of them would fail.

+

Access from PRACE network:

+ + + + + + + + + + + + + + + + + + + + + + + + +
Login addressPortNode role
gridftp-prace.anselm.it4i.cz2812Front end /control server
login1-prace.anselm.it4i.cz2813Backend / data mover server
login2-prace.anselm.it4i.cz2813Backend / data mover server
dm1-prace.anselm.it4i.cz2813Backend / data mover server
+

Copy files to Anselm by running the following commands on your local machine:

+
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://gridftp-prace.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
+

Or by using prace_service script:

+
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://`prace_service -i -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
+

Copy files from Anselm:

+
$ globus-url-copy gsiftp://gridftp-prace.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
+

Or by using prace_service script:

+
$ globus-url-copy gsiftp://`prace_service -i -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
+

 

+

Access from public Internet:

+ + + + + + + + + + + + + + + + + + + + + + + + +
Login addressPortNode role
gridftp.anselm.it4i.cz2812Front end /control server
login1.anselm.it4i.cz2813Backend / data mover server
login2.anselm.it4i.cz2813Backend / data mover server
dm1.anselm.it4i.cz2813Backend / data mover server
+

Copy files to Anselm by running the following commands on your local machine:

+
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://gridftp.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
+

Or by using prace_service script:

+
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://`prace_service -e -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
+

Copy files from Anselm:

+
$ globus-url-copy gsiftp://gridftp.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
+

Or by using prace_service script:

+
$ globus-url-copy gsiftp://`prace_service -e -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
+

 

+

Generally both shared file systems are available through GridFTP:

+ + + + + + + + + + + + + + +
File system mount pointFilesystemComment
/homeLustreDefault HOME directories of users in format /home/prace/login/
/scratchLustreShared SCRATCH mounted on the whole cluster
+

More information about the shared file systems is available here.

+

Usage of the cluster

+

There are some limitations for PRACE user when using the cluster. By default PRACE users aren't allowed to access special queues in the PBS Pro to have high priority or exclusive access to some special equipment like accelerated nodes and high memory (fat) nodes. There may be also restrictions obtaining a working license for the commercial software installed on the cluster, mostly because of the license agreement or because of insufficient amount of licenses.

+

For production runs always use scratch file systems, either the global shared or the local ones. The available file systems are described here.

+

Software, Modules and PRACE Common Production Environment

+

All system wide installed software on the cluster is made available to the users via the modules. The information about the environment and modules usage is in this section of general documentation.

+

PRACE users can use the "prace" module to use the PRACE Common Production Environment.

+
$ module load prace
+

 

+

Resource Allocation and Job Execution

+

General information about the resource allocation, job queuing and job execution is in this section of general documentation.

+

For PRACE users, the default production run queue is "qprace". PRACE users can also use two other queues "qexp" and "qfree".

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
queueActive projectProject resourcesNodespriorityauthorizationwalltime
default/max
qexp
Express queue
nonone required2 reserved,
8 total
highno1 / 1h
qprace
Production queue
yes

> 0178 w/o acceleratormediumno24 / 48h
qfree
Free resource queue
yesnone required178 w/o acceleratorvery lowno12 / 12h
+

qprace, the PRACE Production queue: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprace. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprace is 12 hours. If the job needs longer time, it must use checkpoint/restart functionality.

+

Accounting & Quota

+

The resources that are currently subject to accounting are the core hours. The core hours are accounted on the wall clock basis. The accounting runs whenever the computational cores are allocated or blocked via the PBS Pro workload manager (the qsub command), regardless of whether the cores are actually used for any calculation. See example in the general documentation.

+

PRACE users should check their project accounting using the PRACE Accounting Tool (DART).

+

Users who have undergone the full local registration procedure (including signing the IT4Innovations Acceptable Use Policy) and who have received local password may check at any time, how many core-hours have been consumed by themselves and their projects using the command "it4ifree". Please note that you need to know your user password to use the command and that the displayed core hours are "system core hours" which differ from PRACE "standardized core hours".

+

The it4ifree command is a part of it4i.portal.clients package, located here:
https://pypi.python.org/pypi/it4i.portal.clients

+
$ it4ifree
Password:
     PID   Total Used  ...by me Free
   -------- ------- ------ -------- -------
   OPEN-0-0 1500000 400644   225265 1099356
   DD-13-1    10000 2606 2606 7394
+

 

+

By default file system quota is applied. To check the current status of the quota use

+
$ lfs quota -u USER_LOGIN /home
$ lfs quota -u USER_LOGIN /scratch
+

If the quota is insufficient, please contact the support and request an increase.

+

 

+

 

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/prace.md b/docs.it4i.cz/anselm-cluster-documentation/prace.md index bf274bdc154a441ab2d2cac984d48174e3790bd8..a93c53a7af6feaa9afc033905d39635bacb943e0 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/prace.md +++ b/docs.it4i.cz/anselm-cluster-documentation/prace.md @@ -318,7 +318,7 @@ users can also use two other queues "qexp" and "qfree". Free resource queue ------------------------------------------------------------------------------------------------------------------------- -**qprace**, the PRACE Production queue****This queue is intended for +**qprace**, the PRACE Production queue****: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprace. The queue runs with medium priority and no special authorization is required to use it. @@ -373,4 +373,3 @@ increase.     - diff --git a/docs.it4i.cz/anselm-cluster-documentation/remote-visualization.html b/docs.it4i.cz/anselm-cluster-documentation/remote-visualization.html new file mode 100644 index 0000000000000000000000000000000000000000..2003dbc3edf3ea887cefd6102bf60df3273a157c --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/remote-visualization.html @@ -0,0 +1,802 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Remote visualization service — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + + + Remote visualization service + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Remote visualization service +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Introduction

+

The goal of this service is to provide the users a GPU accelerated use of OpenGL applications, especially for pre- and post- processing work, where not only the GPU performance is needed but also fast access to the shared file systems of the cluster and a reasonable amount of RAM.

+

The service is based on integration of open source tools VirtualGL and TurboVNC together with the cluster's job scheduler PBS Professional.

+

Currently two compute nodes are dedicated for this service with following configuration for each node:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Visualization node configuration
CPU2x Intel Sandy Bridge E5-2670, 2.6GHz
Processor cores16 (2x8 cores)
RAM64 GB, min. 4 GB per core
GPUNVIDIA Quadro 4000, 2GB RAM
Local disk driveyes - 500 GB
Compute networkInfiniBand QDR
+

Schematic overview

+

rem_vis_scheme

+

rem_vis_legend

+

How to use the service

+

Setup and start your own TurboVNC server.

+

TurboVNC is designed and implemented for cooperation with VirtualGL and available for free for all major platforms. For more information and download, please refer to: http://sourceforge.net/projects/turbovnc/

+

Always use TurboVNC on both sides (server and client) don't mix TurboVNC and other VNC implementations (TightVNC, TigerVNC, ...) as the VNC protocol implementation may slightly differ and diminish your user experience by introducing picture artifacts, etc.

+

The procedure is:

+

1. Connect to a login node.

+

Please follow the documentation.

+

2. Run your own instance of TurboVNC server.

+

To have the OpenGL acceleration, 24 bit color depth must be used. Otherwise only the geometry (desktop size) definition is needed.

+

At first VNC server run you need to define a password.

+

This example defines desktop with dimensions 1200x700 pixels and 24 bit color depth.

+
$ module load turbovnc/1.2.2 
$ vncserver -geometry 1200x700 -depth 24

Desktop 'TurboVNC: login2:1 (username)' started on display login2:1

Starting applications specified in /home/username/.vnc/xstartup.turbovnc
Log file is /home/username/.vnc/login2:1.log
+

3. Remember which display number your VNC server runs (you will need it in the future to stop the server).

+
$ vncserver -list 

TurboVNC server sessions:

X DISPLAY # PROCESS ID
:1 23269
+

In this example the VNC server runs on display :1.

+

4. Remember the exact login node, where your VNC server runs.

+
$ uname -n
login2
+

In this example the VNC server runs on login2.

+

5. Remember on which TCP port your own VNC server is running.

+

To get the port you have to look to the log file of your VNC server.

+
$ grep -E "VNC.*port" /home/username/.vnc/login2:1.log 
20/02/2015 14:46:41 Listening for VNC connections on TCP port 5901
+

In this example the VNC server listens on TCP port 5901.

+

6. Connect to the login node where your VNC server runs with SSH to tunnel your VNC session.

+

Tunnel the TCP port on which your VNC server is listenning.

+
$ ssh login2.anselm.it4i.cz -L 5901:localhost:5901 
+

If you use Windows and Putty, please refer to port forwarding setup in the documentation: https://docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/x-window-and-vnc#section-12

+

7. If you don't have Turbo VNC installed on your workstation.

+

Get it from: http://sourceforge.net/projects/turbovnc/

+

8. Run TurboVNC Viewer from your workstation.

+

Mind that you should connect through the SSH tunneled port. In this example it is 5901 on your workstation (localhost).

+
$ vncviewer localhost:5901 
+

If you use Windows version of TurboVNC Viewer, just run the Viewer and use address *localhost:5901*.

+

9. Proceed to the chapter "Access the visualization node."

+

Now you should have working TurboVNC session connected to your workstation.

+

10. After you end your visualization session.

+

Don't forget to correctly shutdown your own VNC server on the login node!

+
$ vncserver -kill :1 
+

Access the visualization node

+

To access the node use a dedicated PBS Professional scheduler queue qviz. The queue has following properties:

+ + + + + + + + + + + + + + +
queueactive projectproject resourcesnodesmin ncpus*priorityauthorizationwalltime
default/max
+

qviz               Visualization queue

+
yesnone required24150no1 hour / 2 hours
+

Currently when accessing the node, each user gets 4 cores of a CPU allocated, thus approximately 16 GB of RAM and 1/4 of the GPU capacity. If more GPU power or RAM is required, it is recommended to allocate one whole node per user, so that all 16 cores, whole RAM and whole GPU is exclusive. This is currently also the maximum allowed allocation per one user. One hour of work is allocated by default, the user may ask for 2 hours maximum.

+

To access the visualization node, follow these steps:

+

1. In your VNC session, open a terminal and allocate a node using PBSPro qsub command.

+

This step is necessary to allow you to proceed with next steps.

+
$ qsub -I -q qviz -A PROJECT_ID 
+

In this example the default values for CPU cores and usage time are used.

+
$ qsub -I -q qviz -A PROJECT_ID -l select=1:ncpus=16 -l walltime=02:00:00 
+

Substitute **PROJECT_ID* with the assigned project identification string.*

+

In this example a whole node for 2 hours is requested.

+

If there are free resources for your request, you will have a shell running on an assigned node. Please remember the name of the node.

+
$ uname -n
srv8
+

In this example the visualization session was assigned to node srv8.

+

2. In your VNC session open another terminal (keep the one with interactive PBSPro job open).

+

Setup the VirtualGL connection to the node, which PBSPro allocated for your job.

+
$ vglconnect srv8 
+

You will be connected with created VirtualGL tunnel to the visualization node, where you will have a shell.

+

3. Load the VirtualGL module.

+
$ module load virtualgl/2.4 
+

4. Run your desired OpenGL accelerated application using VirtualGL script "vglrun".

+
$ vglrun glxgears 
+

Please note, that if you want to run an OpenGL application which is available through modules, you need at first load the respective module. E. g. to run the Mentat OpenGL application from MARC software package use:

+
$ module load marc/2013.1 
$ vglrun mentat
+

5. After you end your work with the OpenGL application.

+

Just logout from the visualization node and exit both opened terminals and end your VNC server session as described above.

+

Tips and Tricks

+

If you want to increase the responsibility of the visualization, please adjust your TurboVNC client settings in this way:

+

rem_vis_settings

+

To have an idea how the settings are affecting the resulting picture quality three levels of "JPEG image quality" are demonstrated:

+

1. JPEG image quality = 30

+

rem_vis_q3

+

2. JPEG image quality = 15

+

rem_vis_q2

+

3. JPEG image quality = 10

+

rem_vis_q1

+



+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/remote-visualization.md b/docs.it4i.cz/anselm-cluster-documentation/remote-visualization.md index 9cfd5efcd599cebcf7436c9d10a490ee82fe8a00..9871e2aab91188bd4ba65f89412433b7f84723e9 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/remote-visualization.md +++ b/docs.it4i.cz/anselm-cluster-documentation/remote-visualization.md @@ -43,7 +43,7 @@ How to use the service TurboVNC is designed and implemented for cooperation with VirtualGL and available for free for all major platforms. For more information and -download, please refer to +download, please refer to: **Always use TurboVNC on both sides** (server and client) **don't mix TurboVNC and other VNC implementations** (TightVNC, TigerVNC, ...) as @@ -71,7 +71,7 @@ color depth. $ module load turbovnc/1.2.2 $ vncserver -geometry 1200x700 -depth 24 -Desktop 'TurboVNClogin2:1 (username)' started on display login2:1 +Desktop 'TurboVNC: login2:1 (username)' started on display login2:1 Starting applications specified in /home/username/.vnc/xstartup.turbovnc Log file is /home/username/.vnc/login2:1.log @@ -82,7 +82,7 @@ Log file is /home/username/.vnc/login2:1.log ``` $ vncserver -list -TurboVNC server sessions +TurboVNC server sessions: X DISPLAY # PROCESS ID :1 23269 @@ -124,7 +124,7 @@ $ ssh login2.anselm.it4i.cz -L 5901:localhost:5901 #### 7. If you don't have Turbo VNC installed on your workstation. {#7-if-you-don-t-have-turbo-vnc-installed-on-your-workstation} -Get it from +Get it from: #### 8. Run TurboVNC Viewer from your workstation. {#8-run-turbovnc-viewer-from-your-workstation} @@ -297,3 +297,7 @@ quality three levels of "JPEG image quality" are demonstrated: 3. JPEG image quality = 10 ![rem_vis_q1](quality1.png "rem_vis_q1") + + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution.html b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution.html new file mode 100644 index 0000000000000000000000000000000000000000..59eab7c76b82dfe0cfcddb165590be06a88cd5d5 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution.html @@ -0,0 +1,772 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Resource Allocation and Job Execution — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + + + Resource Allocation and Job Execution + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Resource Allocation and Job Execution +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

To run a job, computational resources for this particular job must be allocated. This is done via the PBS Pro job workload manager software, which efficiently distributes workloads across the supercomputer. Extensive informations about PBS Pro can be found in the official documentation here, especially in the PBS Pro User's Guide.

+

Resources Allocation Policy

+

The resources are allocated to the job in a fairshare fashion, subject to constraints set by the queue and resources available to the Project. The Fairshare at Anselm ensures that individual users may consume approximately equal amount of resources per week. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following queues are available to Anselm users:

+
    +
  • qexp, the Express queue
  • +
  • qprod, the Production queue
  • +
  • qlong, the Long queue, regula
  • +
  • qnvidia, qmic, qfat, the Dedicated queues
  • +
  • qfree, the Free resource utilization queue
  • +
+

Check the queue status at https://extranet.it4i.cz/anselm/

+

Read more on the Resource Allocation Policy page.

+

Job submission and execution

+

Use the qsub command to submit your jobs.

+

The qsub submits the job into the queue. The qsub command creates a request to the PBS Job manager for allocation of specified resources.  The smallest allocation unit is entire node, 16 cores, with exception of the qexp queue. The resources will be allocated when available, subject to allocation policies and constraints. After the resources are allocated the jobscript or interactive shell is executed on first of the allocated nodes.

+

Read more on the Job submission and execution page.

+

Capacity computing

+

Use Job arrays when running huge number of jobs.
Use GNU Parallel and/or Job arrays when running (many) single core jobs.

+

In many cases, it is useful to submit huge (100+) number of computational jobs into the PBS queue system. Huge number of (small) jobs is one of the most effective ways to execute embarrassingly parallel calculations, achieving best runtime, throughput and computer utilization. In this chapter, we discuss the the recommended way to run huge number of jobs, including ways to run huge number of single core jobs.

+

Read more on Capacity computing page.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution.md b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution.md index d9b684756d039168a91acab2a5036fa41aaa10f5..83bd91c430e3ab53dedd0669940c5782e5334387 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution.md +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution.md @@ -74,4 +74,3 @@ jobs**. Read more on [Capacity computing](resource-allocation-and-job-execution/capacity-computing.html) page. - diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/capacity-computing.html b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/capacity-computing.html new file mode 100644 index 0000000000000000000000000000000000000000..67662f8dc16ffe781f43a39a9feb8026521cf240 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/capacity-computing.html @@ -0,0 +1,867 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Capacity computing — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Resource Allocation and Job Execution + + / + + + + + + + + + + Capacity computing + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Capacity computing +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Introduction

+

In many cases, it is useful to submit huge (100+) number of computational jobs into the PBS queue system. Huge number of (small) jobs is one of the most effective ways to execute embarrassingly parallel calculations, achieving best runtime, throughput and computer utilization.

+

However, executing huge number of jobs via the PBS queue may strain the system. This strain may result in slow response to commands, inefficient scheduling and overall degradation of performance and user experience, for all users. For this reason, the number of jobs is limited to 100 per user, 1000 per job array

+

Please follow one of the procedures below, in case you wish to schedule more than 100 jobs at a time.

+ +

Policy

+
    +
  1. A user is allowed to submit at most 100 jobs. Each job may be a job array.
  2. +
  3. The array size is at most 1000 subjobs.
  4. +
+

Job arrays

+

Huge number of jobs may be easily submitted and managed as a job array.

+

A job array is a compact representation of many jobs, called subjobs. The subjobs share the same job script, and have the same values for all attributes and resources, with the following exceptions:

+
    +
  • each subjob has a unique index, $PBS_ARRAY_INDEX
  • +
  • job Identifiers of subjobs only differ by their indices
  • +
  • the state of subjobs can differ (R,Q,...etc.)
  • +
+

All subjobs within a job array have the same scheduling priority and schedule as independent jobs.
Entire job array is submitted through a single qsub command and may be managed by qdel, qalter, qhold, qrls and qsig commands as a single job.

+

Shared jobscript

+

All subjobs in job array use the very same, single jobscript. Each subjob runs its own instance of the jobscript. The instances execute different work controlled by $PBS_ARRAY_INDEX variable.

+

Example:

+

Assume we have 900 input files with name beginning with "file" (e. g. file001, ..., file900). Assume we would like to use each of these input files with program executable myprog.x, each as a separate job.

+

First, we create a tasklist file (or subjobs list), listing all tasks (subjobs) - all input files in our example:

+
$ find . -name 'file*' > tasklist
+

Then we create jobscript:

+
#!/bin/bash
#PBS -A PROJECT_ID
#PBS -q qprod
#PBS -l select=1:ncpus=16,walltime=02:00:00

# change to local scratch directory
SCR=/lscratch/$PBS_JOBID
mkdir -p $SCR ; cd $SCR || exit

# get individual tasks from tasklist with index from PBS JOB ARRAY
TASK=$(sed -n "${PBS_ARRAY_INDEX}p" $PBS_O_WORKDIR/tasklist)

# copy input file and executable to scratch
cp $PBS_O_WORKDIR/$TASK input ; cp $PBS_O_WORKDIR/myprog.x .

# execute the calculation
./myprog.x < input > output

# copy output file to submit directory
cp output $PBS_O_WORKDIR/$TASK.out
+

In this example, the submit directory holds the 900 input files, executable myprog.x and the jobscript file. As input for each run, we take the filename of input file from created tasklist file. We copy the input file to local scratch /lscratch/$PBS_JOBID, execute the myprog.x and copy the output file back to the submit directory, under the $TASK.out name. The myprog.x runs on one node only and must use threads to run in parallel. Be aware, that if the myprog.x is not multithreaded, then all the jobs are run as single thread programs in sequential manner. Due to allocation of the whole node, the accounted time is equal to the usage of whole node, while using only 1/16 of the node!

+

If huge number of parallel multicore (in means of multinode multithread, e. g. MPI enabled) jobs is needed to run, then a job array approach should also be used. The main difference compared to previous example using one node is that the local scratch should not be used (as it's not shared between nodes) and MPI or other technique for parallel multinode run has to be used properly.

+

Submit the job array

+

To submit the job array, use the qsub -J command. The 900 jobs of the example above may be submitted like this:

+
$ qsub -N JOBNAME -J 1-900 jobscript
12345[].dm2
+

In this example, we submit a job array of 900 subjobs. Each subjob will run on full node and is assumed to take less than 2 hours (please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue).

+

Sometimes for testing purposes, you may need to submit only one-element array. This is not allowed by PBSPro, but there's a workaround:

+
$ qsub -N JOBNAME -J 9-10:2 jobscript
+

This will only choose the lower index (9 in this example) for submitting/running your job.

+

Manage the job array

+

Check status of the job array by the qstat command.

+
$ qstat -a 12345[].dm2

dm2:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
12345[].dm2 user2 qprod xx 13516 1 16 -- 00:50 B 00:02
+

The status B means that some subjobs are already running.

+

Check status of the first 100 subjobs by the qstat command.

+
$ qstat -a 12345[1-100].dm2

dm2:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
12345[1].dm2 user2 qprod xx 13516 1 16 -- 00:50 R 00:02
12345[2].dm2 user2 qprod xx 13516 1 16 -- 00:50 R 00:02
12345[3].dm2 user2 qprod xx 13516 1 16 -- 00:50 R 00:01
12345[4].dm2 user2 qprod xx 13516 1 16 -- 00:50 Q --
. . . . . . . . . . .
, . . . . . . . . . .
12345[100].dm2 user2 qprod xx 13516 1 16 -- 00:50 Q --
+

Delete the entire job array. Running subjobs will be killed, queueing subjobs will be deleted.

+
$ qdel 12345[].dm2
+

Deleting large job arrays may take a while.

+

Display status information for all user's jobs, job arrays, and subjobs.

+
$ qstat -u $USER -t
+

Display status information for all user's subjobs.

+
$ qstat -u $USER -tJ
+

Read more on job arrays in the PBSPro Users guide.

+

GNU parallel

+

Use GNU parallel to run many single core tasks on one node.

+

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. GNU parallel is most useful in running single core jobs via the queue system on  Anselm.

+

For more information and examples see the parallel man page:

+
$ module add parallel
$ man parallel
+

GNU parallel jobscript

+

The GNU parallel shell executes multiple instances of the jobscript using all cores on the node. The instances execute different work, controlled by the $PARALLEL_SEQ variable.

+

Example:

+

Assume we have 101 input files with name beginning with "file" (e. g. file001, ..., file101). Assume we would like to use each of these input files with program executable myprog.x, each as a separate single core job. We call these single core jobs tasks.

+

First, we create a tasklist file, listing all tasks - all input files in our example:

+
$ find . -name 'file*' > tasklist
+

Then we create jobscript:

+
#!/bin/bash
#PBS -A PROJECT_ID
#PBS -q qprod
#PBS -l select=1:ncpus=16,walltime=02:00:00

[ -z "$PARALLEL_SEQ" ] && \
{ module add parallel ; exec parallel -a $PBS_O_WORKDIR/tasklist $0 ; }

# change to local scratch directory
SCR=/lscratch/$PBS_JOBID/$PARALLEL_SEQ
mkdir -p $SCR ; cd $SCR || exit

# get individual task from tasklist
TASK=$1  

# copy input file and executable to scratch
cp $PBS_O_WORKDIR/$TASK input

# execute the calculation
cat  input > output

# copy output file to submit directory
cp output $PBS_O_WORKDIR/$TASK.out
+

In this example, tasks from tasklist are executed via the GNU parallel. The jobscript executes multiple instances of itself in parallel, on all cores of the node. Once an instace of jobscript is finished, new instance starts until all entries in tasklist are processed. Currently processed entry of the joblist may be retrieved via $1 variable. Variable $TASK expands to one of the input filenames from tasklist. We copy the input file to local scratch, execute the myprog.x and copy the output file back to the submit directory, under the $TASK.out name. 

+

Submit the job

+

To submit the job, use the qsub command. The 101 tasks' job of the example above may be submitted like this:

+
$ qsub -N JOBNAME jobscript
12345.dm2
+

In this example, we submit a job of 101 tasks. 16 input files will be processed in  parallel. The 101 tasks on 16 cores are assumed to complete in less than 2 hours.

+

Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.

+

Job arrays and GNU parallel

+

Combine the Job arrays and GNU parallel for best throughput of single core jobs

+

While job arrays are able to utilize all available computational nodes, the GNU parallel can be used to efficiently run multiple single-core jobs on single node. The two approaches may be combined to utilize all available (current and future) resources to execute single core jobs.

+

Every subjob in an array runs GNU parallel to utilize all cores on the node

+

GNU parallel, shared jobscript

+

Combined approach, very similar to job arrays, can be taken. Job array is submitted to the queuing system. The subjobs run GNU parallel. The GNU parallel shell executes multiple instances of the jobscript using all cores on the node. The instances execute different work, controlled by the $PBS_JOB_ARRAY and $PARALLEL_SEQ variables.

+

Example:

+

Assume we have 992 input files with name beginning with "file" (e. g. file001, ..., file992). Assume we would like to use each of these input files with program executable myprog.x, each as a separate single core job. We call these single core jobs tasks.

+

First, we create a tasklist file, listing all tasks - all input files in our example:

+
$ find . -name 'file*' > tasklist
+

Next we create a file, controlling how many tasks will be executed in one subjob

+
$ seq 32 > numtasks
+

Then we create jobscript:

+
#!/bin/bash
#PBS -A PROJECT_ID
#PBS -q qprod
#PBS -l select=1:ncpus=16,walltime=02:00:00

[ -z "$PARALLEL_SEQ" ] && \
{ module add parallel ; exec parallel -a $PBS_O_WORKDIR/numtasks $0 ; }

# change to local scratch directory
SCR=/lscratch/$PBS_JOBID/$PARALLEL_SEQ
mkdir -p $SCR ; cd $SCR || exit

# get individual task from tasklist with index from PBS JOB ARRAY and index form Parallel
IDX=$(($PBS_ARRAY_INDEX + $PARALLEL_SEQ - 1))
TASK=$(sed -n "${IDX}p" $PBS_O_WORKDIR/tasklist)
[ -z "$TASK" ] && exit

# copy input file and executable to scratch
cp $PBS_O_WORKDIR/$TASK input

# execute the calculation
cat input > output

# copy output file to submit directory
cp output $PBS_O_WORKDIR/$TASK.out
+

In this example, the jobscript executes in multiple instances in parallel, on all cores of a computing node.  Variable $TASK expands to one of the input filenames from tasklist. We copy the input file to local scratch, execute the myprog.x and copy the output file back to the submit directory, under the $TASK.out name.  The numtasks file controls how many tasks will be run per subjob. Once an task is finished, new task starts, until the number of tasks  in numtasks file is reached.

+

Select  subjob walltime and number of tasks per subjob  carefully

+

 When deciding this values, think about following guiding rules :

+
    +
  1. Let n=N/16.  Inequality (n+1) * T < W should hold. The N is number of tasks per subjob, T is expected single task walltime and W is subjob walltime. Short subjob walltime improves scheduling and job throughput.
  2. +
  3. Number of tasks should be modulo 16.
  4. +
  5. These rules are valid only when all tasks have similar task walltimes T.
  6. +
+

Submit the job array

+

To submit the job array, use the qsub -J command. The 992 tasks' job of the example above may be submitted like this:

+
$ qsub -N JOBNAME -J 1-992:32 jobscript
12345[].dm2
+

In this example, we submit a job array of 31 subjobs. Note the  -J 1-992:32, this must be the same as the number sent to numtasks file. Each subjob will run on full node and process 16 input files in parallel, 32 in total per subjob.  Every subjob is assumed to complete in less than 2 hours.

+

Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.

+

Examples

+

Download the examples in capacity.zip,  illustrating the above listed ways to run huge number of jobs. We recommend to try out the examples, before using this for running production jobs.

+

Unzip the archive in an empty directory on Anselm and follow the instructions in the README file

+
$ unzip capacity.zip
$ cat README
+

 

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/capacity-computing.md b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/capacity-computing.md index e7b8434847aa1c40cfdf536f74c33dd0a094c02a..bd2495e0d31583bb7c89b5558830964d19088e1e 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/capacity-computing.md +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/capacity-computing.md @@ -432,3 +432,7 @@ $ cat README ```   + + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/introduction.html b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/introduction.html new file mode 100644 index 0000000000000000000000000000000000000000..23e0a8417ffc1727c371ab135e57e63e4e503a56 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/introduction.html @@ -0,0 +1,772 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Resource Allocation and Job Execution — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + + + Resource Allocation and Job Execution + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Resource Allocation and Job Execution +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

To run a job, computational resources for this particular job must be allocated. This is done via the PBS Pro job workload manager software, which efficiently distributes workloads across the supercomputer. Extensive informations about PBS Pro can be found in the official documentation here, especially in the PBS Pro User's Guide.

+

Resources Allocation Policy

+

The resources are allocated to the job in a fairshare fashion, subject to constraints set by the queue and resources available to the Project. The Fairshare at Anselm ensures that individual users may consume approximately equal amount of resources per week. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following queues are available to Anselm users:

+
    +
  • qexp, the Express queue
  • +
  • qprod, the Production queue
  • +
  • qlong, the Long queue, regula
  • +
  • qnvidia, qmic, qfat, the Dedicated queues
  • +
  • qfree, the Free resource utilization queue
  • +
+

Check the queue status at https://extranet.it4i.cz/anselm/

+

Read more on the Resource Allocation Policy page.

+

Job submission and execution

+

Use the qsub command to submit your jobs.

+

The qsub submits the job into the queue. The qsub command creates a request to the PBS Job manager for allocation of specified resources.  The smallest allocation unit is entire node, 16 cores, with exception of the qexp queue. The resources will be allocated when available, subject to allocation policies and constraints. After the resources are allocated the jobscript or interactive shell is executed on first of the allocated nodes.

+

Read more on the Job submission and execution page.

+

Capacity computing

+

Use Job arrays when running huge number of jobs.
Use GNU Parallel and/or Job arrays when running (many) single core jobs.

+

In many cases, it is useful to submit huge (100+) number of computational jobs into the PBS queue system. Huge number of (small) jobs is one of the most effective ways to execute embarrassingly parallel calculations, achieving best runtime, throughput and computer utilization. In this chapter, we discuss the the recommended way to run huge number of jobs, including ways to run huge number of single core jobs.

+

Read more on Capacity computing page.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/introduction.md b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/introduction.md index 82244c355d95364c0098d5bf39188b03a68e09e3..e1be2fecc787be8695c8d95d6e7d5d0afb682b35 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/introduction.md +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/introduction.md @@ -72,4 +72,3 @@ Read more on [Capacity computing](capacity-computing.html) page. - diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-priority.html b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-priority.html new file mode 100644 index 0000000000000000000000000000000000000000..2619c97d3e09e2ded712be066c5fb3f2cccfa15e --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-priority.html @@ -0,0 +1,793 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Job scheduling — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Resource Allocation and Job Execution + + / + + + + + + + + + + Job scheduling + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Job scheduling +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Job execution priority

+

Scheduler gives each job an execution priority and then uses this job execution priority to select which job(s) to run.

+

Job execution priority on Anselm is determined by these job properties (in order of importance):

+
    +
  1. queue priority
  2. +
  3. fairshare priority
  4. +
  5. eligible time
  6. +
+

Queue priority

+

Queue priority is priority of queue where job is queued before execution.

+

Queue priority has the biggest impact on job execution priority. Execution priority of jobs in higher priority queues is always greater than execution priority of jobs in lower priority queues. Other properties of job used for determining job execution priority (fairshare priority, eligible time) cannot compete with queue priority.

+

Queue priorities can be seen at https://extranet.it4i.cz/anselm/queues

+

Fairshare priority

+

Fairshare priority is priority calculated on recent usage of resources. Fairshare priority is calculated per project, all members of project share same fairshare priority. Projects with higher recent usage have lower fairshare priority than projects with lower or none recent usage.

+

Fairshare priority is used for ranking jobs with equal queue priority.

+

Fairshare priority is calculated as

+

+

where MAX_FAIRSHARE has value 1E6,
usageProject is cumulated usage by all members of selected project,
usageTotal is total usage by all users, by all projects.

+

Usage counts allocated corehours (ncpus*walltime). Usage is decayed, or cut in half periodically, at the interval 168 hours (one week).
Jobs queued in queue qexp are not calculated to project's usage.

+

Calculated usage and fairshare priority can be seen at https://extranet.it4i.cz/anselm/projects.

+


Calculated fairshare priority can be also seen as Resource_List.fairshare attribute of a job.

+

Eligible time

+

Eligible time is amount (in seconds) of eligible time job accrued while waiting to run. Jobs with higher eligible time gains higher priority.

+

Eligible time has the least impact on execution priority. Eligible time is used for sorting jobs with equal queue priority and fairshare priority. It is very, very difficult for eligible time to compete with fairshare priority.

+

Eligible time can be seen as eligible_time attribute of job.

+

Formula

+

Job execution priority (job sort formula) is calculated as:

+

+

Job backfilling

+

Anselm cluster uses job backfilling.

+

Backfilling means fitting smaller jobs around the higher-priority jobs that the scheduler is going to run next, in such a way that the higher-priority jobs are not delayed. Backfilling allows us to keep resources from becoming idle when the top job (job with the highest execution priority) cannot run.

+

The scheduler makes a list of jobs to run in order of execution priority. Scheduler looks for smaller jobs that can fit into the usage gaps
around the highest-priority jobs in the list. The scheduler looks in the prioritized list of jobs and chooses the highest-priority smaller jobs that fit. Filler jobs are run only if they will not delay the start time of top jobs.

+

It means, that jobs with lower execution priority can be run before jobs with higher execution priority.

+

It is very beneficial to specify the walltime when submitting jobs.

+

Specifying more accurate walltime enables better schedulling, better execution times and better resource usage. Jobs with suitable (small) walltime could be backfilled - and overtake job(s) with higher priority.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-priority.md b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-priority.md index 0bb5db05022da4ba348098ada176980305abd3d9..5750dbfc67906bdd97579d087e98f18d9c1b4183 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-priority.md +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-priority.md @@ -16,16 +16,16 @@ Job execution priority on Anselm is determined by these job properties ### Queue priority -ueue priority is priority of queue where job is queued before +Queue priority is priority of queue where job is queued before execution. -ueue priority has the biggest impact on job execution priority. +Queue priority has the biggest impact on job execution priority. Execution priority of jobs in higher priority queues is always greater than execution priority of jobs in lower priority queues. Other properties of job used for determining job execution priority (fairshare priority, eligible time) cannot compete with queue priority. -ueue priorities can be seen at +Queue priorities can be seen at ### Fairshare priority @@ -104,4 +104,3 @@ execution times and better resource usage. Jobs with suitable (small) walltime could be backfilled - and overtake job(s) with higher priority. - diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-submission-and-execution.html b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-submission-and-execution.html new file mode 100644 index 0000000000000000000000000000000000000000..a24f62ef975c0a4d54ef2cf0117062fb85da53f0 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-submission-and-execution.html @@ -0,0 +1,898 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Job submission and execution — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Resource Allocation and Job Execution + + / + + + + + + + + + + Job submission and execution + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Job submission and execution +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Job Submission

+

When allocating computational resources for the job, please specify

+
    +
  1. suitable queue for your job (default is qprod)
  2. +
  3. number of computational nodes required
  4. +
  5. number of cores per node required
  6. +
  7. maximum wall time allocated to your calculation, note that jobs exceeding maximum wall time will be killed
  8. +
  9. Project ID
  10. +
  11. Jobscript or interactive switch
  12. +
+

Use the qsub command to submit your job to a queue for allocation of the computational resources.

+

Submit the job using the qsub command:

+
$ qsub -A Project_ID -q queue -l select=x:ncpus=y,walltime=[[hh:]mm:]ss[.ms] jobscript
+

The qsub submits the job into the queue, in another words the qsub command creates a request to the PBS Job manager for allocation of specified resources. The resources will be allocated when available, subject to above described policies and constraints. After the resources are allocated the jobscript or interactive shell is executed on first of the allocated nodes.

+

Job Submission Examples

+
$ qsub -A OPEN-0-0 -q qprod -l select=64:ncpus=16,walltime=03:00:00 ./myjob
+

In this example, we allocate 64 nodes, 16 cores per node, for 3 hours. We allocate these resources via the qprod queue, consumed resources will be accounted to the Project identified by Project ID OPEN-0-0. Jobscript myjob will be executed on the first node in the allocation.

+

 

+
$ qsub -q qexp -l select=4:ncpus=16 -I
+

In this example, we allocate 4 nodes, 16 cores per node, for 1 hour. We allocate these resources via the qexp queue. The resources will be available interactively

+

 

+
$ qsub -A OPEN-0-0 -q qnvidia -l select=10:ncpus=16 ./myjob
+

In this example, we allocate 10 nvidia accelerated nodes, 16 cores per node, for  24 hours. We allocate these resources via the qnvidia queue. Jobscript myjob will be executed on the first node in the allocation.

+

 

+
$ qsub -A OPEN-0-0 -q qfree -l select=10:ncpus=16 ./myjob
+

In this example, we allocate 10  nodes, 16 cores per node, for 12 hours. We allocate these resources via the qfree queue. It is not required that the project OPEN-0-0 has any available resources left. Consumed resources are still accounted for. Jobscript myjob will be executed on the first node in the allocation.

+

 

+

All qsub options may be saved directly into the jobscript. In such a case, no options to qsub are needed.

+
$ qsub ./myjob
+

 

+

By default, the PBS batch system sends an e-mail only when the job is aborted. Disabling mail events completely can be done like this:

+
$ qsub -m n
+

Advanced job placement

+

Placement by name

+

Specific nodes may be allocated via the PBS

+
qsub -A OPEN-0-0 -q qprod -l select=1:ncpus=16:host=cn171+1:ncpus=16:host=cn172 -I
+

In this example, we allocate nodes cn171 and cn172, all 16 cores per node, for 24 hours.  Consumed resources will be accounted to the Project identified by Project ID OPEN-0-0. The resources will be available interactively.

+

Placement by CPU type

+

Nodes equipped with Intel Xeon E5-2665 CPU have base clock frequency 2.4GHz, nodes equipped with Intel Xeon E5-2470 CPU have base frequency 2.3 GHz (see section Compute Nodes for details).  Nodes may be selected via the PBS resource attribute cpu_freq .

+ + + + + + + + + + + + + + + + +
CPU Typebase freq.Nodescpu_freq attribute
Intel Xeon E5-26652.4GHzcn[1-180], cn[208-209]24
Intel Xeon E5-24702.3GHzcn[181-207]23
+

 

+
$ qsub -A OPEN-0-0 -q qprod -l select=4:ncpus=16:cpu_freq=24 -I
+
+
In this example, we allocate 4 nodes, 16 cores, selecting only the nodes with Intel Xeon E5-2665 CPU.
+

Placement by IB switch

+
Groups of computational nodes are connected to chassis integrated Infiniband switches. These switches form the leaf switch layer of the Infiniband  network fat tree topology. Nodes sharing the leaf switch can communicate most efficiently. Sharing the same switch prevents hops in the network and provides for unbiased, most efficient network communication.

+
Nodes sharing the same switch may be selected via the PBS resource attribute ibswitch. Values of this attribute are iswXX, where XX is the switch number. The node-switch mapping can be seen at Hardware Overview section.

+
We recommend allocating compute nodes of a single switch when best possible computational network performance is required to run the job efficiently:
+
qsub -A OPEN-0-0 -q qprod -l select=18:ncpus=16:ibswitch=isw11 ./myjob
+
In this example, we request all the 18 nodes sharing the isw11 switch for 24 hours. Full chassis will be allocated.
+

Advanced job handling

+

Selecting Turbo Boost off

+
Intel Turbo Boost Technology is on by default. We strongly recommend keeping the default. 
+
If necessary (such as in case of benchmarking) you can disable the Turbo for all nodes of the job by using the PBS resource attribute cpu_turbo_boost
+
+
+
$ qsub -A OPEN-0-0 -q qprod -l select=4:ncpus=16 -l cpu_turbo_boost=0 -I
+

More about the Intel Turbo Boost in the TurboBoost section

+

Advanced examples

+

In the following example, we select an allocation for benchmarking a very special and demanding MPI program. We request Turbo off, 2 full chassis of compute nodes (nodes sharing the same IB switches) for 30 minutes:

+
$ qsub -A OPEN-0-0 -q qprod \
-l select=18:ncpus=16:ibswitch=isw10:mpiprocs=1:ompthreads=16+18:ncpus=16:ibswitch=isw20:mpiprocs=16:ompthreads=1 \
-l cpu_turbo_boost=0,walltime=00:30:00 \
-N Benchmark ./mybenchmark
+

The MPI processes will be distributed differently on the nodes connected to the two switches. On the isw10 nodes, we will run 1 MPI process per node 16 threads per process, on isw20  nodes we will run 16 plain MPI processes.

+

Although this example is somewhat artificial, it demonstrates the flexibility of the qsub command options.

+

Job Management

+

Check status of your jobs using the qstat and check-pbs-jobs commands

+
$ qstat -a
$ qstat -a -u username
$ qstat -an -u username
$ qstat -f 12345.srv11
+

Example:

+
$ qstat -a

srv11:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
16287.srv11 user1 qlong job1 6183 4 64 -- 144:0 R 38:25
16468.srv11 user1 qlong job2 8060 4 64 -- 144:0 R 17:44
16547.srv11 user2 qprod job3x 13516 2 32 -- 48:00 R 00:58
+

In this example user1 and user2 are running jobs named job1, job2 and job3x. The jobs job1 and job2 are using 4 nodes, 16 cores per node each. The job1 already runs for 38 hours and 25 minutes, job2 for 17 hours 44 minutes. The job1 already consumed 64*38.41 = 2458.6 core hours. The job3x already consumed 0.96*32 = 30.93 core hours. These consumed core hours will be accounted on the respective project accounts, regardless of whether the allocated cores were actually used for computations.

+

Check status of your jobs using check-pbs-jobs command. Check presence of user's PBS jobs' processes on execution hosts. Display load, processes. Display job standard and error output. Continuously display (tail -f) job standard or error output.

+
$ check-pbs-jobs --check-all
$ check-pbs-jobs --print-load --print-processes
$ check-pbs-jobs --print-job-out --print-job-err

$ check-pbs-jobs --jobid JOBID --check-all --print-all

$ check-pbs-jobs --jobid JOBID --tailf-job-out
+

Examples:

+
$ check-pbs-jobs --check-all
JOB 35141.dm2, session_id 71995, user user2, nodes cn164,cn165
Check session id: OK
Check processes
cn164: OK
cn165: No process
+

In this example we see that job 35141.dm2 currently runs no process on allocated node cn165, which may indicate an execution error.

+
$ check-pbs-jobs --print-load --print-processes
JOB 35141.dm2, session_id 71995, user user2, nodes cn164,cn165
Print load
cn164: LOAD: 16.01, 16.01, 16.00
cn165: LOAD: 0.01, 0.00, 0.01
Print processes
%CPU CMD
cn164: 0.0 -bash
cn164: 0.0 /bin/bash /var/spool/PBS/mom_priv/jobs/35141.dm2.SC
cn164: 99.7 run-task
...
+

In this example we see that job 35141.dm2 currently runs process run-task on node cn164, using one thread only, while node cn165 is empty, which may indicate an execution error.

+
$ check-pbs-jobs --jobid 35141.dm2 --print-job-out
JOB 35141.dm2, session_id 71995, user user2, nodes cn164,cn165
Print job standard output:
======================== Job start ==========================
Started at    : Fri Aug 30 02:47:53 CEST 2013
Script name   : script
Run loop 1
Run loop 2
Run loop 3
+

In this example, we see actual output (some iteration loops) of the job 35141.dm2

+

Manage your queued or running jobs, using the qhold, qrls, qdel, qsig or qalter commands

+

You may release your allocation at any time, using qdel command

+
$ qdel 12345.srv11
+

You may kill a running job by force, using qsig command

+
$ qsig -s 9 12345.srv11
+

Learn more by reading the pbs man page

+
$ man pbs_professional
+

Job Execution

+

Jobscript

+

Prepare the jobscript to run batch jobs in the PBS queue system

+

The Jobscript is a user made script, controlling sequence of commands for executing the calculation. It is often written in bash, other scripts may be used as well. The jobscript is supplied to PBS qsub command as an argument and executed by the PBS Professional workload manager.

+

The jobscript or interactive shell is executed on first of the allocated nodes.

+
$ qsub -q qexp -l select=4:ncpus=16 -N Name0 ./myjob
$ qstat -n -u username

srv11:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
15209.srv11 username qexp Name0 5530 4 64 -- 01:00 R 00:00
cn17/0*16+cn108/0*16+cn109/0*16+cn110/0*16
+

 In this example, the nodes cn17, cn108, cn109 and cn110 were allocated for 1 hour via the qexp queue. The jobscript myjob will be executed on the node cn17, while the nodes cn108, cn109 and cn110 are available for use as well.

+

The jobscript or interactive shell is by default executed in home directory

+
$ qsub -q qexp -l select=4:ncpus=16 -I
qsub: waiting for job 15210.srv11 to start
qsub: job 15210.srv11 ready

$ pwd
/home/username
+

In this example, 4 nodes were allocated interactively for 1 hour via the qexp queue. The interactive shell is executed in the home directory.

+

All nodes within the allocation may be accessed via ssh.  Unallocated nodes are not accessible to user.

+

The allocated nodes are accessible via ssh from login nodes. The nodes may access each other via ssh as well.

+

Calculations on allocated nodes may be executed remotely via the MPI, ssh, pdsh or clush. You may find out which nodes belong to the allocation by reading the $PBS_NODEFILE file

+
qsub -q qexp -l select=4:ncpus=16 -I
qsub: waiting for job 15210.srv11 to start
qsub: job 15210.srv11 ready

$ pwd
/home/username

$ sort -u $PBS_NODEFILE
cn17.bullx
cn108.bullx
cn109.bullx
cn110.bullx

$ pdsh -w cn17,cn[108-110] hostname
cn17: cn17
cn108: cn108
cn109: cn109
cn110: cn110
+

In this example, the hostname program is executed via pdsh from the interactive shell. The execution runs on all four allocated nodes. The same result would be achieved if the pdsh is called from any of the allocated nodes or from the login nodes.

+

Example Jobscript for MPI Calculation

+

Production jobs must use the /scratch directory for I/O

+

The recommended way to run production jobs is to change to /scratch directory early in the jobscript, copy all inputs to /scratch, execute the calculations and copy outputs to home directory.

+
#!/bin/bash

# change to scratch directory, exit on failure
SCRDIR=/scratch/$USER/myjob
mkdir -p $SCRDIR
cd $SCRDIR || exit

# copy input file to scratch
cp $PBS_O_WORKDIR/input .
cp $PBS_O_WORKDIR/mympiprog.x .

# load the mpi module
module load openmpi

# execute the calculation
mpiexec -pernode ./mympiprog.x

# copy output file to home
cp output $PBS_O_WORKDIR/.

#exit
exit
+

In this example, some directory on the /home holds the input file input and executable mympiprog.x . We create a directory myjob on the /scratch filesystem, copy input and executable files from the /home directory where the qsub was invoked ($PBS_O_WORKDIR) to /scratch, execute the MPI programm mympiprog.x and copy the output file back to the /home directory. The mympiprog.x is executed as one process per node, on all allocated nodes.

+

Consider preloading inputs and executables onto shared scratch before the calculation starts.

+

In some cases, it may be impractical to copy the inputs to scratch and outputs to home. This is especially true when very large input and output files are expected, or when the files should be reused by a subsequent calculation. In such a case, it is users responsibility to preload the input files on shared /scratch before the job submission and retrieve the outputs manually, after all calculations are finished.

+

Store the qsub options within the jobscript.
Use mpiprocs and ompthreads qsub options to control the MPI job execution.

+

Example jobscript for an MPI job with preloaded inputs and executables, options for qsub are stored within the script :

+
#!/bin/bash
#PBS -q qprod
#PBS -N MYJOB
#PBS -l select=100:ncpus=16:mpiprocs=1:ompthreads=16
#PBS -A OPEN-0-0

# change to scratch directory, exit on failure
SCRDIR=/scratch/$USER/myjob
cd $SCRDIR || exit


# load the mpi module
module load openmpi

# execute the calculation
mpiexec ./mympiprog.x


#exit
exit
+

In this example, input and executable files are assumed preloaded manually in /scratch/$USER/myjob directory. Note the mpiprocs and ompthreads qsub options, controlling behavior of the MPI execution. The mympiprog.x is executed as one process per node, on all 100 allocated nodes. If mympiprog.x implements OpenMP threads, it will run 16 threads per node.

+

More information is found in the Running OpenMPI and Running MPICH2 sections.

+

Example Jobscript for Single Node Calculation

+

Local scratch directory is often useful for single node jobs. Local scratch will be deleted immediately after the job ends.

+

Example jobscript for single node calculation, using local scratch on the node:

+
#!/bin/bash

# change to local scratch directory
cd /lscratch/$PBS_JOBID || exit

# copy input file to scratch
cp $PBS_O_WORKDIR/input .
cp $PBS_O_WORKDIR/myprog.x .

# execute the calculation
./myprog.x

# copy output file to home
cp output $PBS_O_WORKDIR/.

#exit
exit
+

In this example, some directory on the home holds the input file input and executable myprog.x . We copy input and executable files from the home directory where the qsub was invoked ($PBS_O_WORKDIR) to local scratch /lscratch/$PBS_JOBID, execute the myprog.x and copy the output file back to the /home directory. The myprog.x runs on one node only and may use threads.

+

Other Jobscript Examples

+

Further jobscript examples may be found in the Software section and the Capacity computing section.

+

 

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-submission-and-execution.md b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-submission-and-execution.md index 7c7a3e6acf397746d8f1ac059d843028c30a88d4..9954980e40e1e8eb13eb2e0d25287a3bf64eb072 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-submission-and-execution.md +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/job-submission-and-execution.md @@ -258,10 +258,10 @@ Examples: ``` $ check-pbs-jobs --check-all JOB 35141.dm2, session_id 71995, user user2, nodes cn164,cn165 -Check session idOK +Check session id: OK Check processes -cn164OK -cn165No process +cn164: OK +cn165: No process ``` In this example we see that job 35141.dm2 currently runs no process on @@ -271,13 +271,13 @@ allocated node cn165, which may indicate an execution error. $ check-pbs-jobs --print-load --print-processes JOB 35141.dm2, session_id 71995, user user2, nodes cn164,cn165 Print load -cn164LOAD16.01, 16.01, 16.00 -cn165LOAD 0.01, 0.00, 0.01 +cn164: LOAD: 16.01, 16.01, 16.00 +cn165: LOAD: 0.01, 0.00, 0.01 Print processes %CPU CMD -cn164 0.0 -bash -cn164 0.0 /bin/bash /var/spool/PBS/mom_priv/jobs/35141.dm2.SC -cn16499.7 run-task +cn164: 0.0 -bash +cn164: 0.0 /bin/bash /var/spool/PBS/mom_priv/jobs/35141.dm2.SC +cn164: 99.7 run-task ... ``` @@ -290,8 +290,8 @@ $ check-pbs-jobs --jobid 35141.dm2 --print-job-out JOB 35141.dm2, session_id 71995, user user2, nodes cn164,cn165 Print job standard output: ======================== Job start ========================== -Started at    Fri Aug 30 02:47:53 CEST 2013 -Script name   script +Started at    : Fri Aug 30 02:47:53 CEST 2013 +Script name   : script Run loop 1 Run loop 2 Run loop 3 @@ -359,8 +359,8 @@ directory ``` $ qsub -q qexp -l select=4:ncpus=16 -I -qsubwaiting for job 15210.srv11 to start -qsubjob 15210.srv11 ready +qsub: waiting for job 15210.srv11 to start +qsub: job 15210.srv11 ready $ pwd /home/username @@ -381,8 +381,8 @@ allocation by reading the $PBS_NODEFILE file ``` qsub -q qexp -l select=4:ncpus=16 -I -qsubwaiting for job 15210.srv11 to start -qsubjob 15210.srv11 ready +qsub: waiting for job 15210.srv11 to start +qsub: job 15210.srv11 ready $ pwd /home/username @@ -394,10 +394,10 @@ cn109.bullx cn110.bullx $ pdsh -w cn17,cn[108-110] hostname -cn17cn17 -cn108cn108 -cn109cn109 -cn110cn110 +cn17: cn17 +cn108: cn108 +cn109: cn109 +cn110: cn110 ``` In this example, the hostname program is executed via pdsh from the @@ -542,4 +542,3 @@ computing](capacity-computing.html) section.   - diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/resources-allocation-policy.html b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/resources-allocation-policy.html new file mode 100644 index 0000000000000000000000000000000000000000..fd73df2fce5d8ef92c9fad37ad4d834a2dd4c365 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/resources-allocation-policy.html @@ -0,0 +1,854 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Resources Allocation Policy — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Resource Allocation and Job Execution + + / + + + + + + + + + + Resources Allocation Policy + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Resources Allocation Policy +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

Resources Allocation Policy

+

The resources are allocated to the job in a fairshare fashion, subject to constraints set by the queue and resources available to the Project. The Fairshare at Anselm ensures that individual users may consume approximately equal amount of resources per week. Detailed information in the Job scheduling section. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following table provides the queue partitioning overview:     

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
queueactive projectproject resourcesnodesmin ncpus*priorityauthorizationwalltime
default/max
qexp
Express queue
nonone required2 reserved, 31 total
including MIC, GPU and FAT nodes
1150no1h
qprod
Production queue
yes

> 0 +

178 nodes w/o accelerator

+
160no24/48h
+

qlong
Long queue

+
yes> 060 nodes w/o accelerator160no72/144h
+

qnvidia, qmic, qfat
Dedicated queues

+
yes +

> 0

+
23 total qnvidia
4 total qmic
2 total qfat
16200yes24/48h
qfree
Free resource queue
yesnone required178 w/o accelerator16-1024no12h
+

The qfree queue is not free of charge. Normal accounting applies. However, it allows for utilization of free resources, once a Project exhausted all its allocated computational resources. This does not apply for Directors Discreation's projects (DD projects) by default. Usage of qfree after exhaustion of DD projects computational resources is allowed after request for this queue.

+

The qexp queue is equipped with the nodes not having the very same CPU clock speed. Should you need the very same CPU speed, you have to select the proper nodes during the PSB job submission.

+
    +
  • qexp, the Express queue: This queue is dedicated for testing and running very small jobs. It is not required to specify a project to enter the qexp. There are 2 nodes always reserved for this queue (w/o accelerator), maximum 8 nodes are available via the qexp for a particular user, from a pool of nodes containing Nvidia accelerated nodes (cn181-203), MIC accelerated nodes (cn204-207) and Fat nodes with 512GB RAM (cn208-209). This enables to test and tune also accelerated code or code with higher RAM requirements. The nodes may be allocated on per core basis. No special authorization is required to use it. The maximum runtime in qexp is 1 hour.
  • +
  • qprod, the Production queue: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, except the reserved ones. 178 nodes without accelerator are included. Full nodes, 16 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours.
  • +
  • qlong, the Long queue: This queue is intended for long production runs. It is required that active project with nonzero remaining resources is specified to enter the qlong. Only 60 nodes without acceleration may be accessed via the qlong queue. Full nodes, 16 cores per node are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qlong is 144 hours (three times of the standard qprod time - 3 * 48 h).
  • +
  • qnvidia, qmic, qfat, the Dedicated queues: The queue qnvidia is dedicated to access the Nvidia accelerated nodes, the qmic to access MIC nodes and qfat the Fat nodes. It is required that active project with nonzero remaining resources is specified to enter these queues. 23 nvidia, 4 mic and 2 fat nodes are included. Full nodes, 16 cores per node are allocated. The queues run with very high priority, the jobs will be scheduled before the jobs coming from the qexp queue. An PI needs explicitly ask support for authorization to enter the dedicated queues for all users associated to her/his Project.
  • +
  • qfree, The Free resource queue: The queue qfree is intended for utilization of free resources, after a Project exhausted all its allocated computational resources (Does not apply to DD projects by default. DD projects have to request for persmission on qfree after exhaustion of computational resources.). It is required that active project is specified to enter the queue, however no remaining resources are required. Consumed resources will be accounted to the Project. Only 178 nodes without accelerator may be accessed from this queue. Full nodes, 16 cores per node are allocated. The queue runs with very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours.
  • +
+

Notes

+

The job wall clock time defaults to half the maximum time, see table above. Longer wall time limits can be  set manually, see examples.

Jobs that exceed the reserved wall clock time (Req'd Time) get killed automatically. Wall clock time limit can be changed for queuing jobs (state Q) using the qalter command, however can not be changed for a running job (state R).

+

Anselm users may check current queue configuration at https://extranet.it4i.cz/anselm/queues.

+

Queue status

+

Check the status of jobs, queues and compute nodes at https://extranet.it4i.cz/anselm/

+

rspbs web interface

+

Display the queue status on Anselm:

+
$ qstat -q
+

The PBS allocation overview may be obtained also using the rspbs command.

+
$ rspbs
Usage: rspbs [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  --get-node-ncpu-chart
                        Print chart of allocated ncpus per node
  --summary             Print summary
  --get-server-details  Print server
  --get-queues          Print queues
  --get-queues-details  Print queues details
  --get-reservations    Print reservations
  --get-reservations-details
                        Print reservations details
  --get-nodes           Print nodes of PBS complex
  --get-nodeset         Print nodeset of PBS complex
  --get-nodes-details   Print nodes details
  --get-jobs            Print jobs
  --get-jobs-details    Print jobs details
  --get-jobs-check-params
                        Print jobid, job state, session_id, user, nodes
  --get-users           Print users of jobs
  --get-allocated-nodes
                        Print allocated nodes of jobs
  --get-allocated-nodeset
                        Print allocated nodeset of jobs
  --get-node-users      Print node users
  --get-node-jobs       Print node jobs
  --get-node-ncpus      Print number of ncpus per node
  --get-node-allocated-ncpus
                        Print number of allocated ncpus per node
  --get-node-qlist      Print node qlist
  --get-node-ibswitch   Print node ibswitch
  --get-user-nodes      Print user nodes
  --get-user-nodeset    Print user nodeset
  --get-user-jobs       Print user jobs
  --get-user-jobc       Print number of jobs per user
  --get-user-nodec      Print number of allocated nodes per user
  --get-user-ncpus      Print number of allocated ncpus per user
  --get-qlist-nodes     Print qlist nodes
  --get-qlist-nodeset   Print qlist nodeset
  --get-ibswitch-nodes  Print ibswitch nodes
  --get-ibswitch-nodeset
                        Print ibswitch nodeset
  --state=STATE         Only for given job state
  --jobid=JOBID         Only for given job ID
  --user=USER           Only for given user
  --node=NODE           Only for given node
  --nodestate=NODESTATE
                        Only for given node state (affects only --get-node*
                        --get-qlist-* --get-ibswitch-* actions)
  --incl-finished       Include finished jobs

+

Resources Accounting Policy

+

The Core-Hour

+

The resources that are currently subject to accounting are the core-hours. The core-hours are accounted on the wall clock basis. The accounting runs whenever the computational cores are allocated or blocked via the PBS Pro workload manager (the qsub command), regardless of whether the cores are actually used for any calculation. 1 core-hour is defined as 1 processor core allocated for 1 hour of wall clock time. Allocating a full node (16 cores) for 1 hour accounts to 16 core-hours. See example in the  Job submission and execution section.

+

Check consumed resources

+

The it4ifree command is a part of it4i.portal.clients package, located here:
https://pypi.python.org/pypi/it4i.portal.clients

+

User may check at any time, how many core-hours have been consumed by himself/herself and his/her projects. The command is available on clusters' login nodes.

+
$ it4ifree
Password:
     PID   Total Used  ...by me Free
   -------- ------- ------ -------- -------
   OPEN-0-0 1500000 400644   225265 1099356
   DD-13-1    10000 2606 2606 7394
+

 

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/resources-allocation-policy.md b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/resources-allocation-policy.md index 2e926808d2b12a0a5685b2f1ba87ffdff34183a6..1e2bfd00355f1021de4cd7c191704c5ec94b34ab 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/resources-allocation-policy.md +++ b/docs.it4i.cz/anselm-cluster-documentation/resource-allocation-and-job-execution/resources-allocation-policy.md @@ -121,7 +121,7 @@ clock speed.** Should you need the very same CPU speed, you have to select the proper nodes during the PSB job submission. **** -- **qexp**, the Express queueThis queue is dedicated for testing and +- **qexp**, the Express queue: This queue is dedicated for testing and running very small jobs. It is not required to specify a project to enter the qexp. *There are 2 nodes always reserved for this queue (w/o accelerator), maximum 8 nodes are available via the @@ -132,7 +132,7 @@ select the proper nodes during the PSB job submission. RAM requirements.* The nodes may be allocated on per core basis. No special authorization is required to use it. The maximum runtime in qexp is 1 hour. -- **qprod**, the Production queue****This queue is intended for +- **qprod**, the Production queue****: This queue is intended for normal production runs. It is required that active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, except the reserved ones. @@ -141,7 +141,7 @@ select the proper nodes during the PSB job submission. are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours. -- **qlong**, the Long queue****This queue is intended for long +- **qlong**, the Long queue****: This queue is intended for long production runs. It is required that active project with nonzero remaining resources is specified to enter the qlong. Only 60 nodes without acceleration may be accessed via the qlong queue. Full @@ -149,7 +149,7 @@ select the proper nodes during the PSB job submission. priority and no special authorization is required to use it. *The maximum runtime in qlong is 144 hours (three times of the standard qprod time - 3 * 48 h).* -- **qnvidia, qmic, qfat**, the Dedicated queues****The queue qnvidia +- **qnvidia, qmic, qfat**, the Dedicated queues****: The queue qnvidia is dedicated to access the Nvidia accelerated nodes, the qmic to access MIC nodes and qfat the Fat nodes. It is required that active project with nonzero remaining resources is specified to enter @@ -161,7 +161,7 @@ select the proper nodes during the PSB job submission. [support](https://support.it4i.cz/rt/) for authorization to enter the dedicated queues for all users associated to her/his Project. -- **qfree**, The Free resource queue****The queue qfree is intended +- **qfree**, The Free resource queue****: The queue qfree is intended for utilization of free resources, after a Project exhausted all its allocated computational resources (Does not apply to DD projects by default. DD projects have to request for persmission on qfree @@ -206,7 +206,7 @@ command. ``` $ rspbs -Usagerspbs [options] +Usage: rspbs [options] Options:   --version             show program's version number and exit @@ -297,4 +297,3 @@ Password:   - diff --git a/docs.it4i.cz/anselm-cluster-documentation/software.1.html b/docs.it4i.cz/anselm-cluster-documentation/software.1.html new file mode 100644 index 0000000000000000000000000000000000000000..42285d5a0fe315b12719b3c686b800bc5ba4c26c --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software.1.html @@ -0,0 +1,1486 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Software — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + + + Software + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Software +

+ + + + +
+
+ + + + + +
+ In this section we provide overview of installed software and its usage. + + + +
+ + + + + +
+
+ + + + + + + + + +
+ + + +
+ + + + ANSYS + + + + + + + + + + +
+ +
+ An engineering simulation software +
+ + + + + + +
+ + + + COMSOL + + + + + + + + + + +
+ +
+ A finite element analysis, solver and Simulation software +
+ + + + + + +
+ + + + Debuggers and profilers + + + + + + + + + + +
+ +
+ A collection of development tools +
+ + + + + + +
+ + + + Chemistry and Materials science + + + + + + + + + + +
+ +
+ Tools for computational chemistry. +
+ + + + + + +
+ + + + Intel Parallel studio + + + + + + + + + + +
+ +
+ The Intel Parallel Studio XE +
+ + + + + + +
+ + + + MPI + + + + + + + + + + +
+ +
+ Message Passing Interface libraries on ANSELM +
+ + + + + + +
+ + + + Numerical Libraries + + + + + + + + + + +
+ +
+ Libraries for numerical computations +
+ + + + + + +
+ + + + Numerical languages + + + + + + + + + + +
+ +
+ Interpreted languages for numerical computations +
+ + + + + + +
+ + + + Virtualization + + + + + + + + + + +
+ +
+ +
+ + + + + + +
+ + + + Compilers + + + + + + + + + + +
+ +
+ Available compilers, including GNU, INTEL and UPC compilers +
+ + + + + + +
+ + + + Intel Xeon Phi + + + + + + + + + + +
+ +
+ A guide to Intel Xeon Phi usage +
+ + + + + + +
+ + + + ISV Licenses + + + + + + + + + + +
+ +
+ A guide to managing Independent Software Vendor licences +
+ + + + + + +
+ + + + Java + + + + + + + + + + +
+ +
+ Java on ANSELM +
+ + + + + + +
+ + + + nVidia CUDA + + + + + + + + + + +
+ +
+ A guide to nVidia CUDA programming and GPU usage +
+ + + + + + +
+ + + + OMICS Master + + + + + + + + + + +
+ +
+ +
+ + + + + + +
+ + + + OpenFOAM + + + + + + + + + + +
+ +
+ A free, open source CFD software package +
+ + + + + + +
+ + + + Operating System + + + + + + + + + + +
+ +
+ The operating system, deployed on ANSELM +
+ + + + + + +
+ + + + ParaView + + + + + + + + + + +
+ +
+ An open-source, multi-platform data analysis and visualization application +
+ + + + + + +
+ + + + GPI-2 + + + + + + + + + + +
+ +
+ A library that implements the GASPI specification +
+ + + +
+ + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/anselm-cluster-documentation/software/mpi-1/running-mpich2.html b/docs.it4i.cz/anselm-cluster-documentation/software/anselm-cluster-documentation/software/mpi-1/running-mpich2.html new file mode 100644 index 0000000000000000000000000000000000000000..04aa435022b3c8feb3544779cfd73a06d4ae1aa3 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/anselm-cluster-documentation/software/mpi-1/running-mpich2.html @@ -0,0 +1,1097 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Running MPICH2 — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + MPI + + / + + + + + + + + + + Running MPICH2 + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Running MPICH2 +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

MPICH2 program execution

+

The MPICH2 programs use mpd daemon or ssh connection to spawn processes, no PBS support is needed. However the PBS allocation is required to access compute nodes. On Anselm, the Intel MPI and mpich2 1.9 are MPICH2 based MPI implementations.

+

Basic usage

+

Use the mpirun to execute the MPICH2 code.

+

Example:

+
$ qsub -q qexp -l select=4:ncpus=16 -I
qsub: waiting for job 15210.srv11 to start
qsub: job 15210.srv11 ready

$ module load impi

$ mpirun -ppn 1 -hostfile $PBS_NODEFILE ./helloworld_mpi.x
Hello world! from rank 0 of 4 on host cn17
Hello world! from rank 1 of 4 on host cn108
Hello world! from rank 2 of 4 on host cn109
Hello world! from rank 3 of 4 on host cn110
+

In this example, we allocate 4 nodes via the express queue interactively. We set up the intel MPI environment and interactively run the helloworld_mpi.x program. We request MPI to spawn 1 process per node.
Note that the executable helloworld_mpi.x must be available within the same path on all nodes. This is automatically fulfilled on the /home and /scratch filesystem.

+

You need to preload the executable, if running on the local scratch /lscratch filesystem

+
$ pwd
/lscratch/15210.srv11
$ mpirun -ppn 1 -hostfile $PBS_NODEFILE cp /home/username/helloworld_mpi.x .
$ mpirun -ppn 1 -hostfile $PBS_NODEFILE ./helloworld_mpi.x
Hello world! from rank 0 of 4 on host cn17
Hello world! from rank 1 of 4 on host cn108
Hello world! from rank 2 of 4 on host cn109
Hello world! from rank 3 of 4 on host cn110
+

In this example, we assume the executable helloworld_mpi.x is present on shared home directory. We run the cp command via mpirun, copying the executable from shared home to local scratch . Second  mpirun will execute the binary in the /lscratch/15210.srv11 directory on nodes cn17, cn108, cn109 and cn110, one process per node.

+

MPI process mapping may be controlled by PBS parameters.

+

The mpiprocs and ompthreads parameters allow for selection of number of running MPI processes per node as well as number of OpenMP threads per MPI process.

+

One MPI process per node

+

Follow this example to run one MPI process per node, 16 threads per process. Note that no options to mpirun are needed

+
$ qsub -q qexp -l select=4:ncpus=16:mpiprocs=1:ompthreads=16 -I

$ module load mvapich2

$ mpirun ./helloworld_mpi.x
+

In this example, we demonstrate recommended way to run an MPI application, using 1 MPI processes per node and 16 threads per socket, on 4 nodes.

+

Two MPI processes per node

+

Follow this example to run two MPI processes per node, 8 threads per process. Note the options to mpirun for mvapich2. No options are needed for impi.

+
$ qsub -q qexp -l select=4:ncpus=16:mpiprocs=2:ompthreads=8 -I

$ module load mvapich2

$ mpirun -bind-to numa ./helloworld_mpi.x
+

In this example, we demonstrate recommended way to run an MPI application, using 2 MPI processes per node and 8 threads per socket, each process and its threads bound to a separate processor socket of the node, on 4 nodes

+

16 MPI processes per node

+

Follow this example to run 16 MPI processes per node, 1 thread per process. Note the options to mpirun for mvapich2. No options are needed for impi.

+
$ qsub -q qexp -l select=4:ncpus=16:mpiprocs=16:ompthreads=1 -I

$ module load mvapich2

$ mpirun -bind-to core ./helloworld_mpi.x
+

In this example, we demonstrate recommended way to run an MPI application, using 16 MPI processes per node, single threaded. Each process is bound to separate processor core, on 4 nodes.

+

OpenMP thread affinity

+

Important!  Bind every OpenMP thread to a core!

+

In the previous two examples with one or two MPI processes per node, the operating system might still migrate OpenMP threads between cores. You might want to avoid this by setting these environment variable for GCC OpenMP:

+
$ export GOMP_CPU_AFFINITY="0-15"
+

or this one for Intel OpenMP:

+
$ export KMP_AFFINITY=granularity=fine,compact,1,0
+

As of OpenMP 4.0 (supported by GCC 4.9 and later and Intel 14.0 and later) the following variables may be used for Intel or GCC:

+
$ export OMP_PROC_BIND=true
$ export OMP_PLACES=cores
+

 

+

MPICH2 Process Mapping and Binding

+

The mpirun allows for precise selection of how the MPI processes will be mapped to the computational nodes and how these processes will bind to particular processor sockets and cores.

+

Machinefile

+

Process mapping may be controlled by specifying a machinefile input to the mpirun program. Altough all implementations of MPI provide means for process mapping and binding, following examples are valid for the impi and mvapich2 only.

+

Example machinefile

+
cn110.bullx
cn109.bullx
cn108.bullx
cn17.bullx
cn108.bullx
+

Use the machinefile to control process placement

+
$ mpirun -machinefile machinefile helloworld_mpi.x
Hello world! from rank 0 of 5 on host cn110
Hello world! from rank 1 of 5 on host cn109
Hello world! from rank 2 of 5 on host cn108
Hello world! from rank 3 of 5 on host cn17
Hello world! from rank 4 of 5 on host cn108
+

In this example, we see that ranks have been mapped on nodes according to the order in which nodes show in the machinefile

+

Process Binding

+

The Intel MPI automatically binds each process and its threads to the corresponding portion of cores on the processor socket of the node, no options needed. The binding is primarily controlled by environment variables. Read more about mpi process binding on Intel website. The MPICH2 uses the -bind-to option Use -bind-to numa or -bind-to core to bind the process on single core or entire socket.

+

Bindings verification

+

In all cases, binding and threading may be verified by executing

+
$ mpirun  -bindto numa numactl --show
$ mpirun -bindto numa echo $OMP_NUM_THREADS
+

Intel MPI on Xeon Phi

+

The MPI section of Intel Xeon Phi chapter provides details on how to run Intel MPI code on Xeon Phi architecture.

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/anselm-cluster-documentation/software/mpi-1/running-mpich2.md b/docs.it4i.cz/anselm-cluster-documentation/software/anselm-cluster-documentation/software/mpi-1/running-mpich2.md index 8e3ef76abf015c941a3251be3690161d58e1ef22..59b1c5bfd181301bbc812b9f4a9681c650b913b2 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/software/anselm-cluster-documentation/software/mpi-1/running-mpich2.md +++ b/docs.it4i.cz/anselm-cluster-documentation/software/anselm-cluster-documentation/software/mpi-1/running-mpich2.md @@ -20,8 +20,8 @@ Use the mpirun to execute the MPICH2 code. Example: $ qsub -q qexp -l select=4:ncpus=16 -I - qsubwaiting for job 15210.srv11 to start - qsubjob 15210.srv11 ready + qsub: waiting for job 15210.srv11 to start + qsub: job 15210.srv11 ready $ module load impi @@ -193,4 +193,3 @@ chapter](../../../intel-xeon-phi.html) provides details on how to run Intel MPI code on Xeon Phi architecture. - diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys.html new file mode 100644 index 0000000000000000000000000000000000000000..55af2d64b8262d6c7e12c8c3f5e5ffe4df5fb37c --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys.html @@ -0,0 +1,1173 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Overview of ANSYS Products — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + + + ANSYS + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Overview of ANSYS Products +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

SVS FEM as ANSYS Channel partner for Czech Republic provided all ANSYS licenses for ANSELM cluster and supports of all ANSYS Products (Multiphysics, Mechanical, MAPDL, CFX, Fluent, Maxwell, LS-DYNA...) to IT staff and ANSYS users. If you are challenging to problem of ANSYS functionality contact please 

+

Anselm provides as commercial as academic variants. Academic variants are distinguished by "Academic..." word in the name of  license or by two letter preposition "aa_" in the license feature name. Change of license is realized on command line respectively directly in user's pbs file (see individual products). More about licensing here

+

To load the latest version of any ANSYS product (Mechanical, Fluent, CFX, MAPDL,...) load the module:

+
$ module load ansys
+

ANSYS supports interactive regime, but due to assumed solution of extremely difficult tasks it is not recommended.

+

If user needs to work in interactive regime we recommend to configure the RSM service on the client machine which allows to forward the solution to the Anselm directly from the client's Workbench project (see ANSYS RSM service).

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys.md b/docs.it4i.cz/anselm-cluster-documentation/software/ansys.md index 7709b2a163729fef2ebdde9da957f11fbf4da638..b8d91d70ebb3f2b44c858215c70ac5e30ad3bea5 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/software/ansys.md +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys.md @@ -32,4 +32,3 @@ solution to the Anselm directly from the client's Workbench project (see ANSYS RSM service). - diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-cfx-pbs-file/view.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-cfx-pbs-file/view.html new file mode 100644 index 0000000000000000000000000000000000000000..9c3e88cfb0d3f0163be6886ce3de1d4f5953ecd9 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-cfx-pbs-file/view.html @@ -0,0 +1,1170 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +ANSYS CFX PBS file — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + ANSYS + + / + + + + + + + + + + ANSYS CFX PBS file + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ ANSYS CFX PBS file +

+ + + + +
+
+ + + + + + + + + + + +
+
+ + +

+ + + + + + + + + shell script icon + cfx.pbs + + + — + shell script, + 1 KB (1026 bytes) + + + + + + + + + +

+ + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-cfx.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-cfx.html new file mode 100644 index 0000000000000000000000000000000000000000..92c62d3d55d3624cfede64a00e5704d87858b564 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-cfx.html @@ -0,0 +1,1184 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +ANSYS CFX — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + ANSYS + + / + + + + + + + + + + ANSYS CFX + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ ANSYS CFX +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

ANSYS CFX software is a high-performance, general purpose fluid dynamics program that has been applied to solve wide-ranging fluid flow problems for over 20 years. At the heart of ANSYS CFX is its advanced solver technology, the key to achieving reliable and accurate solutions quickly and robustly. The modern, highly parallelized solver is the foundation for an abundant choice of physical models to capture virtually any type of phenomena related to fluid flow. The solver and its many physical models are wrapped in a modern, intuitive, and flexible GUI and user environment, with extensive capabilities for customization and automation using session files, scripting and a powerful expression language.

+

To run ANSYS CFX in batch mode you can utilize/modify the default cfx.pbs script and execute it via the qsub command.

+
#!/bin/bash
#PBS -l nodes=2:ppn=16
#PBS -q qprod
#PBS -N $USER-CFX-Project
#PBS -A XX-YY-ZZ

#! Mail to user when job terminate or abort
#PBS -m ae

#!change the working directory (default is home directory)
#cd <working directory> (working directory must exists)
WORK_DIR="/scratch/$USER/work"
cd $WORK_DIR

echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`

module load ansys

#### Set number of processors per host listing
#### (set to 1 as $PBS_NODEFILE lists each node twice if :ppn=2)
procs_per_host=1
#### Create host list
hl=""
for host in `cat $PBS_NODEFILE`
do
if [ "$hl" = "" ]
then hl="$host:$procs_per_host"
else hl="${hl}:$host:$procs_per_host"
fi
done

echo Machines: $hl

#-dev input.def includes the input of CFX analysis in DEF format
#-P the name of prefered license feature (aa_r=ANSYS Academic Research, ane3fl=Multiphysics(commercial))
/ansys_inc/v145/CFX/bin/cfx5solve -def input.def -size 4 -size-ni 4x -part-large -start-method "Platform MPI Distributed Parallel" -par-dist $hl -P aa_r
+

Header of the pbs file (above) is common and description can be find on this site. SVS FEM recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.

+

Working directory has to be created before sending pbs job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common CFX def file which is attached to the cfx solver via parameter -def

+

License should be selected by parameter -P (Big letter P). Licensed products are the following: aa_r (ANSYS Academic Research), ane3fl (ANSYS Multiphysics)-Commercial.
More about licensing here

+

 

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-cfx.md b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-cfx.md index fa6e2d1facdcf1c075c828b0217e84d3bf15ee12..e12ff157564f5d83c1218134cba586923268ff63 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-cfx.md +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-cfx.md @@ -54,7 +54,7 @@ do fi done -echo Machines$hl +echo Machines: $hl #-dev input.def includes the input of CFX analysis in DEF format #-P the name of prefered license feature (aa_r=ANSYS Academic Research, ane3fl=Multiphysics(commercial)) @@ -64,7 +64,7 @@ echo Machines$hl Header of the pbs file (above) is common and description can be find on [this site](../../resource-allocation-and-job-execution/job-submission-and-execution.html). -SVS FEM recommends to utilize sources by keywordsnodes, ppn. These +SVS FEM recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources. @@ -76,7 +76,7 @@ CFX def file which is attached to the cfx solver via parameter -def **License** should be selected by parameter -P (Big letter **P**). -Licensed products are the followingaa_r +Licensed products are the following: aa_r (ANSYS **Academic** Research), ane3fl (ANSYS Multiphysics)-**Commercial.** [More @@ -84,3 +84,7 @@ Multiphysics)-**Commercial.** class="hps">here](licensing.html)   + + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-fluent-pbs-file/view.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-fluent-pbs-file/view.html new file mode 100644 index 0000000000000000000000000000000000000000..fc0721ffe30affd46586478d0d9a5b5ab9551f5d --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-fluent-pbs-file/view.html @@ -0,0 +1,1170 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +ANSYS Fluent PBS file — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + ANSYS + + / + + + + + + + + + + ANSYS Fluent PBS file + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ ANSYS Fluent PBS file +

+ + + + +
+
+ + + + + + + + + + + +
+
+ + +

+ + + + + + + + + shell script icon + fluent.pbs + + + — + shell script, + 1 KB + + + + + + + + + +

+ + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-fluent.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-fluent.html new file mode 100644 index 0000000000000000000000000000000000000000..f2fe17ac24746ccc279d26e2602223e15214cdb7 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-fluent.html @@ -0,0 +1,1268 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +ANSYS Fluent — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + ANSYS + + / + + + + + + + + + + ANSYS Fluent + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ ANSYS Fluent +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

ANSYS Fluent software contains the broad physical modeling capabilities needed to model flow, turbulence, heat transfer, and reactions for industrial applications ranging from air flow over an aircraft wing to combustion in a furnace, from bubble columns to oil platforms, from blood flow to semiconductor manufacturing, and from clean room design to wastewater treatment plants. Special models that give the software the ability to model in-cylinder combustion, aeroacoustics, turbomachinery, and multiphase systems have served to broaden its reach.

+

1. Common way to run Fluent over pbs file

+

To run ANSYS Fluent in batch mode you can utilize/modify the default fluent.pbs script and execute it via the qsub command.

+
#!/bin/bash
#PBS -S /bin/bash
#PBS -l nodes=2:ppn=16
#PBS -q qprod
#PBS -N $USER-Fluent-Project
#PBS -A XX-YY-ZZ

#! Mail to user when job terminate or abort
#PBS -m ae

#!change the working directory (default is home directory)
#cd <working directory> (working directory must exists)
WORK_DIR="/scratch/$USER/work"
cd $WORK_DIR

echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`

#### Load ansys module so that we find the cfx5solve command
module load ansys

# Use following line to specify MPI for message-passing instead
NCORES=`wc -l $PBS_NODEFILE |awk '{print $1}'`

/ansys_inc/v145/fluent/bin/fluent 3d -t$NCORES -cnf=$PBS_NODEFILE -g -i fluent.jou
+

Header of the pbs file (above) is common and description can be find on this site. SVS FEM recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.

+

Working directory has to be created before sending pbs job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common Fluent journal file which is attached to the Fluent solver via parameter -i fluent.jou

+

Journal file with definition of the input geometry and boundary conditions and defined process of solution has e.g. the following structure:

+
/file/read-case aircraft_2m.cas.gz
/solve/init
init
/solve/iterate
10
/file/write-case-dat aircraft_2m-solution
/exit yes
+

The appropriate dimension of the problem has to be set by parameter (2d/3d). 

+

2. Fast way to run Fluent from command line

+
+
fluent solver_version [FLUENT_options] -i journal_file -pbs
+
+

This syntax will start the ANSYS FLUENT job under PBS Professional using the qsub command in a batch manner. When resources are available, PBS Professional will start the job and return a job ID, usually in the form of job_ID.hostname. This job ID can then be used to query, control, or stop the job using standard PBS Professional commands, such as qstat or qdel. The job will be run out of the current working directory, and all output will be written to the file fluent.o job_ID.       

+

3. Running Fluent via user's config file

+
+

The sample script uses a configuration file called pbs_fluent.conf  if no command line arguments are present. This configuration file should be present in the directory from which the jobs are submitted (which is also the directory in which the jobs are executed). The following is an example of what the content of pbs_fluent.conf can be:

+
input="example_small.flin"
+case="Small-1.65m.cas"
+fluent_args="3d -pmyrinet"
+outfile="fluent_test.out"
+mpp="true"
+

The following is an explanation of the parameters:

+
+
+

input is the name of the input file.

+

case is the name of the .cas file that the input file will utilize.

+

fluent_args are extra ANSYS FLUENT arguments. As shown in the previous example, you can specify the interconnect by using the -p interconnect command. The available interconnects include ethernet (the default), myrinet, infiniband, vendor, altix, and crayx. The MPI is selected automatically, based on the specified interconnect.

+

outfile is the name of the file to which the standard output will be sent.

+

mpp="true" will tell the job script to execute the job across multiple processors.               

+
+
+
+
+
+

To run ANSYS Fluent in batch mode with user's config file you can utilize/modify the following script and execute it via the qsub command.

+
+
#!/bin/sh
#PBS -l nodes=2:ppn=4
#PBS -1 qprod
#PBS -N $USE-Fluent-Project
#PBS -A XX-YY-ZZ
+ cd $PBS_O_WORKDIR + + #We assume that if they didn’t specify arguments then they should use the + #config file if [ "xx${input}${case}${mpp}${fluent_args}zz" = "xxzz" ]; then + if [ -f pbs_fluent.conf ]; then + . pbs_fluent.conf + else + printf "No command line arguments specified, " + printf "and no configuration file found. Exiting \n" + fi + fi + + + #Augment the ANSYS FLUENT command line arguments case "$mpp" in + true) + #MPI job execution scenario + num_nodes=‘cat $PBS_NODEFILE | sort -u | wc -l‘ + cpus=‘expr $num_nodes \* $NCPUS‘ + #Default arguments for mpp jobs, these should be changed to suit your + #needs. + fluent_args="-t${cpus} $fluent_args -cnf=$PBS_NODEFILE" + ;; + *) + #SMP case + #Default arguments for smp jobs, should be adjusted to suit your + #needs. + fluent_args="-t$NCPUS $fluent_args" + ;; + esac + #Default arguments for all jobs + fluent_args="-ssh -g -i $input $fluent_args" + + echo "---------- Going to start a fluent job with the following settings: + Input: $input + Case: $case + Output: $outfile + Fluent arguments: $fluent_args" + + #run the solver + /ansys_inc/v145/fluent/bin/fluent $fluent_args > $outfile
+
+

It runs the jobs out of the directory from which they are submitted (PBS_O_WORKDIR).

+

4. Running Fluent in parralel

+

Fluent could be run in parallel only under Academic Research license. To do so this ANSYS Academic Research license must be placed before ANSYS CFD license in user preferences. To make this change anslic_admin utility should be run

+
/ansys_inc/shared_les/licensing/lic_admin/anslic_admin
+

ANSLIC_ADMIN Utility will be run

+

+

+

+

 

+

ANSYS Academic Research license should be moved up to the top of the list.

+

 

+

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-fluent.md b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-fluent.md index b30190d274b0f61494779369e353c481e9979115..e43a8033c561f08c767efbfee7a909de077b08a5 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-fluent.md +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-fluent.md @@ -53,7 +53,7 @@ Header of the pbs file (above) is common and description can be find on [this site](../../resource-allocation-and-job-execution/job-submission-and-execution.html). [SVS FEM](http://www.svsfem.cz) recommends to utilize -sources by keywordsnodes, ppn. These keywords allows to address +sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources. @@ -193,10 +193,10 @@ command. fluent_args="-ssh -g -i $input $fluent_args" echo "---------- Going to start a fluent job with the following settings: - Input$input - Case$case - Output$outfile - Fluent arguments$fluent_args" + Input: $input + Case: $case + Output: $outfile + Fluent arguments: $fluent_args" #run the solver /ansys_inc/v145/fluent/bin/fluent $fluent_args > $outfile @@ -235,3 +235,7 @@ list.   ![](Fluent_Licence_4.jpg) + + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-ls-dyna-pbs-file/view.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-ls-dyna-pbs-file/view.html new file mode 100644 index 0000000000000000000000000000000000000000..78d790b6bd70f7305062c6f7162a34cbcfddc6b3 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-ls-dyna-pbs-file/view.html @@ -0,0 +1,1170 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +ANSYS LS-DYNA PBS file — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + ANSYS + + / + + + + + + + + + + ANSYS LS-DYNA PBS file + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ ANSYS LS-DYNA PBS file +

+ + + + +
+
+ + + + + + + + + + + +
+
+ + +

+ + + + + + + + + shell script icon + ansysdyna.pbs + + + — + shell script, + 1 KB + + + + + + + + + +

+ + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-ls-dyna.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-ls-dyna.html new file mode 100644 index 0000000000000000000000000000000000000000..b0a3bf535606b44c0358732c4e4b4af66bd18f6a --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-ls-dyna.html @@ -0,0 +1,1183 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +ANSYS LS-DYNA — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + ANSYS + + / + + + + + + + + + + ANSYS LS-DYNA + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ ANSYS LS-DYNA +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

ANSYS LS-DYNA software provides convenient and easy-to-use access to the technology-rich, time-tested explicit solver without the need to contend with the complex input requirements of this sophisticated program. Introduced in 1996, ANSYS LS-DYNA capabilities have helped customers in numerous industries to resolve highly intricate design issues. ANSYS Mechanical users have been able take advantage of complex explicit solutions for a long time utilizing the traditional ANSYS Parametric Design Language (APDL) environment. These explicit capabilities are available to ANSYS Workbench users as well. The Workbench platform is a powerful, comprehensive, easy-to-use environment for engineering simulation. CAD import from all sources, geometry cleanup, automatic meshing, solution, parametric optimization, result visualization and comprehensive report generation are all available within a single fully interactive modern  graphical user environment.

+

To run ANSYS LS-DYNA in batch mode you can utilize/modify the default ansysdyna.pbs script and execute it via the qsub command.

+
#!/bin/bash
#PBS -l nodes=2:ppn=16
#PBS -q qprod
#PBS -N $USER-DYNA-Project
#PBS -A XX-YY-ZZ

#! Mail to user when job terminate or abort
#PBS -m ae

#!change the working directory (default is home directory)
#cd <working directory>
WORK_DIR="/scratch/$USER/work"
cd $WORK_DIR

echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`

#! Counts the number of processors
NPROCS=`wc -l < $PBS_NODEFILE`

echo This job has allocated $NPROCS nodes

module load ansys

#### Set number of processors per host listing
#### (set to 1 as $PBS_NODEFILE lists each node twice if :ppn=2)
procs_per_host=1
#### Create host list
hl=""
for host in `cat $PBS_NODEFILE`
do
if [ "$hl" = "" ]
then hl="$host:$procs_per_host"
else hl="${hl}:$host:$procs_per_host"
fi
done

echo Machines: $hl

/ansys_inc/v145/ansys/bin/ansys145 -dis -lsdynampp i=input.k -machines $hl
+

Header of the pbs file (above) is common and description can be find on this site. SVS FEM recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.

+

Working directory has to be created before sending pbs job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common LS-DYNA .k file which is attached to the ansys solver via parameter i=

+

 

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-ls-dyna.md b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-ls-dyna.md index 38c83e91807de96b53e2d95384e5587b0ff00966..4f27959d582152e335b164ca112c37c1289a1075 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-ls-dyna.md +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-ls-dyna.md @@ -63,7 +63,7 @@ do fi done -echo Machines$hl +echo Machines: $hl /ansys_inc/v145/ansys/bin/ansys145 -dis -lsdynampp i=input.k -machines $hl ``` @@ -72,7 +72,7 @@ echo Machines$hl find on [this site](../../resource-allocation-and-job-execution/job-submission-and-execution.html). [SVS FEM](http://www.svsfem.cz) recommends to utilize -sources by keywordsnodes, ppn. These keywords allows to address +sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources. @@ -83,3 +83,7 @@ file has to be specified. Input file has to be defined by common LS-DYNA .**k** file which is attached to the ansys solver via parameter i=   + + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-mapdl-pbs-file/view.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-mapdl-pbs-file/view.html new file mode 100644 index 0000000000000000000000000000000000000000..9d7964651622778fb26b2a135beb4a53d95c9c78 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-mapdl-pbs-file/view.html @@ -0,0 +1,1170 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +ANSYS MAPDL PBS file — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + ANSYS + + / + + + + + + + + + + ANSYS MAPDL PBS file + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ ANSYS MAPDL PBS file +

+ + + + +
+
+ + + + + + + + + + + +
+
+ + +

+ + + + + + + + + shell script icon + mapdl.pbs + + + — + shell script, + 1 KB (1201 bytes) + + + + + + + + + +

+ + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-mechanical-apdl.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-mechanical-apdl.html new file mode 100644 index 0000000000000000000000000000000000000000..c3d7af92aa8e0a3b6fc947564b02a03978cbae61 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-mechanical-apdl.html @@ -0,0 +1,1183 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +ANSYS MAPDL — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + ANSYS + + / + + + + + + + + + + ANSYS MAPDL + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ ANSYS MAPDL +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

ANSYS Multiphysics software offers a comprehensive product solution for both multiphysics and single-physics analysis. The product includes structural, thermal, fluid and both high- and low-frequency electromagnetic analysis. The product also contains solutions for both direct and sequentially coupled physics problems including direct coupled-field elements and the ANSYS multi-field solver.

+

To run ANSYS MAPDL in batch mode you can utilize/modify the default mapdl.pbs script and execute it via the qsub command.

+
#!/bin/bash
#PBS -l nodes=2:ppn=16
#PBS -q qprod
#PBS -N $USER-ANSYS-Project
#PBS -A XX-YY-ZZ

#! Mail to user when job terminate or abort
#PBS -m ae

#!change the working directory (default is home directory)
#cd <working directory> (working directory must exists)
WORK_DIR="/scratch/$USER/work"
cd $WORK_DIR

echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`

module load ansys

#### Set number of processors per host listing
#### (set to 1 as $PBS_NODEFILE lists each node twice if :ppn=2)
procs_per_host=1
#### Create host list
hl=""
for host in `cat $PBS_NODEFILE`
do
if [ "$hl" = "" ]
then hl="$host:$procs_per_host"
else hl="${hl}:$host:$procs_per_host"
fi
done

echo Machines: $hl

#-i input.dat includes the input of analysis in APDL format
#-o file.out is output file from ansys where all text outputs will be redirected
#-p the name of license feature (aa_r=ANSYS Academic Research, ane3fl=Multiphysics(commercial), aa_r_dy=Academic AUTODYN)
/ansys_inc/v145/ansys/bin/ansys145 -b -dis -p aa_r -i input.dat -o file.out -machines $hl -dir $WORK_DIR
+

Header of the pbs file (above) is common and description can be find on this site. SVS FEM recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.

+

Working directory has to be created before sending pbs job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common APDL file which is attached to the ansys solver via parameter -i

+

License should be selected by parameter -p. Licensed products are the following: aa_r (ANSYS Academic Research), ane3fl (ANSYS Multiphysics)-Commercial, aa_r_dy (ANSYS Academic AUTODYN)
More about licensing here

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-mechanical-apdl.md b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-mechanical-apdl.md index c344de56479bda0a3486cd9dd1408d54c68e8647..285f909a4c9da3a74b1393c27f7edbc0a2d26866 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-mechanical-apdl.md +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-mechanical-apdl.md @@ -49,7 +49,7 @@ do fi done -echo Machines$hl +echo Machines: $hl #-i input.dat includes the input of analysis in APDL format #-o file.out is output file from ansys where all text outputs will be redirected @@ -61,7 +61,7 @@ Header of the pbs file (above) is common and description can be find on [this site](../../resource-allocation-and-job-execution/job-submission-and-execution.html). [SVS FEM](http://www.svsfem.cz) recommends to utilize -sources by keywordsnodes, ppn. These keywords allows to address +sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources. @@ -72,10 +72,14 @@ file has to be specified. Input file has to be defined by common APDL file which is attached to the ansys solver via parameter -i **License** should be selected by parameter -p. Licensed products are -the followingaa_r (ANSYS **Academic** Research), ane3fl (ANSYS +the following: aa_r (ANSYS **Academic** Research), ane3fl (ANSYS Multiphysics)-**Commercial**, aa_r_dy (ANSYS **Academic** AUTODYN) [More about licensing here](licensing.html) + + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-products-mechanical-fluent-cfx-mapdl.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-products-mechanical-fluent-cfx-mapdl.html new file mode 100644 index 0000000000000000000000000000000000000000..cdcfaebeca3851adb6ce1ea8d036020c63fb3161 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-products-mechanical-fluent-cfx-mapdl.html @@ -0,0 +1,1173 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Overview of ANSYS Products — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + + + ANSYS + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Overview of ANSYS Products +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

SVS FEM as ANSYS Channel partner for Czech Republic provided all ANSYS licenses for ANSELM cluster and supports of all ANSYS Products (Multiphysics, Mechanical, MAPDL, CFX, Fluent, Maxwell, LS-DYNA...) to IT staff and ANSYS users. If you are challenging to problem of ANSYS functionality contact please 

+

Anselm provides as commercial as academic variants. Academic variants are distinguished by "Academic..." word in the name of  license or by two letter preposition "aa_" in the license feature name. Change of license is realized on command line respectively directly in user's pbs file (see individual products). More about licensing here

+

To load the latest version of any ANSYS product (Mechanical, Fluent, CFX, MAPDL,...) load the module:

+
$ module load ansys
+

ANSYS supports interactive regime, but due to assumed solution of extremely difficult tasks it is not recommended.

+

If user needs to work in interactive regime we recommend to configure the RSM service on the client machine which allows to forward the solution to the Anselm directly from the client's Workbench project (see ANSYS RSM service).

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-products-mechanical-fluent-cfx-mapdl.md b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-products-mechanical-fluent-cfx-mapdl.md index 58f64de2c66d480991233d21daaf98b11f3aec69..ea369b7a88f22f62380b9d45528d2ca494805d66 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-products-mechanical-fluent-cfx-mapdl.md +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ansys-products-mechanical-fluent-cfx-mapdl.md @@ -30,3 +30,7 @@ If user needs to work in interactive regime we recommend to configure the RSM service on the client machine which allows to forward the solution to the Anselm directly from the client's Workbench project (see ANSYS RSM service). + + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/licensing.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/licensing.html new file mode 100644 index 0000000000000000000000000000000000000000..c168c66fd3940f4e772f469f3d9e7944534ff8fb --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/licensing.html @@ -0,0 +1,351 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + + + +
+ +
+ + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + +
+ +
+ + + + + + +
+ + + +
+ + + +
+ +
+ +
+ + + + + + + + + + +
+ + +
+ + + +
+ + +
+
+
+ + + + + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ls-dyna-pbs-file/view.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ls-dyna-pbs-file/view.html new file mode 100644 index 0000000000000000000000000000000000000000..78a31628995fe3c7fb1ec6f45f5befa2dbe37394 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ls-dyna-pbs-file/view.html @@ -0,0 +1,1170 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +LS-DYNA PBS file — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + ANSYS + + / + + + + + + + + + + LS-DYNA PBS file + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ LS-DYNA PBS file +

+ + + + +
+
+ + + + + + + + + + + +
+
+ + +

+ + + + + + + + + shell script icon + lsdyna.pbs + + + — + shell script, + 1 KB + + + + + + + + + +

+ + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ls-dyna.html b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ls-dyna.html new file mode 100644 index 0000000000000000000000000000000000000000..3c2fa44cb9fac1431ad88cf46a964e9635f9187e --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ls-dyna.html @@ -0,0 +1,1183 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +LS-DYNA — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + ANSYS + + / + + + + + + + + + + LS-DYNA + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ LS-DYNA +

+ + + + +
+
+ + + + + + + + + + + +
+ +
+
+ + + +
+

LS-DYNA is a multi-purpose, explicit and implicit finite element program used to analyze the nonlinear dynamic response of structures. Its fully automated contact analysis capability, a wide range of constitutive models to simulate a whole range of engineering materials (steels, composites, foams, concrete, etc.), error-checking features and the high scalability have enabled users worldwide to solve successfully many complex problems. Additionally LS-DYNA is extensively used to simulate impacts on structures from drop tests, underwater shock, explosions or high-velocity impacts. Explosive forming, process engineering, accident reconstruction, vehicle dynamics, thermal brake disc analysis or nuclear safety are further areas in the broad range of possible applications. In leading-edge research LS-DYNA is used to investigate the behaviour of materials like composites, ceramics, concrete, or wood. Moreover, it is used in biomechanics, human modelling, molecular structures, casting, forging, or virtual testing.

+

Anselm provides 1 commercial license of LS-DYNA without HPC support now. 

+

To run LS-DYNA in batch mode you can utilize/modify the default lsdyna.pbs script and execute it via the qsub command.

+
#!/bin/bash
#PBS -l nodes=1:ppn=16
#PBS -q qprod
#PBS -N $USER-LSDYNA-Project
#PBS -A XX-YY-ZZ

#! Mail to user when job terminate or abort
#PBS -m ae

#!change the working directory (default is home directory)
#cd <working directory> (working directory must exists)
WORK_DIR="/scratch/$USER/work"
cd $WORK_DIR

echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`

module load lsdyna

/apps/engineering/lsdyna/lsdyna700s i=input.k
+

Header of the pbs file (above) is common and description can be find on this site. SVS FEM recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.

+

Working directory has to be created before sending pbs job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common LS-DYNA .k file which is attached to the LS-DYNA solver via parameter i=

+ +
+ + + + +
+
+ Komentáře +
+ + + + + +
+
+ + + +
+ +
+ + + + +
+ +
+ + +
+ + + +
+ + +
+ + + + + + +
+
+ +
+
+ + + + + + + + +
+
+ + +
+ + +
+
+ +
+ + Navigace + +
+ +
+ + + +
+
+ +
+ + + + + +
+ + + +
+ + + + + + + +
+ + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ls-dyna.md b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ls-dyna.md index 8113ddac5166761dd4770963e8011b092769e59d..51443a473bb79b90ee2661e542ae38ab92af9337 100644 --- a/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ls-dyna.md +++ b/docs.it4i.cz/anselm-cluster-documentation/software/ansys/ls-dyna.md @@ -53,7 +53,7 @@ Header of the pbs file (above) is common and description can be find on [this site](../../resource-allocation-and-job-execution/job-submission-and-execution.html). [SVS FEM](http://www.svsfem.cz) recommends to utilize -sources by keywordsnodes, ppn. These keywords allows to address +sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources. @@ -62,3 +62,7 @@ Working directory has to be created before sending pbs job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common LS-DYNA **.k** file which is attached to the LS-DYNA solver via parameter i= + + + + diff --git a/docs.it4i.cz/anselm-cluster-documentation/software/chemistry.html b/docs.it4i.cz/anselm-cluster-documentation/software/chemistry.html new file mode 100644 index 0000000000000000000000000000000000000000..b3cb364a296aa670bc1d40df56f6dcb17d0afc19 --- /dev/null +++ b/docs.it4i.cz/anselm-cluster-documentation/software/chemistry.html @@ -0,0 +1,1089 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Chemistry and Materials science — IT4I Docs + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+
+

+ Přejít na obsah | + + Přejít na navigaci +

+ +
+ +

Osobní nástroje

+ + + + + +
+ + + + + + + +
+ +
+
+
+ +
+ +
+ + Nacházíte se zde: + + Úvod + + / + + + + + + Anselm Cluster Documentation + + / + + + + + + + + Software + + / + + + + + + + + + + Chemistry and Materials science + + + +
+
+ + +
+ + + + + + + + + + +
+ + + + + +
+ + + + +

+ Chemistry and Materials science +

+ + + + +
+
+ + + + + +
+ Tools for computational chemistry. +
+ + + + + +
+
+ + + + + + + + + +
+ + + +
+ + + + Molpro + + + + + + + + + + +
+ +
+ Molpro is a complete system of ab initio programs for molecular electronic structure calculations. +
+ + + + + + +
+ + + + NWChem + + + + + + + + + + +
+ +
+ High-Performance Computat