Skip to content
Snippets Groups Projects
Commit 760a29af authored by David Hrbáč's avatar David Hrbáč
Browse files

Links OK

parent e9333cd5
No related branches found
No related tags found
5 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section,!219Virtual environment, upgrade MKdocs, upgrade Material design
......@@ -16,7 +16,7 @@ Queue priority is the priority of the queue in which the job is waiting prior to
Queue priority has the biggest impact on job execution priority. The execution priority of jobs in higher priority queues is always greater than the execution priority of jobs in lower priority queues. Other properties of jobs used for determining the job execution priority (fair-share priority, eligible time) cannot compete with queue priority.
Queue priorities can be seen at [https://extranet.it4i.cz/anselm/queues](https://extranet.it4i.cz/anselm/queues)
Queue priorities can be seen [here][a].
### Fair-Share Priority
......@@ -36,7 +36,7 @@ Usage counts allocated core-hours (`ncpus x walltime`). Usage decays, halving at
Jobs queued in the queue qexp are not used to calculate the project's usage.
!!! note
Calculated usage and fair-share priority can be seen at [https://extranet.it4i.cz/anselm/projects](https://extranet.it4i.cz/anselm/projects).
Calculated usage and fair-share priority can be seen [here][b].
Calculated fair-share priority can be also be seen in the Resource_List.fairshare attribute of a job.
......@@ -70,3 +70,6 @@ This means that jobs with lower execution priority can be run before jobs with h
Specifying more accurate walltime enables better scheduling, better execution times, and better resource usage. Jobs with suitable (small) walltime can be backfilled - and overtake job(s) with a higher priority.
---8<--- "mathjax.md"
[a]: https://extranet.it4i.cz/anselm/queues
[b]: https://extranet.it4i.cz/anselm/projects
......@@ -51,7 +51,7 @@ $ qsub -A OPEN-0-0 -q qfree -l select=10:ncpus=16 ./myjob
In this example, we allocate 10 nodes, 16 cores per node, for 12 hours. We allocate these resources via the qfree queue. It is not required that the project OPEN-0-0 has any available resources left. Consumed resources are still accounted for. The jobscript myjob will be executed on the first node in the allocation.
All qsub options may be [saved directly into the jobscript](#example-jobscript-for-mpi-calculation-with-preloaded-inputs). In such cases, it is not necessary to specify any options for qsub.
All qsub options may be [saved directly into the jobscript][1]. In such cases, it is not necessary to specify any options for qsub.
```console
$ qsub ./myjob
......@@ -92,9 +92,9 @@ In this example, we allocate 4 nodes, 16 cores per node, selecting only the node
### Placement by IB Switch
Groups of computational nodes are connected to chassis integrated Infiniband switches. These switches form the leaf switch layer of the [Infiniband network](anselm/network/) fat tree topology. Nodes sharing the leaf switch can communicate most efficiently. Sharing the same switch prevents hops in the network and facilitates unbiased, highly efficient network communication.
Groups of computational nodes are connected to chassis integrated Infiniband switches. These switches form the leaf switch layer of the [Infiniband network][2] fat tree topology. Nodes sharing the leaf switch can communicate most efficiently. Sharing the same switch prevents hops in the network and facilitates unbiased, highly efficient network communication.
Nodes sharing the same switch may be selected via the PBS resource attribute ibswitch. Values of this attribute are iswXX, where XX is the switch number. The node-switch mapping can be seen in the [Hardware Overview](anselm/hardware-overview/) section.
Nodes sharing the same switch may be selected via the PBS resource attribute ibswitch. Values of this attribute are iswXX, where XX is the switch number. The node-switch mapping can be seen in the [Hardware Overview][3] section.
We recommend allocating compute nodes to a single switch when best possible computational network performance is required to run the job efficiently:
......@@ -339,7 +339,7 @@ exit
In this example, a directory in /home holds the input file input and executable mympiprog.x . We create the directory myjob on the /scratch filesystem, copy input and executable files from the /home directory where the qsub was invoked ($PBS_O_WORKDIR) to /scratch, execute the MPI program mympiprog.x and copy the output file back to the /home directory. mympiprog.x is executed as one process per node, on all allocated nodes.
!!! note
Consider preloading inputs and executables onto [shared scratch](storage/) memory before the calculation starts.
Consider preloading inputs and executables onto [shared scratch][4] memory before the calculation starts.
In some cases, it may be impractical to copy the inputs to the scratch memory and the outputs to the home directory. This is especially true when very large input and output files are expected, or when the files should be reused by a subsequent calculation. In such cases, it is the users' responsibility to preload the input files on shared /scratch memory before the job submission, and retrieve the outputs manually after all calculations are finished.
......@@ -373,15 +373,14 @@ exit
In this example, input and executable files are assumed to be preloaded manually in the /scratch/$USER/myjob directory. Note the **mpiprocs** and **ompthreads** qsub options controlling the behavior of the MPI execution. mympiprog.x is executed as one process per node, on all 100 allocated nodes. If mympiprog.x implements OpenMP threads, it will run 16 threads per node.
More information can be found in the [Running OpenMPI](software/mpi/Running_OpenMPI/) and [Running MPICH2](software/mpi/running-mpich2/)
sections.
More information can be found in the [Running OpenMPI][5] and [Running MPICH2][6] sections.
### Example Jobscript for Single Node Calculation
!!! note
The local scratch directory is often useful for single node jobs. Local scratch memory will be deleted immediately after the job ends.
Example jobscript for single node calculation, using [local scratch](anselm/storage/) memory on the node:
Example jobscript for single node calculation, using [local scratch][4] memory on the node:
```bash
#!/bin/bash
......@@ -407,4 +406,12 @@ In this example, a directory in /home holds the input file input and executable
### Other Jobscript Examples
Further jobscript examples may be found in the software section and the [Capacity computing](anselm/capacity-computing/) section.
Further jobscript examples may be found in the software section and the [Capacity computing][7] section.
[1]: ./#example-jobscript-for-mpi-calculation-with-preloaded-inputs
[2]: ./network.md
[3]: ./hardware-overview.md
[4]: ./storage.md
[5]: ../software/mpi/Running_OpenMPI.md
[6]: ../software/mpi/running-mpich2.md
[7]: ./capacity-computing.md
# Network
All of the compute and login nodes of Anselm are interconnected through an [InfiniBand](http://en.wikipedia.org/wiki/InfiniBand) QDR network and a Gigabit [Ethernet](http://en.wikipedia.org/wiki/Ethernet) network. Both networks may be used to transfer user data.
All of the compute and login nodes of Anselm are interconnected through an [InfiniBand][a] QDR network and a Gigabit [Ethernet][b] network. Both networks may be used to transfer user data.
## InfiniBand Network
All of the compute and login nodes of Anselm are interconnected through a high-bandwidth, low-latency [InfiniBand](http://en.wikipedia.org/wiki/InfiniBand) QDR network (IB 4 x QDR, 40 Gbps). The network topology is a fully non-blocking fat-tree.
All of the compute and login nodes of Anselm are interconnected through a high-bandwidth, low-latency [InfiniBand][a] QDR network (IB 4 x QDR, 40 Gbps). The network topology is a fully non-blocking fat-tree.
The compute nodes may be accessed via the InfiniBand network using ib0 network interface, in address range 10.2.1.1-209. The MPI may be used to establish native InfiniBand connection among the nodes.
......@@ -19,6 +19,8 @@ The compute nodes may be accessed via the regular Gigabit Ethernet network inter
## Example
In this example, we access the node cn110 through the InfiniBand network via the ib0 interface, then from cn110 to cn108 through the Ethernet network.
```console
$ qsub -q qexp -l select=4:ncpus=16 -N Name0 ./myjob
$ qstat -n -u username
......@@ -32,4 +34,5 @@ $ ssh 10.2.1.110
$ ssh 10.1.1.108
```
In this example, we access the node cn110 through the InfiniBand network via the ib0 interface, then from cn110 to cn108 through the Ethernet network.
[a]: http://en.wikipedia.org/wiki/InfiniBand
[b]: http://en.wikipedia.org/wiki/Ethernet
......@@ -2,7 +2,7 @@
## Job Queue Policies
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and the resources available to the Project. The Fair-share system of Anselm ensures that individual users may consume approximately equal amounts of resources per week. Detailed information can be found in the [Job scheduling](anselm/job-priority/) section. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. The following table provides the queue partitioning overview:
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and the resources available to the Project. The Fair-share system of Anselm ensures that individual users may consume approximately equal amounts of resources per week. Detailed information can be found in the [Job scheduling][1] section. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. The following table provides the queue partitioning overview:
!!! note
Check the queue status at <https://extranet.it4i.cz/anselm/>
......@@ -17,28 +17,28 @@ The resources are allocated to the job in a fair-share fashion, subject to const
| qfree | yes | < 120% of allocation | 180 w/o accelerator | 16 | -1024 | no | 12 h |
!!! note
**The qfree queue is not free of charge**. [Normal accounting](#resources-accounting-policy) applies. However, it allows for utilization of free resources, once a project has exhausted all its allocated computational resources. This does not apply to Director's Discretion projects (DD projects) by default. Usage of qfree after exhaustion of DD projects' computational resources is allowed after request for this queue.
**The qfree queue is not free of charge**. [Normal accounting][2] applies. However, it allows for utilization of free resources, once a project has exhausted all its allocated computational resources. This does not apply to Director's Discretion projects (DD projects) by default. Usage of qfree after exhaustion of DD projects' computational resources is allowed after request for this queue.
**The qexp queue is equipped with nodes which do not have exactly the same CPU clock speed.** Should you need the nodes to have exactly the same CPU speed, you have to select the proper nodes during the PSB job submission.
* **qexp**, the Express queue: This queue is dedicated to testing and running very small jobs. It is not required to specify a project to enter the qexp. There are always 2 nodes reserved for this queue (w/o accelerators), a maximum 8 nodes are available via the qexp for a particular user, from a pool of nodes containing Nvidia accelerated nodes (cn181-203), MIC accelerated nodes (cn204-207) and Fat nodes with 512GB of RAM (cn208-209). This enables us to test and tune accelerated code and code with higher RAM requirements. The nodes may be allocated on a per core basis. No special authorization is required to use qexp. The maximum runtime in qexp is 1 hour.
* **qprod**, the Production queue: This queue is intended for normal production runs. It is required that an active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, except the reserved ones. 178 nodes without accelerators are included. Full nodes, 16 cores per node, are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours.
* **qlong**, the Long queue: This queue is intended for long production runs. It is required that an active project with nonzero remaining resources is specified to enter the qlong. Only 60 nodes without acceleration may be accessed via the qlong queue. Full nodes, 16 cores per node, are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qlong is 144 hours (three times that of the standard qprod time - 3 x 48 h).
* **qnvidia**, qmic, qfat, the Dedicated queues: The queue qnvidia is dedicated to accessing the Nvidia accelerated nodes, the qmic to accessing MIC nodes and qfat the Fat nodes. It is required that an active project with nonzero remaining resources is specified to enter these queues. 23 nvidia, 4 mic, and 2 fat nodes are included. Full nodes, 16 cores per node, are allocated. The queues run with very high priority, the jobs will be scheduled before the jobs coming from the qexp queue. An PI needs to explicitly ask [support](https://support.it4i.cz/rt/) for authorization to enter the dedicated queues for all users associated with her/his project.
* **qnvidia**, qmic, qfat, the Dedicated queues: The queue qnvidia is dedicated to accessing the Nvidia accelerated nodes, the qmic to accessing MIC nodes and qfat the Fat nodes. It is required that an active project with nonzero remaining resources is specified to enter these queues. 23 nvidia, 4 mic, and 2 fat nodes are included. Full nodes, 16 cores per node, are allocated. The queues run with very high priority, the jobs will be scheduled before the jobs coming from the qexp queue. An PI needs to explicitly ask [support][a] for authorization to enter the dedicated queues for all users associated with her/his project.
* **qfree**, The Free resource queue: The queue qfree is intended for utilization of free resources, after a project has exhausted all of its allocated computational resources (Does not apply to DD projects by default; DD projects have to request persmission to use qfree after exhaustion of computational resources). It is required that active project is specified to enter the queue. Consumed resources will be accounted to the Project. Access to the qfree queue is automatically removed if consumed resources exceed 120% of the resources allocated to the Project. Only 180 nodes without accelerators may be accessed from this queue. Full nodes, 16 cores per node, are allocated. The queue runs with very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours.
## Queue Notes
The job wall clock time defaults to **half the maximum time**, see the table above. Longer wall time limits can be [set manually, see examples](anselm/job-submission-and-execution/).
The job wall clock time defaults to **half the maximum time**, see the table above. Longer wall time limits can be [set manually, see examples][3].
Jobs that exceed the reserved wall clock time (Req'd Time) get killed automatically. The wall clock time limit can be changed for queuing jobs (state Q) using the qalter command, however it cannot be changed for a running job (state R).
Anselm users may check the current queue configuration at [https://extranet.it4i.cz/anselm/queues](https://extranet.it4i.cz/anselm/queues).
Anselm users may check the current queue configuration [here][b].
## Queue Status
!!! tip
Check the status of jobs, queues and compute nodes at [https://extranet.it4i.cz/anselm/](https://extranet.it4i.cz/anselm/)
Check the status of jobs, queues and compute nodes [here][c].
![rspbs web interface](../img/rsweb.png)
......@@ -109,3 +109,11 @@ Options:
---8<--- "resource_accounting.md"
---8<--- "mathjax.md"
[1]: ./job-priority.md
[2]: ./#resources-accounting-policy
[3]: ./job-submission-and-execution.md
[a]: https://support.it4i.cz/rt/
[b]: https://extranet.it4i.cz/anselm/queues
[c]: https://extranet.it4i.cz/anselm/
......@@ -10,7 +10,7 @@ The Anselm cluster is accessed by SSH protocol via login nodes login1 and login2
| login1.anselm.it4i.cz | 22 | ssh | login1 |
| login2.anselm.it4i.cz | 22 | ssh | login2 |
Authentication is by [private key](../../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys/)
Authentication is available by [private key][1] only.
!!! note
Please verify SSH fingerprints during the first logon. They are identical on all login nodes:
......@@ -39,7 +39,7 @@ If you see a warning message "UNPROTECTED PRIVATE KEY FILE!", use this command t
$ chmod 600 /path/to/id_rsa
```
On **Windows**, use [PuTTY ssh client](../general/accessing-the-clusters/shell-access-and-data-transfer/putty.md).
On **Windows**, use [PuTTY ssh client][2].
After logging in, you will see the command prompt:
......@@ -61,11 +61,11 @@ Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com
Example to the cluster login:
!!! note
The environment is **not** shared between login nodes, except for [shared filesystems](storage/#shared-filesystems).
The environment is **not** shared between login nodes, except for [shared filesystems][3].
## Data Transfer
Data in and out of the system may be transferred by the [scp](http://en.wikipedia.org/wiki/Secure_copy) and sftp protocols. (Not available yet). In the case that large volumes of data are transferred, use the dedicated data mover node dm1.anselm.it4i.cz for increased performance.
Data in and out of the system may be transferred by the [scp][a] and sftp protocols. (Not available yet). In the case that large volumes of data are transferred, use the dedicated data mover node dm1.anselm.it4i.cz for increased performance.
| Address | Port | Protocol |
| --------------------- | ---- | --------- |
......@@ -73,7 +73,7 @@ Data in and out of the system may be transferred by the [scp](http://en.wikipedi
| login1.anselm.it4i.cz | 22 | scp |
| login2.anselm.it4i.cz | 22 | scp |
Authentication is by [private key](../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys.md)
Authentication is by [private key][1] only.
!!! note
Data transfer rates of up to **160MB/s** can be achieved with scp or sftp.
......@@ -101,7 +101,7 @@ or
$ sftp -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz
```
A very convenient way to transfer files in and out of Anselm is via the fuse filesystem [sshfs](http://linux.die.net/man/1/sshfs)
A very convenient way to transfer files in and out of Anselm is via the fuse filesystem [sshfs][b].
```console
$ sshfs -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz:. mountpoint
......@@ -117,9 +117,9 @@ $ man scp
$ man sshfs
```
On Windows, use the [WinSCP client](http://winscp.net/eng/download.php) to transfer the data. The [win-sshfs client](http://code.google.com/p/win-sshfs/) provides a way to mount the Anselm filesystems directly as an external disc.
On Windows, use the [WinSCP client][c] to transfer the data. The [win-sshfs client][d] provides a way to mount the Anselm filesystems directly as an external disc.
More information about the shared file systems is available [here](access/storage/).
More information about the shared file systems is available [here][4].
## Connection Restrictions
......@@ -169,15 +169,15 @@ $ ssh -L 6000:localhost:1234 remote.host.com
Remote port forwarding from compute nodes allows applications running on the compute nodes to access hosts outside the Anselm Cluster.
First, establish the remote port forwarding form the login node, as [described above](#port-forwarding-from-login-nodes).
First, establish the remote port forwarding form the login node, as [described above][5].
Second, invoke port forwarding from the compute node to the login node. Insert the following line into your jobscript or interactive shell;
Second, invoke port forwarding from the compute node to the login node. Insert the following line into your jobscript or interactive shell:
```console
$ ssh -TN -f -L 6000:localhost:6000 login1
```
In this example, we assume that port forwarding from login1:6000 to remote.host.com:1234 has been established beforehand. By accessing localhost:6000, an application running on a compute node will see the response of remote.host.com:1234
In this example, we assume that port forwarding from `login1:6000` to `remote.host.com:1234` has been established beforehand. By accessing `localhost:6000`, an application running on a compute node will see the response of `remote.host.com:1234`.
### Using Proxy Servers
......@@ -192,21 +192,39 @@ To establish a local proxy server on your workstation, install and run SOCKS pro
$ ssh -D 1080 localhost
```
On Windows, install and run the free, open source [Sock Puppet](http://sockspuppet.com/) server.
On Windows, install and run the free, open source [Sock Puppet][e] server.
Once the proxy server is running, establish ssh port forwarding from Anselm to the proxy server, port 1080, exactly as [described above](#port-forwarding-from-login-nodes):
Once the proxy server is running, establish ssh port forwarding from Anselm to the proxy server, port 1080, exactly as [described above][5]:
```console
$ ssh -R 6000:localhost:1080 anselm.it4i.cz
```
Now, configure the applications proxy settings to **localhost:6000**. Use port forwarding to access the [proxy server from compute nodes](#port-forwarding-from-compute-nodes) as well.
Now, configure the applications proxy settings to **localhost:6000**. Use port forwarding to access the [proxy server from compute nodes][5] as well.
## Graphical User Interface
* The [X Window system](general/accessing-the-clusters/graphical-user-interface/x-window-system/) is the principal way to get GUI access to the clusters.
* [Virtual Network Computing](general/accessing-the-clusters/graphical-user-interface/vnc/) is a graphical [desktop sharing](http://en.wikipedia.org/wiki/Desktop_sharing) system that uses the [Remote Frame Buffer protocol](http://en.wikipedia.org/wiki/RFB_protocol) to remotely control another [computer](http://en.wikipedia.org/wiki/Computer).
* The [X Window system][6] is the principal way to get GUI access to the clusters.
* [Virtual Network Computing][7] is a graphical [desktop sharing][f] system that uses the [Remote Frame Buffer protocol][g] to remotely control another [computer][h].
## VPN Access
* Access IT4Innovations internal resources via [VPN](general/accessing-the-clusters/vpn-access/).
* Access IT4Innovations internal resources via [VPN][8].
[1]: ../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys.md
[2]: ../general/accessing-the-clusters/shell-access-and-data-transfer/putty.md
[3]: ./storage.md#shared-filesystems
[4]: ./storage.md
[5]: ./#port-forwarding-from-login-nodes
[6]: ../general/accessing-the-clusters/graphical-user-interface/x-window-system.md
[7]: ../general/accessing-the-clusters/graphical-user-interface/vnc.md
[8]: ../general/accessing-the-clusters/vpn-access.md
[a]: http://en.wikipedia.org/wiki/Secure_copy
[b]: http://linux.die.net/man/1/sshfs
[c]: http://winscp.net/eng/download.php
[d]: http://code.google.com/p/win-sshfs/
[e]: http://sockspuppet.com/
[f]: http://en.wikipedia.org/wiki/Desktop_sharing
[g]: http://en.wikipedia.org/wiki/RFB_protocol
[h]: http://en.wikipedia.org/wiki/Computer
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment