Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • sccs/docs.it4i.cz
  • soj0018/docs.it4i.cz
  • lszustak/docs.it4i.cz
  • jarosjir/docs.it4i.cz
  • strakpe/docs.it4i.cz
  • beranekj/docs.it4i.cz
  • tab0039/docs.it4i.cz
  • davidciz/docs.it4i.cz
  • gui0013/docs.it4i.cz
  • mrazek/docs.it4i.cz
  • lriha/docs.it4i.cz
  • it4i-vhapla/docs.it4i.cz
  • hol0598/docs.it4i.cz
  • sccs/docs-it-4-i-cz-fumadocs
  • siw019/docs-it-4-i-cz-fumadocs
15 results
Show changes
Showing
with 1763 additions and 0 deletions
Hardware Overview
=================
The Anselm cluster consists of 209 computational nodes named cn[1-209]
of which 180 are regular compute nodes, 23 GPU Kepler K20 accelerated
nodes, 4 MIC Xeon Phi 5110 accelerated nodes and 2 fat nodes. Each node
is a <span class="WYSIWYG_LINK">powerful</span> x86-64 computer,
equipped with 16 cores (two eight-core Intel Sandy Bridge processors),
at least 64GB RAM, and local hard drive. The user access to the Anselm
cluster is provided by two login nodes login[1,2]. The nodes are
interlinked by high speed InfiniBand and Ethernet networks. All nodes
share 320TB /home disk storage to store the user files. The 146TB shared
/scratch storage is available for the scratch data.
The Fat nodes are equipped with large amount (512GB) of memory.
Virtualization infrastructure provides resources to run long term
servers and services in virtual mode. Fat nodes and virtual servers may
access 45 TB of dedicated block storage. Accelerated nodes, fat nodes,
and virtualization infrastructure are available [upon
request](https://support.it4i.cz/rt) made by a PI.
Schematic representation of the Anselm cluster. Each box represents a
node (computer) or storage capacity:
User-oriented infrastructure
Storage
Management infrastructure
--------
login1
login2
dm1
--------
**Rack 01, Switch isw5
**
-------------- -------------- -------------- -------------- --------------
cn186 cn187 cn188 cn189
cn181 cn182 cn183 cn184 cn185
-------------- -------------- -------------- -------------- --------------
**Rack 01, Switch isw4
**
cn29
cn30
cn31
cn32
cn33
cn34
cn35
cn36
cn19
cn20
cn21
cn22
cn23
cn24
cn25
cn26
cn27
cn28
<table>
<colgroup>
<col width="100%" />
</colgroup>
<tbody>
<tr class="odd">
<td align="left"><p> </p>
<p> </p>
<p>Lustre FS</p>
<p>/home<br />
320TB</p>
<p> </p>
<p> </p></td>
</tr>
<tr class="even">
<td align="left"><p>Lustre FS</p>
<p>/scratch<br />
146TB</p></td>
</tr>
</tbody>
</table>
Management
nodes
Block storage
45 TB
Virtualization
infrastructure
servers
...
Srv node
Srv node
Srv node
...
**Rack 01, Switch isw0
**
cn11
cn12
cn13
cn14
cn15
cn16
cn17
cn18
cn1
cn2
cn3
cn4
cn5
cn6
cn7
cn8
cn9
cn10
**Rack 02, Switch isw10
**
cn73
cn74
cn75
cn76
cn77
cn78
cn79
cn80
cn190
cn191
cn192
cn205
cn206
**Rack 02, Switch isw9
**
cn65
cn66
cn67
cn68
cn69
cn70
cn71
cn72
cn55
cn56
cn57
cn58
cn59
cn60
cn61
cn62
cn63
cn64
**Rack 02, Switch isw6
**
cn47
cn48
cn49
cn50
cn51
cn52
cn53
cn54
cn37
cn38
cn39
cn40
cn41
cn42
cn43
cn44
cn45
cn46
**Rack 03, Switch isw15
**
cn193
cn194
cn195
cn207
cn117
cn118
cn119
cn120
cn121
cn122
cn123
cn124
cn125
cn126
**Rack 03, Switch isw14
**
cn109
cn110
cn111
cn112
cn113
cn114
cn115
cn116
cn99
cn100
cn101
cn102
cn103
cn104
cn105
cn106
cn107
cn108
**Rack 03, Switch isw11
**
cn91
cn92
cn93
cn94
cn95
cn96
cn97
cn98
cn81
cn82
cn83
cn84
cn85
cn86
cn87
cn88
cn89
cn90
**Rack 04, Switch isw20
**
cn173
cn174
cn175
cn176
cn177
cn178
cn179
cn180
cn163
cn164
cn165
cn166
cn167
cn168
cn169
cn170
cn171
cn172
**Rack 04, **Switch** isw19
**
cn155
cn156
cn157
cn158
cn159
cn160
cn161
cn162
cn145
cn146
cn147
cn148
cn149
cn150
cn151
cn152
cn153
cn154
**Rack 04, Switch isw16
**
cn137
cn138
cn139
cn140
cn141
cn142
cn143
cn144
cn127
cn128
cn129
cn130
cn131
cn132
cn133
cn134
cn135
cn136
**Rack 05, Switch isw21
**
-------------- -------------- -------------- -------------- --------------
cn201 cn202 cn203 cn204
cn196 cn197 cn198 cn199 cn200
-------------- -------------- -------------- -------------- --------------
----------------
Fat node cn208
Fat node cn209
...
----------------
The cluster compute nodes cn[1-207] are organized within 13 chassis.
There are four types of compute nodes:
- 180 compute nodes without the accelerator
- 23 compute nodes with GPU accelerator - equipped with NVIDIA Tesla
Kepler K20
- 4 compute nodes with MIC accelerator - equipped with Intel Xeon Phi
5110P
- 2 fat nodes - equipped with 512GB RAM and two 100GB SSD drives
[More about Compute nodes](compute-nodes.html).
GPU and accelerated nodes are available upon request, see the [Resources
Allocation
Policy](resource-allocation-and-job-execution/resources-allocation-policy.html).
All these nodes are interconnected by fast <span
class="WYSIWYG_LINK">InfiniBand <span class="WYSIWYG_LINK">QDR</span>
network</span> and Ethernet network. [More about the <span
class="WYSIWYG_LINK">Network</span>](network.html).
Every chassis provides Infiniband switch, marked **isw**, connecting all
nodes in the chassis, as well as connecting the chassis to the upper
level switches.
All nodes share 360TB /home disk storage to store user files. The 146TB
shared /scratch storage is available for the scratch data. These file
systems are provided by Lustre parallel file system. There is also local
disk storage available on all compute nodes /lscratch. [More about
<span
class="WYSIWYG_LINK">Storage</span>](storage.html).
The user access to the Anselm cluster is provided by two login nodes
login1, login2, and data mover node dm1. [More about accessing
cluster.](accessing-the-cluster.html)
The parameters are summarized in the following tables:
**In general**
Primary purpose
High Performance Computing
Architecture of compute nodes
x86-64
Operating system
Linux
[**Compute nodes**](compute-nodes.html)
Totally
209
Processor cores
16 (2x8 cores)
RAM
min. 64 GB, min. 4 GB per core
Local disk drive
yes - usually 500 GB
Compute network
InfiniBand QDR, fully non-blocking, fat-tree
w/o accelerator
180, cn[1-180]
GPU accelerated
23, cn[181-203]
MIC accelerated
4, cn[204-207]
Fat compute nodes
2, cn[208-209]
**In total**
Total theoretical peak performance (Rpeak)
94 Tflop/s
Total max. LINPACK performance (Rmax)
73 Tflop/s
Total amount of RAM
15.136 TB
Node Processor Memory Accelerator
------------------ --------------------------------------- -------- ----------------------
w/o accelerator 2x Intel Sandy Bridge E5-2665, 2.4GHz 64GB -
GPU accelerated 2x Intel Sandy Bridge E5-2470, 2.3GHz 96GB NVIDIA Kepler K20
MIC accelerated 2x Intel Sandy Bridge E5-2470, 2.3GHz 96GB Intel Xeon Phi P5110
Fat compute node 2x Intel Sandy Bridge E5-2665, 2.4GHz 512GB -
For more details please refer to the [Compute
nodes](compute-nodes.html),
[Storage](storage.html), and
[Network](network.html).
docs.it4i.cz/anselm-cluster-documentation/icon.jpg

3.11 KiB

Introduction
============
Welcome to Anselm supercomputer cluster. The Anselm cluster consists of
209 compute nodes, totaling 3344 compute cores with 15TB RAM and giving
over 94 Tflop/s theoretical peak performance. Each node is a <span
class="WYSIWYG_LINK">powerful</span> x86-64 computer, equipped with 16
cores, at least 64GB RAM, and 500GB harddrive. Nodes are interconnected
by fully non-blocking fat-tree Infiniband network and equipped with
Intel Sandy Bridge processors. A few nodes are also equipped with NVIDIA
Kepler GPU or Intel Xeon Phi MIC accelerators. Read more in [Hardware
Overview](hardware-overview.html).
The cluster runs bullx Linux [<span
class="WYSIWYG_LINK"></span>](http://www.bull.com/bullx-logiciels/systeme-exploitation.html)[operating
system](software/operating-system.html), which is
compatible with the <span class="WYSIWYG_LINK">RedHat</span> [<span
class="WYSIWYG_LINK">Linux
family.</span>](http://upload.wikimedia.org/wikipedia/commons/1/1b/Linux_Distribution_Timeline.svg)
We have installed a wide range of
[software](software.1.html) packages targeted at
different scientific domains. These packages are accessible via the
[modules environment](environment-and-modules.html).
User data shared file-system (HOME, 320TB) and job data shared
file-system (SCRATCH, 146TB) are available to users.
The PBS Professional workload manager provides [computing resources
allocations and job
execution](resource-allocation-and-job-execution.html).
Read more on how to [apply for
resources](../get-started-with-it4innovations/applying-for-resources.html),
[obtain login
credentials,](../get-started-with-it4innovations/obtaining-login-credentials.html)
and [access the cluster](accessing-the-cluster.html).
docs.it4i.cz/anselm-cluster-documentation/legend.png

43.1 KiB

docs.it4i.cz/anselm-cluster-documentation/logingui.jpg

32.7 KiB

Network
=======
All compute and login nodes of Anselm are interconnected by
[Infiniband](http://en.wikipedia.org/wiki/InfiniBand)
DR network and by Gigabit
[Ethernet](http://en.wikipedia.org/wiki/Ethernet)
network. Both networks may be used to transfer user data.
Infiniband Network
------------------
All compute and login nodes of Anselm are interconnected by a
high-bandwidth, low-latency
[Infiniband](http://en.wikipedia.org/wiki/InfiniBand)
DR network (IB 4x QDR, 40 Gbps). The network topology is a fully
non-blocking fat-tree.
The compute nodes may be accessed via the Infiniband network using ib0
network interface, in address range 10.2.1.1-209. The MPI may be used to
establish native Infiniband connection among the nodes.
The network provides **2170MB/s** transfer rates via the TCP connection
(single stream) and up to **3600MB/s** via native Infiniband protocol.
The Fat tree topology ensures that peak transfer rates are achieved
between any two nodes, independent of network traffic exchanged among
other nodes concurrently.
Ethernet Network
----------------
The compute nodes may be accessed via the regular Gigabit Ethernet
network interface eth0, in address range 10.1.1.1-209, or by using
aliases cn1-cn209.
The network provides **114MB/s** transfer rates via the TCP connection.
Example
-------
```
$ qsub -q qexp -l select=4:ncpus=16 -N Name0 ./myjob
$ qstat -n -u username
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
15209.srv11 username qexp Name0 5530 4 64 -- 01:00 R 00:00
cn17/0*16+cn108/0*16+cn109/0*16+cn110/0*16
$ ssh 10.2.1.110
$ ssh 10.1.1.108
```
In this example, we access the node cn110 by Infiniband network via the
ib0 interface, then from cn110 to cn108 by Ethernet network.
PRACE User Support
==================
Intro
-----
PRACE users coming to Anselm as to TIER-1 system offered through the
DECI calls are in general treated as standard users and so most of the
general documentation applies to them as well. This section shows the
main differences for quicker orientation, but often uses references to
the original documentation. PRACE users who don't undergo the full
procedure (including signing the IT4I AuP on top of the PRACE AuP) will
not have a password and thus access to some services intended for
regular users. This can lower their comfort, but otherwise they should
be able to use the TIER-1 system as intended. Please see the [Obtaining
Login Credentials
section](../get-started-with-it4innovations/obtaining-login-credentials/obtaining-login-credentials.html),
if the same level of access is required.
All general [PRACE User
Documentation](http://www.prace-ri.eu/user-documentation/)
should be read before continuing reading the local documentation here.
[]()Help and Support
--------------------
If you have any troubles, need information, request support or want to
install additional software, please use [PRACE
Helpdesk](http://www.prace-ri.eu/helpdesk-guide264/).
Information about the local services are provided in the [introduction
of general user documentation](introduction.html).
Please keep in mind, that standard PRACE accounts don't have a password
to access the web interface of the local (IT4Innovations) request
tracker and thus a new ticket should be created by sending an e-mail to
support[at]it4i.cz.
Obtaining Login Credentials
---------------------------
In general PRACE users already have a PRACE account setup through their
HOMESITE (institution from their country) as a result of rewarded PRACE
project proposal. This includes signed PRACE AuP, generated and
registered certificates, etc.
If there's a special need a PRACE user can get a standard (local)
account at IT4Innovations. To get an account on the Anselm cluster, the
user needs to obtain the login credentials. The procedure is the same as
for general users of the cluster, so please see the corresponding
[section of the general documentation
here](../get-started-with-it4innovations/obtaining-login-credentials.html).
Accessing the cluster
---------------------
### Access with GSI-SSH
For all PRACE users the method for interactive access (login) and data
transfer based on grid services from Globus Toolkit (GSI SSH and
GridFTP) is supported.
The user will need a valid certificate and to be present in the PRACE
LDAP (please contact your HOME SITE or the primary investigator of your
project for LDAP account creation).
Most of the information needed by PRACE users accessing the Anselm
TIER-1 system can be found here:
- [General user's
FAQ](http://www.prace-ri.eu/Users-General-FAQs)
- [Certificates
FAQ](http://www.prace-ri.eu/Certificates-FAQ)
- [Interactive access using
GSISSH](http://www.prace-ri.eu/Interactive-Access-Using-gsissh)
- [Data transfer with
GridFTP](http://www.prace-ri.eu/Data-Transfer-with-GridFTP-Details)
- [Data transfer with
gtransfer](http://www.prace-ri.eu/Data-Transfer-with-gtransfer)
Before you start to use any of the services don't forget to create a
proxy certificate from your certificate:
$ grid-proxy-init
To check whether your proxy certificate is still valid (by default it's
valid 12 hours), use:
$ grid-proxy-info
To access Anselm cluster, two login nodes running GSI SSH service are
available. The service is available from public Internet as well as from
the internal PRACE network (accessible only from other PRACE partners).
**Access from PRACE network:**
It is recommended to use the single DNS name <span
class="monospace">anselm-prace.it4i.cz</span> which is distributed
between the two login nodes. If needed, user can login directly to one
of the login nodes. The addresses are:
Login address Port Protocol Login node
----------------------------- ------ ---------- ------------------
anselm-prace.it4i.cz 2222 gsissh login1 or login2
login1-prace.anselm.it4i.cz 2222 gsissh login1
login2-prace.anselm.it4i.cz 2222 gsissh login2
$ gsissh -p 2222 anselm-prace.it4i.cz
When logging from other PRACE system, the prace_service script can be
used:
$ gsissh `prace_service -i -s anselm`
**Access from public Internet:**
It is recommended to use the single DNS name <span
class="monospace">anselm.it4i.cz</span> which is distributed between the
two login nodes. If needed, user can login directly to one of the login
nodes. The addresses are:
Login address Port Protocol Login node
----------------------- ------ ---------- ------------------
anselm.it4i.cz 2222 gsissh login1 or login2
login1.anselm.it4i.cz 2222 gsissh login1
login2.anselm.it4i.cz 2222 gsissh login2
$ gsissh -p 2222 anselm.it4i.cz
When logging from other PRACE system, the <span
class="monospace">prace_service</span> script can be used:
$ gsissh `prace_service -e -s anselm`
Although the preferred and recommended file transfer mechanism is [using
GridFTP](prace.html#file-transfers), the GSI SSH
implementation on Anselm supports also SCP, so for small files transfer
gsiscp can be used:
$ gsiscp -P 2222 _LOCAL_PATH_TO_YOUR_FILE_ anselm.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_
$ gsiscp -P 2222 anselm.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_ _LOCAL_PATH_TO_YOUR_FILE_
$ gsiscp -P 2222 _LOCAL_PATH_TO_YOUR_FILE_ anselm-prace.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_
$ gsiscp -P 2222 anselm-prace.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_ _LOCAL_PATH_TO_YOUR_FILE_
### Access to X11 applications (VNC)
If the user needs to run X11 based graphical application and does not
have a X11 server, the applications can be run using VNC service. If the
user is using regular SSH based access, please see the [section in
general
documentation](https://docs.it4i.cz/anselm-cluster-documentation/resolveuid/11e53ad0d2fd4c5187537f4baeedff33).
If the user uses GSI SSH based access, then the procedure is similar to
the SSH based access ([look
here](https://docs.it4i.cz/anselm-cluster-documentation/resolveuid/11e53ad0d2fd4c5187537f4baeedff33)),
only the port forwarding must be done using GSI SSH:
$ gsissh -p 2222 anselm.it4i.cz -L 5961:localhost:5961
### Access with SSH
After successful obtainment of login credentials for the local
IT4Innovations account, the PRACE users can access the cluster as
regular users using SSH. For more information please see the [section in
general
documentation](https://docs.it4i.cz/anselm-cluster-documentation/resolveuid/5d3d6f3d873a42e584cbf4365c4e251b).
[]()File transfers
------------------
PRACE users can use the same transfer mechanisms as regular users (if
they've undergone the full registration procedure). For information
about this, please see [the section in the general
documentation](https://docs.it4i.cz/anselm-cluster-documentation/resolveuid/5d3d6f3d873a42e584cbf4365c4e251b).
Apart from the standard mechanisms, for PRACE users to transfer data
to/from Anselm cluster, a GridFTP server running Globus Toolkit GridFTP
service is available. The service is available from public Internet as
well as from the internal PRACE network (accessible only from other
PRACE partners).
There's one control server and three backend servers for striping and/or
backup in case one of them would fail.
**Access from PRACE network:**
Login address Port Node role
------------------------------ ------ -----------------------------
gridftp-prace.anselm.it4i.cz 2812 Front end /control server
login1-prace.anselm.it4i.cz 2813 Backend / data mover server
login2-prace.anselm.it4i.cz 2813 Backend / data mover server
dm1-prace.anselm.it4i.cz 2813 Backend / data mover server
Copy files **to** Anselm by running the following commands on your local
machine:
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://gridftp-prace.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
Or by using <span class="monospace">prace_service</span> script:
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://`prace_service -i -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
Copy files **from** Anselm:
$ globus-url-copy gsiftp://gridftp-prace.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
Or by using <span class="monospace">prace_service</span> script:
$ globus-url-copy gsiftp://`prace_service -i -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
**Access from public Internet:**
Login address Port Node role
------------------------ ------ -----------------------------
gridftp.anselm.it4i.cz 2812 Front end /control server
login1.anselm.it4i.cz 2813 Backend / data mover server
login2.anselm.it4i.cz 2813 Backend / data mover server
dm1.anselm.it4i.cz 2813 Backend / data mover server
Copy files **to** Anselm by running the following commands on your local
machine:
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://gridftp.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
Or by using <span class="monospace">prace_service</span> script:
$ globus-url-copy file://_LOCAL_PATH_TO_YOUR_FILE_ gsiftp://`prace_service -e -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_
Copy files **from** Anselm:
$ globus-url-copy gsiftp://gridftp.anselm.it4i.cz:2812/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
Or by using <span class="monospace">prace_service</span> script:
$ globus-url-copy gsiftp://`prace_service -e -f anselm`/home/prace/_YOUR_ACCOUNT_ON_ANSELM_/_PATH_TO_YOUR_FILE_ file://_LOCAL_PATH_TO_YOUR_FILE_
Generally both shared file systems are available through GridFTP:
File system mount point Filesystem Comment
------------------------- ------------ ----------------------------------------------------------------
/home Lustre Default HOME directories of users in format /home/prace/login/
/scratch Lustre Shared SCRATCH mounted on the whole cluster
More information about the shared file systems is available
[here](storage.html).
Usage of the cluster
--------------------
There are some limitations for PRACE user when using the cluster. By
default PRACE users aren't allowed to access special queues in the PBS
Pro to have high priority or exclusive access to some special equipment
like accelerated nodes and high memory (fat) nodes. There may be also
restrictions obtaining a working license for the commercial software
installed on the cluster, mostly because of the license agreement or
because of insufficient amount of licenses.
For production runs always use scratch file systems, either the global
shared or the local ones. The available file systems are described
[here](hardware-overview.html).
### Software, Modules and PRACE Common Production Environment
All system wide installed software on the cluster is made available to
the users via the modules. The information about the environment and
modules usage is in this [section of general
documentation](environment-and-modules.html).
PRACE users can use the "prace" module to use the [PRACE Common
Production
Environment](http://www.prace-ri.eu/PRACE-common-production).
$ module load prace
### Resource Allocation and Job Execution
General information about the resource allocation, job queuing and job
execution is in this [section of general
documentation](resource-allocation-and-job-execution/introduction.html).
For PRACE users, the default production run queue is "qprace". PRACE
users can also use two other queues "qexp" and "qfree".
-------------------------------------------------------------------------------------------------------------------------
queue Active project Project resources Nodes priority authorization walltime
default/max
--------------------- ---------------- ------------------- --------------------- ---------- --------------- -------------
**qexp** no none required 2 reserved, high no 1 / 1h
Express queue 8 total
**qprace** yes &gt; 0 178 w/o accelerator medium no 24 / 48h
Production queue
**qfree** yes none required 178 w/o accelerator very low no 12 / 12h
Free resource queue
-------------------------------------------------------------------------------------------------------------------------
**qprace**, the PRACE Production queue****This queue is intended for
normal production runs. It is required that active project with nonzero
remaining resources is specified to enter the qprace. The queue runs
with medium priority and no special authorization is required to use it.
The maximum runtime in qprace is 12 hours. If the job needs longer time,
it must use checkpoint/restart functionality.
### Accounting & Quota
The resources that are currently subject to accounting are the core
hours. The core hours are accounted on the wall clock basis. The
accounting runs whenever the computational cores are allocated or
blocked via the PBS Pro workload manager (the qsub command), regardless
of whether the cores are actually used for any calculation. See [example
in the general
documentation](resource-allocation-and-job-execution/resources-allocation-policy.html).
PRACE users should check their project accounting using the [PRACE
Accounting Tool
(DART)](http://www.prace-ri.eu/accounting-report-tool/).
Users who have undergone the full local registration procedure
(including signing the IT4Innovations Acceptable Use Policy) and who
have received local password may check at any time, how many core-hours
have been consumed by themselves and their projects using the command
"it4ifree". Please note that you need to know your user password to use
the command and that the displayed core hours are "system core hours"
which differ from PRACE "standardized core hours".
The **it4ifree** command is a part of it4i.portal.clients package,
located here:
<https://pypi.python.org/pypi/it4i.portal.clients>
$ it4ifree
Password:
PID Total Used ...by me Free
-------- ------- ------ -------- -------
OPEN-0-0 1500000 400644 225265 1099356
DD-13-1 10000 2606 2606 7394
By default file system quota is applied. To check the current status of
the quota use
$ lfs quota -u USER_LOGIN /home
$ lfs quota -u USER_LOGIN /scratch
If the quota is insufficient, please contact the
[support](prace.html#help-and-support) and request an
increase.
docs.it4i.cz/anselm-cluster-documentation/quality1.png

172 KiB

docs.it4i.cz/anselm-cluster-documentation/quality2.png

210 KiB

docs.it4i.cz/anselm-cluster-documentation/quality3.png

220 KiB

Remote visualization service
============================
Introduction
------------
The goal of this service is to provide the users a GPU accelerated use
of OpenGL applications, especially for pre- and post- processing work,
where not only the GPU performance is needed but also fast access to the
shared file systems of the cluster and a reasonable amount of RAM.
The service is based on integration of open source tools VirtualGL and
TurboVNC together with the cluster's job scheduler PBS Professional.
Currently two compute nodes are dedicated for this service with
following configuration for each node:
[**Visualization node
configuration**](compute-nodes.html)
CPU
2x Intel Sandy Bridge E5-2670, 2.6GHz
Processor cores
16 (2x8 cores)
RAM
64 GB, min. 4 GB per core
GPU
NVIDIA Quadro 4000, 2GB RAM
Local disk drive
yes - 500 GB
Compute network
InfiniBand QDR
Schematic overview
------------------
![rem_vis_scheme](scheme.png "rem_vis_scheme")
![rem_vis_legend](legend.png "rem_vis_legend")
How to use the service
----------------------
### Setup and start your own TurboVNC server.
TurboVNC is designed and implemented for cooperation with VirtualGL and
available for free for all major platforms. For more information and
download, please refer to<http://sourceforge.net/projects/turbovnc/>
**Always use TurboVNC on both sides** (server and client) **don't mix
TurboVNC and other VNC implementations** (TightVNC, TigerVNC, ...) as
the VNC protocol implementation may slightly differ and diminish your
user experience by introducing picture artifacts, etc.
The procedure is:
#### 1. Connect to a login node. {#1-connect-to-a-login-node}
Please [follow the
documentation](https://docs.it4i.cz/anselm-cluster-documentation/resolveuid/5d3d6f3d873a42e584cbf4365c4e251b).
#### 2. Run your own instance of TurboVNC server. {#2-run-your-own-instance-of-turbovnc-server}
To have the OpenGL acceleration, **24 bit color depth must be used**.
Otherwise only the geometry (desktop size) definition is needed.
*At first VNC server run you need to define a password.*
This example defines desktop with dimensions 1200x700 pixels and 24 bit
color depth.
```
$ module load turbovnc/1.2.2
$ vncserver -geometry 1200x700 -depth 24
Desktop 'TurboVNClogin2:1 (username)' started on display login2:1
Starting applications specified in /home/username/.vnc/xstartup.turbovnc
Log file is /home/username/.vnc/login2:1.log
```
#### 3. Remember which display number your VNC server runs (you will need it in the future to stop the server). {#3-remember-which-display-number-your-vnc-server-runs-you-will-need-it-in-the-future-to-stop-the-server}
```
$ vncserver -list
TurboVNC server sessions
X DISPLAY # PROCESS ID
:1 23269
```
In this example the VNC server runs on display **:1**.
#### 4. Remember the exact login node, where your VNC server runs. {#4-remember-the-exact-login-node-where-your-vnc-server-runs}
```
$ uname -n
login2
```
In this example the VNC server runs on **login2**.
#### 5. Remember on which TCP port your own VNC server is running. {#5-remember-on-which-tcp-port-your-own-vnc-server-is-running}
To get the port you have to look to the log file of your VNC server.
```
$ grep -E "VNC.*port" /home/username/.vnc/login2:1.log
20/02/2015 14:46:41 Listening for VNC connections on TCP port 5901
```
In this example the VNC server listens on TCP port **5901**.
#### 6. Connect to the login node where your VNC server runs with SSH to tunnel your VNC session. {#6-connect-to-the-login-node-where-your-vnc-server-runs-with-ssh-to-tunnel-your-vnc-session}
Tunnel the TCP port on which your VNC server is listenning.
```
$ ssh login2.anselm.it4i.cz -L 5901:localhost:5901
```
*If you use Windows and Putty, please refer to port forwarding setup
<span class="internal-link">in the documentation</span>:*
[https://docs.it4i.cz/anselm-cluster-documentation/accessing-the-cluster/x-window-and-vnc#section-12](accessing-the-cluster/x-window-and-vnc.html#section-12)
#### 7. If you don't have Turbo VNC installed on your workstation. {#7-if-you-don-t-have-turbo-vnc-installed-on-your-workstation}
Get it from<http://sourceforge.net/projects/turbovnc/>
#### 8. Run TurboVNC Viewer from your workstation. {#8-run-turbovnc-viewer-from-your-workstation}
Mind that you should connect through the SSH tunneled port. In this
example it is 5901 on your workstation (localhost).
```
$ vncviewer localhost:5901
```
*If you use Windows version of TurboVNC Viewer, just run the Viewer and
use address **localhost:5901**.*
#### 9. Proceed to the chapter "Access the visualization node." {#9-proceed-to-the-chapter-access-the-visualization-node}
*Now you should have working TurboVNC session connected to your
workstation.*
#### 10. After you end your visualization session. {#10-after-you-end-your-visualization-session}
*Don't forget to correctly shutdown your own VNC server on the login
node!*
```
$ vncserver -kill :1
```
Access the visualization node
-----------------------------
To access the node use a dedicated PBS Professional scheduler queue
**qviz**. The queue has following properties:
<table>
<colgroup>
<col width="12%" />
<col width="12%" />
<col width="12%" />
<col width="12%" />
<col width="12%" />
<col width="12%" />
<col width="12%" />
<col width="12%" />
</colgroup>
<thead>
<tr class="header">
<th align="left">queue</th>
<th align="left">active project</th>
<th align="left">project resources</th>
<th align="left">nodes</th>
<th align="left">min ncpus*</th>
<th align="left">priority</th>
<th align="left">authorization</th>
<th align="left">walltime<br />
default/max</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left"><p><strong>qviz </strong> Visualization queue</p></td>
<td align="left">yes</td>
<td align="left">none required</td>
<td align="left">2</td>
<td align="left">4</td>
<td align="left"><span><em>150</em></span></td>
<td align="left">no</td>
<td align="left">1 hour / 2 hours</td>
</tr>
</tbody>
</table>
Currently when accessing the node, each user gets 4 cores of a CPU
allocated, thus approximately 16 GB of RAM and 1/4 of the GPU capacity.
*If more GPU power or RAM is required, it is recommended to allocate one
whole node per user, so that all 16 cores, whole RAM and whole GPU is
exclusive. This is currently also the maximum allowed allocation per one
user. One hour of work is allocated by default, the user may ask for 2
hours maximum.*
To access the visualization node, follow these steps:
#### 1. In your VNC session, open a terminal and allocate a node using PBSPro qsub command. {#1-in-your-vnc-session-open-a-terminal-and-allocate-a-node-using-pbspro-qsub-command}
*This step is necessary to allow you to proceed with next steps.*
```
$ qsub -I -q qviz -A PROJECT_ID
```
In this example the default values for CPU cores and usage time are
used.
```
$ qsub -I -q qviz -A PROJECT_ID -l select=1:ncpus=16 -l walltime=02:00:00
```
*Substitute **PROJECT_ID** with the assigned project identification
string.*
In this example a whole node for 2 hours is requested.
If there are free resources for your request, you will have a shell
running on an assigned node. Please remember the name of the node.
```
$ uname -n
srv8
```
In this example the visualization session was assigned to node **srv8**.
#### 2. In your VNC session open another terminal (keep the one with interactive PBSPro job open). {#2-in-your-vnc-session-open-another-terminal-keep-the-one-with-interactive-pbspro-job-open}
Setup the VirtualGL connection to the node, which PBSPro allocated for
your job.
```
$ vglconnect srv8
```
You will be connected with created VirtualGL tunnel to the visualization
node, where you will have a shell.
#### 3. Load the VirtualGL module. {#3-load-the-virtualgl-module}
```
$ module load virtualgl/2.4
```
#### 4. Run your desired OpenGL accelerated application using VirtualGL script "vglrun". {#4-run-your-desired-opengl-accelerated-application-using-virtualgl-script-vglrun}
```
$ vglrun glxgears
```
Please note, that if you want to run an OpenGL application which is
available through modules, you need at first load the respective module.
E. g. to run the **Mentat** OpenGL application from **MARC** software
package use:
```
$ module load marc/2013.1
$ vglrun mentat
```
#### 5. After you end your work with the OpenGL application. {#5-after-you-end-your-work-with-the-opengl-application}
Just logout from the visualization node and exit both opened terminals
and end your VNC server session as described above.
Tips and Tricks
---------------
If you want to increase the responsibility of the visualization, please
adjust your TurboVNC client settings in this way:
![rem_vis_settings](turbovncclientsetting.png "rem_vis_settings")
To have an idea how the settings are affecting the resulting picture
quality three levels of "JPEG image quality" are demonstrated:
1. JPEG image quality = 30
![rem_vis_q3](quality3.png "rem_vis_q3")
2. JPEG image quality = 15
![rem_vis_q2](quality2.png "rem_vis_q2")
3. JPEG image quality = 10
![rem_vis_q1](quality1.png "rem_vis_q1")
Resource Allocation and Job Execution
=====================================
To run a [job](introduction.html), [computational
resources](introduction.html) for this particular job
must be allocated. This is done via the PBS Pro job workload manager
software, which efficiently distributes workloads across the
supercomputer. Extensive informations about PBS Pro can be found in the
[official documentation
here](../pbspro-documentation.html), especially in the
[PBS Pro User's
Guide](../pbspro-documentation/pbspro-users-guide.1).
Resources Allocation Policy
---------------------------
The resources are allocated to the job in a fairshare fashion, subject
to constraints set by the queue and resources available to the Project.
[The
Fairshare](resource-allocation-and-job-execution/job-priority.html)
at Anselm ensures that individual users may consume approximately equal
amount of resources per week. The resources are accessible via several
queues for queueing the jobs. The queues provide prioritized and
exclusive access to the computational resources. Following queues are
available to Anselm users:
- **qexp**, the Express queue
- **qprod**, the Production queue****
- **qlong**, the Long queue, regula
- **qnvidia, qmic, qfat**, the Dedicated queues
- **qfree,** the Free resource utilization queue
Check the queue status at <https://extranet.it4i.cz/anselm/>
Read more on the [Resource Allocation
Policy](resource-allocation-and-job-execution/resources-allocation-policy.html)
page.
Job submission and execution
----------------------------
Use the **qsub** command to submit your jobs.
The qsub submits the job into the queue. The qsub command creates a
request to the PBS Job manager for allocation of specified resources.
The **smallest allocation unit is entire node, 16 cores**, with
exception of the qexp queue. The resources will be allocated when
available, subject to allocation policies and constraints. **After the
resources are allocated the jobscript or interactive shell is executed
on first of the allocated nodes.**
Read more on the [Job submission and
execution](resource-allocation-and-job-execution/job-submission-and-execution.html)
page.
Capacity computing
------------------
Use Job arrays when running huge number of jobs.
Use GNU Parallel and/or Job arrays when running (many) single core jobs.
In many cases, it is useful to submit huge (<span>100+</span>) number of
computational jobs into the PBS queue system. Huge number of (small)
jobs is one of the most effective ways to execute embarrassingly
parallel calculations, achieving best runtime, throughput and computer
utilization. In this chapter, we discuss the the recommended way to run
huge number of jobs, including **ways to run huge number of single core
jobs**.
Read more on [Capacity
computing](resource-allocation-and-job-execution/capacity-computing.html)
page.
Capacity computing
==================
Introduction
------------
In many cases, it is useful to submit huge (<span>100+</span>) number of
computational jobs into the PBS queue system. Huge number of (small)
jobs is one of the most effective ways to execute embarrassingly
parallel calculations, achieving best runtime, throughput and computer
utilization.
However, executing huge number of jobs via the PBS queue may strain the
system. This strain may result in slow response to commands, inefficient
scheduling and overall degradation of performance and user experience,
for all users. For this reason, the number of jobs is **limited to 100
per user, 1000 per job array**
Please follow one of the procedures below, in case you wish to schedule
more than <span>100</span> jobs at a time.
- Use [Job arrays](capacity-computing.html#job-arrays)
when running huge number of
[multithread](capacity-computing.html#shared-jobscript-on-one-node)
(bound to one node only) or multinode (multithread across
several nodes) jobs
- Use [GNU
parallel](capacity-computing.html#gnu-parallel) when
running single core jobs
- Combine[GNU parallel with Job
arrays](capacity-computing.html#combining-job-arrays-and-gnu-parallel)
when running huge number of single core jobs
Policy
------
1. A user is allowed to submit at most 100 jobs. Each job may be [a job
array](capacity-computing.html#job-arrays).
2. The array size is at most 1000 subjobs.
[]()Job arrays
--------------
Huge number of jobs may be easily submitted and managed as a job array.
A job array is a compact representation of many jobs, called subjobs.
The subjobs share the same job script, and have the same values for all
attributes and resources, with the following exceptions:
- each subjob has a unique index, $PBS_ARRAY_INDEX
- job Identifiers of subjobs only differ by their indices
- the state of subjobs can differ (R,Q,...etc.)
All subjobs within a job array have the same scheduling priority and
schedule as independent jobs.
Entire job array is submitted through a single qsub command and may be
managed by qdel, qalter, qhold, qrls and qsig commands as a single job.
### []()Shared jobscript
All subjobs in job array use the very same, single jobscript. Each
subjob runs its own instance of the jobscript. The instances execute
different work controlled by $PBS_ARRAY_INDEX variable.
[]()Example:
Assume we have 900 input files with name beginning with "file" (e. g.
file001, ..., file900). Assume we would like to use each of these input
files with program executable myprog.x, each as a separate job.
First, we create a tasklist file (or subjobs list), listing all tasks
(subjobs) - all input files in our example:
```
$ find . -name 'file*' > tasklist
```
Then we create jobscript:
```
#!/bin/bash
#PBS -A PROJECT_ID
#PBS -q qprod
#PBS -l select=1:ncpus=16,walltime=02:00:00
# change to local scratch directory
SCR=/lscratch/$PBS_JOBID
mkdir -p $SCR ; cd $SCR || exit
# get individual tasks from tasklist with index from PBS JOB ARRAY
TASK=$(sed -n "${PBS_ARRAY_INDEX}p" $PBS_O_WORKDIR/tasklist)
# copy input file and executable to scratch
cp $PBS_O_WORKDIR/$TASK input ; cp $PBS_O_WORKDIR/myprog.x .
# execute the calculation
./myprog.x < input > output
# copy output file to submit directory
cp output $PBS_O_WORKDIR/$TASK.out
```
In this example, the submit directory holds the 900 input files,
executable myprog.x and the jobscript file. As input for each run, we
take the filename of input file from created tasklist file. We copy the
input file to local scratch /lscratch/$PBS_JOBID, execute the myprog.x
and copy the output file back to <span>the submit directory</span>,
under the $TASK.out name. The myprog.x runs on one node only and must
use threads to run in parallel. Be aware, that if the myprog.x **is not
multithreaded**, then all the **jobs are run as single thread programs
in sequential** manner. Due to allocation of the whole node, the
**accounted time is equal to the usage of whole node**, while using only
1/16 of the node!
If huge number of parallel multicore (in means of multinode multithread,
e. g. MPI enabled) jobs is needed to run, then a job array approach
should also be used. The main difference compared to previous example
using one node is that the local scratch should not be used (as it's not
shared between nodes) and MPI or other technique for parallel multinode
run has to be used properly.
### Submit the job array
To submit the job array, use the qsub -J command. The 900 jobs of the
[example above](capacity-computing.html#array_example) may
be submitted like this:
```
$ qsub -N JOBNAME -J 1-900 jobscript
12345[].dm2
```
In this example, we submit a job array of 900 subjobs. Each subjob will
run on full node and is assumed to take less than 2 hours (please note
the #PBS directives in the beginning of the jobscript file, dont'
forget to set your valid PROJECT_ID and desired queue).
Sometimes for testing purposes, you may need to submit only one-element
array. This is not allowed by PBSPro, but there's a workaround:
```
$ qsub -N JOBNAME -J 9-10:2 jobscript
```
This will only choose the lower index (9 in this example) for
submitting/running your job.
### Manage the job array
Check status of the job array by the qstat command.
```
$ qstat -a 12345[].dm2
dm2:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
12345[].dm2 user2 qprod xx 13516 1 16 -- 00:50 B 00:02
```
The status B means that some subjobs are already running.
Check status of the first 100 subjobs by the qstat command.
```
$ qstat -a 12345[1-100].dm2
dm2:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
12345[1].dm2 user2 qprod xx 13516 1 16 -- 00:50 R 00:02
12345[2].dm2 user2 qprod xx 13516 1 16 -- 00:50 R 00:02
12345[3].dm2 user2 qprod xx 13516 1 16 -- 00:50 R 00:01
12345[4].dm2 user2 qprod xx 13516 1 16 -- 00:50 Q --
. . . . . . . . . . .
, . . . . . . . . . .
12345[100].dm2 user2 qprod xx 13516 1 16 -- 00:50 Q --
```
Delete the entire job array. Running subjobs will be killed, queueing
subjobs will be deleted.
```
$ qdel 12345[].dm2
```
Deleting large job arrays may take a while.
Display status information for all user's jobs, job arrays, and subjobs.
```
$ qstat -u $USER -t
```
Display status information for all user's subjobs.
```
$ qstat -u $USER -tJ
```
Read more on job arrays in the [PBSPro Users
guide](../../pbspro-documentation.html).
[]()GNU parallel
----------------
Use GNU parallel to run many single core tasks on one node.
GNU parallel is a shell tool for executing jobs in parallel using one or
more computers. A job can be a single command or a small script that has
to be run for each of the lines in the input. GNU parallel is most
useful in running single core jobs via the queue system on Anselm.
For more information and examples see the parallel man page:
```
$ module add parallel
$ man parallel
```
### GNU parallel jobscript
The GNU parallel shell executes multiple instances of the jobscript
using all cores on the node. The instances execute different work,
controlled by the $PARALLEL_SEQ variable.
[]()Example:
Assume we have 101 input files with name beginning with "file" (e. g.
file001, ..., file101). Assume we would like to use each of these input
files with program executable myprog.x, each as a separate single core
job. We call these single core jobs tasks.
First, we create a tasklist file, listing all tasks - all input files in
our example:
```
$ find . -name 'file*' > tasklist
```
Then we create jobscript:
```
#!/bin/bash
#PBS -A PROJECT_ID
#PBS -q qprod
#PBS -l select=1:ncpus=16,walltime=02:00:00
[ -z "$PARALLEL_SEQ" ] &&
{ module add parallel ; exec parallel -a $PBS_O_WORKDIR/tasklist $0 ; }
# change to local scratch directory
SCR=/lscratch/$PBS_JOBID/$PARALLEL_SEQ
mkdir -p $SCR ; cd $SCR || exit
# get individual task from tasklist
TASK=$1
# copy input file and executable to scratch
cp $PBS_O_WORKDIR/$TASK input
# execute the calculation
cat input > output
# copy output file to submit directory
cp output $PBS_O_WORKDIR/$TASK.out
```
In this example, tasks from tasklist are executed via the GNU
parallel. The jobscript executes multiple instances of itself in
parallel, on all cores of the node. Once an instace of jobscript is
finished, new instance starts until all entries in tasklist are
processed. Currently processed entry of the joblist may be retrieved via
$1 variable. Variable $TASK expands to one of the input filenames from
tasklist. We copy the input file to local scratch, execute the myprog.x
and copy the output file back to the submit directory, under the
$TASK.out name.
### Submit the job
To submit the job, use the qsub command. The 101 tasks' job of the
[example above](capacity-computing.html#gp_example) may be
submitted like this:
```
$ qsub -N JOBNAME jobscript
12345.dm2
```
In this example, we submit a job of 101 tasks. 16 input files will be
processed in parallel. The 101 tasks on 16 cores are assumed to
complete in less than 2 hours.
Please note the #PBS directives in the beginning of the jobscript file,
dont' forget to set your valid PROJECT_ID and desired queue.
[]()Job arrays and GNU parallel
-------------------------------
Combine the Job arrays and GNU parallel for best throughput of single
core jobs
While job arrays are able to utilize all available computational nodes,
the GNU parallel can be used to efficiently run multiple single-core
jobs on single node. The two approaches may be combined to utilize all
available (current and future) resources to execute single core jobs.
Every subjob in an array runs GNU parallel to utilize all cores on the
node
### GNU parallel, shared jobscript
Combined approach, very similar to job arrays, can be taken. Job array
is submitted to the queuing system. The subjobs run GNU parallel. The
GNU parallel shell executes multiple instances of the jobscript using
all cores on the node. The instances execute different work, controlled
by the $PBS_JOB_ARRAY and $PARALLEL_SEQ variables.
[]()Example:
Assume we have 992 input files with name beginning with "file" (e. g.
file001, ..., file992). Assume we would like to use each of these input
files with program executable myprog.x, each as a separate single core
job. We call these single core jobs tasks.
First, we create a tasklist file, listing all tasks - all input files in
our example:
```
$ find . -name 'file*' > tasklist
```
Next we create a file, controlling how many tasks will be executed in
one subjob
```
$ seq 32 > numtasks
```
Then we create jobscript:
```
#!/bin/bash
#PBS -A PROJECT_ID
#PBS -q qprod
#PBS -l select=1:ncpus=16,walltime=02:00:00
[ -z "$PARALLEL_SEQ" ] &&
{ module add parallel ; exec parallel -a $PBS_O_WORKDIR/numtasks $0 ; }
# change to local scratch directory
SCR=/lscratch/$PBS_JOBID/$PARALLEL_SEQ
mkdir -p $SCR ; cd $SCR || exit
# get individual task from tasklist with index from PBS JOB ARRAY and index form Parallel
IDX=$(($PBS_ARRAY_INDEX + $PARALLEL_SEQ - 1))
TASK=$(sed -n "${IDX}p" $PBS_O_WORKDIR/tasklist)
[ -z "$TASK" ] && exit
# copy input file and executable to scratch
cp $PBS_O_WORKDIR/$TASK input
# execute the calculation
cat input > output
# copy output file to submit directory
cp output $PBS_O_WORKDIR/$TASK.out
```
In this example, the jobscript executes in multiple instances in
parallel, on all cores of a computing node. Variable $TASK expands to
one of the input filenames from tasklist. We copy the input file to
local scratch, execute the myprog.x and copy the output file back to the
submit directory, under the $TASK.out name. The numtasks file controls
how many tasks will be run per subjob. Once an task is finished, new
task starts, until the number of tasks in numtasks file is reached.
Select subjob walltime and number of tasks per subjob carefully
When deciding this values, think about following guiding rules :
1. Let n=N/16. Inequality (n+1) * T &lt; W should hold. The N is
number of tasks per subjob, T is expected single task walltime and W
is subjob walltime. Short subjob walltime improves scheduling and
job throughput.
2. Number of tasks should be modulo 16.
3. These rules are valid only when all tasks have similar task
walltimes T.
### Submit the job array
To submit the job array, use the qsub -J command. The 992 tasks' job of
the [example
above](capacity-computing.html#combined_example) may be
submitted like this:
```
$ qsub -N JOBNAME -J 1-992:32 jobscript
12345[].dm2
```
In this example, we submit a job array of 31 subjobs. Note the -J
1-992:**32**, this must be the same as the number sent to numtasks file.
Each subjob will run on full node and process 16 input files in
parallel, 32 in total per subjob. Every subjob is assumed to complete
in less than 2 hours.
Please note the #PBS directives in the beginning of the jobscript file,
dont' forget to set your valid PROJECT_ID and desired queue.
Examples
--------
Download the examples in
[capacity.zip](capacity-computing-examples),
illustrating the above listed ways to run huge number of jobs. We
recommend to try out the examples, before using this for running
production jobs.
Unzip the archive in an empty directory on Anselm and follow the
instructions in the README file
```
$ unzip capacity.zip
$ cat README
```
Resource Allocation and Job Execution
=====================================
To run a [job](../introduction.html), [computational
resources](../introduction.html) for this particular job
must be allocated. This is done via the PBS Pro job workload manager
software, which efficiently distributes workloads across the
supercomputer. Extensive informations about PBS Pro can be found in the
[official documentation
here](../../pbspro-documentation.html), especially in
the [PBS Pro User's
Guide](../../pbspro-documentation/pbspro-users-guide.1).
Resources Allocation Policy
---------------------------
The resources are allocated to the job in a fairshare fashion, subject
to constraints set by the queue and resources available to the Project.
[The Fairshare](job-priority.html) at Anselm ensures
that individual users may consume approximately equal amount of
resources per week. The resources are accessible via several queues for
queueing the jobs. The queues provide prioritized and exclusive access
to the computational resources. Following queues are available to Anselm
users:
- **qexp**, the Express queue
- **qprod**, the Production queue****
- **qlong**, the Long queue, regula
- **qnvidia, qmic, qfat**, the Dedicated queues
- **qfree,** the Free resource utilization queue
Check the queue status at <https://extranet.it4i.cz/anselm/>
Read more on the [Resource Allocation
Policy](resources-allocation-policy.html) page.
Job submission and execution
----------------------------
Use the **qsub** command to submit your jobs.
The qsub submits the job into the queue. The qsub command creates a
request to the PBS Job manager for allocation of specified resources.
The **smallest allocation unit is entire node, 16 cores**, with
exception of the qexp queue. The resources will be allocated when
available, subject to allocation policies and constraints. **After the
resources are allocated the jobscript or interactive shell is executed
on first of the allocated nodes.**
Read more on the [Job submission and
execution](job-submission-and-execution.html) page.
Capacity computing
------------------
Use Job arrays when running huge number of jobs.
Use GNU Parallel and/or Job arrays when running (many) single core jobs.
In many cases, it is useful to submit huge (<span>100+</span>) number of
computational jobs into the PBS queue system. Huge number of (small)
jobs is one of the most effective ways to execute embarrassingly
parallel calculations, achieving best runtime, throughput and computer
utilization. In this chapter, we discuss the the recommended way to run
huge number of jobs, including **ways to run huge number of single core
jobs**.
Read more on [Capacity
computing](capacity-computing.html) page.