...
 
Commits (142)
......@@ -2,6 +2,7 @@ stages:
- test
- build
- deploy
- after_test
variables:
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
......@@ -40,6 +41,15 @@ ext_links:
only:
- master
404s:
stage: after_test
image: davidhrbac/docker-mkdocscheck:latest
script:
- wget -V
- echo https://docs.it4i.cz/devel/$CI_BUILD_REF_NAME/
- wget --spider -e robots=off -o wget.log -r -p https://docs.it4i.cz/devel/$CI_BUILD_REF_NAME/
- cat wget.log | awk '/^Found [0-9]+ broken links.$/,/FINISHED/ { rc=-1; print $0 }; END { exit rc }'
mkdocs:
stage: build
image: davidhrbac/docker-mkdocscheck:latest
......@@ -59,9 +69,10 @@ mkdocs:
- bash scripts/add_version.sh
# get modules list from clusters
- bash scripts/get_modules.sh
#generate site_url
- sed "s/\(site_url.*$\)/\1devel\/$CI_BUILD_REF_NAME\//" mkdocs.yml | head
- (if [ "${CI_BUILD_REF_NAME}" != 'hrb3' ]; then sed -i "s/\(site_url.*$\)/\1devel\/$CI_BUILD_REF_NAME\//" mkdocs.yml;fi);
# generate site_url
- (if [ "${CI_BUILD_REF_NAME}" != 'master' ]; then sed -i "s/\(site_url.*$\)/\1devel\/$CI_BUILD_REF_NAME\//" mkdocs.yml;fi);
# generate ULT for code link
- sed -i "s/master/$CI_BUILD_REF_NAME/g" material/partials/toc.html
# regenerate modules matrix
- python scripts/modules-matrix.py > docs.it4i/modules-matrix.md
- python scripts/modules-json.py > docs.it4i/modules-matrix.json
......
......@@ -9,13 +9,13 @@ However, executing a huge number of jobs via the PBS queue may strain the system
!!! note
Please follow one of the procedures below, in case you wish to schedule more than 100 jobs at a time.
* Use [Job arrays](/anselm/capacity-computing/#job-arrays) when running a huge number of [multithread](anselm/capacity-computing/#shared-jobscript-on-one-node) (bound to one node only) or multinode (multithread across several nodes) jobs
* Use [GNU parallel](/anselm/capacity-computing/#gnu-parallel) when running single core jobs
* Combine [GNU parallel with Job arrays](/anselm/capacity-computing/#job-arrays-and-gnu-parallel) when running huge number of single core jobs
* Use [Job arrays][1] when running a huge number of [multithread][2] (bound to one node only) or multinode (multithread across several nodes) jobs
* Use [GNU parallel][3] when running single core jobs
* Combine [GNU parallel with Job arrays][4] when running huge number of single core jobs
## Policy
1. A user is allowed to submit at most 100 jobs. Each job may be [a job array](/anselm/capacity-computing/#job-arrays).
1. A user is allowed to submit at most 100 jobs. Each job may be [a job array][1].
1. The array size is at most 1000 subjobs.
## Job Arrays
......@@ -76,7 +76,7 @@ If running a huge number of parallel multicore (in means of multinode multithrea
### Submit the Job Array
To submit the job array, use the qsub -J command. The 900 jobs of the [example above](/anselm/capacity-computing/#array_example) may be submitted like this:
To submit the job array, use the qsub -J command. The 900 jobs of the [example above][5] may be submitted like this:
```console
$ qsub -N JOBNAME -J 1-900 jobscript
......@@ -145,7 +145,7 @@ Display status information for all user's subjobs.
$ qstat -u $USER -tJ
```
Read more on job arrays in the [PBSPro Users guide](pbspro/).
Read more on job arrays in the [PBSPro Users guide][6].
## GNU Parallel
......@@ -207,7 +207,7 @@ In this example, tasks from the tasklist are executed via the GNU parallel. The
### Submit the Job
To submit the job, use the qsub command. The 101 task job of the [example above](/anselm/capacity-computing/#gp_example) may be submitted as follows:
To submit the job, use the qsub command. The 101 task job of the [example above][7] may be submitted as follows:
```console
$ qsub -N JOBNAME jobscript
......@@ -292,7 +292,7 @@ When deciding this values, keep in mind the following guiding rules:
### Submit the Job Array (-J)
To submit the job array, use the qsub -J command. The 992 task job of the [example above](/anselm/capacity-computing/#combined_example) may be submitted like this:
To submit the job array, use the qsub -J command. The 992 task job of the [example above][8] may be submitted like this:
```console
$ qsub -N JOBNAME -J 1-992:32 jobscript
......@@ -306,7 +306,7 @@ In this example, we submit a job array of 31 subjobs. Note the -J 1-992:**32**,
## Examples
Download the examples in [capacity.zip](capacity.zip), illustrating the above listed ways to run a huge number of jobs. We recommend trying out the examples before using this for running production jobs.
Download the examples in [capacity.zip][9], illustrating the above listed ways to run a huge number of jobs. We recommend trying out the examples before using this for running production jobs.
Unzip the archive in an empty directory on Anselm and follow the instructions in the README file
......@@ -314,3 +314,13 @@ Unzip the archive in an empty directory on Anselm and follow the instructions in
$ unzip capacity.zip
$ cat README
```
[1]: #job-arrays
[2]: #shared-jobscript-on-one-node
[3]: #gnu-parallel
[4]: #job-arrays-and-gnu-parallel
[5]: #array_example
[6]: ../pbspro.md
[7]: #gp_example
[8]: #combined_example
[9]: capacity.zip
......@@ -2,7 +2,7 @@
## Node Configuration
Anselm is cluster of x86-64 Intel based nodes built with Bull Extreme Computing bullx technology. The cluster contains four types of compute nodes.
Anselm is a cluster of x86-64 Intel based nodes built with Bull Extreme Computing bullx technology. The cluster contains four types of compute nodes.
### Compute Nodes Without Accelerators
......@@ -52,7 +52,7 @@ Anselm is cluster of x86-64 Intel based nodes built with Bull Extreme Computing
### Compute Node Summary
| Node type | Count | Range | Memory | Cores | [Access](/general/resources-allocation-policy/) |
| Node type | Count | Range | Memory | Cores | Queues |
| ---------------------------- | ----- | ----------- | ------ | ----------- | -------------------------------------- |
| Nodes without an accelerator | 180 | cn[1-180] | 64GB | 16 @ 2.4GHz | qexp, qprod, qlong, qfree, qprace, qatlas |
| Nodes with a GPU accelerator | 23 | cn[181-203] | 96GB | 16 @ 2.3GHz | qnvidia, qexp |
......
......@@ -2,7 +2,7 @@
The Anselm cluster consists of 209 computational nodes named cn[1-209] of which 180 are regular compute nodes, 23 are GPU Kepler K20 accelerated nodes, 4 are MIC Xeon Phi 5110P accelerated nodes, and 2 are fat nodes. Each node is a powerful x86-64 computer, equipped with 16 cores (two eight-core Intel Sandy Bridge processors), at least 64 GB of RAM, and a local hard drive. User access to the Anselm cluster is provided by two login nodes login[1,2]. The nodes are interlinked through high speed InfiniBand and Ethernet networks. All nodes share a 320 TB /home disk for storage of user files. The 146 TB shared /scratch storage is available for scratch data.
The Fat nodes are equipped with a large amount (512 GB) of memory. Virtualization infrastructure provides resources to run long term servers and services in virtual mode. Fat nodes and virtual servers may access 45 TB of dedicated block storage. Accelerated nodes, fat nodes, and virtualization infrastructure are available [upon request](https://support.it4i.cz/rt) from a PI.
The Fat nodes are equipped with a large amount (512 GB) of memory. Virtualization infrastructure provides resources to run long term servers and services in virtual mode. Fat nodes and virtual servers may access 45 TB of dedicated block storage. Accelerated nodes, fat nodes, and virtualization infrastructure are available [upon request][a] from a PI.
Schematic representation of the Anselm cluster. Each box represents a node (computer) or storage capacity:
......@@ -17,16 +17,16 @@ There are four types of compute nodes:
* 4 compute nodes with a MIC accelerator - an Intel Xeon Phi 5110P
* 2 fat nodes - equipped with 512 GB of RAM and two 100 GB SSD drives
[More about Compute nodes](/anselm/compute-nodes/).
[More about Compute nodes][1].
GPU and accelerated nodes are available upon request, see the [Resources Allocation Policy](/anselm/resources-allocation-policy/).
GPU and accelerated nodes are available upon request, see the [Resources Allocation Policy][2].
All of these nodes are interconnected through fast InfiniBand and Ethernet networks. [More about the Network](/anselm/network/).
All of these nodes are interconnected through fast InfiniBand and Ethernet networks. [More about the Network][3].
Every chassis provides an InfiniBand switch, marked **isw**, connecting all nodes in the chassis, as well as connecting the chassis to the upper level switches.
All of the nodes share a 360 TB /home disk for storage of user files. The 146 TB shared /scratch storage is available for scratch data. These file systems are provided by the Lustre parallel file system. There is also local disk storage available on all compute nodes in /lscratch. [More about Storage](/anselm/storage/).
All of the nodes share a 360 TB /home disk for storage of user files. The 146 TB shared /scratch storage is available for scratch data. These file systems are provided by the Lustre parallel file system. There is also local disk storage available on all compute nodes in /lscratch. [More about Storage][4].
User access to the Anselm cluster is provided by two login nodes login1, login2, and data mover node dm1. [More about accessing the cluster.](/anselm/shell-and-data-access/)
User access to the Anselm cluster is provided by two login nodes login1, login2, and data mover node dm1. [More about accessing the cluster][5].
The parameters are summarized in the following tables:
......@@ -35,7 +35,7 @@ The parameters are summarized in the following tables:
| Primary purpose | High Performance Computing |
| Architecture of compute nodes | x86-64 |
| Operating system | Linux (CentOS) |
| [**Compute nodes**](/anselm/compute-nodes/) | |
| [**Compute nodes**][1] | |
| Total | 209 |
| Processor cores | 16 (2 x 8 cores) |
| RAM | min. 64 GB, min. 4 GB per core |
......@@ -57,4 +57,12 @@ The parameters are summarized in the following tables:
| MIC accelerated | 2 x Intel Sandy Bridge E5-2470, 2.3 GHz | 96 GB | Intel Xeon Phi 5110P |
| Fat compute node | 2 x Intel Sandy Bridge E5-2665, 2.4 GHz | 512 GB | - |
For more details refer to [Compute nodes](/anselm/compute-nodes/), [Storage](anselm/storage/), and [Network](anselm/network/).
For more details refer to [Compute nodes][1], [Storage][4], and [Network][3].
[1]: compute-nodes.md
[2]: resources-allocation-policy.md
[3]: network.md
[4]: storage.md
[5]: shell-and-data-access.md
[a]: https://support.it4i.cz/rt
# Introduction
Welcome to Anselm supercomputer cluster. The Anselm cluster consists of 209 compute nodes, totalling 3344 compute cores with 15 TB RAM, giving over 94 TFLOP/s theoretical peak performance. Each node is a powerful x86-64 computer, equipped with 16 cores, at least 64 GB of RAM, and a 500 GB hard disk drive. Nodes are interconnected through a fully non-blocking fat-tree InfiniBand network, and are equipped with Intel Sandy Bridge processors. A few nodes are also equipped with NVIDIA Kepler GPU or Intel Xeon Phi MIC accelerators. Read more in [Hardware Overview](/anselm/hardware-overview/).
Welcome to Anselm supercomputer cluster. The Anselm cluster consists of 209 compute nodes, totalling 3344 compute cores with 15 TB RAM, giving over 94 TFLOP/s theoretical peak performance. Each node is a powerful x86-64 computer, equipped with 16 cores, at least 64 GB of RAM, and a 500 GB hard disk drive. Nodes are interconnected through a fully non-blocking fat-tree InfiniBand network, and are equipped with Intel Sandy Bridge processors. A few nodes are also equipped with NVIDIA Kepler GPU or Intel Xeon Phi MIC accelerators. Read more in [Hardware Overview][1].
The cluster runs with an [operating system](/software/operating-system/) which is compatible with the RedHat [Linux family.](http://upload.wikimedia.org/wikipedia/commons/1/1b/Linux_Distribution_Timeline.svg) We have installed a wide range of software packages targeted at different scientific domains. These packages are accessible via the [modules environment](environment-and-modules/).
The cluster runs with an operating system which is compatible with the RedHat [Linux family][a]. We have installed a wide range of software packages targeted at different scientific domains. These packages are accessible via the [modules environment][2].
The user data shared file-system (HOME, 320 TB) and job data shared file-system (SCRATCH, 146 TB) are available to users.
The PBS Professional workload manager provides [computing resources allocations and job execution](/anselm/resources-allocation-policy/).
The PBS Professional workload manager provides [computing resources allocations and job execution][3].
Read more on how to [apply for resources](/general/applying-for-resources/), [obtain login credentials](general/obtaining-login-credentials/obtaining-login-credentials/) and [access the cluster](/anselm/shell-and-data-access/).
Read more on how to [apply for resources][4], [obtain login credentials][5] and [access the cluster][6].
[1]: hardware-overview.md
[2]: ../environment-and-modules.md
[3]: resources-allocation-policy.md
[4]: ../general/applying-for-resources.md
[5]: ../general/obtaining-login-credentials/obtaining-login-credentials.md
[6]: shell-and-data-access.md
[a]: http://upload.wikimedia.org/wikipedia/commons/1/1b/Linux_Distribution_Timeline.svg
......@@ -16,7 +16,7 @@ Queue priority is the priority of the queue in which the job is waiting prior to
Queue priority has the biggest impact on job execution priority. The execution priority of jobs in higher priority queues is always greater than the execution priority of jobs in lower priority queues. Other properties of jobs used for determining the job execution priority (fair-share priority, eligible time) cannot compete with queue priority.
Queue priorities can be seen at [https://extranet.it4i.cz/anselm/queues](https://extranet.it4i.cz/anselm/queues)
Queue priorities can be seen [here][a].
### Fair-Share Priority
......@@ -36,7 +36,7 @@ Usage counts allocated core-hours (`ncpus x walltime`). Usage decays, halving at
Jobs queued in the queue qexp are not used to calculate the project's usage.
!!! note
Calculated usage and fair-share priority can be seen at [https://extranet.it4i.cz/anselm/projects](https://extranet.it4i.cz/anselm/projects).
Calculated usage and fair-share priority can be seen [here][b].
Calculated fair-share priority can be also be seen in the Resource_List.fairshare attribute of a job.
......@@ -70,3 +70,6 @@ This means that jobs with lower execution priority can be run before jobs with h
Specifying more accurate walltime enables better scheduling, better execution times, and better resource usage. Jobs with suitable (small) walltime can be backfilled - and overtake job(s) with a higher priority.
---8<--- "mathjax.md"
[a]: https://extranet.it4i.cz/rsweb/anselm/queues
[b]: https://extranet.it4i.cz/rsweb/anselm/projects
......@@ -51,7 +51,7 @@ $ qsub -A OPEN-0-0 -q qfree -l select=10:ncpus=16 ./myjob
In this example, we allocate 10 nodes, 16 cores per node, for 12 hours. We allocate these resources via the qfree queue. It is not required that the project OPEN-0-0 has any available resources left. Consumed resources are still accounted for. The jobscript myjob will be executed on the first node in the allocation.
All qsub options may be [saved directly into the jobscript](#example-jobscript-for-mpi-calculation-with-preloaded-inputs). In such cases, it is not necessary to specify any options for qsub.
All qsub options may be [saved directly into the jobscript][1]. In such cases, it is not necessary to specify any options for qsub.
```console
$ qsub ./myjob
......@@ -92,9 +92,9 @@ In this example, we allocate 4 nodes, 16 cores per node, selecting only the node
### Placement by IB Switch
Groups of computational nodes are connected to chassis integrated Infiniband switches. These switches form the leaf switch layer of the [Infiniband network](/anselm/network/) fat tree topology. Nodes sharing the leaf switch can communicate most efficiently. Sharing the same switch prevents hops in the network and facilitates unbiased, highly efficient network communication.
Groups of computational nodes are connected to chassis integrated Infiniband switches. These switches form the leaf switch layer of the [Infiniband network][2] fat tree topology. Nodes sharing the leaf switch can communicate most efficiently. Sharing the same switch prevents hops in the network and facilitates unbiased, highly efficient network communication.
Nodes sharing the same switch may be selected via the PBS resource attribute ibswitch. Values of this attribute are iswXX, where XX is the switch number. The node-switch mapping can be seen in the [Hardware Overview](/anselm/hardware-overview/) section.
Nodes sharing the same switch may be selected via the PBS resource attribute ibswitch. Values of this attribute are iswXX, where XX is the switch number. The node-switch mapping can be seen in the [Hardware Overview][3] section.
We recommend allocating compute nodes to a single switch when best possible computational network performance is required to run the job efficiently:
......@@ -339,7 +339,7 @@ exit
In this example, a directory in /home holds the input file input and executable mympiprog.x . We create the directory myjob on the /scratch filesystem, copy input and executable files from the /home directory where the qsub was invoked ($PBS_O_WORKDIR) to /scratch, execute the MPI program mympiprog.x and copy the output file back to the /home directory. mympiprog.x is executed as one process per node, on all allocated nodes.
!!! note
Consider preloading inputs and executables onto [shared scratch](storage/) memory before the calculation starts.
Consider preloading inputs and executables onto [shared scratch][4] memory before the calculation starts.
In some cases, it may be impractical to copy the inputs to the scratch memory and the outputs to the home directory. This is especially true when very large input and output files are expected, or when the files should be reused by a subsequent calculation. In such cases, it is the users' responsibility to preload the input files on shared /scratch memory before the job submission, and retrieve the outputs manually after all calculations are finished.
......@@ -373,15 +373,14 @@ exit
In this example, input and executable files are assumed to be preloaded manually in the /scratch/$USER/myjob directory. Note the **mpiprocs** and **ompthreads** qsub options controlling the behavior of the MPI execution. mympiprog.x is executed as one process per node, on all 100 allocated nodes. If mympiprog.x implements OpenMP threads, it will run 16 threads per node.
More information can be found in the [Running OpenMPI](/software/mpi/Running_OpenMPI/) and [Running MPICH2](software/mpi/running-mpich2/)
sections.
More information can be found in the [Running OpenMPI][5] and [Running MPICH2][6] sections.
### Example Jobscript for Single Node Calculation
!!! note
The local scratch directory is often useful for single node jobs. Local scratch memory will be deleted immediately after the job ends.
Example jobscript for single node calculation, using [local scratch](/anselm/storage/) memory on the node:
Example jobscript for single node calculation, using [local scratch][4] memory on the node:
```bash
#!/bin/bash
......@@ -407,4 +406,12 @@ In this example, a directory in /home holds the input file input and executable
### Other Jobscript Examples
Further jobscript examples may be found in the software section and the [Capacity computing](/anselm/capacity-computing/) section.
Further jobscript examples may be found in the software section and the [Capacity computing][7] section.
[1]: #example-jobscript-for-mpi-calculation-with-preloaded-inputs
[2]: network.md
[3]: hardware-overview.md
[4]: storage.md
[5]: ../software/mpi/running_openmpi.md
[6]: ../software/mpi/running-mpich2.md
[7]: capacity-computing.md
# Network
All of the compute and login nodes of Anselm are interconnected through an [InfiniBand](http://en.wikipedia.org/wiki/InfiniBand) QDR network and a Gigabit [Ethernet](http://en.wikipedia.org/wiki/Ethernet) network. Both networks may be used to transfer user data.
All of the compute and login nodes of Anselm are interconnected through an [InfiniBand][a] QDR network and a Gigabit [Ethernet][b] network. Both networks may be used to transfer user data.
## InfiniBand Network
All of the compute and login nodes of Anselm are interconnected through a high-bandwidth, low-latency [InfiniBand](http://en.wikipedia.org/wiki/InfiniBand) QDR network (IB 4 x QDR, 40 Gbps). The network topology is a fully non-blocking fat-tree.
All of the compute and login nodes of Anselm are interconnected through a high-bandwidth, low-latency [InfiniBand][a] QDR network (IB 4 x QDR, 40 Gbps). The network topology is a fully non-blocking fat-tree.
The compute nodes may be accessed via the InfiniBand network using ib0 network interface, in address range 10.2.1.1-209. The MPI may be used to establish native InfiniBand connection among the nodes.
......@@ -19,6 +19,8 @@ The compute nodes may be accessed via the regular Gigabit Ethernet network inter
## Example
In this example, we access the node cn110 through the InfiniBand network via the ib0 interface, then from cn110 to cn108 through the Ethernet network.
```console
$ qsub -q qexp -l select=4:ncpus=16 -N Name0 ./myjob
$ qstat -n -u username
......@@ -32,4 +34,5 @@ $ ssh 10.2.1.110
$ ssh 10.1.1.108
```
In this example, we access the node cn110 through the InfiniBand network via the ib0 interface, then from cn110 to cn108 through the Ethernet network.
[a]: http://en.wikipedia.org/wiki/InfiniBand
[b]: http://en.wikipedia.org/wiki/Ethernet
......@@ -2,10 +2,10 @@
## Job Queue Policies
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and the resources available to the Project. The Fair-share system of Anselm ensures that individual users may consume approximately equal amounts of resources per week. Detailed information can be found in the [Job scheduling](/anselm/job-priority/) section. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. The following table provides the queue partitioning overview:
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and the resources available to the Project. The Fair-share system of Anselm ensures that individual users may consume approximately equal amounts of resources per week. Detailed information can be found in the [Job scheduling][1] section. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. The following table provides the queue partitioning overview:
!!! note
Check the queue status at <https://extranet.it4i.cz/anselm/>
Check the queue status at <https://extranet.it4i.cz/rsweb/anselm/>
| queue | active project | project resources | nodes | min ncpus | priority | authorization | walltime |
| ------------------- | -------------- | -------------------- | ---------------------------------------------------- | --------- | -------- | ------------- | -------- |
......@@ -17,28 +17,28 @@ The resources are allocated to the job in a fair-share fashion, subject to const
| qfree | yes | < 120% of allocation | 180 w/o accelerator | 16 | -1024 | no | 12 h |
!!! note
**The qfree queue is not free of charge**. [Normal accounting](#resources-accounting-policy) applies. However, it allows for utilization of free resources, once a project has exhausted all its allocated computational resources. This does not apply to Director's Discretion projects (DD projects) by default. Usage of qfree after exhaustion of DD projects' computational resources is allowed after request for this queue.
**The qfree queue is not free of charge**. [Normal accounting][2] applies. However, it allows for utilization of free resources, once a project has exhausted all its allocated computational resources. This does not apply to Director's Discretion projects (DD projects) by default. Usage of qfree after exhaustion of DD projects' computational resources is allowed after request for this queue.
**The qexp queue is equipped with nodes which do not have exactly the same CPU clock speed.** Should you need the nodes to have exactly the same CPU speed, you have to select the proper nodes during the PSB job submission.
* **qexp**, the Express queue: This queue is dedicated to testing and running very small jobs. It is not required to specify a project to enter the qexp. There are always 2 nodes reserved for this queue (w/o accelerators), a maximum 8 nodes are available via the qexp for a particular user, from a pool of nodes containing Nvidia accelerated nodes (cn181-203), MIC accelerated nodes (cn204-207) and Fat nodes with 512GB of RAM (cn208-209). This enables us to test and tune accelerated code and code with higher RAM requirements. The nodes may be allocated on a per core basis. No special authorization is required to use qexp. The maximum runtime in qexp is 1 hour.
* **qprod**, the Production queue: This queue is intended for normal production runs. It is required that an active project with nonzero remaining resources is specified to enter the qprod. All nodes may be accessed via the qprod queue, except the reserved ones. 178 nodes without accelerators are included. Full nodes, 16 cores per node, are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qprod is 48 hours.
* **qlong**, the Long queue: This queue is intended for long production runs. It is required that an active project with nonzero remaining resources is specified to enter the qlong. Only 60 nodes without acceleration may be accessed via the qlong queue. Full nodes, 16 cores per node, are allocated. The queue runs with medium priority and no special authorization is required to use it. The maximum runtime in qlong is 144 hours (three times that of the standard qprod time - 3 x 48 h).
* **qnvidia**, qmic, qfat, the Dedicated queues: The queue qnvidia is dedicated to accessing the Nvidia accelerated nodes, the qmic to accessing MIC nodes and qfat the Fat nodes. It is required that an active project with nonzero remaining resources is specified to enter these queues. 23 nvidia, 4 mic, and 2 fat nodes are included. Full nodes, 16 cores per node, are allocated. The queues run with very high priority, the jobs will be scheduled before the jobs coming from the qexp queue. An PI needs to explicitly ask [support](https://support.it4i.cz/rt/) for authorization to enter the dedicated queues for all users associated with her/his project.
* **qnvidia**, qmic, qfat, the Dedicated queues: The queue qnvidia is dedicated to accessing the Nvidia accelerated nodes, the qmic to accessing MIC nodes and qfat the Fat nodes. It is required that an active project with nonzero remaining resources is specified to enter these queues. 23 nvidia, 4 mic, and 2 fat nodes are included. Full nodes, 16 cores per node, are allocated. The queues run with very high priority, the jobs will be scheduled before the jobs coming from the qexp queue. An PI needs to explicitly ask [support][a] for authorization to enter the dedicated queues for all users associated with her/his project.
* **qfree**, The Free resource queue: The queue qfree is intended for utilization of free resources, after a project has exhausted all of its allocated computational resources (Does not apply to DD projects by default; DD projects have to request persmission to use qfree after exhaustion of computational resources). It is required that active project is specified to enter the queue. Consumed resources will be accounted to the Project. Access to the qfree queue is automatically removed if consumed resources exceed 120% of the resources allocated to the Project. Only 180 nodes without accelerators may be accessed from this queue. Full nodes, 16 cores per node, are allocated. The queue runs with very low priority and no special authorization is required to use it. The maximum runtime in qfree is 12 hours.
## Queue Notes
The job wall clock time defaults to **half the maximum time**, see the table above. Longer wall time limits can be [set manually, see examples](/anselm/job-submission-and-execution/).
The job wall clock time defaults to **half the maximum time**, see the table above. Longer wall time limits can be [set manually, see examples][3].
Jobs that exceed the reserved wall clock time (Req'd Time) get killed automatically. The wall clock time limit can be changed for queuing jobs (state Q) using the qalter command, however it cannot be changed for a running job (state R).
Anselm users may check the current queue configuration at [https://extranet.it4i.cz/anselm/queues](https://extranet.it4i.cz/anselm/queues).
Anselm users may check the current queue configuration [here][b].
## Queue Status
!!! tip
Check the status of jobs, queues and compute nodes at [https://extranet.it4i.cz/anselm/](https://extranet.it4i.cz/anselm/)
Check the status of jobs, queues and compute nodes [here][c].
![rspbs web interface](../img/rsweb.png)
......@@ -109,3 +109,11 @@ Options:
---8<--- "resource_accounting.md"
---8<--- "mathjax.md"
[1]: job-priority.md
[2]: #resources-accounting-policy
[3]: job-submission-and-execution.md
[a]: https://support.it4i.cz/rt/
[b]: https://extranet.it4i.cz/rsweb/anselm/queues
[c]: https://extranet.it4i.cz/rsweb/anselm/
......@@ -10,7 +10,7 @@ The Anselm cluster is accessed by SSH protocol via login nodes login1 and login2
| login1.anselm.it4i.cz | 22 | ssh | login1 |
| login2.anselm.it4i.cz | 22 | ssh | login2 |
Authentication is by [private key](../../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys/)
Authentication is available by [private key][1] only.
!!! note
Please verify SSH fingerprints during the first logon. They are identical on all login nodes:
......@@ -39,7 +39,7 @@ If you see a warning message "UNPROTECTED PRIVATE KEY FILE!", use this command t
$ chmod 600 /path/to/id_rsa
```
On **Windows**, use [PuTTY ssh client](../general/accessing-the-clusters/shell-access-and-data-transfer/putty.md).
On **Windows**, use [PuTTY ssh client][2].
After logging in, you will see the command prompt:
......@@ -61,11 +61,11 @@ Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com
Example to the cluster login:
!!! note
The environment is **not** shared between login nodes, except for [shared filesystems](storage/#shared-filesystems).
The environment is **not** shared between login nodes, except for [shared filesystems][3].
## Data Transfer
Data in and out of the system may be transferred by the [scp](http://en.wikipedia.org/wiki/Secure_copy) and sftp protocols. (Not available yet). In the case that large volumes of data are transferred, use the dedicated data mover node dm1.anselm.it4i.cz for increased performance.
Data in and out of the system may be transferred by the [scp][a] and sftp protocols. (Not available yet). In the case that large volumes of data are transferred, use the dedicated data mover node dm1.anselm.it4i.cz for increased performance.
| Address | Port | Protocol |
| --------------------- | ---- | --------- |
......@@ -73,7 +73,7 @@ Data in and out of the system may be transferred by the [scp](http://en.wikipedi
| login1.anselm.it4i.cz | 22 | scp |
| login2.anselm.it4i.cz | 22 | scp |
Authentication is by [private key](../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys.md)
Authentication is by [private key][1] only.
!!! note
Data transfer rates of up to **160MB/s** can be achieved with scp or sftp.
......@@ -101,7 +101,7 @@ or
$ sftp -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz
```
A very convenient way to transfer files in and out of Anselm is via the fuse filesystem [sshfs](http://linux.die.net/man/1/sshfs)
A very convenient way to transfer files in and out of Anselm is via the fuse filesystem [sshfs][b].
```console
$ sshfs -o IdentityFile=/path/to/id_rsa username@anselm.it4i.cz:. mountpoint
......@@ -117,9 +117,9 @@ $ man scp
$ man sshfs
```
On Windows, use the [WinSCP client](http://winscp.net/eng/download.php) to transfer the data. The [win-sshfs client](http://code.google.com/p/win-sshfs/) provides a way to mount the Anselm filesystems directly as an external disc.
On Windows, use the [WinSCP client][c] to transfer the data. The [win-sshfs client][d] provides a way to mount the Anselm filesystems directly as an external disc.
More information about the shared file systems is available [here](access/storage/).
More information about the shared file systems is available [here][4].
## Connection Restrictions
......@@ -169,15 +169,15 @@ $ ssh -L 6000:localhost:1234 remote.host.com
Remote port forwarding from compute nodes allows applications running on the compute nodes to access hosts outside the Anselm Cluster.
First, establish the remote port forwarding form the login node, as [described above](#port-forwarding-from-login-nodes).
First, establish the remote port forwarding form the login node, as [described above][5].
Second, invoke port forwarding from the compute node to the login node. Insert the following line into your jobscript or interactive shell;
Second, invoke port forwarding from the compute node to the login node. Insert the following line into your jobscript or interactive shell:
```console
$ ssh -TN -f -L 6000:localhost:6000 login1
```
In this example, we assume that port forwarding from login1:6000 to remote.host.com:1234 has been established beforehand. By accessing localhost:6000, an application running on a compute node will see the response of remote.host.com:1234
In this example, we assume that port forwarding from `login1:6000` to `remote.host.com:1234` has been established beforehand. By accessing `localhost:6000`, an application running on a compute node will see the response of `remote.host.com:1234`.
### Using Proxy Servers
......@@ -192,21 +192,39 @@ To establish a local proxy server on your workstation, install and run SOCKS pro
$ ssh -D 1080 localhost
```
On Windows, install and run the free, open source [Sock Puppet](http://sockspuppet.com/) server.
On Windows, install and run the free, open source [Sock Puppet][e] server.
Once the proxy server is running, establish ssh port forwarding from Anselm to the proxy server, port 1080, exactly as [described above](#port-forwarding-from-login-nodes):
Once the proxy server is running, establish ssh port forwarding from Anselm to the proxy server, port 1080, exactly as [described above][5]:
```console
$ ssh -R 6000:localhost:1080 anselm.it4i.cz
```
Now, configure the applications proxy settings to **localhost:6000**. Use port forwarding to access the [proxy server from compute nodes](#port-forwarding-from-compute-nodes) as well.
Now, configure the applications proxy settings to **localhost:6000**. Use port forwarding to access the [proxy server from compute nodes][5] as well.
## Graphical User Interface
* The [X Window system](/general/accessing-the-clusters/graphical-user-interface/x-window-system/) is the principal way to get GUI access to the clusters.
* [Virtual Network Computing](/general/accessing-the-clusters/graphical-user-interface/vnc/) is a graphical [desktop sharing](http://en.wikipedia.org/wiki/Desktop_sharing) system that uses the [Remote Frame Buffer protocol](http://en.wikipedia.org/wiki/RFB_protocol) to remotely control another [computer](http://en.wikipedia.org/wiki/Computer).
* The [X Window system][6] is the principal way to get GUI access to the clusters.
* [Virtual Network Computing][7] is a graphical [desktop sharing][f] system that uses the [Remote Frame Buffer protocol][g] to remotely control another [computer][h].
## VPN Access
* Access IT4Innovations internal resources via [VPN](/general/accessing-the-clusters/vpn-access/).
* Access IT4Innovations internal resources via [VPN][8].
[1]: ../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys.md
[2]: ../general/accessing-the-clusters/shell-access-and-data-transfer/putty.md
[3]: storage.md#shared-filesystems
[4]: storage.md
[5]: #port-forwarding-from-login-nodes
[6]: ../general/accessing-the-clusters/graphical-user-interface/x-window-system.md
[7]: ../general/accessing-the-clusters/graphical-user-interface/vnc.md
[8]: ../general/accessing-the-clusters/vpn-access.md
[a]: http://en.wikipedia.org/wiki/Secure_copy
[b]: http://linux.die.net/man/1/sshfs
[c]: http://winscp.net/eng/download.php
[d]: http://code.google.com/p/win-sshfs/
[e]: http://sockspuppet.com/
[f]: http://en.wikipedia.org/wiki/Desktop_sharing
[g]: http://en.wikipedia.org/wiki/RFB_protocol
[h]: http://en.wikipedia.org/wiki/Computer
......@@ -197,11 +197,11 @@ $ ./test.cuda
### cuBLAS
The NVIDIA CUDA Basic Linear Algebra Subroutines (cuBLAS) library is a GPU-accelerated version of the complete standard BLAS library with 152 standard BLAS routines. A basic description of the library together with basic performance comparisons with MKL can be found [here](https://developer.nvidia.com/cublas "Nvidia cuBLAS").
The NVIDIA CUDA Basic Linear Algebra Subroutines (cuBLAS) library is a GPU-accelerated version of the complete standard BLAS library with 152 standard BLAS routines. A basic description of the library together with basic performance comparisons with MKL can be found [here][a].
#### cuBLAS Example: SAXPY
The SAXPY function multiplies the vector x by the scalar alpha, and adds it to the vector y, overwriting the latest vector with the result. A description of the cuBLAS function can be found in [NVIDIA CUDA documentation](http://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-axpy "Nvidia CUDA documentation "). Code can be pasted in the file and compiled without any modification.
The SAXPY function multiplies the vector x by the scalar alpha, and adds it to the vector y, overwriting the latest vector with the result. A description of the cuBLAS function can be found in [NVIDIA CUDA documentation][b]. Code can be pasted in the file and compiled without any modification.
```cpp
/* Includes, system */
......@@ -283,8 +283,8 @@ int main(int argc, char **argv)
!!! note
cuBLAS has its own function for data transfers between CPU and GPU memory:
- [cublasSetVector](http://docs.nvidia.com/cuda/cublas/index.html#cublassetvector) - transfers data from CPU to GPU memory
- [cublasGetVector](http://docs.nvidia.com/cuda/cublas/index.html#cublasgetvector) - transfers data from GPU to CPU memory
- [cublasSetVector][c] - transfers data from CPU to GPU memory
- [cublasGetVector][d] - transfers data from GPU to CPU memory
To compile the code using the NVCC compiler a "-lcublas" compiler flag has to be specified:
......@@ -307,3 +307,8 @@ $ ml cuda
$ ml intel
$ icc -std=c99 test_cublas.c -o test_cublas_icc -lcublas -lcudart
```
[a]: https://developer.nvidia.com/cublas
[b]: http://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-axpy
[c]: http://docs.nvidia.com/cuda/cublas/index.html#cublassetvector
[d]: http://docs.nvidia.com/cuda/cublas/index.html#cublasgetvector
This diff is collapsed.
......@@ -30,7 +30,7 @@ fi
In order to configure your shell for running particular application on clusters we use Module package interface.
Application modules on clusters are built using [EasyBuild](/software/tools/easybuild/). The modules are divided into the following structure:
Application modules on clusters are built using [EasyBuild][1]. The modules are divided into the following structure:
```
base: Default module class
......@@ -61,4 +61,7 @@ Application modules on clusters are built using [EasyBuild](/software/tools/easy
!!! note
The modules set up the application paths, library paths and environment variables for running particular application.
The modules may be loaded, unloaded and switched, according to momentary needs. For details see [here](/software/modules/lmod/).
The modules may be loaded, unloaded and switched, according to momentary needs. For details see [lmod][2].
[1]: software/tools/easybuild.md
[2]: software/modules/lmod.md
# VNC
The **Virtual Network Computing** (**VNC**) is a graphical [desktop sharing](http://en.wikipedia.org/wiki/Desktop_sharing "Desktop sharing") system that uses the [Remote Frame Buffer protocol (RFB)](http://en.wikipedia.org/wiki/RFB_protocol "RFB protocol") to remotely control another [computer](http://en.wikipedia.org/wiki/Computer "Computer"). It transmits the [keyboard](http://en.wikipedia.org/wiki/Computer_keyboard "Computer keyboard") and [mouse](http://en.wikipedia.org/wiki/Computer_mouse") events from one computer to another, relaying the graphical [screen](http://en.wikipedia.org/wiki/Computer_screen "Computer screen") updates back in the other direction, over a [network](http://en.wikipedia.org/wiki/Computer_network "Computer network").
The **Virtual Network Computing** (**VNC**) is a graphical [desktop sharing][a] system that uses the [Remote Frame Buffer protocol (RFB)][b] to remotely control another [computer][c]). It transmits the [keyboard][d] and [mouse][e] events from one computer to another, relaying the graphical [screen][f] updates back in the other direction, over a [network][g].
Vnc-based connections are usually faster (require less network bandwidth) then [X11](/general/accessing-the-clusters/graphical-user-interface/x-window-system) applications forwarded directly through ssh.
Vnc-based connections are usually faster (require less network bandwidth) then [X11][1] applications forwarded directly through ssh.
The recommended clients are [TightVNC](http://www.tightvnc.com) or [TigerVNC](http://sourceforge.net/apps/mediawiki/tigervnc/index.php?title=Main_Page) (free, open source, available for almost any platform).
The recommended clients are [TightVNC][h] or [TigerVNC][i] (free, open source, available for almost any platform).
In this chapter we show how to create an underlying ssh tunnel from your client machine to one of our login nodes. Then, how to start your own vnc server on our login node and finally how to connect to your vnc server via the encrypted ssh tunnel.
......@@ -24,7 +24,7 @@ Verify:
!!! note
To access VNC a local vncserver must be started first and also a tunnel using SSH port forwarding must be established.
[See below](#linuxmac-os-example-of-creating-a-tunnel) for the details on SSH tunnels.
[See below][2] for the details on SSH tunnels.
You should start by **choosing your display number**.
To choose free one, you should check currently occupied display numbers - list them using command:
......@@ -78,7 +78,7 @@ username :102
!!! note
The VNC server runs on port 59xx, where xx is the display number. So, you get your port number simply as 5900 + display number, in our example 5900 + 61 = 5961. Another example for display number 102 is calculation of TCP port 5900 + 102 = 6002 but be aware, that TCP ports above 6000 are often used by X11. **Please, calculate your own port number and use it instead of 5961 from examples below!**
To access the VNC server you have to create a tunnel between the login node using TCP port 5961 and your machine using a free TCP port (for simplicity the very same) in next step. See examples for [Linux/Mac OS](#linuxmac-os-example-of-creating-a-tunnel) and [Windows](#windows-example-of-creating-a-tunnel).
To access the VNC server you have to create a tunnel between the login node using TCP port 5961 and your machine using a free TCP port (for simplicity the very same) in next step. See examples for [Linux/Mac OS][2] and [Windows][3].
!!! note
The tunnel must point to the same login node where you launched the VNC server, eg. login2. If you use just cluster-name.it4i.cz, the tunnel might point to a different node due to DNS round robin.
......@@ -145,7 +145,7 @@ Fill the Source port and Destination fields. **Do not forget to click the Add bu
### WSL (Bash on Windows)
[Windows Subsystem for Linux](http://docs.microsoft.com/en-us/windows/wsl) is another way to run Linux software in a Windows environment.
[Windows Subsystem for Linux][j] is another way to run Linux software in a Windows environment.
At your machine, create the tunnel:
......@@ -214,7 +214,7 @@ Or this way:
```
!!! note
Do not forget to terminate also SSH tunnel, if it was used. Look on end of [this section](#linuxmac-os-example-of-creating-a-tunnel) for the details.
Do not forget to terminate also SSH tunnel, if it was used. Look on end of [this section][2] for the details.
## GUI Applications on Compute Nodes Over VNC
......@@ -230,7 +230,7 @@ Allow incoming X11 graphics from the compute nodes at the login node:
$ xhost +
```
Get an interactive session on a compute node (for more detailed info [look here](/anselm/job-submission-and-execution/)). Use the **-v DISPLAY** option to propagate the DISPLAY on the compute node. In this example, we want a complete node (16 cores in this example) from the production queue:
Get an interactive session on a compute node (for more detailed info [look here][4]). Use the **-v DISPLAY** option to propagate the DISPLAY on the compute node. In this example, we want a complete node (16 cores in this example) from the production queue:
```console
$ qsub -I -v DISPLAY=$(uname -n):$(echo $DISPLAY | cut -d ':' -f 2) -A PROJECT_ID -q qprod -l select=1:ncpus=16
......@@ -245,3 +245,19 @@ $ xterm
Example described above:
![](../../../img/gnome-compute-nodes-over-vnc.png)
[a]: http://en.wikipedia.org/wiki/Desktop_sharing
[b]: http://en.wikipedia.org/wiki/RFB_protocol
[c]: http://en.wikipedia.org/wiki/Computer
[d]: http://en.wikipedia.org/wiki/Computer_keyboard
[e]: http://en.wikipedia.org/wiki/Computer_mouse
[f]: http://en.wikipedia.org/wiki/Computer_screen
[g]: http://en.wikipedia.org/wiki/Computer_network
[h]: http://www.tightvnc.com
[i]: http://sourceforge.net/apps/mediawiki/tigervnc/index.php?title=Main_Page
[j]: http://docs.microsoft.com/en-us/windows/wsl
[1]: x-window-system.md
[2]: #linuxmac-os-example-of-creating-a-tunnel
[3]: #windows-example-of-creating-a-tunnel
[4]: ../../../anselm/job-submission-and-execution.md
# X Window System
The X Window system is a principal way to get GUI access to the clusters. The **X Window System** (commonly known as **X11**, based on its current major version being 11, or shortened to simply **X**, and sometimes informally **X-Windows**) is a computer software system and network [protocol](http://en.wikipedia.org/wiki/Protocol_%28computing%29 "Protocol (computing)") that provides a basis for [graphical user interfaces](http://en.wikipedia.org/wiki/Graphical_user_interface "Graphical user interface") (GUIs) and rich input device capability for [networked computers](http://en.wikipedia.org/wiki/Computer_network "Computer network").
The X Window system is a principal way to get GUI access to the clusters. The **X Window System** (commonly known as **X11**, based on its current major version being 11, or shortened to simply **X**, and sometimes informally **X-Windows**) is a computer software system and network [protocol][a] that provides a basis for [graphical user interfaces][b] (GUIs) and rich input device capability for [networked computers][c].
!!! tip
The X display forwarding must be activated and the X server running on client side
......@@ -60,18 +60,17 @@ In order to display graphical user interface GUI of various software tools, you
### X Server on OS X
Mac OS users need to install [XQuartz server](https://www.xquartz.org).
Mac OS users need to install [XQuartz server][d].
### X Server on Windows
There are variety of X servers available for Windows environment. The commercial Xwin32 is very stable and rich featured. The Cygwin environment provides fully featured open-source XWin X server. For simplicity, we recommend open-source X server by the [Xming project](http://sourceforge.net/projects/xming/). For stability and full features we recommend the
[XWin](http://x.cygwin.com/) X server by Cygwin
There are variety of X servers available for Windows environment. The commercial Xwin32 is very stable and rich featured. The Cygwin environment provides fully featured open-source XWin X server. For simplicity, we recommend open-source X server by the [Xming project][e]. For stability and full features we recommend the [XWin][f] X server by Cygwin
| How to use Xwin | How to use Xming |
|--- | --- |
| [Install Cygwin](http://x.cygwin.com/) Find and execute XWin.exe to start the X server on Windows desktop computer.[If no able to forward X11 using PuTTY to CygwinX](#if-no-able-to-forward-x11-using-putty-to-cygwinx) | Use Xlaunch to configure the Xming. Run Xming to start the X server on Windows desktop computer. |
| [Install Cygwin][g]. Find and execute XWin.exe to start the X server on Windows desktop computer.[If no able to forward X11 using PuTTY to CygwinX][1] | Use Xlaunch to configure the Xming. Run Xming to start the X server on Windows desktop computer. |
Read more on [http://www.math.umn.edu/systems_guide/putty_xwin32.html](http://www.math.umn.edu/systems_guide/putty_xwin32.shtml)
Read more [here][h].
## Running GUI Enabled Applications
......@@ -116,7 +115,7 @@ The Gnome 2.28 GUI environment is available on the clusters. We recommend to use
### Gnome on Linux and OS X
To run the remote Gnome session in a window on Linux/OS X computer, you need to install Xephyr. Ubuntu package is
xserver-xephyr, on OS X it is part of [XQuartz](http://xquartz.macosforge.org/landing/). First, launch Xephyr on local machine:
xserver-xephyr, on OS X it is part of [XQuartz][i]. First, launch Xephyr on local machine:
```console
local $ Xephyr -ac -screen 1024x768 -br -reset -terminate :1 &
......@@ -143,7 +142,7 @@ However this method does not seem to work with recent Linux distributions and yo
Use XLaunch to start the Xming server or run the XWin.exe. Select the "One window" mode.
Log in to the cluster, using [PuTTY](#putty-on-windows) or [Bash on Windows](#wsl-bash-on-windows). On the cluster, run the gnome-session command.
Log in to the cluster, using [PuTTY][2] or [Bash on Windows][3]. On the cluster, run the gnome-session command.
```console
$ gnome-session &
......@@ -153,3 +152,16 @@ In this way, we run remote gnome session on the cluster, displaying it in the lo
Use System-Log Out to close the gnome-session
[1]: #if-no-able-to-forward-x11-using-putty-to-cygwinx
[2]: #putty-on-windows
[3]: #wsl-bash-on-windows
[a]: http://en.wikipedia.org/wiki/Protocol_%28computing%29
[b]: http://en.wikipedia.org/wiki/Graphical_user_interface
[c]: http://en.wikipedia.org/wiki/Computer_network
[d]: https://www.xquartz.org
[e]: http://sourceforge.net/projects/xming/
[f]: http://x.cygwin.com/
[g]: http://x.cygwin.com/
[h]: http://www.math.umn.edu/systems_guide/putty_xwin32.shtml
[i]: http://xquartz.macosforge.org/landing/
......@@ -2,10 +2,10 @@
## Windows PuTTY Installer
We recommned you to download "**A Windows installer for everything except PuTTYtel**" with **Pageant** (SSH authentication agent) and **PuTTYgen** (PuTTY key generator) which is available [here](http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html).
We recommned you to download "**A Windows installer for everything except PuTTYtel**" with **Pageant** (SSH authentication agent) and **PuTTYgen** (PuTTY key generator) which is available [here][a].
!!! note
After installation you can proceed directly to private keys authentication using ["Putty"](#putty).
After installation you can proceed directly to private keys authentication using ["Putty"][1].
"Change Password for Existing Private Key" is optional.
......@@ -23,7 +23,7 @@ We recommned you to download "**A Windows installer for everything except PuTTYt
* Category - Connection - SSH - Auth:
Select Attempt authentication using Pageant.
Select Allow agent forwarding.
Browse and select your [private key](ssh-keys/) file.
Browse and select your [private key][2] file.
![](../../../img/PuTTY_keyV.png)
......@@ -36,7 +36,7 @@ We recommned you to download "**A Windows installer for everything except PuTTYt
![](../../../img/PuTTY_open_Salomon.png)
* Enter your username if the _Host Name_ input is not in the format "username@salomon.it4i.cz".
* Enter passphrase for selected [private key](/general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys/) file if Pageant **SSH authentication agent is not used.**
* Enter passphrase for selected [private key][2] file if Pageant **SSH authentication agent is not used.**
## Another PuTTY Settings
......@@ -63,7 +63,7 @@ PuTTYgen is the PuTTY key generator. You can load in an existing private key and
You can change the password of your SSH key with "PuTTY Key Generator". Make sure to backup the key.
* Load your [private key](/general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys/) file with _Load_ button.
* Load your [private key][2] file with _Load_ button.
* Enter your current passphrase.
* Change key passphrase.
* Confirm key passphrase.
......@@ -104,4 +104,9 @@ You can generate an additional public/private key pair and insert public key int
![](../../../img/PuttyKeygenerator_006V.png)
* Now you can insert additional public key into authorized_keys file for authentication with your own private key.
You must log in using ssh key received after registration. Then proceed to [How to add your own key](/general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys/).
You must log in using ssh key received after registration. Then proceed to [How to add your own key][2].
[1]: #putty
[2]: ssh-keys.md
[a]: http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html
......@@ -15,7 +15,7 @@ It is impossible to connect to VPN from other operating systems.
## VPN Client Installation
You can install VPN client from web interface after successful login with [IT4I credentials](/general/obtaining-login-credentials/obtaining-login-credentials/#login-credentials) on address [https://vpn.it4i.cz/user](https://vpn.it4i.cz/user)
You can install VPN client from web interface after successful login with [IT4I credentials][1] [here][a].
![](../../img/vpn_web_login.png)
......@@ -43,7 +43,7 @@ After successful download of installation file, you have to execute this executa
You can use graphical user interface or command line interface to run VPN client on all supported operating systems. We suggest using GUI.
Before the first login to VPN, you have to fill URL **[https://vpn.it4i.cz/user](https://vpn.it4i.cz/user)** into the text field.
Before the first login to VPN, you have to fill URL **[https://vpn.it4i.cz/user][a]** into the text field.
![](../../img/vpn_contacting_https_cluster.png)
......@@ -72,3 +72,8 @@ After a successful logon, you can see a green circle with a tick mark on the loc
![](../../img/vpn_successfull_connection.png)
For disconnecting, right-click on the AnyConnect client icon in the system tray and select **VPN Disconnect**.
[1]: ../../general/obtaining-login-credentials/obtaining-login-credentials.md#login-credentials
[a]: https://vpn.it4i.cz/user
# Applying for Resources
Computational resources may be allocated by any of the following [Computing resources allocation](http://www.it4i.cz/computing-resources-allocation/?lang=en) mechanisms.
Computational resources may be allocated by any of the following [Computing resources allocation][a] mechanisms.
Academic researchers can apply for computational resources via [Open Access Competitions](http://www.it4i.cz/open-access-competition/?lang=en&lang=en).
Academic researchers can apply for computational resources via [Open Access Competitions][b].
Anyone is welcomed to apply via the [Directors Discretion.](http://www.it4i.cz/obtaining-computational-resources-through-directors-discretion/?lang=en&lang=en)
Anyone is welcomed to apply via the [Directors Discretion][c].
Foreign (mostly European) users can obtain computational resources via the [PRACE (DECI) program](http://www.prace-ri.eu/DECI-Projects).
Foreign (mostly European) users can obtain computational resources via the [PRACE (DECI) program][d].
In all cases, IT4Innovations’ access mechanisms are aimed at distributing computational resources while taking into account the development and application of supercomputing methods and their benefits and usefulness for society. The applicants are expected to submit a proposal. In the proposal, the applicants **apply for a particular amount of core-hours** of computational resources. The requested core-hours should be substantiated by scientific excellence of the proposal, its computational maturity and expected impacts. Proposals do undergo a scientific, technical and economic evaluation. The allocation decisions are based on this evaluation. More information at [Computing resources allocation](http://www.it4i.cz/computing-resources-allocation/?lang=en) and [Obtaining Login Credentials](/general/obtaining-login-credentials/obtaining-login-credentials/) page.
In all cases, IT4Innovations’ access mechanisms are aimed at distributing computational resources while taking into account the development and application of supercomputing methods and their benefits and usefulness for society. The applicants are expected to submit a proposal. In the proposal, the applicants **apply for a particular amount of core-hours** of computational resources. The requested core-hours should be substantiated by scientific excellence of the proposal, its computational maturity and expected impacts. Proposals do undergo a scientific, technical and economic evaluation. The allocation decisions are based on this evaluation. More information at [Computing resources allocation][a] and [Obtaining Login Credentials][1] page.
[1]: obtaining-login-credentials/obtaining-login-credentials.md
[a]: http://www.it4i.cz/computing-resources-allocation/?lang=en
[b]: http://www.it4i.cz/open-access-competition/?lang=en&lang=en
[c]: http://www.it4i.cz/obtaining-computational-resources-through-directors-discretion/?lang=en&lang=en
[d]: http://www.prace-ri.eu/DECI-Projects
......@@ -17,11 +17,11 @@ However, users need only manage User and CA certificates. Note that your user ce
## Q: Which X.509 Certificates Are Recognised by IT4Innovations?
[The Certificates for Digital Signatures](#the-certificates-for-digital-signatures).
[The Certificates for Digital Signatures][1].
## Q: How Do I Get a User Certificate That Can Be Used With IT4Innovations?
To get a certificate, you must make a request to your local, IGTF approved, Certificate Authority (CA). Usually you then must visit, in person, your nearest Registration Authority (RA) to verify your affiliation and identity (photo identification is required). Usually, you will then be emailed details on how to retrieve your certificate, although procedures can vary between CAs. If you are in Europe, you can locate [your trusted CA](https://www.eugridpma.org/members/worldmap/).
To get a certificate, you must make a request to your local, IGTF approved, Certificate Authority (CA). Usually you then must visit, in person, your nearest Registration Authority (RA) to verify your affiliation and identity (photo identification is required). Usually, you will then be emailed details on how to retrieve your certificate, although procedures can vary between CAs. If you are in Europe, you can locate [your trusted CA][a].
In some countries certificates can also be retrieved using the TERENA Certificate Service, see the FAQ below for the link.
......@@ -31,7 +31,7 @@ Yes, provided that the CA which provides this service is also a member of IGTF.
## Q: Does IT4Innovations Support the TERENA Certificate Service?
Yes, ITInnovations supports TERENA eScience personal certificates. For more information, visit [TCS - Trusted Certificate Service](https://tcs-escience-portal.terena.org/), where you also can find if your organisation/country can use this service
Yes, ITInnovations supports TERENA eScience personal certificates. For more information, visit [TCS - Trusted Certificate Service][b], where you also can find if your organisation/country can use this service.
## Q: What Format Should My Certificate Take?
......@@ -51,7 +51,7 @@ To convert your Certificate from p12 to JKS, IT4Innovations recommends using the
Certification Authority (CA) certificates are used to verify the link between your user certificate and the authority which issued it. They are also used to verify the link between the host certificate of a IT4Innovations server and the CA which issued that certificate. In essence they establish a chain of trust between you and the target server. Thus, for some grid services, users must have a copy of all the CA certificates.
To assist users, SURFsara (a member of PRACE) provides a complete and up-to-date bundle of all the CA certificates that any PRACE user (or IT4Innovations grid services user) will require. Bundle of certificates, in either p12, PEM or JKS formats, are [available here](https://winnetou.surfsara.nl/prace/certs/).
To assist users, SURFsara (a member of PRACE) provides a complete and up-to-date bundle of all the CA certificates that any PRACE user (or IT4Innovations grid services user) will require. Bundle of certificates, in either p12, PEM or JKS formats, are [available here][c].
It is worth noting that gsissh-term and DART automatically updates their CA certificates from this SURFsara website. In other cases, if you receive a warning that a server’s certificate can not be validated (not trusted), then update your CA certificates via the SURFsara website. If this fails, then contact the IT4Innovations helpdesk.
......@@ -61,7 +61,7 @@ Lastly, if you need the CA certificates for a personal Globus 5 installation, th
myproxy-get-trustroots -s myproxy-prace.lrz.de
```
If you run this command as ’root’, then it will install the certificates into /etc/grid-security/certificates. If you run this not as ’root’, then the certificates will be installed into $HOME/.globus/certificates. For Globus, you can download the globuscerts.tar.gz packet [available here](https://winnetou.surfsara.nl/prace/certs/).
If you run this command as ’root’, then it will install the certificates into /etc/grid-security/certificates. If you run this not as ’root’, then the certificates will be installed into $HOME/.globus/certificates. For Globus, you can download the globuscerts.tar.gz packet [available here][c].
## Q: What Is a DN and How Do I Find Mine?
......@@ -104,7 +104,7 @@ To check your certificate (e.g., DN, validity, issuer, public key algorithm, etc
openssl x509 -in usercert.pem -text -noout
```
To download openssl if not pre-installed, see [here](https://www.openssl.org/source/). On Macintosh Mac OS X computers openssl is already pre-installed and can be used immediately.
To download openssl if not pre-installed, see [here][d]. On Macintosh Mac OS X computers openssl is already pre-installed and can be used immediately.
## Q: How Do I Create and Then Manage a Keystore?
......@@ -126,7 +126,7 @@ You also can import CA certificates into your java keystore with the tool, e.g.:
where $mydomain.crt is the certificate of a trusted signing authority (CA) and $mydomain is the alias name that you give to the entry.
More information on the tool can be found [here](http://docs.oracle.com/javase/7/docs/technotes/tools/solaris/keytool.html)
More information on the tool can be found [here][e].
## Q: How Do I Use My Certificate to Access the Different Grid Services?
......@@ -134,7 +134,7 @@ Most grid services require the use of your certificate; however, the format of y
If employing the PRACE version of GSISSH-term (also a Java Web Start Application), you may use either the PEM or p12 formats. Note that this service automatically installs up-to-date PRACE CA certificates.
If the grid service is UNICORE, then you bind your certificate, in either the p12 format or JKS, to UNICORE during the installation of the client on your local machine. For more information visit [UNICORE6 in PRACE](http://www.prace-ri.eu/UNICORE6-in-PRACE)
If the grid service is UNICORE, then you bind your certificate, in either the p12 format or JKS, to UNICORE during the installation of the client on your local machine. For more information visit [UNICORE6 in PRACE][f].
If the grid service is part of Globus, such as GSI-SSH, GriFTP or GRAM5, then the certificates can be in either p12 or PEM format and must reside in the "$HOME/.globus" directory for Linux and Mac users or %HOMEPATH%.globus for Windows users. (Windows users will have to use the DOS command ’cmd’ to create a directory which starts with a ’.’). Further, user certificates should be named either "usercred.p12" or "usercert.pem" and "userkey.pem", and the CA certificates must be kept in a pre-specified directory as follows. For Linux and Mac users, this directory is either $HOME/.globus/certificates or /etc/grid-security/certificates. For Windows users, this directory is %HOMEPATH%.globuscertificates. (If you are using GSISSH-Term from prace-ri.eu then you do not have to create the .globus directory nor install CA certificates to use this tool alone).
......@@ -152,12 +152,23 @@ A proxy certificate is a short-lived certificate which may be employed by UNICOR
## Q: What Is the MyProxy Service?
[The MyProxy Service](http://grid.ncsa.illinois.edu/myproxy/) , can be employed by gsissh-term and Globus tools, and is an online repository that allows users to store long lived proxy certificates remotely, which can then be retrieved for use at a later date. Each proxy is protected by a password provided by the user at the time of storage. This is beneficial to Globus users as they do not have to carry their private keys and certificates when travelling; nor do users have to install private keys and certificates on possibly insecure computers.
[The MyProxy Service][g], can be employed by gsissh-term and Globus tools, and is an online repository that allows users to store long lived proxy certificates remotely, which can then be retrieved for use at a later date. Each proxy is protected by a password provided by the user at the time of storage. This is beneficial to Globus users as they do not have to carry their private keys and certificates when travelling; nor do users have to install private keys and certificates on possibly insecure computers.
## Q: Someone May Have Copied or Had Access to the Private Key of My Certificate Either in a Separate File or in the Browser. What Should I Do?
Please ask the CA that issued your certificate to revoke this certificate and to supply you with a new one. In addition, report this to IT4Innovations by contacting [the support team](https://support.it4i.cz/rt).
Please ask the CA that issued your certificate to revoke this certificate and to supply you with a new one. In addition, report this to IT4Innovations by contacting [the support team][h].
## Q: My Certificate Expired. What Should I Do?
In order to still be able to communicate with us, one has to make a request for the new certificate to your Certificate Authority (CA). There is no need to explicitly send us any information about your new certificate if a new one has the same Distinguished Name (DN) as the old one.
[1]: #the-certificates-for-digital-signatures
[a]: https://www.eugridpma.org/members/worldmap/
[b]: https://tcs-escience-portal.terena.org/
[c]: https://winnetou.surfsara.nl/prace/certs/
[d]: https://www.openssl.org/source/
[e]: http://docs.oracle.com/javase/7/docs/technotes/tools/solaris/keytool.html
[f]: http://www.prace-ri.eu/UNICORE6-in-PRACE
[g]: http://grid.ncsa.illinois.edu/myproxy/
[h]: https://support.it4i.cz/rt
# Resource Allocation and Job Execution
To run a [job](/#terminology-frequently-used-on-these-pages), [computational resources](/salomon/resources-allocation-policy#resource-accounting-policy) for this particular job must be allocated. This is done via the PBS Pro job workload manager software, which distributes workloads across the supercomputer. Extensive information about PBS Pro can be found in the [PBS Pro User's Guide](/pbspro).
To run a [job][1], computational resources for this particular job must be allocated. This is done via the PBS Pro job workload manager software, which distributes workloads across the supercomputer. Extensive information about PBS Pro can be found in the [PBS Pro User's Guide][2].
## Resources Allocation Policy
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and resources available to the Project. [The Fair-share](/salomon/job-priority#fair-share-priority) ensures that individual users may consume approximately equal amount of resources per week. The resources are accessible via queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following queues are are the most important:
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and resources available to the Project. [The Fair-share][3] ensures that individual users may consume approximately equal amount of resources per week. The resources are accessible via queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following queues are are the most important:
* **qexp**, the Express queue
* **qprod**, the Production queue
......@@ -14,9 +14,9 @@ The resources are allocated to the job in a fair-share fashion, subject to const
* **qfree**, the Free resource utilization queue
!!! note
Check the queue status at [https://extranet.it4i.cz/](https://extranet.it4i.cz/)
Check the queue status [here][a].
Read more on the [Resource AllocationPolicy](/salomon/resources-allocation-policy) page.
Read more on the [Resource AllocationPolicy][4] page.
## Job Submission and Execution
......@@ -25,7 +25,7 @@ Read more on the [Resource AllocationPolicy](/salomon/resources-allocation-polic
The qsub submits the job into the queue. The qsub command creates a request to the PBS Job manager for allocation of specified resources. The **smallest allocation unit is entire node, 16 cores**, with exception of the qexp queue. The resources will be allocated when available, subject to allocation policies and constraints. **After the resources are allocated the jobscript or interactive shell is executed on first of the allocated nodes.**
Read more on the [Job submission and execution](/salomon/job-submission-and-execution) page.
Read more on the [Job submission and execution][5] page.
## Capacity Computing
......@@ -36,4 +36,13 @@ Use GNU Parallel and/or Job arrays when running (many) single core jobs.
In many cases, it is useful to submit huge (100+) number of computational jobs into the PBS queue system. Huge number of (small) jobs is one of the most effective ways to execute embarrassingly parallel calculations, achieving best runtime, throughput and computer utilization. In this chapter, we discuss the the recommended way to run huge number of jobs, including **ways to run huge number of single core jobs**.
Read more on [Capacity computing](/salomon/capacity-computing) page.
Read more on [Capacity computing][6] page.
[1]: #terminology-frequently-used-on-these-pages
[2]: ../pbspro.md
[3]: ../salomon/job-priority.md#fair-share-priority
[4]: ../salomon/resources-allocation-policy.md
[5]: ../salomon/job-submission-and-execution.md
[6]: ../salomon/capacity-computing.md
[a]: https://extranet.it4i.cz/rsweb/salomon/queues
# Documentation
Welcome to the IT4Innovations documentation pages. The IT4Innovations national supercomputing center operates the supercomputers [Salomon](/salomon/introduction/) and [Anselm](/anselm/introduction/). The supercomputers are [available](/general/applying-for-resources/) to the academic community within the Czech Republic and Europe, and the industrial community worldwide. The purpose of these pages is to provide comprehensive documentation of the hardware, software and usage of the computers.
!!! Warning
There's a planned Salomon upgrade. Make sure to read the [details][upgrade].
Welcome to the IT4Innovations documentation pages. The IT4Innovations national supercomputing center operates the supercomputers [Salomon][1] and [Anselm][2]. The supercomputers are [available][3] to the academic community within the Czech Republic and Europe, and the industrial community worldwide. The purpose of these pages is to provide comprehensive documentation of the hardware, software and usage of the computers.
## How to Read the Documentation
......@@ -11,27 +14,27 @@ Welcome to the IT4Innovations documentation pages. The IT4Innovations national s
## Getting Help and Support
!!! note
Contact [support\[at\]it4i.cz](mailto:support@it4i.cz) for help and support regarding the cluster technology at IT4Innovations. Please use **Czech**, **Slovak** or **English** language for communication with us. Follow the status of your request to IT4Innovations at [support.it4i.cz/rt](http://support.it4i.cz/rt). The IT4Innovations support team will use best efforts to resolve requests within thirty days.
Contact [support\[at\]it4i.cz][a] for help and support regarding the cluster technology at IT4Innovations. Please use **Czech**, **Slovak** or **English** language for communication with us. Follow the status of your request to IT4Innovations [here][b]. The IT4Innovations support team will use best efforts to resolve requests within thirty days.
Use your IT4Innovations username and password to log in to the [support](http://support.it4i.cz/) portal.
Use your IT4Innovations username and password to log in to the [support][b] portal.
## Required Proficiency
!!! note
You need basic proficiency in Linux environments.
In order to use the system for your calculations, you need basic proficiency in Linux environments. To gain this proficiency we recommend you read the [introduction to Linux](http://www.tldp.org/LDP/intro-linux/html/) operating system environments, and install a Linux distribution on your personal computer. A good choice might be the [CentOS](http://www.centos.org/) distribution, as it is similar to systems on the clusters at IT4Innovations. It's easy to install and use. In fact, any Linux distribution would do.
In order to use the system for your calculations, you need basic proficiency in Linux environments. To gain this proficiency we recommend you read the [introduction to Linux][c] operating system environments, and install a Linux distribution on your personal computer. A good choice might be the [CentOS][d] distribution, as it is similar to systems on the clusters at IT4Innovations. It's easy to install and use. In fact, any Linux distribution would do.
!!! note
Learn how to parallelize your code!
In many cases, you will run your own code on the cluster. In order to fully exploit the cluster, you will need to carefully consider how to utilize all the cores available on the node and how to use multiple nodes at the same time. You need to **parallelize** your code. Proficieny in MPI, OpenMP, CUDA, UPC or GPI2 programming may be gained via [training provided by IT4Innovations.](http://prace.it4i.cz)
In many cases, you will run your own code on the cluster. In order to fully exploit the cluster, you will need to carefully consider how to utilize all the cores available on the node and how to use multiple nodes at the same time. You need to **parallelize** your code. Proficieny in MPI, OpenMP, CUDA, UPC or GPI2 programming may be gained via [training provided by IT4Innovations][e].
## Terminology Frequently Used on These Pages
* **node:** a computer, interconnected via a network to other computers - Computational nodes are powerful computers, designed for, and dedicated to executing demanding scientific computations.
* **core:** a processor core, a unit of processor, executing computations
* **core-hour:** also normalized core-hour, NCH. A metric of computer utilization, [see definition](/salomon/resources-allocation-policy/#normalized-core-hours-nch).
* **core-hour:** also normalized core-hour, NCH. A metric of computer utilization, [see definition][4].
* **job:** a calculation running on the supercomputer - the job allocates and utilizes the resources of the supercomputer for certain time.
* **HPC:** High Performance Computing
* **HPC (computational) resources:** corehours, storage capacity, software licences
......@@ -60,8 +63,20 @@ local $
## Errors
Although we have taken every care to ensure the accuracy of the content, mistakes do happen.
If you find an inconsistency or error, report it by visiting [http://support.it4i.cz/rt](http://support.it4i.cz/rt), creating a new ticket, and entering the details.
If you find an inconsistency or error, report it by visiting [support][b], creating a new ticket, and entering the details.
By doing so, you can save other readers from frustration and help us improve.
!!! tip
We will fix the problem as soon as possible.
[1]: salomon/introduction.md
[2]: anselm/introduction.md
[3]: general/applying-for-resources.md
[4]: salomon/resources-allocation-policy.md#normalized-core-hours-nch
[upgrade]: salomon-upgrade.md
[a]: mailto:support@it4i.cz
[b]: http://support.it4i.cz/rt
[c]: http://www.tldp.org/LDP/intro-linux/html/
[d]: http://www.centos.org/
[e]: http://prace.it4i.cz
......@@ -24,7 +24,8 @@ Install development packages (gcc, g++, make, automake, autoconf, bison, flex, p
$ qsub ... -l mic_devel=true
```
Available on Salomon Perrin nodes.
!!! Warning
Available on Salomon Perrin nodes.
## Global RAM Disk
......@@ -34,7 +35,8 @@ Create global shared file system consisting of RAM disks of allocated nodes. Fil
$ qsub ... -l global_ramdisk=true
```
Available on Salomon nodes.
!!! Warning
Available on Salomon nodes only.
## Virtualization Network
......@@ -44,7 +46,7 @@ Configure network for virtualization, create interconnect for fast communication
$ qsub ... -l virt_network=true
```
[See Tap Interconnect](/software/tools/virtualization/#tap-interconnect)
[See Tap Interconnect][1]
## x86 Adapt Support
......@@ -54,9 +56,11 @@ Load kernel module, that allows changing/toggling system parameters stored in MS
$ qsub ... -l x86_adapt=true
```
Hazardous, it causes CPU frequency disruption.
!!! Danger
Hazardous, it causes CPU frequency disruption.
Available on Salomon nodes.
!!! Warning
Available on Salomon nodes only.
## Disabling Intel Turbo Boost on CPU
......@@ -70,7 +74,8 @@ $ qsub ... -l cpu_turbo_boost=false
## Offlining CPU Cores
Not available.
!!! Info
Not available now.
To offline N CPU cores
......@@ -86,16 +91,18 @@ $ qsub ... -l cpu_offline_cores=PATTERN
where pattern is list of core's numbers to offline separated by character 'c' e.g. "5c11c16c23c"
Hazardous, it causes Lustre threads disruption.
!!! Danger
Hazardous, it causes Lustre threads disruption.
## Setting Intel Hyper Threading on CPU
Not available, requires changed BIOS settings.
Intel Hyper Threading is disabled by default.
To enable Intel Hyper Threading on allocated nodes CPUs
Intel Hyper Threading is disabled by default. To enable Intel Hyper Threading on allocated nodes CPUs:
```console
$ qsub ... -l cpu_hyper_threading=true
```
!!! Warning
Available on Salomon nodes only.
[1]: software/tools/virtualization.md#tap-interconnect
This diff is collapsed.
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -6,13 +6,19 @@
| ------ | ----------- |
| [icc](http://software.intel.com/en-us/intel-compilers/) | Intel C and C++ compilers |
## Data
| Module | Description |
| ------ | ----------- |
| [HDF5](http://www.hdfgroup.org/HDF5/) | HDF5 is a unique technology suite that makes possible the management of extremely large and complex data collections. |
## Devel
| Module | Description |
| ------ | ----------- |
| devel_environment | &nbsp; |
| M4 | &nbsp; |
| ncurses | &nbsp; |
| [devel_environment](https://docs.it4i.cz/software/mic/mic_environment) | Devel environment for intel xeon phi GCC 5.1.1 Python 2.7.12 Perl 5.14.2 CMake 2.8.7 Make 3.82 ncurses 5.9 ... |
| [M4](http://www.gnu.org/software/m4/m4.html) | GNU M4 is an implementation of the traditional Unix macro processor. It is mostly SVR4 compatible although it has some extensions (for example, handling more than 9 positional parameters to macros). GNU M4 also has built-in functions for including files, running shell commands, doing arithmetic, etc. |
| [ncurses](http://www.gnu.org/software/ncurses/) | The Ncurses (new curses) library is a free software emulation of curses in System V Release 4.0, and more. It uses Terminfo format, supports pads and color and multiple highlights and forms characters and function-key mapping, and has all the other SYSV-curses enhancements over BSD Curses. |
## Lang
......@@ -33,6 +39,7 @@
| Module | Description |
| ------ | ----------- |
| GMP | &nbsp; |
| [Octave](http://www.gnu.org/software/octave/) | GNU Octave is a high-level interpreted language, primarily intended for numerical computations. |
## Mpi
......@@ -41,23 +48,32 @@
| ------ | ----------- |
| [impi](http://software.intel.com/en-us/intel-mpi-library/) | Intel MPI Library, compatible with MPICH ABI |
## Numlib
| Module | Description |
| ------ | ----------- |
| [imkl](http://software.intel.com/en-us/intel-mkl/) | Intel Math Kernel Library is a library of highly optimized, extensively threaded math routines for science, engineering, and financial applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK, Sparse Solvers, Fast Fourier Transforms, Vector Math, and more. |
## Toolchain
| Module | Description |
| ------ | ----------- |
| [iccifort](http://software.intel.com/en-us/intel-cluster-toolkit-compiler/) | Intel C, C++ & Fortran compilers |
| [ifort](http://software.intel.com/en-us/intel-compilers/) | Intel Fortran compiler |
| [iimpi](http://software.intel.com/en-us/intel-cluster-toolkit-compiler/) | Intel C/C++ and Fortran compilers, alongside Intel MPI. |
| [intel](http://software.intel.com/en-us/intel-cluster-toolkit-compiler/) | Compiler toolchain including Intel compilers, Intel MPI and Intel Math Kernel Library (MKL). |
## Tools
| Module | Description |
| ------ | ----------- |
| bzip2 | &nbsp; |
| cURL | &nbsp; |
| [bzip2](http://www.bzip.org/) | bzip2 is a freely available, patent free, high-quality data compressor. It typically compresses files to within 10% to 15% of the best available techniques (the PPM family of statistical compressors), whilst being around twice as fast at compression and six times faster at decompression. |
| [cURL](http://curl.haxx.se) | libcurl is a free and easy-to-use client-side URL transfer library |
| [expat](http://expat.sourceforge.net/) | Expat is an XML parser library written in C. It is a stream-oriented parser in which an application registers handlers for things the parser might find in the XML document (like start tags) |
| OpenSSL | &nbsp; |
## Vis
| Module | Description |
| ------ | ----------- |
| gettext | &nbsp; |
| [gettext](http://www.gnu.org/software/gettext/) | GNU `gettext' is an important step for the GNU Translation Project, as it is an asset on which we may build many other steps. This package offers to programmers, translators, and even users, a well integrated set of tools and documentation |
......@@ -61,6 +61,7 @@
| [pkg-config](http://www.freedesktop.org/wiki/Software/pkg-config/) | pkg-config is a helper tool used when compiling applications and libraries. It helps you insert the correct compiler options on the command line so an application can use gcc -o test test.c `pkg-config --libs --cflags glib-2.0` for instance, rather than hard-coding values on where to find glib (or other libraries). |
| [Qt](http://qt-project.org/) | Qt is a comprehensive cross-platform C++ application framework. |
| [Qt5](http://qt.io/) | Qt is a comprehensive cross-platform C++ application framework. |
| [sparsehash](https://github.com/sparsehash/sparsehash) | An extremely memory-efficient hash_map implementation. 2 bits/entry overhead! The SparseHash library contains several hash-map implementations, including implementations that optimize for space or speed. |
| [SQLite](http://www.sqlite.org/) | SQLite: SQL Database Engine in a C Library |
| [SWIG](http://www.swig.org/) | SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages. |
| [xorg-macros](http://cgit.freedesktop.org/xorg/util/macros) | X.org macros utilities. |
......@@ -140,7 +141,7 @@
| Module | Description |
| ------ | ----------- |
| CUDA | &nbsp; |
| [CUDA](https://developer.nvidia.com/cuda-toolkit) | CUDA (formerly Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. CUDA gives developers access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs. |
| [hwloc](http://www.open-mpi.org/projects/hwloc/) | The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, ...) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces, InfiniBand HCAs or GPUs. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit it accordingly and efficiently. |
| [libpciaccess](http://cgit.freedesktop.org/xorg/lib/libpciaccess/) | Generic PCI access library. |
......@@ -148,13 +149,14 @@
| Module | Description |
| ------ | ----------- |
| [foss]((none)) | GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK. |
| foss | GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK. |
| [GNU](http://www.gnu.org/software/) | Compiler-only toolchain with GCC and binutils. |
| [gompi]((none)) | GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support. |
| gompi | GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support. |
| [iccifort](http://software.intel.com/en-us/intel-cluster-toolkit-compiler/) | Intel C, C++ and Fortran compilers |
| [iimpi](http://software.intel.com/en-us/intel-cluster-toolkit-compiler/) | Intel C/C++ and Fortran compilers, alongside Intel MPI. |
| [intel](http://software.intel.com/en-us/intel-cluster-toolkit-compiler/) | Intel Cluster Toolkit Compiler Edition provides Intel C/C++ and Fortran compilers, Intel MPI & Intel MKL. |
| [PRACE](http://www.prace-ri.eu/PRACE-Common-Production) | The PRACE Common Production Environment (PCPE) is a set of software tools and libraries that are planned to be available on all PRACE execution sites. The PCPE also defines a set of environment variables that try to make compilation on all sites as homogeneous and simple as possible. |
| [Py](https://www.python.org) | Python 2.7 toolchain |
## Tools
......@@ -162,6 +164,7 @@
| ------ | ----------- |
| [Bash](http://www.gnu.org/software/bash) | Bash is an sh-compatible command language interpreter that executes commands read from the standard input or from a file. Bash also incorporates useful features from the Korn and C shells (ksh and csh). |
| [binutils](http://directory.fsf.org/project/binutils/) | binutils: GNU binary utilities |
| [BLCR](http://crd.lbl.gov/departments/computer-science/CLaSS/research/BLCR/) | Future Technologies Group researchers are developing a hybrid kernel/user implementation of checkpoint/restart. Their goal is to provide a robust, production quality implementation that checkpoints a wide range of applications, without requiring changes to be made to application code. This work focuses on checkpointing parallel applications that communicate through MPI, and on compatibility with the software suite produced by the SciDAC Scalable Systems Software ISIC. |
| [bzip2](http://www.bzip.org/) | bzip2 is a freely available, patent free, high-quality data compressor. It typically compresses files to within 10% to 15% of the best available techniques (the PPM family of statistical compressors), whilst being around twice as fast at compression and six times faster at decompression. |
| [cURL](http://curl.haxx.se) | libcurl is a free and easy-to-use client-side URL transfer library, supporting DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS, Telnet and TFTP. libcurl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM, Negotiate, Kerberos), file transfer resume, http proxy tunneling and more. |
| [DMTCP](http://dmtcp.sourceforge.net/index.html) | DMTCP (Distributed MultiThreaded Checkpointing) transparently checkpoints a single-host or distributed computation in user-space -- with no modifications to user code or to the O/S. |
......@@ -171,6 +174,7 @@
| [gzip](http://www.gnu.org/software/gzip/) | gzip (GNU zip) is a popular data compression program as a replacement for compress |
| MATLAB | &nbsp; |
| [Mercurial](http://mercurial.selenic.com/) | Mercurial is a free, distributed source control management tool. It efficiently handles projects of any size and offers an easy and intuitive interface. |
| [moreutils](https://joeyh.name/code/moreutils/) | Moreutils is a growing collection of the unix tools that nobody thought to write long ago when unix was young. |
| [numactl](http://oss.sgi.com/projects/libnuma/) | The numactl program allows you to run your application program on specific cpu's and memory nodes. It does this by supplying a NUMA memory policy to the operating system before running your program. The libnuma library provides convenient ways for you to add NUMA memory policies into your own program. |
| pigz | &nbsp; |
| [QEMU](http://wiki.qemu.org/Main_Page) | QEMU is a generic and open source machine emulator and virtualizer. |
......
This source diff could not be displayed because it is too large. You can view the blob instead.
* ![pdf](img/pdf.png)[PBS Pro Programmer's Guide](http://www.pbsworks.com/pdfs/PBSProgramGuide13.0.pdf)
* ![pdf](img/pdf.png)[PBS Pro Quick Start Guide](http://www.pbsworks.com/pdfs/PBSQuickStartGuide13.0.pdf)
* ![pdf](img/pdf.png)[PBS Pro Reference Guide](http://www.pbsworks.com/pdfs/PBSReferenceGuide13.0.pdf)
* ![pdf](img/pdf.png)[PBS Pro User's Guide](http://www.pbsworks.com/pdfs/PBSUserGuide13.0.pdf)
* ![pdf](img/pdf.png)[PBS Pro Programmer's Guide][1]
* ![pdf](img/pdf.png)[PBS Pro Quick Start Guide][2]
* ![pdf](img/pdf.png)[PBS Pro Reference Guide][3]
* ![pdf](img/pdf.png)[PBS Pro User's Guide][4]
[1]: http://www.pbsworks.com/pdfs/PBSProgramGuide13.0.pdf
[2]: http://www.pbsworks.com/pdfs/PBSQuickStartGuide13.0.pdf
[3]: http://www.pbsworks.com/pdfs/PBSReferenceGuide13.0.pdf
[4]: http://www.pbsworks.com/pdfs/PBSUserGuide13.0.pdf
This diff is collapsed.
There's a planned upgrade of Salomon since 2018-12-04 til 2018-12-05.
!!! Warning
This upgrade will introduce a lot of changes with respect to production and user experience.
!!! Hint
You might **need** to **recompile** your binaries.
Salomon operating system will be upgraded to the latest CentOS 7.6. We will be able to support the latest software versions and keep the cluster security with upstream releases after the upgrade.
Major changes are:
* kernel will be upgraded to 3.10 (2.6.32 now)
* glibc will be upgraded to 2.17 (2.12 now)
* software modules/binaries should be recompiled or deleted
## Discontinued Modules
A new tag has been introduced. Modules tagged with **C6** might be malfunctioning. These modules might be recompiled during transition period. Keep support@it4i.cz informed on malfunctioning modules.
```console
$ ml av intel/
--------------------------- /apps/modules/toolchain ----------------------------