Skip to content
Snippets Groups Projects
Commit 01cd1f05 authored by Lukáš Krupčík's avatar Lukáš Krupčík
Browse files

fix links

parent bd89c66c
No related branches found
No related tags found
5 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section,!219Virtual environment, upgrade MKdocs, upgrade Material design
Showing
with 70 additions and 73 deletions
......@@ -2,7 +2,7 @@
## Job Queue Policies
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and resources available to the Project. The fair-share at Anselm ensures that individual users may consume approximately equal amount of resources per week. Detailed information in the [Job scheduling](job-priority/) section. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following table provides the queue partitioning overview:
The resources are allocated to the job in a fair-share fashion, subject to constraints set by the queue and resources available to the Project. The fair-share at Anselm ensures that individual users may consume approximately equal amount of resources per week. Detailed information in the [Job scheduling](salomon/job-priority/) section. The resources are accessible via several queues for queueing the jobs. The queues provide prioritized and exclusive access to the computational resources. Following table provides the queue partitioning overview:
!!! note
Check the queue status at <https://extranet.it4i.cz/rsweb/salomon/>
......@@ -35,18 +35,18 @@ The resources are allocated to the job in a fair-share fashion, subject to const
## Queue Notes
The job wall-clock time defaults to **half the maximum time**, see table above. Longer wall time limits can be [set manually, see examples](job-submission-and-execution/).
The job wall-clock time defaults to **half the maximum time**, see table above. Longer wall time limits can be [set manually, see examples](salomon/job-submission-and-execution/).
Jobs that exceed the reserved wall-clock time (Req'd Time) get killed automatically. Wall-clock time limit can be changed for queuing jobs (state Q) using the qalter command, however can not be changed for a running job (state R).
Salomon users may check current queue configuration at <https://extranet.it4i.cz/rsweb/salomon/queues>.
Salomon users may check current queue configuration at [https://extranet.it4i.cz/rsweb/salomon/queues](https://extranet.it4i.cz/rsweb/salomon/queues).
## Queue Status
!!! note
Check the status of jobs, queues and compute nodes at [https://extranet.it4i.cz/rsweb/salomon/](https://extranet.it4i.cz/rsweb/salomon)
![RSWEB Salomon](../img/rswebsalomon.png "RSWEB Salomon")
![RSWEB Salomon](img/rswebsalomon.png "RSWEB Salomon")
Display the queue status on Salomon:
......
......@@ -15,7 +15,7 @@ The Salomon cluster is accessed by SSH protocol via login nodes login1, login2,
| login3.salomon.it4i.cz | 22 | ssh | login3 |
| login4.salomon.it4i.cz | 22 | ssh | login4 |
The authentication is by the [private key](../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys/)
The authentication is by the [private key](general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys/)
!!! note
Please verify SSH fingerprints during the first logon. They are identical on all login nodes:
......@@ -44,7 +44,7 @@ If you see warning message "UNPROTECTED PRIVATE KEY FILE!", use this command to
local $ chmod 600 /path/to/id_rsa
```
On **Windows**, use [PuTTY ssh client](../general/accessing-the-clusters/shell-access-and-data-transfer/putty.md).
On **Windows**, use [PuTTY ssh client](general/accessing-the-clusters/shell-access-and-data-transfer/putty.md).
After logging in, you will see the command prompt:
......@@ -60,12 +60,12 @@ After logging in, you will see the command prompt:
http://www.it4i.cz/?lang=en
Last login: Tue Jul 9 15:57:38 2013 from your-host.example.com
Last login: Tue Jul 9 15:57:38 2018 from your-host.example.com
[username@login2.salomon ~]$
```
!!! note
The environment is **not** shared between login nodes, except for [shared filesystems](storage/).
The environment is **not** shared between login nodes, except for [shared filesystems](salomon/storage/).
## Data Transfer
......@@ -79,7 +79,7 @@ Data in and out of the system may be transferred by the [scp](http://en.wikipedi
| login3.salomon.it4i.cz | 22 | scp, sftp |
| login4.salomon.it4i.cz | 22 | scp, sftp |
The authentication is by the [private key](../general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys/)
The authentication is by the [private key](general/accessing-the-clusters/shell-access-and-data-transfer/ssh-keys/)
On linux or Mac, use scp or sftp client to transfer the data to Salomon:
......@@ -115,7 +115,7 @@ $ man sshfs
On Windows, use [WinSCP client](http://winscp.net/eng/download.php) to transfer the data. The [win-sshfs client](http://code.google.com/p/win-sshfs/) provides a way to mount the Salomon filesystems directly as an external disc.
More information about the shared file systems is available [here](storage/).
More information about the shared file systems is available [here](salomon/storage/).
## Connection Restrictions
......@@ -199,9 +199,9 @@ Now, configure the applications proxy settings to **localhost:6000**. Use port f
## Graphical User Interface
* The [X Window system](../general/accessing-the-clusters/graphical-user-interface/x-window-system/) is a principal way to get GUI access to the clusters.
* The [X Window system](general/accessing-the-clusters/graphical-user-interface/x-window-system/) is a principal way to get GUI access to the clusters.
* The [Virtual Network Computing](../general/accessing-the-clusters/graphical-user-interface/vnc/) is a graphical [desktop sharing](http://en.wikipedia.org/wiki/Desktop_sharing) system that uses the [Remote Frame Buffer protocol](http://en.wikipedia.org/wiki/RFB_protocol) to remotely control another [computer](http://en.wikipedia.org/wiki/Computer).
## VPN Access
* Access to IT4Innovations internal resources via [VPN](../general/accessing-the-clusters/vpn-access/).
* Access to IT4Innovations internal resources via [VPN](general/accessing-the-clusters/vpn-access/).
......@@ -47,9 +47,8 @@ echo Machines: $hl
/ansys_inc/v145/CFX/bin/cfx5solve -def input.def -size 4 -size-ni 4x -part-large -start-method "Platform MPI Distributed Parallel" -par-dist $hl -P aa_r
```
Header of the pbs file (above) is common and description can be find on [this site](../../job-submission-and-execution/). SVS FEM recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Header of the pbs file (above) is common and description can be find on [this site](salomon/job-submission-and-execution/). SVS FEM recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Working directory has to be created before sending PBS job into the queue. Input file should be in working directory or full path to input file has to be specified. >Input file has to be defined by common CFX def file which is attached to the CFX solver via parameter -def
**License** should be selected by parameter -P (Big letter **P**). Licensed products are the following: aa_r (ANSYS **Academic** Research), ane3fl (ANSYS Multiphysics)-**Commercial**.
[More about licensing here](licensing/)
......@@ -38,7 +38,7 @@ NCORES=`wc -l $PBS_NODEFILE |awk '{print $1}'`
/ansys_inc/v145/fluent/bin/fluent 3d -t$NCORES -cnf=$PBS_NODEFILE -g -i fluent.jou
```
Header of the pbs file (above) is common and description can be find on [this site](../../resources-allocation-policy/). [SVS FEM](http://www.svsfem.cz) recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Header of the pbs file (above) is common and description can be find on [this site](salomon/resources-allocation-policy/). [SVS FEM](http://www.svsfem.cz) recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Working directory has to be created before sending pbs job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common Fluent journal file which is attached to the Fluent solver via parameter -i fluent.jou
......@@ -151,12 +151,12 @@ Fluent could be run in parallel only under Academic Research license. To do so t
ANSLIC_ADMIN Utility will be run
![](../../../img/Fluent_Licence_1.jpg)
![](img/Fluent_Licence_1.jpg)
![](../../../img/Fluent_Licence_2.jpg)
![](img/Fluent_Licence_2.jpg)
![](../../../img/Fluent_Licence_3.jpg)
![](img/Fluent_Licence_3.jpg)
ANSYS Academic Research license should be moved up to the top of the list.
![](../../../img/Fluent_Licence_4.jpg)
![](img/Fluent_Licence_4.jpg)
......@@ -50,6 +50,6 @@ echo Machines: $hl
/ansys_inc/v145/ansys/bin/ansys145 -dis -lsdynampp i=input.k -machines $hl
```
Header of the PBS file (above) is common and description can be find on [this site](../../resource-allocation-and-job-execution/job-submission-and-execution/). [SVS FEM](http://www.svsfem.cz) recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Header of the PBS file (above) is common and description can be find on [this site](salomon/resource-allocation-and-job-execution/job-submission-and-execution/). [SVS FEM](http://www.svsfem.cz) recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Working directory has to be created before sending PBS job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common LS-DYNA .**k** file which is attached to the ansys solver via parameter i=
......@@ -49,8 +49,8 @@ echo Machines: $hl
/ansys_inc/v145/ansys/bin/ansys145 -b -dis -p aa_r -i input.dat -o file.out -machines $hl -dir $WORK_DIR
```
Header of the PBS file (above) is common and description can be find on [this site](../../resources-allocation-policy/). [SVS FEM](http://www.svsfem.cz) recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Header of the PBS file (above) is common and description can be find on [this site](salomon/resources-allocation-policy/). [SVS FEM](http://www.svsfem.cz) recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Working directory has to be created before sending PBS job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common APDL file which is attached to the ansys solver via parameter -i
**License** should be selected by parameter -p. Licensed products are the following: aa_r (ANSYS **Academic** Research), ane3fl (ANSYS Multiphysics)-**Commercial**, aa_r_dy (ANSYS **Academic** AUTODYN) [More about licensing here](licensing/)
**License** should be selected by parameter -p. Licensed products are the following: aa_r (ANSYS **Academic** Research), ane3fl (ANSYS Multiphysics)-**Commercial**, aa_r_dy (ANSYS **Academic** AUTODYN)
......@@ -2,7 +2,7 @@
**[SVS FEM](http://www.svsfem.cz/)** as **[ANSYS Channel partner](http://www.ansys.com/)** for Czech Republic provided all ANSYS licenses for ANSELM cluster and supports of all ANSYS Products (Multiphysics, Mechanical, MAPDL, CFX, Fluent, Maxwell, LS-DYNA...) to IT staff and ANSYS users. If you are challenging to problem of ANSYS functionality contact [hotline@svsfem.cz](mailto:hotline@svsfem.cz?subject=Ostrava%20-%20ANSELM)
Anselm provides as commercial as academic variants. Academic variants are distinguished by "**Academic...**" word in the name of license or by two letter preposition "**aa\_**" in the license feature name. Change of license is realized on command line respectively directly in user's PBS file (see individual products). [More about licensing here](licensing/)
Anselm provides as commercial as academic variants. Academic variants are distinguished by "**Academic...**" word in the name of license or by two letter preposition "**aa\_**" in the license feature name. Change of license is realized on command line respectively directly in user's PBS file (see individual products).
To load the latest version of any ANSYS product (Mechanical, Fluent, CFX, MAPDL,...) load the module:
......
......@@ -40,4 +40,4 @@ The recommend to use version 6.5. Version 6.3 fails on Salomon nodes with accele
Please refer to [the documentation](http://www.nwchem-sw.org/index.php/Release62:Top-level) and in the input file set the following directives :
* MEMORY : controls the amount of memory NWChem will use
* SCRATCH_DIR : set this to a directory in [SCRATCH filesystem](../../storage/storage/) (or run the calculation completely in a scratch directory). For certain calculations, it might be advisable to reduce I/O by forcing "direct" mode, e.g. `scf direct`
* SCRATCH_DIR : set this to a directory in [SCRATCH filesystem](salomon/storage/) (or run the calculation completely in a scratch directory). For certain calculations, it might be advisable to reduce I/O by forcing "direct" mode, e.g. `scf direct`
# Octave
GNU Octave is a high-level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It also provides extensive graphics capabilities for data visualization and manipulation. Octave is normally used through its interactive command line interface, but it can also be used to write non-interactive programs. The Octave language is quite similar to Matlab so that most programs are easily portable. Read more on <http://www.gnu.org/software/octave/>
GNU Octave is a high-level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It also provides extensive graphics capabilities for data visualization and manipulation. Octave is normally used through its interactive command line interface, but it can also be used to write non-interactive programs. The Octave language is quite similar to Matlab so that most programs are easily portable. Read more on [http://www.gnu.org/software/octave/](http://www.gnu.org/software/octave/).
Two versions of octave are available on the cluster, via module
......@@ -45,7 +45,7 @@ To run octave in batch mode, write an octave script, then write a bash jobscript
exit
```
This script may be submitted directly to the PBS workload manager via the qsub command. The inputs are in octcode.m file, outputs in output.out file. See the single node jobscript example in the [Job execution section](../../).
This script may be submitted directly to the PBS workload manager via the qsub command. The inputs are in octcode.m file, outputs in output.out file. See the single node jobscript example in the [Job execution section](salomon/job-submission-and-execution).
The octave c compiler mkoctfile calls the GNU gcc 4.8.1 for compiling native c code. This is very useful for running native c subroutines in octave environment.
......
......@@ -3,7 +3,6 @@
## Introduction
LMGC90 is a free and open source software dedicated to multiple physics simulation of discrete material and structures.
More details on the capabilities of LMGC90 are available [here][Welcome].
## Modules
......@@ -70,7 +69,7 @@ The files inside the `DISPLAY` directory can be visualized with paraview. It is
- porofe: porous mechanical mesh
- multife: multi-phasic fluid in porous media mesh
[Welcome]: <http://www.lmgc.univ-montp2.fr/~dubois/LMGC90/Web/Welcome_!.html>
[pre_lmgc]: <http://www.lmgc.univ-montp2.fr/%7Edubois/LMGC90/UserDoc/pre/index.html>
[chipy]: <http://www.lmgc.univ-montp2.fr/%7Edubois/LMGC90/UserDoc/chipy/index.html>
[LMGC90_Postpro.pdf]: <https://git-xen.lmgc.univ-montp2.fr/lmgc90/lmgc90_user/blob/2017.rc1/manuals/LMGC90_Postpro.pdf>
[Welcome](http://www.lmgc.univ-montp2.fr/~dubois/LMGC90/Web/Welcome_!.html)
[pre_lmgc](http://www.lmgc.univ-montp2.fr/%7Edubois/LMGC90/UserDoc/pre/index.html)
[chipy](http://www.lmgc.univ-montp2.fr/%7Edubois/LMGC90/UserDoc/chipy/index.html)
[LMGC90_Postpro.pdf](https://git-xen.lmgc.univ-montp2.fr/lmgc90/lmgc90_user/blob/2017.rc1/manuals/LMGC90_Postpro.pdf)
......@@ -51,4 +51,4 @@ where
After computation newly created result file *RESULT_FILE* in the current directory should contain results. More detailed result informations then should be found in the file *res.txt* which is in every single randomly named folder created by PragTic in the very same current directory.
[Welcome]: <http://www.pragtic.com/>
[Welcome](http://www.pragtic.com/)
......@@ -46,7 +46,7 @@ Configuration of the SCRATCH Lustre storage
### Understanding the Lustre File Systems
<http://www.nas.nasa.gov>
[http://www.nas.nasa.gov](http://www.nas.nasa.gov)
A user file on the Lustre file system can be divided into multiple chunks (stripes) and stored across a subset of the object storage targets (OSTs) (disks). The stripes are distributed among the OSTs in a round-robin fashion to ensure load balancing.
......@@ -106,7 +106,7 @@ Another good practice is to make the stripe count be an integral factor of the n
Large stripe size allows each client to have exclusive access to its own part of a file. However, it can be counterproductive in some cases if it does not match your I/O pattern. The choice of stripe size has no effect on a single-stripe file.
Read more on <http://wiki.lustre.org/manual/LustreManual20_HTML/ManagingStripingFreeSpace.html>
Read more on [http://wiki.lustre.org/manual/LustreManual20_HTML/ManagingStripingFreeSpace.html](http://wiki.lustre.org/manual/LustreManual20_HTML/ManagingStripingFreeSpace.html)
## Disk Usage and Quota Commands
......@@ -235,7 +235,7 @@ Users home directories /home/username reside on HOME file system. Accessible cap
The HOME should not be used to archive data of past Projects or other unrelated data.
The files on HOME will not be deleted until end of the [users lifecycle](../general/obtaining-login-credentials/obtaining-login-credentials/).
The files on HOME will not be deleted until end of the [users lifecycle](general/obtaining-login-credentials/obtaining-login-credentials/).
The workspace is backed up, such that it can be restored in case of catasthropic failure resulting in significant data loss. This backup however is not intended to restore old versions of user data or to restore (accidentaly) deleted files.
......@@ -332,7 +332,7 @@ It is not recommended to allocate large amount of memory and use large amount of
The Global RAM disk spans the local RAM disks of all the nodes within a single job.
![Global RAM disk](../img/global_ramdisk.png)
![Global RAM disk](img/global_ramdisk.png)
The Global RAM disk deploys
BeeGFS On Demand parallel filesystem, using local RAM disks as a storage backend.
......
......@@ -14,8 +14,8 @@ Remote visualization with NICE DCV software is availabe on two nodes.
## References
* [Graphical User Interface](shell-and-data-access/#graphical-user-interface)
* [VPN Access](shell-and-data-access/#vpn-access)
* [Graphical User Interface](salomon/shell-and-data-access/#graphical-user-interface)
* [VPN Access](salomon/shell-and-data-access/#vpn-access)
## Install and Run
......@@ -25,7 +25,7 @@ Remote visualization with NICE DCV software is availabe on two nodes.
* [Linux download](http://www.nice-software.com/storage/nice-dcv/2016.0/endstation/linux/nice-dcv-endstation-2016.0-17066.run)
* [Windows download](http://www.nice-software.com/storage/nice-dcv/2016.0/endstation/win/nice-dcv-endstation-2016.0-17066-Release.msi)
**Install VPN client** [VPN Access](../general/accessing-the-clusters/vpn-access/) (user-computer)
**Install VPN client** [VPN Access](general/accessing-the-clusters/vpn-access/) (user-computer)
!!! note
Visualisation server is a compute node. You are not able to SSH with your private key. There are two solutions available to solve login issue.
......@@ -36,11 +36,11 @@ Remote visualization with NICE DCV software is availabe on two nodes.
* Generate public fingerprint for your private key with PuTTYgen
![](../img/puttygen.png)
![](img/puttygen.png)
* Add this key to `~/.ssh/authorized_keys` on the cluster
![](../img/addsshkey.png)
![](img/addsshkey.png)
* Use your standard SSH key to connect to visualization server
......@@ -49,17 +49,17 @@ Remote visualization with NICE DCV software is availabe on two nodes.
* Install WinSCP client (user-computer) [Download WinSCP installer](https://winscp.net/download/WinSCP-5.13.3-Setup.exe)
* Add credentials
![](../img/viz1-win.png)
![](img/viz1-win.png)
* Add path to key file
![](../img/viz2-win.png)
![](img/viz2-win.png)
* Save
* Copy `~/.ssh/id_rsa` to your computer
* Convert key to PuTTY format with PuTTYgen
![](../img/puttygenconvert.png)
![](img/puttygenconvert.png)
* Use this new ssh key to connect to visualization server
......@@ -69,12 +69,12 @@ Remote visualization with NICE DCV software is availabe on two nodes.
* [Download PuTTY installer](https://the.earth.li/~sgtatham/putty/latest/w64/putty-64bit-0.70-installer.msi)
* Configure PuTTY
![](../img/viz3-win.png)
![](img/viz3-win.png)
* Add credentials and key file (create 3x sessions: **vizserv1.salomon.it4i.cz**, **vizserv2.salomon.it4i.cz**, **login1.salomon.it4i.cz**)
* Config SSH tunnels (user-computer) (for sessions vizserv1 and vizserv2 only) - ports: **5901**, **5902**, **7300-7305**
![](../img/viz4-win.png)
![](img/viz4-win.png)
* Save
......@@ -110,14 +110,14 @@ $ qsub -I -q qviz -A OPEN-XX-XX -l select=1:ncpus=4:host=vizserv2,walltime=04:00
* vizserv2: localhost:5902
* fill password
![](../img/viz5-win.png)
![](../img/viz6-win.png)
![](img/viz5-win.png)
![](img/viz6-win.png)
**Check DCV status (Salomon-vizservX) in VNC window**
**Run glxgears (Salomon-vizservX)**
![](../img/viz7-win.png)
![](img/viz7-win.png)
**LOGOUT FROM MENU: System->Logout**
......@@ -170,13 +170,13 @@ $ qsub -I -q qviz -A OPEN-XX-XX -l select=1:ncpus=4:host=vizserv2,walltime=04:00
* vizserv2: localhost:5902
* fill password
![](../img/viz1.png)
![](../img/viz2.png)
![](img/viz1.png)
![](img/viz2.png)
**Check DCV status in VNC window**
**Run glxgears**
![](../img/viz3.png)
![](img/viz3.png)
**LOGOUT FROM MENU: System->Logout**
......@@ -222,7 +222,7 @@ sci-libs/umfpack-5.6.2
| libraries | 4 |
| **Total** | **93** |
![graphs](../img/bio-graphs.png)
![graphs](img/bio-graphs.png)
## Other Applications Available Through Gentoo Linux
......
......@@ -13,6 +13,6 @@ VCF files are scanned by this diagnostic tool for known diagnostic disease-assoc
TEAM (27) is an intuitive and easy-to-use web tool that fills the gap between the predicted mutations and the final diagnostic in targeted enrichment sequencing analysis. The tool searches for known diagnostic mutations, corresponding to a disease panel, among the predicted patient’s variants. Diagnostic variants for the disease are taken from four databases of disease-related variants (HGMD, HUMSAVAR , ClinVar and COSMIC) If no primary diagnostic variant is found, then a list of secondary findings that can help to establish a diagnostic is produced. TEAM also provides with an interface for the definition of and customization of panels, by means of which, genes and mutations can be added or discarded to adjust panel definitions.
![Interface of the application. Panels for defining targeted regions of interest can be set up by just drag and drop known disease genes or disease definitions from the lists. Thus, virtual panels can be interactively improved as the knowledge of the disease increases.](../../img/fig5.png)
![Interface of the application. Panels for defining targeted regions of interest can be set up by just drag and drop known disease genes or disease definitions from the lists. Thus, virtual panels can be interactively improved as the knowledge of the disease increases.](img/fig5.png)
** Figure 5. **Interface of the application. Panels for defining targeted regions of interest can be set up by just drag and drop known disease genes or disease definitions from the lists. Thus, virtual panels can be interactively improved as the knowledge of the disease increases.
......@@ -9,7 +9,7 @@ The scope of this OMICS MASTER solution is restricted to human genomics research
The pipeline inputs the raw data produced by the sequencing machines and undergoes a processing procedure that consists on a quality control, the mapping and variant calling steps that result in a file containing the set of variants in the sample. From this point, the prioritization component or the diagnostic component can be launched.
![OMICS MASTER solution overview. Data is produced in the external labs and comes to IT4I (represented by the blue dashed line). The data pre-processor converts raw data into a list of variants and annotations for each sequenced patient. These lists files together with primary and secondary (alignment) data files are stored in IT4I sequence DB and uploaded to the discovery (candidate priorization) or diagnostic component where they can be analysed directly by the user that produced
them, depending of the experimental design carried out.](../../img/fig1.png)
them, depending of the experimental design carried out.](img/fig1.png)
Figure 1. OMICS MASTER solution overview. Data is produced in the external labs and comes to IT4I (represented by the blue dashed line). The data pre-processor converts raw data into a list of variants and annotations for each sequenced patient. These lists files together with primary and secondary (alignment) data files are stored in IT4I sequence DB and uploaded to the discovery (candidate prioritization) or diagnostic component where they can be analyzed directly by the user that produced them, depending of the experimental design carried out.
......@@ -41,7 +41,7 @@ Output: FASTQ file plus an HTML file containing statistics on the data.
FASTQ format It represents the nucleotide sequence and its corresponding quality scores.
![FASTQ file.](../../img/fig2.png "fig2.png")
![FASTQ file.](img/fig2.png)
Figure 2.FASTQ file.
#### Mapping
......@@ -81,7 +81,7 @@ corresponding information is unavailable.
The standard CIGAR description of pairwise alignment defines three operations: ‘M’ for match/mismatch, ‘I’ for insertion compared with the reference and ‘D’ for deletion. The extended CIGAR proposed in SAM added four more operations: ‘N’ for skipped bases on the reference, ‘S’ for soft clipping, ‘H’ for hard clipping and ‘P’ for padding. These support splicing, clipping, multi-part and padded alignments. Figure 3 shows examples of CIGAR strings for different types of alignments.
![SAM format file. The ‘@SQ’ line in the header section gives the order of reference sequences. Notably, r001 is the name of a read pair. According to FLAG 163 (=1+2+32+128), the read mapped to position 7 is the second read in the pair (128) and regarded as properly paired (1 + 2); its mate is mapped to 37 on the reverse strand (32). Read r002 has three soft-clipped (unaligned) bases. The coordinate shown in SAM is the position of the first aligned base. The CIGAR string for this alignment contains a P (padding) operation which correctly aligns the inserted sequences. Padding operations can be absent when an aligner does not support multiple sequence alignment. The last six bases of read r003 map to position 9, and the first five to position 29 on the reverse strand. The hard clipping operation H indicates that the clipped sequence is not present in the sequence field. The NM tag gives the number of mismatches. Read r004 is aligned across an intron, indicated by the N operation.](../../img/fig3.png)
![SAM format file. The ‘@SQ’ line in the header section gives the order of reference sequences. Notably, r001 is the name of a read pair. According to FLAG 163 (=1+2+32+128), the read mapped to position 7 is the second read in the pair (128) and regarded as properly paired (1 + 2); its mate is mapped to 37 on the reverse strand (32). Read r002 has three soft-clipped (unaligned) bases. The coordinate shown in SAM is the position of the first aligned base. The CIGAR string for this alignment contains a P (padding) operation which correctly aligns the inserted sequences. Padding operations can be absent when an aligner does not support multiple sequence alignment. The last six bases of read r003 map to position 9, and the first five to position 29 on the reverse strand. The hard clipping operation H indicates that the clipped sequence is not present in the sequence field. The NM tag gives the number of mismatches. Read r004 is aligned across an intron, indicated by the N operation.](img/fig3.png)
Figure 3 . SAM format file. The ‘@SQ’ line in the header section gives the order of reference sequences. Notably, r001 is the name of a read pair. According to FLAG 163 (=1+2+32+128), the read mapped to position 7 is the second read in the pair (128) and regarded as properly paired (1 + 2); its mate is mapped to 37 on the reverse strand (32). Read r002 has three soft-clipped (unaligned) bases. The coordinate shown in SAM is the position of the first aligned base. The CIGAR string for this alignment contains a P (padding) operation which correctly aligns the inserted sequences. Padding operations can be absent when an aligner does not support multiple sequence alignment. The last six bases of read r003 map to position 9, and the first five to position 29 on the reverse strand. The hard clipping operation H indicates that the clipped sequence is not present in the sequence field. The NM tag gives the number of mismatches. Read r004 is aligned across an intron, indicated by the N operation.
......@@ -124,8 +124,7 @@ VCF (3) is a standardized format for storing the most prevalent types of sequenc
A VCF file consists of a header section and a data section. The header contains an arbitrary number of metainformation lines, each starting with characters ‘##’, and a TAB delimited field definition line, starting with a single ‘#’ character. The meta-information header lines provide a standardized description of tags and annotations used in the data section. The use of meta-information allows the information stored within a VCF file to be tailored to the dataset in question. It can be also used to provide information about the means of file creation, date of creation, version of the reference sequence, software used and any other information relevant to the history of the file. The field definition line names eight mandatory columns, corresponding to data columns representing the chromosome (CHROM), a 1-based position of the start of the variant (POS), unique identifiers of the variant (ID), the reference allele (REF), a comma separated list of alternate non-reference alleles (ALT), a phred-scaled quality score (QUAL), site filtering information (FILTER) and a semicolon separated list of additional, user extensible annotation (INFO). In addition, if samples are present in the file, the mandatory header columns are followed by a FORMAT column and an arbitrary number of sample IDs that define the samples included in the VCF file. The FORMAT column is used to define the information contained within each subsequent genotype column, which consists of a colon separated list of fields. For example, the FORMAT field GT:GQ:DP in the fourth data entry of Figure 1a indicates that the subsequent entries contain information regarding the genotype, genotype quality and read depth for each sample. All data lines are TAB delimited and the number of fields in each data line must match the number of fields in the header line. It is strongly recommended that all annotation tags used are declared in the VCF header section.
![a) Example of valid VCF. The header lines ##fileformat and #CHROM are mandatory, the rest is optional but strongly recommended. Each line of the body describes variants present in the sampled population at one genomic position or region. All alternate alleles are listed in the ALT column and referenced from the genotype fields as 1-based indexes to
this list; the reference haplotype is designated as 0. For multiploid data, the separator indicates whether the data are phased (|) or unphased (/). Thus, the two alleles C and G at the positions 2 and 5 in this figure occur on the same chromosome in SAMPLE1. The first data line shows an example of a deletion (present in SAMPLE1) and a replacement of
two bases by another base (SAMPLE2); the second line shows a SNP and an insertion; the third a SNP; the fourth a large structural variant described by the annotation in the INFO column, the coordinate is that of the base before the variant. (b–f ) Alignments and VCF representations of different sequence variants: SNP, insertion, deletion, replacement, and a large deletion. The REF columns shows the reference bases replaced by the haplotype in the ALT column. The coordinate refers to the first reference base. (g) Users are advised to use simplest representation possible and lowest coordinate in cases where the position is ambiguous.](../../img/fig4.png)
this list; the reference haplotype is designated as 0. For multiploid data, the separator indicates whether the data are phased (|) or unphased (/). Thus, the two alleles C and G at the positions 2 and 5 in this figure occur on the same chromosome in SAMPLE1. The first data line shows an example of a deletion (present in SAMPLE1) and a replacement of two bases by another base (SAMPLE2); the second line shows a SNP and an insertion; the third a SNP; the fourth a large structural variant described by the annotation in the INFO column, the coordinate is that of the base before the variant. (b–f ) Alignments and VCF representations of different sequence variants: SNP, insertion, deletion, replacement, and a large deletion. The REF columns shows the reference bases replaced by the haplotype in the ALT column. The coordinate refers to the first reference base. (g) Users are advised to use simplest representation possible and lowest coordinate in cases where the position is ambiguous.](img/fig4.png)
Figure 4 . (a) Example of valid VCF. The header lines ##fileformat and #CHROM are mandatory, the rest is optional but strongly recommended. Each line of the body describes variants present in the sampled population at one genomic position or region. All alternate alleles are listed in the ALT column and referenced from the genotype fields as 1-based indexes to this list; the reference haplotype is designated as 0. For multiploid data, the separator indicates whether the data are phased (|) or unphased (/). Thus, the two alleles C and G at the positions 2 and 5 in this figure occur on the same chromosome in SAMPLE1. The first data line shows an example of a deletion (present in SAMPLE1) and a replacement of two bases by another base (SAMPLE2); the second line shows a SNP and an insertion; the third a SNP; the fourth a large structural variant described by the annotation in the INFO column, the coordinate is that of the base before the variant. (b–f ) Alignments and VCF representations of different sequence variants: SNP, insertion, deletion, replacement, and a large deletion. The REF columns shows the reference bases replaced by the haplotype in the ALT column. The coordinate refers to the first reference base. (g) Users are advised to use simplest representation possible and lowest coordinate in cases where the position is ambiguous.
......@@ -167,9 +166,9 @@ Systems biology
We also import systems biology information like interactome information from IntAct (24). Reactome (25) stores pathway and interaction information in BioPAX (26) format. BioPAX data exchange format enables the integration of diverse pathway
resources. We successfully solved the problem of storing data released in BioPAX format into a SQL relational schema, which allowed us importing Reactome in CellBase.
### [Diagnostic Component (TEAM)](diagnostic-component-team/)
### [Diagnostic Component (TEAM)](software/bio/omics-master/diagnostic-component-team/)
### [Priorization Component (BiERApp)](priorization-component-bierapp/)
### [Priorization Component (BiERApp)](software/bio/omics-master/priorization-component-bierapp/)
## Usage
......@@ -264,7 +263,7 @@ The ped file ( file.ped) contains the following info:
FAM sample_B 0 0 2 2
```
Now, lets load the NGSPipeline module and copy the sample data to a [scratch directory](../../salomon/storage/):
Now, lets load the NGSPipeline module and copy the sample data to a [scratch directory](salomon/storage/):
```console
$ ml ngsPipeline
......@@ -278,7 +277,7 @@ Now, we can launch the pipeline (replace OPEN-0-0 with your Project ID):
$ ngsPipeline -i /scratch/$USER/omics/sample_data/data -o /scratch/$USER/omics/results -p /scratch/$USER/omics/sample_data/data/file.ped --project OPEN-0-0 --queue qprod
```
This command submits the processing [jobs to the queue](../../salomon/job-submission-and-execution/).
This command submits the processing [jobs to the queue](salomon/job-submission-and-execution/).
If we want to re-launch the pipeline from stage 4 until stage 20 we should use the next command:
......@@ -342,19 +341,19 @@ The output folder contains all the subfolders with the intermediate data. This f
Once the file has been uploaded, a panel must be chosen from the Panel list. Then, pressing the Run button the diagnostic process starts. TEAM searches first for known diagnostic mutation(s) taken from four databases: HGMD-public (20), [HUMSAVAR](http://www.uniprot.org/docs/humsavar), ClinVar (29) and COSMIC (23).
![The panel manager. The elements used to define a panel are (A) disease terms, (B) diagnostic mutations and (C) genes. Arrows represent actions that can be taken in the panel manager. Panels can be defined by using the known mutations and genes of a particular disease. This can be done by dragging them to the Primary Diagnostic box (action D). This action, in addition to defining the diseases in the Primary Diagnostic box, automatically adds the corresponding genes to the Genes box. The panels can be customized by adding new genes (action F) or removing undesired genes (action G). New disease mutations can be added independently or associated to an already existing disease term (action E). Disease terms can be removed by simply dragging themback (action H).](../../img/fig7x.png)
![The panel manager. The elements used to define a panel are (A) disease terms, (B) diagnostic mutations and (C) genes. Arrows represent actions that can be taken in the panel manager. Panels can be defined by using the known mutations and genes of a particular disease. This can be done by dragging them to the Primary Diagnostic box (action D). This action, in addition to defining the diseases in the Primary Diagnostic box, automatically adds the corresponding genes to the Genes box. The panels can be customized by adding new genes (action F) or removing undesired genes (action G). New disease mutations can be added independently or associated to an already existing disease term (action E). Disease terms can be removed by simply dragging themback (action H).](img/fig7x.png)
Figure 7. The panel manager. The elements used to define a panel are ( A ) disease terms, ( B ) diagnostic mutations and ( C ) genes. Arrows represent actions that can be taken in the panel manager. Panels can be defined by using the known mutations and genes of a particular disease. This can be done by dragging them to the Primary Diagnostic box (action D ). This action, in addition to defining the diseases in the Primary Diagnostic box, automatically adds the corresponding genes to the Genes box. The panels can be customized by adding new genes (action F ) or removing undesired genes (action G). New disease mutations can be added independently or associated to an already existing disease term (action E ). Disease terms can be removed by simply dragging them back (action H ).
For variant discovering/filtering we should upload the VCF file into BierApp by using the following form:
\![BierApp VCF upload panel. It is recommended to choose a name for the job as well as a description.](../../img/fig8.png)\
![BierApp VCF upload panel. It is recommended to choose a name for the job as well as a description.](img/fig8.png)\
Figure 8 . \BierApp VCF upload panel. It is recommended to choose a name for the job as well as a description \\.
Each prioritization (‘job’) has three associated screens that facilitate the filtering steps. The first one, the ‘Summary’ tab, displays a statistic of the data set analyzed, containing the samples analyzed, the number and types of variants found and its distribution according to consequence types. The second screen, in the ‘Variants and effect’ tab, is the actual filtering tool, and the third one, the ‘Genome view’ tab, offers a representation of the selected variants within the genomic context provided by an embedded version of the Genome Maps Tool (30).
![This picture shows all the information associated to the variants. If a variant has an associated phenotype we could see it in the last column. In this case, the variant 7:132481242 CT is associated to the phenotype: large intestine tumor.](../../img/fig9.png)
![This picture shows all the information associated to the variants. If a variant has an associated phenotype we could see it in the last column. In this case, the variant 7:132481242 CT is associated to the phenotype: large intestine tumor.](img/fig9.png)
Figure 9 . This picture shows all the information associated to the variants. If a variant has an associated phenotype we could see it in the last column. In this case, the variant 7:132481242 CT is associated to the phenotype: large intestine tumor.
......
......@@ -13,7 +13,7 @@ BiERapp is available at the [following address](http://omics.it4i.cz/bierapp/)
BiERapp (28) efficiently helps in the identification of causative variants in family and sporadic genetic diseases. The program reads lists of predicted variants (nucleotide substitutions and indels) in affected individuals or tumor samples and controls. In family studies, different modes of inheritance can easily be defined to filter out variants that do not segregate with the disease along the family. Moreover, BiERapp integrates additional information such as allelic frequencies in the general population and the most popular damaging scores to further narrow down the number of putative variants in successive filtering steps. BiERapp provides an interactive and user-friendly interface that implements the filtering strategy used in the context of a large-scale genomic project carried out by the Spanish Network for Research, in Rare Diseases (CIBERER) and the Medical Genome Project. in which more than 800 exomes have been analyzed.
![Web interface to the prioritization tool. This figure shows the interface of the web tool for candidate gene prioritization with the filters available. The tool includes a genomic viewer (Genome Maps 30) that enables the representation of the variants in the corresponding genomic coordinates.](../../img/fig6.png)
![Web interface to the prioritization tool. This figure shows the interface of the web tool for candidate gene prioritization with the filters available. The tool includes a genomic viewer (Genome Maps 30) that enables the representation of the variants in the corresponding genomic coordinates.](img/fig6.png)
** Figure 6 **. Web interface to the prioritization tool. This figure shows the interface of the web tool for candidate gene
prioritization with the filters available. The tool includes a genomic viewer (Genome Maps 30) that enables the representation of the variants in the corresponding genomic coordinates.
......@@ -18,7 +18,7 @@ On the clusters COMSOL is available in the latest stable version. There are two
* **Non commercial** or so called **EDU variant**, which can be used for research and educational purposes.
* **Commercial** or so called **COM variant**, which can used also for commercial activities. **COM variant** has only subset of features compared to the **EDU variant** available. More about licensing [here](licensing-and-available-versions/).
* **Commercial** or so called **COM variant**, which can used also for commercial activities. **COM variant** has only subset of features compared to the **EDU variant** available. More about licensing [here](software/cae/comsol/licensing-and-available-versions/).
To load the of COMSOL load the module
......@@ -32,7 +32,7 @@ By default the **EDU variant** will be loaded. If user needs other version or va
$ ml av COMSOL
```
If user needs to prepare COMSOL jobs in the interactive mode it is recommend to use COMSOL on the compute nodes via PBS Pro scheduler. In order run the COMSOL Desktop GUI on Windows is recommended to use the [Virtual Network Computing (VNC)](../../general/accessing-the-clusters/graphical-user-interface/x-window-system/).
If user needs to prepare COMSOL jobs in the interactive mode it is recommend to use COMSOL on the compute nodes via PBS Pro scheduler. In order run the COMSOL Desktop GUI on Windows is recommended to use the [Virtual Network Computing (VNC)](general/accessing-the-clusters/graphical-user-interface/x-window-system/).
Example for Salomon:
......@@ -76,7 +76,7 @@ Working directory has to be created before sending the (comsol.pbs) job script i
COMSOL is the software package for the numerical solution of the partial differential equations. LiveLink for MATLAB allows connection to the COMSOL API (Application Programming Interface) with the benefits of the programming language and computing environment of the MATLAB.
LiveLink for MATLAB is available in both **EDU** and **COM** **variant** of the COMSOL release. On the clusters 1 commercial (**COM**) license and the 5 educational (**EDU**) licenses of LiveLink for MATLAB (see the [ISV Licenses](../isv_licenses/)) are available. Following example shows how to start COMSOL model from MATLAB via LiveLink in the interactive mode (on Anselm use 16 threads).
LiveLink for MATLAB is available in both **EDU** and **COM** **variant** of the COMSOL release. On the clusters 1 commercial (**COM**) license and the 5 educational (**EDU**) licenses of LiveLink for MATLAB (see the [ISV Licenses](software/isv_licenses/)) are available. Following example shows how to start COMSOL model from MATLAB via LiveLink in the interactive mode (on Anselm use 16 threads).
```console
$ xhost +
......
......@@ -35,7 +35,7 @@ Molpro is compiled for parallel execution using MPI and OpenMP. By default, Molp
!!! note
The OpenMP parallelization in Molpro is limited and has been observed to produce limited scaling. We therefore recommend to use MPI parallelization only. This can be achieved by passing option mpiprocs=16:ompthreads=1 to PBS.
You are advised to use the -d option to point to a directory in [SCRATCH file system - Salomon](../../salomon/storage/). Molpro can produce a large amount of temporary data during its run, and it is important that these are placed in the fast scratch file system.
You are advised to use the -d option to point to a directory in [SCRATCH file system - Salomon](salomon/storage/). Molpro can produce a large amount of temporary data during its run, and it is important that these are placed in the fast scratch file system.
### Example jobscript
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment