diff --git a/docs.it4i/software/bio/bioinformatics.md b/docs.it4i/software/bio/bioinformatics.md
new file mode 100644
index 0000000000000000000000000000000000000000..91de9ca9cce57d66c005ee919a0444a852660fac
--- /dev/null
+++ b/docs.it4i/software/bio/bioinformatics.md
@@ -0,0 +1,235 @@
+# Bioinformatics Applications
+
+## Introduction
+
+In addition to the many applications available through modules (deployed through EasyBuild packaging system) we provide an alternative source of applications on our clusters inferred from [Gentoo Linux](https://www.gentoo.org/). The user's environment is setup through a script which returns a bash instance to the user (you can think of it a starting a whole virtual machine but inside your current namespace) . The applications were optimized by gcc compiler for the SandyBridge and IvyBridge platforms. The binaries use paths from /apps/gentoo prefix to find the required runtime dependencies, config files, etc. The Gentoo Linux is a standalone installation not even relying on the glibc provided by host operating system (Redhat). The trick which allowed us to install Gentoo Linux on the host Redhat system is called Gentoo::RAP and uses a modified loader with a hardcoded path ([links](https://wiki.gentoo.org/wiki/Prefix/libc)).
+
+## Starting the Environment
+
+```console
+mmokrejs@login2~$ /apps/gentoo/startprefix
+```
+
+## Starting PBS Jobs Using the Applications
+
+Create a template file which can be used and an argument to qsub command. Notably, the 'PBS -S' line specifies full PATH to the Bourne shell of the Gentoo Linux environment.
+
+```console
+mmokrejs@login2~$ cat myjob.pbs
+#PBS -S /apps/gentoo/bin/sh
+#PBS -l nodes=1:ppn=16,walltime=12:00:00
+#PBS -q qfree
+#PBS -M my_email@foo.bar
+#PBS -m ea
+#PBS -N sample22
+#PBS -A DD-13-5
+#source ~/.bashrc
+
+cd $PBS_O_WORKDIR || exit 255
+
+myscript.sh foo 1>myjob.log 2>&1
+
+$ head -n 1 myscript.sh
+#! /apps/gentoo/bin/sh
+$ qsub myjob.pbs
+$ qstat
+```
+
+## Reading Manual Pages for Installed Applications
+
+```console
+mmokrejs@login2~$ man -M /apps/gentoo/usr/share/man bwa
+mmokrejs@login2~$ man -M /apps/gentoo/usr/share/man samtools
+```
+
+## Listing of Bioinformatics Applications
+
+```console
+mmokrejs@login2~$ grep biology /scratch/mmokrejs/gentoo_rap/installed.txt
+sci-biology/ANGLE-bin-20080813-r1
+sci-biology/AlignGraph-9999
+sci-biology/Atlas-Link-0.01-r1
+sci-biology/BRANCH-9999
+sci-biology/EBARDenovo-1.2.2
+sci-biology/FLASH-1.2.9
+sci-biology/GAL-0.2.2
+sci-biology/Gambit-0.4.145
+sci-biology/HTSeq-0.6.1
+sci-biology/InterMine-0.98
+sci-biology/MochiView-1.45
+sci-biology/MuSeqBox-5.4
+sci-biology/ONTO-PERL-1.41
+sci-biology/ORFcor-20130507
+sci-biology/Rcorrector-9999
+sci-biology/SSAKE-3.8.2
+sci-biology/STAR-9999
+sci-biology/YASRA-2.33
+sci-biology/abacas-1.3.1
+sci-biology/align_to_scf-1.06
+sci-biology/assembly-stats-9999
+sci-biology/bambus-2.33
+sci-biology/bamtools-9999
+sci-biology/bcftools-1.2
+sci-biology/bedtools-2.22.1
+sci-biology/bfast-0.7.0a
+sci-biology/biobambam2-9999
+sci-biology/bismark-0.13.0
+sci-biology/blat-34-r1
+sci-biology/blue-1.1.3
+sci-biology/bowtie-2.2.9
+sci-biology/brat-1.2.4
+sci-biology/bwa-0.7.13
+sci-biology/bx-python-9999
+sci-biology/cast-bin-20080813
+sci-biology/cd-hit-4.6.5
+sci-biology/cdbfasta-0.1
+sci-biology/clover-2011.10.24
+sci-biology/clustalw-2.1
+sci-biology/cnrun-2.0.3
+sci-biology/codonw-1.4.4-r2
+sci-biology/conform-gt-1174
+sci-biology/conifer-0.2.2
+sci-biology/coral-1.4
+sci-biology/cross_genome-20140822
+sci-biology/cutadapt-9999
+sci-biology/dawg-1.1.2
+sci-biology/dna2pep-1.1
+sci-biology/edena-3.131028
+sci-biology/epga-9999
+sci-biology/erpin-5.5b
+sci-biology/estscan-3.0.3
+sci-biology/eugene-4.1d
+sci-biology/exonerate-gff3-9999
+sci-biology/fastx_toolkit-0.0.14
+sci-biology/gemini-9999
+sci-biology/geneid-1.4.4
+sci-biology/genepop-4.2.1
+sci-biology/glimmerhmm-3.0.1-r1
+sci-biology/gmap-2015.12.31.5
+sci-biology/hexamer-19990330
+sci-biology/hts-python-9999
+sci-biology/jellyfish-2.1.4
+sci-biology/jigsaw-3.2.10
+sci-biology/kallisto-9999
+sci-biology/karect-1.0.0
+sci-biology/lastz-1.03.66
+sci-biology/libgtextutils-0.6.1
+sci-biology/lucy-1.20
+sci-biology/megahit-9999
+sci-biology/merlin-1.1.2
+sci-biology/miranda-3.3a
+sci-biology/mreps-2.5
+sci-biology/mrfast-2.6.0.1
+sci-biology/mummer-3.22-r1
+sci-biology/muscle-3.8.31
+sci-biology/nrcl-110625
+sci-biology/nwalign-0.3.1
+sci-biology/oases-9999
+sci-biology/parafly-20130121
+sci-biology/phrap-1.080812-r1
+sci-biology/phred-071220
+sci-biology/phylip-3.696-r1
+sci-biology/plinkseq-0.10
+sci-biology/primer3-2.3.7
+sci-biology/prinseq-lite-0.20.4
+sci-biology/proda-1.0
+sci-biology/pybedtools-0.6.9
+sci-biology/pysam-0.9.0
+sci-biology/pysamstats-0.24.2
+sci-biology/quast-2.3
+sci-biology/quorum-1.0.0
+sci-biology/reaper-15348
+sci-biology/repeatmasker-libraries-20150807
+sci-biology/reptile-1.1
+sci-biology/samstat-20130708
+sci-biology/samtools-0.1.20-r2
+sci-biology/samtools-1.3-r1
+sci-biology/scaffold_builder-20131122-r1
+sci-biology/scan_for_matches-20121220
+sci-biology/screed-0.7.1
+sci-biology/scythe-0.992
+sci-biology/seqan-2.1.1
+sci-biology/seqtools-4.34.5
+sci-biology/sff_dump-1.04
+sci-biology/sgp2-1.1
+sci-biology/shrimp-2.2.3
+sci-biology/sickle-9999
+sci-biology/smalt-0.7.6
+sci-biology/snpomatic-9999
+sci-biology/ssaha2-bin-2.5.5
+sci-biology/stampy-1.0.28
+sci-biology/stringtie-1.2.2
+sci-biology/subread-1.4.6
+sci-biology/swissknife-1.72
+sci-biology/tagdust-20101028
+sci-biology/tclust-110625
+sci-biology/tigr-foundation-libs-2.0-r1
+sci-biology/trans-abyss-1.4.8
+sci-biology/trf-4.07b
+sci-biology/uchime-4.2.40
+sci-biology/velvet-1.2.10
+sci-biology/velvetk-20120606
+sci-biology/zmsort-110625
+```
+
+```console
+mmokrejs@login2~$ grep sci-libs /scratch/mmokrejs/gentoo_rap/installed.txt
+sci-libs/amd-2.3.1
+sci-libs/blas-reference-20151113-r1
+sci-libs/camd-2.3.1
+sci-libs/cbflib-0.9.3.3
+sci-libs/ccolamd-2.8.0
+sci-libs/cholmod-2.1.2
+sci-libs/coinor-cbc-2.8.9
+sci-libs/coinor-cgl-0.58.6
+sci-libs/coinor-clp-1.15.6-r1
+sci-libs/coinor-dylp-1.9.4
+sci-libs/coinor-osi-0.106.6
+sci-libs/coinor-utils-2.9.11
+sci-libs/coinor-vol-1.4.4
+sci-libs/colamd-2.8.0
+sci-libs/cxsparse-3.1.2
+sci-libs/dcmtk-3.6.0
+sci-libs/gsl-2.1
+sci-libs/hdf5-1.8.15_p1
+sci-libs/htslib-1.3
+sci-libs/io_lib-1.14.7
+sci-libs/lapack-reference-3.6.0-r1
+sci-libs/lemon-1.3-r2
+sci-libs/libmaus2-9999
+sci-libs/qrupdate-1.1.2-r1
+sci-libs/scikits-0.1-r1
+sci-libs/suitesparseconfig-4.2.1
+sci-libs/umfpack-5.6.2
+```
+
+## Classification of Applications
+
+| Applications for bioinformatics at IT4I |        |
+| --------------------------------------- | ------ |
+| error-correctors                        | 6      |
+| aligners                                | 20     |
+| clusterers                              | 5      |
+| assemblers                              | 9      |
+| scaffolders                             | 6      |
+| motif searching                         | 6      |
+| ORF/gene prediction/genome annotation   | 13     |
+| genotype/haplotype/popullation genetics | 3      |
+| phylogenetics                           | 1      |
+| transcriptome analysis                  | 2      |
+| utilities                               | 15     |
+| GUI                                     | 3      |
+| libraries                               | 4      |
+| **Total**                               | **93** |
+
+![graphs](../img/bio-graphs.png)
+
+## Other Applications Available Through Gentoo Linux
+
+Gentoo Linux is a allows compilation of its applications from source code while using compiler and optimize flags set to user's wish. This facilitates creation of optimized binaries for the host platform. Users maybe also use several versions of gcc, python and other tools.
+
+```console
+mmokrejs@login2~$ gcc-config -l
+mmokrejs@login2~$ java-config -L
+mmokrejs@login2~$ eselect
+```
diff --git a/docs.it4i/software/bio/omics-master/diagnostic-component-team.md b/docs.it4i/software/bio/omics-master/diagnostic-component-team.md
new file mode 100644
index 0000000000000000000000000000000000000000..24dc717781a881901310c127739d1e873d151a6b
--- /dev/null
+++ b/docs.it4i/software/bio/omics-master/diagnostic-component-team.md
@@ -0,0 +1,18 @@
+# Diagnostic component (TEAM)
+
+## Access
+
+TEAM is available at the [following address](http://omics.it4i.cz/team/)
+
+!!! note
+    The address is accessible only via VPN.
+
+## Diagnostic Component
+
+VCF files are scanned by this diagnostic tool for known diagnostic disease-associated variants. When no diagnostic mutation is found, the file can be sent to the disease-causing gene discovery tool to see whether new disease associated variants can be found.
+
+TEAM (27) is an intuitive and easy-to-use web tool that fills the gap between the predicted mutations and the final diagnostic in targeted enrichment sequencing analysis. The tool searches for known diagnostic mutations, corresponding to a disease panel, among the predicted patientâ€™s variants. Diagnostic variants for the disease are taken from four databases of disease-related variants (HGMD-public, HUMSAVAR , ClinVar and COSMIC) If no primary diagnostic variant is found, then a list of secondary findings that can help to establish a diagnostic is produced. TEAM also provides with an interface for the definition of and customization of panels, by means of which, genes and mutations can be added or discarded to adjust panel definitions.
+
+![Interface of the application. Panels for defining targeted regions of interest can be set up by just drag and drop known disease genes or disease definitions from the lists. Thus, virtual panels can be interactively improved as the knowledge of the disease increases.](../../img/fig5.png)
+
+** Figure 5. **Interface of the application. Panels for defining targeted regions of interest can be set up by just drag and drop known disease genes or disease definitions from the lists. Thus, virtual panels can be interactively improved as the knowledge of the disease increases.
diff --git a/docs.it4i/software/bio/omics-master/overview.md b/docs.it4i/software/bio/omics-master/overview.md
new file mode 100644
index 0000000000000000000000000000000000000000..e29f1daec829dd7af8a93409314a2caef755625d
--- /dev/null
+++ b/docs.it4i/software/bio/omics-master/overview.md
@@ -0,0 +1,391 @@
+# Overview
+
+The human NGS data processing solution
+
+## Introduction
+
+The scope of this OMICS MASTER solution is restricted to human genomics research (disease causing gene discovery in whole human genome or exome) or diagnosis (panel sequencing), although it could be extended in the future to other usages.
+
+The pipeline inputs the raw data produced by the sequencing machines and undergoes a processing procedure that consists on a quality control, the mapping and variant calling steps that result in a file containing the set of variants in the sample. From this point, the prioritization component or the diagnostic component can be launched.
+
+![OMICS MASTER solution overview. Data is produced in the external labs and comes to IT4I (represented by the blue dashed line). The data pre-processor converts raw data into a list of variants and annotations for each sequenced patient. These lists files together with primary and secondary (alignment) data files are stored in IT4I sequence DB and uploaded to the discovery (candidate priorization) or diagnostic component where they can be analysed directly by the user that produced
+them, depending of the experimental design carried out.](../../img/fig1.png)
+
+Figure 1. OMICS MASTER solution overview. Data is produced in the external labs and comes to IT4I (represented by the blue dashed line). The data pre-processor converts raw data into a list of variants and annotations for each sequenced patient. These lists files together with primary and secondary (alignment) data files are stored in IT4I sequence DB and uploaded to the discovery (candidate prioritization) or diagnostic component where they can be analyzed directly by the user that produced them, depending of the experimental design carried out.
+
+Typical genomics pipelines are composed by several components that need to be launched manually. The advantage of OMICS MASTER pipeline is that all these components are invoked sequentially in an automated way.
+
+OMICS MASTER pipeline inputs a FASTQ file and outputs an enriched VCF file. This pipeline is able to queue all the jobs to PBS by only launching a process taking all the necessary input files and creates the intermediate and final folders
+
+Letâ€™s see each of the OMICS MASTER solution components:
+
+## Components
+
+### Processing
+
+This component is composed by a set of programs that carry out quality controls, alignment, realignment, variant calling and variant annotation. It turns raw data from the sequencing machine into files containing lists of variants (VCF) that once annotated, can be used by the following components (discovery and diagnosis).
+
+We distinguish three types of sequencing instruments: bench sequencers (MySeq, IonTorrent, and Roche Junior, although this last one is about being discontinued), which produce relatively Genomes in the clinic
+
+low throughput (tens of million reads), and high end sequencers, which produce high throughput (hundreds of million reads) among which we have Illumina HiSeq 2000 (and new models) and SOLiD. All of them but SOLiD produce data in sequence format. SOLiD produces data in a special format called colour space that require of specific software for the mapping process. Once the mapping has been done, the rest of the pipeline is identical. Anyway, SOLiD is a technology which is also about being discontinued by the manufacturer so, this type of data will be scarce in the future.
+
+#### Quality Control, Preprocessing and Statistics for FASTQ
+
+FastQC& FastQC.
+
+These steps are carried out over the original FASTQ file with optimized scripts and includes the following steps: sequence cleansing, estimation of base quality scores, elimination of duplicates and statistics.
+
+Input: FASTQ file.
+
+Output: FASTQ file plus an HTML file containing statistics on the data.
+
+FASTQ format It represents the nucleotide sequence and its corresponding quality scores.
+
+![FASTQ file.](../../img/fig2.png "fig2.png")
+Figure 2.FASTQ file.
+
+#### Mapping
+
+Component: Hpg-aligner.
+
+Sequence reads are mapped over the human reference genome. SOLiD reads are not covered by this solution; they should be mapped with specific software (among the few available options, SHRiMP seems to be the best one). For the rest of NGS machine outputs we use HPG Aligner. HPG-Aligner is an innovative solution, based on a combination of mapping with BWT and local alignment with Smith-Waterman (SW), that drastically increases mapping accuracy (97% versus 62-70% by current mappers, in the most common scenarios). This proposal provides a simple and fast solution that maps almost all the reads, even those containing a high number of mismatches or indels.
+
+Input: FASTQ file.
+
+Output: Aligned file in BAM format.
+
+#### Sequence Alignment/Map (SAM)
+
+It is a human readable tab-delimited format in which each read and its alignment is represented on a single line. The format can represent unmapped reads, reads that are mapped to unique locations, and reads that are mapped to multiple locations.
+
+The SAM format (1) consists of one header section and one alignment section. The lines in the header section start with character â€@â€™, and lines in the alignment section do not. All lines are TAB delimited.
+
+In SAM, each alignment line has 11 mandatory fields and a variable number of optional fields. The mandatory fields are briefly described in Table 1. They must be present but their value can be a â€\â€™ or a zero (depending on the field) if the
+corresponding information is unavailable.
+
+|  No.      |  Name      |  Description                                          |
+| --------- | ---------- | ----------------------------------------------------- |
+| 1         | QNAME      | Query NAME of the read or the read pai                |
+| 2         | FLAG       | Bitwise FLAG (pairing,strand,mate strand,etc.)        |
+| 3         | RNAME      | Reference sequence NAME                               |
+| 4         | POS        | 1-Based leftmost POSition of clipped alignment        |
+| 5         | MAPQ       | MAPping Quality (Phred-scaled)                        |
+| 6         | CIGAR      | Extended CIGAR string (operations:MIDNSHP)            |
+| 7         | MRNM       | Mate REference NaMe ('=' if same RNAME)               |
+| 8         | MPOS       | 1-Based leftmost Mate POSition                        |
+| 9         | ISIZE      | Inferred Insert SIZE                                  |
+| 10        | SEQ        | Query SEQuence on the same strand as the reference    |
+| 11        | QUAL       | Query QUALity (ASCII-33=Phred base quality)           |
+
+ Table 1 . Mandatory fields in the SAM format.
+
+The standard CIGAR description of pairwise alignment defines three operations: â€Mâ€™ for match/mismatch, â€Iâ€™ for insertion compared with the reference and â€Dâ€™ for deletion. The extended CIGAR proposed in SAM added four more operations: â€Nâ€™ for skipped bases on the reference, â€Sâ€™ for soft clipping, â€Hâ€™ for hard clipping and â€Pâ€™ for padding. These support splicing, clipping, multi-part and padded alignments. Figure 3 shows examples of CIGAR strings for different types of alignments.
+
+![SAM format file. The â€@SQâ€™ line in the header section gives the order of reference sequences. Notably, r001 is the name of a read pair. According to FLAG 163 (=1+2+32+128), the read mapped to position 7 is the second read in the pair (128) and regarded as properly paired (1 + 2); its mate is mapped to 37 on the reverse strand (32). Read r002 has three soft-clipped (unaligned) bases. The coordinate shown in SAM is the position of the first aligned base. The CIGAR string for this alignment contains a P (padding) operation which correctly aligns the inserted sequences. Padding operations can be absent when an aligner does not support multiple sequence alignment. The last six bases of read r003 map to position 9, and the first five to position 29 on the reverse strand. The hard clipping operation H indicates that the clipped sequence is not present in the sequence field. The NM tag gives the number of mismatches. Read r004 is aligned across an intron, indicated by the N operation.](../../img/fig3.png)
+
+ Figure 3 . SAM format file. The â€@SQâ€™ line in the header section gives the order of reference sequences. Notably, r001 is the name of a read pair. According to FLAG 163 (=1+2+32+128), the read mapped to position 7 is the second read in the pair (128) and regarded as properly paired (1 + 2); its mate is mapped to 37 on the reverse strand (32). Read r002 has three soft-clipped (unaligned) bases. The coordinate shown in SAM is the position of the first aligned base. The CIGAR string for this alignment contains a P (padding) operation which correctly aligns the inserted sequences. Padding operations can be absent when an aligner does not support multiple sequence alignment. The last six bases of read r003 map to position 9, and the first five to position 29 on the reverse strand. The hard clipping operation H indicates that the clipped sequence is not present in the sequence field. The NM tag gives the number of mismatches. Read r004 is aligned across an intron, indicated by the N operation.
+
+##### Binary Alignment/Map (BAM)
+
+BAM is the binary representation of SAM and keeps exactly the same information as SAM. BAM uses lossless compression to reduce the size of the data by about 75% and provides an indexing system that allows reads that overlap a region of the genome to be retrieved and rapidly traversed.
+
+#### Quality Control, Preprocessing and Statistics for BAM
+
+Component: Hpg-Fastq & FastQC.
+
+Some features
+
+ Quality control
+   reads with N errors
+   reads with multiple mappings
+   strand bias
+   paired-end insert
+ Filtering: by number of errors, number of hits
+   Comparator: stats, intersection, ...
+
+Input: BAM file.
+
+Output: BAM file plus an HTML file containing statistics.
+
+#### Variant Calling
+
+Component: GATK.
+
+Identification of single nucleotide variants and indels on the alignments is performed using the Genome Analysis Toolkit (GATK). GATK (2) is a software package developed at the Broad Institute to analyze high-throughput sequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance.
+
+Input: BAM
+
+Output:VCF
+
+Variant Call Format (VCF)
+
+VCF (3) is a standardized format for storing the most prevalent types of sequence variation, including SNPs, indels and larger structural variants, together with rich annotations. The format was developed with the primary intention to represent human genetic variation, but its use is not restricted to diploid genomes and can be used in different contexts as well. Its flexibility and user extensibility allows representation of a wide variety of genomic variation with respect to a single reference sequence.
+
+A VCF file consists of a header section and a data section. The header contains an arbitrary number of metainformation lines, each starting with characters â€##â€™, and a TAB delimited field definition line, starting with a single â€#â€™ character. The meta-information header lines provide a standardized description of tags and annotations used in the data section. The use of meta-information allows the information stored within a VCF file to be tailored to the dataset in question. It can be also used to provide information about the means of file creation, date of creation, version of the reference sequence, software used and any other information relevant to the history of the file. The field definition line names eight mandatory columns, corresponding to data columns representing the chromosome (CHROM), a 1-based position of the start of the variant (POS), unique identifiers of the variant (ID), the reference allele (REF), a comma separated list of alternate non-reference alleles (ALT), a phred-scaled quality score (QUAL), site filtering information (FILTER) and a semicolon separated list of additional, user extensible annotation (INFO). In addition, if samples are present in the file, the mandatory header columns are followed by a FORMAT column and an arbitrary number of sample IDs that define the samples included in the VCF file. The FORMAT column is used to define the information contained within each subsequent genotype column, which consists of a colon separated list of fields. For example, the FORMAT field GT:GQ:DP in the fourth data entry of Figure 1a indicates that the subsequent entries contain information regarding the genotype, genotype quality and read depth for each sample. All data lines are TAB delimited and the number of fields in each data line must match the number of fields in the header line. It is strongly recommended that all annotation tags used are declared in the VCF header section.
+
+![a) Example of valid VCF. The header lines ##fileformat and #CHROM are mandatory, the rest is optional but strongly recommended. Each line of the body describes variants present in the sampled population at one genomic position or region. All alternate alleles are listed in the ALT column and referenced from the genotype fields as 1-based indexes to
+this list; the reference haplotype is designated as 0. For multiploid data, the separator indicates whether the data are phased (|) or unphased (/). Thus, the two alleles C and G at the positions 2 and 5 in this figure occur on the same chromosome in SAMPLE1. The first data line shows an example of a deletion (present in SAMPLE1) and a replacement of
+two bases by another base (SAMPLE2); the second line shows a SNP and an insertion; the third a SNP; the fourth a large structural variant described by the annotation in the INFO column, the coordinate is that of the base before the variant. (bâ€“f ) Alignments and VCF representations of different sequence variants: SNP, insertion, deletion, replacement, and a large deletion. The REF columns shows the reference bases replaced by the haplotype in the ALT column. The coordinate refers to the first reference base. (g) Users are advised to use simplest representation possible and lowest coordinate in cases where the position is ambiguous.](../../img/fig4.png)
+
+ Figure 4 . (a) Example of valid VCF. The header lines ##fileformat and #CHROM are mandatory, the rest is optional but strongly recommended. Each line of the body describes variants present in the sampled population at one genomic position or region. All alternate alleles are listed in the ALT column and referenced from the genotype fields as 1-based indexes to this list; the reference haplotype is designated as 0. For multiploid data, the separator indicates whether the data are phased (|) or unphased (/). Thus, the two alleles C and G at the positions 2 and 5 in this figure occur on the same chromosome in SAMPLE1. The first data line shows an example of a deletion (present in SAMPLE1) and a replacement of two bases by another base (SAMPLE2); the second line shows a SNP and an insertion; the third a SNP; the fourth a large structural variant described by the annotation in the INFO column, the coordinate is that of the base before the variant. (bâ€“f ) Alignments and VCF representations of different sequence variants: SNP, insertion, deletion, replacement, and a large deletion. The REF columns shows the reference bases replaced by the haplotype in the ALT column. The coordinate refers to the first reference base. (g) Users are advised to use simplest representation possible and lowest coordinate in cases where the position is ambiguous.
+
+### Annotating
+
+Component: HPG-Variant
+
+The functional consequences of every variant found are then annotated using the HPG-Variant software, which extracts from CellBase, the Knowledge database, all the information relevant on the predicted pathologic effect of the variants.
+
+VARIANT (VARIant Analysis Tool) (4) reports information on the variants found that include consequence type and annotations taken from different databases and repositories (SNPs and variants from dbSNP and 1000 genomes, and disease-related variants from the Genome-Wide Association Study (GWAS) catalog, Online Mendelian Inheritance in Man (OMIM), Catalog of Somatic Mutations in Cancer (COSMIC) mutations, etc. VARIANT also produces a rich variety of annotations that include information on the regulatory (transcription factor or miRNAbinding sites, etc.) or structural roles, or on the selective pressures on the sites affected by the variation. This information allows extending the conventional reports beyond the coding regions and expands the knowledge on the contribution of non-coding or synonymous variants to the phenotype studied.
+
+ Input: VCF
+
+ Output: The output of this step is the Variant Calling Format (VCF) file, which contains changes with respect to the reference genome with the corresponding QC and functional annotations.
+
+#### CellBase
+
+CellBase(5) is a relational database integrates biological information from different sources and includes:
+
+Core features
+
+We took genome sequences, genes, transcripts, exons, cytobands or cross references (xrefs) identifiers (IDs) from Ensembl (6). Protein information including sequences, xrefs or protein features (natural variants, mutagenesis sites, post-translational modifications, etc.) were imported from UniProt (7).
+
+Regulatory
+
+CellBase imports miRNA from miRBase (8); curated and non-curated miRNA targets from miRecords (9), miRTarBase (10),
+TargetScan(11) and microRNA.org (12) and CpG islands and conserved regions from the UCSC database (13).
+
+Functional annotation
+
+OBO Foundry (14) develops many biomedical ontologies that are implemented in OBO format. We designed a SQL schema to store these OBO ontologies and 30 ontologies were imported. OBO ontology term annotations were taken from Ensembl (6). InterPro (15) annotations were also imported.
+
+Variation
+
+CellBase includes SNPs from dbSNP (16)^; SNP population frequencies from HapMap (17), 1000 genomes project (18) and Ensembl (6); phenotypically annotated SNPs were imported from NHRI GWAS Catalog (19),HGMD (20), Open Access GWAS Database (21), UniProt (7) and OMIM (22); mutations from COSMIC (23) and structural variations from Ensembl (6).
+
+Systems biology
+
+We also import systems biology information like interactome information from IntAct (24). Reactome (25) stores pathway and interaction information in BioPAX (26) format. BioPAX data exchange format enables the integration of diverse pathway
+resources. We successfully solved the problem of storing data released in BioPAX format into a SQL relational schema, which allowed us importing Reactome in CellBase.
+
+### [Diagnostic Component (TEAM)](diagnostic-component-team/)
+
+### [Priorization Component (BiERApp)](priorization-component-bierapp/)
+
+## Usage
+
+First of all, we should load ngsPipeline module:
+
+```console
+$ ml ngsPipeline
+```
+
+This command will load python/2.7.5 module and all the required modules (hpg-aligner, gatk, etc)
+
+If we launch ngsPipeline with â€-hâ€™, we will get the usage help:
+
+```console
+$ ngsPipeline -h
+    Usage: ngsPipeline.py [-h] -i INPUT -o OUTPUT -p PED --project PROJECT --queue
+      QUEUE [--stages-path STAGES_PATH] [--email EMAIL]
+     [--prefix PREFIX] [-s START] [-e END] --log
+
+    Python pipeline
+
+    optional arguments:
+      -h, --help show this help message and exit
+      -i INPUT, --input INPUT
+      -o OUTPUT, --output OUTPUT
+        Output Data directory
+      -p PED, --ped PED     Ped file with all individuals
+      --project PROJECT     Project Id
+      --queue QUEUE         Queue Id
+      --stages-path STAGES_PATH
+        Custom Stages path
+      --email EMAIL         Email
+      --prefix PREFIX       Prefix name for Queue Jobs name
+      -s START, --start START
+        Initial stage
+      -e END, --end END     Final stage
+      --log       Log to file
+```
+
+Let us see a brief description of the arguments:
+
+```console
+      -h --help. Show the help.
+
+      -i, --input. The input data directory. This directory must to have a special structure. We have to create one folder per sample (with the same name). These folders will host the fastq files. These fastq files must have the following pattern â€śsampleNameâ€ť + â€ś_â€ť + â€ś1 or 2â€ť + â€ś.fqâ€ť. 1 for the first pair (in paired-end sequences), and 2 for the
+second one.
+
+      -o , --output. The output folder. This folder will contain all the intermediate and final folders. When the pipeline will be executed completely, we could remove the intermediate folders and keep only the final one (with the VCF file containing all the variants)
+
+      -p , --ped. The ped file with the pedigree. This file contains all the sample names. These names must coincide with the names of the input folders. If our input folder contains more samples than the .ped file, the pipeline will use only the samples from the .ped file.
+
+      --email. Email for PBS notifications.
+
+      --prefix. Prefix for PBS Job names.
+
+      -s, --start & -e, --end. Initial and final stage. If we want to launch the pipeline in a specific stage we must use -s. If we want to end the pipeline in a specific stage we must use -e.
+
+      --log. Using log argument NGSpipeline will prompt all the logs to this file.
+
+      --project>. Project ID of your supercomputer allocation.
+
+      --queue. [Queue](../../salomon/resources-allocation-policy/) to run the jobs in.
+```
+
+Input, output and ped arguments are mandatory. If the output folder does not exist, the pipeline will create it.
+
+## Examples
+
+This is an example usage of NGSpipeline:
+
+We have a folder with the following structure in
+
+```console
+/apps/bio/omics/1.0/sample_data/ >:
+
+    /apps/bio/omics/1.0/sample_data
+    â””â”€â”€ data
+        â”śâ”€â”€ file.ped
+        â”śâ”€â”€ sample1
+        â”‚   â”śâ”€â”€ sample1_1.fq
+        â”‚   â””â”€â”€ sample1_2.fq
+        â””â”€â”€ sample2
+  â”śâ”€â”€ sample2_1.fq
+  â””â”€â”€ sample2_2.fq
+```
+
+The ped file ( file.ped) contains the following info:
+
+```console
+    #family_ID sample_ID parental_ID maternal_ID sex phenotype
+    FAM sample_A 0 0 1 1
+    FAM sample_B 0 0 2 2
+```
+
+Now, lets load the NGSPipeline module and copy the sample data to a [scratch directory](../../salomon/storage/):
+
+```console
+$ ml ngsPipeline
+$ mkdir -p /scratch/$USER/omics/results
+$ cp -r /apps/bio/omics/1.0/sample_data /scratch/$USER/omics/
+```
+
+Now, we can launch the pipeline (replace OPEN-0-0 with your Project ID):
+
+```console
+$ ngsPipeline -i /scratch/$USER/omics/sample_data/data -o /scratch/$USER/omics/results -p /scratch/$USER/omics/sample_data/data/file.ped --project OPEN-0-0 --queue qprod
+```
+
+This command submits the processing [jobs to the queue](../../salomon/job-submission-and-execution/).
+
+If we want to re-launch the pipeline from stage 4 until stage 20 we should use the next command:
+
+```console
+$ ngsPipeline -i /scratch/$USER/omics/sample_data/data -o /scratch/$USER/omics/results -p /scratch/$USER/omics/sample_data/data/file.ped -s 4 -e 20 --project OPEN-0-0 --queue qprod
+```
+
+## Details on the Pipeline
+
+The pipeline calls the following tools
+
+ [fastqc](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/), quality control tool for high throughput sequence data.
+ [gatk](https://www.broadinstitute.org/gatk/), The Genome Analysis Toolkit or GATK is a software package developed at
+      the Broad Institute to analyze high-throughput sequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size.
+ [hpg-aligner](https://github.com/opencb-hpg/hpg-aligner), HPG Aligner has been designed to align short and long reads with high sensitivity, therefore any number of mismatches or indels are allowed. HPG Aligner implements and combines two well known algorithms: _Burrows-Wheeler Transform_ (BWT) to speed-up mapping high-quality reads, and _Smith-Waterman_> (SW) to increase sensitivity when reads cannot be mapped using BWT.
+ [hpg-fastq](http://docs.bioinfo.cipf.es/projects/fastqhpc/wiki), a quality control tool for high throughput sequence data.
+ [hpg-variant](http://docs.bioinfo.cipf.es/projects/hpg-variant/wiki), The HPG Variant suite is an ambitious project aimed to provide a complete suite of tools to work with genomic variation data, from VCF tools to variant profiling or genomic statistics. It is being implemented using High Performance Computing technologies to provide the best performance possible.
+ [picard](http://picard.sourceforge.net/), Picard comprises Java-based command-line utilities that manipulate SAM files, and a Java API (HTSJDK) for creating new programs that read and write SAM files. Both SAM text format and SAM binary (BAM) format are supported.
+ [samtools](http://samtools.sourceforge.net/samtools-c.shtml), SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.
+ [snpEff](http://snpeff.sourceforge.net/), Genetic variant annotation and effect prediction toolbox.
+
+This listing show which tools are used in each step of the pipeline
+
+ stage-00: fastqc
+ stage-01: hpg_fastq
+ stage-02: fastqc
+ stage-03: hpg_aligner and samtools
+ stage-04: samtools
+ stage-05: samtools
+ stage-06: fastqc
+ stage-07: picard
+ stage-08: fastqc
+ stage-09: picard
+ stage-10: gatk
+ stage-11: gatk
+ stage-12: gatk
+ stage-13: gatk
+ stage-14: gatk
+ stage-15: gatk
+ stage-16: samtools
+ stage-17: samtools
+ stage-18: fastqc
+ stage-19: gatk
+ stage-20: gatk
+ stage-21: gatk
+ stage-22: gatk
+ stage-23: gatk
+ stage-24: hpg-variant
+ stage-25: hpg-variant
+ stage-26: snpEff
+ stage-27: snpEff
+ stage-28: hpg-variant
+
+## Interpretation
+
+The output folder contains all the subfolders with the intermediate data. This folder contains the final VCF with all the variants. This file can be uploaded into [TEAM](diagnostic-component-team/) by using the VCF file button. It is important to note here that the entire management of the VCF file is local: no patientâ€™s sequence data is sent over the Internet thus avoiding any problem of data privacy or confidentiality.
+
+![TEAM upload panel. Once the file has been uploaded, a panel must be chosen from the Panel list. Then, pressing the Run button the diagnostic process starts.]\((../../img/fig7.png)
+
+ Figure 7. _TEAM upload panel._ _Once the file has been uploaded, a panel must be chosen from the Panel_ list. Then, pressing the Run button the diagnostic process starts.
+
+Once the file has been uploaded, a panel must be chosen from the Panel list. Then, pressing the Run button the diagnostic process starts. TEAM searches first for known diagnostic mutation(s) taken from four databases: HGMD-public (20), [HUMSAVAR](http://www.uniprot.org/docs/humsavar), ClinVar (29) and COSMIC (23).
+
+![The panel manager. The elements used to define a panel are (A) disease terms, (B) diagnostic mutations and (C) genes. Arrows represent actions that can be taken in the panel manager. Panels can be defined by using the known mutations and genes of a particular disease. This can be done by dragging them to the Primary Diagnostic box (action D). This action, in addition to defining the diseases in the Primary Diagnostic box, automatically adds the corresponding genes to the Genes box. The panels can be customized by adding new genes (action F) or removing undesired genes (action G). New disease mutations can be added independently or associated to an already existing disease term (action E). Disease terms can be removed by simply dragging themback (action H).](../../img/fig7x.png)
+
+ Figure 7. The panel manager. The elements used to define a panel are ( A ) disease terms, ( B ) diagnostic mutations and ( C ) genes. Arrows represent actions that can be taken in the panel manager. Panels can be defined by using the known mutations and genes of a particular disease. This can be done by dragging them to the Primary Diagnostic box (action D ). This action, in addition to defining the diseases in the Primary Diagnostic box, automatically adds the corresponding genes to the Genes box. The panels can be customized by adding new genes (action F ) or removing undesired genes (action G). New disease mutations can be added independently or associated to an already existing disease term (action E ). Disease terms can be removed by simply dragging them back (action H ).
+
+For variant discovering/filtering we should upload the VCF file into BierApp by using the following form:
+
+\![BierApp VCF upload panel. It is recommended to choose a name for the job as well as a description.](../../img/fig8.png)\
+
+ Figure 8 . \BierApp VCF upload panel. It is recommended to choose a name for the job as well as a description \\.
+
+Each prioritization (â€jobâ€™) has three associated screens that facilitate the filtering steps. The first one, the â€Summaryâ€™ tab, displays a statistic of the data set analyzed, containing the samples analyzed, the number and types of variants found and its distribution according to consequence types. The second screen, in the â€Variants and effectâ€™ tab, is the actual filtering tool, and the third one, the â€Genome viewâ€™ tab, offers a representation of the selected variants within the genomic context provided by an embedded version of the Genome Maps Tool (30).
+
+![This picture shows all the information associated to the variants. If a variant has an associated phenotype we could see it in the last column. In this case, the variant 7:132481242 CT is associated to the phenotype: large intestine tumor.](../../img/fig9.png)
+
+ Figure 9 . This picture shows all the information associated to the variants. If a variant has an associated phenotype we could see it in the last column. In this case, the variant 7:132481242 CT is associated to the phenotype: large intestine tumor.
+
+## References
+
+1. Heng Li, Bob Handsaker, Alec Wysoker, Tim Fennell, Jue Ruan, Nils Homer, Gabor Marth5, Goncalo Abecasis6, Richard Durbin and 1000 Genome Project Data Processing Subgroup: The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009, 25: 2078-2079.
+1. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA: The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. _Genome Res_ >2010, 20:1297-1303.
+1. Petr Danecek, Adam Auton, Goncalo Abecasis, Cornelis A. Albers, Eric Banks, Mark A. DePristo, Robert E. Handsaker, Gerton Lunter, Gabor T. Marth, Stephen T. Sherry, Gilean McVean, Richard Durbin, and 1000 Genomes Project Analysis Group. The variant call format and VCFtools. Bioinformatics 2011, 27: 2156-2158.
+1. Medina I, De Maria A, Bleda M, Salavert F, Alonso R, Gonzalez CY, Dopazo J: VARIANT: Command Line, Web service and Web interface for fast and accurate functional characterization of variants found by Next-Generation Sequencing. Nucleic Acids Res 2012, 40:W54-58.
+1. Bleda M, Tarraga J, de Maria A, Salavert F, Garcia-Alonso L, Celma M, Martin A, Dopazo J, Medina I: CellBase, a comprehensive collection of RESTful web services for retrieving relevant biological information from heterogeneous sources. Nucleic Acids Res 2012, 40:W609-614.
+1. Flicek,P., Amode,M.R., Barrell,D., Beal,K., Brent,S., Carvalho-Silva,D., Clapham,P., Coates,G., Fairley,S., Fitzgerald,S. et al. (2012) Ensembl 2012. Nucleic Acids Res., 40, D84â€“D90.
+1. UniProt Consortium. (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res., 40, D71â€“D75.
+1. Kozomara,A. and Griffiths-Jones,S. (2011) miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res., 39, D152â€“D157.
+1. Xiao,F., Zuo,Z., Cai,G., Kang,S., Gao,X. and Li,T. (2009) miRecords: an integrated resource for microRNA-target interactions. Nucleic Acids Res., 37, D105â€“D110.
+1. Hsu,S.D., Lin,F.M., Wu,W.Y., Liang,C., Huang,W.C., Chan,W.L., Tsai,W.T., Chen,G.Z., Lee,C.J., Chiu,C.M. et al. (2011) miRTarBase: a database curates experimentally validated microRNA-target interactions. Nucleic Acids Res., 39, D163â€“D169.
+1. Friedman,R.C., Farh,K.K., Burge,C.B. and Bartel,D.P. (2009) Most mammalian mRNAs are conserved targets of microRNAs. Genome Res., 19, 92â€“105. 12. Betel,D., Wilson,M., Gabow,A., Marks,D.S. and Sander,C. (2008) The microRNA.org resource: targets and expression. Nucleic Acids Res., 36, D149â€“D153.
+1. Dreszer,T.R., Karolchik,D., Zweig,A.S., Hinrichs,A.S., Raney,B.J., Kuhn,R.M., Meyer,L.R., Wong,M., Sloan,C.A., Rosenbloom,K.R. et al. (2012) The UCSC genome browser database: extensions and updates 2011. Nucleic Acids Res.,40, D918â€“D923.
+1. Smith,B., Ashburner,M., Rosse,C., Bard,J., Bug,W., Ceusters,W., Goldberg,L.J., Eilbeck,K., Ireland,A., Mungall,C.J. et al. (2007) The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat. Biotechnol., 25, 1251â€“1255.
+1. Hunter,S., Jones,P., Mitchell,A., Apweiler,R., Attwood,T.K.,Bateman,A., Bernard,T., Binns,D., Bork,P., Burge,S. et al. (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res.,40, D306â€“D312.
+1. Sherry,S.T., Ward,M.H., Kholodov,M., Baker,J., Phan,L., Smigielski,E.M. and Sirotkin,K. (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res., 29, 308â€“311.
+1. Altshuler,D.M., Gibbs,R.A., Peltonen,L., Dermitzakis,E., Schaffner,S.F., Yu,F., Bonnen,P.E., de Bakker,P.I., Deloukas,P., Gabriel,S.B. et al. (2010) Integrating common and rare genetic variation in diverse human populations. Nature, 467, 52â€“58.
+1. 1000 Genomes Project Consortium. (2010) A map of human genome variation from population-scale sequencing. Nature, 467, 1061â€“1073.
+1. Hindorff,L.A., Sethupathy,P., Junkins,H.A., Ramos,E.M., Mehta,J.P., Collins,F.S. and Manolio,T.A. (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA, 106, 9362â€“9367.
+1. Stenson,P.D., Ball,E.V., Mort,M., Phillips,A.D., Shiel,J.A., Thomas,N.S., Abeysinghe,S., Krawczak,M. and Cooper,D.N. (2003) Human gene mutation database (HGMD): 2003 update. Hum. Mutat., 21, 577â€“581.
+1. Johnson,A.D. and Oâ€™Donnell,C.J. (2009) An open access database of genome-wide association results. BMC Med. Genet, 10, 6.
+1. McKusick,V. (1998) A Catalog of Human Genes and Genetic Disorders, 12th edn. John Hopkins University Press,Baltimore, MD.
+1. Forbes,S.A., Bindal,N., Bamford,S., Cole,C., Kok,C.Y., Beare,D., Jia,M., Shepherd,R., Leung,K., Menzies,A. et al. (2011) COSMIC: mining complete cancer genomes in the catalogue of somatic mutations in cancer. Nucleic Acids Res., 39, D945â€“D950.
+1. Kerrien,S., Aranda,B., Breuza,L., Bridge,A., Broackes-Carter,F., Chen,C., Duesbury,M., Dumousseau,M., Feuermann,M., Hinz,U. et al. (2012) The Intact molecular interaction database in 2012. Nucleic Acids Res., 40, D841â€“D846.
+1. Croft,D., Oâ€™Kelly,G., Wu,G., Haw,R., Gillespie,M., Matthews,L., Caudy,M., Garapati,P., Gopinath,G., Jassal,B. et al. (2011) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res., 39, D691â€“D697.
+1. Demir,E., Cary,M.P., Paley,S., Fukuda,K., Lemer,C., Vastrik,I.,Wu,G., Dâ€™Eustachio,P., Schaefer,C., Luciano,J. et al. (2010) The BioPAX community standard for pathway data sharing. Nature Biotechnol., 28, 935â€“942.
+1. AlemĂˇn Z, GarcĂa-GarcĂa F, Medina I, Dopazo J (2014): A web tool for the design and management of panels of genes for targeted enrichment and massive sequencing for clinical applications. Nucleic Acids Res 42: W83-7.
+1. [AlemĂˇn A](http://www.ncbi.nlm.nih.gov/pubmed?term=Alem%C3%A1n%20A%5BAuthor%5D&cauthor=true&cauthor_uid=24803668)>, [Garcia-Garcia F](http://www.ncbi.nlm.nih.gov/pubmed?term=Garcia-Garcia%20F%5BAuthor%5D&cauthor=true&cauthor_uid=24803668)>, [Salavert F](http://www.ncbi.nlm.nih.gov/pubmed?term=Salavert%20F%5BAuthor%5D&cauthor=true&cauthor_uid=24803668)>, [Medina I](http://www.ncbi.nlm.nih.gov/pubmed?term=Medina%20I%5BAuthor%5D&cauthor=true&cauthor_uid=24803668)>, [Dopazo J](http://www.ncbi.nlm.nih.gov/pubmed?term=Dopazo%20J%5BAuthor%5D&cauthor=true&cauthor_uid=24803668)> (2014). A web-based interactive framework to assist in the prioritization of disease candidate genes in whole-exome sequencing studies. [Nucleic Acids Res.](http://www.ncbi.nlm.nih.gov/pubmed/?term=BiERapp "Nucleic acids research.")>42 :W88-93.
+1. Landrum,M.J., Lee,J.M., Riley,G.R., Jang,W., Rubinstein,W.S., Church,D.M. and Maglott,D.R. (2014) ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res., 42, D980â€“D985.
+1. Medina I, Salavert F, Sanchez R, de Maria A, Alonso R, Escobar P, Bleda M, Dopazo J: Genome Maps, a new generation genome browser. Nucleic Acids Res 2013, 41:W41-46.
diff --git a/docs.it4i/software/bio/omics-master/priorization-component-bierapp.md b/docs.it4i/software/bio/omics-master/priorization-component-bierapp.md
new file mode 100644
index 0000000000000000000000000000000000000000..07c763fb6db2a6f31c760993b1a094a0e97ee7ff
--- /dev/null
+++ b/docs.it4i/software/bio/omics-master/priorization-component-bierapp.md
@@ -0,0 +1,19 @@
+# Prioritization component (BiERapp)
+
+## Access
+
+BiERapp is available at the [following address](http://omics.it4i.cz/bierapp/)
+
+!!! note
+    The address is accessible only via VPN.
+
+## BiERapp
+
+** This tool is aimed to discover new disease genes or variants by studying affected families or cases and controls. It carries out a filtering process to sequentially remove: (i) variants which are not no compatible with the disease because are not expected to have impact on the protein function; (ii) variants that exist at frequencies incompatible with the disease; (iii) variants that do not segregate with the disease. The result is a reduced set of disease gene candidates that should be further validated experimentally. **
+
+BiERapp (28) efficiently helps in the identification of causative variants in family and sporadic genetic diseases. The program reads lists of predicted variants (nucleotide substitutions and indels) in affected individuals or tumor samples and controls. In family studies, different modes of inheritance can easily be defined to filter out variants that do not segregate with the disease along the family. Moreover, BiERapp integrates additional information such as allelic frequencies in the general population and the most popular damaging scores to further narrow down the number of putative variants in successive filtering steps. BiERapp provides an interactive and user-friendly interface that implements the filtering strategy used in the context of a large-scale genomic project carried out by the Spanish Network for Research, in Rare Diseases (CIBERER) and the Medical Genome Project. in which more than 800 exomes have been analyzed.
+
+![Web interface to the prioritization tool. This figure shows the interface of the web tool for candidate gene prioritization with the filters available. The tool includes a genomic viewer (Genome Maps 30) that enables the representation of the variants in the corresponding genomic coordinates.](../../img/fig6.png)
+
+** Figure 6 **. Web interface to the prioritization tool. This figure shows the interface of the web tool for candidate gene
+prioritization with the filters available. The tool includes a genomic viewer (Genome Maps 30) that enables the representation of the variants in the corresponding genomic coordinates.
diff --git a/docs.it4i/software/cae/comsol/comsol-multiphysics.md b/docs.it4i/software/cae/comsol/comsol-multiphysics.md
new file mode 100644
index 0000000000000000000000000000000000000000..c5170bfcffbafc2e8e744cf97ff1b7501f2c6b0b
--- /dev/null
+++ b/docs.it4i/software/cae/comsol/comsol-multiphysics.md
@@ -0,0 +1,120 @@
+# COMSOL Multiphysics
+
+## Introduction
+
+[COMSOL](http://www.comsol.com) is a powerful environment for modelling and solving various engineering and scientific problems based on partial differential equations. COMSOL is designed to solve coupled or multiphysics phenomena. For many standard engineering problems COMSOL provides add-on products such as electrical, mechanical, fluid flow, and chemical applications.
+
+* [Structural Mechanics Module](http://www.comsol.com/structural-mechanics-module),
+* [Heat Transfer Module](http://www.comsol.com/heat-transfer-module),
+* [CFD Module](http://www.comsol.com/cfd-module),
+* [Acoustics Module](http://www.comsol.com/acoustics-module),
+* and [many others](http://www.comsol.com/products)
+
+COMSOL also allows an interface support for equation-based modelling of partial differential equations.
+
+## Execution
+
+On the clusters COMSOL is available in the latest stable version. There are two variants of the release:
+
+* **Non commercial** or so called **EDU variant**, which can be used for research and educational purposes.
+
+* **Commercial** or so called **COM variant**, which can used also for commercial activities. **COM variant** has only subset of features compared to the **EDU variant** available. More about licensing [here](licensing-and-available-versions/).
+
+To load the of COMSOL load the module
+
+```console
+$ ml COMSOL
+```
+
+By default the **EDU variant** will be loaded. If user needs other version or variant, load the particular version. To obtain the list of available versions use
+
+```console
+$ ml av  COMSOL
+```
+
+If user needs to prepare COMSOL jobs in the interactive mode it is recommend to use COMSOL on the compute nodes via PBS Pro scheduler. In order run the COMSOL Desktop GUI on Windows is recommended to use the [Virtual Network Computing (VNC)](../../general/accessing-the-clusters/graphical-user-interface/x-window-system/).
+
+Example for Salomon:
+
+```console
+$ xhost +
+$ qsub -I -X -A PROJECT_ID -q qprod -l select=1:ppn=24
+$ ml COMSOL
+$ comsol
+```
+
+To run COMSOL in batch mode, without the COMSOL Desktop GUI environment, user can utilized the default (comsol.pbs) job script and execute it via the qsub command.
+
+```bash
+#!/bin/bash
+#PBS -l select=3:ppn=24
+#PBS -q qprod
+#PBS -N JOB_NAME
+#PBS -A PROJECT_ID
+
+cd /scratch/work/user/$USER/ || exit   # on Anselm use: /scratch/$USER
+
+echo Time is `date`
+echo Directory is `pwd`
+echo '**PBS_NODEFILE***START*******'
+cat $PBS_NODEFILE
+echo '**PBS_NODEFILE***END*********'
+
+text_nodes < cat $PBS_NODEFILE
+
+module load COMSOL
+# module load COMSOL/51-EDU
+
+ntask=$(wc -l $PBS_NODEFILE)
+
+comsol -nn ${ntask} batch -configuration /tmp â€“mpiarg â€“rmk â€“mpiarg pbs -tmpdir /scratch/.../$USER/ -inputfile name_input_f.mph -outputfile name_output_f.mph -batchlog name_log_f.log
+```
+
+Working directory has to be created before sending the (comsol.pbs) job script into the queue. Input file (name_input_f.mph) has to be in working directory or full path to input file has to be specified. The appropriate path to the temp directory of the job has to be set by command option (-tmpdir).
+
+## LiveLink for MATLAB
+
+COMSOL is the software package for the numerical solution of the partial differential equations. LiveLink for MATLAB allows connection to the COMSOL API (Application Programming Interface) with the benefits of the programming language and computing environment of the MATLAB.
+
+LiveLink for MATLAB is available in both **EDU** and **COM** **variant** of the COMSOL release. On the clusters 1 commercial (**COM**) license and the 5 educational (**EDU**) licenses of LiveLink for MATLAB (please see the [ISV Licenses](../isv_licenses/)) are available. Following example shows how to start COMSOL model from MATLAB via LiveLink in the interactive mode (on Anselm use 16 threads).
+
+```console
+$ xhost +
+$ qsub -I -X -A PROJECT_ID -q qexp -l select=1:ppn=24
+$ ml MATLAB
+$ ml COMSOL
+$ comsol server MATLAB
+```
+
+At the first time to launch the LiveLink for MATLAB (client-MATLAB/server-COMSOL connection) the login and password is requested and this information is not requested again.
+
+To run LiveLink for MATLAB in batch mode with (comsol_matlab.pbs) job script you can utilize/modify the following script and execute it via the qsub command.
+
+```bash
+#!/bin/bash
+#PBS -l select=3:ppn=24
+#PBS -q qprod
+#PBS -N JOB_NAME
+#PBS -A PROJECT_ID
+
+cd /scratch/work/user/$USER || exit   # on Anselm use: /scratch/$USER
+
+echo Time is `date`
+echo Directory is `pwd`
+echo '**PBS_NODEFILE***START*******'
+cat $PBS_NODEFILE
+echo '**PBS_NODEFILE***END*********'
+
+text_nodes < cat $PBS_NODEFILE
+
+module load MATLAB
+module load COMSOL/51-EDU
+
+ntask=$(wc -l $PBS_NODEFILE)
+
+comsol -nn ${ntask} server -configuration /tmp -mpiarg -rmk -mpiarg pbs -tmpdir /scratch/work/user/$USER/work &
+cd /apps/cae/COMSOL/51/mli
+matlab -nodesktop -nosplash -r "mphstart; addpath /scratch/work/user/$USER/work; test_job"
+```
+
+This example shows how to run Livelink for MATLAB with following configuration: 3 nodes and 24 cores per node. Working directory has to be created before submitting (comsol_matlab.pbs) job script into the queue. Input file (test_job.m) has to be in working directory or full path to input file has to be specified. The Matlab command option (-r â€ťmphstartâ€ť) created a connection with a COMSOL server using the default port number.
diff --git a/docs.it4i/software/cae/comsol/licensing-and-available-versions.md b/docs.it4i/software/cae/comsol/licensing-and-available-versions.md
new file mode 100644
index 0000000000000000000000000000000000000000..4358b930fedbfcdf3ea9277d2fa5c89e8a74ca37
--- /dev/null
+++ b/docs.it4i/software/cae/comsol/licensing-and-available-versions.md
@@ -0,0 +1,19 @@
+# Licensing and Available Versions
+
+## Comsol Licence Can Be Used By:
+
+* all persons in the carrying out of the CE IT4Innovations Project (In addition to the primary licensee, which is VSB - Technical University of Ostrava, users are CE IT4Innovations third parties - CE IT4Innovations project partners, particularly the University of Ostrava, the Brno University of Technology - Faculty of Informatics, the Silesian University in Opava, Institute of Geonics AS CR.)
+* all persons who have a valid license
+* students of the Technical University
+
+## Comsol EDU Network Licence
+
+The licence intended to be used for science and research, publications, studentsâ€™ projects, teaching (academic licence).
+
+## Comsol COM Network Licence
+
+The licence intended to be used for science and research, publications, studentsâ€™ projects, commercial research with no commercial use restrictions.  Enables the solution of at least one job by one user in one program start.
+
+## Available Versions
+
+* ver. 51
diff --git a/docs.it4i/software/intel/intel-suite/intel-advisor.md b/docs.it4i/software/intel/intel-suite/intel-advisor.md
new file mode 100644
index 0000000000000000000000000000000000000000..688deda17708cc23578fd50dc6063fb7716c5858
--- /dev/null
+++ b/docs.it4i/software/intel/intel-suite/intel-advisor.md
@@ -0,0 +1,31 @@
+# Intel Advisor
+
+is tool aiming to assist you in vectorization and threading of your code. You can use it to profile your application and identify loops, that could benefit from vectorization and/or threading parallelism.
+
+## Installed Versions
+
+The following versions are currently available on Salomon as modules:
+
+2016 Update 2 - Advisor/2016_update2
+
+## Usage
+
+Your program should be compiled with -g switch to include symbol names. You should compile with -O2 or higher to see code that is already vectorized by the compiler.
+
+Profiling is possible either directly from the GUI, or from command line.
+
+To profile from GUI, launch Advisor:
+
+```console
+$ advixe-gui
+```
+
+Then select menu File -> New -> Project. Choose a directory to save project data to. After clicking OK, Project properties window will appear, where you can configure path to your binary, launch arguments, working directory etc. After clicking OK, the project is ready.
+
+In the left pane, you can switch between Vectorization and Threading workflows. Each has several possible steps which you can execute by clicking Collect button. Alternatively, you can click on Command Line, to see the command line required to run the analysis directly from command line.
+
+## References
+
+1. [IntelÂ® Advisor 2015 Tutorial: Find Where to Add Parallelism - C++ Sample](https://software.intel.com/en-us/intel-advisor-tutorial-vectorization-windows-cplusplus)
+1. [Product page](https://software.intel.com/en-us/intel-advisor-xe)
+1. [Documentation](https://software.intel.com/en-us/intel-advisor-2016-user-guide-linux)
diff --git a/docs.it4i/software/intel/intel-suite/intel-compilers.md b/docs.it4i/software/intel/intel-suite/intel-compilers.md
new file mode 100644
index 0000000000000000000000000000000000000000..8e2ee714f6e5c61ec8b4e3b4522a3a06fdd11f46
--- /dev/null
+++ b/docs.it4i/software/intel/intel-suite/intel-compilers.md
@@ -0,0 +1,36 @@
+# Intel Compilers
+
+The Intel compilers in multiple versions are available, via module intel. The compilers include the icc C and C++ compiler and the ifort fortran 77/90/95 compiler.
+
+```console
+$ ml intel
+$ icc -v
+$ ifort -v
+```
+
+The intel compilers provide for vectorization of the code, via the AVX2 instructions and support threading parallelization via OpenMP
+
+For maximum performance on the Salomon cluster compute nodes, compile your programs using the AVX2 instructions, with reporting where the vectorization was used. We recommend following compilation options for high performance
+
+```console
+$ icc   -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec myprog.c mysubroutines.c -o myprog.x
+$ ifort -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec myprog.f mysubroutines.f -o myprog.x
+```
+
+In this example, we compile the program enabling interprocedural optimizations between source files (-ipo), aggresive loop optimizations (-O3) and vectorization (-xCORE-AVX2)
+
+The compiler recognizes the omp, simd, vector and ivdep pragmas for OpenMP parallelization and AVX2 vectorization. Enable the OpenMP parallelization by the **-openmp** compiler switch.
+
+```console
+$ icc -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.c mysubroutines.c -o myprog.x
+$ ifort -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.f mysubroutines.f -o myprog.x
+```
+
+Read more at <https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-user-and-reference-guide>
+
+## Sandy Bridge/Ivy Bridge/Haswell Binary Compatibility
+
+ Anselm nodes are currently equipped with Sandy Bridge CPUs, while Salomon compute nodes are equipped with Haswell based architecture. The UV1 SMP compute server has Ivy Bridge CPUs, which are equivalent to Sandy Bridge (only smaller manufacturing technology). The new processors are backward compatible with the Sandy Bridge nodes, so all programs that ran on the Sandy Bridge processors, should also run on the new Haswell nodes. To get optimal performance out of the Haswell processors a program should make use of the special AVX2 instructions for this processor. One can do this by recompiling codes with the compiler flags designated to invoke these instructions. For the Intel compiler suite, there are two ways of doing this:
+
+* Using compiler flag (both for Fortran and C): -xCORE-AVX2. This will create a binary with AVX2 instructions, specifically for the Haswell processors. Note that the executable will not run on Sandy Bridge/Ivy Bridge nodes.
+* Using compiler flags (both for Fortran and C): -xAVX -axCORE-AVX2. This   will generate multiple, feature specific auto-dispatch code paths for IntelÂ® processors, if there is a performance benefit. So this binary will run both on Sandy Bridge/Ivy Bridge and Haswell processors. During runtime it will be decided which path to follow, dependent on which processor you are running on. In general this    will result in larger binaries.
diff --git a/docs.it4i/software/intel/intel-suite/intel-debugger.md b/docs.it4i/software/intel/intel-suite/intel-debugger.md
new file mode 100644
index 0000000000000000000000000000000000000000..ac7cec6ad56acbc3705fcdc478531e2cade64c47
--- /dev/null
+++ b/docs.it4i/software/intel/intel-suite/intel-debugger.md
@@ -0,0 +1,73 @@
+# Intel Debugger
+
+IDB is no longer available since Intel Parallel Studio 2015
+
+## Debugging Serial Applications
+
+The intel debugger version is available, via module intel/13.5.192. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use [X display](../../general/accessing-the-clusters/graphical-user-interface/x-window-system/) for running the GUI.
+
+```console
+$ ml intel/13.5.192
+$ ml Java
+$ idb
+```
+
+The debugger may run in text mode. To debug in text mode, use
+
+```console
+$ idbc
+```
+
+To debug on the compute nodes, module intel must be loaded. The GUI on compute nodes may be accessed using the same way as in [the GUI section](../../general/accessing-the-clusters/graphical-user-interface/x-window-system/)
+
+Example:
+
+```console
+$ qsub -q qexp -l select=1:ncpus=24 -X -I    # use 16 threads for Anselm
+    qsub: waiting for job 19654.srv11 to start
+    qsub: job 19654.srv11 ready
+$ ml intel
+$ ml Java
+$ icc -O0 -g myprog.c -o myprog.x
+$ idb ./myprog.x
+```
+
+In this example, we allocate 1 full compute node, compile program myprog.c with debugging options -O0 -g and run the idb debugger interactively on the myprog.x executable. The GUI access is via X11 port forwarding provided by the PBS workload manager.
+
+## Debugging Parallel Applications
+
+ Intel debugger is capable of debugging multithreaded and MPI parallel programs as well.
+
+### Small Number of MPI Ranks
+
+For debugging small number of MPI ranks, you may execute and debug each rank in separate xterm terminal (do not forget the [X display](../../general/accessing-the-clusters/graphical-user-interface/x-window-system/)). Using Intel MPI, this may be done in following way:
+
+```console
+$ qsub -q qexp -l select=2:ncpus=24 -X -I
+    qsub: waiting for job 19654.srv11 to start
+    qsub: job 19655.srv11 ready
+$ ml intel
+$ mpirun -ppn 1 -hostfile $PBS_NODEFILE --enable-x xterm -e idbc ./mympiprog.x
+```
+
+In this example, we allocate 2 full compute node, run xterm on each node and start idb debugger in command line mode, debugging two ranks of mympiprog.x application. The xterm will pop up for each rank, with idb prompt ready. The example is not limited to use of Intel MPI
+
+### Large Number of MPI Ranks
+
+Run the idb debugger from within the MPI debug option. This will cause the debugger to bind to all ranks and provide aggregated outputs across the ranks, pausing execution automatically just after startup. You may then set break points and step the execution manually. Using Intel MPI:
+
+```console
+$ qsub -q qexp -l select=2:ncpus=24 -X -I
+    qsub: waiting for job 19654.srv11 to start
+    qsub: job 19655.srv11 ready
+$ ml intel
+$ mpirun -n 48 -idb ./mympiprog.x
+```
+
+### Debugging Multithreaded Application
+
+Run the idb debugger in GUI mode. The menu Parallel contains number of tools for debugging multiple threads. One of the most useful tools is the **Serialize Execution** tool, which serializes execution of concurrent threads for easy orientation and identification of concurrency related bugs.
+
+## Further Information
+
+Exhaustive manual on idb features and usage is published at Intel website, <https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/>
diff --git a/docs.it4i/software/intel/intel-suite/intel-inspector.md b/docs.it4i/software/intel/intel-suite/intel-inspector.md
new file mode 100644
index 0000000000000000000000000000000000000000..bd298923813d786c7620c751a3c267983bb2a48d
--- /dev/null
+++ b/docs.it4i/software/intel/intel-suite/intel-inspector.md
@@ -0,0 +1,39 @@
+# Intel Inspector
+
+Intel Inspector is a dynamic memory and threading error checking tool for C/C++/Fortran applications. It can detect issues such as memory leaks, invalid memory references, uninitalized variables, race conditions, deadlocks etc.
+
+## Installed Versions
+
+The following versions are currently available on Salomon as modules:
+
+2016 Update 1 - Inspector/2016_update1
+
+## Usage
+
+Your program should be compiled with -g switch to include symbol names. Optimizations can be turned on.
+
+Debugging is possible either directly from the GUI, or from command line.
+
+### GUI Mode
+
+To debug from GUI, launch Inspector:
+
+```console
+$ inspxe-gui &
+```
+
+Then select menu File -> New -> Project. Choose a directory to save project data to. After clicking OK, Project properties window will appear, where you can configure path to your binary, launch arguments, working directory etc. After clicking OK, the project is ready.
+
+In the main pane, you can start a predefined analysis type or define your own. Click Start to start the analysis. Alternatively, you can click on Command Line, to see the command line required to run the analysis directly from command line.
+
+### Batch Mode
+
+Analysis can be also run from command line in batch mode. Batch mode analysis is run with command inspxe-cl. To obtain the required parameters, either consult the documentation or you can configure the analysis in the GUI and then click "Command Line" button in the lower right corner to the respective command line.
+
+Results obtained from batch mode can be then viewed in the GUI by selecting File -> Open -> Result...
+
+## References
+
+1. [Product page](https://software.intel.com/en-us/intel-inspector-xe)
+1. [Documentation and Release Notes](https://software.intel.com/en-us/intel-inspector-xe-support/documentation)
+1. [Tutorials](https://software.intel.com/en-us/articles/inspectorxe-tutorials)
diff --git a/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md b/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md
new file mode 100644
index 0000000000000000000000000000000000000000..a47233367e4130177be4db677197a07ec26f9fb2
--- /dev/null
+++ b/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md
@@ -0,0 +1,78 @@
+# Intel IPP
+
+## Intel Integrated Performance Primitives
+
+Intel Integrated Performance Primitives, version 9.0.1, compiled for AVX2 vector instructions is available, via module ipp. The IPP is a very rich library of highly optimized algorithmic building blocks for media and data applications. This includes signal, image and frame processing algorithms, such as FFT, FIR, Convolution, Optical Flow, Hough transform, Sum, MinMax, as well as cryptographic functions, linear algebra functions and many more.
+
+Check out IPP before implementing own math functions for data processing, it is likely already there.
+
+```console
+$ ml ipp
+```
+
+The module sets up environment variables, required for linking and running ipp enabled applications.
+
+## IPP Example
+
+```cpp
+#include "ipp.h"
+#include <stdio.h>
+int main(int argc, char* argv[])
+{
+    const IppLibraryVersion *lib;
+    Ipp64u fm;
+    IppStatus status;
+
+    status= ippInit();            //IPP initialization with the best optimization layer
+    if( status != ippStsNoErr ) {
+            printf("IppInit() Error:n");
+            printf("%sn", ippGetStatusString(status) );
+            return -1;
+    }
+
+    //Get version info
+    lib = ippiGetLibVersion();
+    printf("%s %sn", lib->Name, lib->Version);
+
+    //Get CPU features enabled with selected library level
+    fm=ippGetEnabledCpuFeatures();
+    printf("SSE    :%cn",(fm>1)&1?'Y':'N');
+    printf("SSE2   :%cn",(fm>2)&1?'Y':'N');
+    printf("SSE3   :%cn",(fm>3)&1?'Y':'N');
+    printf("SSSE3  :%cn",(fm>4)&1?'Y':'N');
+    printf("SSE41  :%cn",(fm>6)&1?'Y':'N');
+    printf("SSE42  :%cn",(fm>7)&1?'Y':'N');
+    printf("AVX    :%cn",(fm>8)&1 ?'Y':'N');
+    printf("AVX2   :%cn", (fm>15)&1 ?'Y':'N' );
+    printf("----------n");
+    printf("OS Enabled AVX :%cn", (fm>9)&1 ?'Y':'N');
+    printf("AES            :%cn", (fm>10)&1?'Y':'N');
+    printf("CLMUL          :%cn", (fm>11)&1?'Y':'N');
+    printf("RDRAND         :%cn", (fm>13)&1?'Y':'N');
+    printf("F16C           :%cn", (fm>14)&1?'Y':'N');
+
+    return 0;
+}
+```
+
+Compile above example, using any compiler and the ipp module.
+
+```console
+$ ml intel
+$ ml ipp
+$ icc testipp.c -o testipp.x -lippi -lipps -lippcore
+```
+
+You will need the ipp module loaded to run the ipp enabled executable. This may be avoided, by compiling library search paths into the executable
+
+```console
+$ ml intel
+$ ml ipp
+$ icc testipp.c -o testipp.x -Wl,-rpath=$LIBRARY_PATH -lippi -lipps -lippcore
+```
+
+## Code Samples and Documentation
+
+Intel provides number of [Code Samples for IPP](https://software.intel.com/en-us/articles/code-samples-for-intel-integrated-performance-primitives-library), illustrating use of IPP.
+
+Read full documentation on IPP [on Intel website,](http://software.intel.com/sites/products/search/search.php?q=&x=15&y=6&product=ipp&version=7.1&docos=lin) in particular the [IPP Reference manual.](http://software.intel.com/sites/products/documentation/doclib/ipp_sa/71/ipp_manual/index.htm)
diff --git a/docs.it4i/software/intel/intel-suite/intel-mkl.md b/docs.it4i/software/intel/intel-suite/intel-mkl.md
new file mode 100644
index 0000000000000000000000000000000000000000..2053e958b2673acb4fc79e4e552bea5cf016d85e
--- /dev/null
+++ b/docs.it4i/software/intel/intel-suite/intel-mkl.md
@@ -0,0 +1,120 @@
+# Intel MKL
+
+## Intel Math Kernel Library
+
+Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, extensively threaded and optimized for maximum performance. Intel MKL provides these basic math kernels:
+
+* BLAS (level 1, 2, and 3) and LAPACK linear algebra routines, offering vector, vector-matrix, and matrix-matrix operations.
+* The PARDISO direct sparse solver, an iterative sparse solver, and supporting sparse BLAS (level 1, 2, and 3) routines for solving sparse systems of equations.
+* ScaLAPACK distributed processing linear algebra routines for Linux and Windows operating systems, as well as the Basic Linear Algebra Communications Subprograms (BLACS) and the Parallel Basic Linear Algebra Subprograms (PBLAS).
+* Fast Fourier transform (FFT) functions in one, two, or three dimensions with support for mixed radices (not limited to sizes that are powers of 2), as well as distributed versions of these functions.
+* Vector Math Library (VML) routines for optimized mathematical operations on vectors.
+* Vector Statistical Library (VSL) routines, which offer high-performance vectorized random number generators (RNG) for several probability distributions, convolution and correlation routines, and summary statistics functions.
+* Data Fitting Library, which provides capabilities for spline-based approximation of functions, derivatives and integrals of functions, and search.
+* Extended Eigensolver, a shared memory version of an eigensolver based on the Feast Eigenvalue Solver.
+
+For details see the [Intel MKL Reference Manual](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mklman/index.htm).
+
+Intel MKL is available on the cluster
+
+```console
+$ ml av imkl
+$ ml imkl
+```
+
+The module sets up environment variables, required for linking and running mkl enabled applications. The most important variables are the $MKLROOT, $CPATH, $LD_LIBRARY_PATH and $MKL_EXAMPLES
+
+Intel MKL library may be linked using any compiler. With intel compiler use -mkl option to link default threaded MKL.
+
+### Interfaces
+
+Intel MKL library provides number of interfaces. The fundamental once are the LP64 and ILP64. The Intel MKL ILP64 libraries use the 64-bit integer type (necessary for indexing large arrays, with more than 231^-1 elements), whereas the LP64 libraries index arrays with the 32-bit integer type.
+
+| Interface | Integer type                                 |
+| --------- | -------------------------------------------- |
+| LP64      | 32-bit, int, integer(kind=4), MPI_INT        |
+| ILP64     | 64-bit, long int, integer(kind=8), MPI_INT64 |
+
+### Linking
+
+Linking Intel MKL libraries may be complex. Intel [mkl link line advisor](http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor) helps. See also [examples](intel-mkl/#examples) below.
+
+You will need the mkl module loaded to run the mkl enabled executable. This may be avoided, by compiling library search paths into the executable. Include rpath on the compile line:
+
+```console
+$ icc .... -Wl,-rpath=$LIBRARY_PATH ...
+```
+
+### Threading
+
+Advantage in using Intel MKL library is that it brings threaded parallelization to applications that are otherwise not parallel.
+
+For this to work, the application must link the threaded MKL library (default). Number and behaviour of MKL threads may be controlled via the OpenMP environment variables, such as OMP_NUM_THREADS and KMP_AFFINITY. MKL_NUM_THREADS takes precedence over OMP_NUM_THREADS
+
+```console
+$ export OMP_NUM_THREADS=24   # 16 for Anselm
+$ export KMP_AFFINITY=granularity=fine,compact,1,0
+```
+
+The application will run with 24 threads with affinity optimized for fine grain parallelization.
+
+## Examples
+
+Number of examples, demonstrating use of the Intel MKL library and its linking is available on clusters, in the $MKL_EXAMPLES directory. In the examples below, we demonstrate linking Intel MKL to Intel and GNU compiled program for multi-threaded matrix multiplication.
+
+### Working With Examples
+
+```console
+$ ml intel
+$ ml imkl
+$ cp -a $MKL_EXAMPLES/cblas /tmp/
+$ cd /tmp/cblas
+$ make sointel64 function=cblas_dgemm
+```
+
+In this example, we compile, link and run the cblas_dgemm example, demonstrating use of MKL example suite installed on clusters.
+
+### Example: MKL and Intel Compiler
+
+```console
+$ ml intel
+$ ml imkl
+$ cp -a $MKL_EXAMPLES/cblas /tmp/
+$ cd /tmp/cblas
+$
+$ icc -w source/cblas_dgemmx.c source/common_func.c -mkl -o cblas_dgemmx.x
+$ ./cblas_dgemmx.x data/cblas_dgemmx.d
+```
+
+In this example, we compile, link and run the cblas_dgemm example, demonstrating use of MKL with icc -mkl option. Using the -mkl option is equivalent to:
+
+```console
+$ icc -w source/cblas_dgemmx.c source/common_func.c -o cblas_dgemmx.x -I$MKL_INC_DIR -L$MKL_LIB_DIR -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5
+```
+
+In this example, we compile and link the cblas_dgemm example, using LP64 interface to threaded MKL and Intel OMP threads implementation.
+
+### Example: Intel MKL and GNU Compiler
+
+```console
+$ ml GCC
+$ ml imkl
+$ cp -a $MKL_EXAMPLES/cblas /tmp/
+$ cd /tmp/cblas
+$ gcc -w source/cblas_dgemmx.c source/common_func.c -o cblas_dgemmx.x -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -lm
+$ ./cblas_dgemmx.x data/cblas_dgemmx.d
+```
+
+In this example, we compile, link and run the cblas_dgemm example, using LP64 interface to threaded MKL and gnu OMP threads implementation.
+
+## MKL and MIC Accelerators
+
+The Intel MKL is capable to automatically offload the computations o the MIC accelerator. See section [Intel Xeon Phi](../intel-xeon-phi/) for details.
+
+## LAPACKE C Interface
+
+MKL includes LAPACKE C Interface to LAPACK. For some reason, although Intel is the author of LAPACKE, the LAPACKE header files are not present in MKL. For this reason, we have prepared LAPACKE module, which includes Intel's LAPACKE headers from official LAPACK, which you can use to compile code using LAPACKE interface against MKL.
+
+## Further Reading
+
+Read more on [Intel website](http://software.intel.com/en-us/intel-mkl), in particular the [MKL users guide](https://software.intel.com/en-us/intel-mkl/documentation/linux).
diff --git a/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md b/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md
new file mode 100644
index 0000000000000000000000000000000000000000..7b6ba956b932b63d535dc0e3aeb7667385fdccf8
--- /dev/null
+++ b/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md
@@ -0,0 +1,69 @@
+# Intel Parallel Studio
+
+The Salomon cluster provides following elements of the Intel Parallel Studio XE
+
+Intel Parallel Studio XE
+
+* Intel Compilers
+* Intel Debugger
+* Intel MKL Library
+* Intel Integrated Performance Primitives Library
+* Intel Threading Building Blocks Library
+* Intel Trace Analyzer and Collector
+* Intel Advisor
+* Intel Inspector
+
+## Intel Compilers
+
+The Intel compilers are available, via module intel. The compilers include the icc C and C++ compiler and the ifort fortran 77/90/95 compiler.
+
+```console
+$ ml intel
+$ icc -v
+$ ifort -v
+```
+
+Read more at the [Intel Compilers](intel-compilers/) page.
+
+## Intel Debugger
+
+IDB is no longer available since Parallel Studio 2015.
+
+The intel debugger version 13.0 is available, via module intel. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment.
+
+```console
+$ ml intel
+$ idb
+```
+
+Read more at the [Intel Debugger](intel-debugger/) page.
+
+## Intel Math Kernel Library
+
+Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, extensively threaded and optimized for maximum performance. Intel MKL unites and provides these basic components: BLAS, LAPACK, ScaLapack, PARDISO, FFT, VML, VSL, Data fitting, Feast Eigensolver and many more.
+
+```console
+$ ml imkl
+```
+
+Read more at the [Intel MKL](intel-mkl/) page.
+
+## Intel Integrated Performance Primitives
+
+Intel Integrated Performance Primitives, version 7.1.1, compiled for AVX is available, via module ipp. The IPP is a library of highly optimized algorithmic building blocks for media and data applications. This includes signal, image and frame processing algorithms, such as FFT, FIR, Convolution, Optical Flow, Hough transform, Sum, MinMax and many more.
+
+```console
+$ ml ipp
+```
+
+Read more at the [Intel IPP](intel-integrated-performance-primitives/) page.
+
+## Intel Threading Building Blocks
+
+Intel Threading Building Blocks (Intel TBB) is a library that supports scalable parallel programming using standard ISO C++ code. It does not require special languages or compilers. It is designed to promote scalable data parallel programming. Additionally, it fully supports nested parallelism, so you can build larger parallel components from smaller parallel components. To use the library, you specify tasks, not threads, and let the library map tasks onto threads in an efficient manner.
+
+```console
+$ ml tbb
+```
+
+Read more at the [Intel TBB](intel-tbb/) page.
diff --git a/docs.it4i/software/intel/intel-suite/intel-tbb.md b/docs.it4i/software/intel/intel-suite/intel-tbb.md
new file mode 100644
index 0000000000000000000000000000000000000000..59976aa7ef31d2e97e9799ced80578be11a2d8ab
--- /dev/null
+++ b/docs.it4i/software/intel/intel-suite/intel-tbb.md
@@ -0,0 +1,40 @@
+# Intel TBB
+
+## Intel Threading Building Blocks
+
+Intel Threading Building Blocks (Intel TBB) is a library that supports scalable parallel programming using standard ISO C++ code. It does not require special languages or compilers.  To use the library, you specify tasks, not threads, and let the library map tasks onto threads in an efficient manner. The tasks are executed by a runtime scheduler and may be offloaded to [MIC accelerator](../intel-xeon-phi/).
+
+Intel is available on the cluster.
+
+```console
+$ ml av tbb
+```
+
+The module sets up environment variables, required for linking and running tbb enabled applications.
+
+Link the tbb library, using -ltbb
+
+## Examples
+
+Number of examples, demonstrating use of TBB and its built-in scheduler is available on Anselm, in the $TBB_EXAMPLES directory.
+
+```console
+$ ml intel
+$ ml tbb
+$ cp -a $TBB_EXAMPLES/common $TBB_EXAMPLES/parallel_reduce /tmp/
+$ cd /tmp/parallel_reduce/primes
+$ icc -O2 -DNDEBUG -o primes.x main.cpp primes.cpp -ltbb
+$ ./primes.x
+```
+
+In this example, we compile, link and run the primes example, demonstrating use of parallel task-based reduce in computation of prime numbers.
+
+You will need the tbb module loaded to run the tbb enabled executable. This may be avoided, by compiling library search paths into the executable.
+
+```console
+$ icc -O2 -o primes.x main.cpp primes.cpp -Wl,-rpath=$LIBRARY_PATH -ltbb
+```
+
+## Further Reading
+
+Read more on Intel website, <http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm>
diff --git a/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md b/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md
new file mode 100644
index 0000000000000000000000000000000000000000..b7bf6c92d3a03112392a86078037aeff28e8623f
--- /dev/null
+++ b/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md
@@ -0,0 +1,40 @@
+# Intel Trace Analyzer and Collector
+
+Intel Trace Analyzer and Collector (ITAC) is a tool to collect and graphicaly analyze behaviour of MPI applications. It helps you to analyze communication patterns of your application, identify hotspots, perform correctnes checking (identify deadlocks, data corruption etc), simulate how your application would run on a different interconnect.
+
+ITAC is a offline analysis tool - first you run your application to collect a trace file, then you can open the trace in a GUI analyzer to view it.
+
+## Installed Version
+
+Currently on Salomon is version 9.1.2.024 available as module itac/9.1.2.024
+
+## Collecting Traces
+
+ITAC can collect traces from applications that are using Intel MPI. To generate a trace, simply add -trace option to your mpirun command :
+
+```console
+$ ml itac/9.1.2.024
+$ mpirun -trace myapp
+```
+
+The trace will be saved in file myapp.stf in the current directory.
+
+## Viewing Traces
+
+To view and analyze the trace, open ITAC GUI in a [graphical environment](../../general/accessing-the-clusters/graphical-user-interface/x-window-system/):
+
+```console
+$ ml itac/9.1.2.024
+$ traceanalyzer
+```
+
+The GUI will launch and you can open the produced `*`.stf file.
+
+![](../../img/Snmekobrazovky20151204v15.35.12.png)
+
+Please refer to Intel documenation about usage of the GUI tool.
+
+## References
+
+1. [Getting Started with IntelÂ® Trace Analyzer and Collector](https://software.intel.com/en-us/get-started-with-itac-for-linux)
+1. [IntelÂ® Trace Analyzer and Collector - Documentation](https://software.intel.com/en-us/intel-trace-analyzer)
diff --git a/docs.it4i/software/intel/intel-xeon-phi.anselm.md b/docs.it4i/software/intel/intel-xeon-phi.anselm.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1e86256d093b4bd34fe799e48f64d38f48d0e83
--- /dev/null
+++ b/docs.it4i/software/intel/intel-xeon-phi.anselm.md
@@ -0,0 +1,904 @@
+# Intel Xeon Phi
+
+## Guide to Intel Xeon Phi Usage
+
+Intel Xeon Phi can be programmed in several modes. The default mode on Anselm is offload mode, but all modes described in this document are supported.
+
+## Intel Utilities for Xeon Phi
+
+To get access to a compute node with Intel Xeon Phi accelerator, use the PBS interactive session
+
+```console
+$ qsub -I -q qmic -A NONE-0-0
+```
+
+To set up the environment module "Intel" has to be loaded
+
+```console
+$ ml intel
+```
+
+Information about the hardware can be obtained by running the micinfo program on the host.
+
+```console
+$ /usr/bin/micinfo
+```
+
+The output of the "micinfo" utility executed on one of the Anselm node is as follows. (note: to get PCIe related details the command has to be run with root privileges)
+
+```console
+MicInfo Utility Log
+Created Wed Sep 13 13:44:14 2017
+
+
+        System Info
+                HOST OS                 : Linux
+                OS Version              : 2.6.32-696.3.2.el6.Bull.120.x86_64
+                Driver Version          : 3.4.9-1
+                MPSS Version            : 3.4.9
+                Host Physical Memory    : 98836 MB
+
+Device No: 0, Device Name: mic0
+
+        Version
+                Flash Version            : 2.1.02.0391
+                SMC Firmware Version     : 1.17.6900
+                SMC Boot Loader Version  : 1.8.4326
+                uOS Version              : 2.6.38.8+mpss3.4.9
+                Device Serial Number     : ADKC30102489
+
+        Board
+                Vendor ID                : 0x8086
+                Device ID                : 0x2250
+                Subsystem ID             : 0x2500
+                Coprocessor Stepping ID  : 3
+                PCIe Width               : x16
+                PCIe Speed               : 5 GT/s
+                PCIe Max payload size    : 256 bytes
+                PCIe Max read req size   : 512 bytes
+                Coprocessor Model        : 0x01
+                Coprocessor Model Ext    : 0x00
+                Coprocessor Type         : 0x00
+                Coprocessor Family       : 0x0b
+                Coprocessor Family Ext   : 0x00
+                Coprocessor Stepping     : B1
+                Board SKU                : B1PRQ-5110P/5120D
+                ECC Mode                 : Enabled
+                SMC HW Revision          : Product 225W Passive CS
+
+        Cores
+                Total No of Active Cores : 60
+                Voltage                  : 1009000 uV
+                Frequency                : 1052631 kHz
+
+        Thermal
+                Fan Speed Control        : N/A
+                Fan RPM                  : N/A
+                Fan PWM                  : N/A
+                Die Temp                 : 53 C
+
+        GDDR
+                GDDR Vendor              : Elpida
+                GDDR Version             : 0x1
+                GDDR Density             : 2048 Mb
+                GDDR Size                : 7936 MB
+                GDDR Technology          : GDDR5
+                GDDR Speed               : 5.000000 GT/s
+                GDDR Frequency           : 2500000 kHz
+                GDDR Voltage             : 1501000 uV
+```
+
+## Offload Mode
+
+To compile a code for Intel Xeon Phi a MPSS stack has to be installed on the machine where compilation is executed. Currently the MPSS stack is only installed on compute nodes equipped with accelerators.
+
+```console
+$ qsub -I -q qmic -A NONE-0-0
+$ ml intel
+```
+
+For debugging purposes it is also recommended to set environment variable "OFFLOAD_REPORT". Value can be set from 0 to 3, where higher number means more debugging information.
+
+```console
+export OFFLOAD_REPORT=3
+```
+
+A very basic example of code that employs offload programming technique is shown in the next listing.
+
+!!! note
+    This code is sequential and utilizes only single core of the accelerator.
+
+```cpp
+$ vim source-offload.cpp
+
+#include <iostream>
+
+int main(int argc, char* argv[])
+{
+    const int niter = 100000;
+    double result = 0;
+
+ #pragma offload target(mic)
+    for (int i = 0; i < niter; ++i) {
+        const double t = (i + 0.5) / niter;
+        result += 4.0 / (t * t + 1.0);
+    }
+    result /= niter;
+    std::cout << "Pi ~ " << result << 'n';
+}
+```
+
+To compile a code using Intel compiler run
+
+```console
+$ icc source-offload.cpp -o bin-offload
+```
+
+To execute the code, run the following command on the host
+
+```console
+$ ./bin-offload
+```
+
+### Parallelization in Offload Mode Using OpenMP
+
+One way of paralelization a code for Xeon Phi is using OpenMP directives. The following example shows code for parallel vector addition.
+
+```cpp
+$ vim ./vect-add
+
+#include <stdio.h>
+
+typedef int T;
+
+#define SIZE 1000
+
+#pragma offload_attribute(push, target(mic))
+T in1[SIZE];
+T in2[SIZE];
+T res[SIZE];
+#pragma offload_attribute(pop)
+
+// MIC function to add two vectors
+__attribute__((target(mic))) add_mic(T *a, T *b, T *c, int size) {
+  int i = 0;
+  #pragma omp parallel for
+    for (i = 0; i < size; i++)
+      c[i] = a[i] + b[i];
+}
+
+// CPU function to add two vectors
+void add_cpu (T *a, T *b, T *c, int size) {
+  int i;
+  for (i = 0; i < size; i++)
+    c[i] = a[i] + b[i];
+}
+
+// CPU function to generate a vector of random numbers
+void random_T (T *a, int size) {
+  int i;
+  for (i = 0; i < size; i++)
+    a[i] = rand() % 10000; // random number between 0 and 9999
+}
+
+// CPU function to compare two vectors
+int compare(T *a, T *b, T size ){
+  int pass = 0;
+  int i;
+  for (i = 0; i < size; i++){
+    if (a[i] != b[i]) {
+      printf("Value mismatch at location %d, values %d and %dn",i, a[i], b[i]);
+      pass = 1;
+    }
+  }
+  if (pass == 0) printf ("Test passedn"); else printf ("Test Failedn");
+  return pass;
+}
+
+int main()
+{
+  int i;
+  random_T(in1, SIZE);
+  random_T(in2, SIZE);
+
+  #pragma offload target(mic) in(in1,in2)  inout(res)
+  {
+
+    // Parallel loop from main function
+    #pragma omp parallel for
+    for (i=0; i<SIZE; i++)
+      res[i] = in1[i] + in2[i];
+
+    // or parallel loop is called inside the function
+    add_mic(in1, in2, res, SIZE);
+
+  }
+
+  //Check the results with CPU implementation
+  T res_cpu[SIZE];
+  add_cpu(in1, in2, res_cpu, SIZE);
+  compare(res, res_cpu, SIZE);
+
+}
+```
+
+During the compilation Intel compiler shows which loops have been vectorized in both host and accelerator. This can be enabled with compiler option "-vec-report2". To compile and execute the code run
+
+```console
+$ icc vect-add.c -openmp_report2 -vec-report2 -o vect-add
+$ ./vect-add
+```
+
+Some interesting compiler flags useful not only for code debugging are:
+
+!!! note
+    Debugging
+
+    openmp_report[0|1|2] - controls the compiler based vectorization diagnostic level
+    vec-report[0|1|2] - controls the OpenMP parallelizer diagnostic level
+
+    Performance ooptimization
+    xhost - FOR HOST ONLY - to generate AVX (Advanced Vector Extensions) instructions.
+
+## Automatic Offload Using Intel MKL Library
+
+Intel MKL includes an Automatic Offload (AO) feature that enables computationally intensive MKL functions called in user code to benefit from attached Intel Xeon Phi coprocessors automatically and transparently.
+
+Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm).
+
+The Automatic Offload may be enabled by either an MKL function call within the code:
+
+```cpp
+    mkl_mic_enable();
+```
+
+or by setting environment variable
+
+```console
+$ export MKL_MIC_ENABLE=1
+```
+
+To get more information about automatic offload please refer to "[Using IntelÂ® MKL Automatic Offload on Intel Â® Xeon Phiâ„˘ Coprocessors](http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf)" white paper or [Intel MKL documentation](https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation).
+
+### Automatic Offload Example
+
+At first get an interactive PBS session on a node with MIC accelerator and load "intel" module that automatically loads "mkl" module as well.
+
+```console
+$ qsub -I -q qmic -A OPEN-0-0 -l select=1:ncpus=16
+$ module load intel
+```
+
+Following example show how to automatically offload an SGEMM (single precision - general matrix multiply) function to MIC coprocessor. The code can be copied to a file and compiled without any necessary modification.
+
+```cpp
+$ vim sgemm-ao-short.c
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <malloc.h>
+#include <stdint.h>
+
+#include "mkl.h"
+
+int main(int argc, char **argv)
+{
+    float *A, *B, *C; /* Matrices */
+
+    MKL_INT N = 2560; /* Matrix dimensions */
+    MKL_INT LD = N; /* Leading dimension */
+    int matrix_bytes; /* Matrix size in bytes */
+    int matrix_elements; /* Matrix size in elements */
+
+    float alpha = 1.0, beta = 1.0; /* Scaling factors */
+    char transa = 'N', transb = 'N'; /* Transposition options */
+
+    int i, j; /* Counters */
+
+    matrix_elements = N * N;
+    matrix_bytes = sizeof(float) * matrix_elements;
+
+    /* Allocate the matrices */
+    A = malloc(matrix_bytes); B = malloc(matrix_bytes); C = malloc(matrix_bytes);
+
+    /* Initialize the matrices */
+    for (i = 0; i < matrix_elements; i++) {
+            A[i] = 1.0; B[i] = 2.0; C[i] = 0.0;
+    }
+
+    printf("Computing SGEMM on the hostn");
+    sgemm(&transa, &transb, &N, &N, &N, &alpha, A, &N, B, &N, &beta, C, &N);
+
+    printf("Enabling Automatic Offloadn");
+    /* Alternatively, set environment variable MKL_MIC_ENABLE=1 */
+    mkl_mic_enable();
+
+    int ndevices = mkl_mic_get_device_count(); /* Number of MIC devices */
+    printf("Automatic Offload enabled: %d MIC devices presentn",   ndevices);
+
+    printf("Computing SGEMM with automatic workdivisionn");
+    sgemm(&transa, &transb, &N, &N, &N, &alpha, A, &N, B, &N, &beta, C, &N);
+
+    /* Free the matrix memory */
+    free(A); free(B); free(C);
+
+    printf("Donen");
+
+    return 0;
+}
+```
+
+!!! note
+    This example is simplified version of an example from MKL. The expanded version can be found here: `$MKL_EXAMPLES/mic_ao/blasc/source/sgemm.c`.
+
+To compile a code using Intel compiler use:
+
+```console
+$ icc -mkl sgemm-ao-short.c -o sgemm
+```
+
+For debugging purposes enable the offload report to see more information about automatic offloading.
+
+```console
+$ export OFFLOAD_REPORT=2
+```
+
+The output of a code should look similar to following listing, where lines starting with [MKL] are generated by offload reporting:
+
+```console
+    Computing SGEMM on the host
+    Enabling Automatic Offload
+    Automatic Offload enabled: 1 MIC devices present
+    Computing SGEMM with automatic workdivision
+    [MKL] [MIC --] [AO Function]    SGEMM
+    [MKL] [MIC --] [AO SGEMM Workdivision]  0.00 1.00
+    [MKL] [MIC 00] [AO SGEMM CPU Time]      0.463351 seconds
+    [MKL] [MIC 00] [AO SGEMM MIC Time]      0.179608 seconds
+    [MKL] [MIC 00] [AO SGEMM CPU->MIC Data] 52428800 bytes
+    [MKL] [MIC 00] [AO SGEMM MIC->CPU Data] 26214400 bytes
+    Done
+```
+
+## Native Mode
+
+In the native mode a program is executed directly on Intel Xeon Phi without involvement of the host machine. Similarly to offload mode, the code is compiled on the host computer with Intel compilers.
+
+To compile a code user has to be connected to a compute with MIC and load Intel compilers module. To get an interactive session on a compute node with an Intel Xeon Phi and load the module use following commands:
+
+```console
+$ qsub -I -q qmic -A NONE-0-0
+$ ml intel
+```
+
+!!! note
+    Particular version of the Intel module is specified. This information is used later to specify the correct library paths.
+
+To produce a binary compatible with Intel Xeon Phi architecture user has to specify "-mmic" compiler flag. Two compilation examples are shown below. The first example shows how to compile OpenMP parallel code "vect-add.c" for host only:
+
+```console
+$ icc -xhost -no-offload -fopenmp vect-add.c -o vect-add-host
+```
+
+To run this code on host, use:
+
+```console
+$ ./vect-add-host
+```
+
+The second example shows how to compile the same code for Intel Xeon Phi:
+
+```console
+$ icc -mmic -fopenmp vect-add.c -o vect-add-mic
+```
+
+### Execution of the Program in Native Mode on Intel Xeon Phi
+
+The user access to the Intel Xeon Phi is through the SSH. Since user home directories are mounted using NFS on the accelerator, users do not have to copy binary files or libraries between the host and accelerator.
+
+To connect to the accelerator run:
+
+```console
+$ ssh mic0
+```
+
+If the code is sequential, it can be executed directly:
+
+```console
+mic0 $ ~/path_to_binary/vect-add-seq-mic
+```
+
+If the code is parallelized using OpenMP a set of additional libraries is required for execution. To locate these libraries new path has to be added to the LD_LIBRARY_PATH environment variable prior to the execution:
+
+```console
+mic0 $ export LD_LIBRARY_PATH=/apps/intel/composer_xe_2013.5.192/compiler/lib/mic:$LD_LIBRARY_PATH
+```
+
+!!! note
+    The path exported in the previous example contains path to a specific compiler (here the version is 5.192). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer.
+
+For your information the list of libraries and their location required for execution of an OpenMP parallel code on Intel Xeon Phi is:
+
+!!! note
+    /apps/intel/composer_xe_2013.5.192/compiler/lib/mic
+
+    - libiomp5.so
+    - libimf.so
+    - libsvml.so
+    - libirng.so
+    - libintlc.so.5
+
+Finally, to run the compiled code use:
+
+```console
+$ ~/path_to_binary/vect-add-mic
+```
+
+## OpenCL
+
+OpenCL (Open Computing Language) is an open standard for general-purpose parallel programming for diverse mix of multi-core CPUs, GPU coprocessors, and other parallel processors. OpenCL provides a flexible execution model and uniform programming environment for software developers to write portable code for systems running on both the CPU and graphics processors or accelerators like the IntelÂ® Xeon Phi.
+
+On Anselm OpenCL is installed only on compute nodes with MIC accelerator, therefore OpenCL code can be compiled only on these nodes.
+
+```console
+module load opencl-sdk opencl-rt
+```
+
+Always load "opencl-sdk" (providing devel files like headers) and "opencl-rt" (providing dynamic library libOpenCL.so) modules to compile and link OpenCL code. Load "opencl-rt" for running your compiled code.
+
+There are two basic examples of OpenCL code in the following directory:
+
+```console
+/apps/intel/opencl-examples/
+```
+
+First example "CapsBasic" detects OpenCL compatible hardware, here CPU and MIC, and prints basic information about the capabilities of it.
+
+```console
+/apps/intel/opencl-examples/CapsBasic/capsbasic
+```
+
+To compile and run the example copy it to your home directory, get a PBS interactive session on of the nodes with MIC and run make for compilation. Make files are very basic and shows how the OpenCL code can be compiled on Anselm.
+
+```console
+$ cp /apps/intel/opencl-examples/CapsBasic/* .
+$ qsub -I -q qmic -A NONE-0-0
+$ make
+```
+
+The compilation command for this example is:
+
+```console
+$ g++ capsbasic.cpp -lOpenCL -o capsbasic -I/apps/intel/opencl/include/
+```
+
+After executing the complied binary file, following output should be displayed.
+
+```console
+$ ./capsbasic
+
+    Number of available platforms: 1
+    Platform names:
+        [0] Intel(R) OpenCL [Selected]
+    Number of devices available for each type:
+        CL_DEVICE_TYPE_CPU: 1
+        CL_DEVICE_TYPE_GPU: 0
+        CL_DEVICE_TYPE_ACCELERATOR: 1
+
+    ** Detailed information for each device ***
+
+    CL_DEVICE_TYPE_CPU[0]
+        CL_DEVICE_NAME:        Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz
+        CL_DEVICE_AVAILABLE: 1
+
+    ...
+
+    CL_DEVICE_TYPE_ACCELERATOR[0]
+        CL_DEVICE_NAME: Intel(R) Many Integrated Core Acceleration Card
+        CL_DEVICE_AVAILABLE: 1
+
+    ...
+```
+
+!!! note
+    More information about this example can be found on Intel website: <http://software.intel.com/en-us/vcsource/samples/caps-basic/>
+
+The second example that can be found in "/apps/intel/opencl-examples" directory is General Matrix Multiply. You can follow the the same procedure to download the example to your directory and compile it.
+
+```console
+$ cp -r /apps/intel/opencl-examples/* .
+$ qsub -I -q qmic -A NONE-0-0
+$ cd GEMM
+$ make
+```
+
+The compilation command for this example is:
+
+```console
+$ g++ cmdoptions.cpp gemm.cpp ../common/basic.cpp ../common/cmdparser.cpp ../common/oclobject.cpp -I../common -lOpenCL -o gemm -I/apps/intel/opencl/include/
+```
+
+To see the performance of Intel Xeon Phi performing the DGEMM run the example as follows:
+
+```console
+    ./gemm -d 1
+    Platforms (1):
+     [0] Intel(R) OpenCL [Selected]
+    Devices (2):
+     [0] Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz
+     [1] Intel(R) Many Integrated Core Acceleration Card [Selected]
+    Build program options: "-DT=float -DTILE_SIZE_M=1 -DTILE_GROUP_M=16 -DTILE_SIZE_N=128 -DTILE_GROUP_N=1 -DTILE_SIZE_K=8"
+    Running gemm_nn kernel with matrix size: 3968x3968
+    Memory row stride to ensure necessary alignment: 15872 bytes
+    Size of memory region for one matrix: 62980096 bytes
+    Using alpha = 0.57599 and beta = 0.872412
+    ...
+    Host time: 0.292953 sec.
+    Host perf: 426.635 GFLOPS
+    Host time: 0.293334 sec.
+    Host perf: 426.081 GFLOPS
+    ...
+```
+
+!!! warning
+    GNU compiler is used to compile the OpenCL codes for Intel MIC. You do not need to load Intel compiler module.
+
+## MPI
+
+### Environment Setup and Compilation
+
+Again an MPI code for Intel Xeon Phi has to be compiled on a compute node with accelerator and MPSS software stack installed. To get to a compute node with accelerator use:
+
+```console
+$ qsub -I -q qmic -A NONE-0-0
+```
+
+The only supported implementation of MPI standard for Intel Xeon Phi is Intel MPI. To setup a fully functional development environment a combination of Intel compiler and Intel MPI has to be used. On a host load following modules before compilation:
+
+```console
+$ module load intel
+```
+
+To compile an MPI code for host use:
+
+````console
+$ mpiicc -xhost -o mpi-test mpi-test.c
+```
+
+To compile the same code for Intel Xeon Phi architecture use:
+
+```console
+$ mpiicc -mmic -o mpi-test-mic mpi-test.c
+````
+
+An example of basic MPI version of "hello-world" example in C language, that can be executed on both host and Xeon Phi is (can be directly copy and pasted to a .c file)
+
+```cpp
+#include <stdio.h>
+#include <mpi.h>
+
+int main (argc, argv)
+     int argc;
+     char *argv[];
+{
+  int rank, size;
+
+  int len;
+  char node[MPI_MAX_PROCESSOR_NAME];
+
+  MPI_Init (&argc, &argv);      /* starts MPI */
+  MPI_Comm_rank (MPI_COMM_WORLD, &rank);        /* get current process id */
+  MPI_Comm_size (MPI_COMM_WORLD, &size);        /* get number of processes */
+
+  MPI_Get_processor_name(node,&len);
+
+  printf( "Hello world from process %d of %d on host %s n", rank, size, node );
+  MPI_Finalize();
+  return 0;
+}
+```
+
+### MPI Programming Models
+
+Intel MPI for the Xeon Phi coprocessors offers different MPI programming models:
+
+!!! note
+    **Host-only model** - all MPI ranks reside on the host. The coprocessors can be used by using offload pragmas. (Using MPI calls inside offloaded code is not supported.)
+
+    **Coprocessor-only model** - all MPI ranks reside only on the coprocessors.
+
+    **Symmetric model** - the MPI ranks reside on both the host and the coprocessor. Most general MPI case.
+
+### Host-Only Model
+
+In this case all environment variables are set by modules, so to execute the compiled MPI program on a single node, use:
+
+```console
+$ mpirun -np 4 ./mpi-test
+```
+
+The output should be similar to:
+
+```console
+    Hello world from process 1 of 4 on host cn207
+    Hello world from process 3 of 4 on host cn207
+    Hello world from process 2 of 4 on host cn207
+    Hello world from process 0 of 4 on host cn207
+```
+
+### Coprocessor-Only Model
+
+There are two ways how to execute an MPI code on a single coprocessor: 1.) lunch the program using "**mpirun**" from the
+coprocessor; or 2.) lunch the task using "**mpiexec.hydra**" from a host.
+
+#### Execution on Coprocessor
+
+Similarly to execution of OpenMP programs in native mode, since the environmental module are not supported on MIC, user has to setup paths to Intel MPI libraries and binaries manually. One time setup can be done by creating a "**.profile**" file in user's home directory. This file sets up the environment on the MIC automatically once user access to the accelerator through the SSH.
+
+```console
+$ vim ~/.profile
+
+    PS1='[u@h W]$ '
+    export PATH=/usr/bin:/usr/sbin:/bin:/sbin
+
+    #OpenMP
+    export LD_LIBRARY_PATH=/apps/intel/composer_xe_2013.5.192/compiler/lib/mic:$LD_LIBRARY_PATH
+
+    #Intel MPI
+    export LD_LIBRARY_PATH=/apps/intel/impi/4.1.1.036/mic/lib/:$LD_LIBRARY_PATH
+    export PATH=/apps/intel/impi/4.1.1.036/mic/bin/:$PATH
+```
+
+!!! note
+    \* this file sets up both environmental variable for both MPI and OpenMP libraries.
+    \* this file sets up the paths to a particular version of Intel MPI library and particular version of an Intel compiler. These versions have to match with loaded modules.
+
+To access a MIC accelerator located on a node that user is currently connected to, use:
+
+```console
+$ ssh mic0
+```
+
+or in case you need specify a MIC accelerator on a particular node, use:
+
+```console
+$ ssh cn207-mic0
+```
+
+To run the MPI code in parallel on multiple core of the accelerator, use:
+
+```console
+$ mpirun -np 4 ./mpi-test-mic
+```
+
+The output should be similar to:
+
+```console
+    Hello world from process 1 of 4 on host cn207-mic0
+    Hello world from process 2 of 4 on host cn207-mic0
+    Hello world from process 3 of 4 on host cn207-mic0
+    Hello world from process 0 of 4 on host cn207-mic0
+```
+
+#### Execution on Host
+
+If the MPI program is launched from host instead of the coprocessor, the environmental variables are not set using the ".profile" file. Therefore user has to specify library paths from the command line when calling "mpiexec".
+
+First step is to tell mpiexec that the MPI should be executed on a local accelerator by setting up the environmental variable "I_MPI_MIC"
+
+```console
+$ export I_MPI_MIC=1
+```
+
+Now the MPI program can be executed as:
+
+```console
+$ mpiexec.hydra -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/ -host mic0 -n 4 ~/mpi-test-mic
+```
+
+or using mpirun
+
+```console
+$ mpirun -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/ -host mic0 -n 4 ~/mpi-test-mic
+```
+
+!!! note
+    \* the full path to the binary has to specified (here: `>~/mpi-test-mic`)
+    \* the `LD_LIBRARY_PATH` has to match with Intel MPI module used to compile the MPI code
+
+The output should be again similar to:
+
+```console
+    Hello world from process 1 of 4 on host cn207-mic0
+    Hello world from process 2 of 4 on host cn207-mic0
+    Hello world from process 3 of 4 on host cn207-mic0
+    Hello world from process 0 of 4 on host cn207-mic0
+```
+
+!!! note
+    `mpiexec.hydra` requires a file the MIC filesystem. If the file is missing please contact the system administrators.
+
+A simple test to see if the file is present is to execute:
+
+```console
+$ ssh mic0 ls /bin/pmi_proxy
+      /bin/pmi_proxy
+```
+
+#### Execution on Host - MPI Processes Distributed Over Multiple Accelerators on Multiple Nodes**
+
+To get access to multiple nodes with MIC accelerator, user has to use PBS to allocate the resources. To start interactive session, that allocates 2 compute nodes = 2 MIC accelerators run qsub command with following parameters:
+
+```console
+$ qsub -I -q qmic -A NONE-0-0 -l select=2:ncpus=16
+$ ml intel/13.5.192 impi/4.1.1.036
+```
+
+This command connects user through ssh to one of the nodes immediately. To see the other nodes that have been allocated use:
+
+```console
+$ cat $PBS_NODEFILE
+```
+
+For example:
+
+```console
+    cn204.bullx
+    cn205.bullx
+```
+
+This output means that the PBS allocated nodes cn204 and cn205, which means that user has direct access to "**cn204-mic0**" and "**cn-205-mic0**" accelerators.
+
+!!! note
+    At this point user can connect to any of the allocated nodes or any of the allocated MIC accelerators using ssh:
+    - to connect to the second node : `$ ssh cn205`
+    - to connect to the accelerator on the first node from the first node: `$ ssh cn204-mic0` or `$ ssh mic0`
+    - to connect to the accelerator on the second node from the first node: `$ ssh cn205-mic0`
+
+At this point we expect that correct modules are loaded and binary is compiled. For parallel execution the mpiexec.hydra is used. Again the first step is to tell mpiexec that the MPI can be executed on MIC accelerators by setting up the environmental variable "I_MPI_MIC"
+
+```console
+$ export I_MPI_MIC=1
+```
+
+The launch the MPI program use:
+
+```console
+$ mpiexec.hydra -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/
+     -genv I_MPI_FABRICS_LIST tcp
+     -genv I_MPI_FABRICS shm:tcp
+     -genv I_MPI_TCP_NETMASK=10.1.0.0/16
+     -host cn204-mic0 -n 4 ~/mpi-test-mic
+    : -host cn205-mic0 -n 6 ~/mpi-test-mic
+```
+
+or using mpirun:
+
+```console
+$ mpirun -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/
+     -genv I_MPI_FABRICS_LIST tcp
+     -genv I_MPI_FABRICS shm:tcp
+     -genv I_MPI_TCP_NETMASK=10.1.0.0/16
+     -host cn204-mic0 -n 4 ~/mpi-test-mic
+    : -host cn205-mic0 -n 6 ~/mpi-test-mic
+```
+
+In this case four MPI processes are executed on accelerator cn204-mic and six processes are executed on accelerator cn205-mic0. The sample output (sorted after execution) is:
+
+```console
+    Hello world from process 0 of 10 on host cn204-mic0
+    Hello world from process 1 of 10 on host cn204-mic0
+    Hello world from process 2 of 10 on host cn204-mic0
+    Hello world from process 3 of 10 on host cn204-mic0
+    Hello world from process 4 of 10 on host cn205-mic0
+    Hello world from process 5 of 10 on host cn205-mic0
+    Hello world from process 6 of 10 on host cn205-mic0
+    Hello world from process 7 of 10 on host cn205-mic0
+    Hello world from process 8 of 10 on host cn205-mic0
+    Hello world from process 9 of 10 on host cn205-mic0
+```
+
+The same way MPI program can be executed on multiple hosts:
+
+```console
+$ mpiexec.hydra -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/
+     -genv I_MPI_FABRICS_LIST tcp
+     -genv I_MPI_FABRICS shm:tcp
+     -genv I_MPI_TCP_NETMASK=10.1.0.0/16
+     -host cn204 -n 4 ~/mpi-test
+    : -host cn205 -n 6 ~/mpi-test
+```
+
+### Symmetric Model
+
+In a symmetric mode MPI programs are executed on both host computer(s) and MIC accelerator(s). Since MIC has a different
+architecture and requires different binary file produced by the Intel compiler two different files has to be compiled before MPI program is executed.
+
+In the previous section we have compiled two binary files, one for hosts "**mpi-test**" and one for MIC accelerators "**mpi-test-mic**". These two binaries can be executed at once using mpiexec.hydra:
+
+```console
+$ mpiexec.hydra
+     -genv I_MPI_FABRICS_LIST tcp
+     -genv I_MPI_FABRICS shm:tcp
+     -genv I_MPI_TCP_NETMASK=10.1.0.0/16
+     -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/
+     -host cn205 -n 2 ~/mpi-test
+    : -host cn205-mic0 -n 2 ~/mpi-test-mic
+```
+
+In this example the first two parameters (line 2 and 3) sets up required environment variables for execution. The third line specifies binary that is executed on host (here cn205) and the last line specifies the binary that is execute on the accelerator (here cn205-mic0).
+
+The output of the program is:
+
+```console
+    Hello world from process 0 of 4 on host cn205
+    Hello world from process 1 of 4 on host cn205
+    Hello world from process 2 of 4 on host cn205-mic0
+    Hello world from process 3 of 4 on host cn205-mic0
+```
+
+The execution procedure can be simplified by using the mpirun command with the machine file a a parameter. Machine file contains list of all nodes and accelerators that should used to execute MPI processes.
+
+An example of a machine file that uses 2 >hosts (**cn205** and **cn206**) and 2 accelerators **(cn205-mic0** and **cn206-mic0**) to run 2 MPI processes on each of them:
+
+```console
+$ cat hosts_file_mix
+    cn205:2
+    cn205-mic0:2
+    cn206:2
+    cn206-mic0:2
+```
+
+In addition if a naming convention is set in a way that the name of the binary for host is **"bin_name"**  and the name of the binary for the accelerator is **"bin_name-mic"** then by setting up the environment variable **I_MPI_MIC_POSTFIX** to **"-mic"** user do not have to specify the names of booth binaries. In this case mpirun needs just the name of the host binary file (i.e. "mpi-test") and uses the suffix to get a name of the binary for accelerator (i..e. "mpi-test-mic").
+
+```console
+$ export I_MPI_MIC_POSTFIX=-mic
+```
+
+To run the MPI code using mpirun and the machine file "hosts_file_mix" use:
+
+```console
+$ mpirun
+     -genv I_MPI_FABRICS shm:tcp
+     -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/
+     -genv I_MPI_FABRICS_LIST tcp
+     -genv I_MPI_FABRICS shm:tcp
+     -genv I_MPI_TCP_NETMASK=10.1.0.0/16
+     -machinefile hosts_file_mix
+     ~/mpi-test
+```
+
+A possible output of the MPI "hello-world" example executed on two hosts and two accelerators is:
+
+```console
+    Hello world from process 0 of 8 on host cn204
+    Hello world from process 1 of 8 on host cn204
+    Hello world from process 2 of 8 on host cn204-mic0
+    Hello world from process 3 of 8 on host cn204-mic0
+    Hello world from process 4 of 8 on host cn205
+    Hello world from process 5 of 8 on host cn205
+    Hello world from process 6 of 8 on host cn205-mic0
+    Hello world from process 7 of 8 on host cn205-mic0
+```
+
+!!! note
+    At this point the MPI communication between MIC accelerators on different nodes uses 1Gb Ethernet only.
+
+### Using Automatically Generated Node-Files
+
+Set of node-files, that can be used instead of manually creating a new one every time, is generated for user convenience. Six node-files are generated:
+
+!!! note
+    **Node-files:**
+
+     - /lscratch/${PBS_JOBID}/nodefile-cn Hosts only node-file
+     - /lscratch/${PBS_JOBID}/nodefile-mic MICs only node-file
+     - /lscratch/${PBS_JOBID}/nodefile-mix Hosts and MICs node-file
+     - /lscratch/${PBS_JOBID}/nodefile-cn-sn Hosts only node-file, using short names
+     - /lscratch/${PBS_JOBID}/nodefile-mic-sn MICs only node-file, using short names
+     - /lscratch/${PBS_JOBID}/nodefile-mix-sn Hosts and MICs node-file, using short names
+
+Each host or accelerator is listed only once per file. User has to specify how many jobs should be executed per node using `-n` parameter of the mpirun command.
+
+## Optimization
+
+For more details about optimization techniques please read Intel document [Optimization and Performance Tuning for IntelÂ® Xeon Phiâ„˘ Coprocessors](http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization "http&#x3A;//software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization")
diff --git a/docs.it4i/software/intel/intel-xeon-phi.md b/docs.it4i/software/intel/intel-xeon-phi.md
new file mode 100644
index 0000000000000000000000000000000000000000..f09e343ce7c02c194c8d1406cc374442d0be4249
--- /dev/null
+++ b/docs.it4i/software/intel/intel-xeon-phi.md
@@ -0,0 +1,1011 @@
+# Intel Xeon Phi
+
+## Guide to Intel Xeon Phi Usage
+
+Intel Xeon Phi accelerator can be programmed in several modes. The default mode on the cluster is offload mode, but all modes described in this document are supported.
+
+## Intel Utilities for Xeon Phi
+
+To get access to a compute node with Intel Xeon Phi accelerator, use the PBS interactive session
+
+```console
+$ qsub -I -q qprod -l select=1:ncpus=24:accelerator=True:naccelerators=2:accelerator_model=phi7120 -A NONE-0-0
+```
+
+To set up the environment module "intel" has to be loaded, without specifying the version, default version is loaded (at time of writing this, it's 2015b)
+
+```console
+$ ml intel
+```
+
+Information about the hardware can be obtained by running the micinfo program on the host.
+
+```console
+$ /usr/bin/micinfo
+```
+
+The output of the "micinfo" utility executed on one of the cluster node is as follows. (note: to get PCIe related details the command has to be run with root privileges)
+
+```console
+MicInfo Utility Log
+Created Wed Sep 13 13:39:28 2017
+
+
+        System Info
+                HOST OS                 : Linux
+                OS Version              : 2.6.32-696.3.2.el6.x86_64
+                Driver Version          : 3.8.2-1
+                MPSS Version            : 3.8.2
+                Host Physical Memory    : 128838 MB
+
+Device No: 0, Device Name: mic0
+
+        Version
+                Flash Version            : 2.1.02.0391
+                SMC Firmware Version     : 1.17.6900
+                SMC Boot Loader Version  : 1.8.4326
+                Coprocessor OS Version   : 2.6.38.8+mpss3.8.2
+                Device Serial Number     : ADKC44601725
+
+        Board
+                Vendor ID                : 0x8086
+                Device ID                : 0x225c
+                Subsystem ID             : 0x7d95
+                Coprocessor Stepping ID  : 2
+                PCIe Width               : x16
+                PCIe Speed               : 5 GT/s
+                PCIe Max payload size    : 256 bytes
+                PCIe Max read req size   : 512 bytes
+                Coprocessor Model        : 0x01
+                Coprocessor Model Ext    : 0x00
+                Coprocessor Type         : 0x00
+                Coprocessor Family       : 0x0b
+                Coprocessor Family Ext   : 0x00
+                Coprocessor Stepping     : C0
+                Board SKU                : C0PRQ-7120 P/A/X/D
+                ECC Mode                 : Enabled
+                SMC HW Revision          : Product 300W Passive CS
+
+        Cores
+                Total No of Active Cores : 61
+                Voltage                  : 1041000 uV
+                Frequency                : 1238095 kHz
+
+        Thermal
+                Fan Speed Control        : N/A
+                Fan RPM                  : N/A
+                Fan PWM                  : N/A
+                Die Temp                 : 50 C
+
+        GDDR
+                GDDR Vendor              : Samsung
+                GDDR Version             : 0x6
+                GDDR Density             : 4096 Mb
+                GDDR Size                : 15872 MB
+                GDDR Technology          : GDDR5
+                GDDR Speed               : 5.500000 GT/s
+                GDDR Frequency           : 2750000 kHz
+                GDDR Voltage             : 1501000 uV
+
+Device No: 1, Device Name: mic1
+
+        Version
+                Flash Version            : 2.1.02.0391
+                SMC Firmware Version     : 1.17.6900
+                SMC Boot Loader Version  : 1.8.4326
+                Coprocessor OS Version   : 2.6.38.8+mpss3.8.2
+                Device Serial Number     : ADKC44601893
+
+        Board
+                Vendor ID                : 0x8086
+                Device ID                : 0x225c
+                Subsystem ID             : 0x7d95
+                Coprocessor Stepping ID  : 2
+                PCIe Width               : x16
+                PCIe Speed               : 5 GT/s
+                PCIe Max payload size    : 256 bytes
+                PCIe Max read req size   : 512 bytes
+                Coprocessor Model        : 0x01
+                Coprocessor Model Ext    : 0x00
+                Coprocessor Type         : 0x00
+                Coprocessor Family       : 0x0b
+                Coprocessor Family Ext   : 0x00
+                Coprocessor Stepping     : C0
+                Board SKU                : C0PRQ-7120 P/A/X/D
+                ECC Mode                 : Enabled
+                SMC HW Revision          : Product 300W Passive CS
+
+        Cores
+                Total No of Active Cores : 61
+                Voltage                  : 1053000 uV
+                Frequency                : 1238095 kHz
+
+        Thermal
+                Fan Speed Control        : N/A
+                Fan RPM                  : N/A
+                Fan PWM                  : N/A
+                Die Temp                 : 48 C
+
+        GDDR
+                GDDR Vendor              : Samsung
+                GDDR Version             : 0x6
+                GDDR Density             : 4096 Mb
+                GDDR Size                : 15872 MB
+                GDDR Technology          : GDDR5
+                GDDR Speed               : 5.500000 GT/s
+                GDDR Frequency           : 2750000 kHz
+                GDDR Voltage             : 1501000 uV
+```
+
+## Offload Mode
+
+To compile a code for Intel Xeon Phi a MPSS stack has to be installed on the machine where compilation is executed. Currently the MPSS stack is only installed on compute nodes equipped with accelerators.
+
+```console
+$ qsub -I -q qprod -l select=1:ncpus=24:accelerator=True:naccelerators=2:accelerator_model=phi7120 -A NONE-0-0
+$ ml intel
+```
+
+For debugging purposes it is also recommended to set environment variable "OFFLOAD_REPORT". Value can be set from 0 to 3, where higher number means more debugging information.
+
+```console
+export OFFLOAD_REPORT=3
+```
+
+A very basic example of code that employs offload programming technique is shown in the next listing. Please note that this code is sequential and utilizes only single core of the accelerator.
+
+```cpp
+$ cat source-offload.cpp
+
+#include <iostream>
+
+int main(int argc, char* argv[])
+{
+    const int niter = 100000;
+    double result = 0;
+
+ #pragma offload target(mic)
+    for (int i = 0; i < niter; ++i) {
+        const double t = (i + 0.5) / niter;
+        result += 4.0 / (t * t + 1.0);
+    }
+    result /= niter;
+    std::cout << "Pi ~ " << result << '\n';
+}
+```
+
+To compile a code using Intel compiler run
+
+```console
+$ icc source-offload.cpp -o bin-offload
+```
+
+To execute the code, run the following command on the host
+
+```console
+$ ./bin-offload
+```
+
+### Parallelization in Offload Mode Using OpenMP
+
+One way of paralelization a code for Xeon Phi is using OpenMP directives. The following example shows code for parallel vector addition.
+
+```cpp
+$ cat ./vect-add
+
+#include <stdio.h>
+
+typedef int T;
+
+#define SIZE 1000
+
+#pragma offload_attribute(push, target(mic))
+T in1[SIZE];
+T in2[SIZE];
+T res[SIZE];
+#pragma offload_attribute(pop)
+
+// MIC function to add two vectors
+__attribute__((target(mic))) add_mic(T *a, T *b, T *c, int size) {
+  int i = 0;
+  #pragma omp parallel for
+    for (i = 0; i < size; i++)
+      c[i] = a[i] + b[i];
+}
+
+// CPU function to add two vectors
+void add_cpu (T *a, T *b, T *c, int size) {
+  int i;
+  for (i = 0; i < size; i++)
+    c[i] = a[i] + b[i];
+}
+
+// CPU function to generate a vector of random numbers
+void random_T (T *a, int size) {
+  int i;
+  for (i = 0; i < size; i++)
+    a[i] = rand() % 10000; // random number between 0 and 9999
+}
+
+// CPU function to compare two vectors
+int compare(T *a, T *b, T size ){
+  int pass = 0;
+  int i;
+  for (i = 0; i < size; i++){
+    if (a[i] != b[i]) {
+      printf("Value mismatch at location %d, values %d and %dn",i, a[i], b[i]);
+      pass = 1;
+    }
+  }
+  if (pass == 0) printf ("Test passedn"); else printf ("Test Failedn");
+  return pass;
+}
+
+int main()
+{
+  int i;
+  random_T(in1, SIZE);
+  random_T(in2, SIZE);
+
+  #pragma offload target(mic) in(in1,in2)  inout(res)
+  {
+
+    // Parallel loop from main function
+    #pragma omp parallel for
+    for (i=0; i<SIZE; i++)
+      res[i] = in1[i] + in2[i];
+
+    // or parallel loop is called inside the function
+    add_mic(in1, in2, res, SIZE);
+
+  }
+
+  //Check the results with CPU implementation
+  T res_cpu[SIZE];
+  add_cpu(in1, in2, res_cpu, SIZE);
+  compare(res, res_cpu, SIZE);
+
+}
+```
+
+During the compilation Intel compiler shows which loops have been vectorized in both host and accelerator. This can be enabled with compiler option "-vec-report2". To compile and execute the code run
+
+```console
+$ icc vect-add.c -openmp_report2 -vec-report2 -o vect-add
+$ ./vect-add
+```
+
+Some interesting compiler flags useful not only for code debugging are:
+
+!!! note
+    Debugging
+    openmp_report[0|1|2] - controls the compiler based vectorization diagnostic level
+    vec-report[0|1|2] - controls the OpenMP parallelizer diagnostic level
+
+    Performance ooptimization
+    xhost - FOR HOST ONLY - to generate AVX (Advanced Vector Extensions) instructions.
+
+## Automatic Offload Using Intel MKL Library
+
+Intel MKL includes an Automatic Offload (AO) feature that enables computationally intensive MKL functions called in user code to benefit from attached Intel Xeon Phi coprocessors automatically and transparently.
+
+!!! note
+    Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm).
+
+The Automatic Offload may be enabled by either an MKL function call within the code:
+
+```cpp
+mkl_mic_enable();
+```
+
+or by setting environment variable
+
+```console
+$ export MKL_MIC_ENABLE=1
+```
+
+To get more information about automatic offload please refer to "[Using IntelÂ® MKL Automatic Offload on Intel Â® Xeon Phiâ„˘ Coprocessors](http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf)" white paper or [Intel MKL documentation](https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation).
+
+### Automatic Offload Example
+
+At first get an interactive PBS session on a node with MIC accelerator and load "intel" module that automatically loads "mkl" module as well.
+
+```console
+$ qsub -I -q qprod -l select=1:ncpus=24:accelerator=True:naccelerators=2:accelerator_model=phi7120 -A NONE-0-0
+$ ml intel
+```
+
+The code can be copied to a file and compiled without any necessary modification.
+
+```cpp
+$ vim sgemm-ao-short.c
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <malloc.h>
+#include <stdint.h>
+
+#include "mkl.h"
+
+int main(int argc, char **argv)
+{
+    float *A, *B, *C; /* Matrices */
+
+    MKL_INT N = 2560; /* Matrix dimensions */
+    MKL_INT LD = N; /* Leading dimension */
+    int matrix_bytes; /* Matrix size in bytes */
+    int matrix_elements; /* Matrix size in elements */
+
+    float alpha = 1.0, beta = 1.0; /* Scaling factors */
+    char transa = 'N', transb = 'N'; /* Transposition options */
+
+    int i, j; /* Counters */
+
+    matrix_elements = N * N;
+    matrix_bytes = sizeof(float) * matrix_elements;
+
+    /* Allocate the matrices */
+    A = malloc(matrix_bytes); B = malloc(matrix_bytes); C = malloc(matrix_bytes);
+
+    /* Initialize the matrices */
+    for (i = 0; i < matrix_elements; i++) {
+            A[i] = 1.0; B[i] = 2.0; C[i] = 0.0;
+    }
+
+    printf("Computing SGEMM on the host\n");
+    sgemm(&transa, &transb, &N, &N, &N, &alpha, A, &N, B, &N, &beta, C, &N);
+
+    printf("Enabling Automatic Offload\n");
+    /* Alternatively, set environment variable MKL_MIC_ENABLE=1 */
+    mkl_mic_enable();
+
+    int ndevices = mkl_mic_get_device_count(); /* Number of MIC devices */
+    printf("Automatic Offload enabled: %d MIC devices present\n",   ndevices);
+
+    printf("Computing SGEMM with automatic workdivision\n");
+    sgemm(&transa, &transb, &N, &N, &N, &alpha, A, &N, B, &N, &beta, C, &N);
+
+    /* Free the matrix memory */
+    free(A); free(B); free(C);
+
+    printf("Done\n");
+
+    return 0;
+}
+```
+
+!!! note
+    This example is simplified version of an example from MKL. The expanded version can be found here: **$MKL_EXAMPLES/mic_ao/blasc/source/sgemm.c**
+
+To compile a code using Intel compiler use:
+
+```console
+$ icc -mkl sgemm-ao-short.c -o sgemm
+```
+
+For debugging purposes enable the offload report to see more information about automatic offloading.
+
+```console
+$ export OFFLOAD_REPORT=2
+```
+
+The output of a code should look similar to following listing, where lines starting with [MKL] are generated by offload reporting:
+
+```console
+[user@r31u03n799 ~]$ ./sgemm
+Computing SGEMM on the host
+Enabling Automatic Offload
+Automatic Offload enabled: 2 MIC devices present
+Computing SGEMM with automatic workdivision
+[MKL] [MIC --] [AO Function]    SGEMM
+[MKL] [MIC --] [AO SGEMM Workdivision]    0.44 0.28 0.28
+[MKL] [MIC 00] [AO SGEMM CPU Time]    0.252427 seconds
+[MKL] [MIC 00] [AO SGEMM MIC Time]    0.091001 seconds
+[MKL] [MIC 00] [AO SGEMM CPU->MIC Data]    34078720 bytes
+[MKL] [MIC 00] [AO SGEMM MIC->CPU Data]    7864320 bytes
+[MKL] [MIC 01] [AO SGEMM CPU Time]    0.252427 seconds
+[MKL] [MIC 01] [AO SGEMM MIC Time]    0.094758 seconds
+[MKL] [MIC 01] [AO SGEMM CPU->MIC Data]    34078720 bytes
+[MKL] [MIC 01] [AO SGEMM MIC->CPU Data]    7864320 bytes
+Done
+```
+
+!!! note ""
+    Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm).
+
+### Automatic offload example #2
+
+In this example, we will demonstrate automatic offload control via an environment vatiable MKL_MIC_ENABLE. The function DGEMM will be offloaded.
+
+At first get an interactive PBS session on a node with MIC accelerator.
+
+```console
+$ qsub -I -q qprod -l select=1:ncpus=24:accelerator=True:naccelerators=2:accelerator_model=phi7120 -A NONE-0-0
+```
+
+Once in, we enable the offload and run the Octave software. In octave, we generate two large random matrices and let them multiply together.
+
+```console
+$ export MKL_MIC_ENABLE=1
+$ export OFFLOAD_REPORT=2
+$ ml Octave/3.8.2-intel-2015b
+$ octave -q
+octave:1> A=rand(10000);
+octave:2> B=rand(10000);
+octave:3> C=A*B;
+[MKL] [MIC --] [AO Function]    DGEMM
+[MKL] [MIC --] [AO DGEMM Workdivision]    0.14 0.43 0.43
+[MKL] [MIC 00] [AO DGEMM CPU Time]    3.814714 seconds
+[MKL] [MIC 00] [AO DGEMM MIC Time]    2.781595 seconds
+[MKL] [MIC 00] [AO DGEMM CPU->MIC Data]    1145600000 bytes
+[MKL] [MIC 00] [AO DGEMM MIC->CPU Data]    1382400000 bytes
+[MKL] [MIC 01] [AO DGEMM CPU Time]    3.814714 seconds
+[MKL] [MIC 01] [AO DGEMM MIC Time]    2.843016 seconds
+[MKL] [MIC 01] [AO DGEMM CPU->MIC Data]    1145600000 bytes
+[MKL] [MIC 01] [AO DGEMM MIC->CPU Data]    1382400000 bytes
+octave:4> exit
+```
+
+On the example above we observe, that the DGEMM function workload was split over CPU, MIC 0 and MIC 1, in the ratio 0.14 0.43 0.43. The matrix multiplication was done on the CPU, accelerated by two Xeon Phi accelerators.
+
+## Native Mode
+
+In the native mode a program is executed directly on Intel Xeon Phi without involvement of the host machine. Similarly to offload mode, the code is compiled on the host computer with Intel compilers.
+
+To compile a code user has to be connected to a compute with MIC and load Intel compilers module. To get an interactive session on a compute node with an Intel Xeon Phi and load the module use following commands:
+
+```console
+$ qsub -I -q qprod -l select=1:ncpus=24:accelerator=True:naccelerators=2:accelerator_model=phi7120 -A NONE-0-0
+$ ml intel
+```
+
+!!! note
+    Particular version of the Intel module is specified. This information is used later to specify the correct library paths.
+
+To produce a binary compatible with Intel Xeon Phi architecture user has to specify "-mmic" compiler flag. Two compilation examples are shown below. The first example shows how to compile OpenMP parallel code "vect-add.c" for host only:
+
+```console
+$ icc -xhost -no-offload -fopenmp vect-add.c -o vect-add-host
+```
+
+To run this code on host, use:
+
+```console
+$ ./vect-add-host
+```
+
+The second example shows how to compile the same code for Intel Xeon Phi:
+
+```console
+$ icc -mmic -fopenmp vect-add.c -o vect-add-mic
+```
+
+### Execution of the Program in Native Mode on Intel Xeon Phi
+
+The user access to the Intel Xeon Phi is through the SSH. Since user home directories are mounted using NFS on the accelerator, users do not have to copy binary files or libraries between the host and accelerator.
+
+Get the PATH of MIC enabled libraries for currently used Intel Compiler (here was icc/2015.3.187-GNU-5.1.0-2.25 used):
+
+```console
+$ echo $MIC_LD_LIBRARY_PATH
+/apps/all/icc/2015.3.187-GNU-5.1.0-2.25/composer_xe_2015.3.187/compiler/lib/mic
+```
+
+To connect to the accelerator run:
+
+```console
+$ ssh mic0
+```
+
+If the code is sequential, it can be executed directly:
+
+```console
+mic0 $ ~/path_to_binary/vect-add-seq-mic
+```
+
+If the code is parallelized using OpenMP a set of additional libraries is required for execution. To locate these libraries new path has to be added to the LD_LIBRARY_PATH environment variable prior to the execution:
+
+```console
+mic0 $ export LD_LIBRARY_PATH=/apps/all/icc/2015.3.187-GNU-5.1.0-2.25/composer_xe_2015.3.187/compiler/lib/mic:$LD_LIBRARY_PATH
+```
+
+!!! note
+    Please note that the path exported in the previous example contains path to a specific compiler (here the version is 2015.3.187-GNU-5.1.0-2.25). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer.
+
+For your information the list of libraries and their location required for execution of an OpenMP parallel code on Intel Xeon Phi is:
+
+!!! note
+    /apps/all/icc/2015.3.187-GNU-5.1.0-2.25/composer_xe_2015.3.187/compiler/lib/mic
+
+    libiomp5.so
+    libimf.so
+    libsvml.so
+    libirng.so
+    libintlc.so.5
+
+Finally, to run the compiled code use:
+
+## OpenCL
+
+OpenCL (Open Computing Language) is an open standard for general-purpose parallel programming for diverse mix of multi-core CPUs, GPU coprocessors, and other parallel processors. OpenCL provides a flexible execution model and uniform programming environment for software developers to write portable code for systems running on both the CPU and graphics processors or accelerators like the IntelÂ® Xeon Phi.
+
+On Salomon OpenCL is installed only on compute nodes with MIC accelerator, therefore OpenCL code can be compiled only on these nodes.
+
+```console
+module load opencl-sdk opencl-rt
+```
+
+Always load "opencl-sdk" (providing devel files like headers) and "opencl-rt" (providing dynamic library libOpenCL.so) modules to compile and link OpenCL code. Load "opencl-rt" for running your compiled code.
+
+There are two basic examples of OpenCL code in the following directory:
+
+```console
+/apps/intel/opencl-examples/
+```
+
+First example "CapsBasic" detects OpenCL compatible hardware, here CPU and MIC, and prints basic information about the capabilities of it.
+
+```console
+/apps/intel/opencl-examples/CapsBasic/capsbasic
+```
+
+To compile and run the example copy it to your home directory, get a PBS interactive session on of the nodes with MIC and run make for compilation. Make files are very basic and shows how the OpenCL code can be compiled on Salomon.
+
+```console
+$ cp /apps/intel/opencl-examples/CapsBasic/* .
+$ qsub -I -q qmic -A NONE-0-0
+$ make
+```
+
+The compilation command for this example is:
+
+```console
+$ g++ capsbasic.cpp -lOpenCL -o capsbasic -I/apps/intel/opencl/include/
+```
+
+After executing the complied binary file, following output should be displayed.
+
+```console
+./capsbasic
+
+Number of available platforms: 1
+Platform names:
+    [0] Intel(R) OpenCL [Selected]
+Number of devices available for each type:
+    CL_DEVICE_TYPE_CPU: 1
+    CL_DEVICE_TYPE_GPU: 0
+    CL_DEVICE_TYPE_ACCELERATOR: 1
+
+** Detailed information for each device ***
+
+CL_DEVICE_TYPE_CPU[0]
+    CL_DEVICE_NAME:        Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz
+    CL_DEVICE_AVAILABLE: 1
+
+...
+
+CL_DEVICE_TYPE_ACCELERATOR[0]
+    CL_DEVICE_NAME: Intel(R) Many Integrated Core Acceleration Card
+    CL_DEVICE_AVAILABLE: 1
+
+...
+```
+
+!!! note
+    More information about this example can be found on Intel website: <http://software.intel.com/en-us/vcsource/samples/caps-basic/>
+
+The second example that can be found in "/apps/intel/opencl-examples" directory is General Matrix Multiply. You can follow the the same procedure to download the example to your directory and compile it.
+
+```console
+$ cp -r /apps/intel/opencl-examples/* .
+$ qsub -I -q qmic -A NONE-0-0
+$ cd GEMM
+$ make
+```
+
+The compilation command for this example is:
+
+```console
+$ g++ cmdoptions.cpp gemm.cpp ../common/basic.cpp ../common/cmdparser.cpp ../common/oclobject.cpp -I../common -lOpenCL -o gemm -I/apps/intel/opencl/include/
+```
+
+To see the performance of Intel Xeon Phi performing the DGEMM run the example as follows:
+
+```console
+./gemm -d 1
+Platforms (1):
+ [0] Intel(R) OpenCL [Selected]
+Devices (2):
+ [0] Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz
+ [1] Intel(R) Many Integrated Core Acceleration Card [Selected]
+Build program options: "-DT=float -DTILE_SIZE_M=1 -DTILE_GROUP_M=16 -DTILE_SIZE_N=128 -DTILE_GROUP_N=1 -DTILE_SIZE_K=8"
+Running gemm_nn kernel with matrix size: 3968x3968
+Memory row stride to ensure necessary alignment: 15872 bytes
+Size of memory region for one matrix: 62980096 bytes
+Using alpha = 0.57599 and beta = 0.872412
+...
+Host time: 0.292953 sec.
+Host perf: 426.635 GFLOPS
+Host time: 0.293334 sec.
+Host perf: 426.081 GFLOPS
+...
+```
+
+!!! hint
+    GNU compiler is used to compile the OpenCL codes for Intel MIC. You do not need to load Intel compiler module.
+
+## MPI
+
+### Environment Setup and Compilation
+
+To achieve best MPI performance always use following setup for Intel MPI on Xeon Phi accelerated nodes:
+
+```console
+$ export I_MPI_FABRICS=shm:dapl
+$ export I_MPI_DAPL_PROVIDER_LIST=ofa-v2-mlx4_0-1u,ofa-v2-scif0,ofa-v2-mcm-1
+```
+
+This ensures, that MPI inside node will use SHMEM communication, between HOST and Phi the IB SCIF will be used and between different nodes or Phi's on diferent nodes a CCL-Direct proxy will be used.
+
+!!! note
+    Other FABRICS like tcp,ofa may be used (even combined with shm) but there's severe loss of performance (by order of magnitude).
+    Usage of single DAPL PROVIDER (e. g. I_MPI_DAPL_PROVIDER=ofa-v2-mlx4_0-1u) will cause failure of Host<->Phi and/or Phi<->Phi communication.
+    Usage of the I_MPI_DAPL_PROVIDER_LIST on non-accelerated node will cause failure of any MPI communication, since those nodes don't have SCIF device and there's no CCL-Direct proxy runnig.
+
+Again an MPI code for Intel Xeon Phi has to be compiled on a compute node with accelerator and MPSS software stack installed. To get to a compute node with accelerator use:
+
+```console
+$ qsub -I -q qprod -l select=1:ncpus=24:accelerator=True:naccelerators=2:accelerator_model=phi7120 -A NONE-0-0
+```
+
+The only supported implementation of MPI standard for Intel Xeon Phi is Intel MPI. To setup a fully functional development environment a combination of Intel compiler and Intel MPI has to be used. On a host load following modules before compilation:
+
+```console
+$ module load intel
+```
+
+To compile an MPI code for host use:
+
+```console
+$ mpiicc -xhost -o mpi-test mpi-test.c
+```
+
+To compile the same code for Intel Xeon Phi architecture use:
+
+```console
+$ mpiicc -mmic -o mpi-test-mic mpi-test.c
+```
+
+Or, if you are using Fortran :
+
+```console
+$ mpiifort -mmic -o mpi-test-mic mpi-test.f90
+```
+
+An example of basic MPI version of "hello-world" example in C language, that can be executed on both host and Xeon Phi is (can be directly copy and pasted to a .c file)
+
+```cpp
+#include <stdio.h>
+#include <mpi.h>
+
+int main (argc, argv)
+     int argc;
+     char *argv[];
+{
+  int rank, size;
+
+  int len;
+  char node[MPI_MAX_PROCESSOR_NAME];
+
+  MPI_Init (&argc, &argv);      /* starts MPI */
+  MPI_Comm_rank (MPI_COMM_WORLD, &rank);        /* get current process id */
+  MPI_Comm_size (MPI_COMM_WORLD, &size);        /* get number of processes */
+
+  MPI_Get_processor_name(node,&len);
+
+  printf( "Hello world from process %d of %d on host %s n", rank, size, node );
+  MPI_Finalize();
+  return 0;
+}
+```
+
+### MPI Programming Models
+
+Intel MPI for the Xeon Phi coprocessors offers different MPI programming models:
+
+!!! note
+    **Host-only model** - all MPI ranks reside on the host. The coprocessors can be used by using offload pragmas. (Using MPI calls inside offloaded code is not supported.)
+
+    **Coprocessor-only model** - all MPI ranks reside only on the coprocessors.
+
+    **Symmetric model** - the MPI ranks reside on both the host and the coprocessor. Most general MPI case.
+
+### Host-Only Model
+
+In this case all environment variables are set by modules, so to execute the compiled MPI program on a single node, use:
+
+```console
+$ mpirun -np 4 ./mpi-test
+```
+
+The output should be similar to:
+
+```console
+Hello world from process 1 of 4 on host r38u31n1000
+Hello world from process 3 of 4 on host r38u31n1000
+Hello world from process 2 of 4 on host r38u31n1000
+Hello world from process 0 of 4 on host r38u31n1000
+```
+
+### Coprocessor-Only Model
+
+There are two ways how to execute an MPI code on a single coprocessor: 1.) lunch the program using "**mpirun**" from the
+coprocessor; or 2.) lunch the task using "**mpiexec.hydra**" from a host.
+
+#### Execution on Coprocessor
+
+Similarly to execution of OpenMP programs in native mode, since the environmental module are not supported on MIC, user has to setup paths to Intel MPI libraries and binaries manually. One time setup can be done by creating a "**.profile**" file in user's home directory. This file sets up the environment on the MIC automatically once user access to the accelerator through the SSH.
+
+At first get the LD_LIBRARY_PATH for currenty used Intel Compiler and Intel MPI:
+
+```console
+$ echo $MIC_LD_LIBRARY_PATH
+/apps/all/imkl/11.2.3.187-iimpi-7.3.5-GNU-5.1.0-2.25/mkl/lib/mic:/apps/all/imkl/11.2.3.187-iimpi-7.3.5-GNU-5.1.0-2.25/lib/mic:/apps/all/icc/2015.3.187-GNU-5.1.0-2.25/composer_xe_2015.3.187/compiler/lib/mic/
+```
+
+Use it in your ~/.profile:
+
+```console
+$ cat ~/.profile
+
+PS1='[\u@\h \W]\$ '
+export PATH=/usr/bin:/usr/sbin:/bin:/sbin
+
+#IMPI
+export PATH=/apps/all/impi/5.0.3.048-iccifort-2015.3.187-GNU-5.1.0-2.25/mic/bin/:$PATH
+
+#OpenMP (ICC, IFORT), IMKL and IMPI
+export LD_LIBRARY_PATH=/apps/all/imkl/11.2.3.187-iimpi-7.3.5-GNU-5.1.0-2.25/mkl/lib/mic:/apps/all/imkl/11.2.3.187-iimpi-7.3.5-GNU-5.1.0-2.25/lib/mic:/apps/all/icc/2015.3.187-GNU-5.1.0-2.25/composer_xe_2015.3.187/compiler/lib/mic:$LD_LIBRARY_PATH
+
+```
+
+!!! note
+    \* this file sets up both environmental variable for both MPI and OpenMP libraries.
+    \* this file sets up the paths to a particular version of Intel MPI library and particular version of an Intel compiler. These versions have to match with loaded modules.
+
+To access a MIC accelerator located on a node that user is currently connected to, use:
+
+```console
+$ ssh mic0
+```
+
+or in case you need specify a MIC accelerator on a particular node, use:
+
+```console
+$ ssh r38u31n1000-mic0
+```
+
+To run the MPI code in parallel on multiple core of the accelerator, use:
+
+```console
+$ mpirun -np 4 ./mpi-test-mic
+```
+
+The output should be similar to:
+
+```console
+Hello world from process 1 of 4 on host r38u31n1000-mic0
+Hello world from process 2 of 4 on host r38u31n1000-mic0
+Hello world from process 3 of 4 on host r38u31n1000-mic0
+Hello world from process 0 of 4 on host r38u31n1000-mic0
+```
+
+#### Execution on Host
+
+If the MPI program is launched from host instead of the coprocessor, the environmental variables are not set using the ".profile" file. Therefore user has to specify library paths from the command line when calling "mpiexec".
+
+First step is to tell mpiexec that the MPI should be executed on a local accelerator by setting up the environmental variable "I_MPI_MIC"
+
+```console
+$ export I_MPI_MIC=1
+```
+
+Now the MPI program can be executed as:
+
+```console
+$ mpirun -genv LD_LIBRARY_PATH $MIC_LD_LIBRARY_PATH -host mic0 -n 4 ~/mpi-test-mic
+```
+
+or using mpirun
+
+```console
+$ mpirun -genv LD_LIBRARY_PATH $MIC_LD_LIBRARY_PATH -host mic0 -n 4 ~/mpi-test-mic
+```
+
+!!! note
+    \* the full path to the binary has to specified (here: "**>~/mpi-test-mic**")
+    \* the LD_LIBRARY_PATH has to match with Intel MPI module used to compile the MPI code
+
+The output should be again similar to:
+
+```console
+Hello world from process 1 of 4 on host r38u31n1000-mic0
+Hello world from process 2 of 4 on host r38u31n1000-mic0
+Hello world from process 3 of 4 on host r38u31n1000-mic0
+Hello world from process 0 of 4 on host r38u31n1000-mic0
+```
+
+!!! hint
+    **"mpiexec.hydra"** requires a file the MIC filesystem. If the file is missing please contact the system administrators.
+
+A simple test to see if the file is present is to execute:
+
+```console
+$ ssh mic0 ls /bin/pmi_proxy
+  /bin/pmi_proxy
+```
+
+#### Execution on Host - MPI Processes Distributed Over Multiple Accelerators on Multiple Nodes
+
+To get access to multiple nodes with MIC accelerator, user has to use PBS to allocate the resources. To start interactive session, that allocates 2 compute nodes = 2 MIC accelerators run qsub command with following parameters:
+
+```console
+$ qsub -I -q qprod -l select=2:ncpus=24:accelerator=True:naccelerators=2:accelerator_model=phi7120 -A NONE-0-0
+$ module load intel impi
+```
+
+This command connects user through ssh to one of the nodes immediately. To see the other nodes that have been allocated use:
+
+```console
+$ cat $PBS_NODEFILE
+```
+
+For example:
+
+```console
+r25u25n710.ib0.smc.salomon.it4i.cz
+r25u26n711.ib0.smc.salomon.it4i.cz
+```
+
+This output means that the PBS allocated nodes cn204 and cn205, which means that user has direct access to "**r25u25n710-mic0**" and "**r25u26n711-mic0**" accelerators.
+
+!!! note
+    At this point user can connect to any of the allocated nodes or any of the allocated MIC accelerators using ssh:
+    - to connect to the second node : `$ ssh r25u26n711`
+    - to connect to the accelerator on the first node from the first node:  `$ ssh r25u25n710-mic0` or `$ ssh mic0`
+    - to connect to the accelerator on the second node from the first node: `$ ssh r25u25n711-mic0`
+
+At this point we expect that correct modules are loaded and binary is compiled. For parallel execution the mpiexec.hydra is used. Again the first step is to tell mpiexec that the MPI can be executed on MIC accelerators by setting up the environmental variable "I_MPI_MIC", don't forget to have correct FABRIC and PROVIDER defined.
+
+```console
+$ export I_MPI_MIC=1
+$ export I_MPI_FABRICS=shm:dapl
+$ export I_MPI_DAPL_PROVIDER_LIST=ofa-v2-mlx4_0-1u,ofa-v2-scif0,ofa-v2-mcm-1
+```
+
+The launch the MPI program use:
+
+```console
+$ mpirun -genv LD_LIBRARY_PATH $MIC_LD_LIBRARY_PATH \
+ -host r25u25n710-mic0 -n 4 ~/mpi-test-mic \
+: -host r25u26n711-mic0 -n 6 ~/mpi-test-mic
+```
+
+or using mpirun:
+
+```console
+$ mpirun -genv LD_LIBRARY_PATH \
+ -host r25u25n710-mic0 -n 4 ~/mpi-test-mic \
+: -host r25u26n711-mic0 -n 6 ~/mpi-test-mic
+```
+
+In this case four MPI processes are executed on accelerator cn204-mic and six processes are executed on accelerator cn205-mic0. The sample output (sorted after execution) is:
+
+```console
+Hello world from process 0 of 10 on host r25u25n710-mic0
+Hello world from process 1 of 10 on host r25u25n710-mic0
+Hello world from process 2 of 10 on host r25u25n710-mic0
+Hello world from process 3 of 10 on host r25u25n710-mic0
+Hello world from process 4 of 10 on host r25u26n711-mic0
+Hello world from process 5 of 10 on host r25u26n711-mic0
+Hello world from process 6 of 10 on host r25u26n711-mic0
+Hello world from process 7 of 10 on host r25u26n711-mic0
+Hello world from process 8 of 10 on host r25u26n711-mic0
+Hello world from process 9 of 10 on host r25u26n711-mic0
+```
+
+The same way MPI program can be executed on multiple hosts:
+
+```console
+$ mpirun -genv LD_LIBRARY_PATH $MIC_LD_LIBRARY_PATH \
+ -host r25u25n710 -n 4 ~/mpi-test \
+: -host r25u26n711 -n 6 ~/mpi-test
+```
+
+### Symmetric model
+
+In a symmetric mode MPI programs are executed on both host computer(s) and MIC accelerator(s). Since MIC has a different
+architecture and requires different binary file produced by the Intel compiler two different files has to be compiled before MPI program is executed.
+
+In the previous section we have compiled two binary files, one for hosts "**mpi-test**" and one for MIC accelerators "**mpi-test-mic**". These two binaries can be executed at once using mpiexec.hydra:
+
+```console
+$ mpirun \
+ -genv $MIC_LD_LIBRARY_PATH \
+ -host r38u32n1001 -n 2 ~/mpi-test \
+: -host r38u32n1001-mic0 -n 2 ~/mpi-test-mic
+```
+
+In this example the first two parameters (line 2 and 3) sets up required environment variables for execution. The third line specifies binary that is executed on host (here r38u32n1001) and the last line specifies the binary that is execute on the accelerator (here r38u32n1001-mic0).
+
+The output of the program is:
+
+```console
+Hello world from process 0 of 4 on host r38u32n1001
+Hello world from process 1 of 4 on host r38u32n1001
+Hello world from process 2 of 4 on host r38u32n1001-mic0
+Hello world from process 3 of 4 on host r38u32n1001-mic0
+```
+
+The execution procedure can be simplified by using the mpirun command with the machine file a a parameter. Machine file contains list of all nodes and accelerators that should used to execute MPI processes.
+
+An example of a machine file that uses 2 >hosts (**r38u32n1001** and **r38u32n1002**) and 2 accelerators **(r38u32n1001-mic0** and **r38u32n1002-mic0**) to run 2 MPI processes on each of them:
+
+```console
+$ cat hosts_file_mix
+r38u32n1001:2
+r38u32n1001-mic0:2
+r38u33n1002:2
+r38u33n1002-mic0:2
+```
+
+In addition if a naming convention is set in a way that the name of the binary for host is **"bin_name"**  and the name of the binary for the accelerator is **"bin_name-mic"** then by setting up the environment variable **I_MPI_MIC_POSTFIX** to **"-mic"** user do not have to specify the names of booth binaries. In this case mpirun needs just the name of the host binary file (i.e. "mpi-test") and uses the suffix to get a name of the binary for accelerator (i..e. "mpi-test-mic").
+
+```console
+$ export I_MPI_MIC_POSTFIX=-mic
+```
+
+To run the MPI code using mpirun and the machine file "hosts_file_mix" use:
+
+```console
+$ mpirun \
+ -genv LD_LIBRARY_PATH $MIC_LD_LIBRARY_PATH \
+ -machinefile hosts_file_mix \
+ ~/mpi-test
+```
+
+A possible output of the MPI "hello-world" example executed on two hosts and two accelerators is:
+
+```console
+Hello world from process 0 of 8 on host r38u31n1000
+Hello world from process 1 of 8 on host r38u31n1000
+Hello world from process 2 of 8 on host r38u31n1000-mic0
+Hello world from process 3 of 8 on host r38u31n1000-mic0
+Hello world from process 4 of 8 on host r38u32n1001
+Hello world from process 5 of 8 on host r38u32n1001
+Hello world from process 6 of 8 on host r38u32n1001-mic0
+Hello world from process 7 of 8 on host r38u32n1001-mic0
+```
+
+!!! note
+    At this point the MPI communication between MIC accelerators on different nodes uses 1Gb Ethernet only.
+
+### Using Automatically Generated Node-Files
+
+Set of node-files, that can be used instead of manually creating a new one every time, is generated for user convenience. Six node-files are generated:
+
+!!! note
+    **Node-files:**
+
+     - /lscratch/${PBS_JOBID}/nodefile-cn Hosts only node-file
+     - /lscratch/${PBS_JOBID}/nodefile-mic MICs only node-file
+     - /lscratch/${PBS_JOBID}/nodefile-mix Hosts and MICs node-file
+     - /lscratch/${PBS_JOBID}/nodefile-cn-sn Hosts only node-file, using short names
+     - /lscratch/${PBS_JOBID}/nodefile-mic-sn MICs only node-file, using short names
+     - /lscratch/${PBS_JOBID}/nodefile-mix-sn Hosts and MICs node-file, using short names
+
+Each host or accelerator is listed only once per file. User has to specify how many jobs should be executed per node using `-n` parameter of the mpirun command.
+
+## Optimization
+
+For more details about optimization techniques please read Intel document [Optimization and Performance Tuning for IntelÂ® Xeon Phiâ„˘ Coprocessors](http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization "http&#x3A;//software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization")
diff --git a/docs.it4i/software/modules/lmod.md b/docs.it4i/software/modules/lmod.md
new file mode 100644
index 0000000000000000000000000000000000000000..4aad91ce85aff13e59594cc1efaeee4ce6465d78
--- /dev/null
+++ b/docs.it4i/software/modules/lmod.md
@@ -0,0 +1,319 @@
+# Lmod Environment
+
+Lmod is a modules tool, a modern alternative to the oudated & no longer actively maintained Tcl-based environment modules tool.
+
+Detailed documentation on Lmod is available [here](http://lmod.readthedocs.io).
+
+## Benefits
+
+* significantly more responsive module commands, in particular module avail (ml av)
+* easier to use interface
+* module files can be written in either Tcl or Lua syntax (and both types of modules can be mixed together)
+
+## Introduction
+
+Below you will find more details and examples.
+
+| command                  | equivalent/explanation                                           |
+| ------------------------ | ---------------------------------------------------------------- |
+| ml                       | module list                                                      |
+| ml GCC/6.2.0-2.27        | module load GCC/6.2.0-2.27                                       |
+| ml -GCC/6.2.0-2.27       | module unload GCC/6.2.0-2.27                                     |
+| ml purge                 | module unload all modules                                        |
+| ml av                    | module avail                                                     |
+| ml show GCC/6.2.0-2.27   | module show GCC                                                  |
+| ml spider                | gcc searches (case-insensitive) for gcc in all available modules |
+| ml spider GCC/6.2.0-2.27 | show all information about the module GCC/6.2.0-2.27             |
+| ml save mycollection     | stores the currently loaded modules to a collection              |
+| ml restore mycollection  | restores a previously stored collection of modules               |
+
+## Listing Loaded Modules
+
+To get an overview of the currently loaded modules, use module list or ml (without specifying extra arguments).
+
+```console
+$ ml
+Currently Loaded Modules:
+   1) EasyBuild/3.0.0 (S)  2) lmod/7.2.2
+  Where:
+   S:  Module is Sticky, requires --force to unload or purge
+```
+
+!!! tip
+    For more details on sticky modules, see the section on [ml purge](#resetting-by-unloading-all-modules).
+
+## Searching for Available Modules
+
+To get an overview of all available modules, you can use ml avail or simply ml av:
+
+```console
+$ ml av
+---------------------------------------- /apps/modules/compiler ----------------------------------------------
+   GCC/5.2.0    GCCcore/6.2.0 (D)    icc/2013.5.192     ifort/2013.5.192    LLVM/3.9.0-intel-2017.00 (D)
+                                 ...                                  ...
+
+---------------------------------------- /apps/modules/devel -------------------------------------------------
+   Autoconf/2.69-foss-2015g    CMake/3.0.0-intel-2016.01   M4/1.4.17-intel-2016.01   pkg-config/0.27.1-foss-2015g
+   Autoconf/2.69-foss-2016a    CMake/3.3.1-foss-2015g      M4/1.4.17-intel-2017.00   pkg-config/0.27.1-intel-2015b
+                                 ...                                  ...
+```
+
+In the current module naming scheme, each module name consists of two parts:
+
+* the part before the first /, corresponding to the software name
+* the remainder, corresponding to the software version, the compiler toolchain that was used to install the software, and a possible version suffix
+
+!!! tip
+    The (D) indicates that this particular version of the module is the default, but we strongly recommend to not rely on this as the default can change at any point. Usuall, the default will point to the latest version available.
+
+## Searching for Modules
+
+If you just provide a software name, for example gcc, it prints on overview of all available modules for GCC.
+
+```console
+$ ml spider gcc
+---------------------------------------------------------------------------------
+  GCC:
+---------------------------------------------------------------------------------
+    Description:
+      The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, and Ada, as well as libraries for these languages (libstdc++, libgcj,...). - Homepage: http://gcc.gnu.org/
+
+     Versions:
+        GCC/4.4.7-system
+        GCC/4.7.4
+        GCC/4.8.3
+        GCC/4.9.2-binutils-2.25
+        GCC/4.9.2
+        GCC/4.9.3-binutils-2.25
+        GCC/4.9.3
+        GCC/4.9.3-2.25
+        GCC/5.1.0-binutils-2.25
+        GCC/5.2.0
+        GCC/5.3.0-binutils-2.25
+        GCC/5.3.0-2.25
+        GCC/5.3.0-2.26
+        GCC/5.3.1-snapshot-20160419-2.25
+        GCC/5.4.0-2.26
+        GCC/6.2.0-2.27
+
+     Other possible modules matches:
+        GCCcore
+---------------------------------------------------------------------------------
+  To find other possible module matches do:
+      module -r spider '.*GCC.*'
+---------------------------------------------------------------------------------
+  For detailed information about a specific "GCC" module (including how to load the modules) use the module's full name.
+  For example:
+     $ module spider GCC/6.2.0-2.27
+---------------------------------------------------------------------------------
+```
+
+!!! tip
+    Spider is case-insensitive.
+
+If you use spider on a full module name like GCC/6.2.0-2.27 it will tell on which cluster(s) that module available:
+
+```console
+$ module spider GCC/6.2.0-2.27
+--------------------------------------------------------------------------------------------------------------
+  GCC: GCC/6.2.0-2.27
+--------------------------------------------------------------------------------------------------------------
+    Description:
+      The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, and Ada, as well as libraries for these languages (libstdc++, libgcj,...). - Homepage: http://gcc.gnu.org/
+
+    This module can be loaded directly: module load GCC/6.2.0-2.27
+
+    Help:
+      The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, and Ada,
+       as well as libraries for these languages (libstdc++, libgcj,...). - Homepage: http://gcc.gnu.org/
+```
+
+This tells you what the module contains and a URL to the homepage of the software.
+
+## Available Modules for a Particular Software Package
+
+To check which modules are available for a particular software package, you can provide the software name to ml av.
+For example, to check which versions of git are available:
+
+```console
+$ ml av git
+
+-------------------------------------- /apps/modules/tools ----------------------------------------
+   git/2.8.0-GNU-4.9.3-2.25    git/2.8.0-intel-2017.00    git/2.9.0    git/2.9.2    git/2.11.0 (D)
+
+  Where:
+   D:  Default Module
+
+Use "module spider" to find all possible modules.
+Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
+```
+
+!!! tip
+    The specified software name is treated case-insensitively.
+
+Lmod does a partial match on the module name, so sometimes you need to use / to indicate the end of the software name you are interested in:
+
+```console
+$ ml av GCC/
+
+------------------------------------------ /apps/modules/compiler -------------------------------------------
+GCC/4.4.7-system    GCC/4.8.3   GCC/4.9.2   GCC/4.9.3   GCC/5.1.0-binutils-2.25 GCC/5.3.0-binutils-2.25   GCC/5.3.0-2.26   GCC/5.4.0-2.26   GCC/4.7.4   GCC/4.9.2-binutils-2.25   GCC/4.9.3-binutils-2.25   GCC/4.9.3-2.25   GCC/5.2.0   GCC/5.3.0-2.25 GCC/6.2.0-2.27 (D)
+
+  Where:
+   D:  Default Module
+
+Use "module spider" to find all possible modules.
+Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
+```
+
+## Inspecting a Module
+
+To see how a module would change the environment, use ml show:
+
+```console
+$ ml show Python/3.5.2
+
+help([[Python is a programming language that lets you work more quickly and integrate your systems more effectively. - Homepage: http://python.org/]])
+whatis("Description: Python is a programming language that lets you work more quickly and integrate your systems more effectively. - Homepage: http://python.org/")
+conflict("Python")
+load("bzip2/1.0.6")
+load("zlib/1.2.8")
+load("libreadline/6.3")
+load("ncurses/5.9")
+load("SQLite/3.8.8.1")
+load("Tk/8.6.3")
+load("GMP/6.0.0a")
+load("XZ/5.2.2")
+prepend_path("CPATH","/apps/all/Python/3.5.2/include")
+prepend_path("LD_LIBRARY_PATH","/apps/all/Python/3.5.2/lib")
+prepend_path("LIBRARY_PATH","/apps/all/Python/3.5.2/lib")
+prepend_path("MANPATH","/apps/all/Python/3.5.2/share/man")
+prepend_path("PATH","/apps/all/Python/3.5.2/bin")
+prepend_path("PKG_CONFIG_PATH","/apps/all/Python/3.5.2/lib/pkgconfig")
+setenv("EBROOTPYTHON","/apps/all/Python/3.5.2")
+setenv("EBVERSIONPYTHON","3.5.2")
+setenv("EBDEVELPYTHON","/apps/all/Python/3.5.2/easybuild/Python-3.5.2-easybuild-devel")
+setenv("EBEXTSLISTPYTHON","setuptools-20.1.1,pip-8.0.2,nose-1.3.7")
+```
+
+!!! tip
+    Note that both the direct changes to the environment as well as other modules that will be loaded are shown.
+
+If you're not sure what all of this means: don't worry, you don't have to know, just try loading the module as try using the software.
+
+## Loading Modules
+
+The effectively apply the changes to the environment that are specified by a module, use ml and specify the name of the module.
+For example, to set up your environment to use intel:
+
+```console
+$ ml intel/2017.00
+$ ml
+Currently Loaded Modules:
+  1) GCCcore/5.4.0
+  2) binutils/2.26-GCCcore-5.4.0                        (H)
+  3) icc/2017.0.098-GCC-5.4.0-2.26
+  4) ifort/2017.0.098-GCC-5.4.0-2.26
+  5) iccifort/2017.0.098-GCC-5.4.0-2.26
+  6) impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26
+  7) iimpi/2017.00-GCC-5.4.0-2.26
+  8) imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26
+  9) intel/2017.00
+
+  Where:
+   H:  Hidden Module
+```
+
+!!! tip
+    Note that even though we only loaded a single module, the output of ml shows that a whole bunch of modules were loaded, which are required dependencies for intel/2017.00.
+
+## Conflicting Modules
+
+!!! warning
+    It is important to note that **only modules that are compatible with each other can be loaded together. In particular, modules must be installed either with the same toolchain as the modules that** are already loaded, or with a compatible (sub)toolchain.
+
+For example, once you have loaded one or more modules that were installed with the intel/2017.00 toolchain, all other modules that you load should have been installed with the same toolchain.
+
+In addition, only **one single version** of each software package can be loaded at a particular time. For example, once you have the Python/3.5.2-intel-2017.00 module loaded, you can not load a different version of Python in the same session/job script, neither directly, nor indirectly as a dependency of another module you want to load.
+
+## Unloading Modules
+
+To revert the changes to the environment that were made by a particular module, you can use ml -<modname>.
+For example:
+
+```console
+$ ml
+Currently Loaded Modules:
+  1) EasyBuild/3.0.0 (S)   2) lmod/7.2.2
+$ which gcc
+/usr/bin/gcc
+$ ml GCC/
+$ ml
+Currently Loaded Modules:
+  1) EasyBuild/3.0.0 (S)   2) lmod/7.2.2   3) GCCcore/6.2.0   4) binutils/2.27-GCCcore-6.2.0 (H)   5) GCC/6.2.0-2.27
+$ which gcc
+/apps/all/GCCcore/6.2.0/bin/gcc
+$ ml -GCC
+$ ml
+Currently Loaded Modules:
+  1) EasyBuild/3.0.0 (S)   2) lmod/7.2.2   3) GCCcore/6.2.0   4) binutils/2.27-GCCcore-6.2.0 (H)
+$ which gcc
+/usr/bin/gcc
+```
+
+## Resetting by Unloading All Modules
+
+To reset your environment back to a clean state, you can use ml purge or ml purge --force:
+
+```console
+$ ml
+Currently Loaded Modules:
+  1) EasyBuild/3.0.0 (S)   2) lmod/7.2.2   3) GCCcore/6.2.0   4) binutils/2.27-GCCcore-6.2.0 (H)
+$ ml purge
+The following modules were not unloaded:
+   (Use "module --force purge" to unload all):
+ 1) EasyBuild/3.0.0
+$ ml
+Currently Loaded Modules:
+ 1) EasyBuild/3.0.0 (S)
+$ ml purge --force
+$ ml
+No modules loaded
+```
+
+As such, you should not (re)load the cluster module anymore after running ml purge.
+
+## Module Collections
+
+If you have a set of modules that you need to load often, you can save these in a collection (only works with Lmod).
+
+First, load all the modules you need, for example:
+
+```console
+$ ml intel/2017.00 Python/3.5.2-intel-2017.00
+```
+
+Now store them in a collection using ml save:
+
+```console
+$ ml save my-collection
+```
+
+Later, for example in a job script, you can reload all these modules with ml restore:
+
+```console
+$ ml restore my-collection
+```
+
+With ml savelist can you gets a list of all saved collections:
+
+```console
+$ ml savelist
+Named collection list:
+  1) my-collection
+  2) my-test-collection
+```
+
+To inspect a collection, use ml describe.
+
+To remove a module collection, remove the corresponding entry in $HOME/.lmod.d.
diff --git a/docs.it4i/software/viz/gpi2.md b/docs.it4i/software/viz/gpi2.md
new file mode 100644
index 0000000000000000000000000000000000000000..1de40bd8a592cf0d450a8744f704a767004b2b6a
--- /dev/null
+++ b/docs.it4i/software/viz/gpi2.md
@@ -0,0 +1,168 @@
+# GPI-2
+
+## Introduction
+
+Programming Next Generation Supercomputers: GPI-2 is an API library for asynchronous interprocess, cross-node communication. It provides a flexible, scalable and fault tolerant interface for parallel applications.
+
+The GPI-2 library ([www.gpi-site.com/gpi2/](http://www.gpi-site.com/gpi2/)) implements the GASPI specification (Global Address Space Programming Interface, [www.gaspi.de](http://www.gaspi.de/en/project.html)). GASPI is a Partitioned Global Address Space (PGAS) API. It aims at scalable, flexible and failure tolerant computing in massively parallel environments.
+
+## Modules
+
+The GPI-2, version 1.0.2 is available on Anselm via module gpi2:
+
+```console
+$ ml gpi2
+
+$ ml av GPI-2   # Salomon
+```
+
+The module sets up environment variables, required for linking and running GPI-2 enabled applications. This particular command loads the default module, which is gpi2/1.0.2
+
+## Linking
+
+!!! note
+    Link with -lGPI2 -libverbs
+
+Load the gpi2 module. Link using **-lGPI2** and **-libverbs** switches to link your code against GPI-2. The GPI-2 requires the OFED infinband communication library ibverbs.
+
+### Compiling and Linking With Intel Compilers
+
+```console
+$ ml intel
+$ ml gpi2
+$ icc myprog.c -o myprog.x -Wl,-rpath=$LIBRARY_PATH -lGPI2 -libverbs
+```
+
+### Compiling and Linking With GNU Compilers
+
+```console
+$ ml gcc
+$ ml gpi2
+$ gcc myprog.c -o myprog.x -Wl,-rpath=$LIBRARY_PATH -lGPI2 -libverbs
+```
+
+## Running the GPI-2 Codes
+
+!!! note
+    gaspi_run starts the GPI-2 application
+
+The gaspi_run utility is used to start and run GPI-2 applications:
+
+```console
+$ gaspi_run -m machinefile ./myprog.x
+```
+
+A machine file (** machinefile **) with the hostnames of nodes where the application will run, must be provided. The machinefile lists all nodes on which to run, one entry per node per process. This file may be hand created or obtained from standard $PBS_NODEFILE:
+
+```console
+$ cut -f1 -d"." $PBS_NODEFILE > machinefile
+```
+
+machinefile:
+
+```console
+    cn79
+    cn80
+```
+
+This machinefile will run 2 GPI-2 processes, one on node cn79 other on node cn80.
+
+machinefle:
+
+```console
+    cn79
+    cn79
+    cn80
+    cn80
+```
+
+This machinefile will run 4 GPI-2 processes, 2 on node cn79 o 2 on node cn80.
+
+!!! note
+    Use the **mpiprocs**to control how many GPI-2 processes will run per node
+
+Example:
+
+```console
+$ qsub -A OPEN-0-0 -q qexp -l select=2:ncpus=16:mpiprocs=16 -I
+```
+
+This example will produce $PBS_NODEFILE with 16 entries per node.
+
+### Gaspi_logger
+
+!!! note
+    gaspi_logger views the output form GPI-2 application ranks
+
+The gaspi_logger utility is used to view the output from all nodes except the master node (rank 0). The gaspi_logger is started, on another session, on the master node - the node where the gaspi_run is executed. The output of the application, when called with gaspi_printf(), will be redirected to the gaspi_logger. Other I/O routines (e.g. printf) will not.
+
+## Example
+
+Following is an example GPI-2 enabled code:
+
+```cpp
+#include <GASPI.h>
+#include <stdlib.h>
+
+void success_or_exit ( const char* file, const int line, const int ec)
+{
+  if (ec != GASPI_SUCCESS)
+    {
+      gaspi_printf ("Assertion failed in %s[%i]:%dn", file, line, ec);
+      exit (1);
+    }
+}
+
+#define ASSERT(ec) success_or_exit (__FILE__, __LINE__, ec);
+
+int main(int argc, char *argv[])
+{
+  gaspi_rank_t rank, num;
+  gaspi_return_t ret;
+
+  /* Initialize GPI-2 */
+  ASSERT( gaspi_proc_init(GASPI_BLOCK) );
+
+  /* Get ranks information */
+  ASSERT( gaspi_proc_rank(&rank) );
+  ASSERT( gaspi_proc_num(&num) );
+
+  gaspi_printf("Hello from rank %d of %dn",
+           rank, num);
+
+  /* Terminate */
+  ASSERT( gaspi_proc_term(GASPI_BLOCK) );
+
+  return 0;
+}
+```
+
+Load modules and compile:
+
+```console
+$ ml gcc gpi2
+$ gcc helloworld_gpi.c -o helloworld_gpi.x -Wl,-rpath=$LIBRARY_PATH -lGPI2 -libverbs
+```
+
+Submit the job and run the GPI-2 application
+
+```console
+$ qsub -q qexp -l select=2:ncpus=1:mpiprocs=1,place=scatter,walltime=00:05:00 -I
+    qsub: waiting for job 171247.dm2 to start
+    qsub: job 171247.dm2 ready
+cn79 $ ml gpi2
+cn79 $ cut -f1 -d"." $PBS_NODEFILE > machinefile
+cn79 $ gaspi_run -m machinefile ./helloworld_gpi.x
+    Hello from rank 0 of 2
+```
+
+At the same time, in another session, you may start the gaspi logger:
+
+```console
+$ ssh cn79
+cn79 $ gaspi_logger
+    GASPI Logger (v1.1)
+    [cn80:0] Hello from rank 1 of 2
+```
+
+In this example, we compile the helloworld_gpi.c code using the **gnu compiler**(gcc) and link it to the GPI-2 and ibverbs library. The library search path is compiled in. For execution, we use the qexp queue, 2 nodes 1 core each. The GPI module must be loaded on the master compute node (in this example the cn79), gaspi_logger is used from different session to view the output of the second process.
diff --git a/docs.it4i/software/viz/openfoam.md b/docs.it4i/software/viz/openfoam.md
new file mode 100644
index 0000000000000000000000000000000000000000..27aefea264ca2414f8abde9cb734896ac1255faa
--- /dev/null
+++ b/docs.it4i/software/viz/openfoam.md
@@ -0,0 +1,228 @@
+# OpenFOAM
+
+a Free, Open Source CFD Software Package
+
+## Introduction
+
+OpenFOAM is a free, open source CFD software package developed by [**OpenCFD Ltd**](http://www.openfoam.com/about) at [**ESI Group**](http://www.esi-group.com/) and distributed by the [**OpenFOAM Foundation **](http://www.openfoam.org/). It has a large user base across most areas of engineering and science, from both commercial and academic organisations.
+
+Homepage: <http://www.openfoam.com/>
+
+### Installed Version
+
+Currently, several version compiled by GCC/ICC compilers in single/double precision with several version of openmpi are available on Anselm.
+
+For example syntax of available OpenFOAM module is:
+
+\<openfoam\/2.2.1-icc-openmpi1.6.5-DP\>
+
+this means openfoam version 2.2.1 compiled by ICC compiler with openmpi1.6.5 in double precision.
+
+Naming convection of the installed versions is following:
+
+openfoam\<VERSION\>-\<COMPILER\>\<openmpiVERSION\>-\<PRECISION\>
+
+* \<VERSION\> - version of openfoam
+* \<COMPILER\> - version of used compiler
+* \<openmpiVERSION\> - version of used openmpi/impi
+* \<PRECISION\> - DP/SP â€“ double/single precision
+
+### Available OpenFOAM Modules
+
+To check available modules use
+
+```console
+$ ml av
+```
+
+In /opt/modules/modulefiles/engineering you can see installed engineering softwares:
+
+```console
+    ------------------------------------ /opt/modules/modulefiles/engineering -------------------------------------------------------------
+    ansys/14.5.x               matlab/R2013a-COM                                openfoam/2.2.1-icc-impi4.1.1.036-DP
+    comsol/43b-COM             matlab/R2013a-EDU                                openfoam/2.2.1-icc-openmpi1.6.5-DP
+    comsol/43b-EDU             openfoam/2.2.1-gcc481-openmpi1.6.5-DP            paraview/4.0.1-gcc481-bullxmpi1.2.4.1-osmesa10.0
+    lsdyna/7.x.x               openfoam/2.2.1-gcc481-openmpi1.6.5-SP
+```
+
+For information how to use modules please [look here](../anselm/environment-and-modules/).
+
+## Getting Started
+
+To create OpenFOAM environment on ANSELM give the commands:
+
+```console
+$ ml openfoam/2.2.1-icc-openmpi1.6.5-DP
+$ source $FOAM_BASHRC
+```
+
+!!! note
+    Please load correct module with your requirements â€ścompiler - GCC/ICC, precision - DP/SPâ€ť.
+
+Create a project directory within the $HOME/OpenFOAM directory named \<USER\>-\<OFversion\> and create a directory named run within it, e.g. by typing:
+
+```console
+$ mkdir -p $FOAM_RUN
+```
+
+Project directory is now available by typing:
+
+```console
+$ cd /home/<USER>/OpenFOAM/<USER>-<OFversion>/run
+```
+
+\<OFversion\> - for example \<2.2.1\>
+
+or
+
+```console
+$ cd $FOAM_RUN
+```
+
+Copy the tutorial examples directory in the OpenFOAM distribution to the run directory:
+
+```console
+$ cp -r $FOAM_TUTORIALS $FOAM_RUN
+```
+
+Now you can run the first case for example incompressible laminar flow in a cavity.
+
+## Running Serial Applications
+
+Create a Bash script test.sh
+
+```bash
+#!/bin/bash
+module load openfoam/2.2.1-icc-openmpi1.6.5-DP
+source $FOAM_BASHRC
+
+# source to run functions
+. $WM_PROJECT_DIR/bin/tools/RunFunctions
+
+cd $FOAM_RUN/tutorials/incompressible/icoFoam/cavity
+
+runApplication blockMesh
+runApplication icoFoam
+```
+
+Job submission (example for Anselm):
+
+```console
+$ qsub -A OPEN-0-0 -q qprod -l select=1:ncpus=16,walltime=03:00:00 test.sh
+```
+
+For information about job submission please [look here](../anselm/job-submission-and-execution/).
+
+## Running Applications in Parallel
+
+Run the second case for example external incompressible turbulent flow - case - motorBike.
+
+First we must run serial application bockMesh and decomposePar for preparation of parallel computation.
+
+!!! note
+    Create a Bash scrip test.sh:
+
+```bash
+#!/bin/bash
+module load openfoam/2.2.1-icc-openmpi1.6.5-DP
+source $FOAM_BASHRC
+
+# source to run functions
+. $WM_PROJECT_DIR/bin/tools/RunFunctions
+
+cd $FOAM_RUN/tutorials/incompressible/simpleFoam/motorBike
+
+runApplication blockMesh
+runApplication decomposePar
+```
+
+Job submission
+
+```console
+$ qsub -A OPEN-0-0 -q qprod -l select=1:ncpus=16,walltime=03:00:00 test.sh
+```
+
+This job create simple block mesh and domain decomposition. Check your decomposition, and submit parallel computation:
+
+!!! note
+    Create a PBS script testParallel.pbs:
+
+```bash
+#!/bin/bash
+#PBS -N motorBike
+#PBS -l select=2:ncpus=16
+#PBS -l walltime=01:00:00
+#PBS -q qprod
+#PBS -A OPEN-0-0
+
+module load openfoam/2.2.1-icc-openmpi1.6.5-DP
+source $FOAM_BASHRC
+
+cd $FOAM_RUN/tutorials/incompressible/simpleFoam/motorBike
+
+nproc = 32
+
+mpirun -hostfile ${PBS_NODEFILE} -np $nproc snappyHexMesh -overwrite -parallel | tee snappyHexMesh.log
+
+mpirun -hostfile ${PBS_NODEFILE} -np $nproc potentialFoam -noFunctionObject-writep -parallel | tee potentialFoam.log
+
+mpirun -hostfile ${PBS_NODEFILE} -np $nproc simpleFoam -parallel | tee simpleFoam.log
+```
+
+nproc â€“ number of subdomains
+
+Job submission
+
+```console
+$ qsub testParallel.pbs
+```
+
+## Compile Your Own Solver
+
+Initialize OpenFOAM environment before compiling your solver
+
+```console
+$ ml openfoam/2.2.1-icc-openmpi1.6.5-DP
+$ source $FOAM_BASHRC
+$ cd $FOAM_RUN/
+```
+
+Create directory applications/solvers in user directory
+
+```console
+$ mkdir -p applications/solvers
+$ cd applications/solvers
+```
+
+Copy icoFoam solverâ€™s source files
+
+```console
+$ cp -r $FOAM_SOLVERS/incompressible/icoFoam/ My_icoFoam
+$ cd My_icoFoam
+```
+
+Rename icoFoam.C to My_icoFOAM.C
+
+```console
+$ mv icoFoam.C My_icoFoam.C
+```
+
+Edit _files_ file in _Make_ directory:
+
+```bash
+    icoFoam.C
+    EXE = $(FOAM_APPBIN)/icoFoam
+```
+
+and change to:
+
+```bash
+    My_icoFoam.C
+    EXE = $(FOAM_USER_APPBIN)/My_icoFoam
+```
+
+In directory My_icoFoam give the compilation command:
+
+```console
+$ wmake
+```
diff --git a/docs.it4i/software/viz/paraview.md b/docs.it4i/software/viz/paraview.md
new file mode 100644
index 0000000000000000000000000000000000000000..7e2bae9a95bc33c6f83756188a5c1c54e4037892
--- /dev/null
+++ b/docs.it4i/software/viz/paraview.md
@@ -0,0 +1,88 @@
+# ParaView
+
+Open-Source, Multi-Platform Data Analysis and Visualization Application
+
+## Introduction
+
+**ParaView** is an open-source, multi-platform data analysis and visualization application. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView's batch processing capabilities.
+
+ParaView was developed to analyze extremely large datasets using distributed memory computing resources. It can be run on supercomputers to analyze datasets of exascale size as well as on laptops for smaller data.
+
+Homepage : <http://www.paraview.org/>
+
+## Installed Version
+
+Currently, version 5.1.2 compiled with intel/2017a against intel MPI library and OSMesa 12.0.2 is installed on the clusters.
+
+## Usage
+
+On the clusters, ParaView is to be used in client-server mode. A parallel ParaView server is launched on compute nodes by the user, and client is launched on your desktop PC to control and view the visualization. Download ParaView client application for your OS here: <http://paraview.org/paraview/resources/software.php>.
+
+!!!Warning
+    Your version must match the version number installed on the cluster.
+
+### Launching Server
+
+To launch the server, you must first allocate compute nodes, for example
+
+```console
+$ qsub -I -q qprod -A OPEN-0-0 -l select=2
+```
+
+to launch an interactive session on 2 nodes. Refer to [Resource Allocation and Job Execution](../salomon/job-submission-and-execution/) for details.
+
+After the interactive session is opened, load the ParaView module (following examples for Salomon, Anselm instructions in comments):
+
+```console
+$ ml ParaView/5.1.2-intel-2017a-mpi
+```
+
+Now launch the parallel server, with number of nodes times 24 (16 on Anselm) processes:
+
+```console
+$ mpirun -np 48 pvserver --use-offscreen-rendering
+    Waiting for client...
+    Connection URL: cs://r37u29n1006:11111
+    Accepting connection(s): r37u29n1006:11111i
+
+Anselm:
+$ mpirun -np 32 pvserver --use-offscreen-rendering
+    Waiting for client...
+    Connection URL: cs://cn77:11111
+    Accepting connection(s): cn77:11111
+```
+
+Note the that the server is listening on compute node r37u29n1006 in this case, we shall use this information later.
+
+### Client Connection
+
+Because a direct connection is not allowed to compute nodes on Salomon, you must establish a SSH tunnel to connect to the server. Choose a port number on your PC to be forwarded to ParaView server, for example 12345. If your PC is running Linux, use this command to establish a SSH tunnel:
+
+```console
+Salomon: $ ssh -TN -L 12345:r37u29n1006:11111 username@salomon.it4i.cz
+Anselm: $ ssh -TN -L 12345:cn77:11111 username@anselm.it4i.cz
+```
+
+replace username with your login and r37u29n1006 (cn77) with the name of compute node your ParaView server is running on (see previous step).
+
+If you use PuTTY on Windows, load Salomon connection configuration, then go to *Connection* -> *SSH* -> *Tunnels* to set up the port forwarding.
+
+Fill the Source port and Destination fields. **Do not forget to click the Add button.**
+
+![](../../img/paraview_ssh_tunnel_salomon.png "SSH Tunnel in PuTTY")
+
+Now launch ParaView client installed on your desktop PC. Select *File* -> *Connect...* and fill in the following :
+
+![](../../img/paraview_connect_salomon.png "ParaView - Connect to server")
+
+The configuration is now saved for later use. Now click Connect to connect to the ParaView server. In your terminal where you have interactive session with ParaView server launched, you should see:
+
+```console
+Client connected.
+```
+
+You can now use Parallel ParaView.
+
+### Close Server
+
+Remember to close the interactive session after you finish working with ParaView server, as it will remain launched even after your client is disconnected and will continue to consume resources.