Compare revisions

e39c45df · e39c45df · e39c45df · e39c45df · e39c45df · e39c45df
--- a/docs.it4i/anselm-cluster-documentation/software/debuggers/vampir.md
+++ b/docs.it4i/anselm-cluster-documentation/software/debuggers/vampir.md
+hVampir
+======
+
+Vampir is a commercial trace analysis and visualisation tool. It can work with traces in OTF and OTF2 formats. It does not have the functionality to collect traces, you need to use a trace collection tool (such as [Score-P](../../../salomon/software/debuggers/score-p/)) first to collect the traces.
+
+![](../../../img/Snmekobrazovky20160708v12.33.35.png)
+
+Installed versions
+------------------
+Version 8.5.0 is currently installed as module Vampir/8.5.0 :
+
+```bash
+    $ module load Vampir/8.5.0
+    $ vampir &
+```
+
+User manual
+-----------
+You can find the detailed user manual in PDF format in $EBROOTVAMPIR/doc/vampir-manual.pdf
+
+References
+----------
+[1].  <https://www.vampir.eu>
--- a/docs.it4i/anselm-cluster-documentation/software/gpi2.md
+++ b/docs.it4i/anselm-cluster-documentation/software/gpi2.md
+GPI-2
+=====
+
+##A library that implements the GASPI specification
+
+Introduction
+------------
+Programming Next Generation Supercomputers: GPI-2 is an API library for asynchronous interprocess, cross-node communication. It provides a flexible, scalable and fault tolerant interface for parallel applications.
+
+The GPI-2 library ([www.gpi-site.com/gpi2/](http://www.gpi-site.com/gpi2/)) implements the GASPI specification (Global Address Space Programming Interface, [www.gaspi.de](http://www.gaspi.de/en/project.html)). GASPI is a Partitioned Global Address Space (PGAS) API. It aims at scalable, flexible and failure tolerant computing in massively parallel environments.
+
+Modules
+-------
+The GPI-2, version 1.0.2 is available on Anselm via module gpi2:
+
+```bash
+    $ module load gpi2
+```
+
+The module sets up environment variables, required for linking and running GPI-2 enabled applications. This particular command loads the default module, which is gpi2/1.0.2
+
+Linking
+-------
+!!! Note "Note"
+	Link with -lGPI2 -libverbs
+
+Load the gpi2 module. Link using **-lGPI2** and **-libverbs** switches to link your code against GPI-2. The GPI-2 requires the OFED infinband communication library ibverbs.
+
+### Compiling and linking with Intel compilers
+
+```bash
+    $ module load intel
+    $ module load gpi2
+    $ icc myprog.c -o myprog.x -Wl,-rpath=$LIBRARY_PATH -lGPI2 -libverbs
+```
+
+### Compiling and linking with GNU compilers
+
+```bash
+    $ module load gcc
+    $ module load gpi2
+    $ gcc myprog.c -o myprog.x -Wl,-rpath=$LIBRARY_PATH -lGPI2 -libverbs
+```
+
+Running the GPI-2 codes
+-----------------------
+
+!!! Note "Note"
+	gaspi_run starts the GPI-2 application
+
+The gaspi_run utility is used to start and run GPI-2 applications:
+
+```bash
+    $ gaspi_run -m machinefile ./myprog.x
+```
+
+A machine file (**machinefile**) with the hostnames of nodes where the application will run, must be provided. The machinefile lists all nodes on which to run, one entry per node per process. This file may be hand created or obtained from standard $PBS_NODEFILE:
+
+```bash
+    $ cut -f1 -d"." $PBS_NODEFILE > machinefile
+```
+
+machinefile:
+
+```bash
+    cn79
+    cn80
+```
+
+This machinefile will run 2 GPI-2 processes, one on node cn79 other on node cn80.
+
+machinefle:
+
+```bash
+    cn79
+    cn79
+    cn80
+    cn80
+```
+
+This machinefile will run 4 GPI-2 processes, 2 on node cn79 o 2 on node cn80.
+
+!!! Note "Note"
+	Use the **mpiprocs** to control how many GPI-2 processes will run per node
+
+Example:
+
+```bash
+    $ qsub -A OPEN-0-0 -q qexp -l select=2:ncpus=16:mpiprocs=16 -I
+```
+
+This example will produce $PBS_NODEFILE with 16 entries per node.
+
+### gaspi_logger
+
+!!! Note "Note"
+	gaspi_logger views the output form GPI-2 application ranks
+
+The gaspi_logger utility is used to view the output from all nodes except the master node (rank 0). The gaspi_logger is started, on another session, on the master node - the node where the gaspi_run is executed. The output of the application, when called with gaspi_printf(), will be redirected to the gaspi_logger. Other I/O routines (e.g. printf) will not.
+
+Example
+-------
+
+Following is an example GPI-2 enabled code:
+
+```cpp
+    #include <GASPI.h>
+    #include <stdlib.h>
+
+    void success_or_exit ( const char* file, const int line, const int ec)
+    {
+      if (ec != GASPI_SUCCESS)
+        {
+          gaspi_printf ("Assertion failed in %s[%i]:%dn", file, line, ec);
+          exit (1);
+        }
+    }
+
+    #define ASSERT(ec) success_or_exit (__FILE__, __LINE__, ec);
+
+    int main(int argc, char *argv[])
+    {
+      gaspi_rank_t rank, num;
+      gaspi_return_t ret;
+
+      /* Initialize GPI-2 */
+      ASSERT( gaspi_proc_init(GASPI_BLOCK) );
+
+      /* Get ranks information */
+      ASSERT( gaspi_proc_rank(&rank) );
+      ASSERT( gaspi_proc_num(&num) );
+
+      gaspi_printf("Hello from rank %d of %dn",
+               rank, num);
+
+      /* Terminate */
+      ASSERT( gaspi_proc_term(GASPI_BLOCK) );
+
+      return 0;
+    }
+```
+
+Load modules and compile:
+
+```bash
+    $ module load gcc gpi2
+    $ gcc helloworld_gpi.c -o helloworld_gpi.x -Wl,-rpath=$LIBRARY_PATH -lGPI2 -libverbs
+```
+
+Submit the job and run the GPI-2 application
+
+```bash
+    $ qsub -q qexp -l select=2:ncpus=1:mpiprocs=1,place=scatter,walltime=00:05:00 -I
+    qsub: waiting for job 171247.dm2 to start
+    qsub: job 171247.dm2 ready
+
+    cn79 $ module load gpi2
+    cn79 $ cut -f1 -d"." $PBS_NODEFILE > machinefile
+    cn79 $ gaspi_run -m machinefile ./helloworld_gpi.x
+    Hello from rank 0 of 2
+```
+
+At the same time, in another session, you may start the gaspi logger:
+
+```bash
+    $ ssh cn79
+    cn79 $ gaspi_logger
+    GASPI Logger (v1.1)
+    [cn80:0] Hello from rank 1 of 2
+```
+
+In this example, we compile the helloworld_gpi.c code using the **gnu compiler** (gcc) and link it to the GPI-2 and ibverbs library. The library search path is compiled in. For execution, we use the qexp queue, 2 nodes 1 core each. The GPI module must be loaded on the master compute node (in this example the cn79), gaspi_logger is used from different session to view the output of the second process.
--- a/docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-compilers.md
+++ b/docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-compilers.md
+Intel Compilers
+===============
+
+The Intel compilers version 13.1.1 are available, via module intel. The compilers include the icc C and C++ compiler and the ifort fortran 77/90/95 compiler.
+
+```bash
+    $ module load intel
+    $ icc -v
+    $ ifort -v
+```
+
+The intel compilers provide for vectorization of the code, via the AVX instructions and support threading parallelization via OpenMP
+
+For maximum performance on the Anselm cluster, compile your programs using the AVX instructions, with reporting where the vectorization was used. We recommend following compilation options for high performance
+
+```bash
+    $ icc   -ipo -O3 -vec -xAVX -vec-report1 myprog.c mysubroutines.c -o myprog.x
+    $ ifort -ipo -O3 -vec -xAVX -vec-report1 myprog.f mysubroutines.f -o myprog.x
+```
+
+In this example, we compile the program enabling interprocedural optimizations between source files (-ipo), aggresive loop optimizations (-O3) and vectorization (-vec -xAVX)
+
+The compiler recognizes the omp, simd, vector and ivdep pragmas for OpenMP parallelization and AVX vectorization. Enable the OpenMP parallelization by the **-openmp** compiler switch.
+
+```bash
+    $ icc -ipo -O3 -vec -xAVX -vec-report1 -openmp myprog.c mysubroutines.c -o myprog.x
+    $ ifort -ipo -O3 -vec -xAVX -vec-report1 -openmp myprog.f mysubroutines.f -o myprog.x
+```
+
+Read more at <http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/composerxe/compiler/cpp-lin/index.htm>
+
+Sandy Bridge/Haswell binary compatibility
+-----------------------------------------
+Anselm nodes are currently equipped with Sandy Bridge CPUs, while Salomon will use Haswell architecture. >The new processors are backward compatible with the Sandy Bridge nodes, so all programs that ran on the Sandy Bridge processors, should also run on the new Haswell nodes. >To get optimal performance out of the Haswell processors a program should make use of the special AVX2 instructions for this processor. One can do this by recompiling codes with the compiler flags >designated to invoke these instructions. For the Intel compiler suite, there are two ways of doing this:
+
+-   Using compiler flag (both for Fortran and C): -xCORE-AVX2. This will create a binary with AVX2 instructions, specifically for the Haswell processors. Note that the executable will not run on Sandy Bridge nodes.
+-   Using compiler flags (both for Fortran and C): -xAVX -axCORE-AVX2. This will generate multiple, feature specific auto-dispatch code paths for Intel® processors, if there is a performance benefit. So this binary will run both on Sandy Bridge and Haswell processors. During runtime it will be decided which path to follow, dependent on which processor you are running on. In general this will result in larger binaries.
--- a/docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-debugger.md
+++ b/docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-debugger.md
+Intel Debugger
+==============
+
+Debugging serial applications
+-----------------------------
+The intel debugger version 13.0 is available, via module intel. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use X display for running the GUI.
+
+```bash
+    $ module load intel
+    $ idb
+```
+
+The debugger may run in text mode. To debug in text mode, use
+
+```bash
+    $ idbc
+```
+
+To debug on the compute nodes, module intel must be loaded. The GUI on compute nodes may be accessed using the same way as in the GUI section
+
+Example:
+
+```bash
+    $ qsub -q qexp -l select=1:ncpus=16 -X -I
+    qsub: waiting for job 19654.srv11 to start
+    qsub: job 19654.srv11 ready
+
+    $ module load intel
+    $ module load java
+    $ icc -O0 -g myprog.c -o myprog.x
+    $ idb ./myprog.x
+```
+
+In this example, we allocate 1 full compute node, compile program myprog.c with debugging options -O0 -g and run the idb debugger interactively on the myprog.x executable. The GUI access is via X11 port forwarding provided by the PBS workload manager.
+
+Debugging parallel applications
+-------------------------------
+Intel debugger is capable of debugging multithreaded and MPI parallel programs as well.
+
+### Small number of MPI ranks
+
+For debugging small number of MPI ranks, you may execute and debug each rank in separate xterm terminal (do not forget the X display. Using Intel MPI, this may be done in following way:
+
+```bash
+    $ qsub -q qexp -l select=2:ncpus=16 -X -I
+    qsub: waiting for job 19654.srv11 to start
+    qsub: job 19655.srv11 ready
+
+    $ module load intel impi
+    $ mpirun -ppn 1 -hostfile $PBS_NODEFILE --enable-x xterm -e idbc ./mympiprog.x
+```
+
+In this example, we allocate 2 full compute node, run xterm on each node and start idb debugger in command line mode, debugging two ranks of mympiprog.x application. The xterm will pop up for each rank, with idb prompt ready. The example is not limited to use of Intel MPI
+
+### Large number of MPI ranks
+
+Run the idb debugger from within the MPI debug option. This will cause the debugger to bind to all ranks and provide aggregated outputs across the ranks, pausing execution automatically just after startup. You may then set break points and step the execution manually. Using Intel MPI:
+
+```bash
+    $ qsub -q qexp -l select=2:ncpus=16 -X -I
+    qsub: waiting for job 19654.srv11 to start
+    qsub: job 19655.srv11 ready
+
+    $ module load intel impi
+    $ mpirun -n 32 -idb ./mympiprog.x
+```
+
+### Debugging multithreaded application
+
+Run the idb debugger in GUI mode. The menu Parallel contains number of tools for debugging multiple threads. One of the most useful tools is the **Serialize Execution** tool, which serializes execution of concurrent threads for easy orientation and identification of concurrency related bugs.
+
+Further information
+-------------------
+Exhaustive manual on idb features and usage is published at [Intel website](http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/composerxe/debugger/user_guide/index.htm)
+
--- a/docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-integrated-performance-primitives.md
+++ b/docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-integrated-performance-primitives.md
+Intel IPP
+=========
+
+Intel Integrated Performance Primitives
+---------------------------------------
+Intel Integrated Performance Primitives, version 7.1.1, compiled for AVX vector instructions is available, via module ipp. The IPP is a very rich library of highly optimized algorithmic building blocks for media and data applications. This includes signal, image and frame processing algorithms, such as FFT, FIR, Convolution, Optical Flow, Hough transform, Sum, MinMax, as well as cryptographic functions, linear algebra functions and many more.
+
+!!! Note "Note"
+	Check out IPP before implementing own math functions for data processing, it is likely already there.
+
+```bash
+    $ module load ipp
+```
+
+The module sets up environment variables, required for linking and running ipp enabled applications.
+
+IPP example
+-----------
+
+```cpp
+    #include "ipp.h"
+    #include <stdio.h>
+    int main(int argc, char* argv[])
+    {
+            const IppLibraryVersion *lib;
+            Ipp64u fm;
+            IppStatus status;
+
+            status= ippInit();            //IPP initialization with the best optimization layer
+            if( status != ippStsNoErr ) {
+                    printf("IppInit() Error:n");
+                    printf("%sn", ippGetStatusString(status) );
+                    return -1;
+            }
+
+            //Get version info
+            lib = ippiGetLibVersion();
+            printf("%s %sn", lib->Name, lib->Version);
+
+            //Get CPU features enabled with selected library level
+            fm=ippGetEnabledCpuFeatures();
+            printf("SSE    :%cn",(fm>1)&1?'Y':'N');
+            printf("SSE2   :%cn",(fm>2)&1?'Y':'N');
+            printf("SSE3   :%cn",(fm>3)&1?'Y':'N');
+            printf("SSSE3  :%cn",(fm>4)&1?'Y':'N');
+            printf("SSE41  :%cn",(fm>6)&1?'Y':'N');
+            printf("SSE42  :%cn",(fm>7)&1?'Y':'N');
+            printf("AVX    :%cn",(fm>8)&1 ?'Y':'N');
+            printf("AVX2   :%cn", (fm>15)&1 ?'Y':'N' );
+            printf("----------n");
+            printf("OS Enabled AVX :%cn", (fm>9)&1 ?'Y':'N');
+            printf("AES            :%cn", (fm>10)&1?'Y':'N');
+            printf("CLMUL          :%cn", (fm>11)&1?'Y':'N');
+            printf("RDRAND         :%cn", (fm>13)&1?'Y':'N');
+            printf("F16C           :%cn", (fm>14)&1?'Y':'N');
+
+            return 0;
+    }
+```
+
+Compile above example, using any compiler and the ipp module.
+
+```bash
+    $ module load intel
+    $ module load ipp
+
+    $ icc testipp.c -o testipp.x -lippi -lipps -lippcore
+```
+
+You will need the ipp module loaded to run the ipp enabled executable. This may be avoided, by compiling library search paths into the executable
+
+```bash
+    $ module load intel
+    $ module load ipp
+
+    $ icc testipp.c -o testipp.x -Wl,-rpath=$LIBRARY_PATH -lippi -lipps -lippcore
+```
+
+Code samples and documentation
+------------------------------
+Intel provides number of [Code Samples for IPP](https://software.intel.com/en-us/articles/code-samples-for-intel-integrated-performance-primitives-library), illustrating use of IPP.
+
+Read full documentation on IPP [on Intel website,](http://software.intel.com/sites/products/search/search.php?q=&x=15&y=6&product=ipp&version=7.1&docos=lin) in particular the [IPP Reference manual.](http://software.intel.com/sites/products/documentation/doclib/ipp_sa/71/ipp_manual/index.htm)
--- a/docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-mkl.md
+++ b/docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-mkl.md
+Intel MKL
+=========
+
+Intel Math Kernel Library
+-------------------------
+Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, extensively threaded and optimized for maximum performance. Intel MKL provides these basic math kernels:
+
+- BLAS (level 1, 2, and 3) and LAPACK linear algebra routines, offering vector, vector-matrix, and matrix-matrix operations.
+- The PARDISO direct sparse solver, an iterative sparse solver, and supporting sparse BLAS (level 1, 2, and 3) routines for solving sparse systems of equations.
+- ScaLAPACK distributed processing linear algebra routines for Linux* and Windows* operating systems, as well as the Basic Linear Algebra Communications Subprograms (BLACS) and the Parallel Basic Linear Algebra Subprograms (PBLAS).
+- Fast Fourier transform (FFT) functions in one, two, or three dimensions with support for mixed radices (not limited to sizes that are powers of 2), as well as distributed versions of these functions.
+- Vector Math Library (VML) routines for optimized mathematical operations on vectors.
+- Vector Statistical Library (VSL) routines, which offer high-performance vectorized random number generators (RNG) for    several probability distributions, convolution and correlation routines, and summary statistics functions.
+- Data Fitting Library, which provides capabilities for spline-based approximation of functions, derivatives and integrals of functions, and search.
+-   Extended Eigensolver, a shared memory  version of an eigensolver based on the Feast Eigenvalue Solver.
+
+For details see the [Intel MKL Reference Manual](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mklman/index.htm).
+
+Intel MKL version 13.5.192 is available on Anselm
+
+```bash
+    $ module load mkl
+```
+
+The module sets up environment variables, required for linking and running mkl enabled applications. The most important variables are the $MKLROOT, $MKL_INC_DIR, $MKL_LIB_DIR and $MKL_EXAMPLES
+
+!!! Note "Note"
+	The MKL library may be linked using any compiler. With intel compiler use -mkl option to link default threaded MKL.
+
+### Interfaces
+
+The MKL library provides number of interfaces. The fundamental once are the LP64 and ILP64. The Intel MKL ILP64 libraries use the 64-bit integer type (necessary for indexing large arrays, with more than 231^-1 elements), whereas the LP64 libraries index arrays with the 32-bit integer type.
+
+|Interface|Integer type|
+|---|---|
+|LP64|32-bit, int, integer(kind=4), MPI_INT|
+|ILP64|64-bit, long int, integer(kind=8), MPI_INT64|
+
+### Linking
+
+Linking MKL libraries may be complex. Intel [mkl link line advisor](http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor) helps. See also [examples](intel-mkl/#examples) below.
+
+You will need the mkl module loaded to run the mkl enabled executable. This may be avoided, by compiling library search paths into the executable. Include  rpath on the compile line:
+
+```bash
+    $ icc .... -Wl,-rpath=$LIBRARY_PATH ...
+```
+
+### Threading
+
+!!! Note "Note"
+	Advantage in using the MKL library is that it brings threaded parallelization to applications that are otherwise not parallel.
+
+For this to work, the application must link the threaded MKL library (default). Number and behaviour of MKL threads may be controlled via the OpenMP environment variables, such as OMP_NUM_THREADS and KMP_AFFINITY. MKL_NUM_THREADS takes precedence over OMP_NUM_THREADS
+
+```bash
+    $ export OMP_NUM_THREADS=16
+    $ export KMP_AFFINITY=granularity=fine,compact,1,0
+```
+
+The application will run with 16 threads with affinity optimized for fine grain parallelization.
+
+Examples
+------------
+Number of examples, demonstrating use of the MKL library and its linking is available on Anselm, in the $MKL_EXAMPLES directory. In the examples below, we demonstrate linking MKL to Intel and GNU compiled program for multi-threaded matrix multiplication.
+
+### Working with examples
+
+```bash
+    $ module load intel
+    $ module load mkl
+    $ cp -a $MKL_EXAMPLES/cblas /tmp/
+    $ cd /tmp/cblas
+
+    $ make sointel64 function=cblas_dgemm
+```
+
+In this example, we compile, link and run the cblas_dgemm  example, demonstrating use of MKL example suite installed on Anselm.
+
+### Example: MKL and Intel compiler
+
+```bash
+    $ module load intel
+    $ module load mkl
+    $ cp -a $MKL_EXAMPLES/cblas /tmp/
+    $ cd /tmp/cblas
+    $
+    $ icc -w source/cblas_dgemmx.c source/common_func.c -mkl -o cblas_dgemmx.x
+    $ ./cblas_dgemmx.x data/cblas_dgemmx.d
+```
+
+In this example, we compile, link and run the cblas_dgemm  example, demonstrating use of MKL with icc -mkl option. Using the -mkl option is equivalent to:
+
+```bash
+    $ icc -w source/cblas_dgemmx.c source/common_func.c -o cblas_dgemmx.x
+    -I$MKL_INC_DIR -L$MKL_LIB_DIR -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5
+```
+
+In this example, we compile and link the cblas_dgemm  example, using LP64 interface to threaded MKL and Intel OMP threads implementation.
+
+### Example: MKL and GNU compiler
+
+```bash
+    $ module load gcc
+    $ module load mkl
+    $ cp -a $MKL_EXAMPLES/cblas /tmp/
+    $ cd /tmp/cblas
+
+    $ gcc -w source/cblas_dgemmx.c source/common_func.c -o cblas_dgemmx.x
+    -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -lm
+
+    $ ./cblas_dgemmx.x data/cblas_dgemmx.d
+```
+
+In this example, we compile, link and run the cblas_dgemm  example, using LP64 interface to threaded MKL and gnu OMP threads implementation.
+
+MKL and MIC accelerators
+------------------------
+The MKL is capable to automatically offload the computations o the MIC accelerator. See section [Intel XeonPhi](../intel-xeon-phi/) for details.
+
+Further reading
+---------------
+Read more on [Intel website](http://software.intel.com/en-us/intel-mkl), in particular the [MKL users guide](https://software.intel.com/en-us/intel-mkl/documentation/linux).
--- a/docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-tbb.md
+++ b/docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-tbb.md
+Intel TBB
+=========
+
+Intel Threading Building Blocks
+-------------------------------
+Intel Threading Building Blocks (Intel TBB) is a library that supports scalable parallel programming using standard ISO C++ code. It does not require special languages or compilers.  To use the library, you specify tasks, not threads, and let the library map tasks onto threads in an efficient manner. The tasks are executed by a runtime scheduler and may
+be offloaded to [MIC accelerator](../intel-xeon-phi/).
+
+Intel TBB version 4.1 is available on Anselm
+
+```bash
+    $ module load tbb
+```
+
+The module sets up environment variables, required for linking and running tbb enabled applications.
+
+!!! Note "Note"
+	Link the tbb library, using -ltbb
+
+Examples
+--------
+Number of examples, demonstrating use of TBB and its built-in scheduler is available on Anselm, in the $TBB_EXAMPLES directory.
+
+```bash
+    $ module load intel
+    $ module load tbb
+    $ cp -a $TBB_EXAMPLES/common $TBB_EXAMPLES/parallel_reduce /tmp/
+    $ cd /tmp/parallel_reduce/primes
+    $ icc -O2 -DNDEBUG -o primes.x main.cpp primes.cpp -ltbb
+    $ ./primes.x
+```
+
+In this example, we compile, link and run the primes example, demonstrating use of parallel task-based reduce in computation of prime numbers.
+
+You will need the tbb module loaded to run the tbb enabled executable. This may be avoided, by compiling library search paths into the executable.
+
+```bash
+    $ icc -O2 -o primes.x main.cpp primes.cpp -Wl,-rpath=$LIBRARY_PATH -ltbb
+```
+
+Further reading
+---------------
+Read more on Intel website, <http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm>
+
--- a/docs.it4i/anselm-cluster-documentation/software/intel-suite/introduction.md
+++ b/docs.it4i/anselm-cluster-documentation/software/intel-suite/introduction.md
+Intel Parallel Studio
+=====================
+
+The Anselm cluster provides following elements of the Intel Parallel Studio XE
+
+|Intel Parallel Studio XE|
+|-------------------------------------------------|
+|Intel Compilers|
+|Intel Debugger|
+|Intel MKL Library|
+|Intel Integrated Performance Primitives Library|
+|Intel Threading Building Blocks Library|
+
+Intel compilers
+---------------
+The Intel compilers version 13.1.3 are available, via module intel. The compilers include the icc C and C++ compiler and the ifort fortran 77/90/95 compiler.
+
+```bash
+    $ module load intel
+    $ icc -v
+    $ ifort -v
+```
+
+Read more at the [Intel Compilers](intel-compilers/) page.
+
+Intel debugger
+--------------
+The intel debugger version 13.0 is available, via module intel. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use X display for running the GUI.
+
+```bash
+    $ module load intel
+    $ idb
+```
+
+Read more at the [Intel Debugger](intel-debugger/) page.
+
+Intel Math Kernel Library
+-------------------------
+Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, extensively threaded and optimized for maximum performance. Intel MKL unites and provides these basic components: BLAS, LAPACK, ScaLapack, PARDISO, FFT, VML, VSL, Data fitting, Feast Eigensolver and many more.
+
+```bash
+    $ module load mkl
+```
+
+Read more at the [Intel MKL](intel-mkl/) page.
+
+Intel Integrated Performance Primitives
+---------------------------------------
+Intel Integrated Performance Primitives, version 7.1.1, compiled for AVX is available, via module ipp. The IPP is a library of highly optimized algorithmic building blocks for media and data applications. This includes signal, image and frame processing algorithms, such as FFT, FIR, Convolution, Optical Flow, Hough transform, Sum, MinMax and many more.
+
+```bash
+    $ module load ipp
+```
+
+Read more at the [Intel IPP](intel-integrated-performance-primitives/) page.
+
+Intel Threading Building Blocks
+-------------------------------
+Intel Threading Building Blocks (Intel TBB) is a library that supports scalable parallel programming using standard ISO C++ code. It does not require special languages or compilers. It is designed to promote scalable data parallel programming. Additionally, it fully supports nested parallelism, so you can build larger parallel components from smaller parallel components. To use the library, you specify tasks, not threads, and let the library map tasks onto threads in an efficient manner.
+
+```bash
+    $ module load tbb
+```
+
+Read more at the [Intel TBB](intel-tbb/) page.
--- a/docs.it4i/anselm-cluster-documentation/software/intel-xeon-phi.md
+++ b/docs.it4i/anselm-cluster-documentation/software/intel-xeon-phi.md
--- a/docs.it4i/anselm-cluster-documentation/software/isv_licenses.md
+++ b/docs.it4i/anselm-cluster-documentation/software/isv_licenses.md
--- a/docs.it4i/anselm-cluster-documentation/software/java.md
+++ b/docs.it4i/anselm-cluster-documentation/software/java.md
+Java
+====
+
+##Java on ANSELM
+
+Java is available on Anselm cluster. Activate java by loading the java module
+
+```bash
+    $ module load java
+```
+
+Note that the java module must be loaded on the compute nodes as well, in order to run java on compute nodes.
+
+Check for java version and path
+
+```bash
+    $ java -version
+    $ which java
+```
+
+With the module loaded, not only the runtime environment (JRE), but also the development environment (JDK) with the compiler is available.
+
+```bash
+    $ javac -version
+    $ which javac
+```
+
+Java applications may use MPI for interprocess communication, in conjunction with OpenMPI. Read more on <http://www.open-mpi.org/faq/?category=java>. This functionality is currently not supported on Anselm cluster. In case you require the java interface to MPI, please contact [Anselm support](https://support.it4i.cz/rt/).
+
--- a/docs.it4i/anselm-cluster-documentation/software/kvirtualization.md
+++ b/docs.it4i/anselm-cluster-documentation/software/kvirtualization.md
--- a/docs.it4i/anselm-cluster-documentation/software/mpi/Running_OpenMPI.md
+++ b/docs.it4i/anselm-cluster-documentation/software/mpi/Running_OpenMPI.md
--- a/docs.it4i/anselm-cluster-documentation/software/mpi/mpi.md
+++ b/docs.it4i/anselm-cluster-documentation/software/mpi/mpi.md
--- a/docs.it4i/anselm-cluster-documentation/software/mpi/mpi4py-mpi-for-python.md
+++ b/docs.it4i/anselm-cluster-documentation/software/mpi/mpi4py-mpi-for-python.md
--- a/docs.it4i/anselm-cluster-documentation/software/mpi/running-mpich2.md
+++ b/docs.it4i/anselm-cluster-documentation/software/mpi/running-mpich2.md
--- a/docs.it4i/anselm-cluster-documentation/software/numerical-languages/introduction.md
+++ b/docs.it4i/anselm-cluster-documentation/software/numerical-languages/introduction.md
--- a/docs.it4i/anselm-cluster-documentation/software/numerical-languages/matlab.md
+++ b/docs.it4i/anselm-cluster-documentation/software/numerical-languages/matlab.md
--- a/docs.it4i/anselm-cluster-documentation/software/numerical-languages/matlab_1314.md
+++ b/docs.it4i/anselm-cluster-documentation/software/numerical-languages/matlab_1314.md
--- a/docs.it4i/anselm-cluster-documentation/software/numerical-languages/octave.md
+++ b/docs.it4i/anselm-cluster-documentation/software/numerical-languages/octave.md
No results found