intel-mkl.md 6.02 KB
Newer Older
Lukáš Krupčík's avatar
Lukáš Krupčík committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102
Intel MKL 
=========



  


Intel Math Kernel Library
-------------------------

Intel Math Kernel Library (Intel MKL) is a library of math kernel
subroutines, extensively threaded and optimized for maximum performance.
Intel MKL provides these basic math kernels:

[]()
-   <div id="d4841e18">

    

    []()BLAS (level 1, 2, and 3) and LAPACK linear algebra routines,
    offering vector, vector-matrix, and matrix-matrix operations.
-   <div id="d4841e21">

    

    []()The PARDISO direct sparse solver, an iterative sparse solver,
    and supporting sparse BLAS (level 1, 2, and 3) routines for solving
    sparse systems of equations.
-   <div id="d4841e24">

    

    []()ScaLAPACK distributed processing linear algebra routines for
    Linux* and Windows* operating systems, as well as the Basic Linear
    Algebra Communications Subprograms (BLACS) and the Parallel Basic
    Linear Algebra Subprograms (PBLAS).
-   <div id="d4841e27">

    

    []()Fast Fourier transform (FFT) functions in one, two, or three
    dimensions with support for mixed radices (not limited to sizes that
    are powers of 2), as well as distributed versions of
    these functions.
-   <div id="d4841e30">

    

    []()Vector Math Library (VML) routines for optimized mathematical
    operations on vectors.
-   <div id="d4841e34">

    

    []()Vector Statistical Library (VSL) routines, which offer
    high-performance vectorized random number generators (RNG) for
    several probability distributions, convolution and correlation
    routines, and summary statistics functions.
-   <div id="d4841e37">

    

    []()Data Fitting Library, which provides capabilities for
    spline-based approximation of functions, derivatives and integrals
    of functions, and search.
-   Extended Eigensolver, a shared memory  version of an eigensolver
    based on the Feast Eigenvalue Solver.



For details see the [Intel MKL Reference
Manual](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mklman/index.htm).

Intel MKL version 13.5.192 is available on Anselm

    $ module load mkl

The module sets up environment variables, required for linking and
running mkl enabled applications. The most important variables are the
$MKLROOT, $MKL_INC_DIR, $MKL_LIB_DIR and $MKL_EXAMPLES

The MKL library may be linked using any compiler.
With intel compiler use -mkl option to link default threaded MKL.

### Interfaces

The MKL library provides number of interfaces. The fundamental once are
the LP64 and ILP64. The Intel MKL ILP64 libraries use the 64-bit integer
type (necessary for indexing large arrays, with more than 2^31^-1
elements), whereas the LP64 libraries index arrays with the 32-bit
integer type.

  Interface   Integer type
  ----------- -----------------------------------------------
  LP64        32-bit, int, integer(kind=4), MPI_INT
  ILP64       64-bit, long int, integer(kind=8), MPI_INT64

### Linking

Linking MKL libraries may be complex. Intel [mkl link line
advisor](http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor)
Lukáš Krupčík's avatar
add  
Lukáš Krupčík committed
103
helps. See also [examples](intel-mkl.html#examples) below.
Lukáš Krupčík's avatar
Lukáš Krupčík committed
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146

You will need the mkl module loaded to run the mkl enabled executable.
This may be avoided, by compiling library search paths into the
executable. Include  rpath on the compile line:

    $ icc .... -Wl,-rpath=$LIBRARY_PATH ...

### Threading

Advantage in using the MKL library is that it brings threaded
parallelization to applications that are otherwise not parallel.

For this to work, the application must link the threaded MKL library
(default). Number and behaviour of MKL threads may be controlled via the
OpenMP environment variables, such as OMP_NUM_THREADS and
KMP_AFFINITY. MKL_NUM_THREADS takes precedence over OMP_NUM_THREADS

    $ export OMP_NUM_THREADS=16
    $ export KMP_AFFINITY=granularity=fine,compact,1,0

The application will run with 16 threads with affinity optimized for
fine grain parallelization.

[]()Examples
------------

Number of examples, demonstrating use of the MKL library and its linking
is available on Anselm, in the $MKL_EXAMPLES directory. In the
examples below, we demonstrate linking MKL to Intel and GNU compiled
program for multi-threaded matrix multiplication.

### Working with examples

    $ module load intel
    $ module load mkl
    $ cp -a $MKL_EXAMPLES/cblas /tmp/
    $ cd /tmp/cblas

    $ make sointel64 function=cblas_dgemm

In this example, we compile, link and run the cblas_dgemm  example,
demonstrating use of MKL example suite installed on Anselm.

Lukáš Krupčík's avatar
Lukáš Krupčík committed
147
### Example: MKL and Intel compiler
Lukáš Krupčík's avatar
Lukáš Krupčík committed
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166

    $ module load intel
    $ module load mkl
    $ cp -a $MKL_EXAMPLES/cblas /tmp/
    $ cd /tmp/cblas
    $ 
    $ icc -w source/cblas_dgemmx.c source/common_func.c -mkl -o cblas_dgemmx.x
    $ ./cblas_dgemmx.x data/cblas_dgemmx.d

In this example, we compile, link and run the cblas_dgemm  example,
demonstrating use of MKL with icc -mkl option. Using the -mkl option is
equivalent to:

    $ icc -w source/cblas_dgemmx.c source/common_func.c -o cblas_dgemmx.x 
    -I$MKL_INC_DIR -L$MKL_LIB_DIR -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5

In this example, we compile and link the cblas_dgemm  example, using
LP64 interface to threaded MKL and Intel OMP threads implementation.

Lukáš Krupčík's avatar
Lukáš Krupčík committed
167
### Example: MKL and GNU compiler
Lukáš Krupčík's avatar
Lukáš Krupčík committed
168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186

    $ module load gcc
    $ module load mkl
    $ cp -a $MKL_EXAMPLES/cblas /tmp/
    $ cd /tmp/cblas
     
    $ gcc -w source/cblas_dgemmx.c source/common_func.c -o cblas_dgemmx.x 
    -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -lm

    $ ./cblas_dgemmx.x data/cblas_dgemmx.d

In this example, we compile, link and run the cblas_dgemm  example,
using LP64 interface to threaded MKL and gnu OMP threads implementation.

MKL and MIC accelerators
------------------------

The MKL is capable to automatically offload the computations o the MIC
accelerator. See section [Intel Xeon
Lukáš Krupčík's avatar
add  
Lukáš Krupčík committed
187
Phi](../intel-xeon-phi.html) for details.
Lukáš Krupčík's avatar
Lukáš Krupčík committed
188 189 190 191 192 193 194 195 196 197

Further reading
---------------

Read more on [Intel
website](http://software.intel.com/en-us/intel-mkl), in
particular the [MKL users
guide](https://software.intel.com/en-us/intel-mkl/documentation/linux).