Commit e6fb2acb authored by David Hrbáč's avatar David Hrbáč

Links OK

parent 75378e2f
...@@ -26,6 +26,10 @@ In the left pane, you can switch between Vectorization and Threading workflows. ...@@ -26,6 +26,10 @@ In the left pane, you can switch between Vectorization and Threading workflows.
## References ## References
1. [Intel® Advisor 2015 Tutorial: Find Where to Add Parallelism - C++ Sample](https://software.intel.com/en-us/intel-advisor-tutorial-vectorization-windows-cplusplus) 1. [Intel® Advisor 2015 Tutorial: Find Where to Add Parallelism - C++ Sample][a]
1. [Product page](https://software.intel.com/en-us/intel-advisor-xe) 1. [Product page][b]
1. [Documentation](https://software.intel.com/en-us/intel-advisor-2016-user-guide-linux) 1. [Documentation][c]
[a]: https://software.intel.com/en-us/intel-advisor-tutorial-vectorization-windows-cplusplus
[b]: https://software.intel.com/en-us/intel-advisor-xe
[c]: (https://software.intel.com/en-us/intel-advisor-2016-user-guide-linux
...@@ -26,7 +26,7 @@ $ icc -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.c ...@@ -26,7 +26,7 @@ $ icc -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.c
$ ifort -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.f mysubroutines.f -o myprog.x $ ifort -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.f mysubroutines.f -o myprog.x
``` ```
Read more at <https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-user-and-reference-guide> Read more [here][a].
## Sandy Bridge/Ivy Bridge/Haswell Binary Compatibility ## Sandy Bridge/Ivy Bridge/Haswell Binary Compatibility
...@@ -34,3 +34,5 @@ Read more at <https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-use ...@@ -34,3 +34,5 @@ Read more at <https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-use
* Using compiler flag (both for Fortran and C): **-xCORE-AVX2**. This will create a binary with AVX2 instructions, specifically for the Haswell processors. Note that the executable will not run on Sandy Bridge/Ivy Bridge nodes. * Using compiler flag (both for Fortran and C): **-xCORE-AVX2**. This will create a binary with AVX2 instructions, specifically for the Haswell processors. Note that the executable will not run on Sandy Bridge/Ivy Bridge nodes.
* Using compiler flags (both for Fortran and C): **-xAVX -axCORE-AVX2**. This will generate multiple, feature specific auto-dispatch code paths for Intel® processors, if there is a performance benefit. So this binary will run both on Sandy Bridge/Ivy Bridge and Haswell processors. During runtime it will be decided which path to follow, dependent on which processor you are running on. In general this will result in larger binaries. * Using compiler flags (both for Fortran and C): **-xAVX -axCORE-AVX2**. This will generate multiple, feature specific auto-dispatch code paths for Intel® processors, if there is a performance benefit. So this binary will run both on Sandy Bridge/Ivy Bridge and Haswell processors. During runtime it will be decided which path to follow, dependent on which processor you are running on. In general this will result in larger binaries.
[a]: https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-user-and-reference-guide
...@@ -4,7 +4,7 @@ IDB is no longer available since Intel Parallel Studio 2015 ...@@ -4,7 +4,7 @@ IDB is no longer available since Intel Parallel Studio 2015
## Debugging Serial Applications ## Debugging Serial Applications
The intel debugger version is available, via module intel/13.5.192. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use [X display](general/accessing-the-clusters/graphical-user-interface/x-window-system/) for running the GUI. The intel debugger version is available, via module intel/13.5.192. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use [X display][1] for running the GUI.
```console ```console
$ ml intel/13.5.192 $ ml intel/13.5.192
...@@ -18,7 +18,7 @@ The debugger may run in text mode. To debug in text mode, use ...@@ -18,7 +18,7 @@ The debugger may run in text mode. To debug in text mode, use
$ idbc $ idbc
``` ```
To debug on the compute nodes, module intel must be loaded. The GUI on compute nodes may be accessed using the same way as in [the GUI section](general/accessing-the-clusters/graphical-user-interface/x-window-system/) To debug on the compute nodes, module intel must be loaded. The GUI on compute nodes may be accessed using the same way as in [the GUI section][1].
Example: Example:
...@@ -40,7 +40,7 @@ In this example, we allocate 1 full compute node, compile program myprog.c with ...@@ -40,7 +40,7 @@ In this example, we allocate 1 full compute node, compile program myprog.c with
### Small Number of MPI Ranks ### Small Number of MPI Ranks
For debugging small number of MPI ranks, you may execute and debug each rank in separate xterm terminal (do not forget the [X display](general/accessing-the-clusters/graphical-user-interface/x-window-system/)). Using Intel MPI, this may be done in following way: For debugging small number of MPI ranks, you may execute and debug each rank in separate xterm terminal (do not forget the [X display][1]. Using Intel MPI, this may be done in following way:
```console ```console
$ qsub -q qexp -l select=2:ncpus=24 -X -I $ qsub -q qexp -l select=2:ncpus=24 -X -I
...@@ -70,4 +70,8 @@ Run the idb debugger in GUI mode. The menu Parallel contains number of tools for ...@@ -70,4 +70,8 @@ Run the idb debugger in GUI mode. The menu Parallel contains number of tools for
## Further Information ## Further Information
Exhaustive manual on IDB features and usage is published at [Intel website](https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/). Exhaustive manual on IDB features and usage is published at [Intel website][a].
[1]: ../../../general/accessing-the-clusters/graphical-user-interface/x-window-system.md
[a]: https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/
...@@ -34,6 +34,10 @@ Results obtained from batch mode can be then viewed in the GUI by selecting File ...@@ -34,6 +34,10 @@ Results obtained from batch mode can be then viewed in the GUI by selecting File
## References ## References
1. [Product page](https://software.intel.com/en-us/intel-inspector-xe) 1. [Product page][a]
1. [Documentation and Release Notes](https://software.intel.com/en-us/intel-inspector-xe-support/documentation) 1. [Documentation and Release Notes][b]
1. [Tutorials](https://software.intel.com/en-us/articles/inspectorxe-tutorials) 1. [Tutorials][c]
[a]: https://software.intel.com/en-us/intel-inspector-xe
[b]: https://software.intel.com/en-us/intel-inspector-xe-support/documentation
[c]: https://software.intel.com/en-us/articles/inspectorxe-tutorials
...@@ -73,6 +73,10 @@ $ icc testipp.c -o testipp.x -Wl,-rpath=$LIBRARY_PATH -lippi -lipps -lippcore ...@@ -73,6 +73,10 @@ $ icc testipp.c -o testipp.x -Wl,-rpath=$LIBRARY_PATH -lippi -lipps -lippcore
## Code Samples and Documentation ## Code Samples and Documentation
Intel provides number of [Code Samples for IPP](https://software.intel.com/en-us/articles/code-samples-for-intel-integrated-performance-primitives-library), illustrating use of IPP. Intel provides number of [Code Samples for IPP][a], illustrating use of IPP.
Read full documentation on IPP [on Intel website,](http://software.intel.com/sites/products/search/search.php?q=&x=15&y=6&product=ipp&version=7.1&docos=lin) in particular the [IPP Reference manual.](http://software.intel.com/sites/products/documentation/doclib/ipp_sa/71/ipp_manual/index.htm) Read full documentation on IPP [on Intel website][b] in particular the [IPP Reference manual][c].
[a]: https://software.intel.com/en-us/articles/code-samples-for-intel-integrated-performance-primitives-library
[b]: http://software.intel.com/sites/products/search/search.php?q=&x=15&y=6&product=ipp&version=7.1&docos=lin
[c]: http://software.intel.com/sites/products/documentation/doclib/ipp_sa/71/ipp_manual/index.htm
...@@ -13,7 +13,7 @@ Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, e ...@@ -13,7 +13,7 @@ Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, e
* Data Fitting Library, which provides capabilities for spline-based approximation of functions, derivatives and integrals of functions, and search. * Data Fitting Library, which provides capabilities for spline-based approximation of functions, derivatives and integrals of functions, and search.
* Extended Eigensolver, a shared memory version of an eigensolver based on the Feast Eigenvalue Solver. * Extended Eigensolver, a shared memory version of an eigensolver based on the Feast Eigenvalue Solver.
For details see the [Intel MKL Reference Manual](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mklman/index.htm). For details see the [Intel MKL Reference Manual][a].
Intel MKL is available on the cluster Intel MKL is available on the cluster
...@@ -37,7 +37,7 @@ Intel MKL library provides number of interfaces. The fundamental once are the LP ...@@ -37,7 +37,7 @@ Intel MKL library provides number of interfaces. The fundamental once are the LP
### Linking ### Linking
Linking Intel MKL libraries may be complex. Intel [mkl link line advisor](http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor) helps. See also [examples](#examples) below. Linking Intel MKL libraries may be complex. Intel [mkl link line advisor][b] helps. See also [examples][1] below.
You will need the mkl module loaded to run the mkl enabled executable. This may be avoided, by compiling library search paths into the executable. Include rpath on the compile line: You will need the mkl module loaded to run the mkl enabled executable. This may be avoided, by compiling library search paths into the executable. Include rpath on the compile line:
...@@ -109,7 +109,7 @@ In this example, we compile, link and run the cblas_dgemm example, using LP64 in ...@@ -109,7 +109,7 @@ In this example, we compile, link and run the cblas_dgemm example, using LP64 in
## MKL and MIC Accelerators ## MKL and MIC Accelerators
The Intel MKL is capable to automatically offload the computations o the MIC accelerator. See section [Intel Xeon Phi](software/intel/intel-xeon-phi-salomon/) for details. The Intel MKL is capable to automatically offload the computations o the MIC accelerator. See section [Intel Xeon Phi][2] for details.
## LAPACKE C Interface ## LAPACKE C Interface
...@@ -117,4 +117,12 @@ MKL includes LAPACKE C Interface to LAPACK. For some reason, although Intel is t ...@@ -117,4 +117,12 @@ MKL includes LAPACKE C Interface to LAPACK. For some reason, although Intel is t
## Further Reading ## Further Reading
Read more on [Intel website](http://software.intel.com/en-us/intel-mkl), in particular the [MKL users guide](https://software.intel.com/en-us/intel-mkl/documentation/linux). Read more on [Intel website][c], in particular the [MKL users guide][d].
[1]: #examples
[2]: ../intel-xeon-phi-salomon.md
[a]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mklman/index.htm
[b]: http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
[c]: http://software.intel.com/en-us/intel-mkl
[d]: https://software.intel.com/en-us/intel-mkl/documentation/linux
...@@ -23,7 +23,7 @@ $ icc -v ...@@ -23,7 +23,7 @@ $ icc -v
$ ifort -v $ ifort -v
``` ```
Read more at the [Intel Compilers](software/intel/intel-suite/intel-compilers/) page. Read more at the [Intel Compilers][1] page.
## Intel Debugger ## Intel Debugger
...@@ -36,7 +36,7 @@ $ ml intel ...@@ -36,7 +36,7 @@ $ ml intel
$ idb $ idb
``` ```
Read more at the [Intel Debugger](software/intel/intel-suite/intel-debugger/) page. Read more at the [Intel Debugger][2] page.
## Intel Math Kernel Library ## Intel Math Kernel Library
...@@ -46,7 +46,7 @@ Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, e ...@@ -46,7 +46,7 @@ Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, e
$ ml imkl $ ml imkl
``` ```
Read more at the [Intel MKL](software/intel/intel-suite/intel-mkl/) page. Read more at the [Intel MKL][3] page.
## Intel Integrated Performance Primitives ## Intel Integrated Performance Primitives
...@@ -56,7 +56,7 @@ Intel Integrated Performance Primitives, version 7.1.1, compiled for AVX is avai ...@@ -56,7 +56,7 @@ Intel Integrated Performance Primitives, version 7.1.1, compiled for AVX is avai
$ ml ipp $ ml ipp
``` ```
Read more at the [Intel IPP](software/intel/intel-suite/intel-integrated-performance-primitives/) page. Read more at the [Intel IPP][4] page.
## Intel Threading Building Blocks ## Intel Threading Building Blocks
...@@ -66,4 +66,10 @@ Intel Threading Building Blocks (Intel TBB) is a library that supports scalable ...@@ -66,4 +66,10 @@ Intel Threading Building Blocks (Intel TBB) is a library that supports scalable
$ ml tbb $ ml tbb
``` ```
Read more at the [Intel TBB](software/intel/intel-suite/intel-tbb/) page. Read more at the [Intel TBB][5] page.
[1]: intel-compilers.md
[2]: intel-debugger.md
[3]: intel-mkl.md
[4]: intel-integrated-performance-primitives.md
[5]: intel-tbb.md
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
## Intel Threading Building Blocks ## Intel Threading Building Blocks
Intel Threading Building Blocks (Intel TBB) is a library that supports scalable parallel programming using standard ISO C++ code. It does not require special languages or compilers. To use the library, you specify tasks, not threads, and let the library map tasks onto threads in an efficient manner. The tasks are executed by a runtime scheduler and may be offloaded to [MIC accelerator](software/intel//intel-xeon-phi-salomon/). Intel Threading Building Blocks (Intel TBB) is a library that supports scalable parallel programming using standard ISO C++ code. It does not require special languages or compilers. To use the library, you specify tasks, not threads, and let the library map tasks onto threads in an efficient manner. The tasks are executed by a runtime scheduler and may be offloaded to [MIC accelerator][1].
Intel is available on the cluster. Intel is available on the cluster.
...@@ -37,4 +37,8 @@ $ icc -O2 -o primes.x main.cpp primes.cpp -Wl,-rpath=$LIBRARY_PATH -ltbb ...@@ -37,4 +37,8 @@ $ icc -O2 -o primes.x main.cpp primes.cpp -Wl,-rpath=$LIBRARY_PATH -ltbb
## Further Reading ## Further Reading
Read more on Intel website, [http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm](http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm) Read more on Intel [website][a].
[1]: ../intel-xeon-phi-salomon.md
[a]: http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm
...@@ -21,7 +21,7 @@ The trace will be saved in file myapp.stf in the current directory. ...@@ -21,7 +21,7 @@ The trace will be saved in file myapp.stf in the current directory.
## Viewing Traces ## Viewing Traces
To view and analyze the trace, open ITAC GUI in a [graphical environment](general/accessing-the-clusters/graphical-user-interface/x-window-system/): To view and analyze the trace, open ITAC GUI in a [graphical environment][1]:
```console ```console
$ ml itac/9.1.2.024 $ ml itac/9.1.2.024
...@@ -36,5 +36,10 @@ Please refer to Intel documenation about usage of the GUI tool. ...@@ -36,5 +36,10 @@ Please refer to Intel documenation about usage of the GUI tool.
## References ## References
1. [Getting Started with Intel® Trace Analyzer and Collector](https://software.intel.com/en-us/get-started-with-itac-for-linux) 1. [Getting Started with Intel® Trace Analyzer and Collector][a]
1. [Intel® Trace Analyzer and Collector - Documentation](https://software.intel.com/en-us/intel-trace-analyzer) 1. [Intel® Trace Analyzer and Collector - Documentation][b]
[1]: ../../../general/accessing-the-clusters/graphical-user-interface/x-window-system.md
[a]: https://software.intel.com/en-us/get-started-with-itac-for-linux
[b]: https://software.intel.com/en-us/intel-trace-analyzer
...@@ -244,12 +244,13 @@ Some interesting compiler flags useful not only for code debugging are: ...@@ -244,12 +244,13 @@ Some interesting compiler flags useful not only for code debugging are:
Intel MKL includes an Automatic Offload (AO) feature that enables computationally intensive MKL functions called in user code to benefit from attached Intel Xeon Phi coprocessors automatically and transparently. Intel MKL includes an Automatic Offload (AO) feature that enables computationally intensive MKL functions called in user code to benefit from attached Intel Xeon Phi coprocessors automatically and transparently.
Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm). !!! note
Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here][a].
The Automatic Offload may be enabled by either an MKL function call within the code: The Automatic Offload may be enabled by either an MKL function call within the code:
```cpp ```cpp
mkl_mic_enable(); mkl_mic_enable();
``` ```
or by setting environment variable or by setting environment variable
...@@ -258,7 +259,7 @@ or by setting environment variable ...@@ -258,7 +259,7 @@ or by setting environment variable
$ export MKL_MIC_ENABLE=1 $ export MKL_MIC_ENABLE=1
``` ```
To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors](http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf)" white paper or [Intel MKL documentation](https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation). To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors][b]" white paper or [Intel MKL documentation][c].
### Automatic Offload Example ### Automatic Offload Example
...@@ -476,27 +477,27 @@ After executing the complied binary file, following output should be displayed. ...@@ -476,27 +477,27 @@ After executing the complied binary file, following output should be displayed.
```console ```console
$ ./capsbasic $ ./capsbasic
Number of available platforms: 1 Number of available platforms: 1
Platform names: Platform names:
[0] Intel(R) OpenCL [Selected] [0] Intel(R) OpenCL [Selected]
Number of devices available for each type: Number of devices available for each type:
CL_DEVICE_TYPE_CPU: 1 CL_DEVICE_TYPE_CPU: 1
CL_DEVICE_TYPE_GPU: 0 CL_DEVICE_TYPE_GPU: 0
CL_DEVICE_TYPE_ACCELERATOR: 1 CL_DEVICE_TYPE_ACCELERATOR: 1
** Detailed information for each device *** ** Detailed information for each device ***
CL_DEVICE_TYPE_CPU[0] CL_DEVICE_TYPE_CPU[0]
CL_DEVICE_NAME: Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz CL_DEVICE_NAME: Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz
CL_DEVICE_AVAILABLE: 1 CL_DEVICE_AVAILABLE: 1
... ...
CL_DEVICE_TYPE_ACCELERATOR[0] CL_DEVICE_TYPE_ACCELERATOR[0]
CL_DEVICE_NAME: Intel(R) Many Integrated Core Acceleration Card CL_DEVICE_NAME: Intel(R) Many Integrated Core Acceleration Card
CL_DEVICE_AVAILABLE: 1 CL_DEVICE_AVAILABLE: 1
... ...
``` ```
!!! note !!! note
...@@ -619,10 +620,10 @@ $ mpirun -np 4 ./mpi-test ...@@ -619,10 +620,10 @@ $ mpirun -np 4 ./mpi-test
The output should be similar to: The output should be similar to:
```console ```console
Hello world from process 1 of 4 on host cn207 Hello world from process 1 of 4 on host cn207
Hello world from process 3 of 4 on host cn207 Hello world from process 3 of 4 on host cn207
Hello world from process 2 of 4 on host cn207 Hello world from process 2 of 4 on host cn207
Hello world from process 0 of 4 on host cn207 Hello world from process 0 of 4 on host cn207
``` ```
### Coprocessor-Only Model ### Coprocessor-Only Model
...@@ -637,15 +638,15 @@ Similarly to execution of OpenMP programs in native mode, since the environmenta ...@@ -637,15 +638,15 @@ Similarly to execution of OpenMP programs in native mode, since the environmenta
```console ```console
$ vim ~/.profile $ vim ~/.profile
PS1='[u@h W]$ ' PS1='[u@h W]$ '
export PATH=/usr/bin:/usr/sbin:/bin:/sbin export PATH=/usr/bin:/usr/sbin:/bin:/sbin
#OpenMP #OpenMP
export LD_LIBRARY_PATH=/apps/intel/composer_xe_2013.5.192/compiler/lib/mic:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/apps/intel/composer_xe_2013.5.192/compiler/lib/mic:$LD_LIBRARY_PATH
#Intel MPI #Intel MPI
export LD_LIBRARY_PATH=/apps/intel/impi/4.1.1.036/mic/lib/:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/apps/intel/impi/4.1.1.036/mic/lib/:$LD_LIBRARY_PATH
export PATH=/apps/intel/impi/4.1.1.036/mic/bin/:$PATH export PATH=/apps/intel/impi/4.1.1.036/mic/bin/:$PATH
``` ```
!!! note !!! note
...@@ -673,10 +674,10 @@ $ mpirun -np 4 ./mpi-test-mic ...@@ -673,10 +674,10 @@ $ mpirun -np 4 ./mpi-test-mic
The output should be similar to: The output should be similar to:
```console ```console
Hello world from process 1 of 4 on host cn207-mic0 Hello world from process 1 of 4 on host cn207-mic0
Hello world from process 2 of 4 on host cn207-mic0 Hello world from process 2 of 4 on host cn207-mic0
Hello world from process 3 of 4 on host cn207-mic0 Hello world from process 3 of 4 on host cn207-mic0
Hello world from process 0 of 4 on host cn207-mic0 Hello world from process 0 of 4 on host cn207-mic0
``` ```
#### Execution on Host #### Execution on Host
...@@ -708,10 +709,10 @@ $ mpirun -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/ -host mic0 -n ...@@ -708,10 +709,10 @@ $ mpirun -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/ -host mic0 -n
The output should be again similar to: The output should be again similar to:
```console ```console
Hello world from process 1 of 4 on host cn207-mic0 Hello world from process 1 of 4 on host cn207-mic0
Hello world from process 2 of 4 on host cn207-mic0 Hello world from process 2 of 4 on host cn207-mic0
Hello world from process 3 of 4 on host cn207-mic0 Hello world from process 3 of 4 on host cn207-mic0
Hello world from process 0 of 4 on host cn207-mic0 Hello world from process 0 of 4 on host cn207-mic0
``` ```
!!! note !!! note
...@@ -870,14 +871,14 @@ $ mpirun ...@@ -870,14 +871,14 @@ $ mpirun
A possible output of the MPI "hello-world" example executed on two hosts and two accelerators is: A possible output of the MPI "hello-world" example executed on two hosts and two accelerators is:
```console ```console
Hello world from process 0 of 8 on host cn204 Hello world from process 0 of 8 on host cn204
Hello world from process 1 of 8 on host cn204 Hello world from process 1 of 8 on host cn204
Hello world from process 2 of 8 on host cn204-mic0 Hello world from process 2 of 8 on host cn204-mic0
Hello world from process 3 of 8 on host cn204-mic0 Hello world from process 3 of 8 on host cn204-mic0
Hello world from process 4 of 8 on host cn205 Hello world from process 4 of 8 on host cn205
Hello world from process 5 of 8 on host cn205 Hello world from process 5 of 8 on host cn205
Hello world from process 6 of 8 on host cn205-mic0 Hello world from process 6 of 8 on host cn205-mic0
Hello world from process 7 of 8 on host cn205-mic0 Hello world from process 7 of 8 on host cn205-mic0
``` ```
!!! note !!! note
...@@ -901,4 +902,9 @@ Each host or accelerator is listed only once per file. User has to specify how m ...@@ -901,4 +902,9 @@ Each host or accelerator is listed only once per file. User has to specify how m
## Optimization ## Optimization
For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors](http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization "http&#x3A;//software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization") For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors][d].
[a]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm
[b]: http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf
[c]: https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation
[d]: http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization
...@@ -152,7 +152,7 @@ For debugging purposes it is also recommended to set environment variable "OFFLO ...@@ -152,7 +152,7 @@ For debugging purposes it is also recommended to set environment variable "OFFLO
export OFFLOAD_REPORT=3 export OFFLOAD_REPORT=3
``` ```
A very basic example of code that employs offload programming technique is shown in the next listing. Please note that this code is sequential and utilizes only single core of the accelerator. A very basic example of code that employs offload programming technique is shown in the next listing. Note that this code is sequential and utilizes only single core of the accelerator.
```cpp ```cpp
$ cat source-offload.cpp $ cat source-offload.cpp
...@@ -290,7 +290,7 @@ Some interesting compiler flags useful not only for code debugging are: ...@@ -290,7 +290,7 @@ Some interesting compiler flags useful not only for code debugging are:
Intel MKL includes an Automatic Offload (AO) feature that enables computationally intensive MKL functions called in user code to benefit from attached Intel Xeon Phi coprocessors automatically and transparently. Intel MKL includes an Automatic Offload (AO) feature that enables computationally intensive MKL functions called in user code to benefit from attached Intel Xeon Phi coprocessors automatically and transparently.
!!! note !!! note
Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm). Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here][a].
The Automatic Offload may be enabled by either an MKL function call within the code: The Automatic Offload may be enabled by either an MKL function call within the code:
...@@ -304,7 +304,7 @@ or by setting environment variable ...@@ -304,7 +304,7 @@ or by setting environment variable
$ export MKL_MIC_ENABLE=1 $ export MKL_MIC_ENABLE=1
``` ```
To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors](http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf)" white paper or [Intel MKL documentation](https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation). To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors][b]" white paper or [Intel MKL documentation][c].
### Automatic Offload Example ### Automatic Offload Example
...@@ -411,7 +411,7 @@ Done ...@@ -411,7 +411,7 @@ Done
``` ```
!!! note "" !!! note ""
Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm). Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here][d].
### Automatic Offload Example #2 ### Automatic Offload Example #2
...@@ -510,7 +510,7 @@ mic0 $ export LD_LIBRARY_PATH=/apps/all/icc/2015.3.187-GNU-5.1.0-2.25/composer_x ...@@ -510,7 +510,7 @@ mic0 $ export LD_LIBRARY_PATH=/apps/all/icc/2015.3.187-GNU-5.1.0-2.25/composer_x
``` ```
!!! note !!! note
Please note that the path exported in the previous example contains path to a specific compiler (here the version is 2015.3.187-GNU-5.1.0-2.25). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer. The path exported in the previous example contains path to a specific compiler (here the version is 2015.3.187-GNU-5.1.0-2.25). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer.
For your information the list of libraries and their location required for execution of an OpenMP parallel code on Intel Xeon Phi is: For your information the list of libraries and their location required for execution of an OpenMP parallel code on Intel Xeon Phi is:
...@@ -987,4 +987,10 @@ Each host or accelerator is listed only once per file. User has to specify how m ...@@ -987,4 +987,10 @@ Each host or accelerator is listed only once per file. User has to specify how m
## Optimization ## Optimization
For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors](http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization "http&#x3A;//software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization") For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors][e].
[a]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm
[b]: http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf
[c]: https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation
[d]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm
[e]: http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment