diff --git a/docs.it4i/software/intel/intel-suite/intel-advisor.md b/docs.it4i/software/intel/intel-suite/intel-advisor.md index 688deda17708cc23578fd50dc6063fb7716c5858..f2602054acaf1aae89cc4a6e27842af8ad195847 100644 --- a/docs.it4i/software/intel/intel-suite/intel-advisor.md +++ b/docs.it4i/software/intel/intel-suite/intel-advisor.md @@ -26,6 +26,10 @@ In the left pane, you can switch between Vectorization and Threading workflows. ## References -1. [Intel® Advisor 2015 Tutorial: Find Where to Add Parallelism - C++ Sample](https://software.intel.com/en-us/intel-advisor-tutorial-vectorization-windows-cplusplus) -1. [Product page](https://software.intel.com/en-us/intel-advisor-xe) -1. [Documentation](https://software.intel.com/en-us/intel-advisor-2016-user-guide-linux) +1. [Intel® Advisor 2015 Tutorial: Find Where to Add Parallelism - C++ Sample][a] +1. [Product page][b] +1. [Documentation][c] + +[a]: https://software.intel.com/en-us/intel-advisor-tutorial-vectorization-windows-cplusplus +[b]: https://software.intel.com/en-us/intel-advisor-xe +[c]: (https://software.intel.com/en-us/intel-advisor-2016-user-guide-linux diff --git a/docs.it4i/software/intel/intel-suite/intel-compilers.md b/docs.it4i/software/intel/intel-suite/intel-compilers.md index 853408ca59d2984303f561b08902a6ff4dec981c..66d8a125a7e0c64e7a089030af5be3235e1141ec 100644 --- a/docs.it4i/software/intel/intel-suite/intel-compilers.md +++ b/docs.it4i/software/intel/intel-suite/intel-compilers.md @@ -26,7 +26,7 @@ $ icc -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.c $ ifort -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.f mysubroutines.f -o myprog.x ``` -Read more at <https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-user-and-reference-guide> +Read more [here][a]. ## Sandy Bridge/Ivy Bridge/Haswell Binary Compatibility @@ -34,3 +34,5 @@ Read more at <https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-use * Using compiler flag (both for Fortran and C): **-xCORE-AVX2**. This will create a binary with AVX2 instructions, specifically for the Haswell processors. Note that the executable will not run on Sandy Bridge/Ivy Bridge nodes. * Using compiler flags (both for Fortran and C): **-xAVX -axCORE-AVX2**. This will generate multiple, feature specific auto-dispatch code paths for Intel® processors, if there is a performance benefit. So this binary will run both on Sandy Bridge/Ivy Bridge and Haswell processors. During runtime it will be decided which path to follow, dependent on which processor you are running on. In general this will result in larger binaries. + +[a]: https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-user-and-reference-guide diff --git a/docs.it4i/software/intel/intel-suite/intel-debugger.md b/docs.it4i/software/intel/intel-suite/intel-debugger.md index db236753532b85e4757f2bd8643c5ba1a5475c5b..e7e11ba2333a50e51f00d3078081a2a6f688fcf9 100644 --- a/docs.it4i/software/intel/intel-suite/intel-debugger.md +++ b/docs.it4i/software/intel/intel-suite/intel-debugger.md @@ -4,7 +4,7 @@ IDB is no longer available since Intel Parallel Studio 2015 ## Debugging Serial Applications -The intel debugger version is available, via module intel/13.5.192. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use [X display](general/accessing-the-clusters/graphical-user-interface/x-window-system/) for running the GUI. +The intel debugger version is available, via module intel/13.5.192. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use [X display][1] for running the GUI. ```console $ ml intel/13.5.192 @@ -18,7 +18,7 @@ The debugger may run in text mode. To debug in text mode, use $ idbc ``` -To debug on the compute nodes, module intel must be loaded. The GUI on compute nodes may be accessed using the same way as in [the GUI section](general/accessing-the-clusters/graphical-user-interface/x-window-system/) +To debug on the compute nodes, module intel must be loaded. The GUI on compute nodes may be accessed using the same way as in [the GUI section][1]. Example: @@ -40,7 +40,7 @@ In this example, we allocate 1 full compute node, compile program myprog.c with ### Small Number of MPI Ranks -For debugging small number of MPI ranks, you may execute and debug each rank in separate xterm terminal (do not forget the [X display](general/accessing-the-clusters/graphical-user-interface/x-window-system/)). Using Intel MPI, this may be done in following way: +For debugging small number of MPI ranks, you may execute and debug each rank in separate xterm terminal (do not forget the [X display][1]. Using Intel MPI, this may be done in following way: ```console $ qsub -q qexp -l select=2:ncpus=24 -X -I @@ -70,4 +70,8 @@ Run the idb debugger in GUI mode. The menu Parallel contains number of tools for ## Further Information -Exhaustive manual on IDB features and usage is published at [Intel website](https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/). +Exhaustive manual on IDB features and usage is published at [Intel website][a]. + +[1]: ../../../general/accessing-the-clusters/graphical-user-interface/x-window-system.md + +[a]: https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/ diff --git a/docs.it4i/software/intel/intel-suite/intel-inspector.md b/docs.it4i/software/intel/intel-suite/intel-inspector.md index bd298923813d786c7620c751a3c267983bb2a48d..8b44371ab80046d76680d1a02677c4883fb19345 100644 --- a/docs.it4i/software/intel/intel-suite/intel-inspector.md +++ b/docs.it4i/software/intel/intel-suite/intel-inspector.md @@ -34,6 +34,10 @@ Results obtained from batch mode can be then viewed in the GUI by selecting File ## References -1. [Product page](https://software.intel.com/en-us/intel-inspector-xe) -1. [Documentation and Release Notes](https://software.intel.com/en-us/intel-inspector-xe-support/documentation) -1. [Tutorials](https://software.intel.com/en-us/articles/inspectorxe-tutorials) +1. [Product page][a] +1. [Documentation and Release Notes][b] +1. [Tutorials][c] + +[a]: https://software.intel.com/en-us/intel-inspector-xe +[b]: https://software.intel.com/en-us/intel-inspector-xe-support/documentation +[c]: https://software.intel.com/en-us/articles/inspectorxe-tutorials diff --git a/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md b/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md index a47233367e4130177be4db677197a07ec26f9fb2..e2bcbdc69ad618497125c2468353ce210767c8dd 100644 --- a/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md +++ b/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md @@ -73,6 +73,10 @@ $ icc testipp.c -o testipp.x -Wl,-rpath=$LIBRARY_PATH -lippi -lipps -lippcore ## Code Samples and Documentation -Intel provides number of [Code Samples for IPP](https://software.intel.com/en-us/articles/code-samples-for-intel-integrated-performance-primitives-library), illustrating use of IPP. +Intel provides number of [Code Samples for IPP][a], illustrating use of IPP. -Read full documentation on IPP [on Intel website,](http://software.intel.com/sites/products/search/search.php?q=&x=15&y=6&product=ipp&version=7.1&docos=lin) in particular the [IPP Reference manual.](http://software.intel.com/sites/products/documentation/doclib/ipp_sa/71/ipp_manual/index.htm) +Read full documentation on IPP [on Intel website][b] in particular the [IPP Reference manual][c]. + +[a]: https://software.intel.com/en-us/articles/code-samples-for-intel-integrated-performance-primitives-library +[b]: http://software.intel.com/sites/products/search/search.php?q=&x=15&y=6&product=ipp&version=7.1&docos=lin +[c]: http://software.intel.com/sites/products/documentation/doclib/ipp_sa/71/ipp_manual/index.htm diff --git a/docs.it4i/software/intel/intel-suite/intel-mkl.md b/docs.it4i/software/intel/intel-suite/intel-mkl.md index cc2a80c55c8b3224b470ad0042562b18078eeeaa..62193679abbcd6f6f84c725014a0a92ac5487c11 100644 --- a/docs.it4i/software/intel/intel-suite/intel-mkl.md +++ b/docs.it4i/software/intel/intel-suite/intel-mkl.md @@ -13,7 +13,7 @@ Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, e * Data Fitting Library, which provides capabilities for spline-based approximation of functions, derivatives and integrals of functions, and search. * Extended Eigensolver, a shared memory version of an eigensolver based on the Feast Eigenvalue Solver. -For details see the [Intel MKL Reference Manual](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mklman/index.htm). +For details see the [Intel MKL Reference Manual][a]. Intel MKL is available on the cluster @@ -37,7 +37,7 @@ Intel MKL library provides number of interfaces. The fundamental once are the LP ### Linking -Linking Intel MKL libraries may be complex. Intel [mkl link line advisor](http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor) helps. See also [examples](#examples) below. +Linking Intel MKL libraries may be complex. Intel [mkl link line advisor][b] helps. See also [examples][1] below. You will need the mkl module loaded to run the mkl enabled executable. This may be avoided, by compiling library search paths into the executable. Include rpath on the compile line: @@ -109,7 +109,7 @@ In this example, we compile, link and run the cblas_dgemm example, using LP64 in ## MKL and MIC Accelerators -The Intel MKL is capable to automatically offload the computations o the MIC accelerator. See section [Intel Xeon Phi](software/intel/intel-xeon-phi-salomon/) for details. +The Intel MKL is capable to automatically offload the computations o the MIC accelerator. See section [Intel Xeon Phi][2] for details. ## LAPACKE C Interface @@ -117,4 +117,12 @@ MKL includes LAPACKE C Interface to LAPACK. For some reason, although Intel is t ## Further Reading -Read more on [Intel website](http://software.intel.com/en-us/intel-mkl), in particular the [MKL users guide](https://software.intel.com/en-us/intel-mkl/documentation/linux). +Read more on [Intel website][c], in particular the [MKL users guide][d]. + +[1]: #examples +[2]: ../intel-xeon-phi-salomon.md + +[a]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mklman/index.htm +[b]: http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor +[c]: http://software.intel.com/en-us/intel-mkl +[d]: https://software.intel.com/en-us/intel-mkl/documentation/linux diff --git a/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md b/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md index 264b15e9dbe981c8786157cae3bf4f3e3372e3ab..998d1f867a960f03d061eb9d2def66a39a4a9798 100644 --- a/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md +++ b/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md @@ -23,7 +23,7 @@ $ icc -v $ ifort -v ``` -Read more at the [Intel Compilers](software/intel/intel-suite/intel-compilers/) page. +Read more at the [Intel Compilers][1] page. ## Intel Debugger @@ -36,7 +36,7 @@ $ ml intel $ idb ``` -Read more at the [Intel Debugger](software/intel/intel-suite/intel-debugger/) page. +Read more at the [Intel Debugger][2] page. ## Intel Math Kernel Library @@ -46,7 +46,7 @@ Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, e $ ml imkl ``` -Read more at the [Intel MKL](software/intel/intel-suite/intel-mkl/) page. +Read more at the [Intel MKL][3] page. ## Intel Integrated Performance Primitives @@ -56,7 +56,7 @@ Intel Integrated Performance Primitives, version 7.1.1, compiled for AVX is avai $ ml ipp ``` -Read more at the [Intel IPP](software/intel/intel-suite/intel-integrated-performance-primitives/) page. +Read more at the [Intel IPP][4] page. ## Intel Threading Building Blocks @@ -66,4 +66,10 @@ Intel Threading Building Blocks (Intel TBB) is a library that supports scalable $ ml tbb ``` -Read more at the [Intel TBB](software/intel/intel-suite/intel-tbb/) page. +Read more at the [Intel TBB][5] page. + +[1]: intel-compilers.md +[2]: intel-debugger.md +[3]: intel-mkl.md +[4]: intel-integrated-performance-primitives.md +[5]: intel-tbb.md diff --git a/docs.it4i/software/intel/intel-suite/intel-tbb.md b/docs.it4i/software/intel/intel-suite/intel-tbb.md index d28a92d244539cf04c2b0b349be5a8a421a84d4d..545f69ba561fff22decdd96c522b9e5274bbc66a 100644 --- a/docs.it4i/software/intel/intel-suite/intel-tbb.md +++ b/docs.it4i/software/intel/intel-suite/intel-tbb.md @@ -2,7 +2,7 @@ ## Intel Threading Building Blocks -Intel Threading Building Blocks (Intel TBB) is a library that supports scalable parallel programming using standard ISO C++ code. It does not require special languages or compilers. To use the library, you specify tasks, not threads, and let the library map tasks onto threads in an efficient manner. The tasks are executed by a runtime scheduler and may be offloaded to [MIC accelerator](software/intel//intel-xeon-phi-salomon/). +Intel Threading Building Blocks (Intel TBB) is a library that supports scalable parallel programming using standard ISO C++ code. It does not require special languages or compilers. To use the library, you specify tasks, not threads, and let the library map tasks onto threads in an efficient manner. The tasks are executed by a runtime scheduler and may be offloaded to [MIC accelerator][1]. Intel is available on the cluster. @@ -37,4 +37,8 @@ $ icc -O2 -o primes.x main.cpp primes.cpp -Wl,-rpath=$LIBRARY_PATH -ltbb ## Further Reading -Read more on Intel website, [http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm](http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm) +Read more on Intel [website][a]. + +[1]: ../intel-xeon-phi-salomon.md + +[a]: http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm diff --git a/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md b/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md index 7a86f2d0a2dc5ad9f2ad5bf3287ca0353ef2cac5..c6f264ce5cd915d8d2613695cf48220ff2a97a17 100644 --- a/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md +++ b/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md @@ -21,7 +21,7 @@ The trace will be saved in file myapp.stf in the current directory. ## Viewing Traces -To view and analyze the trace, open ITAC GUI in a [graphical environment](general/accessing-the-clusters/graphical-user-interface/x-window-system/): +To view and analyze the trace, open ITAC GUI in a [graphical environment][1]: ```console $ ml itac/9.1.2.024 @@ -36,5 +36,10 @@ Please refer to Intel documenation about usage of the GUI tool. ## References -1. [Getting Started with Intel® Trace Analyzer and Collector](https://software.intel.com/en-us/get-started-with-itac-for-linux) -1. [Intel® Trace Analyzer and Collector - Documentation](https://software.intel.com/en-us/intel-trace-analyzer) +1. [Getting Started with Intel® Trace Analyzer and Collector][a] +1. [Intel® Trace Analyzer and Collector - Documentation][b] + +[1]: ../../../general/accessing-the-clusters/graphical-user-interface/x-window-system.md + +[a]: https://software.intel.com/en-us/get-started-with-itac-for-linux +[b]: https://software.intel.com/en-us/intel-trace-analyzer diff --git a/docs.it4i/software/intel/intel-xeon-phi-anselm.md b/docs.it4i/software/intel/intel-xeon-phi-anselm.md index 009c57fd5554a58e09eec9c0ca33b8955feb1a23..8c38d63df91f27c4c8f2cb5222ab1d31ed54c6e5 100644 --- a/docs.it4i/software/intel/intel-xeon-phi-anselm.md +++ b/docs.it4i/software/intel/intel-xeon-phi-anselm.md @@ -244,12 +244,13 @@ Some interesting compiler flags useful not only for code debugging are: Intel MKL includes an Automatic Offload (AO) feature that enables computationally intensive MKL functions called in user code to benefit from attached Intel Xeon Phi coprocessors automatically and transparently. -Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm). +!!! note + Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here][a]. The Automatic Offload may be enabled by either an MKL function call within the code: ```cpp - mkl_mic_enable(); +mkl_mic_enable(); ``` or by setting environment variable @@ -258,7 +259,7 @@ or by setting environment variable $ export MKL_MIC_ENABLE=1 ``` -To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors](http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf)" white paper or [Intel MKL documentation](https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation). +To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors][b]" white paper or [Intel MKL documentation][c]. ### Automatic Offload Example @@ -476,27 +477,27 @@ After executing the complied binary file, following output should be displayed. ```console $ ./capsbasic - Number of available platforms: 1 - Platform names: - [0] Intel(R) OpenCL [Selected] - Number of devices available for each type: - CL_DEVICE_TYPE_CPU: 1 - CL_DEVICE_TYPE_GPU: 0 - CL_DEVICE_TYPE_ACCELERATOR: 1 +Number of available platforms: 1 +Platform names: + [0] Intel(R) OpenCL [Selected] +Number of devices available for each type: + CL_DEVICE_TYPE_CPU: 1 + CL_DEVICE_TYPE_GPU: 0 + CL_DEVICE_TYPE_ACCELERATOR: 1 - ** Detailed information for each device *** +** Detailed information for each device *** - CL_DEVICE_TYPE_CPU[0] - CL_DEVICE_NAME: Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz - CL_DEVICE_AVAILABLE: 1 +CL_DEVICE_TYPE_CPU[0] + CL_DEVICE_NAME: Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz + CL_DEVICE_AVAILABLE: 1 - ... +... - CL_DEVICE_TYPE_ACCELERATOR[0] - CL_DEVICE_NAME: Intel(R) Many Integrated Core Acceleration Card - CL_DEVICE_AVAILABLE: 1 +CL_DEVICE_TYPE_ACCELERATOR[0] + CL_DEVICE_NAME: Intel(R) Many Integrated Core Acceleration Card + CL_DEVICE_AVAILABLE: 1 - ... +... ``` !!! note @@ -619,10 +620,10 @@ $ mpirun -np 4 ./mpi-test The output should be similar to: ```console - Hello world from process 1 of 4 on host cn207 - Hello world from process 3 of 4 on host cn207 - Hello world from process 2 of 4 on host cn207 - Hello world from process 0 of 4 on host cn207 +Hello world from process 1 of 4 on host cn207 +Hello world from process 3 of 4 on host cn207 +Hello world from process 2 of 4 on host cn207 +Hello world from process 0 of 4 on host cn207 ``` ### Coprocessor-Only Model @@ -637,15 +638,15 @@ Similarly to execution of OpenMP programs in native mode, since the environmenta ```console $ vim ~/.profile - PS1='[u@h W]$ ' - export PATH=/usr/bin:/usr/sbin:/bin:/sbin +PS1='[u@h W]$ ' +export PATH=/usr/bin:/usr/sbin:/bin:/sbin - #OpenMP - export LD_LIBRARY_PATH=/apps/intel/composer_xe_2013.5.192/compiler/lib/mic:$LD_LIBRARY_PATH +#OpenMP +export LD_LIBRARY_PATH=/apps/intel/composer_xe_2013.5.192/compiler/lib/mic:$LD_LIBRARY_PATH - #Intel MPI - export LD_LIBRARY_PATH=/apps/intel/impi/4.1.1.036/mic/lib/:$LD_LIBRARY_PATH - export PATH=/apps/intel/impi/4.1.1.036/mic/bin/:$PATH +#Intel MPI +export LD_LIBRARY_PATH=/apps/intel/impi/4.1.1.036/mic/lib/:$LD_LIBRARY_PATH +export PATH=/apps/intel/impi/4.1.1.036/mic/bin/:$PATH ``` !!! note @@ -673,10 +674,10 @@ $ mpirun -np 4 ./mpi-test-mic The output should be similar to: ```console - Hello world from process 1 of 4 on host cn207-mic0 - Hello world from process 2 of 4 on host cn207-mic0 - Hello world from process 3 of 4 on host cn207-mic0 - Hello world from process 0 of 4 on host cn207-mic0 +Hello world from process 1 of 4 on host cn207-mic0 +Hello world from process 2 of 4 on host cn207-mic0 +Hello world from process 3 of 4 on host cn207-mic0 +Hello world from process 0 of 4 on host cn207-mic0 ``` #### Execution on Host @@ -708,10 +709,10 @@ $ mpirun -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/ -host mic0 -n The output should be again similar to: ```console - Hello world from process 1 of 4 on host cn207-mic0 - Hello world from process 2 of 4 on host cn207-mic0 - Hello world from process 3 of 4 on host cn207-mic0 - Hello world from process 0 of 4 on host cn207-mic0 +Hello world from process 1 of 4 on host cn207-mic0 +Hello world from process 2 of 4 on host cn207-mic0 +Hello world from process 3 of 4 on host cn207-mic0 +Hello world from process 0 of 4 on host cn207-mic0 ``` !!! note @@ -870,14 +871,14 @@ $ mpirun A possible output of the MPI "hello-world" example executed on two hosts and two accelerators is: ```console - Hello world from process 0 of 8 on host cn204 - Hello world from process 1 of 8 on host cn204 - Hello world from process 2 of 8 on host cn204-mic0 - Hello world from process 3 of 8 on host cn204-mic0 - Hello world from process 4 of 8 on host cn205 - Hello world from process 5 of 8 on host cn205 - Hello world from process 6 of 8 on host cn205-mic0 - Hello world from process 7 of 8 on host cn205-mic0 +Hello world from process 0 of 8 on host cn204 +Hello world from process 1 of 8 on host cn204 +Hello world from process 2 of 8 on host cn204-mic0 +Hello world from process 3 of 8 on host cn204-mic0 +Hello world from process 4 of 8 on host cn205 +Hello world from process 5 of 8 on host cn205 +Hello world from process 6 of 8 on host cn205-mic0 +Hello world from process 7 of 8 on host cn205-mic0 ``` !!! note @@ -901,4 +902,9 @@ Each host or accelerator is listed only once per file. User has to specify how m ## Optimization -For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors](http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization "http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization") +For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors][d]. + +[a]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm +[b]: http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf +[c]: https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation +[d]: http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization diff --git a/docs.it4i/software/intel/intel-xeon-phi-salomon.md b/docs.it4i/software/intel/intel-xeon-phi-salomon.md index 982f2d59cd5395edb300efb2d8628d82f68e3e6a..059a018887916aceb52d2a11ca5976f04e3981cb 100644 --- a/docs.it4i/software/intel/intel-xeon-phi-salomon.md +++ b/docs.it4i/software/intel/intel-xeon-phi-salomon.md @@ -152,7 +152,7 @@ For debugging purposes it is also recommended to set environment variable "OFFLO export OFFLOAD_REPORT=3 ``` -A very basic example of code that employs offload programming technique is shown in the next listing. Please note that this code is sequential and utilizes only single core of the accelerator. +A very basic example of code that employs offload programming technique is shown in the next listing. Note that this code is sequential and utilizes only single core of the accelerator. ```cpp $ cat source-offload.cpp @@ -290,7 +290,7 @@ Some interesting compiler flags useful not only for code debugging are: Intel MKL includes an Automatic Offload (AO) feature that enables computationally intensive MKL functions called in user code to benefit from attached Intel Xeon Phi coprocessors automatically and transparently. !!! note - Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm). + Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here][a]. The Automatic Offload may be enabled by either an MKL function call within the code: @@ -304,7 +304,7 @@ or by setting environment variable $ export MKL_MIC_ENABLE=1 ``` -To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors](http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf)" white paper or [Intel MKL documentation](https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation). +To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors][b]" white paper or [Intel MKL documentation][c]. ### Automatic Offload Example @@ -411,7 +411,7 @@ Done ``` !!! note "" - Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm). + Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here][d]. ### Automatic Offload Example #2 @@ -510,7 +510,7 @@ mic0 $ export LD_LIBRARY_PATH=/apps/all/icc/2015.3.187-GNU-5.1.0-2.25/composer_x ``` !!! note - Please note that the path exported in the previous example contains path to a specific compiler (here the version is 2015.3.187-GNU-5.1.0-2.25). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer. + The path exported in the previous example contains path to a specific compiler (here the version is 2015.3.187-GNU-5.1.0-2.25). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer. For your information the list of libraries and their location required for execution of an OpenMP parallel code on Intel Xeon Phi is: @@ -987,4 +987,10 @@ Each host or accelerator is listed only once per file. User has to specify how m ## Optimization -For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors](http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization "http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization") +For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors][e]. + +[a]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm +[b]: http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf +[c]: https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation +[d]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm +[e]: http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization