From e6fb2acb312b9eed59f4012deff2b60136795e8a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?David=20Hrb=C3=A1=C4=8D?= <david@hrbac.cz> Date: Thu, 1 Nov 2018 20:49:21 +0100 Subject: [PATCH] Links OK --- .../intel/intel-suite/intel-advisor.md | 10 +- .../intel/intel-suite/intel-compilers.md | 4 +- .../intel/intel-suite/intel-debugger.md | 12 ++- .../intel/intel-suite/intel-inspector.md | 10 +- ...intel-integrated-performance-primitives.md | 8 +- .../software/intel/intel-suite/intel-mkl.md | 16 ++- .../intel-parallel-studio-introduction.md | 16 ++- .../software/intel/intel-suite/intel-tbb.md | 8 +- .../intel-trace-analyzer-and-collector.md | 11 +- .../software/intel/intel-xeon-phi-anselm.md | 100 ++++++++++-------- .../software/intel/intel-xeon-phi-salomon.md | 18 ++-- 11 files changed, 133 insertions(+), 80 deletions(-) diff --git a/docs.it4i/software/intel/intel-suite/intel-advisor.md b/docs.it4i/software/intel/intel-suite/intel-advisor.md index 688deda17..f2602054a 100644 --- a/docs.it4i/software/intel/intel-suite/intel-advisor.md +++ b/docs.it4i/software/intel/intel-suite/intel-advisor.md @@ -26,6 +26,10 @@ In the left pane, you can switch between Vectorization and Threading workflows. ## References -1. [Intel® Advisor 2015 Tutorial: Find Where to Add Parallelism - C++ Sample](https://software.intel.com/en-us/intel-advisor-tutorial-vectorization-windows-cplusplus) -1. [Product page](https://software.intel.com/en-us/intel-advisor-xe) -1. [Documentation](https://software.intel.com/en-us/intel-advisor-2016-user-guide-linux) +1. [Intel® Advisor 2015 Tutorial: Find Where to Add Parallelism - C++ Sample][a] +1. [Product page][b] +1. [Documentation][c] + +[a]: https://software.intel.com/en-us/intel-advisor-tutorial-vectorization-windows-cplusplus +[b]: https://software.intel.com/en-us/intel-advisor-xe +[c]: (https://software.intel.com/en-us/intel-advisor-2016-user-guide-linux diff --git a/docs.it4i/software/intel/intel-suite/intel-compilers.md b/docs.it4i/software/intel/intel-suite/intel-compilers.md index 853408ca5..66d8a125a 100644 --- a/docs.it4i/software/intel/intel-suite/intel-compilers.md +++ b/docs.it4i/software/intel/intel-suite/intel-compilers.md @@ -26,7 +26,7 @@ $ icc -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.c $ ifort -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.f mysubroutines.f -o myprog.x ``` -Read more at <https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-user-and-reference-guide> +Read more [here][a]. ## Sandy Bridge/Ivy Bridge/Haswell Binary Compatibility @@ -34,3 +34,5 @@ Read more at <https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-use * Using compiler flag (both for Fortran and C): **-xCORE-AVX2**. This will create a binary with AVX2 instructions, specifically for the Haswell processors. Note that the executable will not run on Sandy Bridge/Ivy Bridge nodes. * Using compiler flags (both for Fortran and C): **-xAVX -axCORE-AVX2**. This will generate multiple, feature specific auto-dispatch code paths for Intel® processors, if there is a performance benefit. So this binary will run both on Sandy Bridge/Ivy Bridge and Haswell processors. During runtime it will be decided which path to follow, dependent on which processor you are running on. In general this will result in larger binaries. + +[a]: https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-user-and-reference-guide diff --git a/docs.it4i/software/intel/intel-suite/intel-debugger.md b/docs.it4i/software/intel/intel-suite/intel-debugger.md index db2367535..e7e11ba23 100644 --- a/docs.it4i/software/intel/intel-suite/intel-debugger.md +++ b/docs.it4i/software/intel/intel-suite/intel-debugger.md @@ -4,7 +4,7 @@ IDB is no longer available since Intel Parallel Studio 2015 ## Debugging Serial Applications -The intel debugger version is available, via module intel/13.5.192. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use [X display](general/accessing-the-clusters/graphical-user-interface/x-window-system/) for running the GUI. +The intel debugger version is available, via module intel/13.5.192. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use [X display][1] for running the GUI. ```console $ ml intel/13.5.192 @@ -18,7 +18,7 @@ The debugger may run in text mode. To debug in text mode, use $ idbc ``` -To debug on the compute nodes, module intel must be loaded. The GUI on compute nodes may be accessed using the same way as in [the GUI section](general/accessing-the-clusters/graphical-user-interface/x-window-system/) +To debug on the compute nodes, module intel must be loaded. The GUI on compute nodes may be accessed using the same way as in [the GUI section][1]. Example: @@ -40,7 +40,7 @@ In this example, we allocate 1 full compute node, compile program myprog.c with ### Small Number of MPI Ranks -For debugging small number of MPI ranks, you may execute and debug each rank in separate xterm terminal (do not forget the [X display](general/accessing-the-clusters/graphical-user-interface/x-window-system/)). Using Intel MPI, this may be done in following way: +For debugging small number of MPI ranks, you may execute and debug each rank in separate xterm terminal (do not forget the [X display][1]. Using Intel MPI, this may be done in following way: ```console $ qsub -q qexp -l select=2:ncpus=24 -X -I @@ -70,4 +70,8 @@ Run the idb debugger in GUI mode. The menu Parallel contains number of tools for ## Further Information -Exhaustive manual on IDB features and usage is published at [Intel website](https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/). +Exhaustive manual on IDB features and usage is published at [Intel website][a]. + +[1]: ../../../general/accessing-the-clusters/graphical-user-interface/x-window-system.md + +[a]: https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/ diff --git a/docs.it4i/software/intel/intel-suite/intel-inspector.md b/docs.it4i/software/intel/intel-suite/intel-inspector.md index bd2989238..8b44371ab 100644 --- a/docs.it4i/software/intel/intel-suite/intel-inspector.md +++ b/docs.it4i/software/intel/intel-suite/intel-inspector.md @@ -34,6 +34,10 @@ Results obtained from batch mode can be then viewed in the GUI by selecting File ## References -1. [Product page](https://software.intel.com/en-us/intel-inspector-xe) -1. [Documentation and Release Notes](https://software.intel.com/en-us/intel-inspector-xe-support/documentation) -1. [Tutorials](https://software.intel.com/en-us/articles/inspectorxe-tutorials) +1. [Product page][a] +1. [Documentation and Release Notes][b] +1. [Tutorials][c] + +[a]: https://software.intel.com/en-us/intel-inspector-xe +[b]: https://software.intel.com/en-us/intel-inspector-xe-support/documentation +[c]: https://software.intel.com/en-us/articles/inspectorxe-tutorials diff --git a/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md b/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md index a47233367..e2bcbdc69 100644 --- a/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md +++ b/docs.it4i/software/intel/intel-suite/intel-integrated-performance-primitives.md @@ -73,6 +73,10 @@ $ icc testipp.c -o testipp.x -Wl,-rpath=$LIBRARY_PATH -lippi -lipps -lippcore ## Code Samples and Documentation -Intel provides number of [Code Samples for IPP](https://software.intel.com/en-us/articles/code-samples-for-intel-integrated-performance-primitives-library), illustrating use of IPP. +Intel provides number of [Code Samples for IPP][a], illustrating use of IPP. -Read full documentation on IPP [on Intel website,](http://software.intel.com/sites/products/search/search.php?q=&x=15&y=6&product=ipp&version=7.1&docos=lin) in particular the [IPP Reference manual.](http://software.intel.com/sites/products/documentation/doclib/ipp_sa/71/ipp_manual/index.htm) +Read full documentation on IPP [on Intel website][b] in particular the [IPP Reference manual][c]. + +[a]: https://software.intel.com/en-us/articles/code-samples-for-intel-integrated-performance-primitives-library +[b]: http://software.intel.com/sites/products/search/search.php?q=&x=15&y=6&product=ipp&version=7.1&docos=lin +[c]: http://software.intel.com/sites/products/documentation/doclib/ipp_sa/71/ipp_manual/index.htm diff --git a/docs.it4i/software/intel/intel-suite/intel-mkl.md b/docs.it4i/software/intel/intel-suite/intel-mkl.md index cc2a80c55..62193679a 100644 --- a/docs.it4i/software/intel/intel-suite/intel-mkl.md +++ b/docs.it4i/software/intel/intel-suite/intel-mkl.md @@ -13,7 +13,7 @@ Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, e * Data Fitting Library, which provides capabilities for spline-based approximation of functions, derivatives and integrals of functions, and search. * Extended Eigensolver, a shared memory version of an eigensolver based on the Feast Eigenvalue Solver. -For details see the [Intel MKL Reference Manual](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mklman/index.htm). +For details see the [Intel MKL Reference Manual][a]. Intel MKL is available on the cluster @@ -37,7 +37,7 @@ Intel MKL library provides number of interfaces. The fundamental once are the LP ### Linking -Linking Intel MKL libraries may be complex. Intel [mkl link line advisor](http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor) helps. See also [examples](#examples) below. +Linking Intel MKL libraries may be complex. Intel [mkl link line advisor][b] helps. See also [examples][1] below. You will need the mkl module loaded to run the mkl enabled executable. This may be avoided, by compiling library search paths into the executable. Include rpath on the compile line: @@ -109,7 +109,7 @@ In this example, we compile, link and run the cblas_dgemm example, using LP64 in ## MKL and MIC Accelerators -The Intel MKL is capable to automatically offload the computations o the MIC accelerator. See section [Intel Xeon Phi](software/intel/intel-xeon-phi-salomon/) for details. +The Intel MKL is capable to automatically offload the computations o the MIC accelerator. See section [Intel Xeon Phi][2] for details. ## LAPACKE C Interface @@ -117,4 +117,12 @@ MKL includes LAPACKE C Interface to LAPACK. For some reason, although Intel is t ## Further Reading -Read more on [Intel website](http://software.intel.com/en-us/intel-mkl), in particular the [MKL users guide](https://software.intel.com/en-us/intel-mkl/documentation/linux). +Read more on [Intel website][c], in particular the [MKL users guide][d]. + +[1]: #examples +[2]: ../intel-xeon-phi-salomon.md + +[a]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mklman/index.htm +[b]: http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor +[c]: http://software.intel.com/en-us/intel-mkl +[d]: https://software.intel.com/en-us/intel-mkl/documentation/linux diff --git a/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md b/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md index 264b15e9d..998d1f867 100644 --- a/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md +++ b/docs.it4i/software/intel/intel-suite/intel-parallel-studio-introduction.md @@ -23,7 +23,7 @@ $ icc -v $ ifort -v ``` -Read more at the [Intel Compilers](software/intel/intel-suite/intel-compilers/) page. +Read more at the [Intel Compilers][1] page. ## Intel Debugger @@ -36,7 +36,7 @@ $ ml intel $ idb ``` -Read more at the [Intel Debugger](software/intel/intel-suite/intel-debugger/) page. +Read more at the [Intel Debugger][2] page. ## Intel Math Kernel Library @@ -46,7 +46,7 @@ Intel Math Kernel Library (Intel MKL) is a library of math kernel subroutines, e $ ml imkl ``` -Read more at the [Intel MKL](software/intel/intel-suite/intel-mkl/) page. +Read more at the [Intel MKL][3] page. ## Intel Integrated Performance Primitives @@ -56,7 +56,7 @@ Intel Integrated Performance Primitives, version 7.1.1, compiled for AVX is avai $ ml ipp ``` -Read more at the [Intel IPP](software/intel/intel-suite/intel-integrated-performance-primitives/) page. +Read more at the [Intel IPP][4] page. ## Intel Threading Building Blocks @@ -66,4 +66,10 @@ Intel Threading Building Blocks (Intel TBB) is a library that supports scalable $ ml tbb ``` -Read more at the [Intel TBB](software/intel/intel-suite/intel-tbb/) page. +Read more at the [Intel TBB][5] page. + +[1]: intel-compilers.md +[2]: intel-debugger.md +[3]: intel-mkl.md +[4]: intel-integrated-performance-primitives.md +[5]: intel-tbb.md diff --git a/docs.it4i/software/intel/intel-suite/intel-tbb.md b/docs.it4i/software/intel/intel-suite/intel-tbb.md index d28a92d24..545f69ba5 100644 --- a/docs.it4i/software/intel/intel-suite/intel-tbb.md +++ b/docs.it4i/software/intel/intel-suite/intel-tbb.md @@ -2,7 +2,7 @@ ## Intel Threading Building Blocks -Intel Threading Building Blocks (Intel TBB) is a library that supports scalable parallel programming using standard ISO C++ code. It does not require special languages or compilers. To use the library, you specify tasks, not threads, and let the library map tasks onto threads in an efficient manner. The tasks are executed by a runtime scheduler and may be offloaded to [MIC accelerator](software/intel//intel-xeon-phi-salomon/). +Intel Threading Building Blocks (Intel TBB) is a library that supports scalable parallel programming using standard ISO C++ code. It does not require special languages or compilers. To use the library, you specify tasks, not threads, and let the library map tasks onto threads in an efficient manner. The tasks are executed by a runtime scheduler and may be offloaded to [MIC accelerator][1]. Intel is available on the cluster. @@ -37,4 +37,8 @@ $ icc -O2 -o primes.x main.cpp primes.cpp -Wl,-rpath=$LIBRARY_PATH -ltbb ## Further Reading -Read more on Intel website, [http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm](http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm) +Read more on Intel [website][a]. + +[1]: ../intel-xeon-phi-salomon.md + +[a]: http://software.intel.com/sites/products/documentation/doclib/tbb_sa/help/index.htm diff --git a/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md b/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md index 7a86f2d0a..c6f264ce5 100644 --- a/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md +++ b/docs.it4i/software/intel/intel-suite/intel-trace-analyzer-and-collector.md @@ -21,7 +21,7 @@ The trace will be saved in file myapp.stf in the current directory. ## Viewing Traces -To view and analyze the trace, open ITAC GUI in a [graphical environment](general/accessing-the-clusters/graphical-user-interface/x-window-system/): +To view and analyze the trace, open ITAC GUI in a [graphical environment][1]: ```console $ ml itac/9.1.2.024 @@ -36,5 +36,10 @@ Please refer to Intel documenation about usage of the GUI tool. ## References -1. [Getting Started with Intel® Trace Analyzer and Collector](https://software.intel.com/en-us/get-started-with-itac-for-linux) -1. [Intel® Trace Analyzer and Collector - Documentation](https://software.intel.com/en-us/intel-trace-analyzer) +1. [Getting Started with Intel® Trace Analyzer and Collector][a] +1. [Intel® Trace Analyzer and Collector - Documentation][b] + +[1]: ../../../general/accessing-the-clusters/graphical-user-interface/x-window-system.md + +[a]: https://software.intel.com/en-us/get-started-with-itac-for-linux +[b]: https://software.intel.com/en-us/intel-trace-analyzer diff --git a/docs.it4i/software/intel/intel-xeon-phi-anselm.md b/docs.it4i/software/intel/intel-xeon-phi-anselm.md index 009c57fd5..8c38d63df 100644 --- a/docs.it4i/software/intel/intel-xeon-phi-anselm.md +++ b/docs.it4i/software/intel/intel-xeon-phi-anselm.md @@ -244,12 +244,13 @@ Some interesting compiler flags useful not only for code debugging are: Intel MKL includes an Automatic Offload (AO) feature that enables computationally intensive MKL functions called in user code to benefit from attached Intel Xeon Phi coprocessors automatically and transparently. -Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm). +!!! note + Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here][a]. The Automatic Offload may be enabled by either an MKL function call within the code: ```cpp - mkl_mic_enable(); +mkl_mic_enable(); ``` or by setting environment variable @@ -258,7 +259,7 @@ or by setting environment variable $ export MKL_MIC_ENABLE=1 ``` -To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors](http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf)" white paper or [Intel MKL documentation](https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation). +To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors][b]" white paper or [Intel MKL documentation][c]. ### Automatic Offload Example @@ -476,27 +477,27 @@ After executing the complied binary file, following output should be displayed. ```console $ ./capsbasic - Number of available platforms: 1 - Platform names: - [0] Intel(R) OpenCL [Selected] - Number of devices available for each type: - CL_DEVICE_TYPE_CPU: 1 - CL_DEVICE_TYPE_GPU: 0 - CL_DEVICE_TYPE_ACCELERATOR: 1 +Number of available platforms: 1 +Platform names: + [0] Intel(R) OpenCL [Selected] +Number of devices available for each type: + CL_DEVICE_TYPE_CPU: 1 + CL_DEVICE_TYPE_GPU: 0 + CL_DEVICE_TYPE_ACCELERATOR: 1 - ** Detailed information for each device *** +** Detailed information for each device *** - CL_DEVICE_TYPE_CPU[0] - CL_DEVICE_NAME: Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz - CL_DEVICE_AVAILABLE: 1 +CL_DEVICE_TYPE_CPU[0] + CL_DEVICE_NAME: Intel(R) Xeon(R) CPU E5-2470 0 @ 2.30GHz + CL_DEVICE_AVAILABLE: 1 - ... +... - CL_DEVICE_TYPE_ACCELERATOR[0] - CL_DEVICE_NAME: Intel(R) Many Integrated Core Acceleration Card - CL_DEVICE_AVAILABLE: 1 +CL_DEVICE_TYPE_ACCELERATOR[0] + CL_DEVICE_NAME: Intel(R) Many Integrated Core Acceleration Card + CL_DEVICE_AVAILABLE: 1 - ... +... ``` !!! note @@ -619,10 +620,10 @@ $ mpirun -np 4 ./mpi-test The output should be similar to: ```console - Hello world from process 1 of 4 on host cn207 - Hello world from process 3 of 4 on host cn207 - Hello world from process 2 of 4 on host cn207 - Hello world from process 0 of 4 on host cn207 +Hello world from process 1 of 4 on host cn207 +Hello world from process 3 of 4 on host cn207 +Hello world from process 2 of 4 on host cn207 +Hello world from process 0 of 4 on host cn207 ``` ### Coprocessor-Only Model @@ -637,15 +638,15 @@ Similarly to execution of OpenMP programs in native mode, since the environmenta ```console $ vim ~/.profile - PS1='[u@h W]$ ' - export PATH=/usr/bin:/usr/sbin:/bin:/sbin +PS1='[u@h W]$ ' +export PATH=/usr/bin:/usr/sbin:/bin:/sbin - #OpenMP - export LD_LIBRARY_PATH=/apps/intel/composer_xe_2013.5.192/compiler/lib/mic:$LD_LIBRARY_PATH +#OpenMP +export LD_LIBRARY_PATH=/apps/intel/composer_xe_2013.5.192/compiler/lib/mic:$LD_LIBRARY_PATH - #Intel MPI - export LD_LIBRARY_PATH=/apps/intel/impi/4.1.1.036/mic/lib/:$LD_LIBRARY_PATH - export PATH=/apps/intel/impi/4.1.1.036/mic/bin/:$PATH +#Intel MPI +export LD_LIBRARY_PATH=/apps/intel/impi/4.1.1.036/mic/lib/:$LD_LIBRARY_PATH +export PATH=/apps/intel/impi/4.1.1.036/mic/bin/:$PATH ``` !!! note @@ -673,10 +674,10 @@ $ mpirun -np 4 ./mpi-test-mic The output should be similar to: ```console - Hello world from process 1 of 4 on host cn207-mic0 - Hello world from process 2 of 4 on host cn207-mic0 - Hello world from process 3 of 4 on host cn207-mic0 - Hello world from process 0 of 4 on host cn207-mic0 +Hello world from process 1 of 4 on host cn207-mic0 +Hello world from process 2 of 4 on host cn207-mic0 +Hello world from process 3 of 4 on host cn207-mic0 +Hello world from process 0 of 4 on host cn207-mic0 ``` #### Execution on Host @@ -708,10 +709,10 @@ $ mpirun -genv LD_LIBRARY_PATH /apps/intel/impi/4.1.1.036/mic/lib/ -host mic0 -n The output should be again similar to: ```console - Hello world from process 1 of 4 on host cn207-mic0 - Hello world from process 2 of 4 on host cn207-mic0 - Hello world from process 3 of 4 on host cn207-mic0 - Hello world from process 0 of 4 on host cn207-mic0 +Hello world from process 1 of 4 on host cn207-mic0 +Hello world from process 2 of 4 on host cn207-mic0 +Hello world from process 3 of 4 on host cn207-mic0 +Hello world from process 0 of 4 on host cn207-mic0 ``` !!! note @@ -870,14 +871,14 @@ $ mpirun A possible output of the MPI "hello-world" example executed on two hosts and two accelerators is: ```console - Hello world from process 0 of 8 on host cn204 - Hello world from process 1 of 8 on host cn204 - Hello world from process 2 of 8 on host cn204-mic0 - Hello world from process 3 of 8 on host cn204-mic0 - Hello world from process 4 of 8 on host cn205 - Hello world from process 5 of 8 on host cn205 - Hello world from process 6 of 8 on host cn205-mic0 - Hello world from process 7 of 8 on host cn205-mic0 +Hello world from process 0 of 8 on host cn204 +Hello world from process 1 of 8 on host cn204 +Hello world from process 2 of 8 on host cn204-mic0 +Hello world from process 3 of 8 on host cn204-mic0 +Hello world from process 4 of 8 on host cn205 +Hello world from process 5 of 8 on host cn205 +Hello world from process 6 of 8 on host cn205-mic0 +Hello world from process 7 of 8 on host cn205-mic0 ``` !!! note @@ -901,4 +902,9 @@ Each host or accelerator is listed only once per file. User has to specify how m ## Optimization -For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors](http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization "http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization") +For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors][d]. + +[a]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm +[b]: http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf +[c]: https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation +[d]: http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization diff --git a/docs.it4i/software/intel/intel-xeon-phi-salomon.md b/docs.it4i/software/intel/intel-xeon-phi-salomon.md index 982f2d59c..059a01888 100644 --- a/docs.it4i/software/intel/intel-xeon-phi-salomon.md +++ b/docs.it4i/software/intel/intel-xeon-phi-salomon.md @@ -152,7 +152,7 @@ For debugging purposes it is also recommended to set environment variable "OFFLO export OFFLOAD_REPORT=3 ``` -A very basic example of code that employs offload programming technique is shown in the next listing. Please note that this code is sequential and utilizes only single core of the accelerator. +A very basic example of code that employs offload programming technique is shown in the next listing. Note that this code is sequential and utilizes only single core of the accelerator. ```cpp $ cat source-offload.cpp @@ -290,7 +290,7 @@ Some interesting compiler flags useful not only for code debugging are: Intel MKL includes an Automatic Offload (AO) feature that enables computationally intensive MKL functions called in user code to benefit from attached Intel Xeon Phi coprocessors automatically and transparently. !!! note - Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm). + Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here][a]. The Automatic Offload may be enabled by either an MKL function call within the code: @@ -304,7 +304,7 @@ or by setting environment variable $ export MKL_MIC_ENABLE=1 ``` -To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors](http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf)" white paper or [Intel MKL documentation](https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation). +To get more information about automatic offload refer to "[Using Intel® MKL Automatic Offload on Intel ® Xeon Phi™ Coprocessors][b]" white paper or [Intel MKL documentation][c]. ### Automatic Offload Example @@ -411,7 +411,7 @@ Done ``` !!! note "" - Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here](http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm). + Behavioral of automatic offload mode is controlled by functions called within the program or by environmental variables. Complete list of controls is listed [here][d]. ### Automatic Offload Example #2 @@ -510,7 +510,7 @@ mic0 $ export LD_LIBRARY_PATH=/apps/all/icc/2015.3.187-GNU-5.1.0-2.25/composer_x ``` !!! note - Please note that the path exported in the previous example contains path to a specific compiler (here the version is 2015.3.187-GNU-5.1.0-2.25). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer. + The path exported in the previous example contains path to a specific compiler (here the version is 2015.3.187-GNU-5.1.0-2.25). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer. For your information the list of libraries and their location required for execution of an OpenMP parallel code on Intel Xeon Phi is: @@ -987,4 +987,10 @@ Each host or accelerator is listed only once per file. User has to specify how m ## Optimization -For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors](http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization "http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization") +For more details about optimization techniques read Intel document [Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors][e]. + +[a]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm +[b]: http://software.intel.com/sites/default/files/11MIC42_How_to_Use_MKL_Automatic_Offload_0.pdf +[c]: https://software.intel.com/en-us/articles/intel-math-kernel-library-documentation +[d]: http://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_userguide_lnx/GUID-3DC4FC7D-A1E4-423D-9C0C-06AB265FFA86.htm +[e]: http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-1-optimization -- GitLab