Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
Intel Compilers
===============
The Intel compilers in multiple versions are available, via module
intel. The compilers include the icc C and C++ compiler and the ifort
fortran 77/90/95 compiler.
$ module load intel
$ icc -v
$ ifort -v
The intel compilers provide for vectorization of the code, via the AVX2
instructions and support threading parallelization via OpenMP
For maximum performance on the Salomon cluster compute nodes, compile
your programs using the AVX2 instructions, with reporting where the
vectorization was used. We recommend following compilation options for
high performance
$ icc -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec myprog.c mysubroutines.c -o myprog.x
$ ifort -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec myprog.f mysubroutines.f -o myprog.x
In this example, we compile the program enabling interprocedural
optimizations between source files (-ipo), aggresive loop optimizations
(-O3) and vectorization (-xCORE-AVX2)
The compiler recognizes the omp, simd, vector and ivdep pragmas for
OpenMP parallelization and AVX2 vectorization. Enable the OpenMP
parallelization by the **-openmp** compiler switch.
$ icc -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.c mysubroutines.c -o myprog.x
$ ifort -ipo -O3 -xCORE-AVX2 -qopt-report1 -qopt-report-phase=vec -openmp myprog.f mysubroutines.f -o myprog.x
Read more
at <https://software.intel.com/en-us/intel-cplusplus-compiler-16.0-user-and-reference-guide>
Sandy Bridge/Ivy Bridge/Haswell binary compatibility
----------------------------------------------------
Anselm nodes are currently equipped with Sandy Bridge CPUs, while
Salomon compute nodes are equipped with Haswell based architecture. The
UV1 SMP compute server has Ivy Bridge CPUs, which are equivalent to
Sandy Bridge (only smaller manufacturing technology). >The new
processors are backward compatible with the Sandy Bridge nodes, so all
programs that ran on the Sandy Bridge processors, should also run on the
new Haswell nodes. >To get optimal performance out of the
Haswell processors a program should make use of the
special >AVX2 instructions for this processor. One can do
this by recompiling codes with the compiler
flags >designated to invoke these instructions. For the
Intel compiler suite, there are two ways of
doing >this:
- >Using compiler flag (both for Fortran and C):
-xCORE-AVX2. This will create a
binary class="s1">with AVX2 instructions, specifically
for the Haswell processors. Note that the
executable >will not run on Sandy Bridge/Ivy
Bridge nodes.
- >Using compiler flags (both for Fortran and C):
-xAVX -axCORE-AVX2. This
will >generate multiple, feature specific auto-dispatch
code paths for Intel® processors, if there is >a
performance benefit. So this binary will run both on Sandy
Bridge/Ivy Bridge and Haswell >processors. During
runtime it will be decided which path to follow, dependent on
which >processor you are running on. In general this
will result in larger binaries.