Úprava optimální kompilace kódů na Karolíně s ohledem na dostupnost funkčního...
Compare changes
@@ -24,29 +24,57 @@ see [Lorenz Compiler performance benchmark][a].
@@ -24,29 +24,57 @@ see [Lorenz Compiler performance benchmark][a].
To combine the optimizations for the general CPU code and have the most efficient BLAS routines we recommend the combination of lastest Intel Compiler suite, with Cray's Scientific Library bundle (LIBSCI). When using the Intel Compiler suite includes also support for efficient MPI implementation utilizing Intel MPI library over the Infiniband interconnect.
Most MPI libraries do the binding automatically. The binding of MPI ranks can be inspected for any MPI by running `$ mpirun -n num_of_ranks numactl --show`. However, if the ranks spawn threads, binding of these threads should be done via the environment variables described above.