Skip to content
Snippets Groups Projects
Commit 81f2bad5 authored by Lukáš Krupčík's avatar Lukáš Krupčík
Browse files

merge

parents 88895ad8 18c5888c
No related branches found
No related tags found
4 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section
Showing
with 1582 additions and 17 deletions
......@@ -41,6 +41,7 @@ TotalView
Valgrind
ParaView
OpenFOAM
MAX_FAIRSHARE
MPI4Py
MPICH2
PETSc
......@@ -94,6 +95,202 @@ AnyConnect
X11
backfilling
backfilled
SCP
Lustre
QDR
TFLOP
ncpus
myjob
pernode
mpiprocs
ompthreads
qprace
runtime
SVS
ppn
Multiphysics
aeroacoustics
turbomachinery
CFD
LS-DYNA
APDL
MAPDL
multiphysics
AUTODYN
RSM
Molpro
initio
parallelization
NWChem
SCF
ISV
profiler
Pthreads
profilers
OTF
PAPI
PCM
uncore
pre-processing
prepend
CXX
prepended
POMP2
Memcheck
unaddressable
OTF2
GPI-2
GASPI
GPI
MKL
IPP
TBB
GSL
Omics
VNC
Scalasca
IFORT
interprocedural
IDB
cloop
qcow
qcow2
vmdk
vdi
virtio
paravirtualized
Gbit
tap0
UDP
TCP
preload
qfat
Rmpi
DCT
datasets
dataset
preconditioners
partitioners
PARDISO
PaStiX
SuiteSparse
SuperLU
ExodusII
NetCDF
ParMETIS
multigrid
HYPRE
SPAI
Epetra
EpetraExt
Tpetra
64-bit
Belos
GMRES
Amesos
IFPACK
preconditioner
Teuchos
Makefiles
SAXPY
NVCC
VCF
HGMD
HUMSAVAR
ClinVar
indels
CIBERER
exomes
tmp
SSHFS
RSYNC
unmount
Cygwin
CygwinX
RFB
TightVNC
TigerVNC
GUIs
XLaunch
UTF-8
numpad
PuTTYgen
OpenSSH
IE11
x86
r21u01n577
7120P
interprocessor
IPN
toolchains
toolchain
APIs
easyblocks
GM200
GeForce
GTX
IRUs
ASIC
backplane
ICEX
IRU
PFLOP
T950B
ifconfig
inet
addr
checkbox
appfile
programmatically
http
https
filesystem
phono3py
HDF
splitted
automize
llvm
PGI
GUPC
BUPC
IBV
Aislinn
nondeterminism
stdout
stderr
i.e.
pthreads
uninitialised
broadcasted
ITAC
hotspots
Bioinformatics
semiempirical
DFT
polyfill
ES6
HTML5Rocks
minifiers
CommonJS
PhantomJS
bundlers
Browserify
versioning
isflowing
ispaused
NPM
sublicense
Streams2
Streams3
blogpost
GPG
mississippi
Uint8Arrays
Uint8Array
endianness
styleguide
noop
MkDocs
- docs.it4i/anselm-cluster-documentation/environment-and-modules.md
MODULEPATH
bashrc
......@@ -127,6 +324,7 @@ Rmax
E5-2665
E5-2470
P5110
isw
- docs.it4i/anselm-cluster-documentation/introduction.md
RedHat
- docs.it4i/anselm-cluster-documentation/job-priority.md
......@@ -134,6 +332,8 @@ walltime
qexp
_List.fairshare
_time
_FAIRSHARE
1E6
- docs.it4i/anselm-cluster-documentation/job-submission-and-execution.md
15209.srv11
qsub
......@@ -154,6 +354,15 @@ jobscript
cn108
cn109
cn110
Name0
cn17
_NODEFILE
_O
_WORKDIR
mympiprog.x
_JOBID
myprog.x
openmpi
- docs.it4i/anselm-cluster-documentation/network.md
ib0
- docs.it4i/anselm-cluster-documentation/prace.md
......@@ -161,14 +370,19 @@ PRACE
qfree
it4ifree
it4i.portal.clients
prace
1h
- docs.it4i/anselm-cluster-documentation/shell-and-data-access.md
VPN
- docs.it4i/anselm-cluster-documentation/software/ansys/ansys-cfx.md
ANSYS
CFX
cfx.pbs
_r
ane3fl
- docs.it4i/anselm-cluster-documentation/software/ansys/ansys-mechanical-apdl.md
mapdl.pbs
_dy
- docs.it4i/anselm-cluster-documentation/software/ansys/ls-dyna.md
HPC
lsdyna.pbs
......@@ -183,9 +397,25 @@ Makefile
- docs.it4i/anselm-cluster-documentation/software/gpi2.md
gcc
cn79
helloworld
_gpi.c
ibverbs
gaspi
_logger
- docs.it4i/anselm-cluster-documentation/software/intel-suite/intel-compilers.md
Haswell
CPUs
ipo
O3
vec
xAVX
omp
simd
ivdep
pragmas
openmp
xCORE-AVX2
axCORE-AVX2
- docs.it4i/anselm-cluster-documentation/software/kvirtualization.md
rc.local
runlevel
......@@ -197,6 +427,8 @@ VDE
smb.conf
TMPDIR
run.bat.
slirp
NATs
- docs.it4i/anselm-cluster-documentation/software/mpi/mpi4py-mpi-for-python.md
NumPy
- docs.it4i/anselm-cluster-documentation/software/numerical-languages/matlab_1314.md
......@@ -205,33 +437,73 @@ matlabcode.m
output.out
matlabcodefile
sched
_feature
- docs.it4i/anselm-cluster-documentation/software/numerical-languages/matlab.md
UV2000
maxNumCompThreads
SalomonPBSPro
- docs.it4i/anselm-cluster-documentation/software/numerical-languages/octave.md
_THREADS
_NUM
- docs.it4i/anselm-cluster-documentation/software/numerical-libraries/trilinos.md
CMake-aware
Makefile.export
_PACKAGE
_CXX
_COMPILER
_INCLUDE
_DIRS
_LIBRARY
- docs.it4i/anselm-cluster-documentation/software/ansys/ansys-ls-dyna.md
ansysdyna.pbs
- docs.it4i/anselm-cluster-documentation/software/ansys/ansys.md
svsfem.cz
_
- docs.it4i/anselm-cluster-documentation/software/debuggers/valgrind.md
libmpiwrap-amd64-linux
O0
valgrind
malloc
_PRELOAD
- docs.it4i/anselm-cluster-documentation/software/numerical-libraries/magma-for-intel-xeon-phi.md
cn204
_LIBS
MAGMAROOT
_magma
_server
_anselm
_from
_mic.sh
_dgetrf
_mic
_03.pdf
- docs.it4i/anselm-cluster-documentation/software/paraview.md
cn77
localhost
v4.0.1
- docs.it4i/anselm-cluster-documentation/storage.md
ssh.du1.cesnet.cz
Plzen
ssh.du2.cesnet.cz
ssh.du3.cesnet.cz
tier1
_home
_cache
_tape
- docs.it4i/salomon/environment-and-modules.md
icc
ictce
ifort
imkl
intel
gompi
goolf
BLACS
iompi
iccifort
- docs.it4i/salomon/hardware-overview.md
HW
E5-4627v2
- docs.it4i/salomon/job-submission-and-execution.md
15209.isrv5
r21u01n577
......@@ -256,6 +528,7 @@ mkdir
mympiprog.x
mpiexec
myprog.x
r4i7n0.ib0.smc.salomon.it4i.cz
- docs.it4i/salomon/7d-enhanced-hypercube.md
cns1
cns576
......@@ -264,9 +537,266 @@ r4i7n17
cns577
cns1008
r37u31n1008
7D
- docs.it4i/anselm-cluster-documentation/resources-allocation-policy.md
qsub
it4ifree
it4i.portal.clients
x86
x64
- docs.it4i/anselm-cluster-documentation/software/ansys/ansys-fluent.md
anslic
_admin
- docs.it4i/anselm-cluster-documentation/software/chemistry/nwchem.md
_DIR
- docs.it4i/anselm-cluster-documentation/software/comsol-multiphysics.md
EDU
comsol
_matlab.pbs
_job.m
mphstart
- docs.it4i/anselm-cluster-documentation/software/debuggers/allinea-performance-reports.md
perf-report
perf
txt
html
mympiprog
_32p
- docs.it4i/anselm-cluster-documentation/software/debuggers/intel-vtune-amplifier.md
Hotspots
- docs.it4i/anselm-cluster-documentation/software/debuggers/scalasca.md
scorep
- docs.it4i/anselm-cluster-documentation/software/isv_licenses.md
edu
ansys
_features
_state.txt
f1
matlab
acfd
_ansys
_acfd
_aa
_comsol
HEATTRANSFER
_HEATTRANSFER
COMSOLBATCH
_COMSOLBATCH
STRUCTURALMECHANICS
_STRUCTURALMECHANICS
_matlab
_Toolbox
_Image
_Distrib
_Comp
_Engine
_Acquisition
pmode
matlabpool
- docs.it4i/anselm-cluster-documentation/software/mpi/mpi.md
mpirun
BLAS1
FFT
KMP
_AFFINITY
GOMP
_CPU
bullxmpi-1
mpich2
- docs.it4i/anselm-cluster-documentation/software/mpi/Running_OpenMPI.md
bysocket
bycore
- docs.it4i/anselm-cluster-documentation/software/numerical-libraries/fftw.md
gcc3.3.3
pthread
fftw3
lfftw3
_threads-lfftw3
_omp
icc3.3.3
FFTW2
gcc2.1.5
fftw2
lfftw
_threads
icc2.1.5
fftw-mpi3
_mpi
fftw3-mpi
fftw2-mpi
IntelMPI
- docs.it4i/anselm-cluster-documentation/software/numerical-libraries/gsl.md
dwt.c
mkl
lgsl
- docs.it4i/anselm-cluster-documentation/software/numerical-libraries/hdf5.md
icc
hdf5
_INC
_SHLIB
_CPP
_LIB
_F90
gcc49
- docs.it4i/anselm-cluster-documentation/software/numerical-libraries/petsc.md
_Dist
- docs.it4i/anselm-cluster-documentation/software/nvidia-cuda.md
lcublas
- docs.it4i/anselm-cluster-documentation/software/operating-system.md
6.x
- docs.it4i/get-started-with-it4innovations/accessing-the-clusters/graphical-user-interface/cygwin-and-x11-forwarding.md
startxwin
cygwin64binXWin.exe
tcp
- docs.it4i/get-started-with-it4innovations/accessing-the-clusters/graphical-user-interface/x-window-system.md
Xming
XWin.exe.
- docs.it4i/get-started-with-it4innovations/accessing-the-clusters/shell-access-and-data-transfer/pageant.md
_rsa.ppk
- docs.it4i/get-started-with-it4innovations/accessing-the-clusters/shell-access-and-data-transfer/puttygen.md
_keys
organization.example.com
_rsa
- docs.it4i/get-started-with-it4innovations/accessing-the-clusters/shell-access-and-data-transfer/vpn-connection-fail-in-win-8.1.md
vpnui.exe
- docs.it4i/salomon/ib-single-plane-topology.md
36-port
Mcell.pdf
r21-r38
nodes.pdf
- docs.it4i/salomon/introduction.md
E5-2680v3
- docs.it4i/salomon/network.md
r4i1n0
r4i1n1
r4i1n2
r4i1n3
ip
- docs.it4i/salomon/software/ansys/setting-license-preferences.md
ansys161
- docs.it4i/salomon/software/ansys/workbench.md
mpifile.txt
solvehandlers.xml
- docs.it4i/salomon/software/chemistry/phono3py.md
vasprun.xml
disp-XXXXX
disp
_fc3.yaml
ir
_grid
_points.yaml
gofree-cond1
- docs.it4i/salomon/software/compilers.md
HPF
- docs.it4i/salomon/software/comsol/licensing-and-available-versions.md
ver
- docs.it4i/salomon/software/debuggers/aislinn.md
test.cpp
- docs.it4i/salomon/software/debuggers/intel-vtune-amplifier.md
vtune
_update1
- docs.it4i/salomon/software/debuggers/valgrind.md
EBROOTVALGRIND
- docs.it4i/salomon/software/intel-suite/intel-advisor.md
O2
- docs.it4i/salomon/software/intel-suite/intel-compilers.md
UV1
- docs.it4i/salomon/software/numerical-languages/octave.md
octcode.m
mkoctfile
- docs.it4i/software/orca.md
pdf
- node_modules/es6-promise/README.md
rsvp.js
es6-promise
es6-promise-min
Node.js
testem
- node_modules/spawn-sync/lib/json-buffer/README.md
node.js
- node_modules/spawn-sync/node_modules/concat-stream/node_modules/readable-stream/doc/wg-meetings/2015-01-30.md
WG
domenic
mikeal
io.js
sam
calvin
whatwg
compat
mathias
isaac
chris
- node_modules/spawn-sync/node_modules/concat-stream/node_modules/readable-stream/node_modules/core-util-is/README.md
core-util-is
v0.12.
- node_modules/spawn-sync/node_modules/concat-stream/node_modules/readable-stream/node_modules/isarray/README.md
isarray
Gruber
julian
juliangruber.com
NONINFRINGEMENT
- node_modules/spawn-sync/node_modules/concat-stream/node_modules/readable-stream/node_modules/process-nextick-args/license.md
Metcalf
- node_modules/spawn-sync/node_modules/concat-stream/node_modules/readable-stream/node_modules/process-nextick-args/readme.md
process-nextick-args
process.nextTick
- node_modules/spawn-sync/node_modules/concat-stream/node_modules/readable-stream/node_modules/string_decoder/README.md
_decoder.js
Joyent
joyent
repo
- node_modules/spawn-sync/node_modules/concat-stream/node_modules/readable-stream/node_modules/util-deprecate/History.md
kumavis
jsdocs
- node_modules/spawn-sync/node_modules/concat-stream/node_modules/readable-stream/node_modules/util-deprecate/README.md
util-deprecate
Rajlich
- node_modules/spawn-sync/node_modules/concat-stream/node_modules/readable-stream/README.md
v7.0.0
userland
chrisdickinson
christopher.s.dickinson
gmail.com
9554F04D7259F04124DE6B476D5A82AC7E37093B
calvinmetcalf
calvin.metcalf
F3EF5F62A87FC27A22E643F714CE4FF5015AA242
Vagg
rvagg
vagg.org
DD8F2338BAE7501E3DD5AC78C273792F7D83545D
sonewman
newmansam
outlook.com
Buus
mafintosh
mathiasbuus
Denicola
domenic.me
Matteo
Collina
mcollina
matteo.collina
3ABC01543F22DD2239285CDD818674489FBC127E
- node_modules/spawn-sync/node_modules/concat-stream/readme.md
concat-stream
concat
cb
- node_modules/spawn-sync/node_modules/os-shim/README.md
0.10.x
os.tmpdir
os.endianness
os.EOL
os.platform
os.arch
0.4.x
Aparicio
Adesis
Netlife
S.L
- node_modules/spawn-sync/node_modules/try-thread-sleep/node_modules/thread-sleep/README.md
node-pre-gyp
npm
- node_modules/spawn-sync/README.md
iojs
>>>>>>> readme
# User documentation
This is project contain IT4Innovation user documentation source.
This is project contain IT4Innovations user documentation source.
## Environments
......@@ -42,11 +42,11 @@ $$
To enable the MathJX on page you need to enable it by adding line ```---8<--- "mathjax.md"``` at the end of file.
## Developemnt Environment
## Development Environment
### MkDocs
Documentation pages are build with [MkDocs](http://www.mkdocs.org/), [MkDocs at GitHub](https://github.com/mkdocs/mkdocs/). You need to install mkdocs loacally so that you can build the pages and run developement web server.
Documentation pages are build with [MkDocs](http://www.mkdocs.org/), [MkDocs at GitHub](https://github.com/mkdocs/mkdocs/). You need to install MkDocs locally so that you can build the pages and run development web server.
```bash
pip install mkdocs pygments pymdown-extensions
......
......@@ -26,7 +26,7 @@ fi
```
!!! note
Do not run commands outputting to standard output (echo, module list, etc) in .bashrc for non-interactive SSH sessions. It breaks fundamental functionality (scp, PBS) of your account! Conside utilization of SSH session interactivity for such commands as stated in the previous example.
Do not run commands outputting to standard output (echo, module list, etc) in .bashrc for non-interactive SSH sessions. It breaks fundamental functionality (SCP, PBS) of your account! Consider utilization of SSH session interactivity for such commands as stated in the previous example.
## Application Modules
......
......@@ -323,7 +323,7 @@ cd $SCRDIR || exit
cp $PBS_O_WORKDIR/input .
cp $PBS_O_WORKDIR/mympiprog.x .
# load the mpi module
# load the MPI module
ml OpenMPI
# execute the calculation
......@@ -361,8 +361,13 @@ Example jobscript for an MPI job with preloaded inputs and executables, options
SCRDIR=/scratch/$USER/myjob
cd $SCRDIR || exit
<<<<<<< HEAD
# load the mpi module
ml OpenMPI
=======
# load the MPI module
module load openmpi
>>>>>>> readme
# execute the calculation
mpirun ./mympiprog.x
......
# Allinea Forge (DDT,MAP)
Allinea Forge consist of two tools - debugger DDT and profiler MAP.
Allinea DDT, is a commercial debugger primarily for debugging parallel MPI or OpenMP programs. It also has a support for GPU (CUDA) and Intel Xeon Phi accelerators. DDT provides all the standard debugging features (stack trace, breakpoints, watches, view variables, threads etc.) for every thread running as part of your program, or for every process - even if these processes are distributed across a cluster using an MPI implementation.
Allinea MAP is a profiler for C/C++/Fortran HPC codes. It is designed for profiling parallel code, which uses Pthreads, OpenMP or MPI.
## License and Limitations for Anselm Users
On Anselm users can debug OpenMP or MPI code that runs up to 64 parallel processes. In case of debugging GPU or Xeon Phi accelerated codes the limit is 8 accelerators. These limitation means that:
* 1 user can debug up 64 processes, or
* 32 users can debug 2 processes, etc.
In case of debugging on accelerators:
* 1 user can debug on up to 8 accelerators, or
* 8 users can debug on single accelerator.
## Compiling Code to Run With DDT
### Modules
Load all necessary modules to compile the code. For example:
```bash
$ module load intel
$ module load impi ... or ... module load openmpi/X.X.X-icc
```
Load the Allinea DDT module:
```bash
$ module load Forge
```
Compile the code:
```bash
$ mpicc -g -O0 -o test_debug test.c
$ mpif90 -g -O0 -o test_debug test.f
```
### Compiler Flags
Before debugging, you need to compile your code with theses flags:
!!! note
\* **g** : Generates extra debugging information usable by GDB. -g3 includes even more debugging information. This option is available for GNU and INTEL C/C++ and Fortran compilers.
\* **O0** : Suppress all optimizations.
## Starting a Job With DDT
Be sure to log in with an X window forwarding enabled. This could mean using the -X in the ssh:
```bash
$ ssh -X username@anselm.it4i.cz
```
Other options is to access login node using VNC. Please see the detailed information on how to [use graphic user interface on Anselm](/general/accessing-the-clusters/graphical-user-interface/x-window-system/)
From the login node an interactive session **with X windows forwarding** (-X option) can be started by following command:
```bash
$ qsub -I -X -A NONE-0-0 -q qexp -lselect=1:ncpus=16:mpiprocs=16,walltime=01:00:00
```
Then launch the debugger with the ddt command followed by the name of the executable to debug:
```bash
$ ddt test_debug
```
A submission window that appears have a prefilled path to the executable to debug. You can select the number of MPI processors and/or OpenMP threads on which to run and press run. Command line arguments to a program can be entered to the "Arguments " box.
![](../../../img/ddt1.png)
To start the debugging directly without the submission window, user can specify the debugging and execution parameters from the command line. For example the number of MPI processes is set by option "-np 4". Skipping the dialog is done by "-start" option. To see the list of the "ddt" command line parameters, run "ddt --help".
```bash
ddt -start -np 4 ./hello_debug_impi
```
## Documentation
Users can find original User Guide after loading the DDT module:
```bash
$DDTPATH/doc/userguide.pdf
```
[1] Discipline, Magic, Inspiration and Science: Best Practice Debugging with Allinea DDT, Workshop conducted at LLNL by Allinea on May 10, 2013, [link](https://computing.llnl.gov/tutorials/allineaDDT/index.html)
# Allinea Performance Reports
## Introduction
Allinea Performance Reports characterize the performance of HPC application runs. After executing your application through the tool, a synthetic HTML report is generated automatically, containing information about several metrics along with clear behavior statements and hints to help you improve the efficiency of your runs.
The Allinea Performance Reports is most useful in profiling MPI programs.
Our license is limited to 64 MPI processes.
## Modules
Allinea Performance Reports version 6.0 is available
```bash
$ module load PerformanceReports/6.0
```
The module sets up environment variables, required for using the Allinea Performance Reports. This particular command loads the default module, which is performance reports version 4.2.
## Usage
!!! note
Use the the perf-report wrapper on your (MPI) program.
Instead of [running your MPI program the usual way](../mpi/), use the the perf report wrapper:
```bash
$ perf-report mpirun ./mympiprog.x
```
The MPI program will run as usual. The perf-report creates two additional files, in \*.txt and \*.html format, containing the performance report. Note that [demanding MPI codes should be run within the queue system](../../job-submission-and-execution/).
## Example
In this example, we will be profiling the mympiprog.x MPI program, using Allinea performance reports. Assume that the code is compiled with Intel compilers and linked against Intel MPI library:
First, we allocate some nodes via the express queue:
```bash
$ qsub -q qexp -l select=2:ncpus=16:mpiprocs=16:ompthreads=1 -I
qsub: waiting for job 262197.dm2 to start
qsub: job 262197.dm2 ready
```
Then we load the modules and run the program the usual way:
```bash
$ module load intel impi allinea-perf-report/4.2
$ mpirun ./mympiprog.x
```
Now lets profile the code:
```bash
$ perf-report mpirun ./mympiprog.x
```
Performance report files [mympiprog_32p\*.txt](../../../src/mympiprog_32p_2014-10-15_16-56.txt) and [mympiprog_32p\*.html](../../../src/mympiprog_32p_2014-10-15_16-56.html) were created. We can see that the code is very efficient on MPI and is CPU bounded.
# Intel Compilers
The Intel compilers version 13.1.1 are available, via module Intel. The compilers include the ICC C and C++ compiler and the IFORT Fortran 77/90/95 compiler.
```bash
$ module load intel
$ icc -v
$ ifort -v
```
The Intel compilers provide for vectorization of the code, via the AVX instructions and support threading parallelization via OpenMP
For maximum performance on the Anselm cluster, compile your programs using the AVX instructions, with reporting where the vectorization was used. We recommend following compilation options for high performance
```bash
$ icc -ipo -O3 -vec -xAVX -vec-report1 myprog.c mysubroutines.c -o myprog.x
$ ifort -ipo -O3 -vec -xAVX -vec-report1 myprog.f mysubroutines.f -o myprog.x
```
In this example, we compile the program enabling interprocedural optimizations between source files (-ipo), aggressive loop optimizations (-O3) and vectorization (-vec -xAVX)
The compiler recognizes the omp, simd, vector and ivdep pragmas for OpenMP parallelization and AVX vectorization. Enable the OpenMP parallelization by the **-openmp** compiler switch.
```bash
$ icc -ipo -O3 -vec -xAVX -vec-report1 -openmp myprog.c mysubroutines.c -o myprog.x
$ ifort -ipo -O3 -vec -xAVX -vec-report1 -openmp myprog.f mysubroutines.f -o myprog.x
```
Read more at <http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/composerxe/compiler/cpp-lin/index.htm>
## Sandy Bridge/Haswell Binary Compatibility
Anselm nodes are currently equipped with Sandy Bridge CPUs, while Salomon will use Haswell architecture. >The new processors are backward compatible with the Sandy Bridge nodes, so all programs that ran on the Sandy Bridge processors, should also run on the new Haswell nodes. >To get optimal performance out of the Haswell processors a program should make use of the special AVX2 instructions for this processor. One can do this by recompiling codes with the compiler flags >designated to invoke these instructions. For the Intel compiler suite, there are two ways of doing this:
* Using compiler flag (both for Fortran and C): -xCORE-AVX2. This will create a binary with AVX2 instructions, specifically for the Haswell processors. Note that the executable will not run on Sandy Bridge nodes.
* Using compiler flags (both for Fortran and C): -xAVX -axCORE-AVX2. This will generate multiple, feature specific auto-dispatch code paths for Intel® processors, if there is a performance benefit. So this binary will run both on Sandy Bridge and Haswell processors. During runtime it will be decided which path to follow, dependent on which processor you are running on. In general this will result in larger binaries.
# Intel Debugger
## Debugging Serial Applications
The intel debugger version 13.0 is available, via module intel. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use X display for running the GUI.
```bash
$ module load intel
$ idb
```
The debugger may run in text mode. To debug in text mode, use
```bash
$ idbc
```
To debug on the compute nodes, module intel must be loaded. The GUI on compute nodes may be accessed using the same way as in the GUI section
Example:
```bash
$ qsub -q qexp -l select=1:ncpus=16 -X -I
qsub: waiting for job 19654.srv11 to start
qsub: job 19654.srv11 ready
$ module load intel
$ module load java
$ icc -O0 -g myprog.c -o myprog.x
$ idb ./myprog.x
```
In this example, we allocate 1 full compute node, compile program myprog.c with debugging options -O0 -g and run the idb debugger interactively on the myprog.x executable. The GUI access is via X11 port forwarding provided by the PBS workload manager.
## Debugging Parallel Applications
Intel debugger is capable of debugging multithreaded and MPI parallel programs as well.
### Small Number of MPI Ranks
For debugging small number of MPI ranks, you may execute and debug each rank in separate xterm terminal (do not forget the X display. Using Intel MPI, this may be done in following way:
```bash
$ qsub -q qexp -l select=2:ncpus=16 -X -I
qsub: waiting for job 19654.srv11 to start
qsub: job 19655.srv11 ready
$ module load intel impi
$ mpirun -ppn 1 -hostfile $PBS_NODEFILE --enable-x xterm -e idbc ./mympiprog.x
```
In this example, we allocate 2 full compute node, run xterm on each node and start idb debugger in command line mode, debugging two ranks of mympiprog.x application. The xterm will pop up for each rank, with idb prompt ready. The example is not limited to use of Intel MPI
### Large Number of MPI Ranks
Run the idb debugger from within the MPI debug option. This will cause the debugger to bind to all ranks and provide aggregated outputs across the ranks, pausing execution automatically just after startup. You may then set break points and step the execution manually. Using Intel MPI:
```bash
$ qsub -q qexp -l select=2:ncpus=16 -X -I
qsub: waiting for job 19654.srv11 to start
qsub: job 19655.srv11 ready
$ module load intel impi
$ mpirun -n 32 -idb ./mympiprog.x
```
### Debugging Multithreaded Application
Run the idb debugger in GUI mode. The menu Parallel contains number of tools for debugging multiple threads. One of the most useful tools is the **Serialize Execution** tool, which serializes execution of concurrent threads for easy orientation and identification of concurrency related bugs.
## Further Information
Exhaustive manual on IDB features and usage is published at [Intel website](http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/composerxe/debugger/user_guide/index.htm)
# MPI
## Setting Up MPI Environment
The Anselm cluster provides several implementations of the MPI library:
| MPI Library | Thread support |
| ---------------------------------------------------- | --------------------------------------------------------------- |
| The highly optimized and stable **bullxmpi 1.2.4.1** | Partial thread support up to MPI_THREAD_SERIALIZED |
| The **Intel MPI 4.1** | Full thread support up to MPI_THREAD_MULTIPLE |
| The [OpenMPI 1.6.5](href="http://www.open-mpi.org) | Full thread support up to MPI_THREAD_MULTIPLE, BLCR c/r support |
| The OpenMPI 1.8.1 | Full thread support up to MPI_THREAD_MULTIPLE, MPI-3.0 support |
| The **mpich2 1.9** | Full thread support up to MPI_THREAD_MULTIPLE, BLCR c/r support |
MPI libraries are activated via the environment modules.
Look up section modulefiles/mpi in module avail
```bash
$ module avail
------------------------- /opt/modules/modulefiles/mpi -------------------------
bullxmpi/bullxmpi-1.2.4.1 mvapich2/1.9-icc
impi/4.0.3.008 openmpi/1.6.5-gcc(default)
impi/4.1.0.024 openmpi/1.6.5-gcc46
impi/4.1.0.030 openmpi/1.6.5-icc
impi/4.1.1.036(default) openmpi/1.8.1-gcc
openmpi/1.8.1-gcc46
mvapich2/1.9-gcc(default) openmpi/1.8.1-gcc49
mvapich2/1.9-gcc46 openmpi/1.8.1-icc
```
There are default compilers associated with any particular MPI implementation. The defaults may be changed, the MPI libraries may be used in conjunction with any compiler. The defaults are selected via the modules in following way
| Module | MPI | Compiler suite |
| ------------ | ---------------- | ------------------------------------------------------------------------------ |
| PrgEnv-gnu | bullxmpi-1.2.4.1 | bullx GNU 4.4.6 |
| PrgEnv-intel | Intel MPI 4.1.1 | Intel 13.1.1 |
| bullxmpi | bullxmpi-1.2.4.1 | none, select via module |
| impi | Intel MPI 4.1.1 | none, select via module |
| openmpi | OpenMPI 1.6.5 | GNU compilers 4.8.1, GNU compilers 4.4.6, Intel Compilers |
| openmpi | OpenMPI 1.8.1 | GNU compilers 4.8.1, GNU compilers 4.4.6, GNU compilers 4.9.0, Intel Compilers |
| mvapich2 | MPICH2 1.9 | GNU compilers 4.8.1, GNU compilers 4.4.6, Intel Compilers |
Examples:
```bash
$ module load openmpi
```
In this example, we activate the latest openmpi with latest GNU compilers
To use openmpi with the intel compiler suite, use
```bash
$ module load intel
$ module load openmpi/1.6.5-icc
```
In this example, the openmpi 1.6.5 using intel compilers is activated
## Compiling MPI Programs
!!! note
After setting up your MPI environment, compile your program using one of the mpi wrappers
```bash
$ mpicc -v
$ mpif77 -v
$ mpif90 -v
```
Example program:
```cpp
// helloworld_mpi.c
#include <stdio.h>
#include<mpi.h>
int main(int argc, char **argv) {
int len;
int rank, size;
char node[MPI_MAX_PROCESSOR_NAME];
// Initiate MPI
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Comm_size(MPI_COMM_WORLD,&size);
// Get hostame and print
MPI_Get_processor_name(node,&len);
printf("Hello world! from rank %d of %d on host %sn",rank,size,node);
// Finalize and exit
MPI_Finalize();
return 0;
}
```
Compile the above example with
```bash
$ mpicc helloworld_mpi.c -o helloworld_mpi.x
```
## Running MPI Programs
!!! note
The MPI program executable must be compatible with the loaded MPI module.
Always compile and execute using the very same MPI module.
It is strongly discouraged to mix MPI implementations. Linking an application with one MPI implementation and running mpirun/mpiexec form other implementation may result in unexpected errors.
The MPI program executable must be available within the same path on all nodes. This is automatically fulfilled on the /home and /scratch file system. You need to preload the executable, if running on the local scratch /lscratch file system.
### Ways to Run MPI Programs
Optimal way to run an MPI program depends on its memory requirements, memory access pattern and communication pattern.
!!! note
Consider these ways to run an MPI program:
1. One MPI process per node, 16 threads per process
2. Two MPI processes per node, 8 threads per process
3. 16 MPI processes per node, 1 thread per process.
**One MPI** process per node, using 16 threads, is most useful for memory demanding applications, that make good use of processor cache memory and are not memory bound. This is also a preferred way for communication intensive applications as one process per node enjoys full bandwidth access to the network interface.
**Two MPI** processes per node, using 8 threads each, bound to processor socket is most useful for memory bandwidth bound applications such as BLAS1 or FFT, with scalable memory demand. However, note that the two processes will share access to the network interface. The 8 threads and socket binding should ensure maximum memory access bandwidth and minimize communication, migration and NUMA effect overheads.
!!! note
Important! Bind every OpenMP thread to a core!
In the previous two cases with one or two MPI processes per node, the operating system might still migrate OpenMP threads between cores. You want to avoid this by setting the KMP_AFFINITY or GOMP_CPU_AFFINITY environment variables.
**16 MPI** processes per node, using 1 thread each bound to processor core is most suitable for highly scalable applications with low communication demand.
### Running OpenMPI
The **bullxmpi-1.2.4.1** and [**OpenMPI 1.6.5**](http://www.open-mpi.org/) are both based on OpenMPI. Read more on [how to run OpenMPI](Running_OpenMPI/) based MPI.
### Running MPICH2
The **Intel MPI** and **mpich2 1.9** are MPICH2 based implementations. Read more on [how to run MPICH2](running-mpich2/) based MPI.
The Intel MPI may run on the Intel Xeon Phi accelerators as well. Read more on [how to run Intel MPI on accelerators](../intel-xeon-phi/).
# Matlab
## Introduction
Matlab is available in versions R2015a and R2015b. There are always two variants of the release:
* Non commercial or so called EDU variant, which can be used for common research and educational purposes.
* Commercial or so called COM variant, which can used also for commercial activities. The licenses for commercial variant are much more expensive, so usually the commercial variant has only subset of features compared to the EDU available.
To load the latest version of Matlab load the module
```bash
$ module load MATLAB
```
By default the EDU variant is marked as default. If you need other version or variant, load the particular version. To obtain the list of available versions use
```bash
$ module avail MATLAB
```
If you need to use the Matlab GUI to prepare your Matlab programs, you can use Matlab directly on the login nodes. But for all computations use Matlab on the compute nodes via PBS Pro scheduler.
If you require the Matlab GUI, please follow the general information about [running graphical applications](../../../general/accessing-the-clusters/graphical-user-interface/x-window-system/).
Matlab GUI is quite slow using the X forwarding built in the PBS (qsub -X), so using X11 display redirection either via SSH or directly by xauth (please see the "GUI Applications on Compute Nodes over VNC" part [here](../../../general/accessing-the-clusters/graphical-user-interface/x-window-system/x-window-system/)) is recommended.
To run Matlab with GUI, use
```bash
$ matlab
```
To run Matlab in text mode, without the Matlab Desktop GUI environment, use
```bash
$ matlab -nodesktop -nosplash
```
plots, images, etc... will be still available.
## Running Parallel Matlab Using Distributed Computing Toolbox / Engine
!!! note
Distributed toolbox is available only for the EDU variant
The MPIEXEC mode available in previous versions is no longer available in MATLAB 2015. Also, the programming interface has changed. Refer to [Release Notes](http://www.mathworks.com/help/distcomp/release-notes.html#buanp9e-1).
Delete previously used file mpiLibConf.m, we have observed crashes when using Intel MPI.
To use Distributed Computing, you first need to setup a parallel profile. We have provided the profile for you, you can either import it in MATLAB command line:
```bash
>> parallel.importProfile('/apps/all/MATLAB/2015a-EDU/SalomonPBSPro.settings')
ans =
SalomonPBSPro
```
Or in the GUI, go to tab HOME -> Parallel -> Manage Cluster Profiles..., click Import and navigate to:
/apps/all/MATLAB/2015a-EDU/SalomonPBSPro.settings
With the new mode, MATLAB itself launches the workers via PBS, so you can either use interactive mode or a batch mode on one node, but the actual parallel processing will be done in a separate job started by MATLAB itself. Alternatively, you can use "local" mode to run parallel code on just a single node.
!!! note
The profile is confusingly named Salomon, but you can use it also on Anselm.
### Parallel Matlab Interactive Session
Following example shows how to start interactive session with support for Matlab GUI. For more information about GUI based applications on Anselm see [this page](../../../general/accessing-the-clusters/graphical-user-interface/x-window-system/x-window-system/).
```bash
$ xhost +
$ qsub -I -v DISPLAY=$(uname -n):$(echo $DISPLAY | cut -d ':' -f 2) -A NONE-0-0 -q qexp -l select=1 -l walltime=00:30:00
-l feature__matlab__MATLAB=1
```
This qsub command example shows how to run Matlab on a single node.
The second part of the command shows how to request all necessary licenses. In this case 1 Matlab-EDU license and 48 Distributed Computing Engines licenses.
Once the access to compute nodes is granted by PBS, user can load following modules and start Matlab:
```bash
r1i0n17$ module load MATLAB/2015b-EDU
r1i0n17$ matlab &
```
### Parallel Matlab Batch Job in Local Mode
To run matlab in batch mode, write an matlab script, then write a bash jobscript and execute via the qsub command. By default, matlab will execute one matlab worker instance per allocated core.
```bash
#!/bin/bash
#PBS -A PROJECT ID
#PBS -q qprod
#PBS -l select=1:ncpus=16:mpiprocs=16:ompthreads=1
# change to shared scratch directory
SCR=/scratch/work/user/$USER/$PBS_JOBID
mkdir -p $SCR ; cd $SCR || exit
# copy input file to scratch
cp $PBS_O_WORKDIR/matlabcode.m .
# load modules
module load MATLAB/2015a-EDU
# execute the calculation
matlab -nodisplay -r matlabcode > output.out
# copy output file to home
cp output.out $PBS_O_WORKDIR/.
```
This script may be submitted directly to the PBS workload manager via the qsub command. The inputs and matlab script are in matlabcode.m file, outputs in output.out file. Note the missing .m extension in the matlab -r matlabcodefile call, **the .m must not be included**. Note that the **shared /scratch must be used**. Further, it is **important to include quit** statement at the end of the matlabcode.m script.
Submit the jobscript using qsub
```bash
$ qsub ./jobscript
```
### Parallel Matlab Local Mode Program Example
The last part of the configuration is done directly in the user Matlab script before Distributed Computing Toolbox is started.
```bash
cluster = parcluster('local')
```
This script creates scheduler object "cluster" of type "local" that starts workers locally.
!!! note
Every Matlab script that needs to initialize/use matlabpool has to contain these three lines prior to calling parpool(sched, ...) function.
The last step is to start matlabpool with "cluster" object and correct number of workers. We have 24 cores per node, so we start 24 workers.
```bash
parpool(cluster,16);
... parallel code ...
parpool close
```
The complete example showing how to use Distributed Computing Toolbox in local mode is shown here.
```bash
cluster = parcluster('local');
cluster
parpool(cluster,24);
n=2000;
W = rand(n,n);
W = distributed(W);
x = (1:n)';
x = distributed(x);
spmd
[~, name] = system('hostname')
T = W*x; % Calculation performed on labs, in parallel.
% T and W are both codistributed arrays here.
end
T;
whos % T and W are both distributed arrays here.
parpool close
quit
```
You can copy and paste the example in a .m file and execute. Note that the parpool size should correspond to **total number of cores** available on allocated nodes.
### Parallel Matlab Batch Job Using PBS Mode (Workers Spawned in a Separate Job)
This mode uses PBS scheduler to launch the parallel pool. It uses the SalomonPBSPro profile that needs to be imported to Cluster Manager, as mentioned before. This methodod uses MATLAB's PBS Scheduler interface - it spawns the workers in a separate job submitted by MATLAB using qsub.
This is an example of m-script using PBS mode:
```bash
cluster = parcluster('SalomonPBSPro');
set(cluster, 'SubmitArguments', '-A OPEN-0-0');
set(cluster, 'ResourceTemplate', '-q qprod -l select=10:ncpus=16');
set(cluster, 'NumWorkers', 160);
pool = parpool(cluster, 160);
n=2000;
W = rand(n,n);
W = distributed(W);
x = (1:n)';
x = distributed(x);
spmd
[~, name] = system('hostname')
T = W*x; % Calculation performed on labs, in parallel.
% T and W are both codistributed arrays here.
end
whos % T and W are both distributed arrays here.
% shut down parallel pool
delete(pool)
```
Note that we first construct a cluster object using the imported profile, then set some important options, namely: SubmitArguments, where you need to specify accounting id, and ResourceTemplate, where you need to specify number of nodes to run the job.
You can start this script using batch mode the same way as in Local mode example.
### Parallel Matlab Batch With Direct Launch (Workers Spawned Within the Existing Job)
This method is a "hack" invented by us to emulate the mpiexec functionality found in previous MATLAB versions. We leverage the MATLAB Generic Scheduler interface, but instead of submitting the workers to PBS, we launch the workers directly within the running job, thus we avoid the issues with master script and workers running in separate jobs (issues with license not available, waiting for the worker's job to spawn etc.)
!!! warning
This method is experimental.
For this method, you need to use SalomonDirect profile, import it using [the same way as SalomonPBSPro](matlab/#running-parallel-matlab-using-distributed-computing-toolbox---engine)
This is an example of m-script using direct mode:
```bash
parallel.importProfile('/apps/all/MATLAB/2015a-EDU/SalomonDirect.settings')
cluster = parcluster('SalomonDirect');
set(cluster, 'NumWorkers', 48);
pool = parpool(cluster, 48);
n=2000;
W = rand(n,n);
W = distributed(W);
x = (1:n)';
x = distributed(x);
spmd
[~, name] = system('hostname')
T = W*x; % Calculation performed on labs, in parallel.
% T and W are both codistributed arrays here.
end
whos % T and W are both distributed arrays here.
% shut down parallel pool
delete(pool)
```
### Non-Interactive Session and Licenses
If you want to run batch jobs with Matlab, be sure to request appropriate license features with the PBS Pro scheduler, at least the `-l _feature_matlab_MATLAB=1` for EDU variant of Matlab. More information about how to check the license features states and how to request them with PBS Pro, please [look here](../isv_licenses/).
In case of non-interactive session please read the [following information](../isv_licenses/) on how to modify the qsub command to test for available licenses prior getting the resource allocation.
### Matlab Distributed Computing Engines Start Up Time
Starting Matlab workers is an expensive process that requires certain amount of time. For your information please see the following table:
| compute nodes | number of workers | start-up time[s] |
| ------------- | ----------------- | ---------------- |
| 16 | 384 | 831 |
| 8 | 192 | 807 |
| 4 | 96 | 483 |
| 2 | 48 | 16 |
## MATLAB on UV2000
UV2000 machine available in queue "qfat" can be used for MATLAB computations. This is a SMP NUMA machine with large amount of RAM, which can be beneficial for certain types of MATLAB jobs. CPU cores are allocated in chunks of 8 for this machine.
You can use MATLAB on UV2000 in two parallel modes:
### Threaded Mode
Since this is a SMP machine, you can completely avoid using Parallel Toolbox and use only MATLAB's threading. MATLAB will automatically detect the number of cores you have allocated and will set maxNumCompThreads accordingly and certain operations, such as `fft`, `eig`, `svd` etc. will be automatically run in threads. The advantage of this mode is that you don't need to modify your existing sequential codes.
### Local Cluster Mode
You can also use Parallel Toolbox on UV2000. Use [local cluster mode](matlab/#parallel-matlab-batch-job-in-local-mode), "SalomonPBSPro" profile will not work.
# Storage
There are two main shared file systems on Anselm cluster, the [HOME](#home) and [SCRATCH](#scratch). All login and compute nodes may access same data on shared file systems. Compute nodes are also equipped with local (non-shared) scratch, ramdisk and tmp file systems.
There are two main shared file systems on Anselm cluster, the [HOME](#home) and [SCRATCH](#scratch). All login and compute nodes may access same data on shared file systems. Compute nodes are also equipped with local (non-shared) scratch, RAM disk and tmp file systems.
## Archiving
......@@ -352,7 +352,7 @@ First, create the mount point
$ mkdir cesnet
```
Mount the storage. Note that you can choose among the ssh.du1.cesnet.cz (Plzen), ssh.du2.cesnet.cz (Jihlava), ssh.du3.cesnet.cz (Brno) Mount tier1_home **(only 5120M !)**:
Mount the storage. Note that you can choose among the ssh.du1.cesnet.cz (Plzen), ssh.du2.cesnet.cz (Jihlava), ssh.du3.cesnet.cz (Brno) Mount tier1_home **(only 5120 MB !)**:
```console
$ sshfs username@ssh.du1.cesnet.cz:. cesnet/
......@@ -384,16 +384,23 @@ Once done, please remember to unmount the storage
$ fusermount -u cesnet
```
<<<<<<< HEAD:docs.it4i/anselm/storage.md
### Rsync Access
!!! note
Rsync provides delta transfer for best performance, can resume interrupted transfers
=======
### RSYNC access
Rsync is a fast and extraordinarily versatile file copying tool. It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination. Rsync is widely used for backups and mirroring and as an improved copy command for everyday use.
!!! Note "Note"
RSYNC provides delta transfer for best performance, can resume interrupted transfers
>>>>>>> Spelling corrections:docs.it4i/anselm-cluster-documentation/storage.md
Rsync finds files that need to be transferred using a "quick check" algorithm (by default) that looks for files that have changed in size or in last-modified time. Any changes in the other preserved attributes (as requested by options) are made on the destination file directly when the quick check indicates that the file's data does not need to be updated.
RSYNC is a fast and extraordinarily versatile file copying tool. It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination. RSYNC is widely used for backups and mirroring and as an improved copy command for everyday use.
[More about Rsync](https://du.cesnet.cz/en/navody/rsync/start#pro_bezne_uzivatele)
RSYNC finds files that need to be transferred using a "quick check" algorithm (by default) that looks for files that have changed in size or in last-modified time. Any changes in the other preserved attributes (as requested by options) are made on the destination file directly when the quick check indicates that the file's data does not need to be updated.
[More about RSYNC](https://du.cesnet.cz/en/navody/rsync/start#pro_bezne_uzivatele)
Transfer large files to/from CESNET storage, assuming membership in the Storage VO
......
......@@ -122,7 +122,7 @@ However this method does not seem to work with recent Linux distributions and yo
## Gnome on Windows
Use Xlaunch to start the Xming server or run the XWin.exe. Select the "One window" mode.
Use XLaunch to start the Xming server or run the XWin.exe. Select the "One window" mode.
Log in to the cluster, using PuTTY. On the cluster, run the gnome-session command.
......
......@@ -107,4 +107,4 @@ In this example, we add an additional public key, stored in file additional_key.
## How to Remove Your Own Key
Removing your key from authorized_keys can be done simply by deleting the corresponding public key which can be identified by a comment at the end of line (eg. `username@organization.example.com`).
Removing your key from authorized_keys can be done simply by deleting the corresponding public key which can be identified by a comment at the end of line (e.g. _username@organization.example.com_).
......@@ -24,7 +24,7 @@ fi
```
!!! note
Do not run commands outputting to standard output (echo, module list, etc) in .bashrc for non-interactive SSH sessions. It breaks fundamental functionality (scp, PBS) of your account! Take care for SSH session interactivity for such commands as stated in the previous example.
Do not run commands outputting to standard output (echo, module list, etc) in .bashrc for non-interactive SSH sessions. It breaks fundamental functionality (SCP, PBS) of your account! Take care for SSH session interactivity for such commands as stated in the previous example.
### Application Modules
......
# Introduction
Welcome to Salomon supercomputer cluster. The Salomon cluster consists of 1008 compute nodes, totaling 24192 compute cores with 129 TB RAM and giving over 2 Pflop/s theoretical peak performance. Each node is a powerful x86-64 computer, equipped with 24 cores, at least 128 GB RAM. Nodes are interconnected by 7D Enhanced hypercube InfiniBand network and equipped with Intel Xeon E5-2680v3 processors. The Salomon cluster consists of 576 nodes without accelerators and 432 nodes equipped with Intel Xeon Phi MIC accelerators. Read more in [Hardware Overview](hardware-overview/).
Welcome to Salomon supercomputer cluster. The Salomon cluster consists of 1008 compute nodes, totaling 24192 compute cores with 129 TB RAM and giving over 2 PFLOP/s theoretical peak performance. Each node is a powerful x86-64 computer, equipped with 24 cores, at least 128 GB RAM. Nodes are interconnected by 7D Enhanced hypercube InfiniBand network and equipped with Intel Xeon E5-2680v3 processors. The Salomon cluster consists of 576 nodes without accelerators and 432 nodes equipped with Intel Xeon Phi MIC accelerators. Read more in [Hardware Overview](hardware-overview/).
The cluster runs [CentOS Linux](http://www.bull.com/bullx-logiciels/systeme-exploitation.html) operating system, which is compatible with the RedHat [Linux family.](http://upload.wikimedia.org/wikipedia/commons/1/1b/Linux_Distribution_Timeline.svg)
......
......@@ -459,7 +459,7 @@ cd $SCRDIR || exit
cp $PBS_O_WORKDIR/input .
cp $PBS_O_WORKDIR/mympiprog.x .
# load the mpi module
# load the MPI module
module load OpenMPI
# execute the calculation
......@@ -497,7 +497,7 @@ Example jobscript for an MPI job with preloaded inputs and executables, options
SCRDIR=/scratch/work/user/$USER/myjob
cd $SCRDIR || exit
# load the mpi module
# load the MPI module
module load OpenMPI
# execute the calculation
......
# ANSYS CFX
[ANSYS CFX](http://www.ansys.com/products/fluids/ansys-cfx) software is a high-performance, general purpose fluid dynamics program that has been applied to solve wide-ranging fluid flow problems for over 20 years. At the heart of ANSYS CFX is its advanced solver technology, the key to achieving reliable and accurate solutions quickly and robustly. The modern, highly parallelized solver is the foundation for an abundant choice of physical models to capture virtually any type of phenomena related to fluid flow. The solver and its many physical models are wrapped in a modern, intuitive, and flexible GUI and user environment, with extensive capabilities for customization and automation using session files, scripting and a powerful expression language.
To run ANSYS CFX in batch mode you can utilize/modify the default cfx.pbs script and execute it via the qsub command.
```bash
#!/bin/bash
#PBS -l nodes=2:ppn=16
#PBS -q qprod
#PBS -N $USER-CFX-Project
#PBS -A XX-YY-ZZ
#! Mail to user when job terminate or abort
#PBS -m ae
#!change the working directory (default is home directory)
#cd <working directory> (working directory must exists)
WORK_DIR="/scratch/$USER/work"
cd $WORK_DIR
echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`
module load ansys
#### Set number of processors per host listing
#### (set to 1 as $PBS_NODEFILE lists each node twice if :ppn=2)
procs_per_host=1
#### Create host list
hl=""
for host in `cat $PBS_NODEFILE`
do
if ["$hl" = "" ]
then hl="$host:$procs_per_host"
else hl="${hl}:$host:$procs_per_host"
fi
done
echo Machines: $hl
#-dev input.def includes the input of CFX analysis in DEF format
#-P the name of prefered license feature (aa_r=ANSYS Academic Research, ane3fl=Multiphysics(commercial))
/ansys_inc/v145/CFX/bin/cfx5solve -def input.def -size 4 -size-ni 4x -part-large -start-method "Platform MPI Distributed Parallel" -par-dist $hl -P aa_r
```
Header of the pbs file (above) is common and description can be find on [this site](../../job-submission-and-execution/). SVS FEM recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Working directory has to be created before sending PBS job into the queue. Input file should be in working directory or full path to input file has to be specified. >Input file has to be defined by common CFX def file which is attached to the CFX solver via parameter -def
**License** should be selected by parameter -P (Big letter **P**). Licensed products are the following: aa_r (ANSYS **Academic** Research), ane3fl (ANSYS Multiphysics)-**Commercial**.
[More about licensing here](licensing/)
# ANSYS Fluent
[ANSYS Fluent](http://www.ansys.com/products/fluids/ansys-fluent)
software contains the broad physical modeling capabilities needed to model flow, turbulence, heat transfer, and reactions for industrial applications ranging from air flow over an aircraft wing to combustion in a furnace, from bubble columns to oil platforms, from blood flow to semiconductor manufacturing, and from clean room design to wastewater treatment plants. Special models that give the software the ability to model in-cylinder combustion, aeroacoustics, turbomachinery, and multiphase systems have served to broaden its reach.
1. Common way to run Fluent over PBS file
To run ANSYS Fluent in batch mode you can utilize/modify the default fluent.pbs script and execute it via the qsub command.
```bash
#!/bin/bash
#PBS -S /bin/bash
#PBS -l nodes=2:ppn=16
#PBS -q qprod
#PBS -N $USER-Fluent-Project
#PBS -A XX-YY-ZZ
#! Mail to user when job terminate or abort
#PBS -m ae
#!change the working directory (default is home directory)
#cd <working directory> (working directory must exists)
WORK_DIR="/scratch/$USER/work"
cd $WORK_DIR
echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`
#### Load ansys module so that we find the cfx5solve command
module load ansys
# Use following line to specify MPI for message-passing instead
NCORES=`wc -l $PBS_NODEFILE |awk '{print $1}'`
/ansys_inc/v145/fluent/bin/fluent 3d -t$NCORES -cnf=$PBS_NODEFILE -g -i fluent.jou
```
Header of the pbs file (above) is common and description can be find on [this site](../../resources-allocation-policy/). [SVS FEM](http://www.svsfem.cz) recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Working directory has to be created before sending pbs job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common Fluent journal file which is attached to the Fluent solver via parameter -i fluent.jou
Journal file with definition of the input geometry and boundary conditions and defined process of solution has e.g. the following structure:
```bash
/file/read-case aircraft_2m.cas.gz
/solve/init
init
/solve/iterate
10
/file/write-case-dat aircraft_2m-solution
/exit yes
```
The appropriate dimension of the problem has to be set by parameter (2d/3d).
1. Fast way to run Fluent from command line
```bash
fluent solver_version [FLUENT_options] -i journal_file -pbs
```
This syntax will start the ANSYS FLUENT job under PBS Professional using the qsub command in a batch manner. When resources are available, PBS Professional will start the job and return a job ID, usually in the form of _job_ID.hostname_. This job ID can then be used to query, control, or stop the job using standard PBS Professional commands, such as qstat or qdel. The job will be run out of the current working directory, and all output will be written to the file fluent.o _job_ID_.
1. Running Fluent via user's config file
The sample script uses a configuration file called pbs_fluent.conf if no command line arguments are present. This configuration file should be present in the directory from which the jobs are submitted (which is also the directory in which the jobs are executed). The following is an example of what the content of pbs_fluent.conf can be:
```bash
input="example_small.flin"
case="Small-1.65m.cas"
fluent_args="3d -pmyrinet"
outfile="fluent_test.out"
mpp="true"
```
The following is an explanation of the parameters:
input is the name of the input file.
case is the name of the .cas file that the input file will utilize.
fluent_args are extra ANSYS FLUENT arguments. As shown in the previous example, you can specify the interconnect by using the -p interconnect command. The available interconnects include ethernet (the default), myrinet, infiniband, vendor, altix, and crayx. The MPI is selected automatically, based on the specified interconnect.
outfile is the name of the file to which the standard output will be sent.
mpp="true" will tell the job script to execute the job across multiple processors.
To run ANSYS Fluent in batch mode with user's config file you can utilize/modify the following script and execute it via the qsub command.
```bash
#!/bin/sh
#PBS -l nodes=2:ppn=4
#PBS -1 qprod
#PBS -N $USE-Fluent-Project
#PBS -A XX-YY-ZZ
cd $PBS_O_WORKDIR
#We assume that if they didn’t specify arguments then they should use the
#config file if ["xx${input}${case}${mpp}${fluent_args}zz" = "xxzz" ]; then
if [ -f pbs_fluent.conf ]; then
. pbs_fluent.conf
else
printf "No command line arguments specified, "
printf "and no configuration file found. Exiting n"
fi
fi
#Augment the ANSYS FLUENT command line arguments case "$mpp" in
true)
#MPI job execution scenario
num_nodes=‘cat $PBS_NODEFILE | sort -u | wc -l
cpus=‘expr $num_nodes * $NCPUS
#Default arguments for mpp jobs, these should be changed to suit your
#needs.
fluent_args="-t${cpus} $fluent_args -cnf=$PBS_NODEFILE"
;;
*)
#SMP case
#Default arguments for smp jobs, should be adjusted to suit your
#needs.
fluent_args="-t$NCPUS $fluent_args"
;;
esac
#Default arguments for all jobs
fluent_args="-ssh -g -i $input $fluent_args"
echo "---------- Going to start a fluent job with the following settings:
Input: $input
Case: $case
Output: $outfile
Fluent arguments: $fluent_args"
#run the solver
/ansys_inc/v145/fluent/bin/fluent $fluent_args > $outfile
```
It runs the jobs out of the directory from which they are submitted (PBS_O_WORKDIR).
1. Running Fluent in parralel
Fluent could be run in parallel only under Academic Research license. To do so this ANSYS Academic Research license must be placed before ANSYS CFD license in user preferences. To make this change anslic_admin utility should be run
```bash
/ansys_inc/shared_les/licensing/lic_admin/anslic_admin
```
ANSLIC_ADMIN Utility will be run
![](../../../img/Fluent_Licence_1.jpg)
![](../../../img/Fluent_Licence_2.jpg)
![](../../../img/Fluent_Licence_3.jpg)
ANSYS Academic Research license should be moved up to the top of the list.
![](../../../img/Fluent_Licence_4.jpg)
# ANSYS LS-DYNA
**[ANSYSLS-DYNA](http://www.ansys.com/products/structures/ansys-ls-dyna)** software provides convenient and easy-to-use access to the technology-rich, time-tested explicit solver without the need to contend with the complex input requirements of this sophisticated program. Introduced in 1996, ANSYS LS-DYNA capabilities have helped customers in numerous industries to resolve highly intricate design issues. ANSYS Mechanical users have been able take advantage of complex explicit solutions for a long time utilizing the traditional ANSYS Parametric Design Language (APDL) environment. These explicit capabilities are available to ANSYS Workbench users as well. The Workbench platform is a powerful, comprehensive, easy-to-use environment for engineering simulation. CAD import from all sources, geometry cleanup, automatic meshing, solution, parametric optimization, result visualization and comprehensive report generation are all available within a single fully interactive modern graphical user environment.
To run ANSYS LS-DYNA in batch mode you can utilize/modify the default ansysdyna.pbs script and execute it via the qsub command.
```bash
#!/bin/bash
#PBS -l nodes=2:ppn=16
#PBS -q qprod
#PBS -N $USER-DYNA-Project
#PBS -A XX-YY-ZZ
#! Mail to user when job terminate or abort
#PBS -m ae
#!change the working directory (default is home directory)
#cd <working directory>
WORK_DIR="/scratch/$USER/work"
cd $WORK_DIR
echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`
#! Counts the number of processors
NPROCS=`wc -l < $PBS_NODEFILE`
echo This job has allocated $NPROCS nodes
module load ansys
#### Set number of processors per host listing
#### (set to 1 as $PBS_NODEFILE lists each node twice if :ppn=2)
procs_per_host=1
#### Create host list
hl=""
for host in `cat $PBS_NODEFILE`
do
if ["$hl" = "" ]
then hl="$host:$procs_per_host"
else hl="${hl}:$host:$procs_per_host"
fi
done
echo Machines: $hl
/ansys_inc/v145/ansys/bin/ansys145 -dis -lsdynampp i=input.k -machines $hl
```
<<<<<<< HEAD
Header of the pbs file (above) is common and description can be find on [this site](../../job-submission-and-execution/). [SVS FEM](http://www.svsfem.cz) recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
=======
Header of the PBS file (above) is common and description can be find on [this site](../../resource-allocation-and-job-execution/job-submission-and-execution/). [SVS FEM](http://www.svsfem.cz) recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
>>>>>>> Spelling corrections
Working directory has to be created before sending PBS job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common LS-DYNA .**k** file which is attached to the ansys solver via parameter i=
# ANSYS MAPDL
**[ANSYS Multiphysics](http://www.ansys.com/products/multiphysics)**
software offers a comprehensive product solution for both multiphysics and single-physics analysis. The product includes structural, thermal, fluid and both high- and low-frequency electromagnetic analysis. The product also contains solutions for both direct and sequentially coupled physics problems including direct coupled-field elements and the ANSYS multi-field solver.
To run ANSYS MAPDL in batch mode you can utilize/modify the default mapdl.pbs script and execute it via the qsub command.
```bash
#!/bin/bash
#PBS -l nodes=2:ppn=16
#PBS -q qprod
#PBS -N $USER-ANSYS-Project
#PBS -A XX-YY-ZZ
#! Mail to user when job terminate or abort
#PBS -m ae
#!change the working directory (default is home directory)
#cd <working directory> (working directory must exists)
WORK_DIR="/scratch/$USER/work"
cd $WORK_DIR
echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo This jobs runs on the following processors:
echo `cat $PBS_NODEFILE`
module load ansys
#### Set number of processors per host listing
#### (set to 1 as $PBS_NODEFILE lists each node twice if :ppn=2)
procs_per_host=1
#### Create host list
hl=""
for host in `cat $PBS_NODEFILE`
do
if ["$hl" = "" ]
then hl="$host:$procs_per_host"
else hl="${hl}:$host:$procs_per_host"
fi
done
echo Machines: $hl
#-i input.dat includes the input of analysis in APDL format
#-o file.out is output file from ansys where all text outputs will be redirected
#-p the name of license feature (aa_r=ANSYS Academic Research, ane3fl=Multiphysics(commercial), aa_r_dy=Academic AUTODYN)
/ansys_inc/v145/ansys/bin/ansys145 -b -dis -p aa_r -i input.dat -o file.out -machines $hl -dir $WORK_DIR
```
Header of the PBS file (above) is common and description can be find on [this site](../../resources-allocation-policy/). [SVS FEM](http://www.svsfem.cz) recommends to utilize sources by keywords: nodes, ppn. These keywords allows to address directly the number of nodes (computers) and cores (ppn) which will be utilized in the job. Also the rest of code assumes such structure of allocated resources.
Working directory has to be created before sending PBS job into the queue. Input file should be in working directory or full path to input file has to be specified. Input file has to be defined by common APDL file which is attached to the ansys solver via parameter -i
**License** should be selected by parameter -p. Licensed products are the following: aa_r (ANSYS **Academic** Research), ane3fl (ANSYS Multiphysics)-**Commercial**, aa_r_dy (ANSYS **Academic** AUTODYN) [More about licensing here](licensing/)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment