diff --git a/docs.it4i/anselm-cluster-documentation/software/debuggers/cube.md b/docs.it4i/anselm-cluster-documentation/software/debuggers/cube.md index 2a11e646f6b76fc5c8365fd6fe8fd30fd05f60c3..45b2c465c7830aa7a8a81c5044542f949bddf7c3 100644 --- a/docs.it4i/anselm-cluster-documentation/software/debuggers/cube.md +++ b/docs.it4i/anselm-cluster-documentation/software/debuggers/cube.md @@ -6,7 +6,7 @@ CUBE is a graphical performance report explorer for displaying data from Score-P - **performance metric**, where a number of metrics are available, such as communication time or cache misses, - **call path**, which contains the call tree of your program -- s**ystem resource**, which contains system's nodes, processes and threads, depending on the parallel programming model. +- **system resource**, which contains system's nodes, processes and threads, depending on the parallel programming model. Each dimension is organized in a tree, for example the time performance metric is divided into Execution time and Overhead time, call path dimension is organized by files and routines in your source code etc. @@ -27,7 +27,7 @@ Currently, there are two versions of CUBE 4.2.3 available as [modules](../../env CUBE is a graphical application. Refer to Graphical User Interface documentation for a list of methods to launch graphical applications on Anselm. -!!! Note "Note" +!!! Note Analyzing large data sets can consume large amount of CPU and RAM. Do not perform large analysis on login nodes. After loading the appropriate module, simply launch cube command, or alternatively you can use scalasca -examine command to launch the GUI. Note that for Scalasca datasets, if you do not analyze the data with scalasca -examine before to opening them with CUBE, not all performance data will be available. diff --git a/docs.it4i/anselm-cluster-documentation/software/debuggers/intel-performance-counter-monitor.md b/docs.it4i/anselm-cluster-documentation/software/debuggers/intel-performance-counter-monitor.md index be48242e4d3aa5a8dc0300f11c3d62c57277b3c4..b534dd81bf7d0c745c291c9c5abd95fae53629bd 100644 --- a/docs.it4i/anselm-cluster-documentation/software/debuggers/intel-performance-counter-monitor.md +++ b/docs.it4i/anselm-cluster-documentation/software/debuggers/intel-performance-counter-monitor.md @@ -69,11 +69,11 @@ Can be used to monitor PCI Express bandwith. Usage: pcm-pcie.x <delay> ### pcm-power -Displays energy usage and thermal headroom for CPU and DRAM sockets. Usage: pcm-power.x <delay> \| <external program> +Displays energy usage and thermal headroom for CPU and DRAM sockets. Usage: `pcm-power.x <delay> | <external program>` ### pcm -This command provides an overview of performance counters and memory usage. Usage: pcm.x <delay> \| <external program> +This command provides an overview of performance counters and memory usage. Usage: `pcm.x <delay> | <external program>` Sample output : @@ -192,7 +192,7 @@ Can be used as a sensor for ksysguard GUI, which is currently not installed on A In a similar fashion to PAPI, PCM provides a C++ API to access the performance counter from within your application. Refer to the [Doxygen documentation](http://intel-pcm-api-documentation.github.io/classPCM.html) for details of the API. -!!! Note "Note" +!!! Note Due to security limitations, using PCM API to monitor your applications is currently not possible on Anselm. (The application must be run as root user) Sample program using the API : diff --git a/docs.it4i/anselm-cluster-documentation/software/debuggers/intel-vtune-amplifier.md b/docs.it4i/anselm-cluster-documentation/software/debuggers/intel-vtune-amplifier.md index 278c69ba3f1cb550815195d98b527974fa02247a..e9bae568d427dcda6f11dd0a728533e13d194d17 100644 --- a/docs.it4i/anselm-cluster-documentation/software/debuggers/intel-vtune-amplifier.md +++ b/docs.it4i/anselm-cluster-documentation/software/debuggers/intel-vtune-amplifier.md @@ -2,7 +2,7 @@ ## Introduction -Intel_® _VTune™ Amplifier, part of Intel Parallel studio, is a GUI profiling tool designed for Intel processors. It offers a graphical performance analysis of single core and multithreaded applications. A highlight of the features: +Intel VTune Amplifier, part of Intel Parallel studio, is a GUI profiling tool designed for Intel processors. It offers a graphical performance analysis of single core and multithreaded applications. A highlight of the features: - Hotspot analysis - Locks and waits analysis @@ -26,7 +26,7 @@ and launch the GUI : $ amplxe-gui ``` -!!! Note "Note" +!!! Note To profile an application with VTune Amplifier, special kernel modules need to be loaded. The modules are not loaded on Anselm login nodes, thus direct profiling on login nodes is not possible. Use VTune on compute nodes and refer to the documentation on using GUI applications. The GUI will open in new window. Click on "_New Project..._" to create a new project. After clicking _OK_, a new window with project properties will appear. At "_Application:_", select the bath to your binary you want to profile (the binary should be compiled with -g flag). Some additional options such as command line arguments can be selected. At "_Managed code profiling mode:_" select "_Native_" (unless you want to profile managed mode .NET/Mono applications). After clicking _OK_, your project is created. @@ -47,7 +47,7 @@ Copy the line to clipboard and then you can paste it in your jobscript or in com ## Xeon Phi -!!! Note "Note" +!!! Note This section is outdated. It will be updated with new information soon. It is possible to analyze both native and offload Xeon Phi applications. For offload mode, just specify the path to the binary. For native mode, you need to specify in project properties: @@ -58,7 +58,7 @@ Application parameters: mic0 source ~/.profile && /path/to/your/bin Note that we include source ~/.profile in the command to setup environment paths [as described here](../intel-xeon-phi/). -!!! Note "Note" +!!! Note If the analysis is interrupted or aborted, further analysis on the card might be impossible and you will get errors like "ERROR connecting to MIC card". In this case please contact our support to reboot the MIC card. You may also use remote analysis to collect data from the MIC and then analyze it in the GUI later : diff --git a/docs.it4i/anselm-cluster-documentation/software/debuggers/papi.md b/docs.it4i/anselm-cluster-documentation/software/debuggers/papi.md index f449dcc88209910298a450bfd93fc68ba86f410e..a08f5fa77e03047ac816411a2f5e94d1c10cdf93 100644 --- a/docs.it4i/anselm-cluster-documentation/software/debuggers/papi.md +++ b/docs.it4i/anselm-cluster-documentation/software/debuggers/papi.md @@ -68,7 +68,7 @@ Prints which native events are available on the current CPU. Measures the cost (in cycles) of basic PAPI operations. -\###papi_mem_info +### papi_mem_info Prints information about the memory architecture of the current CPU. diff --git a/docs.it4i/anselm-cluster-documentation/software/debuggers/scalasca.md b/docs.it4i/anselm-cluster-documentation/software/debuggers/scalasca.md index 56032e03a42a50c0c229f2c29d1e5e54ff4e6285..0e46005e8e21844cd65b55b80a4dcb4852cb12a3 100644 --- a/docs.it4i/anselm-cluster-documentation/software/debuggers/scalasca.md +++ b/docs.it4i/anselm-cluster-documentation/software/debuggers/scalasca.md @@ -27,9 +27,9 @@ Instrumentation via " scalasca -instrument" is discouraged. Use [Score-P instrum ### Runtime measurement -After the application is instrumented, runtime measurement can be performed with the " scalasca -analyze" command. The syntax is: +After the application is instrumented, runtime measurement can be performed with the `scalasca -analyze` command. The syntax is: -scalasca -analyze [scalasca options][launcher] [launcher options][program] [program options] +`scalasca -analyze [scalasca options][launcher] [launcher options][program] [program options]` An example : @@ -39,10 +39,10 @@ An example : Some notable Scalasca options are: -**-t Enable trace data collection. By default, only summary data are collected.** -**-e <directory> Specify a directory to save the collected data to. By default, Scalasca saves the data to a directory with prefix scorep\_, followed by name of the executable and launch configuration.** +- **-t Enable trace data collection. By default, only summary data are collected.** +- **-e <directory> Specify a directory to save the collected data to. By default, Scalasca saves the data to a directory with prefix scorep\_, followed by name of the executable and launch configuration.** -!!! Note "Note" +!!! Note Scalasca can generate a huge amount of data, especially if tracing is enabled. Please consider saving the data to a [scratch directory](../../storage/storage/). ### Analysis of reports diff --git a/docs.it4i/salomon/capacity-computing.md b/docs.it4i/salomon/capacity-computing.md index 9ad9f94d2b466fc9525fd96434eb8a84a62318a0..2f2a7308b13cda92c50e689909eca3cf55763000 100644 --- a/docs.it4i/salomon/capacity-computing.md +++ b/docs.it4i/salomon/capacity-computing.md @@ -287,7 +287,7 @@ In this example, the jobscript executes in multiple instances in parallel, on al When deciding this values, think about following guiding rules : -1. Let n=N/24. Inequality (n+1) \* T < W should hold. The N is number of tasks per subjob, T is expected single task walltime and W is subjob walltime. Short subjob walltime improves scheduling and job throughput. +1. Let n = N / 24. Inequality (n + 1) x T < W should hold. The N is number of tasks per subjob, T is expected single task walltime and W is subjob walltime. Short subjob walltime improves scheduling and job throughput. 2. Number of tasks should be modulo 24. 3. These rules are valid only when all tasks have similar task walltimes T. diff --git a/docs.it4i/salomon/compute-nodes.md b/docs.it4i/salomon/compute-nodes.md index 910ac9611c43c519283272f69baa09a5bc462428..26e1541ca2808353a8a25b482cbc770b77c668e5 100644 --- a/docs.it4i/salomon/compute-nodes.md +++ b/docs.it4i/salomon/compute-nodes.md @@ -7,7 +7,7 @@ Compute nodes with MIC accelerator **contains two Intel Xeon Phi 7120P accelerat [More about schematic representation of the Salomon cluster compute nodes IB topology](ib-single-plane-topology/). -\###Compute Nodes Without Accelerator +### Compute Nodes Without Accelerator - codename "grafton" - 576 nodes @@ -17,7 +17,7 @@ Compute nodes with MIC accelerator **contains two Intel Xeon Phi 7120P accelerat  -\###Compute Nodes With MIC Accelerator +### Compute Nodes With MIC Accelerator - codename "perrin" - 432 nodes diff --git a/docs.it4i/salomon/software/ansys/ansys-fluent.md b/docs.it4i/salomon/software/ansys/ansys-fluent.md index c9c2a8020d09ff4b8e4503252480f6a7a5687599..aefcfbf77da974a6ec14110ef571e7dcf1f8ba36 100644 --- a/docs.it4i/salomon/software/ansys/ansys-fluent.md +++ b/docs.it4i/salomon/software/ansys/ansys-fluent.md @@ -5,8 +5,6 @@ software contains the broad physical modeling capabilities needed to model flow, 1. Common way to run Fluent over pbs file -* * * - To run ANSYS Fluent in batch mode you can utilize/modify the default fluent.pbs script and execute it via the qsub command. ```bash @@ -60,8 +58,6 @@ The appropriate dimension of the problem has to be set by parameter (2d/3d). 2. Fast way to run Fluent from command line -* * * - ```bash fluent solver_version [FLUENT_options] -i journal_file -pbs ``` @@ -70,8 +66,6 @@ This syntax will start the ANSYS FLUENT job under PBS Professional using the qs 3. Running Fluent via user's config file -* * * - The sample script uses a configuration file called pbs_fluent.conf if no command line arguments are present. This configuration file should be present in the directory from which the jobs are submitted (which is also the directory in which the jobs are executed). The following is an example of what the content of pbs_fluent.conf can be: ```bash @@ -149,8 +143,6 @@ It runs the jobs out of the directory from which they are submitted (PBS_O_WORKD 4. Running Fluent in parralel -* * * - Fluent could be run in parallel only under Academic Research license. To do so this ANSYS Academic Research license must be placed before ANSYS CFD license in user preferences. To make this change anslic_admin utility should be run ```bash diff --git a/docs.it4i/salomon/software/ansys/workbench.md b/docs.it4i/salomon/software/ansys/workbench.md index 8bc1edb4dd7348d27fcdf57937f04e237e3cba83..8ed07d789dea69798e68c177ac1612a3e391ec88 100644 --- a/docs.it4i/salomon/software/ansys/workbench.md +++ b/docs.it4i/salomon/software/ansys/workbench.md @@ -12,7 +12,7 @@ Enable Distribute Solution checkbox and enter number of cores (eg. 48 to run on -mpifile /path/to/my/job/mpifile.txt ``` -Where /path/to/my/job is the directory where your project is saved. We will create the file mpifile.txt programatically later in the batch script. For more information, refer to _ANSYS Mechanical APDL Parallel Processing_ _Guide_. +Where /path/to/my/job is the directory where your project is saved. We will create the file mpifile.txt programatically later in the batch script. For more information, refer to \*ANSYS Mechanical APDL Parallel Processing\* \*Guide\*. Now, save the project and close Workbench. We will use this script to launch the job: diff --git a/docs.it4i/salomon/software/compilers.md b/docs.it4i/salomon/software/compilers.md index ca8a0f747c0e0a70ac41696b46f977b98958715c..5f9a9ccbb74efe1c624ea26a5e003717c102a26b 100644 --- a/docs.it4i/salomon/software/compilers.md +++ b/docs.it4i/salomon/software/compilers.md @@ -181,10 +181,10 @@ To run the example on two compute nodes using all 48 cores, with 48 threads, iss For more information see the man pages. -\##Java +## Java For information how to use Java (runtime and/or compiler), please read the [Java page](java/). -\##NVIDIA CUDA +## NVIDIA CUDA For information how to work with NVIDIA CUDA, please read the [NVIDIA CUDA page](../../anselm-cluster-documentation/software/nvidia-cuda/).