Skip to content
Snippets Groups Projects
Commit d7099739 authored by David Hrbáč's avatar David Hrbáč
Browse files

Auto capitalize

parent 9ab6455c
No related branches found
No related tags found
5 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section,!68Auto capitalize
Pipeline #
Showing
with 98 additions and 98 deletions
......@@ -18,7 +18,7 @@ However, executing huge number of jobs via the PBS queue may strain the system.
1. A user is allowed to submit at most 100 jobs. Each job may be [a job array](capacity-computing/#job-arrays).
2. The array size is at most 1000 subjobs.
## Job arrays
## Job Arrays
!!! Note "Note"
Huge number of jobs may be easily submitted and managed as a job array.
......@@ -74,7 +74,7 @@ In this example, the submit directory holds the 900 input files, executable mypr
If huge number of parallel multicore (in means of multinode multithread, e. g. MPI enabled) jobs is needed to run, then a job array approach should also be used. The main difference compared to previous example using one node is that the local scratch should not be used (as it's not shared between nodes) and MPI or other technique for parallel multinode run has to be used properly.
### Submit the job array
### Submit the Job Array
To submit the job array, use the qsub -J command. The 900 jobs of the [example above](capacity-computing/#array_example) may be submitted like this:
......@@ -93,7 +93,7 @@ $ qsub -N JOBNAME -J 9-10:2 jobscript
This will only choose the lower index (9 in this example) for submitting/running your job.
### Manage the job array
### Manage the Job Array
Check status of the job array by the qstat command.
......@@ -147,7 +147,7 @@ $ qstat -u $USER -tJ
Read more on job arrays in the [PBSPro Users guide](../../pbspro-documentation/).
## GNU parallel
## GNU Parallel
!!! Note "Note"
Use GNU parallel to run many single core tasks on one node.
......@@ -161,7 +161,7 @@ $ module add parallel
$ man parallel
```
### GNU parallel jobscript
### GNU Parallel jobscript
The GNU parallel shell executes multiple instances of the jobscript using all cores on the node. The instances execute different work, controlled by the $PARALLEL_SEQ variable.
......@@ -205,7 +205,7 @@ cp output $PBS_O_WORKDIR/$TASK.out
In this example, tasks from tasklist are executed via the GNU parallel. The jobscript executes multiple instances of itself in parallel, on all cores of the node. Once an instace of jobscript is finished, new instance starts until all entries in tasklist are processed. Currently processed entry of the joblist may be retrieved via $1 variable. Variable $TASK expands to one of the input filenames from tasklist. We copy the input file to local scratch, execute the myprog.x and copy the output file back to the submit directory, under the $TASK.out name.
### Submit the job
### Submit the Job
To submit the job, use the qsub command. The 101 tasks' job of the [example above](capacity-computing/#gp_example) may be submitted like this:
......@@ -218,7 +218,7 @@ In this example, we submit a job of 101 tasks. 16 input files will be processed
Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
## Job arrays and GNU parallel
## Job Arrays and GNU Parallel
!!! Note "Note"
Combine the Job arrays and GNU parallel for best throughput of single core jobs
......@@ -228,7 +228,7 @@ While job arrays are able to utilize all available computational nodes, the GNU
!!! Note "Note"
Every subjob in an array runs GNU parallel to utilize all cores on the node
### GNU parallel, shared jobscript
### GNU Parallel, Shared jobscript
Combined approach, very similar to job arrays, can be taken. Job array is submitted to the queuing system. The subjobs run GNU parallel. The GNU parallel shell executes multiple instances of the jobscript using all cores on the node. The instances execute different work, controlled by the $PBS_JOB_ARRAY and $PARALLEL_SEQ variables.
......@@ -289,7 +289,7 @@ When deciding this values, think about following guiding rules:
2. Number of tasks should be modulo 16.
3. These rules are valid only when all tasks have similar task walltimes T.
### Submit the job array
### Submit the Job Array
To submit the job array, use the qsub -J command. The 992 tasks' job of the [example above](capacity-computing/#combined_example) may be submitted like this:
......
# Job scheduling
## Job execution priority
## Job Execution Priority
Scheduler gives each job an execution priority and then uses this job execution priority to select which job(s) to run.
......@@ -10,7 +10,7 @@ Job execution priority on Anselm is determined by these job properties (in order
2. fair-share priority
3. eligible time
### Queue priority
### Queue Priority
Queue priority is priority of queue where job is queued before execution.
......@@ -18,7 +18,7 @@ Queue priority has the biggest impact on job execution priority. Execution prior
Queue priorities can be seen at <https://extranet.it4i.cz/anselm/queues>
### Fair-share priority
### Fair-Share Priority
Fair-share priority is priority calculated on recent usage of resources. Fair-share priority is calculated per project, all members of project share same fair-share priority. Projects with higher recent usage have lower fair-share priority than projects with lower or none recent usage.
......@@ -40,7 +40,7 @@ Jobs queued in queue qexp are not calculated to project's usage.
Calculated fair-share priority can be also seen as Resource_List.fairshare attribute of a job.
### Eligible time
### Eligible Time
Eligible time is amount (in seconds) of eligible time job accrued while waiting to run. Jobs with higher eligible time gains higher priority.
......
......@@ -60,9 +60,9 @@ By default, the PBS batch system sends an e-mail only when the job is aborted. D
$ qsub -m n
```
## Advanced job placement
## Advanced Job Placement
### Placement by name
### Placement by Name
Specific nodes may be allocated via the PBS
......@@ -72,7 +72,7 @@ qsub -A OPEN-0-0 -q qprod -l select=1:ncpus=16:host=cn171+1:ncpus=16:host=cn172
In this example, we allocate nodes cn171 and cn172, all 16 cores per node, for 24 hours. Consumed resources will be accounted to the Project identified by Project ID OPEN-0-0. The resources will be available interactively.
### Placement by CPU type
### Placement by CPU Type
Nodes equipped with Intel Xeon E5-2665 CPU have base clock frequency 2.4GHz, nodes equipped with Intel Xeon E5-2470 CPU have base frequency 2.3 GHz (see section Compute Nodes for details). Nodes may be selected via the PBS resource attribute cpu_freq .
......@@ -87,7 +87,7 @@ $ qsub -A OPEN-0-0 -q qprod -l select=4:ncpus=16:cpu_freq=24 -I
In this example, we allocate 4 nodes, 16 cores, selecting only the nodes with Intel Xeon E5-2665 CPU.
### Placement by IB switch
### Placement by IB Switch
Groups of computational nodes are connected to chassis integrated Infiniband switches. These switches form the leaf switch layer of the [Infiniband network](../network/) fat tree topology. Nodes sharing the leaf switch can communicate most efficiently. Sharing the same switch prevents hops in the network and provides for unbiased, most efficient network communication.
......@@ -101,9 +101,9 @@ We recommend allocating compute nodes of a single switch when best possible comp
In this example, we request all the 18 nodes sharing the isw11 switch for 24 hours. Full chassis will be allocated.
## Advanced job handling
## Advanced Job Handling
### Selecting Turbo Boost off
### Selecting Turbo Boost Off
Intel Turbo Boost Technology is on by default. We strongly recommend keeping the default.
......@@ -115,7 +115,7 @@ If necessary (such as in case of benchmarking) you can disable the Turbo for all
More about the Intel Turbo Boost in the TurboBoost section
### Advanced examples
### Advanced Examples
In the following example, we select an allocation for benchmarking a very special and demanding MPI program. We request Turbo off, 2 full chassis of compute nodes (nodes sharing the same IB switches) for 30 minutes:
......
......@@ -18,9 +18,9 @@ In general PRACE users already have a PRACE account setup through their HOMESITE
If there's a special need a PRACE user can get a standard (local) account at IT4Innovations. To get an account on the Anselm cluster, the user needs to obtain the login credentials. The procedure is the same as for general users of the cluster, so please see the corresponding section of the general documentation here.
## Accessing the cluster
## Accessing the Cluster
### Access with GSI-SSH
### Access With GSI-SSH
For all PRACE users the method for interactive access (login) and data transfer based on grid services from Globus Toolkit (GSI SSH and GridFTP) is supported.
......@@ -100,7 +100,7 @@ Although the preferred and recommended file transfer mechanism is [using GridFTP
$ gsiscp -P 2222 anselm-prace.it4i.cz:_ANSELM_PATH_TO_YOUR_FILE_ _LOCAL_PATH_TO_YOUR_FILE_
```
### Access to X11 applications (VNC)
### Access to X11 Applications (VNC)
If the user needs to run X11 based graphical application and does not have a X11 server, the applications can be run using VNC service. If the user is using regular SSH based access, please see the section in general documentation.
......@@ -110,11 +110,11 @@ If the user uses GSI SSH based access, then the procedure is similar to the SSH
$ gsissh -p 2222 anselm.it4i.cz -L 5961:localhost:5961
```
### Access with SSH
### Access With SSH
After successful obtainment of login credentials for the local IT4Innovations account, the PRACE users can access the cluster as regular users using SSH. For more information please see the section in general documentation.
## File transfers
## File Transfers
PRACE users can use the same transfer mechanisms as regular users (if they've undergone the full registration procedure). For information about this, please see the section in the general documentation.
......@@ -197,7 +197,7 @@ Generally both shared file systems are available through GridFTP:
More information about the shared file systems is available [here](storage/).
## Usage of the cluster
## Usage of the Cluster
There are some limitations for PRACE user when using the cluster. By default PRACE users aren't allowed to access special queues in the PBS Pro to have high priority or exclusive access to some special equipment like accelerated nodes and high memory (fat) nodes. There may be also restrictions obtaining a working license for the commercial software installed on the cluster, mostly because of the license agreement or because of insufficient amount of licenses.
......
......@@ -17,15 +17,15 @@ Currently two compute nodes are dedicated for this service with following config
| Local disk drive | yes - 500 GB |
| Compute network | InfiniBand QDR |
## Schematic overview
## Schematic Overview
![rem_vis_scheme](../img/scheme.png "rem_vis_scheme")
![rem_vis_legend](../img/legend.png "rem_vis_legend")
## How to use the service
## How to Use the Service
### Setup and start your own TurboVNC server
### Setup and Start Your Own TurboVNC Server
TurboVNC is designed and implemented for cooperation with VirtualGL and available for free for all major platforms. For more information and download, please refer to: <http://sourceforge.net/projects/turbovnc/>
......@@ -33,11 +33,11 @@ TurboVNC is designed and implemented for cooperation with VirtualGL and availabl
The procedure is:
#### 1. Connect to a login node
#### 1. Connect to a Login Node
Please [follow the documentation](shell-and-data-access/).
#### 2. Run your own instance of TurboVNC server
#### 2. Run Your Own Instance of TurboVNC Server
To have the OpenGL acceleration, **24 bit color depth must be used**. Otherwise only the geometry (desktop size) definition is needed.
......@@ -56,7 +56,7 @@ Starting applications specified in /home/username/.vnc/xstartup.turbovnc
Log file is /home/username/.vnc/login2:1.log
```
#### 3. Remember which display number your VNC server runs (you will need it in the future to stop the server)
#### 3. Remember Which Display Number Your VNC Server Runs (You Will Need It in the Future to Stop the Server)
```bash
$ vncserver -list
......@@ -69,7 +69,7 @@ X DISPLAY # PROCESS ID
In this example the VNC server runs on display **:1**.
#### 4. Remember the exact login node, where your VNC server runs
#### 4. Remember the Exact Login Node, Where Your VNC Server Runs
```bash
$ uname -n
......@@ -78,7 +78,7 @@ login2
In this example the VNC server runs on **login2**.
#### 5. Remember on which TCP port your own VNC server is running
#### 5. Remember on Which TCP Port Your Own VNC Server Is Running
To get the port you have to look to the log file of your VNC server.
......@@ -89,7 +89,7 @@ $ grep -E "VNC.*port" /home/username/.vnc/login2:1.log
In this example the VNC server listens on TCP port **5901**.
#### 6. Connect to the login node where your VNC server runs with SSH to tunnel your VNC session
#### 6. Connect to the Login Node Where Your VNC Server Runs With SSH to Tunnel Your VNC Session
Tunnel the TCP port on which your VNC server is listenning.
......@@ -101,11 +101,11 @@ x-window-system/
If you use Windows and Putty, please refer to port forwarding setup in the documentation:
[x-window-and-vnc#section-12](../get-started-with-it4innovations/accessing-the-clusters/graphical-user-interface/x-window-system/)
#### 7. If you don't have Turbo VNC installed on your workstation
#### 7. If You Don't Have Turbo VNC Installed on Your Workstation
Get it from: <http://sourceforge.net/projects/turbovnc/>
#### 8. Run TurboVNC Viewer from your workstation
#### 8. Run TurboVNC Viewer From Your Workstation
Mind that you should connect through the SSH tunneled port. In this example it is 5901 on your workstation (localhost).
......@@ -115,11 +115,11 @@ $ vncviewer localhost:5901
If you use Windows version of TurboVNC Viewer, just run the Viewer and use address **localhost:5901**.
#### 9. Proceed to the chapter "Access the visualization node"
#### 9. Proceed to the Chapter "Access the Visualization Node"
Now you should have working TurboVNC session connected to your workstation.
#### 10. After you end your visualization session
#### 10. After You End Your Visualization Session
Don't forget to correctly shutdown your own VNC server on the login node!
......@@ -127,7 +127,7 @@ Don't forget to correctly shutdown your own VNC server on the login node!
$ vncserver -kill :1
```
### Access the visualization node
### Access the Visualization Node
**To access the node use a dedicated PBS Professional scheduler queue
qviz**. The queue has following properties:
......@@ -143,7 +143,7 @@ Currently when accessing the node, each user gets 4 cores of a CPU allocated, th
To access the visualization node, follow these steps:
#### 1. In your VNC session, open a terminal and allocate a node using PBSPro qsub command
#### 1. In Your VNC Session, Open a Terminal and Allocate a Node Using PBSPro qsub Command
This step is necessary to allow you to proceed with next steps.
......@@ -170,7 +170,7 @@ srv8
In this example the visualization session was assigned to node **srv8**.
#### 2. In your VNC session open another terminal (keep the one with interactive PBSPro job open)
#### 2. In Your VNC Session Open Another Terminal (Keep the One With Interactive PBSPro Job Open)
Setup the VirtualGL connection to the node, which PBSPro allocated for our job.
......@@ -180,13 +180,13 @@ $ vglconnect srv8
You will be connected with created VirtualGL tunnel to the visualization ode, where you will have a shell.
#### 3. Load the VirtualGL module
#### 3. Load the VirtualGL Module
```bash
$ module load virtualgl/2.4
```
#### 4. Run your desired OpenGL accelerated application using VirtualGL script "vglrun"
#### 4. Run Your Desired OpenGL Accelerated Application Using VirtualGL Script "Vglrun"
```bash
$ vglrun glxgears
......@@ -199,7 +199,7 @@ $ module load marc/2013.1
$ vglrun mentat
```
#### 5. After you end your work with the OpenGL application
#### 5. After You End Your Work With the OpenGL Application
Just logout from the visualization node and exit both opened terminals nd end your VNC server session as described above.
......
......@@ -17,7 +17,7 @@ The resources are allocated to the job in a fair-share fashion, subject to const
Read more on the [Resource AllocationPolicy](resources-allocation-policy/) page.
## Job submission and execution
## Job Submission and Execution
!!! Note "Note"
Use the **qsub** command to submit your jobs.
......@@ -26,7 +26,7 @@ The qsub submits the job into the queue. The qsub command creates a request to t
Read more on the [Job submission and execution](job-submission-and-execution/) page.
## Capacity computing
## Capacity Computing
!!! Note "Note"
Use Job arrays when running huge number of jobs.
......
......@@ -34,7 +34,7 @@ Jobs that exceed the reserved wall clock time (Req'd Time) get killed automatica
Anselm users may check current queue configuration at <https://extranet.it4i.cz/anselm/queues>.
### Queue status
### Queue Status
!!! tip
Check the status of jobs, queues and compute nodes at <https://extranet.it4i.cz/anselm/>
......@@ -107,11 +107,11 @@ Options:
## Resources Accounting Policy
### The Core-Hour
### the Core-Hour
The resources that are currently subject to accounting are the core-hours. The core-hours are accounted on the wall clock basis. The accounting runs whenever the computational cores are allocated or blocked via the PBS Pro workload manager (the qsub command), regardless of whether the cores are actually used for any calculation. 1 core-hour is defined as 1 processor core allocated for 1 hour of wall clock time. Allocating a full node (16 cores) for 1 hour accounts to 16 core-hours. See example in the [Job submission and execution](job-submission-and-execution/) section.
### Check consumed resources
### Check Consumed Resources
!!! Note "Note"
The **it4ifree** command is a part of it4i.portal.clients package, located here: <https://pypi.python.org/pypi/it4i.portal.clients>
......
......@@ -115,7 +115,7 @@ On Windows, use [WinSCP client](http://winscp.net/eng/download.php) to transfer
More information about the shared file systems is available [here](storage/).
## Connection restrictions
## Connection Restrictions
Outgoing connections, from Anselm Cluster login nodes to the outside world, are restricted to following ports:
......@@ -131,9 +131,9 @@ Outgoing connections, from Anselm Cluster login nodes to the outside world, are
Outgoing connections, from Anselm Cluster compute nodes are restricted to the internal network. Direct connections form compute nodes to outside world are cut.
## Port forwarding
## Port Forwarding
### Port forwarding from login nodes
### Port Forwarding From Login Nodes
!!! Note "Note"
Port forwarding allows an application running on Anselm to connect to arbitrary remote host and port.
......@@ -159,7 +159,7 @@ $ ssh -L 6000:localhost:1234 remote.host.com
!!! note
Port number 6000 is chosen as an example only. Pick any free port.
### Port forwarding from compute nodes
### Port Forwarding From Compute Nodes
Remote port forwarding from compute nodes allows applications running on the compute nodes to access hosts outside Anselm Cluster.
......@@ -173,7 +173,7 @@ $ ssh -TN -f -L 6000:localhost:6000 login1
In this example, we assume that port forwarding from login1:6000 to remote.host.com:1234 has been established beforehand. By accessing localhost:6000, an application running on a compute node will see response of remote.host.com:1234
### Using proxy servers
### Using Proxy Servers
Port forwarding is static, each single port is mapped to a particular port on remote host. Connection to other remote host, requires new forward.
......
......@@ -3,7 +3,7 @@
[ANSYS Fluent](http://www.ansys.com/products/fluids/ansys-fluent)
software contains the broad physical modeling capabilities needed to model flow, turbulence, heat transfer, and reactions for industrial applications ranging from air flow over an aircraft wing to combustion in a furnace, from bubble columns to oil platforms, from blood flow to semiconductor manufacturing, and from clean room design to wastewater treatment plants. Special models that give the software the ability to model in-cylinder combustion, aeroacoustics, turbomachinery, and multiphase systems have served to broaden its reach.
## Common way to run Fluent over PBS file
## Common Way to Run Fluent Over PBS File
To run ANSYS Fluent in batch mode you can utilize/modify the default fluent.pbs script and execute it via the qsub command.
......@@ -56,7 +56,7 @@ Journal file with definition of the input geometry and boundary conditions and d
The appropriate dimension of the problem has to be set by parameter (2d/3d).
## Fast way to run Fluent from command line
## Fast Way to Run Fluent From Command Line
```bash
fluent solver_version [FLUENT_options] -i journal_file -pbs
......@@ -64,7 +64,7 @@ fluent solver_version [FLUENT_options] -i journal_file -pbs
This syntax will start the ANSYS FLUENT job under PBS Professional using the qsub command in a batch manner. When resources are available, PBS Professional will start the job and return a job ID, usually in the form of _job_ID.hostname_. This job ID can then be used to query, control, or stop the job using standard PBS Professional commands, such as qstat or qdel. The job will be run out of the current working directory, and all output will be written to the file fluent.o _job_ID_.
## Running Fluent via user's config file
## Running Fluent via User's Config File
The sample script uses a configuration file called pbs_fluent.conf if no command line arguments are present. This configuration file should be present in the directory from which the jobs are submitted (which is also the directory in which the jobs are executed). The following is an example of what the content of pbs_fluent.conf can be:
......@@ -141,7 +141,7 @@ To run ANSYS Fluent in batch mode with user's config file you can utilize/modify
It runs the jobs out of the directory from which they are submitted (PBS_O_WORKDIR).
## Running Fluent in parralel
## Running Fluent in Parralel
Fluent could be run in parallel only under Academic Research license. To do so this ANSYS Academic Research license must be placed before ANSYS CFD license in user preferences. To make this change anslic_admin utility should be run
......
......@@ -12,7 +12,7 @@ Molpro software package is available only to users that have a valid license. Pl
To run Molpro, you need to have a valid license token present in " $HOME/.molpro/token". You can download the token from [Molpro website](https://www.molpro.net/licensee/?portal=licensee).
## Installed version
## Installed Version
Currently on Anselm is installed version 2010.1, patch level 45, parallel version compiled with Intel compilers and Intel MPI.
......
......@@ -8,7 +8,7 @@ NWChem aims to provide its users with computational chemistry tools that are sca
[Homepage](http://www.nwchem-sw.org/index.php/Main_Page)
## Installed versions
## Installed Versions
The following versions are currently installed:
......
# Compilers
## Available compilers, including GNU, INTEL and UPC compilers
## Available Compilers, Including GNU, INTEL and UPC Compilers
Currently there are several compilers for different programming languages available on the Anselm cluster:
......
......@@ -18,7 +18,7 @@ In case of debugging on accelerators:
- 1 user can debug on up to 8 accelerators, or
- 8 users can debug on single accelerator.
## Compiling Code to run with DDT
## Compiling Code to Run With DDT
### Modules
......@@ -43,7 +43,7 @@ $ mpicc -g -O0 -o test_debug test.c
$ mpif90 -g -O0 -o test_debug test.f
```
### Compiler flags
### Compiler Flags
Before debugging, you need to compile your code with theses flags:
......@@ -51,7 +51,7 @@ Before debugging, you need to compile your code with theses flags:
- **g** : Generates extra debugging information usable by GDB. -g3 includes even more debugging information. This option is available for GNU and INTEL C/C++ and Fortran compilers.
- **O0** : Suppress all optimizations.
## Starting a Job with DDT
## Starting a Job With DDT
Be sure to log in with an X window forwarding enabled. This could mean using the -X in the ssh:
......
......@@ -16,7 +16,7 @@ _Figure 1. Screenshot of CUBE displaying data from Scalasca._
Each node in the tree is colored by severity (the color scheme is displayed at the bottom of the window, ranging from the least severe blue to the most severe being red). For example in Figure 1, we can see that most of the point-to-point MPI communication happens in routine exch_qbc, colored red.
## Installed versions
## Installed Versions
Currently, there are two versions of CUBE 4.2.3 available as [modules](../../environment-and-modules/):
......
......@@ -4,7 +4,7 @@
We provide state of the art programms and tools to develop, profile and debug HPC codes at IT4Innovations. On these pages, we provide an overview of the profiling and debugging tools available on Anslem at IT4I.
## Intel debugger
## Intel Debugger
The intel debugger version 13.0 is available, via module intel. The debugger works for applications compiled with C and C++ compiler and the ifort fortran 77/90/95 compiler. The debugger provides java GUI environment. Use X display for running the GUI.
......@@ -48,7 +48,7 @@ TotalView is a source- and machine-level debugger for multi-process, multi-threa
Read more at the [Totalview](debuggers/total-view/) page.
## Vampir trace analyzer
## Vampir Trace Analyzer
Vampir is a GUI trace analyzer for traces in OTF format.
......
......@@ -4,7 +4,7 @@
Intel PCM (Performance Counter Monitor) is a tool to monitor performance hardware counters on Intel>® processors, similar to [PAPI](papi/). The difference between PCM and PAPI is that PCM supports only Intel hardware, but PCM can monitor also uncore metrics, like memory controllers and >QuickPath Interconnect links.
## Installed version
## Installed Version
Currently installed version 2.6. To load the [module](../../environment-and-modules/), issue:
......@@ -12,11 +12,11 @@ Currently installed version 2.6. To load the [module](../../environment-and-modu
$ module load intelpcm
```
## Command line tools
## Command Line Tools
PCM provides a set of tools to monitor system/or application.
### pcm-memory
### Pcm-Memory
Measures memory bandwidth of your application or the whole system. Usage:
......@@ -55,23 +55,23 @@ Sample output:
---------------------------------------||---------------------------------------
```
### pcm-msr
### Pcm-Msr
Command pcm-msr.x can be used to read/write model specific registers of the CPU.
### pcm-numa
### Pcm-Numa
NUMA monitoring utility does not work on Anselm.
### pcm-pcie
### Pcm-Pcie
Can be used to monitor PCI Express bandwith. Usage: pcm-pcie.x &lt;delay>
### pcm-power
### Pcm-Power
Displays energy usage and thermal headroom for CPU and DRAM sockets. Usage: `pcm-power.x <delay> | <external program>`
### pcm
### Pcm
This command provides an overview of performance counters and memory usage. Usage: `pcm.x <delay> | <external program>`
......@@ -184,7 +184,7 @@ Sample output :
Cleaning up
```
### pcm-sensor
### Pcm-Sensor
Can be used as a sensor for ksysguard GUI, which is currently not installed on Anselm.
......
......@@ -22,7 +22,7 @@ This will load the default version. Execute module avail papi for a list of inst
The bin directory of PAPI (which is automatically added to $PATH upon loading the module) contains various utilites.
### papi_avail
### Papi_avail
Prints which preset events are available on the current CPU. The third column indicated whether the preset event is available on the current CPU.
......@@ -60,19 +60,19 @@ Prints which preset events are available on the current CPU. The third column in
....
```
### papi_native_avail
### Papi_native_avail
Prints which native events are available on the current CPU.
### papi_cost
### Papi_cost
Measures the cost (in cycles) of basic PAPI operations.
### papi_mem_info
### Papi_mem_info
Prints information about the memory architecture of the current CPU.
## PAPI API
## Papi API
PAPI provides two kinds of events:
......@@ -88,11 +88,11 @@ To use PAPI in your application, you need to link the appropriate include file.
The include path is automatically added by papi module to $INCLUDE.
### High level API
### High Level API
Please refer to <http://icl.cs.utk.edu/projects/papi/wiki/PAPIC:High_Level> for a description of the High level API.
### Low level API
### Low Level API
Please refer to <http://icl.cs.utk.edu/projects/papi/wiki/PAPIC:Low_Level> for a description of the Low level API.
......@@ -100,7 +100,7 @@ Please refer to <http://icl.cs.utk.edu/projects/papi/wiki/PAPIC:Low_Level> for a
PAPI provides the most accurate timers the platform can support. See <http://icl.cs.utk.edu/projects/papi/wiki/PAPIC:Timers>
### System information
### System Information
PAPI can be used to query some system infromation, such as CPU name and MHz. See <http://icl.cs.utk.edu/projects/papi/wiki/PAPIC:System_Information>
......
......@@ -6,7 +6,7 @@
Scalasca supports profiling of MPI, OpenMP and hybrid MPI+OpenMP applications.
## Installed versions
## Installed Versions
There are currently two versions of Scalasca 2.0 [modules](../../environment-and-modules/) installed on Anselm:
......@@ -25,7 +25,7 @@ Profiling a parallel application with Scalasca consists of three steps:
Instrumentation via " scalasca -instrument" is discouraged. Use [Score-P instrumentation](score-p/).
### Runtime measurement
### Runtime Measurement
After the application is instrumented, runtime measurement can be performed with the `scalasca -analyze` command. The syntax is:
......@@ -45,7 +45,7 @@ Some notable Scalasca options are:
!!! Note
Scalasca can generate a huge amount of data, especially if tracing is enabled. Please consider saving the data to a [scratch directory](../../storage/storage/).
### Analysis of reports
### Analysis of Reports
For the analysis, you must have [Score-P](score-p/) and [CUBE](cube/) modules loaded. The analysis is done in two steps, first, the data is preprocessed and then CUBE GUI tool is launched.
......
......@@ -6,7 +6,7 @@ The [Score-P measurement infrastructure](http://www.vi-hps.org/projects/score-p/
Score-P can be used as an instrumentation tool for [Scalasca](scalasca/).
## Installed versions
## Installed Versions
There are currently two versions of Score-P version 1.2.6 [modules](../../environment-and-modules/) installed on Anselm :
......@@ -21,7 +21,7 @@ There are three ways to instrument your parallel applications in order to enable
2. Manual instrumentation using API calls
3. Manual instrumentation using directives
### Automated instrumentation
### Automated Instrumentation
is the easiest method. Score-P will automatically add instrumentation to every routine entry and exit using compiler hooks, and will intercept MPI calls and OpenMP regions. This method might, however, produce a large number of data. If you want to focus on profiler a specific regions of your code, consider using the manual instrumentation methods. To use automated instrumentation, simply prepend scorep to your compilation command. For example, replace:
......@@ -43,7 +43,7 @@ Usually your program is compiled using a Makefile or similar script, so it advis
It is important that scorep is prepended also to the linking command, in order to link with Score-P instrumentation libraries.
### Manual instrumentation using API calls
### Manual Instrumentation Using API Calls
To use this kind of instrumentation, use scorep with switch --user. You will then mark regions to be instrumented by inserting API calls.
......@@ -76,7 +76,7 @@ An example in C/C++ :
Please refer to the [documentation for description of the API](https://silc.zih.tu-dresden.de/scorep-current/pdf/scorep.pdf).
### Manual instrumentation using directives
### Manual Instrumentation Using Directives
This method uses POMP2 directives to mark regions to be instrumented. To use this method, use command scorep --pomp.
......
......@@ -27,7 +27,7 @@ You can check the status of the licenses here:
CUDA 64 0 64
```
## Compiling Code to run with TotalView
## Compiling Code to Run With TotalView
### Modules
......@@ -53,7 +53,7 @@ Compile the code:
mpif90 -g -O0 -o test_debug test.f
```
### Compiler flags
### Compiler Flags
Before debugging, you need to compile your code with theses flags:
......@@ -61,7 +61,7 @@ Before debugging, you need to compile your code with theses flags:
- **-g** : Generates extra debugging information usable by GDB. **-g3** includes even more debugging information. This option is available for GNU and INTEL C/C++ and Fortran compilers.
- **-O0** : Suppress all optimizations.
## Starting a Job with TotalView
## Starting a Job With TotalView
Be sure to log in with an X window forwarding enabled. This could mean using the -X in the ssh:
......@@ -79,7 +79,7 @@ From the login node an interactive session with X windows forwarding (-X option)
Then launch the debugger with the totalview command followed by the name of the executable to debug.
### Debugging a serial code
### Debugging a Serial Code
To debug a serial code use:
......@@ -87,7 +87,7 @@ To debug a serial code use:
totalview test_debug
```
### Debugging a parallel code - option 1
### Debugging a Parallel Code - Option 1
To debug a parallel code compiled with **OpenMPI** you need to setup your TotalView environment:
......@@ -140,7 +140,7 @@ At this point the main TotalView GUI window will appear and you can insert the b
![](../../../img/totalview2.png)
### Debugging a parallel code - option 2
### Debugging a Parallel Code - Option 2
Other option to start new parallel debugging session from a command line is to let TotalView to execute mpirun by itself. In this case user has to specify a MPI implementation used to compile the source code.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment