Skip to content
Snippets Groups Projects
intel-vtune-amplifier.md 4.16 KiB
Newer Older
  • Learn to ignore specific revisions
  • Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    Intel VTune Amplifier
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    =====================
    
    Introduction
    ------------
    
    Intel*® *VTune™ Amplifier, part of Intel Parallel studio, is a GUI profiling tool designed for Intel processors. It offers a graphical performance analysis of single core and multithreaded applications. A highlight of the features:
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    -   Hotspot analysis
    -   Locks and waits analysis
    -   Low level specific counters, such as branch analysis and memory
        bandwidth
    -   Power usage analysis - frequency and sleep states.
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ![screenshot](../../../img/vtune-amplifier.png)
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    Usage
    -----
    To launch the GUI, first load the module:
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```bash
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
        $ module add VTune/2016_update1
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    and launch the GUI :
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```bash
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
        $ amplxe-gui
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    
    !!! Note "Note"
    	To profile an application with VTune Amplifier, special kernel modules need to be loaded. The modules are not loaded on Anselm login nodes, thus direct profiling on login nodes is not possible. Use VTune on compute nodes and refer to the documentation on using GUI applications.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    The GUI will open in new window. Click on "*New Project...*" to create a new project. After clicking *OK*, a new window with project properties will appear.  At "*Application:*", select the bath to your binary you want to profile (the binary should be compiled with -g flag). Some additional options such as command line arguments can be selected. At "*Managed code profiling mode:*" select "*Native*" (unless you want to profile managed mode .NET/Mono applications). After clicking *OK*, your project is created.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    To run a new analysis, click "*New analysis...*". You will see a list of possible analysis. Some of them will not be possible on the current CPU (eg. Intel Atom analysis is not possible on Sandy Bridge CPU), the GUI will show an error box if you select the wrong analysis. For example, select "*Advanced Hotspots*". Clicking on *Start *will start profiling of the application.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    Remote Analysis
    ---------------
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    VTune Amplifier also allows a form of remote analysis. In this mode, data for analysis is collected from the command line without GUI, and the results are then loaded to GUI on another machine. This allows profiling without interactive graphical jobs. To perform a remote analysis, launch a GUI somewhere, open the new analysis window and then click the button "*Command line*" in bottom right corner. It will show the command line needed to perform the selected analysis.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    The command line will look like this:
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```bash
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
        /apps/all/VTune/2016_update1/vtune_amplifier_xe_2016.1.1.434111/bin64/amplxe-cl -collect advanced-hotspots -knob collection-detail=stack-and-callcount -mrte-mode=native -target-duration-type=veryshort -app-working-dir /home/sta545/test -- /home/sta545/test_pgsesv
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    Copy the line to clipboard and then you can paste it in your jobscript or in command line. After the collection is run, open the GUI once again, click the menu button in the upper right corner, and select "*Open > Result...*". The GUI will load the results from the run.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    Xeon Phi
    --------
    
    !!! Note "Note"
    	This section is outdated. It will be updated with new information soon.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    It is possible to analyze both native and offload Xeon Phi applications. For offload mode, just specify the path to the binary. For native mode, you need to specify in project properties:
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    Application:  ssh
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    Application parameters:  mic0 source ~/.profile && /path/to/your/bin
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    Note that we include  source ~/.profile in the command to setup environment paths [as described here](../intel-xeon-phi/).
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    !!! Note "Note"
    	If the analysis is interrupted or aborted, further analysis on the card might be impossible and you will get errors like "ERROR connecting to MIC card". In this case please contact our support to reboot the MIC card.
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    You may also use remote analysis to collect data from the MIC and then analyze it in the GUI later :
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```bash
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
        $ amplxe-cl -collect knc-hotspots -no-auto-finalize -- ssh mic0
        "export LD_LIBRARY_PATH=/apps/intel/composer_xe_2015.2.164/compiler/lib/mic/:/apps/intel/composer_xe_2015.2.164/mkl/lib/mic/; export KMP_AFFINITY=compact; /tmp/app.mic"
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    ```
    
    Lukáš Krupčík's avatar
    Lukáš Krupčík committed
    
    References
    ----------
    
    1.  <https://www.rcac.purdue.edu/tutorials/phi/PerformanceTuningXeonPhi-Tullos.pdf> Performance Tuning for Intel® Xeon Phi™ Coprocessors