This is Lite Plone Theme
You are here: Home Developer Guide Performance Analysis

Performance Analysis

Basic usage of profiling with the Intel compiler, and Intel VTune.

Intel Compiler Instrumentation

There are several tools within the Intel compilers that allow for profiling your code to find the most time consuming parts or 'hotspots' of the code. This is possible by enabling the following compiler switches which will add instrumentation to the application at the entry and exit points of all loops and functions:

CFLAGS = -g -O1 -profile-functions -profile-loops=all -profile-loops-report=2 ...

Running the application generates a report file with resulting counts in both a human readable text file (a table) and an XML-file. You can analyse the data by looking at the raw text file, or by using a Java based GUI viewer that is shipped with the compiler:

$ loopprofileviewer.sh my_report.xml

The report file contains information such as:

  • Call counts of routines
  • Self-time of functions / loops
  • Total-time of functions / loops
  • Average, minimum, maximum iteration counter of loops

It is important to understand the difference between self-time and total-time of a function. The self-time is the amount of time spent inside a function excluding the time taken by other functions called from within that function body. Total-time includes the time taken by functions called from within the function body.

N.B. These compiler profiles can only be performed on single threaded code, therefore the code must be compiled without OpenMP / pthreads etc. For more detailed multi-threaded profiling, use VTune.

VTune

Xeon Applications.

From an interactive session you can run VTune via the GUI with:

$ amplxe-gui &

From the GUI you can run tests and view the results of tests. For general profiling purposes use the 'hotspots' test. For more multi-threaded oriented profiling use 'Locks and waits' test. Low level tests that use hardware counters should not be used on the machines and are indeed disabled. These tests are fairly esoteric anyway, but if you are certain you require them then contact cosmos_support.

The way we recommend you use VTune is to run it in interactive mode and then view the results with the GUI. To run a hotspots test interactively do:

$ amplxe-cl -c hotspots -- ./myapp 

That will produce a folder with a name like r001hs in your current directory. You can open this result via the VTune GUI, File -> Open -> Result.