Using perf

Performance counters are special hardware registers available on most modern CPUs. These registers count the number of certain types of hw events: such as instructions executed, cache-misses suffered, or branches mispredicted - without slowing down the kernel or applications. These registers can also trigger interrupts when a threshold number of events have passed - and can thus be used to profile the code that runs on that CPU.

The Linux Performance Counter subsystem provides rich abstractions over these hardware capabilities. It provides per task, per CPU and per-workload counters, counter groups, and it provides sampling capabilities on top of those - and more.

It also provides abstraction for 'software events' - such as minor/major page faults, task migrations, task context-switches and tracepoints.

There is a new tool ('perf') that makes full use of this new kernel subsystem. It can be used to optimize, validate and measure applications, workloads or the full system.

'perf' is hosted in the upstream kernel repository and can be found under: tools/perf/

perf structure
Perf uses breakpoint from different sources that handle the register scheduling, thread/cpu attachment, etc.

ptrace      kgdb      ftrace   perf syscall \         |          /         /           \         |         /         /                                        /            Core breakpoint API        / /                    |               /                     |              /              Breakpoints perf events

That's why, to fully use perf, you have to activate all this module such as Ftrace in the kernel configuration.

Installation
Actually, perf tool cannot be cross compile due to his different library needed. At this time, he can be build on OMAP with ubuntu installed. Prior to compile it, install the libelf library which is needed for the installation and execution of perf.


 * 1) apt-get install libelf-dev
 * 2) make
 * 3) make install

Futhermore, the following flags has to be activate into the kernel configuration : * PERF_EVENTS * PERF_COUNTERS.

Getting Started
Once you have installed 'perf' on your system, the simplest way to start profiling an userspace program is to use the "perf record" and "perf report" command as follows:

$ perf record -f -- git gc Counting objects: 1283571, done. Compressing objects: 100% (206724/206724), done. Writing objects: 100% (1283571/1283571), done. Total 1283571 (delta 1070675), reused 1281443 (delta 1068566) [ perf record: Captured and wrote 31.054 MB perf.data (~1356768 samples) ] $ perf report --sort comm,dso,symbol | head -10 # #    31.53%              git  /usr/bin/git                             [.] 0x0000000009804f 13.41%       git-prune  /usr/bin/git-prune                       [.] 0x000000000ad06d 10.05%             git  /lib/tls/i686/cmov/libc-2.8.90.so        [.] _nl_make_l10nflist 5.36%       git-prune  /usr/lib/libz.so.1.2.3.3                 [.] 0x00000000009d51 4.48%             git  /lib/tls/i686/cmov/libc-2.8.90.so        [.] memcpy
 * 1) Samples: 1355726
 * 1) Overhead          Command                            Shared Object  Symbol

perf event tracepoint
[...] kmem:kmalloc                            [Tracepoint event] kmem:kmem_cache_alloc                   [Tracepoint event] kmem:kmalloc_node                       [Tracepoint event] kmem:kmem_cache_alloc_node              [Tracepoint event] kmem:kfree                              [Tracepoint event] kmem:kmem_cache_free                    [Tracepoint event] kmem:mm_page_free_direct                [Tracepoint event] kmem:mm_pagevec_free                    [Tracepoint event] kmem:mm_page_alloc                      [Tracepoint event] kmem:mm_page_alloc_zone_locked          [Tracepoint event] kmem:mm_page_pcpu_drain                 [Tracepoint event]
 * 1) perf list

Then any (or all) of the above event sources can be activated and measured. For example the page alloc/free properties of a 'hackbench run' are:

Time: 0.575 Performance counter stats for './hackbench 10': 13857 kmem:mm_page_pcpu_drain 27576 kmem:mm_page_alloc 6025 kmem:mm_pagevec_free 20934 kmem:mm_page_free_direct 0.613972165 seconds time elapsed
 * 1) perf stat -e kmem:mm_page_pcpu_drain -e kmem:mm_page_alloc -e kmem:mm_pagevec_free -e kmem:mm_page_free_direct ./hackbench 10

You can observe the statistical properties as well, by using the 'repeat the workload N times' feature of perf stat: Time: 0.627 Time: 0.644 Time: 0.564 Time: 0.559 Time: 0.626 Performance counter stats for './hackbench 10' (5 runs): 12920 kmem:mm_page_pcpu_drain    ( +-   3.359% ) 25035 kmem:mm_page_alloc         ( +-   3.783% ) 6104 kmem:mm_pagevec_free       ( +-   0.934% ) 18376 kmem:mm_page_free_direct   ( +-   4.941% ) 0.643954516 seconds time elapsed   ( +-   2.363% )
 * 1) perf stat --repeat 5 -e kmem:mm_page_pcpu_drain -e kmem:mm_page_alloc -e kmem:mm_pagevec_free -e kmem:mm_page_free_direct ./hackbench 10

Scripting support for perf
Recently, a support was added for using perl and python scripts with the perf tool. Interpreters for both perl and python can be embedded into the perf executable, which allows processing the raw perf trace data stream in either of those languages.

Multiple different example scripts are provided with perf, which can be listed from perf itself: List of available trace scripts: syscall-counts [comm]               system-wide syscall counts syscall-counts-by-pid [comm]        system-wide syscall counts, by pid failed-syscalls-by-pid [comm]       system-wide failed syscalls, by pid workqueue-stats                     workqueue stats (ins/exe/create/destroy) check-perf-trace                    useless but exhaustive test script failed-syscalls [comm]              system-wide failed syscalls wakeup-latency                      system-wide min/max/avg wakeup latency rw-by-file                   r/w activity for a program, by file rw-by-pid                           system-wide r/w activity
 * 1) perf trace -l

This list is a mix of perl and python scripts that live in the tools/perf/scripts/{perl,python}

The installed scripts can be used as follows: ^C[ perf record: Woken up 11 times to write data ] [ perf record: Captured and wrote 1.939 MB perf.data (~84709 samples) ] perf trace started with Perl script \ /root/libexec/perf-core/scripts/perl/failed-syscalls.pl   failed syscalls, by comm: comm                   # errors --   firefox                     1721 claws-mail                  149 konsole                      99 X                            77 emacs                        56 [...]   failed syscalls, by syscall: syscall                          # errors -- --    sys_read                              2042 sys_futex                             130 sys_mmap_pgoff                         71 sys_access                             33 sys_stat64                              5 sys_inotify_add_watch                   4 [...]
 * 1) perf trace record failed-syscalls
 * 1) perf trace report failed-syscalls


 * Note:  For futher informations, see the documentation at $KERNEL/tools/perf/Documentation/perf-trace-{python/perl}.txt