Friday, November 06, 2009

Oprofile on ARM Linux


This article applies to OProfile version 0.9.5. OProfile is a profiling system for Linux 2.2/2.4/2.6 systems on a number of architectures. It is capable of profiling all parts of a running system, from the kernel (including modules and interrupt handlers) to shared libraries to binaries. It runs transparently in the background collecting information at a low overhead. These features make it ideal for profiling entire systems to determine bottle necks in real-world systems. Many CPUs provide "performance counters", hardware registers that can count "events"; for example, cache misses, or CPU cycles. OProfile provides profiles of code based on the number of these occurring events: repeatedly, every time a certain (configurable) number of events have occurred, the PC value is recorded. This information is aggregated into profiles for each binary image.

Some hardware setups do not allow OProfile to use performance counters: in these cases, no events are available, and OProfile operates in timer/RTC mode as described below(RTC mode only applicable till linux kernel 2.4).

Cross-Compiling oprofile for ARM arch

Unpack the oprofile-0.9.5.tar.bz2 and go the oprofile directory

[test@localhost]# cd oprofile-0.9.5

Run the below command to configure the oprofile for cross compilation

[test@localhost oprofile-0.9.5]# ./configure –host=arm-linux –-with-kernel-support

Recommendation: Please use any cross compilation environment like CLFS or Scratchbox for cross compilation. My environment is scratchbox. This is important to resolve lots of library dependence and to save lots of time.

Once the configuration is successful, run make to compile the oprofile.

[test@localhost oprofile-0.9.5]# make

Now install the oprofile, during installation you can specify the rootfs as install path, which will directly install or copy the oprofile binaries and libraries to your root file system.

Run the below command to install the oprofile daemon.

[test@localhost oprofile-0.9.5]# make DESTDIR=<INSTALL PATH> install

Running and using oprofile tools

Oprofile can run into two modes, one is by using the hardware performance monitor controller (PMNC for Cortex-A8) and Timer interrupt mode.

Tool summary

This section gives a brief description of the available OProfile utilities and their purpose.


This utility lists the available events and short descriptions.


Used for controlling the OProfile data collection, discussed below in section 4.4


This is the main tool for retrieving useful profile data, described below


his utility can be used to produce annotated source, assembly or mixed source/assembly. Source level annotation is available only if the application was compiled with debugging symbols.


This utility can output gprof-style data files for a binary, for use with gprof –p.


This utility can be used to collect executables, debuginfo, and sample files and copy the files into an archive. The archive is self-contained and can be moved to another machine for further analysis.


This utility converts sample database files from a foreign binary format (abi) to the native format. This is useful only when moving sample files between hosts, for analysis on platforms other than the one used for collection.

agent libraries

Used by virtual machines (like the Java VM) to record information about JITed code being profiled.

Oprofile in Timer Interrupt Mode

This section applies to 2.6 kernels and above only. In 2.6 kernels on CPUs without OProfile support for the hardware performance counters, the driver falls back to using the timer interrupt for profiling.

You can force use of the timer interrupt by using the timer=1 module parameter (or oprofile.timer=1 on the boot command line if OProfile is built-in).

Oprofile in Hardware Performance monitor counter(PMNC) mode

In the case you have to add the PMNC IRQ number to the oprofile driver for the armv7 driver in side file arch/arm/oprofilr/op_mode_v7.c

Add the below line:

Static int irqs[ ] ={




+ #ifdef CONFIG_ARCH_

+ , //Irq number for the PMNC controller for CORTEX-A8 in your SoC.

+ #endif


Using Opcontrol

In this section we describe the configuration and control of the profiling system with opcontrol.

Download the zImage to the target board using rootfs which contains oprofiling tools. And run as below;

First, we need to be the root user to use OProfile. So, either login as the root user, or use the su command and switch to the root user. Next, we need to setup OProfile. We have two options. We can either profile our application with, or without the Linux kernel. If we want to profile with the Linux kernel, we need to reference the uncompressed kernel image file in the /root directory.

#Init Oprofile:

[root@localhost ~] opcontrol –-reset

[root@localhost ~] opcontrol –-init

[root@localhost ~] opcontrol –-vmlinux=/root/vmlinux

#Setup events

[root@localhost ~] opcontrol –e CPU_CYCLES:100000:0:1:1


[root@localhost ~] opcontrol –-start-daemon

[root@localhost ~] opcontrol –start

# verify PMNC IRQs

[root@localhost ~] cat /proc/interrupt

Note: one entry should be there in /proc corresponding to PMNC irq only in case you are using PMNC mode.

Run any application at this point of time.

# stoping profiling

[root@localhost ~] opcontrol –-dump

[root@localhost ~] opcontrol –-stop

# Deinit

[root@localhost ~] opcontrol –-shutdown

[root@localhost ~] opcoontrol –-deinit

# Get the report

[root@localhost ~] opreport

Overflow stats not available

CPU: ARM V7 PMNC, speed 0 MHz (estimated)

Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No

unit mask) count 100000


samples| %|


337 91.3279 hello

21 5.6911 no-vmlinux

6 1.6260

4 1.0840

1 0.2710 busybox

[root@localhost ~]# opreport --callgraph
CPU: ARM V7 PMNC, speed 0 MHz (estimated)
Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No
unit mask) count 100000
samples % app name symbol name
477 86.1011 hello1 main
477 100.000 hello1 main [self]
15 2.7076 vmlinux _spin_unlock_irqrestore
15 100.000 vmlinux _spin_unlock_irqrestore [self]
7 1.2635 vmlinux check_poison_obj
7 100.000 vmlinux check_poison_obj [self]
4 0.7220 busybox /bin/busybox
4 100.000 busybox /bin/busybox [self]
4 0.7220 vmlinux __do_softirq
4 100.000 vmlinux __do_softirq [self]

It should be noted that Oprofile does not provide 100% instruction-accurate profiles and cannot accept any disturbance to the system at all.
Oprofile is easy to install and run.
Oprofile can give a full profile, kernel + all processes
Very detailed CPU info available with advanced usage.
useful links:
and google for your questions.

My learning continues...............Alim


At 10:05 PM , Blogger sindhu said...

I am getting "overflow stats not available"... what that exactly means ? is that a kind of error or ???


Post a Comment

Subscribe to Post Comments [Atom]

<< Home