Oprofile on ARM Linux
Introduction
This article applies to OProfile version 0.9.5. OProfile is a profiling system for Linux 2.2/2.4/2.6 systems on a number of architectures. It is capable of profiling all parts of a running system, from the kernel (including modules and interrupt handlers) to shared libraries to binaries. It runs transparently in the background collecting information at a low overhead. These features make it ideal for profiling entire systems to determine bottle necks in real-world systems. Many CPUs provide "performance counters", hardware registers that can count "events"; for example, cache misses, or CPU cycles. OProfile provides profiles of code based on the number of these occurring events: repeatedly, every time a certain (configurable) number of events have occurred, the PC value is recorded. This information is aggregated into profiles for each binary image.
Some hardware setups do not allow OProfile to use performance counters: in these cases, no events are available, and OProfile operates in timer/RTC mode as described below(RTC mode only applicable till linux kernel 2.4).
Cross-Compiling oprofile for ARM arch
Unpack the oprofile-0.9.5.tar.bz2 and go the oprofile directory
[test@localhost]# cd oprofile-0.9.5
Run the below command to configure the oprofile for cross compilation
[test@localhost oprofile-0.9.5]# ./configure –host=arm-linux –-with-kernel-support
Recommendation: Please use any cross compilation environment like CLFS or Scratchbox for cross compilation. My environment is scratchbox. This is important to resolve lots of library dependence and to save lots of time.
Once the configuration is successful, run make to compile the oprofile.
[test@localhost oprofile-0.9.5]# make
Now install the oprofile, during installation you can specify the rootfs as install path, which will directly install or copy the oprofile binaries and libraries to your root file system.
Run the below command to install the oprofile daemon.
[test@localhost oprofile-0.9.5]# make DESTDIR=<INSTALL PATH> install
Running and using oprofile tools
Oprofile can run into two modes, one is by using the hardware performance monitor controller (PMNC for Cortex-A8) and Timer interrupt mode.
Tool summary
This section gives a brief description of the available OProfile utilities and their purpose.
ophelp
This utility lists the available events and short descriptions.
Opcontrol
Used for controlling the OProfile data collection, discussed below in section 4.4
Opreport
This is the main tool for retrieving useful profile data, described below
Opannotate
his utility can be used to produce annotated source, assembly or mixed source/assembly. Source level annotation is available only if the application was compiled with debugging symbols.
Opgprof
This utility can output gprof-style data files for a binary, for use with gprof –p.
Oparchive
This utility can be used to collect executables, debuginfo, and sample files and copy the files into an archive. The archive is self-contained and can be moved to another machine for further analysis.
Opimport
This utility converts sample database files from a foreign binary format (abi) to the native format. This is useful only when moving sample files between hosts, for analysis on platforms other than the one used for collection.
agent libraries
Oprofile in Timer Interrupt Mode
This section applies to 2.6 kernels and above only. In 2.6 kernels on CPUs without OProfile support for the hardware performance counters, the driver falls back to using the timer interrupt for profiling.
Oprofile in Hardware Performance monitor counter(PMNC) mode
In the case you have to add the PMNC IRQ number to the oprofile driver for the armv7 driver in side file arch/arm/oprofilr/op_mode_v7.c
Add the below line:
Static int irqs[ ] ={
#ifdef CONFIG_ARCH_OMAP3
INT_34XX_BENCH_MPU_EMUL,
#endif
+ #ifdef CONFIG_ARCH_
+
+ #endif
};
Using Opcontrol
In this section we describe the configuration and control of the profiling system with opcontrol.
Download the zImage to the target board using rootfs which contains oprofiling tools. And run as below;
First, we need to be the root user to use OProfile. So, either login as the root user, or use the su command and switch to the root user. Next, we need to setup OProfile. We have two options. We can either profile our application with, or without the Linux kernel. If we want to profile with the Linux kernel, we need to reference the uncompressed kernel image file in the /root directory.
#Init Oprofile:
[root@localhost ~] opcontrol –-reset
[root@localhost ~] opcontrol –-init
[root@localhost ~] opcontrol –-vmlinux=/root/vmlinux
#Setup events
[root@localhost ~] opcontrol –e CPU_CYCLES:100000:0:1:1
#Start
[root@localhost ~] opcontrol –-start-daemon
[root@localhost ~] opcontrol –start
# verify PMNC IRQs
[root@localhost ~] cat /proc/interrupt
Note: one entry should be there in /proc corresponding to PMNC irq only in case you are using PMNC mode.
Run any application at this point of time.
# stoping profiling
[root@localhost ~] opcontrol –-dump
[root@localhost ~] opcontrol –-stop
# Deinit
[root@localhost ~] opcontrol –-shutdown
[root@localhost ~] opcoontrol –-deinit
# Get the report
[root@localhost ~] opreport
Overflow stats not available
CPU: ARM V7 PMNC, speed 0 MHz (estimated)
Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No
unit mask) count 100000
CPU_CYCLES:100000|
samples| %|
------------------
337 91.3279 hello
21 5.6911 no-vmlinux
6 1.6260 libc-2.5.so
4 1.0840 ld-2.5.so
1 0.2710 busybox
[root@localhost ~]# opreport --callgraph
CPU: ARM V7 PMNC, speed 0 MHz (estimated)
Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No
unit mask) count 100000
samples % app name symbol name
-------------------------------------------------------------------------------
477 86.1011 hello1 main
477 100.000 hello1 main [self]
-------------------------------------------------------------------------------
15 2.7076 vmlinux _spin_unlock_irqrestore
15 100.000 vmlinux _spin_unlock_irqrestore [self]
-------------------------------------------------------------------------------
7 1.2635 vmlinux check_poison_obj
7 100.000 vmlinux check_poison_obj [self]
-------------------------------------------------------------------------------
4 0.7220 busybox /bin/busybox
4 100.000 busybox /bin/busybox [self]
-------------------------------------------------------------------------------
4 0.7220 vmlinux __do_softirq
4 100.000 vmlinux __do_softirq [self]
It should be noted that Oprofile does not provide 100% instruction-accurate profiles and cannot accept any disturbance to the system at all.
Oprofile is easy to install and run.
Oprofile can give a full profile, kernel + all processes
Very detailed CPU info available with advanced usage.
useful links:
http://oprofile.sourceforge.net/
http://oprofile.sourceforge.net/doc/index.html
http://oprofile.sourceforge.net/doc/internals/index.html
and google for your questions.
My learning continues...............Alim
1 Comments:
I am getting "overflow stats not available"... what that exactly means ? is that a kind of error or ???
Post a Comment
Subscribe to Post Comments [Atom]
<< Home