by Robert Cohn and Robert Muth
========================================================================================
Pin is a tool for the instrumentation of programs. It supports Linux, Windows, and MacOs executables for Intel (R) IA-32, Intel64, and Itanium (R) processors.
Pin allows a tool to insert arbitrary code (written in C or C++) in arbitrary places in the executable. The code is added dynamically while the executable is running. This also makes it possible to attach Pin to an already running process.
Pin provides a rich API that abstracts away the underlying instruction set idiosyncracies and allows context information such as register contents to be passed to the injected code as parameters. Pin automatically saves and restores the registers that are overwritten by the injected code so the application continues to work. Limited access to symbol and debug information is available as well.
Pin includes the source code for a large number of example instrumentation tools like basic block profilers, cache simulators, instruction trace generators, etc. It is easy to derive new tools using the examples as a template.
Table of Contents
========================================================================================
The only code ever executed is the generated code. The original code is only used for reference. When generating code, Pin gives the user an opportunity to inject their own code (instrumentation).
These two components are instrumentation and analysis code. Both components live in a single executable, a Pintool. Pintools can be thought of as plugins that can modify the code generation process inside Pin.
The Pintool registers callback routines with Pin that are called from Pin whenever new code needs to be generated. This routine represents the instrumentation component. It inspects the code to be generated, investigates its static properties, and decides if and where to inject calls to analysis code. Those calls can target arbitrary functions inside the Pintool. Pin makes sure that register state is saved and restored as necessary and allow arguments to be passed to the functions.
Pin and the Pintool control a program starting with the very first instruction. For executables compiled with shared libraries this implies that the execution of the dynamic loader and all shared libraries will be visible to the Pintool.
When writing tools, it is more important to tune the analysis code than the instrumentation code. This is because the instrumentation is executed once, but analysis code is called many times.
Trace instrumentation lets the Pintool inspect and instrument an executable one trace at a time. Traces usually begin at the target of a taken branch and end with an unconditional branch, including calls and returns. Pin guarantees that a trace is only entered at the top, but it may contain multiple exits. If a branch joins the middle of a trace, Pin constructs a new trace that begins with the branch target. Pin breaks the trace into basic blocks, BBLs. A BBL is a single entrance, single exit sequence of instructions. Branches to the middle of a bbl begin a new trace and hence a new BBL. It is often possible to insert a single analysis call for a BBL, instead of one analysis call for every instruction. Reducing the number of analysis calls makes instrumentation more efficient. Trace instrumentation utilizes the TRACE_AddInstrumentFunction API call.
As a convenience for Pintool writers, Pin also offers an instruction instrumentation mode which lets the tool inspect and instrument an executable a single instruction at a time. This is essentially identical to trace instrumentation where the Pintool writer has been freed from the responsibilty of iterating over the instructions inside a trace. As decribed under trace instrumentation, certain BBLs and the instructions inside of them may be generated (and hence instrumented) multiple times. Instruction instrumentation utilizes the INS_AddInstrumentFunction API call.
Sometimes, however, it can be useful to look at different granularity than a trace. For this purpose Pin offers two additional modes: image and routine instrumentation. These modes are implemented by "caching" instrumentation requests and hence incur a space overhead.
Image instrumentation lets the Pintool inspect and instrument an entire image, IMG, when it is first loaded. A Pintool can walk the sections, SEC, of the image, the routines, RTN, of a section, and the instructions, INS of a routine. Instrumentation can be inserted so that it is executed before or after a routine is executed, or before or after an instruction is executed. Image instrumentation utilizes the IMG_AddInstrumentFunction API call. Image instrumentation depends on symbol information to determine routine boundaries hence PIN_InitSymbols must be called before PIN_Init.
Routine instrumentation lets the Pintool inspect and instrument an entire routine before the first time it is called. A Pintool can walk the instructions of a routine. There is not enough information available to break the instructions into BBLs. Instrumentation can be inserted so that it is executed before or after a routine is executed, or before or after an instruction is executed. Routine instrumentation can be more efficient than image instrumentation in space and time when the only a small number of the routines in an image are executed. Routine instrumentation utilizes the RTN_AddInstrumentFunction API call. Instrumentation of routine exits does not work reliably in the presence of tail calls or when return instructions cannot reliably be detected.
========================================================================================
To illustrate how to write Pintools, we present some simple examples. In the web based version of the manual, you can click on a function in the Pin API to see its documentation.
docount
before every instruction. When the program exits, it prints the count to stderr
.Here is how to run it and the output:
$ pin -t inscount0 -- /bin/ls Makefile atrace.o imageload.out itrace proccount Makefile.example imageload inscount0 itrace.o proccount.o atrace imageload.o inscount0.o itrace.out Count 422838 $
The example can be found in ManualExamples/inscount0.cpp
docount
, the analysis procedure. In this example, we show how to pass arguments. When calling an analysis procedure, Pin allows you to pass the instruction pointer, current value of registers, effective address of memory operations, constants, etc. For a complete list, see IARG_TYPE.With a small change, we can turn the instruction counting example into a Pintool that prints the address of every instruction that is executed. This tool is useful for understanding the control flow of a program for debugging, or in processor design when simulating an instruction cache.
We change the arguments to INS_InsertCall to pass the address of the instruction about to be executed. We replace docount
with printip
, which prints the instruction address. It writes it output to to the file itrace.out
.
This is how to run it and look at the output:
$ pin -t itrace -- /bin/ls Makefile atrace.o imageload.out itrace proccount Makefile.example imageload inscount0 itrace.o proccount.o atrace imageload.o inscount0.o itrace.out $ head itrace.out 0x40001e90 0x40001e91 0x40001ee4 0x40001ee5 0x40001ee7 0x40001ee8 0x40001ee9 0x40001eea 0x40001ef0 0x40001ee0 $
The example can be found in ManualExamples/itrace.cpp
In this example, we show how to do more selective instrumentation by examining the instructions. This tool generates a trace of all memory addresses referenced by a program. This is also useful for debugging and for simulating a data cache in a processor.
We only instrument instructions that read or write memory. We also use INS_InsertPredicatedCall instead of INS_InsertCall to avoid generating references to instructions that are predicated and the predicate is false (predication is only relevant for Itanium).
Since the instrumentation functions are only called once and the analysis functions are called every time an instruction is executed, it is much faster to only instrument the memory operations, as compared to the previous instruction trace example that instruments every instruction.
Here is how to run it and the sample output:
$ pin -t pinatrace -- /bin/ls Makefile atrace.o imageload.o inscount0.o itrace.out Makefile.example atrace.out imageload.out itrace proccount atrace imageload inscount0 itrace.o proccount.o $ head pinatrace.out 0x40001ee0: R 0xbfffe798 0x40001efd: W 0xbfffe7d4 0x40001f09: W 0xbfffe7d8 0x40001f20: W 0xbfffe864 0x40001f20: W 0xbfffe868 0x40001f20: W 0xbfffe86c 0x40001f20: W 0xbfffe870 0x40001f20: W 0xbfffe874 0x40001f20: W 0xbfffe878 0x40001f20: W 0xbfffe87c $
The example can be found in ManualExamples/pinatrace.cpp
If you invoke it on ls, you would see this output:
$ pin -t imageload -- /bin/ls Makefile atrace.o imageload.o inscount0.o proccount Makefile.example atrace.out imageload.out itrace proccount.o atrace imageload inscount0 itrace.o trace.out $ cat imageload.out Loading /bin/ls Loading /lib/ld-linux.so.2 Loading /lib/libtermcap.so.2 Loading /lib/i686/libc.so.6 Unloading /bin/ls Unloading /lib/ld-linux.so.2 Unloading /lib/libtermcap.so.2 Unloading /lib/i686/libc.so.6 $
The example can be found in ManualExamples/imageload.cpp
The example can be found in ManualExamples/inscount1.cpp
Executing the tool and sample output:
$ pin -t proccount -- /bin/grep proccount.cpp Makefile proccount_SOURCES = proccount.cpp $ head proccount.out Procedure Image Address Calls Instructions _fini libc.so.6 0x40144d00 1 21 __deregister_frame_info libc.so.6 0x40143f60 2 70 __register_frame_info libc.so.6 0x40143df0 2 62 fde_merge libc.so.6 0x40143870 0 8 __init_misc libc.so.6 0x40115824 1 85 __getclktck libc.so.6 0x401157f4 0 2 munmap libc.so.6 0x40112ca0 1 9 mmap libc.so.6 0x40112bb0 1 23 getpagesize libc.so.6 0x4010f934 2 26 $
The example can be found in ManualExamples/proccount.cpp
The example can be found in ManualExamples/staticcount.cpp
The example can be found in ManualExamples/detach.cpp
In probe mode, the application and the replacement routine are run natively. This improves performance, but it puts more responsibility on the tool writer.
The tool writer must guarantee that there is not jump target where the probe is placed. A probe is six bytes long on IA-32 platforms, seven bytes long on Intel 64 platforms, and 1 bundle in Itanium platforms.
Also, it is the tool writer's responsibility to ensure that no thread is currently executing the code where a probe is inserted or removed. Tool writers are encouraged to insert probes when an image is loaded to avoid this problem.
When using probes, the "-probe" option must be used on the command line, and Pin must be started with the PIN_StartProgramProbed() API.
The example can be found in ManualExamples/replacesigprobed.cpp
========================================================================================
The examples in the previous section have introduced a number of ways to register call back functions via the Pin API:
The extra parameter val
(shared by all the registration functions) will be passed to fun
as its second argument whenever it is "called back". This is a standard mechanism used in GUI programming with call backs.
If this feature is not needed, it is safe to pass 0 for val
when registering a call back. The expected use of val
is to pass a pointer to an instance of a class. Since val
is a generic pointer, fun
must cast it back to an object before dereferencing the pointer.
========================================================================================
An application and a tool are invoked as follows:
pin [pin-option]... -t [toolname] [tool-options]... -- [application] [application-option]..
The following Pin-options are currently available:
The tool-options follow immediately after the tool specification and depend on the tool used.
Everything following the --
is the command line for the application.
For example, to apply the itrace example (Instruction Address Trace (Instruction Instrumentation)) to a run of the "ls" program:
pin -t itrace -- /bin/ls
To get a listing of the available command line options for Pin:
pin -h
To get a listing of the available command line options for the itrace example:
pin -t itrace -h -- /bin/ls
Note that in the last case /bin/ls
is necessary on the command line but will not be executed.
The -injection switch is Unix-only and controls the way pin is injected into the application process. The default, dynamic, is recommended for all users. It uses parent injection unless it is unsupported (MacOs and Linux 2.4 kernels). Child injection creates the application process as a child of the pin process so you will see both a pin process and the application process running. In parent injection, the pin process exits after injecting the application and is less likely to cause a problem. Using parent injection on an unsupported platform may lead to nondeterministic errors.
IMPORTANT: The description about invoking assumes that the application is a program binary (and not a shell script). If your application is invoked indirectly (from a shell script or using 'exec') then you need to change the actual invocation of the program binary by prefixing it with pin/pintool options. Here's one way of doing that:
# Track down the actual application binary, say it is 'application_binary'. % mv application_binary application_binary.real # Write a shell script named 'application_binary' with the following contents. # (change 'itrace' to your desired tool) #!/bin/sh pin -t itrace -- application_binary.real $*
After you do this, whenever 'application_binary' is invoked indirectly (from some shell script or using 'exec'), the real binary will get invoked with the right pin/pintool options. ========================================================================================
There are 3 different programs residing in the address space. The application, the Pin instrumentation engine, and your Pintool. This section describes how to use gdb to find bugs in a Pintool. You cannot run Pin directly from gdb since Pin uses the debugging API to start the application. Instead, you must invoke Pin from the command line with the -pause_tool switch, and use gdb to attach to the Pin process from another window. The -pause_tool n switch makes Pin print out the process identifier (pid) and pause for n seconds.
If your tool is called opcodemix and the application is /bin/ls, you can use gdb as follows. Start gdb with your tool, but do not use the run command:
$ gdb opcodemix GNU gdb 5.2.1 Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... (gdb)
In another window, start your application with the -pause_tool switch:
$ pin -pause_tool 5 -t opcodemix -- /bin/ls Pausing to attach to pid 28769
Then go back to gdb and attach to the process:
(gdb) attach 28769 Attaching to program: .../build-ia32/SimpleExamples/opcodemix, process 28769 0x011ef361 in ?? () (gdb)
Now, instead of using the gdb run command, you use the cont
command to continue execution. You can also set breakpoints as normal:
(gdb) break main
Breakpoint 1 at 0x5048d30: file .../PinTools/SimpleExamples/opcodemix.cpp, line 232.
(gdb) cont
Continuing.
Breakpoint 1, main (argc=6, argv=0x4fef534)
at .../PinTools/SimpleExamples/opcode.cpp:232
(gdb)
If the program does not exit, then you should detach so gdb will release control:
(gdb) detach Detaching from program: .../build-ia32/SimpleExamples/opcodemix, process 28769 (gdb)
If you recompile your program and then use the run command, gdb will notice that the binary has been changed and reread the debug information from the file. This does not happen automatically when using attach. You must use the file command to make gdb reread the debug information:
(gdb) file opcodemix Load new symbol table from "opcodemix"? (y or n) y Reading symbols from opcodemix... done. (gdb)
========================================================================================
They way a Pintool is written can have great impact on the performace of the tool, i.e. how much it slows down the applications it is instrumenting. This section demonstrates some techniques that can be used to improve tool performance. Let's start with an example. The following piece of code is derived from the Examples/edgcnt.cpp:
The instrumentation component of the tool is show below
VOID Instruction(INS ins, void *v) { ... if ( [ins is a branch or a call instruction] ) { INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR) docount2, IARG_INST_PTR, IARG_BRANCH_TARGET_ADDR, IARG_BRANCH_TAKEN, IARG_END); } ... }
The analysis component looks like this:
VOID docount2( ADDRINT src, ADDRINT dst, INT32 taken ) { if(!taken) return; COUNTER *pedg = Lookup( src,dst ); pedg->_count++; }
The purpose of the tool is to count how often each controlflow changing edge in the control flowgraph is traversed. The tool considers both calls and branches but for brevity we will not mention branches in our description. The tool works as follows: The instrumentation component instruments each branch with a call to docount2. As parameters we pass in the origin and the target of the branch and whether the branch was taken or not. Branch origin and target represent of the source and destination of the controlflow edges. If a branch is not taken the controlflow does not change and hence the analysis routine returns right away. If the branch is taken we use the src and dst parameters to look up the counter associated with this edge (Lookup will create a new one if this edge has not been seen before) and increment the counter. Note, that the tool could have been simplified somewhat by using IPOINT_TAKEN_BRANCH option with INS_InsertCall().
VOID docount( COUNTER *pedg, INT32 taken ) { if( !taken ) return; pedg->_count++; }
And the instrumentation will be somewhat more complex:
VOID Instruction(INS ins, void *v) { ... if (INS_IsDirectBranchOrCall(ins)) { COUNTER *pedg = Lookup( INS_Address(ins), INS_DirectBranchOrCallTargetAddress(ins) ); INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR) docount, IARG_ADDRINT, pedg, IARG_BRANCH_TAKEN, IARG_END); } else { INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR) docount2, IARG_INST_PTR, IARG_BRANCH_TARGET_ADDR, IARG_BRANCH_TAKEN, IARG_END); } ... }
VOID docount( COUNTER *pedg, INT32 taken ) { pedg->_count += taken; }
There is now no question whether docount() will be inlined or not.
In the above example, the original analysis routine IpSample() has a conditional control-flow change. It is rewritten into two analysis routines: CountDown() and PrintIp(). CountDown() is the simpler one of the two, which doesn't have control-flow change. It also performs the original conditional test and returns the test result. We use the conditional instrumentaton APIs INS_InsertIfCall() and INS_InsertThenCall() to tell Pin that tbe analysis routine specified by an INS_InsertThenCall() (i.e. PrintIp() in this example) is executed only if the result of the analysis routine specified by the previous INS_InsertIfCall() (i.e. CountDown() in this example) is non-zero. Now CountDown(), the common case, can be inlined by Pin, and only once a while does Pin need to execute PrintIp(), the non-inlined case.
========================================================================================
To install a kit, unpack a kit and change to the directory:
$ tar zxf /proj/vssad/proj/pin/Kits/pin-2.0-776-ia32.tar.gz $ cd pin-2.0-776-ia32/
Build and test the examples from the manual
$ cd ManualExamples/ $ make test g++ -c -Wall -Werror -Wno-unknown-pragmas -I../Include -DTARGET_IA32 -g1 -o pinatrace.o pinatrace.cpp g++ -static -Wl,-wrap,mmap,-wrap,__mmap,-wrap,brk,-wrap,__brk,--section-start,.interp=0x05048000 -g1 -o pinatrace pinatrace.o -L../Lib/ -lpin -ldwarf -lelf -lencp68 -ldecp68 ../Bin/pin -t pinatrace -- /bin/cp makefile makefile.copy; cmp makefile makefile.copy g++ -c -Wall -Werror -Wno-unknown-pragmas -I../Include -DTARGET_IA32 -g1 -o inscount0.o inscount0.cpp g++ -static -Wl,-wrap,mmap,-wrap,__mmap,-wrap,brk,-wrap,__brk,--section-start,.interp=0x05048000 -g1 -o inscount0 inscount0.o -L../Lib/ -lpin -ldwarf -lelf -lencp68 -ldecp68 ../Bin/pin -t inscount0 -- /bin/cp makefile makefile.copy; cmp makefile makefile.copy Count 277395 g++ -c -Wall -Werror -Wno-unknown-pragmas -I../Include -DTARGET_IA32 -g1 -o itrace.o itrace.cpp g++ -static -Wl,-wrap,mmap,-wrap,__mmap,-wrap,brk,-wrap,__brk,--section-start,.interp=0x05048000 -g1 -o itrace itrace.o -L../Lib/ -lpin -ldwarf -lelf -lencp68 -ldecp68 ../Bin/pin -t itrace -- /bin/cp makefile makefile.copy; cmp makefile makefile.copy g++ -c -Wall -Werror -Wno-unknown-pragmas -I../Include -DTARGET_IA32 -g1 -o proccount.o proccount.cpp g++ -static -Wl,-wrap,mmap,-wrap,__mmap,-wrap,brk,-wrap,__brk,--section-start,.interp=0x05048000 -g1 -o proccount proccount.o -L../Lib/ -lpin -ldwarf -lelf -lencp68 -ldecp68 ../Bin/pin -t proccount -- /bin/cp makefile makefile.copy; cmp makefile makefile.copy $
Run one of the sample tools from the installed directory
$ ../Bin/pin -t pinatrace -- /bin/ls _insprofiler.cpp atrace.out inscount0.o itrace.cpp proccount atrace imageload.cpp inscount1.cpp itrace.o proccount.cpp atrace.cpp inscount0 insprofiler.cpp itrace.out proccount.o atrace.o inscount0.cpp itrace makefile proccount.out $ head pinatrace.out 0x40001ee0: R 0xbfffe1e8 0x40001efd: W 0xbfffe224 0x40001f09: W 0xbfffe228 0x40001f20: W 0xbfffe2b4 0x40001f20: W 0xbfffe2b8 0x40001f20: W 0xbfffe2bc 0x40001f20: W 0xbfffe2c0 0x40001f20: W 0xbfffe2c4 0x40001f20: W 0xbfffe2c8 0x40001f20: W 0xbfffe2cc $
To write your own tool, copy one of the example directories and edit the makefile to add your tool.
========================================================================================
Each kit contains Pin and libraries for a specific architecture. Make sure the kit you download is for the right architecture. The Pin libraries use C++, and the compiler you use to build the tool must be compatible with the Pin library. This restriction only applies to building tools; you can instrument applications built by any compiler.
See the README file in the kit for specific information about compiler version and other limitations. If your compiler is not compatible with the kit, send mail to Pin.Project@intel.com.
========================================================================================
Send bugs and questions to Pin.Project@intel.com. Complete bug reports that are easy to reproduce are fixed faster, so try to provide as much information as possible. Include: kit number, your OS version, compiler version. Try to reproduce the problem in a simple example that you can send us.