Process Tracing

Installing Breakpoints

Imagine now you want to discover what calls a buggy function. Normally, you'd set a breakpoint on it and print the backtrace when it fires. Although can do this, because it knows nothing about your sources, you supply the breakpoint as a raw address in hex and receive the backtrace in much the same form (Figure 2).

Breakpoints come in two flavors – hardware and software – although supports only software breakpoints. To install such a breakpoint, you modify the instruction at the location of interest and put a "breakpoint trigger" instruction there instead. For x86, it's INT 3 (opcode 0xCC). It triggers an interrupt, which the kernel handles and delivers as SIGTRAP to the tracer. The tracer then puts the original instruction back and resumes the tracee.

In the Python-ptrace implementation, a breakpoint is encapsulated in a class of its own, ptrace.debugger.Breakpoint. To create a breakpoint, you supply its constructor a tracee process and the address to put the breakpoint into. As you can see from Listing 3, it calls readBytes() first to read original instructions from the tracee's memory. Then, it calls writeBytes() to put INT 3 at the address. PtraceProcess.writeBytes() translates the PTRACE_POKETEXT request, which copies a machine word from data to the address.

Listing 3

Installing a breakpoint (simplified)


Tracing System Calls

For dessert, I'll briefly skim system call tracing. A de facto standard tool for this is strace (see the "Command of the Month" section), but Python-ptrace bundles its own pure Python version, (Figure 3).

Figure 3: (top) is nearly indistinguishable from the original (bottom).

When you execute /usr/bin/<something> runs the program you specify as a child and issues a PTRACE_TRACEME request to make the parent (i.e., itself) trace it automatically. Then, a slightly modified version of the above "event loop" begins. It starts with a PtraceProcess.syscall(), which translates to a PTRACE_SYSCALL request. The kernel then stops the tracee on each syscall entry and exit with a SIGTRAP signal. This signal is somewhat oversubscribed, so to distinguish syscall traps from everything else, Linux (and some other Unices) introduces a PTRACE_O_SYSGOOD option. When it's enabled with a PTRACE_SETOPTIONS request (python-ptrace does this automatically if supported), the kernel delivers system call traps as SIGTRAP | 0x80 – that is, with bit 7 in the signal number raised (| denotes bitwise OR).

You might wonder why it is important to notify the tracer both on syscall entry and exit. The assumption is that you use the first trap to decode the syscall arguments and the second to obtain the return value. Although it sounds simple, in fact it is rather hairy. Ptrace-python devotes a whole package, ptrace.syscall, for these purposes. Consult it if you are interested. In short, system calls are distinguished by their numbers, which are dependent on architecture and come through architecture-dependent registers. Where to get the arguments and return value also depends on the application binary interface (ABI). This is not to say you'd expect to see flag names such as O_RDONLY instead of raw numerical values.

When all this grunt work is finished, issues PTRACE_SYSCALL once again to run the tracee until the next syscall entry and exit, and the loop commences. The addr argument is unused, and data stores a signal to inject into the tracee when it's resumed.

Command of the Month: strace

Strace (Figure 4) is a venerable tool with noble SunOS origins that dates back to the early days of Linux. It appears you can teach an old dog some new tricks, though. Version 4.15, released around last Christmas, brings some "cool stuff" created in the course of the last year's Google's Summer of Code program.

Figure 4: The good old strace tool recently got a facelift and the fashionable domain name.

I'm speaking about fault injection. Handling error conditions in system programming can be tricky, so how do you check if you have accounted for all of them? It's relatively easy to test if an application misbehaves when it can't open a file, but how do you test for more convoluted things such as interrupted system calls or per-process limits that have been reached?

As of strace 4.15, you can instruct the tool to forge the return value for a selected system call. Consider the example

strace -e fault=open:error=ENOSPC:when=5+ U


which makes strace return an ENOSPC error for the fifth and subsequent open() system calls. According to the man page [6], this happens when a filesystem open() can hold no more files. This state isn't trivial to achieve in the real world, but strace makes testing for such tricky conditions a breeze.

Internally, strace sets the syscall number to -1 before resuming a tracee. Because it is an invalid syscall number, the kernel replies with ENOSYS, but the error specification overrides this return value. The when tells strace when to inject the fault: You can do it for every matching system call or for the first invocation, for instance. A newer strace (unreleased at the time of writing) allows you to inject a signal alongside the error code. You already know this works because you supplied a data argument to a corresponding PTRACE_SYSCALL request.

The only catch is that your distribution (if it's not Arch, you know) probably still ships the old strace. Packages for selected distributions are available through the openSUSE Build Service [7]

The Author

Valentine Sinitsyn develops high-loaded services and teaches students completely unrelated subjects. He also has a KDE developer account that he's never really used.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Perl: Ptrace

    Linux lets users watch the kernel at work with a little help from Ptrace, a tool that both debuggers and malicious process kidnappers use. A CPAN module introduces this technology to Perl and, if this is not enough, C extensions add functionality.

  • Tracing Tools

    Programs rarely reveal what they are doing in the background, but a few clever tools, of interest to both programmers and administrators, monitor this activity and log system functions.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95