Process Tracing
Installing Breakpoints
Imagine now you want to discover what calls a buggy function. Normally, you'd set a breakpoint on it and print the backtrace when it fires. Although gdb.py
can do this, because it knows nothing about your sources, you supply the breakpoint as a raw address in hex and receive the backtrace in much the same form (Figure 2).
Breakpoints come in two flavors – hardware and software – although gdb.py
supports only software breakpoints. To install such a breakpoint, you modify the instruction at the location of interest and put a "breakpoint trigger" instruction there instead. For x86, it's INT 3
(opcode 0xCC). It triggers an interrupt, which the kernel handles and delivers as SIGTRAP
to the tracer. The tracer then puts the original instruction back and resumes the tracee.
In the Python-ptrace implementation, a breakpoint is encapsulated in a class of its own, ptrace.debugger.Breakpoint
. To create a breakpoint, you supply its constructor a tracee process and the address to put the breakpoint into. As you can see from Listing 3, it calls readBytes()
first to read original instructions from the tracee's memory. Then, it calls writeBytes()
to put INT 3
at the address
. PtraceProcess.writeBytes()
translates the PTRACE_POKETEXT
request, which copies a machine word from data
to the address
.
Listing 3
Installing a breakpoint (simplified)
Tracing System Calls
For dessert, I'll briefly skim system call tracing. A de facto standard tool for this is strace (see the "Command of the Month" section), but Python-ptrace bundles its own pure Python version, strace.py
(Figure 3).
When you execute
strace.py /usr/bin/<something>
strace.py
runs the program you specify as a child and issues a PTRACE_TRACEME
request to make the parent (i.e., strace.py
itself) trace it automatically. Then, a slightly modified version of the above "event loop" begins. It starts with a PtraceProcess.syscall()
, which translates to a PTRACE_SYSCALL
request. The kernel then stops the tracee on each syscall entry and exit with a SIGTRAP
signal. This signal is somewhat oversubscribed, so to distinguish syscall traps from everything else, Linux (and some other Unices) introduces a PTRACE_O_SYSGOOD
option. When it's enabled with a PTRACE_SETOPTIONS
request (python-ptrace does this automatically if supported), the kernel delivers system call traps as SIGTRAP | 0x80
– that is, with bit 7 in the signal number raised (|
denotes bitwise OR).
You might wonder why it is important to notify the tracer both on syscall entry and exit. The assumption is that you use the first trap to decode the syscall arguments and the second to obtain the return value. Although it sounds simple, in fact it is rather hairy. Ptrace-python devotes a whole package, ptrace.syscall
, for these purposes. Consult it if you are interested. In short, system calls are distinguished by their numbers, which are dependent on architecture and come through architecture-dependent registers. Where to get the arguments and return value also depends on the application binary interface (ABI). This is not to say you'd expect to see flag names such as O_RDONLY
instead of raw numerical values.
When all this grunt work is finished, strace.py
issues PTRACE_SYSCALL
once again to run the tracee until the next syscall entry and exit, and the loop commences. The addr
argument is unused, and data
stores a signal to inject into the tracee when it's resumed.
Command of the Month: strace
Strace (Figure 4) is a venerable tool with noble SunOS origins that dates back to the early days of Linux. It appears you can teach an old dog some new tricks, though. Version 4.15, released around last Christmas, brings some "cool stuff" created in the course of the last year's Google's Summer of Code program.
I'm speaking about fault injection. Handling error conditions in system programming can be tricky, so how do you check if you have accounted for all of them? It's relatively easy to test if an application misbehaves when it can't open a file, but how do you test for more convoluted things such as interrupted system calls or per-process limits that have been reached?
As of strace 4.15, you can instruct the tool to forge the return value for a selected system call. Consider the example
strace -e fault=open:error=ENOSPC:when=5+ U
/some/program
which makes strace return an ENOSPC error for the fifth and subsequent open() system calls. According to the man page [6], this happens when a filesystem open() can hold no more files. This state isn't trivial to achieve in the real world, but strace makes testing for such tricky conditions a breeze.
Internally, strace sets the syscall number to -1 before resuming a tracee. Because it is an invalid syscall number, the kernel replies with ENOSYS, but the error specification overrides this return value. The when tells strace when to inject the fault: You can do it for every matching system call or for the first invocation, for instance. A newer strace (unreleased at the time of writing) allows you to inject a signal alongside the error code. You already know this works because you supplied a data argument to a corresponding PTRACE_SYSCALL request.
The only catch is that your distribution (if it's not Arch, you know) probably still ships the old strace. Packages for selected distributions are available through the openSUSE Build Service [7]
Infos
- TUI key bindings: https://sourceware.org/gdb/current/onlinedocs/gdb/TUI-Keys.html#TUI-Keys
- ptrace(2) man page: http://man7.org/linux/man-pages/man2/ptrace.2.html
- prctl(2) man page: http://man7.org/linux/man-pages/man2/prctl.2.html
- python-ptrace home: http://python-ptrace.readthedocs.io/en/latest/
- diStorm disassembler home: https://github.com/gdabah/distorm
- open(2) man page: http://man7.org/linux/man-pages/man2/open.2.html
- openSUSE Build Service page for strace: https://build.opensuse.org/package/show/home:ldv_alt/strace/
« Previous 1 2
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
-
Kodi 21.0 Open-Source Entertainment Hub Released
After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
-
Linux Usage Increases in Two Key Areas
If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
-
Vulnerability Discovered in xz Libraries
An urgent alert for Fedora 40 has been posted and users should pay attention.
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.
-
New Pentesting Distribution to Compete with Kali Linux
SnoopGod is now available for your testing needs
-
Juno Computers Launches Another Linux Laptop
If you're looking for a powerhouse laptop that runs Ubuntu, the Juno Computers Neptune 17 v6 should be on your radar.