Modify program behavior with LD_PRELOAD

Change Course

© Lead Image © Elnur Amikishiyev, 123RF.com

© Lead Image © Elnur Amikishiyev, 123RF.com

Article from Issue 277/2023
Author(s):

A little C code and the LD_PRELOAD variable let you customize library functions to modify program behavior.

Perhaps you want to know the files a program opens or deletes and the network connections it establishes. With a simple hack, standard functions such as opening files or listening on a TCP port can be replaced with DIY versions that not only log what the application does but can even change their behavior if desired. The key to these possibilities is the LD_PRELOAD variable, which affects the Linux program loader.

When you start a program, the Linux kernel creates a new process and loads the executable into its memory space, but that is usually not all that happens. Programs typically use libraries that are added dynamically. You can find out which libraries an application loads using the ldd command (Figure 1).

Figure 1: The ls command-line call uses just a few libraries; the list is far longer for graphical applications.

The only "real" libraries in Figure 1 are the libselinux.so.1 (which provides support for the SELinux security extension), libc.so.6 (the C standard library), and libpcre2-8.so.0 entries (the functions that allow a program to process regular expressions). As for the other libraries, linux-vdso.so.1 belongs to the kernel, and ld-linux-x86.64.so.2 is the program loader, the entity that loads the required files.

The most important standard functions (e.g., for file access, process and thread control, and all low-level system call wrappers) can be found in libc.so.6, a GNU C library [1]. It provides many frequently used functions, including open(), read(), and write() for low-level access to files; malloc() for dynamic memory management; printf() for formatted output of data; and exit() to exit the program.

Programs that use the graphical user interface (i.e., typically the X Window System X11) also load the X11 library, libX11.so.6. If you launch other graphical programs, numerous other libraries are added. For example, calling ldd /usr/bin/gedit on Ubuntu 22.04 lists 80 libraries required by the Gnome editor.

A complete list of all the symbols provided by a library can be retrieved using readelf. For example, calling readelf in Listing 1 displays the very long list of symbols in the C standard library (over 3,000 entries). Not all of the entries are functions. The symbols also include global variables and constants.

Listing 1

Symbols

$ readelf -Ws /lib/x86_64-linux-gnu/libc.so.6

Statically Linked

For some program files, ldd does not list libraries; instead it outputs an error message ("The program is not dynamically linked"). Linux launches these programs without loading additional files. If you compile a C program with GCC, you can use the -static option to create a statically linked binary. Figure 2 shows some experiments with a small C program.

Figure 2: Statically linked binaries come with all the required libraries in place making them significantly larger than dynamically linked ones.

In Figure 2, the program prints Hello world and – if present – the first call argument. To use the printf() function from the standard library, the code includes the stdio.h header file.

The two calls to GCC create a dynamically linked (test-printf-dynamic) and a statically linked (test-print-static) version of the executable. Note the difference in size. The dynamically linked binary weighs in at about 16KB, while the static version is about 900KB because it also contains the library functions.

Next, the two calls to strace check which files are opened during the start-up and execution of the program. In the first call to strace, /etc/ld.so.cache contains a list of all libraries that can be loaded automatically at program start time, and libc.so.6 is the well-known standard C library. Only the dynamic version of the program opens the two files. The static version already contains all the code it needs.

At the bottom of Figure 2, you can see two test calls to the programs, which work identically.

Manipulation

You can fine-tune a program's start-up behavior in Linux. In particular, you can specify where the ld-linux-x86-64.so.2 program loader looks for libraries. To do this, you can either use a static setting in the configuration file /etc/ld.so.conf and additional files in the /etc/ld.so.conf.d/ folder (these files point to the directories containing libraries) or the LD_LIBRARY_PATH environment variable. This variable lets you specify additional folders with libraries; the loader will then search those folders first.

Among other things, this flexibility means that you can install different versions of the same library and then define which one to use for individual applications. These adjustments do not influence the basic procedure. The loader checks which libraries a program needs, searches for them in the designated folders, and loads them.

You can achieve custom loading behavior with the LD_PRELOAD variable. Enter the individual libraries that you want the loader to additionally load. The libraries also contain functions that already exist in one of the regular libraries. This mechanism can be used, for example, to replace a single library function with a DIY variant.

Logger

For a simple example of using LD_PRELOAD, I'll create variants of the open() and close() system call wrappers. Listing 2 shows the complete code of the openclose.c file, which you can compile to create an openclose.so library file using the command from Listing 3. The two functions use printf() to print a debug message on the console and otherwise complete their tasks by calling the standard versions of open() and close() using a simple hack.

Listing 2

openclose.c

01 #define _GNU_SOURCE
02 #include <dlfcn.h>
03 #include <unistd.h>
04 #include <stdio.h>
05 #include <stdarg.h>
06
07 int (*true_close)(int fd);
08 int (*true_open)(const char *pathname, int flags, va_list mode);
09
10 int open (const char *pathname, int flags, va_list mode) {
11   true_open = dlsym (RTLD_NEXT, "open");
12   int fd = true_open (pathname, flags, mode);
13   printf ("DEBUG: open(\"%s\") = %d\n", pathname, fd);
14   return fd;
15 }
16
17 int close(int fd) {
18   true_close = dlsym (RTLD_NEXT, "close");
19   printf ("DEBUG: Closing fd = %d\n", fd);
20   return true_close(fd);
21 }

Listing 3

Compiling openclose.c

$ gcc openclose.c -o openclose.so -fPIC -shared -ldl

The dlsym() function finds functions from libraries by name. That is why the second call parameters in lines 11 and 18 are "open" and "close". This would actually return pointers to the versions implemented here, so it would not help. However, you can use the RTLD_NEXT parameter to skip the first match in each case. The continued search for the function names then finds the implementations in the standard library.

To still be able to call the functions in this case, the code in lines 7 and 8 defines two variables which must be of the correct function type. After the assignments in lines 11 and 18, it is then possible to call true_open() and true_close().

You now run an arbitrary program with these modified file access functions by prefixing the program call with the LD_PRELOAD=<path>/<to>/<library> variable assignment, specifying the absolute path (Listing 4).

Listing 4

Variable Assignment

$ cat test.txt
Hello
$ LD_PRELOAD=$PWD/openclose.so cat test.txt
DEBUG: open("test.txt") = 3
Hello
DEBUG: Closing fd = 3

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • strace

    Get started with strace by examining a pair of "Hello World" programs. Next month, in the second part of this two-part series, I'll take a deeper look at strace output.

  • Practical strace

    After "Hello World," you really need to look at system calls in more detail. In this second of two articles, we'll look at debugging in the real world.

  • strace and Firejail

    Software from unknown sources always poses some risks. With the strace analysis tool and the Firejail sandbox, you can monitor and isolate unknown applications to safeguard your system.

  • Perl: Ptrace

    Linux lets users watch the kernel at work with a little help from Ptrace, a tool that both debuggers and malicious process kidnappers use. A CPAN module introduces this technology to Perl and, if this is not enough, C extensions add functionality.

  • MITRE ATT&CK Workshop

    The MITRE ATT&CK website keeps information on attackers and intrusion techniques. We'll show you how to use that information to look for evidence of an attack.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News