Anatomy of a simple Linux utility
How ls Works
A simple Linux utility program such as ls might look simple, but many steps happen behind the scenes from the time you type "ls" to the time you see the directory listing. In this article, we look at these behind-the-scene details.
What really happens when you enter a program's name in a terminal window? This article is a journey into the workings of a commonly used program – the ubiquitous ls
file listing command. This journey starts with the Bash [1] shell finding the ls program in response to the letters ls typed at the terminal, and it leads to a list of files and directories retrieved from the underlying filesystem [2].
To recreate these results, you'll need some basic understanding of standard debugging techniques using the GNU debugger (gdb
), some familiarity with the SystemTap system information utility [3] [4], and an intermediate-level understanding of C programming code. SystemTap is a scripting language and an instrumentation framework that allows you to examine a Linux kernel dynamically. If you don't have all these skills, following along will still give you some insight into the inner workings of a program on Linux.
This article assumes you are running Linux kernel 3.18 [5] with the debug symbols for Bash installed, that a local copy of the 3.18 kernel source is available, and that SystemTap is set up properly. In the next section, I will describe how to configure your system to follow this article.
Setting Up Your System
To install the Bash debug symbols on Fedora 21, you can use the command:
# debuginfo-install bash
If you do not have the GNU debugger gdb
installed, you can install it using yum install gdb
.
The kernel 3.18 source can be downloaded from The Linux Kernel Archives [6], or, if you prefer to clone the kernel source, switch to the v3.19 branch. SystemTap can be installed on Fedora 21 with:
# yum install systemtap-devel systemtap-client # stap-prep
The last line installs the necessary kernel packages for your kernel.
Methodology
Before getting started, it is worthwhile discussing the methodology I adopted for this investigation. The first step is to understand how the program – an executable script or a binary program – corresponding to a command entered on the command line is found. By placing breakpoints at key locations in Bash, you can halt the execution of Bash and examine key variables to get an idea what the program is processing at that point in the program. The next section makes this step clearer with an example that uses the ls
program.
Once you know how the program to be executed is found, you want to know how the program itself works. System calls are the entry point for a program to the kernel space. The program either invokes one directly or via a library function call.
After determining the key system call or calls, you then look into the kernel source code to find the function implementing that system call. SystemTap scripts can then trace the entry and exit from these functions, illustrating how the control flow occurs to and from kernel space.
I adopt this methodology to understand how the ls
program works, but the same techniques should be relevant for any program.
First Steps: Typing ls
When I type ls
, the location of the binary corresponding to the command is first searched in the locations in the PATH
environment variable. You can chart this action using the GNU debugger (gdb
); you'll either need the debug symbols for Bash installed or a locally built copy of Bash with debug enabled.
To begin, start a gdb
session and pass in the bash
binary:
> gdb bash
Place a breakpoint in the search_for_command()
function and start bash
, passing in ls
as the argument (Listing 1).
Listing 1
Placing Breakpoints in Bash Source
As you can see from line #0 in Listing 1, the argument pathname
refers to the string ls, which now has to be searched in the locations specified by the user's $PATH
variable. My $PATH
is as follows:
> echo $PATH /usr/lib64/qt-3.3/bin:/usr/lib64/ccache:/bin:/usr/bin:\ /usr/local/bin:/usr/local/sbin:/usr/sbin:/home/asaha/.local/bin:\ /home/asaha/bin
I now place a breakpoint in the find_user_command_in_path()
function to see how Bash searches through all the locations present in $PATH
(Listing 2).
Listing 2
Searching for the Program in $PATH
At the end of Listing 2, /usr/bin/ls
has been found (/bin
is a symlink to /usr/bin
on Fedora 21); the function shell_execve()
invokes the execve()
system call to execute the command.
The stat()
system call is invoked to check the existence of the executable corresponding to ls
in the path locations. Listing 3 shows the snippet of the calls to stat()
for the three path locations.
Listing 3
stat() Calls to Path Locations
A closer look at the kernel reveals how the stat()
command works. From here on out, all source references are relative to the top-level kernel source directory.
The stat()
system call is defined as in fs/stat.c
(Listing 4). The vfs_stat()
function in turn is defined as shown in Listing 5. The function vfs_fstatat()
makes use of the inode data structures to check for the file's existence, and, if it exists, it retrieves the file's attributes. To see what is happening in kernel space when the stat()
function call is invoked, I use the SystemTap script in Listing 6 to trace the call to and from the vfs_fstatat()
function (Listing 6).
Listing 5
Definition of vfs_stat()
Listing 6
Tracing Call To and From vfs_fstatat()
Listing 4
Definition of stat() System Call
The vfs_fstatat()
function has the prototype:
int vfs_fstatat\ (int dfd, const char __user *filename, struct kstat *stat, int flag)
The parameter, filename
is what I am interested in here. When you run the SystemTap script, you will see the lines shown in Listing 7.
Listing 7
Output of Script in Listing 6
Now, execute the ls
command in another terminal window. You should see the lines shown in Listing 8 in the SystemTap window.
Listing 8
Output of SystemTap script in Listing 7
At this stage, I have a fairly reasonable idea of what happens in userspace and kernel space so that the location of the program to which ls
corresponds is found. Now, I am ready to see how the binary is executed.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Gnome OS Transitioning Toward a General-Purpose Distro
If you're looking for the perfectly vanilla take on the Gnome desktop, Gnome OS might be for you.
-
Fedora 41 Released with New Features
If you're a Fedora fan or just looking for a Linux distribution to help you migrate from Windows, Fedora 41 might be just the ticket.
-
AlmaLinux OS Kitten 10 Gives Power Users a Sneak Preview
If you're looking to kick the tires of AlmaLinux's upstream version, the developers have a purrfect solution.
-
Gnome 47.1 Released with a Few Fixes
The latest release of the Gnome desktop is all about fixing a few nagging issues and not about bringing new features into the mix.
-
System76 Unveils an Ampere-Powered Thelio Desktop
If you're looking for a new desktop system for developing autonomous driving and software-defined vehicle solutions. System76 has you covered.
-
VirtualBox 7.1.4 Includes Initial Support for Linux kernel 6.12
The latest version of VirtualBox has arrived and it not only adds initial support for kernel 6.12 but another feature that will make using the virtual machine tool much easier.
-
New Slimbook EVO with Raw AMD Ryzen Power
If you're looking for serious power in a 14" ultrabook that is powered by Linux, Slimbook has just the thing for you.
-
The Gnome Foundation Struggling to Stay Afloat
The foundation behind the Gnome desktop environment is having to go through some serious belt-tightening due to continued financial problems.
-
Thousands of Linux Servers Infected with Stealth Malware Since 2021
Perfctl is capable of remaining undetected, which makes it dangerous and hard to mitigate.
-
Halcyon Creates Anti-Ransomware Protection for Linux
As more Linux systems are targeted by ransomware, Halcyon is stepping up its protection.