A Deep Dive into the ELF File Format
The Section Headers
Each section header describes one section, including its type, permissions, size, and location (on disk and in memory), and other miscellaneous information. The raw data for the sections appears at the end of the ELF file. This raw data consists of executable code, program data such as global objects, and information used in the link. Some ELF images could also contain debugging data in DWARF format.
Table 1 shows some common section names and their descriptions. Most of these need little explanation. Note that the capitalized section names used in the source code don't make it into the final ELF file; they're just labels for the benefit of the assembler. The real section names are by convention lowercase and start with a dot. For more information on the .gnu.hash
section, see "GNU Hash ELF Sections" online [4]; the accompanying Git repository contains a small C utility that will generate a GNU-style hash for any symbol.
Table 1
Common Section Names
Name | Description |
---|---|
interp |
A (possibly) null-terminated string requesting a program interpreter (aka runtime linker/loader). On Linux, this would typically be /lib64/ld-linux-x86-64.so.2; on FreeBSD it's /libexec/ld-elf.so.1. |
.gnu.hash |
When the dynamic linker combines all the objects in a process, it needs a way to discover, rapidly, whether a particular symbol is present in a given object file. The GNU hash section provides a precomputed hash table to facilitate this. |
.dynsym |
A list of symbols that the object provides or requires. The first symbol should be a null symbol. For symbols we're providing ourselves, we need to supply the section index of the section where the symbol's storage is located, and a virtual address for the symbol. |
.dynstr |
Null-terminated strings, usually the names of libraries and functions needed in the link. This section, like .strtab and .shstrtab, is defined to begin and end with a null character. |
.rela.plt |
Relocations. Each relocation contains the address of a slot the dynamic linker needs to fill in, as well as the offset of the corresponding symbol in .dynsym, the type of relocation (we'll only be using R_X86_64_JMP_SLOT = 7), and a constant addend. These fields are all quad-words. |
.text |
The actual executable code of the program; the address of this section is typically the program's entry point. |
.plt |
Contains code used as a springboard to functions in other ELF images loaded in the same address space. |
.got.plt |
Contains the absolute addresses of functions in other ELF images loaded in the same address space. |
.bss, .data |
These sections contain only program dataholding variables expected by libc. |
.dynamic |
This section contains an array of pairs of quad-words providing extra information to help with dynamic linking. The first quad-word can be thought of as a configuration option and the second, its value. For example, DT_NEEDED followed by the offset of the string libc.so.7 indicates that this library is needed, and DT_GNU_HASH followed by a virtual address tells the linker where to find the .gnu.hash section. |
.symtab |
Non-dynamic symbols; not usually loaded into memory at runtime. |
.strtab |
Non-dynamic strings referenced by .symtab. |
.shstrtab |
Contains the section names used by the section headers. |
As with the program headers, section headers have a common format, so I wrap the declarations in a macro (Listing 3).
Listing 3
The Section Header Macro
We can then declare a section header with just one macro invocation:
SECTION_HEADER TEXT,SHSTRTAB.S6,SHT_PROGBITS,SHF_ALLOCor SHF_EXECINSTR, LOAD_BASE + PLANE1 + SECTION_TEXT,SECTION_TEXT,TEXT_SIZE,0,0,0x10,0x0
Section Contents
This simple example does not require all the sections described in Table 1, but a brief description of the .text
, .plt
, .got.plt
, and .rela.plt
sections will give you an indication of how the sections are structured.
The .text
section contains the executable code for the program proper (Listing 4). In this case, the code calls puts
to print a string to the terminal and exit
to return control to the operating system.
Listing 4
The main Function
x86-64 instructions often use relative addressing. This means that, for example, a CALL
instruction is encoded differently depending on its distance from the code it's calling (the destination is encoded as a signed 32-bit value). This makes it impossible to encode an absolute address or call a function whose offset won't fit in 32 bits. The solution is the Procedure Linkage table (PLT) and the Global Offset Table (GOT), which are described in the .plt
and got.plt
sections. The PLT provides call destinations that are local to the ELF image, so all of its labels can be reached by a 32-bit relative call. It then uses a JMP
instruction to jump to the real function, whose address the dynamic linker has placed in the GOT. The PLT and GOT also allow the dynamic linker to resolve function addresses only when required, speeding up the loading process.
Listing 5 defines some convenience macros for the .plt
and got.plt
sections and then lists the sections themselves.
Listing 5
The PLT and GOT
Next I use the .rela.plt
section to define a macro for a single relocation, using the 24-byte structure described earlier (Listing 6).
Listing 6
Relocations
Exported and Imported Symbols
The executable exports two symbols, environ
and __progname
, expected by libc, and imports puts
and exit
. In Listing 7, I wrap these declarations in some convenience macros.
Listing 7
Imported and Exported Symbols
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Red Hat Adds New Deployment Option for Enterprise Linux Platforms
Red Hat has re-imagined enterprise Linux for an AI future with Image Mode.
-
OSJH and LPI Release 2024 Open Source Pros Job Survey Results
See what open source professionals look for in a new role.
-
Proton 9.0-1 Released to Improve Gaming with Steam
The latest release of Proton 9 adds several improvements and fixes an issue that has been problematic for Linux users.
-
So Long Neofetch and Thanks for the Info
Today is a day that every Linux user who enjoys bragging about their system(s) will mourn, as Neofetch has come to an end.
-
Ubuntu 24.04 Comes with a “Flaw"
If you're thinking you might want to upgrade from your current Ubuntu release to the latest, there's something you might want to consider before doing so.
-
Canonical Releases Ubuntu 24.04
After a brief pause because of the XZ vulnerability, Ubuntu 24.04 is now available for install.
-
Linux Servers Targeted by Akira Ransomware
A group of bad actors who have already extorted $42 million have their sights set on the Linux platform.
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.