The ext filesystem – a four-generation retrospective

Family

© Lead Image © Bea Kraus, medchip, tempusfugit, Robert Mizerek 123RF.com

© Lead Image © Bea Kraus, medchip, tempusfugit, Robert Mizerek 123RF.com

Author(s):

The extended filesystem has been part of the Linux kernel since 0.96c – a faithful companion of the free operating system. With its developments – or, rather, rebirths – through ext2, ext3, and ext4, it is one of the oldest Linux-specific software projects.

The Linux kernel [1] is now almost 22 years old. Its faithful companion since 1992 has been the family of extended filesystems, ext [2]-[4]. For many reasons, Linux took its first steps with a filesystem derived from Minix [3]-[5]). Originally, Linus Torvalds only wanted to develop a better terminal emulator for Minix. Therefore, he had no need for a separate filesystem. Even as the Torvalds project turned into an operating system kernel, the development still continued under Minix. The shared filesystem made it easier to exchange data.

A Star Is Born

The filesystem used in Minix, which was originally developed for educational purposes, had some significant limitations. The maximum file size was 64MB and the maximum filename length was 14 characters [4]. For some Linux pioneers, these limits were eventually unsustainable, and they started to think about a new, native Linux filesystem. Linus integrated the VFS (virtual filesystem) layer into kernel version 0.96a, which facilitated adding additional filesystems (Figure 1) [4] [6].

Figure 1: The Linux kernel 0.96a introduced VFS as a kind of jump-off point for the ext filesystem in 0.96c.

In version 0.96c, the first member of the ext filesystem (FS) family saw the light of the Linux world [2]. Rémy Card, the main architect, was inspired by the design of the UFS (Unix filesystem). Filenames were now allowed to be 255 characters on a filesystem of up to 2GB.

Although ext represented an improvement over the Minix filesystem, it still had a number of elements the developers absolutely hated, such as only one timestamp, instead of the three typically in use today, and the use of linked lists for free space, which quickly led to fragmentation and poor performance.

The replacement of ext was therefore inevitable and not long in coming. The successor, ext2, became part of the Linux kernel in version 0.99.7 (March 1993). The maximum size of the filesystem could now be a massive 4TB, and a file could be up to 2GB in size. In 1993, these were unbelievably large disk sizes. As you can guess, ext2 now also had the familiar three timestamps: file creation, last change, and last access.

Competitor Xia

Interestingly, ext2 was not unrivaled in this race. Xia FS – named after its developer, Frank Xia [6] – was based on Minix FS and addressed its shortcomings (Table 1). The first alpha versions of both Xia and ext2 were released in January 1993, and Xia initially proved to be more stable. However, the larger developer community backed the ext successor and soon helped to make it stable [7].

Table 1

Filesystem Features (March 1993)

Minix

ext

ext2

Xia

64MB

2GB

4TB

2GB

64MB

2GB

2GB

64MB

30 characters

255 characters

255 characters

248 characters

1

1

3

3

The Second Wave

For almost a decade, the ext2 filesystem was the de facto standard for Linux. It still enjoys significant popularity today: for boot media or core dump data storage. The first versions were mainly from the pen of Card, who had already written ext, Stephen Tweedie, and Theodore (Ted) Ts'o.

Their work, published in 1994 [4] [8] as the successor to ext, is a successful representation of the origin, the status, and development potential of the new standard Linux filesystem.

Like its predecessor, ext2 is based on the design principles of UFS and thus has a robust basic structure. Understanding the concept of inodes, directories, and links does not require a computer science degree (Figure 2). After the boot sector, the disk is divided into block groups. At the beginning of each block group is a superblock – or its backup – that describes the filesystem. Information for the block groups then follows. The filesystem uses bitmaps and tables to manage the (unused) blocks and inodes.

Figure 2: The basic structure of ext2 comes from UFS.

The remaining space belongs to the data. When designing ext2, the developers planned for future extensions. Card et al. consciously left enough space in the structures of the filesystem to allow new features to be defined later, so users could benefit from new features without having to create the filesystem again. This approach, which is fairly simple to describe, is an important pillar of the success of the ext FS family. Because ext FS has well-understood internals and has been part of Linux almost from the outset, it has often served as a test bed for extensions of the VFS layer.

Journaling

Over time, more filesystems have been added to the Linux kernel, and ReiserFS, JFS, and XFS were superior to ext2 in at least one way: Thanks to journaling, an fsck was much faster than in ext2. With ever-increasing volumes, filesystem checking contributes significantly to the time the operating system is occupied. In 1998, Tweedie published a design study with initial implementation approaches for ext2 with a journaling function [9].

During the discussion on write performance problems with large files [10], the developer reported on the kernel mailing list that he was working on this extension of the filesystem [11]. The management of many files in a directory was tricky for ext2. The developers addressed this with the introduction of H-tree, a special form of the B-tree.

With the introduction of journaling, the H-tree for large directories, and the ability to enlarge ext FS online, ext3 was born. Since version 2.4.15 (November 2001), it has been available as a Linux kernel extension [12]. The highlight here is that a conversion of ext2 to its successor was (and still is) possible without the inconvenience of shifting data around.

The filesystem developers kept their word. More than that, a way back was even possible. Simply run the fsck.ext2 command against an ext3 partition, and you could mount it again as a normal ext2 FS in the directory tree. For a more elegant approach, you have the tune2fs tool (Listing 1).

Listing 1

From ext2 to ext3 and Back

# mkfs.ext2 /dev/sdb1
mke2fs 1.42.7 (21-Jan-2013)
Filesystem label =
OS Type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
[...]
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
# mount /dev/sdb1 /tmp/mnt/
# grep mnt /proc/mounts
/dev/sdb1 /tmp/mnt ext2 rw,relatime 0 0
# umount /tmp/mnt/
# tune2fs -j /dev/sdb1
tune2fs 1.42.7 (21-Jan-2013)
Creating journal inode: done
# mount /dev/sdb1 /tmp/mnt/
# grep mnt /proc/mounts
/dev/sdb1 /tmp/mnt ext3 rw,relatime,data=ordered 0 0
# umount /tmp/mnt/
# tune2fs -O ^has_journal,^dir_index /dev/sdb1
tune2fs 1.42.7 (21-Jan-2013)
# mount /dev/sdb1 /tmp/mnt/
# grep mnt /proc/mounts
/dev/sdb1 /tmp/mnt ext2 rw,relatime 0 0

Going Back

Converting an ext3 partition back to ext2 appears useful when performance is more important than filesystem integrity. A journal requires more writes to the disk than in ext2, and unfortunately, it cannot be switched off separately. As Listing 1 shows, the conversion is fairly painless and not a one-way street.

History, however, has repeated itself: As size and performance requirements of the managed media grew, ext3 began to reach its limits, so ext4 was created from a kind of patched ext3 [13]. First characterized as experimental, the filesystem has been tagged as production since kernel version 2.6.28 (October 2008).

Continuity

Continuity was also evident on the developer front. Ts'o, already involved in the grandfather of ext4 and the maintainer of ext3, also took the youngest representative of the ext FS family under his wing. Most importantly: you can convert an existing ext3 without data loss or migration.

Unfortunately, some new features of ext4 require a change in the on-disk format. If you do without them, again the conversion is not a one-way operation, and again the tune2fs tool provides valuable service.

The light version of the tool allows the user to mount an ext3 partition with the ext4 driver (Figure 3).

Figure 3: The kernel driver makes some ext4 improvements available to ext3 partitions.

The relationships within the ext FS family are transitive – the compatibility of ext2 with ext3 and ext3 with ext4 results in some compatibility between ext2 and ext4. Table 2 shows which kernel driver can handle which filesystem, without requiring the user to manipulate the data on the disk.

Table 2

Driver Compatibility

Filesystem

ext2

ext3

ext4

ext2

Yes

Yes

Yes

ext3

No

Yes

Yes

ext4 (with ext3 on-disk format)

No

Yes

Yes

ext4

No

No

Yes

Aux Base: Library

As I mentioned before, a good basic understanding of ext2 and its successors has been a key element in the popularity of the filesystem. In addition to the usual tools for making and checking ext FS are some good debugging tools, such as debugfs and dumpe2fs, which allow power users to dig really deep into the filesystem to analyze it, fix bugs, or save data.

For these tools to work, they must have appropriate insider knowledge of the relevant member of the ext FS family. If a detail of the filesystem changes, the developers are forced to adapt the associated tools. As the number of helper programs grows, the maintenance effort can become immense. The trick used by ext2 and its successors is to collect filesystem information in the libext2fs library, against which the tools are linked (Listing 2). Ironically, the Btrfs tool, btrfs-convert, also uses this library, allowing you to convert an existing ext FS in situ to the Btrfs competitor [14].

Listing 2

libext2fs

# ldd /usr/sbin/debugfs /usr/sbin/tune2fs > \
  /usr/sbin/mkfs.ext* /usr/sbin/fsck.ext* > \
  /usr/sbin/btrfs-convert | grep ext2fs | > \
  awk '{print $1}' | uniq libext2fs.so.2

Conclusions

With more than 20 years of history, the ext family of filesystems is one of the most sustainable open source projects in the Linux environment. Its success hinges on several factors. To begin, the 1993 design on one hand allowed for future extension, and on the other hand offered strong backward compatibility. Additionally, the makeup of the development team has remained stable. Early adoption into the Linux kernel didn't hurt, either, helping the filesystem family attract a large developer and user community. Early nomination of ext FS as the standard filesystem for Linux is both a cause and consequence of the success of ext2 and its relatives.

However, it still remains to be seen how ext FS will fare in the coming years (see the box "Looking to the Future"). What will happen after the changing of the guard by Btrfs or XFS? Will the Linux Android smartphone continue to fly the banner of the ext family?

Looking to the Future

In 2008, ext4 was released for production use. Whether there will be an ext5 remains questionable. Even Ts'o believes the future belongs to Btrfs [15]. A major advantage of the ext FS family, backward compatibility, ultimately limits its scalability and manageability. SUSE has already gone down this road: After the difficult departure of ReiserFS, the distribution switched to ext3, but the affiliation did not last long; now, Btrfs has taken over. Red Hat will also do away with ext4 as the default filesystem some time soon. XFS is the designated successor.

The (future) development of ext4 hinges on several components. The core team consists of 10 people, including Ts'o. They hold weekly conference calls and meet once a year. The contributions to the source code itself are derived from the work of far more people. A recent rough analysis counted more than 120 different developers. Communication takes place partly on the Linux kernel mailing list, as well as on ext4-specific lists [16]. Good old Internet relay chat (IRC) is also still used [17].

The xfstest and in-house regression test suites are used to verify functionality, robustness, and stability. Bugs and their solutions are documented by Bugzilla tickets [18]. These and many other facts are described on the wiki page [19]. Developers and users alike will find important and useful information there.

The number of ext4 users is virtually impossible to determine. A large question mark hangs over the figures from Google's data centers, where the filesystem is used on their machines. By the way, the conversion from ext2 to ext4 was one of the reasons why the Internet giant took maintainer Ts'o on board. Android Honeycomb and later also rely on the youngest member of the ext FS family. Thanks to Linux smartphones, the number of ext4 instances is growing rapidly.

Infos

  1. Linux Kernel Archives: http://www.kernel.org/
  2. Linux 0.96c: https://www.kernel.org/pub/linux/kernel/Historic/old-versions/linux-0.96c.tar.gz
  3. "Anatomy of ext4" by M. Tim Jones: http://www.ibm.com/developerworks/linux/library/l-anatomy-ext4/
  4. "Design and Implementation of the Second Extended Filesystem" by Rémy Card, Theodore Ts'o, and Stephen Tweedie: http://e2fsprogs.sourceforge.net/ext2intro.html
  5. Tanenbaum, Andrew. Operating Systems: Design and Implementation. Prentice Hall, 1987
  6. Linux 0.96a: https://www.kernel.org/pub/linux/kernel/Historic/old-versions/linux-0.96a.tar.gz
  7. "The Linux ext2/3/4 Filesystem: Past, Present and Future" by Theodore Ts'o: http://www.linuxfoundation.jp/jp_uploads/seminar20060911/Ted_Tso.pdf
  8. Card, Rémy, Theodore Ts'o, and Stephen Tweedie. "Design and implementation of the second extended filesystem" in Proceedings of the First Dutch International Symposium on Linux, 1994
  9. "Journaling the Linux ext2fs Filesystem" by Stephen Tweedie: http://original.jamesthornton.com/hotlist/linux-filesystems/ext3-journal-design.pdf
  10. "fsync on large files" by Alan Curry: http://lkml.org/lkml/1999/2/12/160
  11. "Re: fsync on large files" by Stephen C. Tweedie: http://lkml.org/lkml/1999/2/17/36
  12. Kernel changelog 2.4.15: http://www.kernel.org/pub/linux/kernel/v2.4/ChangeLog-2.4.15
  13. "Proposal and plan for ext2/3 future development work" by Theodore Ts'o: http://lkml.org/lkml/2006/6/28/454
  14. "Conversion from ext3": http://btrfs.wiki.kernel.org/index.php/Conversion_from_Ext3
  15. "Panelists ponder the kernel at Linux Collaboration Summit" by Ryan Paul: http://arstechnica.com/information-technology/2009/04/linux-collaboration-summit-the-kernel-panel/
  16. Ext4 mailing lists: http://ext4.wiki.kernel.org/index.php/Mailinglists
  17. Ext4 IRC: http://ext4.wiki.kernel.org/index.php/IRC
  18. Ext4 bugs: http://bugzilla.kernel.org/buglist.cgi?product=File+System&component=ext4
  19. Ext4 wiki: http://ext4.wiki.kernel.org

The Author

Dr. Udo Seidel is a Math and Physics teacher and a Linux fan since 1996. Since finishing his doctorate, he has worked as a Linux/Unix trainer, system administrator, and senior solution engineer. Today, he is head of a Linux Strategy team at Amadeus Data Processing GmbH in Erding, Germany.