Comparing the ext3, ext4, XFS, and Btrfs filesystems
XFS
XFS [4] is a 64-bit journaling filesystem. It was originally created in 1994 by Silicon Graphics, Inc. for its IRIX operating system. XFS was later ported to the Linux kernel version 2.4 in 2002.
A benefit of XFS is its stability an maturity. XFS is often seen as the filesystem for people with massive amounts of data. Because it is a full 64-bit filesystem, XFS is capable of handling filesystems as large as millions of terabytes (Exabytes). XFS ensures data consistency via metadata journaling, which allows it to restart very quickly after an unexpected interruption, regardless of the number of files it is managing. At the same time, XFS manages to minimize the performance impact of journaling.
XFS also supports write barriers (a mechanism for enforcing a particular ordering in a sequence of writes). Another specialty of XFS is its allocation groups. Allocation groups allow systems with multiple processors or multi-core processors to provide better throughput by simultaneously reading and writing through multiple application threads. XFS is capable of delivering close to the raw I/O performance that the underlying hardware can provide.
Performance
Table 1 shows the basic features of the four filesystems at a glance. Comparing performance is more of a challenge. Evaluating filesystem performance is a very difficult task because of the complex role a filesystem plays. What does "faster" mean? One system might be faster for accessing many small files, while another is faster for accessing a single large file. One filesystem might perform better on metadata operations, and another might handle data better. At the same time, problems writing metadata to the journal can thwart the overall I/O performance. Thus, a single number can never characterize the performance of a filesystem. Instead, it is better to isolate the different aspects of performance and measure them separately. Afterward, you can determine which aspects are most significant for the workloads you envision.
Table 1
Comparing Features
Name | Btrfs | Ext3 | Ext4 | XFS |
---|---|---|---|---|
Created |
2007 |
1998 |
2006 |
1994 |
Original OS |
Linux |
Linux |
Linux |
IRIX |
Limits |
||||
Max. filename length |
255 bytes |
255 bytes |
255 bytes |
255 bytes |
Max. file size (4k blocks) |
8EB (Linux kernel limit) |
2TB |
16TB |
8EB |
Max. volume size |
16EB |
16TB |
1EB |
16EB |
Features |
||||
Hard links |
yes |
yes |
yes |
yes |
Symbolic links |
yes |
yes |
yes |
yes |
Meta-data journaling |
no |
yes |
yes |
yes |
Snapshots |
yes |
no |
no |
no |
Clones |
yes |
no |
no |
no |
Encryption |
no |
no |
no |
no |
Compression |
yes |
no |
no |
no |
Deduplication |
yes |
no |
no |
no |
Integrated LVM |
yes |
no |
no |
no |
Online resizing |
grow/shrink |
grow only |
grow only |
grow only |
Offline resizing |
no |
grow/shrink |
grow/shrink |
no |
Extent allocation |
yes |
no |
yes |
yes |
Delayed allocation |
yes |
no |
yes |
yes |
Choosing the right benchmarks for measuring filesystem performance is important. Some benchmarks study a filesystem's ability to scale with increasing load; other benchmarks works by replaying traces of recorded workloads. Block device benchmarks such as iometer
[5] or fio
[6] evaluate bandwidth and latency of read and write operations on the physical device. These benchmarks are not very useful for this study. I need benchmarks that operate on the filesystem layer, not on the block device.
My goal is to evaluate read and write performance as a function of the file size. An example for such a benchmark is iozone
[7]. These benchmarks can become in-memory with small file sizes and "warm-cache" results. I can mitigate this effect by running all benchmarks on the same server and RAID system, so that the influence of CPU and memory is the same in all cases. I used an Exus Data ProServII server with Ubuntu Linux Server 14.04 (kernel 3.13.0) and a Transtec SCSI RAID system.
A file size larger than the buffer cache (i.e., nearly the amount of free RAM) lets the performance drop down to the spindle speed of the underlying HD or RAID group. It is possible to delete the page cache contents before running the benchmark. Delete the page cache by writing a special value to /proc/sys/vm/drop_caches
. Writing a 3
will free pagecache, dentries, and inodes:
echo 3 > /proc/sys/vm/drop_caches
I chose iozone and let it run in an automated manner with the following command:
iozone -Raz -bExt3_auto_20G.xls -g20G
Iozone takes quite a while and produces a lot of data.
The program iterates through all file sizes, starting with 64KB, doubling each step, and going through all possible record sizes, starting with 4KB. The server has 16GB of RAM, so the last pass of the benchmark with 16GB file size will no longer fit in the buffer cache. It shows the speed of the physical device – all other results are defined by the speed of the CPU and buffer caches.
I picked samples with typical file sizes out of the data flood to compare the filesystems. They don't show big differences (Figure 1). The similarity is probably because RAM speed is the dominating effect in this case. Most of the reads or writes go to the buffer cache. The situation will change slightly if you use IOzone to measure the throughput explicitly. The appropriate option (-t
) allows the user to specify how many threads or processes to have active during the measurement. Scaling performance with the number of processes is a strength for XFS, which clearly performs better than ext3 (Figure 2).
![](/var/linux_magazin/storage/images/issues/2014/165/choose-a-filesystem/figure-2/620100-1-eng-US/Figure-2_large.png)
Another question is how fast the candidates perform metadata tasks, such as creating or destroying files and directories. I used fdtree [8] to test metadata performance. fdtree is a highly portable shell script that recursively creates and removes directories and files. In my test, fdtree created/deleted four directory levels with 10 directories at each level for a total of 11111 directories, with 10 files of size 40KB per directory, for a total of 111110 files and 4.34GB. Figure 3 shows the results. Again, the differences were not huge, but ext4 takes first place, and ext3 ranks far behind at last place. It is also important to consider that file size and depth of the directory structure make a difference. It also makes a difference where the journal of ext3/4 or XFS filesystem resides – on the same disk (bad), on an extra disk (better), or on a RAM disk or SSD (best).
Conclusion
Choosing a RAID system with lots of spindles instead of a single disk, or choosing an SSD instead of a hard disk, has a much greater influence on I/O performance than filesystem selection. Today's fileystems aren't far apart. Nevertheless, you might gain some percentage points of performance by choosing the right filesystem for your workload.
If you need the latest features and benefit from volume manager and RAID integration, self-healing, or snapshots, your only choice is Btrfs. If stability is the most important criterion, a less complex, but well-established, solution such as ext3 might be the best option. Very large filesystems and the need for high stability lead to XFS, which is also good for reading or writing many parallel streams. Ext4 is a balanced compromise, with many new features and a rock-solid foundation, that excels at metadata operations.
Infos
- Ext3 FAQ: http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html
- Ext4 wiki: https://ext4.wiki.kernel.org/index.php/Main_Page
- Btrfs wiki: https://btrfs.wiki.kernel.org/index.php/Main_Page
- XFS: http://oss.sgi.com/projects/xfs
- iometer: http://www.iometer.org
- fio: http://freecode.com/projects/fio
- iozone: http://www.iozone.org
- fdtree: https://computing.llnl.gov/?set=code&page=sio_downloads
« Previous 1 2
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
![Learn More](https://www.linux-magazine.com/var/linux_magazin/storage/images/media/linux-magazine-eng-us/images/misc/learn-more/834592-1-eng-US/Learn-More_medium.png)
News
-
NVIDIA Released Driver for Upcoming NVIDIA 560 GPU for Linux
Not only has NVIDIA released the driver for its upcoming CPU series, it's the first release that defaults to using open-source GPU kernel modules.
-
OpenMandriva Lx 24.07 Released
If you’re into rolling release Linux distributions, OpenMandriva ROME has a new snapshot with a new kernel.
-
Kernel 6.10 Available for General Usage
Linus Torvalds has released the 6.10 kernel and it includes significant performance increases for Intel Core hybrid systems and more.
-
TUXEDO Computers Releases InfinityBook Pro 14 Gen9 Laptop
Sporting either AMD or Intel CPUs, the TUXEDO InfinityBook Pro 14 is an extremely compact, lightweight, sturdy powerhouse.
-
Google Extends Support for Linux Kernels Used for Android
Because the LTS Linux kernel releases are so important to Android, Google has decided to extend the support period beyond that offered by the kernel development team.
-
Linux Mint 22 Stable Delayed
If you're anxious about getting your hands on the stable release of Linux Mint 22, it looks as if you're going to have to wait a bit longer.
-
Nitrux 3.5.1 Available for Install
The latest version of the immutable, systemd-free distribution includes an updated kernel and NVIDIA driver.
-
Debian 12.6 Released with Plenty of Bug Fixes and Updates
The sixth update to Debian "Bookworm" is all about security mitigations and making adjustments for some "serious problems."
-
Canonical Offers 12-Year LTS for Open Source Docker Images
Canonical is expanding its LTS offering to reach beyond the DEB packages with a new distro-less Docker image.
-
Plasma Desktop 6.1 Released with Several Enhancements
If you're a fan of Plasma Desktop, you should be excited about this new point release.