Comparing the ext3, ext4, XFS, and Btrfs filesystems
XFS
XFS [4] is a 64-bit journaling filesystem. It was originally created in 1994 by Silicon Graphics, Inc. for its IRIX operating system. XFS was later ported to the Linux kernel version 2.4 in 2002.
A benefit of XFS is its stability an maturity. XFS is often seen as the filesystem for people with massive amounts of data. Because it is a full 64-bit filesystem, XFS is capable of handling filesystems as large as millions of terabytes (Exabytes). XFS ensures data consistency via metadata journaling, which allows it to restart very quickly after an unexpected interruption, regardless of the number of files it is managing. At the same time, XFS manages to minimize the performance impact of journaling.
XFS also supports write barriers (a mechanism for enforcing a particular ordering in a sequence of writes). Another specialty of XFS is its allocation groups. Allocation groups allow systems with multiple processors or multi-core processors to provide better throughput by simultaneously reading and writing through multiple application threads. XFS is capable of delivering close to the raw I/O performance that the underlying hardware can provide.
Performance
Table 1 shows the basic features of the four filesystems at a glance. Comparing performance is more of a challenge. Evaluating filesystem performance is a very difficult task because of the complex role a filesystem plays. What does "faster" mean? One system might be faster for accessing many small files, while another is faster for accessing a single large file. One filesystem might perform better on metadata operations, and another might handle data better. At the same time, problems writing metadata to the journal can thwart the overall I/O performance. Thus, a single number can never characterize the performance of a filesystem. Instead, it is better to isolate the different aspects of performance and measure them separately. Afterward, you can determine which aspects are most significant for the workloads you envision.
Table 1
Comparing Features
Name | Btrfs | Ext3 | Ext4 | XFS |
---|---|---|---|---|
Created |
2007 |
1998 |
2006 |
1994 |
Original OS |
Linux |
Linux |
Linux |
IRIX |
Limits |
||||
Max. filename length |
255 bytes |
255 bytes |
255 bytes |
255 bytes |
Max. file size (4k blocks) |
8EB (Linux kernel limit) |
2TB |
16TB |
8EB |
Max. volume size |
16EB |
16TB |
1EB |
16EB |
Features |
||||
Hard links |
yes |
yes |
yes |
yes |
Symbolic links |
yes |
yes |
yes |
yes |
Meta-data journaling |
no |
yes |
yes |
yes |
Snapshots |
yes |
no |
no |
no |
Clones |
yes |
no |
no |
no |
Encryption |
no |
no |
no |
no |
Compression |
yes |
no |
no |
no |
Deduplication |
yes |
no |
no |
no |
Integrated LVM |
yes |
no |
no |
no |
Online resizing |
grow/shrink |
grow only |
grow only |
grow only |
Offline resizing |
no |
grow/shrink |
grow/shrink |
no |
Extent allocation |
yes |
no |
yes |
yes |
Delayed allocation |
yes |
no |
yes |
yes |
Choosing the right benchmarks for measuring filesystem performance is important. Some benchmarks study a filesystem's ability to scale with increasing load; other benchmarks works by replaying traces of recorded workloads. Block device benchmarks such as iometer
[5] or fio
[6] evaluate bandwidth and latency of read and write operations on the physical device. These benchmarks are not very useful for this study. I need benchmarks that operate on the filesystem layer, not on the block device.
My goal is to evaluate read and write performance as a function of the file size. An example for such a benchmark is iozone
[7]. These benchmarks can become in-memory with small file sizes and "warm-cache" results. I can mitigate this effect by running all benchmarks on the same server and RAID system, so that the influence of CPU and memory is the same in all cases. I used an Exus Data ProServII server with Ubuntu Linux Server 14.04 (kernel 3.13.0) and a Transtec SCSI RAID system.
A file size larger than the buffer cache (i.e., nearly the amount of free RAM) lets the performance drop down to the spindle speed of the underlying HD or RAID group. It is possible to delete the page cache contents before running the benchmark. Delete the page cache by writing a special value to /proc/sys/vm/drop_caches
. Writing a 3
will free pagecache, dentries, and inodes:
echo 3 > /proc/sys/vm/drop_caches
I chose iozone and let it run in an automated manner with the following command:
iozone -Raz -bExt3_auto_20G.xls -g20G
Iozone takes quite a while and produces a lot of data.
The program iterates through all file sizes, starting with 64KB, doubling each step, and going through all possible record sizes, starting with 4KB. The server has 16GB of RAM, so the last pass of the benchmark with 16GB file size will no longer fit in the buffer cache. It shows the speed of the physical device – all other results are defined by the speed of the CPU and buffer caches.
I picked samples with typical file sizes out of the data flood to compare the filesystems. They don't show big differences (Figure 1). The similarity is probably because RAM speed is the dominating effect in this case. Most of the reads or writes go to the buffer cache. The situation will change slightly if you use IOzone to measure the throughput explicitly. The appropriate option (-t
) allows the user to specify how many threads or processes to have active during the measurement. Scaling performance with the number of processes is a strength for XFS, which clearly performs better than ext3 (Figure 2).
Another question is how fast the candidates perform metadata tasks, such as creating or destroying files and directories. I used fdtree [8] to test metadata performance. fdtree is a highly portable shell script that recursively creates and removes directories and files. In my test, fdtree created/deleted four directory levels with 10 directories at each level for a total of 11111 directories, with 10 files of size 40KB per directory, for a total of 111110 files and 4.34GB. Figure 3 shows the results. Again, the differences were not huge, but ext4 takes first place, and ext3 ranks far behind at last place. It is also important to consider that file size and depth of the directory structure make a difference. It also makes a difference where the journal of ext3/4 or XFS filesystem resides – on the same disk (bad), on an extra disk (better), or on a RAM disk or SSD (best).
Conclusion
Choosing a RAID system with lots of spindles instead of a single disk, or choosing an SSD instead of a hard disk, has a much greater influence on I/O performance than filesystem selection. Today's fileystems aren't far apart. Nevertheless, you might gain some percentage points of performance by choosing the right filesystem for your workload.
If you need the latest features and benefit from volume manager and RAID integration, self-healing, or snapshots, your only choice is Btrfs. If stability is the most important criterion, a less complex, but well-established, solution such as ext3 might be the best option. Very large filesystems and the need for high stability lead to XFS, which is also good for reading or writing many parallel streams. Ext4 is a balanced compromise, with many new features and a rock-solid foundation, that excels at metadata operations.
Infos
- Ext3 FAQ: http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html
- Ext4 wiki: https://ext4.wiki.kernel.org/index.php/Main_Page
- Btrfs wiki: https://btrfs.wiki.kernel.org/index.php/Main_Page
- XFS: http://oss.sgi.com/projects/xfs
- iometer: http://www.iometer.org
- fio: http://freecode.com/projects/fio
- iozone: http://www.iozone.org
- fdtree: https://computing.llnl.gov/?set=code&page=sio_downloads
« Previous 1 2
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
AlmaLinux 10.0 Beta Released
The AlmaLinux OS Foundation has announced the availability of AlmaLinux 10.0 Beta ("Purple Lion") for all supported devices with significant changes.
-
Gnome 47.2 Now Available
Gnome 47.2 is now available for general use but don't expect much in the way of newness, as this is all about improvements and bug fixes.
-
Latest Cinnamon Desktop Releases with a Bold New Look
Just in time for the holidays, the developer of the Cinnamon desktop has shipped a new release to help spice up your eggnog with new features and a new look.
-
Armbian 24.11 Released with Expanded Hardware Support
If you've been waiting for Armbian to support OrangePi 5 Max and Radxa ROCK 5B+, the wait is over.
-
SUSE Renames Several Products for Better Name Recognition
SUSE has been a very powerful player in the European market, but it knows it must branch out to gain serious traction. Will a name change do the trick?
-
ESET Discovers New Linux Malware
WolfsBane is an all-in-one malware that has hit the Linux operating system and includes a dropper, a launcher, and a backdoor.
-
New Linux Kernel Patch Allows Forcing a CPU Mitigation
Even when CPU mitigations can consume precious CPU cycles, it might not be a bad idea to allow users to enable them, even if your machine isn't vulnerable.
-
Red Hat Enterprise Linux 9.5 Released
Notify your friends, loved ones, and colleagues that the latest version of RHEL is available with plenty of enhancements.
-
Linux Sees Massive Performance Increase from a Single Line of Code
With one line of code, Intel was able to increase the performance of the Linux kernel by 4,000 percent.
-
Fedora KDE Approved as an Official Spin
If you prefer the Plasma desktop environment and the Fedora distribution, you're in luck because there's now an official spin that is listed on the same level as the Fedora Workstation edition.