Comparing the ext3, ext4, XFS, and Btrfs filesystems
XFS
XFS [4] is a 64-bit journaling filesystem. It was originally created in 1994 by Silicon Graphics, Inc. for its IRIX operating system. XFS was later ported to the Linux kernel version 2.4 in 2002.
A benefit of XFS is its stability an maturity. XFS is often seen as the filesystem for people with massive amounts of data. Because it is a full 64-bit filesystem, XFS is capable of handling filesystems as large as millions of terabytes (Exabytes). XFS ensures data consistency via metadata journaling, which allows it to restart very quickly after an unexpected interruption, regardless of the number of files it is managing. At the same time, XFS manages to minimize the performance impact of journaling.
XFS also supports write barriers (a mechanism for enforcing a particular ordering in a sequence of writes). Another specialty of XFS is its allocation groups. Allocation groups allow systems with multiple processors or multi-core processors to provide better throughput by simultaneously reading and writing through multiple application threads. XFS is capable of delivering close to the raw I/O performance that the underlying hardware can provide.
Performance
Table 1 shows the basic features of the four filesystems at a glance. Comparing performance is more of a challenge. Evaluating filesystem performance is a very difficult task because of the complex role a filesystem plays. What does "faster" mean? One system might be faster for accessing many small files, while another is faster for accessing a single large file. One filesystem might perform better on metadata operations, and another might handle data better. At the same time, problems writing metadata to the journal can thwart the overall I/O performance. Thus, a single number can never characterize the performance of a filesystem. Instead, it is better to isolate the different aspects of performance and measure them separately. Afterward, you can determine which aspects are most significant for the workloads you envision.
Table 1
Comparing Features
Name | Btrfs | Ext3 | Ext4 | XFS |
---|---|---|---|---|
Created |
2007 |
1998 |
2006 |
1994 |
Original OS |
Linux |
Linux |
Linux |
IRIX |
Limits |
||||
Max. filename length |
255 bytes |
255 bytes |
255 bytes |
255 bytes |
Max. file size (4k blocks) |
8EB (Linux kernel limit) |
2TB |
16TB |
8EB |
Max. volume size |
16EB |
16TB |
1EB |
16EB |
Features |
||||
Hard links |
yes |
yes |
yes |
yes |
Symbolic links |
yes |
yes |
yes |
yes |
Meta-data journaling |
no |
yes |
yes |
yes |
Snapshots |
yes |
no |
no |
no |
Clones |
yes |
no |
no |
no |
Encryption |
no |
no |
no |
no |
Compression |
yes |
no |
no |
no |
Deduplication |
yes |
no |
no |
no |
Integrated LVM |
yes |
no |
no |
no |
Online resizing |
grow/shrink |
grow only |
grow only |
grow only |
Offline resizing |
no |
grow/shrink |
grow/shrink |
no |
Extent allocation |
yes |
no |
yes |
yes |
Delayed allocation |
yes |
no |
yes |
yes |
Choosing the right benchmarks for measuring filesystem performance is important. Some benchmarks study a filesystem's ability to scale with increasing load; other benchmarks works by replaying traces of recorded workloads. Block device benchmarks such as iometer
[5] or fio
[6] evaluate bandwidth and latency of read and write operations on the physical device. These benchmarks are not very useful for this study. I need benchmarks that operate on the filesystem layer, not on the block device.
My goal is to evaluate read and write performance as a function of the file size. An example for such a benchmark is iozone
[7]. These benchmarks can become in-memory with small file sizes and "warm-cache" results. I can mitigate this effect by running all benchmarks on the same server and RAID system, so that the influence of CPU and memory is the same in all cases. I used an Exus Data ProServII server with Ubuntu Linux Server 14.04 (kernel 3.13.0) and a Transtec SCSI RAID system.
A file size larger than the buffer cache (i.e., nearly the amount of free RAM) lets the performance drop down to the spindle speed of the underlying HD or RAID group. It is possible to delete the page cache contents before running the benchmark. Delete the page cache by writing a special value to /proc/sys/vm/drop_caches
. Writing a 3
will free pagecache, dentries, and inodes:
echo 3 > /proc/sys/vm/drop_caches
I chose iozone and let it run in an automated manner with the following command:
iozone -Raz -bExt3_auto_20G.xls -g20G
Iozone takes quite a while and produces a lot of data.
The program iterates through all file sizes, starting with 64KB, doubling each step, and going through all possible record sizes, starting with 4KB. The server has 16GB of RAM, so the last pass of the benchmark with 16GB file size will no longer fit in the buffer cache. It shows the speed of the physical device – all other results are defined by the speed of the CPU and buffer caches.
I picked samples with typical file sizes out of the data flood to compare the filesystems. They don't show big differences (Figure 1). The similarity is probably because RAM speed is the dominating effect in this case. Most of the reads or writes go to the buffer cache. The situation will change slightly if you use IOzone to measure the throughput explicitly. The appropriate option (-t
) allows the user to specify how many threads or processes to have active during the measurement. Scaling performance with the number of processes is a strength for XFS, which clearly performs better than ext3 (Figure 2).
Another question is how fast the candidates perform metadata tasks, such as creating or destroying files and directories. I used fdtree [8] to test metadata performance. fdtree is a highly portable shell script that recursively creates and removes directories and files. In my test, fdtree created/deleted four directory levels with 10 directories at each level for a total of 11111 directories, with 10 files of size 40KB per directory, for a total of 111110 files and 4.34GB. Figure 3 shows the results. Again, the differences were not huge, but ext4 takes first place, and ext3 ranks far behind at last place. It is also important to consider that file size and depth of the directory structure make a difference. It also makes a difference where the journal of ext3/4 or XFS filesystem resides – on the same disk (bad), on an extra disk (better), or on a RAM disk or SSD (best).
Conclusion
Choosing a RAID system with lots of spindles instead of a single disk, or choosing an SSD instead of a hard disk, has a much greater influence on I/O performance than filesystem selection. Today's fileystems aren't far apart. Nevertheless, you might gain some percentage points of performance by choosing the right filesystem for your workload.
If you need the latest features and benefit from volume manager and RAID integration, self-healing, or snapshots, your only choice is Btrfs. If stability is the most important criterion, a less complex, but well-established, solution such as ext3 might be the best option. Very large filesystems and the need for high stability lead to XFS, which is also good for reading or writing many parallel streams. Ext4 is a balanced compromise, with many new features and a rock-solid foundation, that excels at metadata operations.
Infos
- Ext3 FAQ: http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html
- Ext4 wiki: https://ext4.wiki.kernel.org/index.php/Main_Page
- Btrfs wiki: https://btrfs.wiki.kernel.org/index.php/Main_Page
- XFS: http://oss.sgi.com/projects/xfs
- iometer: http://www.iometer.org
- fio: http://freecode.com/projects/fio
- iozone: http://www.iozone.org
- fdtree: https://computing.llnl.gov/?set=code&page=sio_downloads
« Previous 1 2
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Linux Servers Targeted by Akira Ransomware
A group of bad actors who have already extorted $42 million have their sights set on the Linux platform.
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
-
Kodi 21.0 Open-Source Entertainment Hub Released
After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
-
Linux Usage Increases in Two Key Areas
If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
-
Vulnerability Discovered in xz Libraries
An urgent alert for Fedora 40 has been posted and users should pay attention.
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.
-
New Pentesting Distribution to Compete with Kali Linux
SnoopGod is now available for your testing needs