Btrfs and the future of the filesystem

New Butter

Article from Issue 189/2016

The Btrfs filesystem offers advanced features such as RAID, subvolumes, snapshots, checksums, and transparent compression, but do desktop users really need all that power?

A filesystem functions below the operating system, ensuring that abstract data is converted into physical address attributes such as tracks and sectors. Some filesystems go beyond this basic functionality. One powerful and popular filesystem for Linux is Btrfs. The Btrfs filesystem [1], which is affectionately pronounced "ButterFS," is sometimes called the next generation filesystem. Btrfs is a copy-on-write filesystem [2] originally developed by Oracle Corporation and masterminded by Chris Mason. In some ways, Btrfs is best understood as an implementation of the Solaris 10 transactional filesystem ZFS [3] for the Linux platform. Oracle acquired ZFS in 2010 when it acquired Sun Microsystems. Btrfs is free software under the GPL and was adopted into Linux kernel 2.6.29 early in 2009.

Btrfs was declared suitable for production use in April 2013. Btrfs developer Chris Mason moved from SUSE, where he worked on ReiserFS, to Oracle, and he has worked at Facebook for several years, where Btrfs is widely used in the back end. Btrfs is now no longer limited to Linux; the WinBtrfs [4] project offers what are still experimental drivers for Windows. In the Linux world, Oracle started using Btrfs in its Unbreakable Linux release, version 2, four years ago, and SUSE in SUSE Linux Enterprise Server 12 (SLES 12) and openSUSE 13.2 (Figure 1). OpeSUSE Leap uses Btrfs as the default for the root partition; most other distributions include Btrfs in their archives and offer it as an alternative in the installer. Fedora plans to make Btrfs the default with Fedora 24.

Figure 1: Btrfs as the default subvolume when installing openSUSE.

Trees, Nodes, and Leaves

Btrfs is not just well equipped for future desktops, it is also well suited for data centers. A maximum filesystem size of 16 exbibytes (EiB) corresponds to 16 million terabytes, or just over a trillion bytes. (Compare this with ext4, which can only handle 1EiB.) The filesystem can consist of one file of that size, or you can store 18 trillion files, and filenames can comprise 255 bytes. Whereas the directories in ext4 are organized in a table as an HTree, Btrfs uses a B-tree [5], with metadata in the nodes and the data in the leaves.

The idea is to reduce access times for file elements by reducing the height of the tree structure and expanding horizontally. The use of a B-tree also explains the origin of the name Btrfs, which expands to "B-tree FS." Another feature in the filesystem structure is pseudo-subdirectories, or subvolumes, that act like separate drives but only store changes from the main volume and thus do not occupy a great deal of space. Additionally, Btrfs can never run out of inodes [6], which is possible with the ext family, even if free space is available on the partition.

Differences and USPs

In contrast to the currently unofficial default Linux filesystem ext4, Btrfs offers some features that are generally not attributed to the functionality of a filesystem but are hugely popular, especially in professional environments such as data centers. Useful features include built-in Logical Volume Management functionality and built-in RAID levels 0, 1, 5, 6, and 10. Compared with traditional hardware or software RAID, the advantage to Btrfs is that it can distinguish between used and unused data blocks and thus save much time on recovery. Moreover, Btrfs has mastered the art of converting RAID levels on the fly.

Btrfs offers optional transparent data compression, deduplication [7], checksums for data and metadata, filesystem checks and defragmentation at run time, and optimizations for solid-state drives (SSDs). Additionally, btrfs-convert can reversibly convert existing ext3 and ext4 filesystems to Btrfs, and reverting to ext4 is possible as long as the snapshot created when converting the metadata of the ext4 system is not deleted. The Btrfs filesystem encryption feature, although planned from the outset, has not yet been implemented.

Snapshots Undo Mistakes

One outstanding feature for desktop users is the ability to create snapshots on a running system (Figure 2). Snapshots are images of a subvolume frozen in time. Particularly in connection with the Snapper [8] tool developed by openSUSE (Figure 3), snapshots allow rolling back a system with problems to its previous state, with relatively little effort. Once Snapper is configured, the ability to create a snapshot before a system upgrade removes the worries about upgrades for many users. In SUSE, snapshots are handled graphically via a YaST module for Snapper; external GUIs exist for other operating systems. Btrfs snapshots are less limited than with LVM, because you do not need to allocate space first and run the risk of an overflow. Even a large number of snapshots will not slow the system down.

Figure 2: Activating a snapshot during installation.
Figure 3: Creating a snapshot with Snapper.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More