A next-gen CoW filesystem enters the mainline

Super CoW

© Photo by Anand Thakur on Unsplash

© Photo by Anand Thakur on Unsplash

Article from Issue 298/2025
Author(s):

Bcachefs is a next-generation Linux filesystem that merges into the kernel, offering a feature-complete, high-performance copy-on-write design for scalable, reliable storage.

In the world of Linux filesystems, finding the perfect balance between performance and advanced features has long been a challenge. Traditional workhorses such as ext4 and XFS deliver speed and stability but lack modern capabilities, whereas feature-rich alternatives such as Btrfs and ZFS come with complexity and caveats. Bcachefs [1] – merged into the Linux kernel 6.7 – promises the best of both worlds, aiming to marry the reliability and powerful features of ZFS/Btrfs with the efficiency of ext4/XFS. Bcachefs is a copy-on-write (CoW) filesystem designed for robustness without sacrificing speed. The bcachefs filesystem offers native support for checksumming, compression, encryption, snapshots, tiered storage, and multi-device configurations, all the while delivering consistent low-latency performance and robust data integrity.

Bcachefs developer Kent Overstreet has had some high-profile battles with Linus Torvalds, and the status of future kernel support keeps changing depending on who you talk to and when you talk to them. Linus has stated that the kernel will be "parting ways" with bcachefs in version 6.17 [2]. As of this writing, though, bcachefs is included in the current Linux kernel, and support could continue if the antagonists resolve their differences. Whether or not bcachefs continues as an active part of the kernel, it is still worth a look as an example of how the world of the Linux filesystem keeps evolving.

Architecture and Features

Bcachefs was built from the ground up to be a next-generation CoW filesystem that won't eat your data. Its core design inherits concepts from the earlier bcache block-layer cache, with which it originally shared much code, evolving that prototype into a full POSIX-compatible filesystem. Bcachefs uses CoW semantics similar to Btrfs and ZFS – data and metadata are never overwritten in place, enabling atomic updates and consistent snapshots. All modifications are written to new locations, which not only facilitates reliability but also allows features such as snapshotting and reflinking (efficiently copying or cloning files) by design.

B-Tree Storage Engine

Internally, bcachefs is structured around a highly optimized B-tree. In fact, its on-disk format is essentially a key-value store acting as the filesystem's database. Unusually large B-tree node sizes (256KiB by default) are used and each node is log-structured internally. This hybrid approach reduces write amplification because updates can be buffered and coalesced in these large nodes, minimizing the need to rewrite entire metadata blocks on every change. The B-tree design, combined with a sophisticated transaction model and fine-grained locking, yields low tail-latency for I/O operations. In practice, bcachefs's write performance is remarkably consistent, avoiding the long latency outliers that sometimes plague other filesystems under heavy load. The CoW mechanics and a bucket-based allocator are also leveraged to implement a form of RAID that claims to avoid the notorious "write hole" problem (data inconsistency during partial writes in RAID 5/6) and prevent I/O fragmentation when using multiple devices.

Complete Feature Set

One of Bcachefs's goals is to provide the full set of features expected from a modern filesystem. It includes native support for:

  • CoW with full data and metadata checksumming: Every block and metadata record can be checksummed (CRC32C by default, with 64-bit and other algorithms available) to detect corruption. This is similar to ZFS and Btrfs, meaning bcachefs can catch silent data corruption and is capable of self-healing when used with redundant copies.
  • Integrated compression: Bcachefs supports transparent compression of data with multiple algorithms (LZ4, zlib/Deflate, and Zstandard), configurable per filesystem or even per file. Compression can improve storage efficiency and I/O throughput for suitable data, much like on Btrfs and ZFS.
  • Encryption: Unlike Btrfs (which infamously promised encryption for years but never delivered), bcachefs offers built-in whole-filesystem encryption using modern ciphers (ChaCha20-Poly1305). This allows data at rest to be secured without relying on LUKS or other layers.
  • Multiple device support and scaling: Bcachefs can span multiple block devices in a single filesystem, supporting flexible configurations akin to software RAID. It natively handles replication (e.g., mirroring data across devices) and even has experimental erasure coding for efficient storage redundancy (though the latter is not yet marked stable). By default, data is striped across devices for performance (similar to RAID 0) and with replication settings it can emulate RAID 1, RAID 10, and so on.
  • Tiered storage and caching: True to its bcache heritage, bcachefs can integrate fast and slow storage tiers within one filesystem. Devices can be assigned roles such as foreground, background, or promote targets. Writes land on the fast foreground tier (say, NVMe SSDs) and later migrate to the slower background tier (HDDs) in a sort of automated tiering. Frequently-read data can be promoted to a dedicated cache tier ("promote" devices) for accelerated reads. This is a compelling feature that neither Btrfs nor ZFS currently offers out-of-the-box – bcachefs effectively combines what would otherwise require LVM caching or ZFS special devices, handling it within the filesystem. Administrators can thus achieve SSD-like performance with HDD capacity transparently.
  • Snapshotting and sub-volumes: Bcachefs implements lightweight snapshots by assigning version numbers to filesystem objects instead of duplicating entire trees. This approach, somewhat analogous to how some enterprise filesystems or LVM handle snapshots, avoids the full CoW tree clone that Btrfs performs, potentially improving snapshot scalability. Snapshots in bcachefs are associated with sub-volumes (similar to Btrfs's sub-volumes), allowing consistent point-in-time views of the filesystem for backup or recovery. The snapshot mechanism is designed for efficiency, so taking or deleting snapshots remains fast even with many snapshots present.
  • Bcachefs also supports extended attributes (xattr) and POSIX Access Control Lists (ACLs), enabling fine-grained permission management as on any enterprise filesystem. It also has quotas for managing user or group space usage. Notably, its quota implementation is much more performance-friendly than Btrfs's – bcachefs maintains regular accounting that doesn't incur the severe overhead that Btrfs's quota groups (qgroups) do on large datasets. Additionally, features such as reflinks (efficient file copy/clone) are available and a special "nocow" mode exists for those rare cases (such as certain database or virtual machine images) where you might want to bypass CoW for performance. Bcachefs strives to be a general-purpose filesystem suitable for a wide range of workloads, from small systems to large-scale storage servers, hence the emphasis on both feature completeness and scalability.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Linux Kernel 6.17 Drops bcachefs

    After a clash over some late fixes and disagreements between bcachefs's lead developer and Linus Torvalds, bachefs is out.

  • Kernel News

    This month in Kernel News, Zack reports on when the process is the feature.

  • Kernel News

    Chronicler Zack Brown reports on the bcachefs patch submission process.

  • KaOS Linux 2024.05 Adds Bcachfs Support and More

    With updates all around, KaOS Linux now includes support for the bcachefs file system.

  • News

    In the news: Is This the Year of Linux; Linux Mint 20 Reaches EOL; TuxCare Announces Support for AlmaLinux 9.2; Go-Based Botnet Attacking IoT Devices; Plasma 6.5 Promises Better Memory Optimization; KaOS 2025.05 Officially Qt5 Free; Linux Kernel 6.15; Microsoft Makes Surprising WSL Announcement; and Red Hat Releases RHEL 10 Early.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News