Zack's Kernel News
Zack's Kernel News
Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.
Status of OverlayFS and Union Filesystems in General
Recently, Miklos Szeredi requested that OverlayFS be included in the main kernel tree. OverlayFS allows two directory trees to appear as one. Two files with the same path on each tree would appear to occupy the same directory in the overlayed filesystem. The project has been in existence for several years, but this time Linus Torvalds replied, "Yes, I think we should just do it. It's in use, it's pretty small, and the other alternatives are worse. Let's just plan on getting this thing done with."
Al Viro said he'd start reviewing the code, but he also suggested that if they were going to merge a union filesystem such as OverlayFS, they might as well consider merging other similar projects, such as Unionmount and Aufs. Unionmount in particular, he said, had been getting some good work lately from David Howells.
Meanwhile, Sedat Dilek jumped for joy at seeing OverlayFS close to acceptance. Al also replied again with his initial review. He'd identified some security issues and other technical problems, and he went back and forth with Miklos about them. The two at first didn't see eye-to-eye about how to fix the issues, or even whether a given issue was really a problem.
At one point, George Spelvin offered his, admittedly, somewhat hacky solution to one of Al's problems. The whole thing boiled down to the way OverlayFS or any union filesystem would behave under the full range of possible uses. Regarding George's particular suggestion, Al walked through the convoluted process necessary to remove a directory [1] and replied, "I'm sorry, but this is insane."
Elsewhere, in an entirely different thread, Sedat asked about the status of David's Unionmount project. David replied, "It's being reengineered again to take account of VFS changes that went in in the last merge window."
He added, "It's a maze of twisty locking problems – some of which also apply to things like overlayfs:-(".
The discussion in both threads ended there. It appears everyone, including Linus, is ready to see union filesystems like OverlayFS in the kernel. But no one, including Al Viro and the maintainers of the various union filesystem projects, are able to solve satisfactorily the technical problems that remain. At the moment, none of the projects seem close to getting past Al's laser-beam code reviews, and until that happens, I'm certain none of them will be merged.
Astonishing Tux3 Performance Claims
There seems to be some suspicion between certain kernel developers and Tux3 developers. Tux3 is a versioning filesystem that's been in development since 2008. Recently, Daniel Phillips, the project leader, posted some benchmarks that showed Tux3 outperforming tmpFS. As he put it, "To put this in perspective, we normally regard tmpfs as unbeatable because it is just a thin shim between the standard VFS mechanisms that every filesystem must use, and the swap device."
Dave Chinner took a look at Daniel's numbers and found some issues that he felt indicated a deliberate attempt to mislead people. In particular, he pointed out that the Tux3 benchmark didn't include any "flush" operations – the Tux3 front end was off-loading all of its work to a back end that could take all the time it needed to complete the job. The front end would never block, and so it could simply race through the benchmark and exit. Dave said, "You've carefully crafted the benchmark to demonstrate a best case workload for the tux3 architecture, then carefully not measured the overhead of the work tux3 has offloaded, and then not disclosed any of this in the hope that all people will look at is the headline."
Hirofumi Ogawa, one of the Tux3 developers, responded, saying fsync()
had not yet been implemented, and the benchmarks were intended to show comparisons between just the parts of the code that had already been written.
Daniel also responded to Dave's post, saying, "I should indeed have noted that 'modified dbench' was used for this benchmark, thus amplifying Tux3's advantage in delete performance. This literary oversight does not make the results any less interesting: we beat Tmpfs on that particular load. Beating tmpfs at anything is worthy of note."
Regarding the specific issue Dave had raised about off-loading 100% of Tux3's work, Daniel said, "Yes, that is the entire point of our front/back design: reduce application latency for buffered filesystem transactions."
Theodore Ts'o pointed out that one couldn't simply ignore the fsync()
data and expect a meaningful benchmark result. As he put it, "Since fsync()
is defined as not returning until the data written to the file descriptor is flushed out to stable storage – so it is guaranteed to be seen after a system crash – it means that the foreground application must not continue until the data is written by Tux3's back-end." He added, "any advantage of decoupling the front/back end is nullified, since fsync()
requires a temporal coupling."
Daniel replied that when they optimized fsync, he expects "… Tux3 to perform competitively, because our delta commit scheme does manage the job with a minimal number of block writes …" [2].
Elsewhere in the thread, Dave remarked on his real concern. He said, "I don't care how fast tux3 is – I care about being able to reproduce other people's results. Hence if you are going to report benchmark results comparing filesystems then you need to tell everyone exactly what you've tweaked and why, from the hardware all the way up to the benchmark config."
The discussion trailed out around there, but some kernel folks also seemed to feel that Daniel's approach was too marketing-oriented, trying to make big announcements at the expense of clarifying the real progress made.
Dealing with Empty Symlinks
Back in January, Pádraig Brady noticed that Linux didn't allow users to create symlinks that pointed to non-existent files. He asked why this was, because POSIX specified that it should be allowed, and other operating systems supported it. There was no discussion at the time, but he recently followed up again, asking if this was going to be fixed.
Part of the idea was that symlinks could be valuable just to store data in their name alone, without utilizing their traditional purpose of linking to other files.
But Al Viro thought this was "utterly pointless," especially considering that the behavior would end up being operating-system-dependent anyway. He said, "blanket refusal to traverse such beasts is a legitimate option."
Eric Blake replied that the real point was not whether creating an empty symlink should be allowed in Linux – it was the way Linux should behave when it encountered an empty symlink during path resolution.
After all, even if Linux didn't allow empty symlinks to be created, other operating systems did, and the filesystems containing those symlinks could be mounted under Linux. It would make sense to handle those cases correctly. Eric remarked:
"I personally don't care whether you fix the Linux kernel symlink()
to allow empty symlinks, or successfully argue for a bug fix against POSIX to permit the existing Linux symlink()
behavior. I'd love to see Linux obtain POSIX certification someday, and either of those two courses of action would get us closer. Meanwhile, I know there are enough other issues in the kernel … that it will be a long time before we ever get a POSIX certification of a Linux system."
Pavel Machek started exploring the extent of the issue under Linux, trying to identify which tools would break when encountering empty symlinks and how bad a break it would be, but the discussion ended at that point, with no clear resolution on a course of action, or even it was worth doing anything about the situation.
Linus Torvalds is notoriously disdainful of compliance for compliance's sake. If there's no cost to it, he's not opposed, but if there are valid technical reasons to implement something in a non-compliant way, he'll choose that over compliance every time, and he makes no secret of his contempt for certain parts of the POSIX standard.
On the other hand, if there's a danger that users might get burned if they mount a filesystem on which another OS has created an empty symlink, Linus would rather eat sand than let that go unfixed. The real question may boil down to whether the status quo would burn anyone. At the moment, it still seems unclear.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Juno Computers Launches Another Linux Laptop
If you're looking for a powerhouse laptop that runs Ubuntu, the Juno Computers Neptune 17 v6 should be on your radar.
-
ZorinOS 17.1 Released, Includes Improved Windows App Support
If you need or desire to run Windows applications on Linux, there's one distribution intent on making that easier for you and its new release further improves that feature.
-
Linux Market Share Surpasses 4% for the First Time
Look out Windows and macOS, Linux is on the rise and has even topped ChromeOS to become the fourth most widely used OS around the globe.
-
KDE’s Plasma 6 Officially Available
KDE’s Plasma 6.0 "Megarelease" has happened, and it's brimming with new features, polish, and performance.
-
Latest Version of Tails Unleashed
Tails 6.0 is based on Debian 12 and includes GNOME 43.
-
KDE Announces New Slimbook V with Plenty of Power and KDE’s Plasma 6
If you're a fan of KDE Plasma, you'll be thrilled to hear they've announced a new Slimbook with an AMD CPU and the latest version of KDE Plasma desktop.
-
Monthly Sponsorship Includes Early Access to elementary OS 8
If you want to get a glimpse of what's in the pipeline for elementary OS 8, just set up a monthly sponsorship to help fund its continued existence.
-
DebConf24 to be Held in South Korea
Busan will be the location of the latest DebConf running July 28 through August 4
-
Fedora Unleashes Atomic Desktops
Fedora has combined its solid distribution with rpm-ostree system to make it possible to deliver a new family of Fedora spins, called Fedora Atomic Desktops.
-
Bootloader Vulnerability Affects Nearly All Linux Distributions
The developers of shim have released a version to fix numerous security flaws, including one that could enable remote control execution of malicious code under certain circumstances.