Zack's Kernel News

Zack's Kernel News

Article from Issue 231/2020
Author(s):

Zack discusses when to break the ABI and the status of vboxsf.

When to Break the ABI

It's rare that a patch sneaks through the development process and changes the Linux kernel application binary interface (ABI), though it's common enough to see a patch that tries to do it or to see a developer that advocates doing it. When it has occurred, Linus Torvalds has always made it very clear that virtually nothing short of a security hole could possibly justify such a thing. But we almost never see a case where a patch is actually accepted into the kernel, and then it is discovered that it changed the ABI.

To understand the ABI in context, consider the kernel application programmer interface (API). The API is a set of library routines that can be referenced by user source code in order to give commands to the kernel or get information out of the kernel.

Linus has no trouble at all changing the kernel API. Well, he has certain standards, but there is certainly no interdiction against it; it's just a normal part of kernel development. The reason is that in order to run into problems using the kernel API, a user must be trying to compile the source code of another program that uses that API. In that case, if the user sees the problem, they can patch the source code themselves by hand, using whatever programming language their code is written in. Once their source code is updated, they can compile and run it, no problem.

The ABI is very similar to the API – but instead of representing a source code library, it represents the actual compiled machine code interface, the CPU opcodes themselves that form the library call. For a user to run into a problem with the ABI, they must be trying to run software that has already been compiled to use that particular ABI. In such a case, the user may not have any source code to patch; they may simply have a closed-source blob of executable code that they've been using for years and can never fix or replace.

That's the problem. If it were possible to freely fix or replace binary executables when they ran into ABI conflicts, Linus would certainly not care nearly as much about breaking the Linux ABI. In that case, he would probably be just as willing to break the ABI as he is to break the API – the rationale being that it's OK so long as no actual users are inconvenienced.

But Linus is fanatically loyal to existing binary code. He refuses to allow the progress of Linux kernel development to break binary code that is already running in the wild. Anyone, anywhere, who is relying on compiled software, must be able to upgrade their kernel and continue to run their existing binary user code, without fear. As I say, virtually the only exception to this rule is where security is concerned. Linus would hew off any piece of the Linux kernel no matter how useful, if that were necessary to eliminate a security hole. Above all else – even functionality – Linux is a secure system.

The ABI policy was demonstrated recently when Douglas Anderson from the Chromium project posted a patch to revert a cryptography feature, on no other grounds but that the earlier patch had broken the kernel's ABI. He said, "The commit made a clear and documented ABI change that is not backward compatible. There exists userspace code that relied on the old behavior and is now broken." He gave a link to the part of the Chromium project that relied on the old ABI.

It's important to note that even though Chromium is an open source project, it still represents compiled executables that rely on the binary interface. Merely having source code available for a given project is not enough to justify breaking the kernel's binary interface relied on by that project.

Douglas was even willing to deal with an ABI change – you never know when it might actually turn out to be a legitimate security issue – but he said the Chromium developers would strongly desire some help managing the necessary changes to their codebase in that case.

Eric Biggers thanked Douglas for the report and asked to know how this broke Chrome OS. He also remarked that if it were really necessary to revert the patch, then the kernel would need to document the fact that a particular error code, which could arrive when doing encryption, was actually a filesystem-specific error. Reverting the patch would essentially make the kernel's behavior more difficult to understand and explain, which as Eric put it, "we'd really like to avoid…."

Eric pointed out that an alternative patch would break the ABI in a different way. He asked if Douglas was proposing that the kernel keep the complex, difficult-to-explain behavior.

Douglas acknowledged that he didn't really know enough about this particular area of the kernel. He said, "I guess I'd have to leave it up to the people who know this better. Mostly I just saw this as an ABI change breaking userspace which to me means revert. I have very little background here to make good decisions about the right way to move forward."

Eric swallowed that, though it must have been difficult, and agreed that it looked as if the patch did indeed need to be reverted. He proposed that for the next kernel version, the existing behavior should be broken equally across all filesystems, so that the behavior would at least be consistent. Yes. The feature proposal is to break the rest of the filesystems in the same way, rather than accept even a single ABI change. As Eric put it, "I think we should try to make things slightly more sane by removing the same check from f2fs and fixing the documentation, so that at least each ioctl will behave consistently across filesystems and be correctly documented."

In addition to this, Eric had to swallow the equally bitter pill of trying to track down exactly where the ABI breakage truly was. After all, the original patch existed for a reason, and that reason still needed to be addressed if possible, but just without the ABI breakage.

But there's a twist! As Eric proceeded to unravel the true nature of the breakage, he found a way to repair the Chromium behavior without actually reverting the patch reported by Douglas. In response to this, Guenter Roeck of Google suggested that the kernel maintainers hold off on accepting the reversion. He said, "I'll do more testing next week, but as it looks like it may no longer be needed, at least not from a Chrome OS perspective."

And Theodore Y. Ts'o, very high up in the kernel developer hierarchy, agreed! He said he would wait on sending the reversion to Linus.

That was the end of the discussion, but it's very interesting. Apparently the status of an ABI breakage can be influenced by whether there are any actual users of that particular interface anywhere in the world. Having fixed the problem from the Chrome OS perspective, the kernel developers had found a sort of reprieve. If no other users come forward to complain about this particular breakage, it may even be conceivable that Linus would allow the ABI breakage to persist.

So the policy towards ABI breakage appears to be that it's OK to solve a security problem or if there are no actual users of the ABI in question.

Status of vboxsf

There was an interesting interaction between the Linux maintainer (Linus Torvalds) and the maintainer of the stable branch (Greg Kroah-Hartman). Linus recently accepted a patch to a kernel release candidate (-rc), which brought the VirtualBox Shared Folder (vboxsf) filesystem into the staging area of the kernel source code. Specifically it was Linux 5.4-rc7, relatively late in the process towards the official 5.4 release. The staging area is traditionally a place where upcoming features can get the widest possible distribution. They aren't actually in the compiled kernel, because the staging area is kept isolated. But they are in the source tree, and anyone downloading that tree also gets the staging area. This is very useful for getting as many eyes as possible on new features, before they are actually transitioned into the main kernel code and put into actual use.

It was Greg who fed the patch up to Linus at that time, and he got the patch from Christoph Hellwig, who had said it was ready to go into the kernel. Greg apparently misconstrued what Christoph had meant and put the new filesystem into the staging area instead of the main kernel for actual use. So after Christoph pointed out Greg's mistake, Greg submitted a new patch to Linus, for the -rc8 kernel, migrating vboxsf out of the staging area and into the kernel proper.

Linus replied definitively:

"No.

"I was unhappy about a staging driver being added in rc7, but I went 'whatever, it's Greg's garbage'.

"There is no way in hell I will take a new filesystem in rc8.

"Would you take that into stable? No, you wouldn't. Then why is this being upstreamed now.

"Honestly, I think I'll just delete the whole thing, since it shouldn't have gone in in the first place. This is not how we add new filesystems."

Greg accepted this gracefully, saying, "Fair enough, sorry for the noise."

However, Hans de Goede complained, saying:

"The problem is that Al Viro, after an initial review around v2 or v3 of the patch, which I believe I have fully addressed, has been ignoring this patch/new fs for over a year now. I've pinged him repeatedly both via email and on irc, but with no luck. I guess he simply is too busy with other stuff.

"I did ask other fs developers to review and have gotten reviews from David Howell and Christoph Hellwig. I've addressed all their review remarks and I've had reviews of the newer versions with just a few nitpicks remaining. I've also addressed those nitpicks. But I never got an Acked-by or Reviewed-by from either of them on any of the newer versions.

"I even talked to various people about this at plumbers, but I did not get any traction there either.

"On the advice of Christoph I've asked Andrew Morton to take this directly under fs/ instead, twice. When this all went no where I went the staging route, with the current result."

Linus did not reply directly, but when he announced -rc8 he did say, "The other noticeable thing in the diffs is the removal of the vboxsf filesystem. It will get resubmitted properly later; there was nothing obviously wrong with it technically; it just ended up in the wrong location and submitted at the wrong time. We'll get it done properly probably during 5.5." Regarding the lateness of submitting vboxsf for inclusion, he remarked, "I considered just making a final 5.4 and be done with it, but decided that there's no real downside to just doing the rc8 after having a release cycle that took a while to calm down." So clearly the -rc7/-rc8 time frame was simply too late this time around.

For me, there are several interesting points to this whole exchange. The use of the staging area is always interesting, because it was created specifically to solve the problem of new features failing to get enough testing before going into the main tree. The idea of having a tree within the tree, to make sure upcoming features were as available as possible, was a simple and surprising innovation.

And in this case, although Linus acknowledged that the filesystem looked technically OK and had already been accepted into the staging area, he pulled the entire patch out again in the very next release candidate, just because the submission itself hadn't been done properly. This seems to indicate a general principle of keeping things orderly, in the face of the tremendous onslaught of new features that pour into the kernel with every release cycle. Linus apparently wants to make it completely unambiguous to developers exactly how each new feature should proceed on its way into the tree.

The other thing I find very interesting about this is Hans's attempt to do an end-run around developers who didn't seem to be responding fast enough – particularly Alexander Viro, who comes as close as it gets to sitting side by side with Linus in the developer hierarchy. Hans tried various ways to get Al's attention, raised the issue at developer conferences, and then followed other well-placed developers' advice to bring Linus's attention to the project. And lo and behold, it's a near certainty that vboxsf will get solid and timely consideration before the release of Linux 5.5.

The Author

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Kernel News

    Chronicler Zack Brown reports on the patch submission process and the status of NTFS. 

  • Kernel News

    Chronicler Zack Brown reports on the NOVA filesystem, making system calls userspace only, and extending module support to plain executables. 

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News