Zack's Kernel News

Zack's Kernel News

Article from Issue 202/2017

Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

Identifying the Oldest Compatible GCC Versions

Arnd Bergmann decided to test all versions of GCC to see how far back he could go and still compile a proper Linux kernel. He reported:

gcc-4.9 and higher is basically warning-free everywhere, although gcc-7 introduces some interesting new warnings (I have started doing patches for those as well). gcc-4.8 is probably good, too, and gcc-4.6 and 4.7 at least don't produce build failures in general, though the level of false-positive warnings increases (we could decide to turn those off for older compilers for build test purposes).

In gcc-4.5 and below, dead code elimination is not as good as later, causing a couple of link errors, and some of them have no good workaround.

In Arnd's ideal universe, GCC 4.5 would be declared too old for Linux and thus unsupported. But he pointed out that there were a bunch of older distributions still widely in use that relied on it.

Going farther back, Arnd reported that with GCC 4.3, "We need a couple of workaround patches beyond the problem mentioned above; more configuration options are unavailable and we get a significant number of false-positive warnings."

Going back to GCC 4.2, Arnd found that whole architectures would no longer work, in particular ARM versions 6 and 7.

He posted a big pile of patches to address all of these problems.

Sebastian Andrzej Siewior pointed out that the currently documented minimum supported GCC version was GCC 3.2, considerably older than the point at which Arnd reported GCC versions to start breaking down. Sebastian also said that H. Peter Anvin had reported that the x86 architecture wouldn't compile properly with any GCC version less than v3.4.

Meanwhile Geert Uytterhoeven reported that he'd been building test versions of Linux using GCC 4.1.2 for years, and while there were a lot of warnings, he'd gotten used to seeing the ignorable ones and was able to sift through them successfully to find any that really mattered.

Arnd, somewhat flabbergasted, asked Geert why on earth he was using such an old compiler for regular work; Geert said, "It's just the cross compiler I built .debs [on] a long time ago. As long as it works, I see no reason to upgrade, especially as long as I see warnings for bugs that no one else is seeing."

Meanwhile, Sebastian asked if Arnd's quest for the oldest viable GCC version applied to the ARM architecture only, and Arnd replied, "Clearly having the same minimum version across all architectures simplifies things a lot, because many of the bugs in old versions are architecture independent. Then again, some architectures implicitly require a new version because an old one never existed (e.g., arm64 or risc-v), while some other architectures may require an old version."

But Sebastian said that if an architecture required an older version of GCC, then that had to be some kind of bug in the kernel code that should be fixed. But Russell King pointed out that he used older compilers to build ARM kernels because it eliminated a variable when hunting for bugs. If the kernel version is the only change, he said, then he can be sure that the bugs he finds are kernel related. If he always upgraded his GCC version, it might never be obvious that a given bug was kernel-related or tools-related.

Nearby, Heiko Carstens pointed out that the S390 architecture enforced a minimum GCC version of 4.3 as of two years ago.

The discussion went on hiatus for a bit, until Kees Cook asked if there had been any clear conclusion. He suggested that it might be time to raise the minimum supported GCC version to 4.7. He remarked, "I'm curious what gcc 4.6 binaries are common in the wild besides old-stable Debian (unsupported in maybe a year from now?) and 12.04 Ubuntu (going fully unsupported in 2 weeks). It looks like 4.6 was used only in Fedora 15 and 16 (both EOL)."

Arnd replied, "I think we are better off defining two versions: One that we know a lot of people care about, and we actively try to make that work well in all configurations (e.g., 4.6, 4.7, or 4.8), fixing all warnings we run into, and an older version that we try not to break intentionally (e.g., 3.4, 4.1, or 4.3) but that we only fix when someone actually runs into a problem they can't work around by upgrading to a more modern compiler."

Kees replied, "For 'working well everywhere' I feel like 4.8 is the better of those three (I'd prefer 4.9). I think we should avoid 4.6 – it seems not widely used. For an old compiler … yikes. 3.4 sounds insane to me."

Arnd offered his own detailed recommendation:

I suspect that 4.9 might be the one that actually works best across architectures, and it contained some very significant changes. In my testing gcc-5 tends to behave very similarly to 4.9, and gcc-6 introduced a larger number of new warnings, so that would clearly be too new for a recommended version.

The suggestion of 4.9 or higher is appealing as a recommendation because it matches what I would personally tell people:

If you have gcc-4.9 or newer and you don't rely on any newer features, there is no need to upgrade – with gcc-4.8, the -Wmaybe-uninitialized warnings are now turned off because they were too noisy, so upgrading is probably a good idea even though the compiler is otherwise ok and in widespread use – gcc-4.6 and 4.7 are basically usable for building kernels, but the warning output is often counterproductive, and the generated object code may be noticeably worse. … Anything before gcc-4.6 is missing too many features to be useful on ARM, but may still be fine on other architectures.

On the other hand, there is a noticeable difference in compile speed, as a 5% slowdown compared to the previous release apparently is not considered a regression. These are the times I see for building ARM vexpress_defconfig:

gcc-4.4: real 0m47.269s user 11m48.576s
gcc-4.5: real 0m44.878s user 10m58.900s
gcc-4.6: real 0m44.621s user 11m34.716s
gcc-4.7: real 0m47.476s user 12m42.924s
gcc-4.8: real 0m48.494s user 13m19.736s
gcc-4.9: real 0m50.140s user 13m44.876s
gcc-5.x: real 0m51.302s user 14m05.564s
gcc-6.x: real 0m54.615s user 15m06.304s
gcc-7.x: real 0m56.008s user 15m44.720s

That is a factor of 1.5x in CPU cycles between slowest and fastest, so there is clearly a benefit to keeping the old versions around, but there is also no clear cut-off other than noticing that gcc-4.4 is slower than 4.5 in this particular configuration.

But some people did not want to give up their old GCC versions. Maciej W. Rozycki mentioned that he used GCC 4.1.2 (the same version Geert had mentioned using). And Geert said, "If there's no real good reason (brokenness) to deprecate gcc-4.1, I would not do it. I guess most people using old compilers know what they're doing. My main motivation [to] keep on using gcc-4.1 is that it gives many warnings that were disabled in later gcc versions. I do look at all new warnings, and send patches when they are real bugs, or are trivial to silence."

But Geert acknowledged that the value of such an old GCC version had been diminishing lately, thanks to Arnd's work on reducing the number of compiler warnings with recent GCC versions.

There was some technical back-and-forth between Arnd, Geert, and Maciej. Finally Arnd recommended:

How about this approach then:

To keep it simple, we update the README.rst to say that a minimum gcc-4.3 is required, while recommending gcc-4.9 for all architectures … . Support for gcc-4.0 and earlier gets removed from linux/compiler.h, and instead we add a summary of what I found, explaining that gcc-4.1 has active users on a few architectures. We make the Makefile show a warning once during compilation for gcc earlier than 4.3.

Kees and Geert both said this would be fine with them, and the thread ended.

The issue of minimum supported GCC versions is more crucial than one might think. It's not just about kernel developers hunting for kernel bugs or about Linux having bragging rights for supporting 10-year-old compilers. There are a lot of enterprise systems still chugging along out in the world that for one reason or another can't be fully upgraded. Maybe the engineers who understood the system have moved on to other jobs; maybe the source code to crucial binaries is so old that changing to a newer compiler would represent too great a risk. There are plenty of rickety old systems that form the foundation of an entire company. When the Linux kernel finally ends support for the version of GCC they rely on, those companies may be forced to upgrade and risk having to pour resources into fixing breakages they've been praying never to have to deal with. Or worse, they may decide not to upgrade their kernel at all anymore, which would leave them open to increasing numbers of attack vectors. So it's pretty cool that Arnd and others are still working to keep modern kernels compatible with ancient compilers.

Identifying the Oldest Compatible Make Versions

Masahiro Yamada noticed that the oldest officially supported version of GNU Make was version 3.80, which had not worked since Linux 3.18. In fact, he said he himself was the one who had submitted the kernel patch that broke that version of Make.

Instead of reverting his patch, Masahiro suggested updating the documentation to list Make v3.81 as the officially supported minimum version. He preferred this solution partly because it would be a lot of work to redo everything that would need to be redone to support version 3.80 again and because in the three years since his patch broke that version of GNU Make, no one had noticed or complained about the breakage.

Greg Kroah-Hartman had no problem with this, and Michal Marek agreed, as did Jan Beulich. Finally, Linus Torvalds also agreed, saying, "From earlier (unrelated) discussion, I think we have other cases where the 'minimum recommended' may not be true (even gcc – it might be true on some architectures with simple configs, but we've had long-standing known issues with some more complex configurations where the 'minimum' gcc version simply didn't cut it and could generate incorrect code).

"For the GNU make case, you make a strong case of 'it hasn't worked for a while already and nobody even noticed'."

It's interesting to see the difference between the debate over minimum GCC version and minimum Make version. For GCC, there was a lot of back-and-forth, and a lot of effort to patch to kernel such that the oldest possible versions of GCC would still be supported. For Make, none of this took place, most likely because it is a case of the-proof-is-in-the-pudding. That is, if GCC versions older than version 4.6 or so had been broken for years and no one had noticed, I suspect everyone would have been content to set the minimum GCC version at 4.6 and be done with it.

Creating Mount Contexts

David Howells posted some patches to implement a mount context for all filesystem mounts. The context would contain the mount options and some useful binary data and associate it with the mounted filesystem.

In general, this would make it possible for the mount procedure to return more informative error messages. As David put it, "so many things can go wrong during a mount that a small integer isn't really sufficient to convey the issue."

More specifically, a mount context would be able to hold namespace information, making it easier to isolate a mounted filesystem within a virtualized running Linux system.

So far, David said, he had implemented mount contexts for ProcFS and NFS, with ext4 being next on his list.

Jeff Layton was thrilled with this work and offered some minor technical suggestions.

Miklos Szeredi was also thrilled with David's work, but had some more invasive objections to David's design choices. In particular, he said, redoing the mount API needed to accomplish additional things. It needed to distinguish properly among creating a filesystem instance, attaching a filesystem instance into a directory tree, configuring the superblock, and changing the mount properties.

David's work, Miklos said, only partly achieved the separation that the kernel really needed. He summarized the goals as follows:

Why is fsopen() creating a 'mount context'? It's supposed to create a 'superblock creation context'. And indeed, there are mount flags and root path in there, which are definitely not necessary for creating a super block.

Is there a good reason why these mount-specific properties leaked into the object created by fsopen()?

Also I'd expect all context ops to be fully generic first. I.e., no filesystem code needs to be touched to make the new interface work. The context would just build the option string and when everything is ready (probably need a commit command) then it would go off and call mount_fs() to create the superblock and attach it to the context.

Then, when that works, we could add context ops, so the filesystem can do various things along the way, which is the other reason we want this. And in the end it would allow gradual migration to a new superblock creation api and phasing out the old one. But that shouldn't be observable on either the old or the new userspace interfaces.

David agreed in principle with most of Miklos's ideas. But he felt that some of the distinctions Miklos wanted to enforce were less crucial than others; he also pointed out that his own motivation in doing this work was not so much to redo the mount API as it was to support better namespace features.

Miklos recognized that David couldn't be expected to do everything and suggested, "let's just say, that everything that works now should work the same way on the old as well as the new interfaces."

There was a bit more discussion, and the thread petered out. Generally, when someone attempts to implement a new feature such as a mount context, there's a certain expectation that the person will also clean up the surrounding code and take care of any little nitpicky details that have been waiting for someone to handle. At the same time, sometimes those nitpicky details are just too thorny, and requiring contributors to deal with them would only discourage anyone from updating that particular piece of code. So there tends to be a bit of give and take. Ideally, anyone contributing code to thorny areas would at least make the surrounding issues easier to deal with rather than harder. In this case, it seems as though David's code has made the surrounding code easier to deal with, even if it didn't eliminate the problems altogether.

Zack Brown

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Kernel News

    Zack discusses removing dead ports, new minimum GCC version jumping from 3.2 to 4.8, Intel considering hardware changes to mitigate security flaws, enhancing asymmetric process migration, and protecting user's system control. 

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

  • Working with the Kernel

    If you work with third-party hardware drivers, or even if you just need to fix a broken system, someday you might need to upgrade the Linux kernel.

  • Kernel Tips

    Worried about a recent security exploit? Want to take advantage of a new hardware feature? You don’t need to be a Linux expert to patch and compile the Linux kernel. We'll show you how to get started.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95