Zack's Kernel News

Zack's Kernel News

Article from Issue 256/2022
Author(s):

Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

A Butterfly Flaps Its Wings

Andy Shevchenko, who is not a butterfly, happened to remark one day on the Linux Kernel Mailing List: "would it make sense to enable -Werror for default warning level, let's say W=0, at some point?" And this ended up producing great storms of change across the world.

The -Werror option is a command-line option for the GNU C Compiler (GCC). When compiling, GCC will detect many errors in your code, report them to you, and abort the attempt to compile your code. Perhaps you haven't quite made a real error, but your code is still not a good usage of the C language. In that case, GCC produces a warning instead and continues chugging away to build your executable. However, maybe you are so enthusiastic about proper C usage that you want GCC to treat those bad usages as if they were errors. The -Werror option tells GCC to do that.

Masahiro Yamada pointed out to Andy that "Every GCC release adds new warning options." And that "Enabling -Werror by default means the kernel build is suddenly broken with new compilers."

Andy replied that in spite of Masahiro's objection, using -Werror would at least "keep our hand on the pulse of the changes, right?"

Nick Desaulniers from Google pointed out that "Google's pixel kernel team carries an out of tree patch creating exactly such a config. It helps them keep their kernels building warning free."

So there was at least a precedent for hoping such a thing could be done. But Nick himself was opposed to the idea in spite of it being standard practice at his own company. He said, "if a build isn't warning free, it can be difficult to get there; you need to at least disable the config to see how many warnings you have, and identify which are lower hanging fruit. It also makes compiler upgrades excessively difficult. In my experience on Android, if folks are too busy to address compiler warnings, then new warnings added by a new compiler version just get turned off and never addressed. My experience with the kernel has been that fixes for different warnings also take varying amounts of time to get accepted and work their way through mainline, meanwhile builds are broken."

Linus Torvalds also responded to Andy's original post regarding -Werror, saying:

"I'd love to, and would have done that a long time ago, but we just haven't been able to depend on the compilers not having random warnings in random versions.

"And making it a config option doesn't work well either, simply because that will just mean that bots will randomly fail if they set that option, and if we make it harder to set developers won't have it set anyway, and it doesn't help much.

"End result: using -Werror works wonderfully in controlled environments. Not so wonderfully in the general random mess.

"That said, I've been so aggressive against accepting new warnings that I've considered this unconditionally anyway, and then dealing with compiler version fallout by just disabling certain compilers and/or certain options.

"But it would almost certainly be pretty painful."

Elsewhere, in an entirely different thread, Kees Cook posted a patch for Linus, which Linus rejected, saying, "You can't add new warnings without fixing them, and this adds some HORRENDOUSLY ugly new warnings that would most definitely hide other warnings."

In a subsequent post, Linus added, "I really want to enable -Werror at some point, but every time I think I should, I just end up worrying about another random new compiler (or a random old one). We do have -Werror in various configurations (and in some sub-trees)."

But then, fuming over his own email, Linus responded to himself, saying:

"Whatever. I'll just make a new config option, make it 'default y', and it will be on for anybody doing allmodconfig builds etc.

"And if people have new compilers, or odd configurations that still cause warnings, they can turn it off, but hopefully this will make compiler warnings in linux-next (or any other automated builds) cause a lot more noise."

In other words, kaboom. Linus decided to enable -Werror and just deal with the enormous fallout of kernel builds breaking across the known universe.

In yet a third entirely different thread, the kaboom-heard-round-the-world echoed in the ears of Nick (the Google Android guy) who had already objected to the kabooming act itself. He saw that Linus had already committed a patch enabling -Werror, and he began to wail into the night. He said, "While I can appreciate the intent of enabling -Werror, I don't think it is the right tool to address the root cause of developers not testing certain toolchains or configurations, or taking existing reports they're getting serious enough."

He added, "I'd also like to see such a patch sent formally to the list for discussion and have time to soak in next rather than be merged directly into mainline without either."

He said that -Werror might be OK if all existing warnings had already been fixed or if the tool chain would never be updated. But Nick said, "Unfortunately, none of the above is the case for the Linux kernel at this time."

Nick added that at least at Google "This change has caused nearly all of our CI to go red, and requires us to now disable CONFIG_WERROR until every last target and every last config is addressed. Rather than require everyone to disable the above config to keep builds going, perhaps certain CI systems should instead set CFLAGS_KERNEL=-Werror."

But Linus put his foot down once and for all and said:

"It was merged in response of _years_ of pain, with the last one just being the final drop.

"I'm not going to revert that change. I probably will have to limit it (by making that WERROR option depend on certain expectations), but basically any maintainer who has code that causes warnings should expect that they will have to fix those warnings.

"If it's clang that generates bogus warnings, then we'll have to start disable clang warnings. The clang people tend to be proud of their fewer false positives, but so far looking at things, I am not convinced.

"And I'm most definitely not convinced when the 'let's finally enable -Werror after years of talking about it', people end up going 'but but but I have thousands of warnings'.

"That's the POINT of that commit. That 'but but but I have thousands of warnings' is not acceptable.

"I spent hours yesterday getting rid of some warnings. It shouldn't be on me fixing peoples code. It shouldn't be on me noticing that people send me crap that warns.

"And it really shouldn't be 'Linus cares about warnings, so configurations that Linus doesn't test can continue for years to have them'.

"My 'no warnings' policy isn't exactly new, and people shouldn't be shocked when I then say 'time to clean up *YOUR* house too'."

Nick pointed out that this patch was merged without any public discussion over a holiday weekend. It gave no one a chance to test it relative to their various configurations and tool chains. He added to Linus, "Any given maintainer sending you a PR cannot (and should not, IMO) know under what combination of configs, targets, and toolchains you'll test under, and -Werror isn't going to help them figure it out."

He went on, "To be clear, you have merged patches into mainline that broke the build for combinations of configs/targets/toolchains that you are not testing. It's not realistic for you or any one person to test all such combinations either."

And Nick concluded, "Fixing warnings in the Linux kernel for all possible configs, targets, and toolchains while the toolchains continue to add more diagnostics is more sisyphean than digging a hole in wet sand."

Fangrui Song also pointed out that "Default WERROR makes building old kernels with new compilers more painful."

Marco Elver submitted a one-line patch, changing the -Werror option from on by default to on for test builds. He said that this "might move this new Kconfig option towards its appropriate usecases by default. The intent is not to dispute the usefulness of -Werror, but select the appropriate usecases by default, limiting friction for all those who can do little more than say CONFIG_WERROR=n."

Guenter Roeck, Nathan Chancellor, and Randy Dunlap liked Marco's patch, and Linus accepted it, saying "That seems reasonable. It very much is about build-testing."

Mark Brown, however, said, "having -Werror on by default for defconfigs is going to cause issues for bisection and just generally noticing promptly when runtime issues are introduced."

Steven Rostedt, deviating from the general mood of nauseous shock pervading the mailing list, didn't really object to -Werror. He said, "ktest has a way to create a list of current warnings, and then test your code against it, and will fail on any new warning. I run this on all my pull requests to make sure that I do not introduce any new warnings. That said, I still get bug reports on various configs that I did not test, where my code introduces a warning. I hate to be the one that fails their builds."

Pavel Machek did not like -Werror, saying that it would affect the stable kernel tree. The purpose of the stable tree is to accept only bug fixes. Something like -Werror would suddenly inspire a lot of people to try to fix all the newly appearing bugs. This, he said, would result in a lot of patch submissions to the stable tree, which could introduce more instability than it fixed.

On that same topic, Greg Kroah-Hartman said, "I will not be backporting this ['-Werror'] patch to older stable kernels, but I _want_ to see stable builds build with no warnings. When we add warnings, they are almost always things we need to fix up properly. Over time, I have worked to reduce the number of build warnings in older stable kernels. For newer versions of gcc, sometimes that is impossible, but we are close…."

Pavel replied, "You clearly can't backport this patch, but for 5.16-stable, you'll have it in, and now warnings are same as errors… and I don't believe that's [a] good idea for stable."

But Greg disagreed. He felt it was a fine idea for the stable tree, and that "it will force us to keep these trees clean over time." He also added that in the "worst case, we disable it in 4 years when gcc 15 or so generates so many errors we can't resolve them in this old kernel."

Florian Weimer raised some concerns about -Werror in general, saying "there are also warnings which are emitted by the GCC middle-end (the optimizers), and turning on -Werror for those is very problematic. These warnings are very target-specific and also depend on compiler version and optimization parameters." And he added, "GCC also lacks a facility to suppress warnings if they concern code that was introduced during optimization and removed again later (e.g. inlining, constant propagation, dead code removal)."

Linus jumped back into the conversation here, regarding Florian's statement about "middle-end" warnings. Linus said to him:

"People say that, but let's face it, that's simply not true.

"There are real problematic warnings, and we just turn those warnings off. People who want the self-flagellation can enable them with W=1 (or bigger values), and spend their life fighting stupid random compiler warnings that have tons of false positives.

"But the fact is, I've required a warning-free build on x86-64 for anything I notice for the last several years by now, and it really hasn't been a problem.

"What _has_ been a problem is that (a) build bots don't care about [warnings] and (b) the configs I don't personally test (other non-x86-64 architectures stand out, but there's certainly been other configuration issues too).

"But 'bogus compiler warnings' is very much *not* in that list of problems.

"I've looked at a lot of the warnings that are now errors, and while a number of them have made me go 'So why didn't we see that on x86-64?' not one of them has actually made me go '-Werror was wrong'.

"Because EVERY single one I've seen has been for something that should have been fixed. Presumably long long ago, but the warning it generated had been ignored.

"So stop with the 'some warnings just happen' crap. Outside of actual compiler bugs, and truly stupid warnings (that we turn off), that's simply not true.

"And yes, those compiler bugs happen. The new warning already found one issue with current gcc trunk (non-released). So right now the count is 'lots of valid warnings, and one compiler bug that was found _thanks_ to me enabling -Werror'.

"Yes, we have issues with having to work around older compiler bugs. Those aren't going away, and yes, -Werror may well mean that non-x86-64 people now have to deal with them.

"And yes, this is painful. I'm very much aware of that. But we just need to do it. Because the warnings don't go away on their own, and not making them fatal clearly just means that they'll stay around forever."

Thus spake Linus, and that was the end of that. At least for that thread in the mailing list.

Later, Linus gave an update on how the kernel was handling -Werror so far. He said:

"I've spent a fair amount of this week trying to sort out all the odd warnings, and I want to particularly thank Guenter Roeck for his work on tracking where the build failures due to -Werror come from.

"Is it done? No. But on the whole I'm feeling fairly good about this all, even if it has meant that I've been looking at some really odd and grotty code. Who knew I'd still worry about some odd EISA driver on alpha, after all these years? A slight change of pace ;).

"The most annoying thing is probably the 'fix one odd corner case, three others rear their ugly heads'. But I remain convinced that it's all for a good cause, and that we really do want to have a clean build even for the crazy odd cases.

"We'll get there."

Still later and in an unrelated thread, someone complained about the deep, echoing silence from Linus in response to a patch. But Linus explained, "Normally I get to clean up my inbox the week after the merge window, but the -Werror things kept my attention for one extra week, and so my mailbox has been a disaster area as a result. So only today does my inbox start to look reasonable again after the merge window."

A month later, Linus announced Linux Kernel 5.15, with the -Werror patch attached. He remarked, "This release may have started out with some -Werror pain, but it calmed down fairly quickly and on the whole 5.15 was fair[ly] small and calm. Let's hope for more of the same – without Werror issues this time – for the upcoming merge window."

And that's the story.

Kaaa…boom?

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

  • Kernel News

    Zack discusses mysterious alignments in the kernel; and discovery and invention.

  • Kernel News

    Chronicler Zack Brown reports cleaning up build warnings, improving (?) kernel code generation, and working around missing future compiler features.

  • Kernel News

    In kernel news: Rust in Linux; and Compiler and Kernel Frenemies.

  • Kernel News

    This month in Kernel News: Spanking Linus; Controlling Boot Parameters via Sysfs; Finessing GCC; and Dealing with Loose Build Dependencies.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News