Zack's Kernel News

Zack's Kernel News

Article from Issue 236/2020

This month Zack discusses adapting to COVID-19, and l33t security. 

Adapting to COVID-19

The ongoing COVID-19 pandemic does not seem to have slowed Linux kernel development, although in-person gatherings are being abandoned in favor of online alternatives. For example, Josef Bacik announced in late April that the Linux Storage, Filesystem, and Memory Management Summit would be canceled this year. He added, "Next year the summit will be held in Palm Springs, on May 12-14, 2021 at the Riviera Palm Springs. A new CFP and registration will be held again, along with a new round of invites. The program committee will remain the same, and next year we will choose new members."

The issue is somewhat significant for the same reason that these in-person events started up in the first place. There's something different about online interactions. It's maybe the same difference that allows months-long flame wars on mailing lists but far fewer shouting matches in meeting rooms. And it's maybe the same difference that allows you to "forget" to answer an email, where you wouldn't forget to answer a question posed by someone sitting in the room while everyone looks at you expectantly.

Whatever the value of in-person hangouts for Linux development, it's a real value, and the kernel developers will have to find a substitute of similar value or else just deal with losing that value for a year or two.

As for other impacts of COVID-19 on kernel development, Linus Torvalds said in a completely different thread, "I did have a request from the kernel technical advisory board (aka TAB) to mention that if anyone's had (or is predicting) disruptions to their kernel work from COVID-19 that they'd like help solving (finding backup maintainers, etc.), the kernel TAB has offered to help however they can. If this would be useful, please contact them at:"

l33t Security

I've often said that Linus Torvalds considers security to be the highest priority, valuing it above any kernel feature. Recently, Linus clarified this to some extent (or maybe muddied it – you can decide for yourself), during a security discussion. Christophe Leroy posted a patch to limit the size of one of Linux's attack surfaces. Specifically, he wanted to prevent attackers from trying to overwrite certain kernel functions, so the kernel would not unwittingly execute untrusted code when trying to call those functions. To do this he wanted to implement certain functions statically, so they couldn't change at runtime.

Kees Cook was fine with the patch, and Al Viro had some suggestions to regularize the calling conventions across all supported CPU architectures. The patch was originally written for PowerPC, but Al felt that ARM, RISC-V, and S390 needed special handling. He added, "let's sort that out while we still have few users of that interface; changing the calling conventions later will be much harder." He made some suggestions for how to handle things better.

Christophe pointed out that Al's suggestions were directly contrary to some comments by H. Peter Anvin back in January, in response to an earlier version of Christophe's patch. At that time, Peter had said, "I have *deep* concern with carrying state in a 'key' variable: It's a direct attack vector for a crowbar attack, especially since it is by definition live inside a user access region."

Christophe offered to try to mix and match elements of the current patch with the one he posted in January.

Kees agreed with Peter, saying he'd rather accept the current patch as is. And Al didn't seem to push. But he did notice some naming convention issues, and he felt that there were some pieces of code that were nested together in the ARM architecture that seemed to need to be worked out in some way before Christophe's patch could be applied.

Kees agreed that "it's a weakness of the ARM implementation and I'd like to not extend it further. AFAIK [as far as I know] we should never nest, but I would not be surprised at all if we did."

This is where Linus joined the conversation, because Kees also remarked, "If we were looking at a design goal for all architectures, I'd like to be doing what the public PaX patchset did for their memory access switching, which is to alarm if calling into 'enable' found the access already enabled, etc."

The PaX patchset came from an anonymous author in the year 2000 as part of the larger grsecurity project and specifically addressed the type of vulnerability that Christophe's patch also tried to address – namely the case where an attacker tries to replace a portion of the kernel code with the hacker's own construction, so that the kernel will then try to execute that code and thus give control to the attacker.

At this point, Linus sidled in and derisively remonstrated, "We already do better than PaX ever did. Seriously. Mainline has long since passed their hacky garbage. Plus PaX and grsecurity should be actively shunned. Don't look at it, don't use it, and tell everybody you know to not use that shit."

Kees clarified that he'd only been referring to the principle that, "if the 'enable' is called when already enabled, Something Is Wrong." To which Linus replied:

"Well, the 'something is wrong' could easily be 'the hardware does not support this'.

"I'm not at all interested in the crazy code to do this in software. Nobody sane should ever do that.

"Yes, I realize that PaX did software emulation of things like that, and it was one of the reasons why it was never useful to any normal user.

"Security is not an end goal in itself, it's always secondary to 'can I use this'.

"Security that means 'normal people can't use this, it's only for the special l33t users' is not security, it's garbage. That 'do page tables in software' was a prime example of garbage."

It's an interesting statement. When Linus says that security is secondary to "can I use this," is he saying that user features are more important than security?

Of course not. Though undoubtedly someone in the future will take that quote out of context and try to say that Linus was saying exactly that.

I'm sure he'll clarify for himself when the time comes, but I believe Linus's point in that post was that security features must be of use to regular users. If a security feature does not actually provide any added security for regular users, then it's worthless. And he's drawing a distinction between regular users and l33t users, who are trying to push the system in special subtle ways that they shouldn't.

It's useful to look back at Linus's 2017 statement:

"As a security person, you need to repeat this mantra:

"'security problems are just bugs'

"and you need to _internalize_ it, instead of scoff at it.

"The important part about 'just bugs' is that you need to understand that the patches you then introduce for things like hardening are primarily for DEBUGGING.

"I'm not at all interested in killing processes. The only process I'm interested in is the _development_ process, where we find bugs and fix them.

"As long as you see your hardening efforts primarily as a 'let me kill the machine/process on bad behavior', I will stop taking those shit patches.

"I'm deadly serious about this.

"Some security people have scoffed at me when I say that security problems are primarily 'just bugs'.

"Those security people are f*cking morons.

"Because honestly, the kind of security person who doesn't accept that security problems are primarily just bugs, I don't want to work with. If you don't see your job as 'debugging first', I'm simply not interested.

"So I think the hardening project needs to really take a good look at itself in the mirror.

"Because the primary focus should be 'debugging'. The primary focus should be 'let's make sure the kernel released in a year is better than the one released today'.

"And the primary focus right now seems to be 'let's kill things for bugs'. That's wrong."

The Linux/GCC Wars (or Not)

There was some discussion of the GNU C Compiler (GCC) recently among the kernel developers. Waiman Long has been working on a security patch to clear certain memory blocks before freeing up that memory, in order to avoid making the data in those blocks readable by hostile code. He explained, "Using memset() alone for buffer clearing may not provide certainty as the compiler may compile it away. To be sure, the special memzero_explicit() has to be used."

And Waiman said, "this patch introduces a new kvfree_sensitive() for freeing those sensitive data objects allocated by kvmalloc(). The relevant places where kvfree_sensitive() can be used are modified to use it."

Joe Perches noticed that in Waiman's patch, the kvfree() function prototype took a const void *addr as input and wondered why the pointer had to be a constant. He tracked the prototype back to Linux v2.1.44, which changed the pointer from void * to const void *, but couldn't find any explanation.

Waiman said he was just letting sleeping dogs lie, as his patch didn't change that particular pointer. He offered to change it if Joe wanted, but Joe said it really didn't matter; he was just curious. At this point, Linus Torvalds threw in an explanation:

"Because 'free()' should always have been const (and volatile, for that matter, but the kernel doesn't care since we eschew volatile data structures).

"It's a bug in the C library standard.

"Think of it this way: free() doesn't really change the data, it kills the lifetime of it. You can't access it afterwards – you can neither read it nor write it validly. That is a completely different – and independent – operation from writing to it.

"And more importantly, it's perfectly fine to have a const data structure (or a volatile one) that you free. The allocation may have done something like this:

struct mystruct {
  const struct dictionary *dictionary;

"and it was allocated and initialized before it was assigned to that 'dictionary' pointer. That's _good_ code.

"So it wasn't const before the allocation, but it turned const afterwards, and freeing it doesn't change that, it just kills the lifetime entirely.

"So 'free()' should take a const pointer without complaining, and saying


"is a sensible an[d] correct thing to do. Warning about – or requiring that dictionary pointer to be cast to be freed – is fundamentally wrong.

"We're not bound by the fact that the C standard library got their rules wrong, so we can fix it in the kernel."

After some further thought, Linus added:

"I'd really love to be able to describe that operation, but there's sadly no such extension.

"So the _real_ prototype for 'free()'-like operations should be something like

void free(const volatile killed void *ptr);

"where that 'killed' also tells the compiler that the pointer lifetime is dead, so that using it afterwards is invalid. So that the compiler could warn us about some of the most trivial use-after-free cases.

"Because we've had even those trivially stupid ones.

"Yes, obviously various analysis systems do exactly that kind of analysis (and usually go much further), but then it's external things like coverity etc.

"The point being that the lifetime of an object is independent from being able to write to an object, and the 'const' in the 'free()' is not 'I promise to not write to it', but 'I can accept a constant pointer'.

"We've had a number of places in the kernel where we do that kind of 'lifetime' marking explicitly by assigning a NULL (or invalid value) to the pointer when we free it.

"I have this dim memory of us even (long long long ago) trying to use a #define kfree() … to do that, but it turns out to be basically impossible to get the proper 'use once' semantics, so it doesn't work if the argument to kfree() has side effects."

David Howells suggested actually mentioning this to the GCC developers, on the off chance that they might like the idea. David said:

"It might be worth asking the compiler folks to give us an __attribute__ for that – even if they don't do anything with it immediately. So we might have something like:

void free(const volatile void *ptr) __attribute__((free(1)));

"There are some for allocation functions, some of which we use, though I'm not sure we do so as consistently as we should."

Linus replied: "Yes, having the free attribute would be consistent (even if the syntax for it might be as you suggest, kind of like the __printf() attribute works). Even if it wasn't initially used for anything it wouldn't hurt, and maybe some day it would improve warnings (and allow the compiler to do the dead store elimination that started this whole long set of threads in the first place…)."

David submitted a GCC feature request at

So the discussion continued inside the feature request itself. Jeffrey A. Law from the GCC team, said that GCC already recognized that the free() function call itself already had the behavior Linus wanted – but that other free-like functions did not, because GCC had no way of knowing that some other function was free-like.

Linus replied, "Oh, ok, so the logic already exists, just not the interface to tell anybody else." And he also said, "I also realize that it might not be worth it to you guys. Since you already effectively have the DSE code, that looks like a much cheaper thing to do."

But Richard Biener of the GCC team "agreed that having an attribute to annotate free-like functions similar to how we have one for malloc-like functions would be nice."

And Martin Sebor of the GCC team also said:

"I've actually been experimenting with this for GCC 11 as an extension of detecting uninitialized reads from dynamically allocated storage. My initial approach is to

"1) add a second (optional) argument to attribute malloc to mention the deallocation function (e.g., free for calloc, malloc, strdup, etc., or fclose for fopen and fdopen)

"2) add the free function attribute as described in comment #0

"Besides (or instead of just) detecting uninitialized reads from allocated storage this approach detects all accesses to freed pointers the same way -Wreturn-local-addr detects returning addresses of auto variables (i.e., not just dereferences of the pointers but also plain reads). In addition, it detects invalid pairs of calls (such as the free(fopen (…) kind, or the similar C++ new/delete mismatch), as well as attempts to free pointers known not to have been returned from an allocation function at all (e.g., pointers to VLAs or those returned from alloca())."

He volunteered to assign the project to himself and finish implementing it. And that was the end of that.

This was an extremely different interaction between the kernel and GCC folks than in days of yore. In the before times, there was a fair bit of resentment across the development groups. The kernel folks would think a particular thing should be done in the compiler, while the compiler folks would say it should be done in the kernel. Disputes like that led to Linus simply refusing to support later GCC versions for quite a long time, insisting that one particular very old version of GCC was the only one supported by Linux, because later versions all did some kind of thing he didn't want to handle in the kernel.

Times have changed: possibly because open source ultimately took over the world and is no longer an upstart scrambling to defend itself against giant enemies, possibly because there are now reasonable tools and protocols for making feature requests and reporting bugs, and possibly because the various interdependent projects have developed friendly relationships over the past 30 years.

It's nice to see.

The Author

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Kernel News

    In kernel news: Rust in Linux; and Compiler and Kernel Frenemies.

  • Kernel News

    Zack covers: When a Security Hole Is OK; Kernel Documentation Updates; and Security Through Obscurity

  • Kernel News

    Zack discusses mysterious alignments in the kernel; and discovery and invention.

  • Kernel News

    Chronicler Zack Brown reports on the NOVA filesystem, making system calls userspace only, and extending module support to plain executables. 

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More