Kernel News

Zack's Kernel News

Article from Issue 241/2020
Author(s):

This month in Kernel News: Shared Processes with Hyper-Threading; Cleaning Up printk(); and Rust in the Kernel.

Shared Processes with Hyper-Threading

Joel Fernandes wanted to speed up hyper-threading by making it possible for processes on the same CPU that trust each other to share hyper-threads. Hyper-threading is Intel's proprietary form of hardware-based multithreading. Generally in Linux and other operating systems (OSs), the OS is responsible for switching rapidly between processes, so everything on the system seems to be running at once. Hyper-threading does this at the hardware level, saving time for the OS. But the OS can still interact with Intel's hyper-threading features, so folks like Joel can try to eke out performance improvements.

Joel specifically wanted to improve Peter Zijlstra's "core-scheduling" patches, which Peter famously hates the way one hates slowly pulling out their own fingernails. He actually did this once. No, not really. However, the point of the core-scheduling patches is to make each CPU act like two. This way, for moments when one of the virtual CPUs has nothing to do, the other virtual CPU will keep chugging away, making sure the hardware "real" CPU is as fully utilized as possible.

Of course, processes running on those two hyper-threading virtual CPUs have to be treated like potential security threats, the same as all processes everywhere. If a hostile actor gets into a user process on a given system, the kernel wants to limit the amount of damage that actor can do. That's just part of standard Linux procedure. Keep everything isolated, and then nothing can hurt anything else too badly.

However, Joel's idea is that if a pair of virtual CPUs were each running processes that were known to trust each other, then the OS could relax a little and not pay such close attention to limiting each of those processes' access to each other. So without that extra security burden, the two virtual CPUs could run that teensy bit faster.

Of course, Joel's future patch would need to account for all workflows. There couldn't, for example, be a massive slowdown in the event that the two processes didn't actually trust each other.

Additionally, as Joel explained, "Currently no universally agreed set of interface exists and companies have been hacking up their own interface to make use of the patches. This post aims to list use cases which I got after talking to various people at Google and Oracle."

Aside from everything else, Joel needed to account for the shenanigans of corporate players who had already been hacking around on this topic for awhile themselves. Various folks chimed into the discussion.

How do processes know they trust each other? One way is that, if one process creates a bunch of other processes, those processes can all trust each other unless otherwise informed. So the question becomes how to keep track of process families and how trustworthy the parent process thinks the child processes are.

Or, as expressed by Vineeth Pillai, sometimes the user wouldn't need nested hierarchies of processes, but simply want to designate a set of existing processes as "trustworthy" and then have them all run on the same CPU, so as to take advantage of core scheduling and, of course, Joel's proposed speed-ups.

In that use case, Vineeth suggested creating a new coreschedfs filesystem, which would have file entries representing groups of processes that all trusted each other.

However, Dhaval Giani didn't like adding to the proliferation of filesystems in Linux. He said, "We are just reinventing the wheel here. Let's try to stick within cgroupfs first and see if we can make it work there."

There was not a huge amount of discussion, beyond everyone agreeing that faster is better, and that their use cases are the good use cases.

This is how a project gets started sometimes. And a lot of these big companies really want that little tiny bit of extra speed. Their data centers contain millions of machines, and, for that sort of situation, a little speed-up can go a long way.

Cleaning Up printk()

The printk() function is what everyone wants to work on these days. It's the hot, sexy kernel feature that spells good jobs and better press for the lucky developer who snags it.

No, nobody likes printk(). It's always been a mess, and it gets messier just about as often as it gets cleaner. But if you want to debug the kernel or submit a bug report to people who do, then you need those printk() outputs, right up to the microsecond before the crash.

John Ogness recently posted a patch to the printk() subsystem. In fact, it offloads the LOG_CONT functionality that splits long outputs into individual printk() calls, letting an outside loop handle all that. This approach, John said, "is necessary in order to support the upcoming move to a fully lockless printk() implementation." John offloaded the long-line-handling functionality from the writer (i.e., printk()) to the reader, the part of the kernel that handled the printk() output.

Just as nobody likes printk(), nobody likes locks. A lock reserves a resource, like a printer or a RAM chip, for one single process and prevents any other process from using it. Generally locks are held and released at lightning speed, to maintain the illusion that all processes on the system are running simultaneously. But locks do use a little time and can cause very brief system-wide delays – in general, the fewer locks, the better.

There used to be something called "The Big Kernel Lock" that forced the whole system to wait, instead of simply reserving individual services. Eventually it was replaced with lots of little baby locks that were highly targeted. This sped up the overall system and gave everyone a smoother overall user experience.

A lockless printk() is good for other reasons – it reduces the risk that an important kernel message won't be output before a crash, just because a lock hadn't been released in time.

All seemed well. John's patch seemed to be part of a larger effort to make things better. However, Linus Torvalds did not have confidence that the readers would properly put the printk() output back together into a coherent line of text. He said:

"The last time we did things like this, it was a disaster, because a concurrent reader would see and return the _incomplete_ line, and the next entry was still being generated on another CPU.

"The reader would then decide to return that incomplete line, because it had something.

"And while in theory this could then be handled properly in user space, in practice it wasn't. So you'd see a lot of logging tools that would then report all those continuations as separate log events.

"Which is the whole point of LOG_CONT – for that *not* to happen."

He said he'd only accept this sort of patch, "as long as the kernel makes sure the joining does happen [to] it at some point." He added that, "It obviously doesn't have to happen at printk() time, just as long as incomplete records aren't exposed even to concurrent readers."

John said he was on top of it, and he thought this would be handled as Linus wanted. He added, "Petr and Sergey are also strict about this. We are making a serious effort to avoid breaking things for userspace."

Linus breathed a highly tentative sigh of relief, saying:

"Over the years, we've gotten printk wrong so many times that I get a bit paranoid. Things can look fine on the screen, but then have odd line breaks in the logs. Or vice versa. Or work fine on some machine, but consistently show some race on another.

"And some of the more complex features are hardly ever actually used – I'm not sure the optional message context (aka dictionary) is ever actually used.

[...]

"So there are hidden things in there that can easily break *subtly* and then take ages for people to notice, because while some are very obvious indeed ('why is my module list message broken up into a hundred lines?') others might be things people aren't even aware of."

At this point, Petr Mladek stomped all over Linus's tentative sigh of relief, saying he'd already found a bug in which the kernel was outputting empty log lines. Apparently he traced the problem to gaps between the log line numbers, causing extra lines to be output to fill in the gaps. But Petr said, "I can't find any simple or even working solution for maintaining a monotonic sequence number a lockless way that would be the same for all stored pieces."

However, Petr did have the unpleasant suggestion:

"I am afraid that the only working solution is to store all pieces in a single lockless transaction. I think that John already proposed using 2nd small lockless buffer for this. The problem might be how to synchronize flushing the pieces into the final buffer.

"Another solution would be to use separate buffer for each context and CPU. The problem is a missing final '\n'. It might cause that a buffer is not flushed for a long time until another message is printed in the same context on the same CPU.

"The 2nd small logbuffer looks like a better choice if we are able to solve the lockless flush."

There was no further discussion in this thread. Obviously, a solution will be found, since the whole point is to make printk() less revolting. But clearly Linus is pretty jumpy when it comes to printk() changes. It may be a long road to maintainability.

Rust in the Kernel

Nick Desaulniers recently asked for a show of hands from people planning to attend the Linux Plumbers conference regarding how many wanted to have a micro-conference on support for writing kernel modules in the Rust programming language.

There was a general chorus of interest. Rust's value is that it's a slightly higher level language than C and thus doesn't suffer from some of the usual C language pain points like memory leaks. For core kernel implementations, Rust would probably not be acceptable in the same way C++ is not acceptable – those languages don't appear to give enough fine-grained control over the generated machine code. But for modules, the Rust language might tend to be a more inviting choice than C.

Josh Triplett, the leader of the Rust language development team, replied to Nick's post, saying:

"I'd love to see a path to incorporating Rust into the kernel, as long as we can ensure that:

- There are appropriate Rustic interfaces that are natural and safe to use (not just C FFI, and not *just* trivial transformations like slices instead of buffer+len pairs).

- Those Rustic interfaces are easy to maintain and evolve with the kernel.

- We provide compelling use cases that go beyond just basic safety, such as concurrency checking, or lifetimes for object ownership.

- We make Rust fit naturally into the kernel's norms and standards, while also introducing some of Rust's norms and standards where they make sense. (We want to fit into the kernel, and at the same time, we don't want to hastily saw off all the corners that don't immediately fit, because some of those corners provide value. Let's take our time.)

- We move slowly and carefully, making sure it's a gradual introduction, and give people time to incorporate the Rust toolchain into their kernel workflows.

"Also, with my 'Rust language team lead' hat on, I'd be happy to have the Linux kernel feeding into Rust language development priorities. If building Rustic interfaces within the kernel requires some additional language features, we should see what enhancements to the language would best serve those requirements. I've often seen the sentiment that co-evolving Linux and a C compiler would be beneficial for both; I think the same would be true of Linux and the Rust compiler."

There was a general round of fond recollections of earlier discussions of possibly including Rust in the kernel. At one point, Josh remarked, "As I recall, Greg's biggest condition for initial introduction of this was to do the same kind of 'turn this Kconfig option on and turn an option under it off' trick that LTO uses, so that neither 'make allnoconfig' nor 'make allyesconfig' would require Rust until we've had plenty of time to experiment with it. And that seems entirely reasonable to me too."

However, Linus Torvalds objected to exactly that, saying:

"No, please make it a 'is rust available' automatic config option. The exact same way we already do the compiler versions and check for various availability of compiler flags at config time.

"See init/Kconfig for things like

config LD_IS_LLD
  def_bool $(success,$(LD) -v | flfl
  head -n 1 | grep -q LLD)

"and the rust support should be similar. Something like

config RUST_IS_AVAILABLE
  def_bool $(success,flfl
  $(RUST) ..sometest..)

"because I _don't_ want us to be in the situation where any new rust support isn't even build-tested by default.

"Quite the reverse. I'd want the first rust driver (or whatever) to be introduced in such a simple format that failures will be obvious and simple.

"The _worst_ situation to be in is that s (small) group of people start testing their very special situation, and do bad and crazy things because 'nobody else cares, it's hidden'.

"No, thank you."

To which Josh replied, "That sounds even better, and will definitely allow for more testing. We just need to make sure that any kernel CI infrastructure tests that right away, then, so that failures don't get introduced by a patch from someone without a Rust toolchain and not noticed until someone with a Rust toolchain tests it."

Meanwhile, Adrian Bunk had a concern of a different sort. He noted that Firefox depended on Rust, which meant that Linux distributions tended to always ship with a recent version of Rust. If this meant that the kernel would also be relying on a constantly updating version of Rust, Adrian felt there could be problems. As he put it:

"It would not sound good to me if security updates of distribution kernels might additionally end up using a different version of the Rust compiler – the toolchain for the kernel should be stable.

"Would Rust usage in the kernel require distributions to ship a 'Rust for Firefox' and a 'Rust for the kernel'?"

Josh replied, "Rust has hard stability guarantees when upgrading from one stable version to the next. If code compiles with a given stable version of Rust, it'll compile with a newer stable version of Rust. Given that, a stable distribution will just need a single sufficiently up-to-date Rust that meets the minimum version requirements of both Firefox and Linux."

But this was not good enough for Linus. He agreed with Adrian – this was a problem. Linus said:

"I think the worry is more about actual compiler bugs, not the set of exposed features.

"That's always been the biggest pain point. Compiler bugs are very rare, but they are so incredibly hard to debug when they happen that they end up being extra special.

"Random 'we need this compiler for this feature' is actually fairly rare. Yes, the most recent case of me just saying 'let's use 4.9 rather than 4.8' was due to that, but honestly, that's the exception rather than the rule, and is to occasionally simplify the code (and the test coverage).

"The most common case of compiler version checks are due to 'compiler XYZ is known to mis-compile ABC on target IDK'."

Or as Adrian said with fewer words, "Rust cannot offer a hard stability guarantee that there will never be a code generation regression on any platform."

And David Laight added:

"This reminds me of why I never want to use an online compiler service – never mind how hard companies push them.

"If I need to do a bug-fix build of something that was released 2 (or more) years ago I want to use exactly the same toolchain (warts and all) that was used for the original build.

"If the compiler has changed I need to do a full test – just in case it compiles some 'dodgy' code differently. With the same compiler I only need to test the fix."

Meanwhile, Arnd Bergmann speculated, "While Linux used to build with 12 year old compilers (4.1 until 2018), we now require a 6 year old gcc (4.9) or 1 year old clang/llvm. I don't know whether these will fully converge over time but it seems sensible that the minimum rust frontend version we require for a new kernel release would eventually also fall in that range, requiring a compiler that is no more than a few years old, but not requiring the latest stable release."

However, since Rust is still changing rapidly – and will probably change rapidly specifically in order to support Linux kernel integration – Josh replied, "I expect in the short term that we will likely have a need for features from recent stable releases, especially when those features were added specifically to support the kernel or similar, but the rate at which we need new features will slow over time, and eventually we'll go from 'latest stable' to 'within a year or so'."

But Adrian didn't like this chaos, and he didn't like the kernel being too tightly coupled to something that was changing that rapidly. He said, "If you want to keep a tool that tightly to the kernel, please bundle it with the kernel and build it as part of the kernel build. I would suggest to start with a proper design/specification what the kernel wants to use, so that you are confident that a compiler implementing this will be sufficient for the next 5 years."

He added that, if the kernel clearly specified its needs for a tool like Rust, "it would avoid tying the kernel to one specific compiler implementation. A compiler like mrustc or a hypothetical Rust frontend for gcc could then implement a superset of what the kernel needs."

The discussion continued, with more issues raised. For example, Pavel Machek was concerned that Rust might extend build times and use a lot more RAM. However, these issues were not resolved during the thread.

It's obvious that unless a surprising problem suddenly flies out of left field, Rust support will go into the kernel in the not-too-distant future. Linus's concerns were not dire at all, and basically everyone was either in favor of adding Rust support or else had concerns that could be addressed over time. And the willingness of Josh and his team to adapt the Rust language itself to the kernel's needs probably doesn't hurt either.

The Author

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Rust Language

    We look at a few features of Rust, Mozilla's systems programming language, and its similarity to other languages.

  • Rust

    Largely unnoticed by the public, the Mozilla Foundation is tinkering with its own programming language, Rust, which is intended to make writing reliable, fast, and concurrently running applications easier. For this purpose, the developers are borrowing generously from other languages.

  • Kernel News

    Chronicler Zack Brown reports on printk() wrangling, persistent memory as a generalized resource, making Kernel headers available on running systems, and Kernel licensing Hell. 

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

  • Kernel News

    Zack Brown discusses implementing digital rights management in-kernel, improving lighting controls, and updating printk().

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News