Zack's Kernel News

Zack's Kernel News

Article from Issue 246/2021

This month in Kernel News: Opening a Random Can of Worms and Out with the Old.

Opening a Random Can of Worms

Torsten Duwe was mad as hell, and he wasn't going to take it anymore! Or at least, he had certain objections to /dev/random, which he felt should be addressed. In particular, one of the main points of random numbers in the Linux kernel is to support system security. Torsten pointed out that "Input entropy amounts are guesstimated in advance, obviously much too conservatively, compiled in and never checked thereafter; the whitening is done using some home-grown hash function derivative and other non-cryptographic, non-standard operations."

He also remarked with restraint and decorum, that "meanwhile there's quite a maintenance backlog; minor fixes are pending, medium-sized cleanups are ignored and major patch sets to add the missing features are not even discussed."

Torsten said he was in favor of bringing the Linux kernel up to some sort of standards compliance with regards to random numbers, preferably obtaining official certification from one of the organizations that did that sort of thing. But he said he'd settle for /dev/random simply being a reliable source of entropy, even without any certification.

In posting this message, Torsten did an end run around the official /dev/random maintainer, Theodore Y. Ts'o, sending the email directly to Linus Torvalds and asking for a new maintainer. Apparently Torsten felt that Ted had been AWOL on /dev/random patches and needed to be replaced.

Jason A. Donenfeld volunteered to be the new official /dev/random maintainer, though he also remarked, "I think Ted's reluctance to not accept the recent patches sent to this list is mostly justified, and I have no desire to see us rush into replacing random.c with something suboptimal."

And Ted also said, "I do plan to make time to catch up on reviewing patches this cycle. One thing that would help me is if folks (especially Jason, if you would) could start with a detailed review of Nicolai's patches. His incremental approach is I believe the best one from a review perspective, and certainly his cleanup patches are ones which I would expect are no-brainers."

Jason agreed and the discussion seemed to end there. A couple of weeks later, Marcelo Henrique Cerri asked about the status of the /dev/random patch review process, remarking, "I don't believe Torsten's concerns are simply about *applying* patches but more about these long periods of radio silence. That kills collaboration and disengage[s] people. More than simply reviewing patches I would expect a maintainer to give directions and drive the community. Asking Jason to review Nicolai's patches was a step towards that, but I believe we still could benefit from better communication."

Torsten wholeheartedly agreed, saying, "Exactly. I could live with replies in the style of 'old' Linus like: 'Your code is crap, because it does X and Y'. Then I knew how to proceed. But this extended silence slows things down a lot."

Torsten was also not satisfied with having Jason review the patches. He said, "Jason seems to narrow the proposed changes down to 'FIPS [Federal Information Processing Standard] certification', when it actually is a lot more. I think his motivation suffers because of his personal dislike."

At this point Petr Tesarik of SUSE joined the discussion, saying, "Upfront, let me admit that SUSE has a vested interest in a FIPS-certifiable Linux kernel."

He proceeded to say:

"However, it seems to me that nobody can be happy about keeping the current status quo forever. Even in the hypothetical case that the RNG [Random Number Generator] maintainer rejected the whole idea merely because it makes it possible to achieve NIST compliance, and he detests standards compliance, it would still be better than no decision at all. The silence is paralyzing, as it blocks any changes in upstream, while also making it difficult to maintain an out-of-tree implementation that aims at becoming upstream eventually.

"The only option ATM [at the moment] is a fork (similar to what the Xen folks did with XenLinux many years ago). IOW [in other words] the current situation demotivates contributors from being good citizens. I hope we can find a better solution together."

Jason voiced his opinion of this quite clearly, replying, "just because you have a 'vested interest', or a financial interest, or because you want it does not suddenly make it a good idea. The idea is to have good crypto, not to merely check some boxes for the bean counters."

But Stephan M¸ller disagreed with Jason, saying, "using non-assessed cryptography? Sounds dangerous to me even though it may be based on some well-known construction."

Stephan went on, "I thought Linux in general and crypto in particular is about allowing user (or the vendor) to decide about the used algorithm. So, let us have a mechanism that gives them this freedom."

Jason pointed out that "assessed" was not the same as FIPS certification, which was what Petr had advocated. Jason accused Stephan of intentionally conflating the idea of rejecting FIPS certification with the idea of rejecting all possible mechanisms for confirming good entropy.

And to clarify his position, Jason added, "new constructions that I'm interested in would be formally verified (like the other crypto work I've done) with review and buy-in from the cryptographic community, both engineering and academic. I have no interest in submitting 'non-assessed' things developed in a vacuum, and I'm displeased with your attempting to make that characterization."

Jason said that regardless of FIPS certification, he wanted rigorous confirmation of correctness in any proposal he made or code he submitted. He added, "The current RNG is admittedly a bit of a mess, but at least it's a design that's evolved. Something that's 'revolutionary', rather than evolutionary, needs considerably more argumentation."

Petr came back into the conversation, trying to defuse some of the tensions by pointing out that by originally admitting SUSE's vested interest, he was "just trying to be honest about our motivations." He added, "I'm a bit sad that this discussion has quickly gone back to the choice of algorithms and how they can be implemented. The real issue is that the RNG subsystem has not developed as fast as it could. This had not been much of an issue as long as nobody was really interested in making any substantial changes to that code, but it is more apparent now. Torsten believes it can be partly because of a maintainer who is too busy with other tasks, and he suggested we try to improve the situation by giving the RNG-related tasks to someone else. I have not seen a clear answer to this suggestion, except Jason offering his helping hand with Nicolai's cleanup patches, but nothing wrt [with reference to] Stephan's patches. So, what is the plan?"

Jason picked up on Petr's statement that he was sad the discussion had become about algorithms. Jason replied that this was directly relevant to the whole conversation, saying "why are you sad? You are interested in FIPS. FIPS indicates a certain set of algorithms. The ones most suitable to the task seem like they'd run into real practical problems in the kernel's RNG. That's not the _only_ reason I'm not keen on FIPS, but it does seem like a very basic one."

In a subsequent email replying to himself, Jason went on, "in working through Nicholai's patches (an ongoing process), I'm reminded of his admonishment in the 00 cover letter that at some point chacha20 will have to be replaced, due to FIPS. So it seems like that's very much on the table." And he further clarified, "If you want to make lots of changes for cryptographic or technical reasons, that seems like a decent way to engage. But if the motivation for each of these is the bean counting, then again, I'm pretty wary of churn for nothing. And if that bean counting will eventually lead us into bad corners, like the concerns I brought up about FPU usage in the kernel, then I'm even more hesitant. However, I think there may be good arguments to be made that some of Nicholai's patches stand on their own, without the FIPS motivation. And that's the set of arguments that are compelling."

At this point Pavel Machek joined the discussion, responding to the whole premise of the conversation – was it even necessary to change /dev/random at all? As he put it, "does RNG subsystem need to evolve? Its task is to get random numbers. Does it fail at the task?" And on the side of leaving it as is, he remarked that the "problem is, random subsystem is hard to verify, and big rewrite is likely to cause security problems."

Sandy Harris responded to that point in particular:

"Parts of the problem, though, are dead easy in many of today's environments.

"Many CPUs, e,g. Intel, have an instruction that gives random numbers. Some systems have another hardware RNG. Some can add one using a USB device or Denker's Turbid ( Many Linux instances run on VMs so they have an emulated HWRNG using the host's /dev/random.

"None of those is necessarily 100% trustworthy, though the published analysis for Turbid & for (one version of) the Intel device seem adequate to me. However, if you use any of them to scribble over the entire 4k-bit input pool and/or a 512-bit Salsa context during initialisation, then it seems almost certain you'll get enough entropy to block attacks.

"They are all dirt cheap so doing that, and using them again later for incremental squirts of randomness, looks reasonable.

"In many cases you could go further. Consider a system with an intel CPU and another HWRNG, perhaps a VM. Get 128 bits from each source & combine them using the 128-bit finite field multiplication from the GSM authentication. Still cheap & it cannot be worse than the better of the two sources. If both sources are anywhere near reasonable, this should produce 128 bits of very high grade random material, cheaply.

"I am not suggesting any of these should be used for output, but using them for initialisation whenever possible looks obvious to me."

At this point the conversation came to an abrupt halt, at least on the Linux Kernel Mailing List. However, a month later, Stephan posted a new patch (or rather, version 38 of his ongoing work), saying, "The following patch set provides a different approach to /dev/random which is called Linux Random Number Generator (LRNG) to collect entropy within the Linux kernel. It provides the same API and ABI and can be used as a drop-in replacement."

He listed some technical advantages over the existing /dev/random implementation, including a significant speed increase and a variety of other benefits. But there was no response. No further discussion at all.

It's unclear what that means, especially since one of Torsten's original complaints was that patches from Stephan were not being reviewed.

It's not surprising that /dev/random would be such a controversial subject that inspired such heated discussion. A large portion of Linux security rests on that particular feature. And as Pavel pointed out, it will always be difficult to confirm that changes to /dev/random result in entropy that is actually useful, as opposed to easily predictable by hostile actors.

Ultimately, if /dev/random can't be changed because it's too important, that in itself will be considered a bug by Linus Torvalds and other top dogs, and someone will eventually have a brain wave and figure out how to split /dev/random up into pieces that are more easily maintained. But until then, debates and disputes like the above will be a necessary outgrowth of the problem.

Out with the Old

An interesting aspect of Linux kernel development is how much effort is put into removing existing features. In most open source projects, features are generally added and never removed. In Linux, removing features has become perfectly standardized – specifically, support for old hardware that no one uses anymore.

Linus Torvalds has been fairly consistent over the years – if there is even a single user of a given hardware platform, he won't remove support for that platform. But if there are no users, he has no sentimental attachment. Linux was developed on the i386 platform, and Linus said at the time that it would probably never run on anything else. In 2012, support for i386 was ripped off like an old bandage, with never a backward glance.

Why? There were no i386 systems left in the world. Or at least none that were running Linux. Or at least, at the critical moment no one stepped forward to say they still needed Linux support for their i386 systems. If they had, we'd have i386 support to this very day.

It's much better to remove support for dead hardware platforms. It simplifies the whole kernel. It gets rid of special cases that needed to be maintained. It reduces the size of the source tree as well as the compiled binary. It makes it easier for developers to maintain the tree. It makes it easier for newcomers to join in.

Sometimes the decision is easy – someone discovers that support for a given architecture has been broken for a couple years. Nobody squawked, so almost certainly nobody's using that architecture. Easy peasy.

Recently, Sam Ravnborg proposed "sunsetting" the sun4m and sun4d versions of the SPARCstation architecture from Sun Microsystems. Popular in the '90s, these versions were "then replaced by the more powerful sparc64 class of machines," he said.

Sam had done his due diligence, and he added, "Cobham Gaisler have variants of the LEON processer that runs sparc32 – and they are in production today." He proposed that Linux focus its support for LEON machines, rather than the whole set of sun4m and sun4d machines. He also pointed out that the Qemu emulator did support emulating these platforms, which meant that dropping sun4m might mean losing some testing possibilities.

Sam said the Gaisler folks were interested in putting in development work to support their LEON machines, and he pointed out that "this will only be easier with a kernel where the legacy stuff is dropped."

He asked if there were any objections to sunsetting sun4m and sun4d.

Arnd Bergmann replied, "Thank you for doing this, it looks like a very nice cleanup." Though he also acknowledged, "I have no insight on whether there are any users left that would miss it, but I'm fairly sure that there are lots of people that would rather see it gone."

Kjetil Oftedal said he was sad to see these architectures go, but acknowledged, "I guess I haven't had any time to put into the sparc32 port for many years, so I guess it is time to let go."

However, Kjetil did suggest that Gaisler make some Sparc32 machines available for kernel developers to help maintain LEON support.

Sam replied to his own initial email, saying he'd gotten a few private messages from concerned citizens. One person had said (paraphrased by Sam), that "it was better to sunset now when it is actually working, so there is a working state to return to."

A second message, Sam said, argued for continuing to support these architectures, because, in fact, there were still a lot of those machines in existence. Maybe someone would want to use them. This message also pointed out that the NetBSD OS still supported those systems.

Sam invited more people to speak out, for or against dropping sun4m and sun4d.

John Paul Adrian Glaubitz replied, "I would personally be in favor of keeping it and I should finally get my SPARCstation 5 up and running again."

Romain Dolbeau also said, "If there's still a distribution willing to build for Sparc v8, then I believe the kernel should try to keep support of the relevant machine architectures if at all possible."

Julian Calaby said that he had two SPARCstation 10s and a SPARCstation LX. He summarized his situation, saying:

"If I want to run them, assuming the hardware still works, I need to netboot them as I cannot find working, compatible HDDs for them as everything has switched to SATA or SAS.

"Then there's the issue of finding a monitor as they're not electrically compatible with VGA and I'm pretty sure none of the VGA compatible monitors I have or can lay hands on works with their specific sync frequencies.

"Ultimately it's one of those things where there's enough 'stuff' in the way that booting one up for fun is simply impractical and they're old and slow enough that they're not useful for anything else."

Julian continued:

"The last (official) version of Debian to support Sparc32 was Etch and I believe it was one of the last ones to drop support.

"I believe that Gentoo is architecture-neutral enough that it'd work, but I believe that you'll have to compile everything – there'll be no pre-built anything for sparc32 – and as it's fairly slow hardware by today's standards, that's going to take a long time, however you could probably use distcc and cross-compilers to speed it up.

"Long painful story short, it's difficult to get the hardware running, there's practically no Linux distros that support it, and the kernel code has probably bitrotted due to lack of testing.

"As much as it pains me to say this, I think this code's time has come and it's time to get rid of it.

"If there were more people using it or more testing, or more distros supporting it – not just (theoretically?) working on it – then I'd be fighting to keep it.

"But there isn't.

"I think it's time for it to go."

The discussion ended inconclusively – which generally means the architecture is most likely to stick around for awhile, so more people have a chance to raise objections. The most interesting aspect of this particular debate, for me, is that there are actually plenty of these machines floating around. It's conceivable that the very effort to abandon them will inspire someone to build a giant cluster of these machines just to prove it's worth keeping support in the kernel.

The Author

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Kernel News

    Chronicler Zack Brown reports on the latest news, views, dilemmas, and developments within the Linux kernel community.

  • Kernel News

    This month we discuss replacing the random number generator, checking when a process dumps core, fixing filesystem security issues, and adding build dependencies to clean the source tree.

  • Kernel News

    Zack Brown reports on: Trusted Computing and Linux; Load Balancer Improvements; and New Random Number Handling.

  • Linus Says No Backdoor in Linux

    Brief dust-up in the kernel community leads to an illuminating look at random number generation.

  • Deleting Data

    Backups are a common topic, but you’ll hardly hear anyone mention safe data deletion.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More