Zack's Kernel News
Zack's Kernel News
Zack Brown reports on fixing printk() bit by bit, kernel internationalization (or not), and kernel encryption and secure boot.
Fixing printk() Bit by Bit
The printk()
system call is an important way for the kernel to produce logs and other messages. The kernel doesn't use any standard library functions like printf()
, so it has to roll its own. But by all accounts, printk()
is a mess.
Recently, Sergey Senozhatsky tried to spruce it up a little and avoid some potential deadlocks. There was a whole range of deadlocks caused by printk()
recursing onto itself, and Sergey didn't want to touch any of those. But he said there were plenty of non-recursive deadlock scenarios that needed to be fixed.
Specifically, there were ways to deadlock the system in the output console, and printk()
would trigger those deadlocks by trying to write to the console. To fix some of these, Sergey wanted to introduce some new helper functions for the TTY (used to implement the console) and UART code (used to communicate asynchronously with the console).
Unfortunately, this would require updating every single serial driver to use the new helper functions. Also, it would only address deadlock issues involving particular types of kernel locks. But Sergey figured a partial fix was at least a beginning.
Alan Cox was not thrilled with Sergey's proposal. He felt that the fixes would add unnecessary code to parts of the kernel that needed to be as fast as possible. Alan felt that "printk
nowadays is already somewhat unreliable with all the perf related changes," so he favored a simpler approach that would simply defer trying to produce printk()
output if the lock wasn't available.
Sergey objected that Alan's idea wouldn't address deadlocks that had already been reported by users.
Meanwhile, Peter Zijlstra said that Alan had drastically understated the crappiness of printk()
. He said, "printk
is a steaming pile of @#$#@; unreliable doesn't even begin to cover it."
Petr Mladek, who had also been working with Sergey on this code, explained:
"This patch set adds yet another spin_lock
API. It behaves exactly as spin_lock_irqsafe()/spin_unlock_irqrestore()
but in addition it sets printk_context
.
Where printk_context
defines what printk
implementation is safe. We basically have four possibilities:
1. Normal (store in logbuf
; try to handle consoles)
2. Deferred (store in logbuf
; defer consoles)
3. Safe (store in per-CPU buffer, defer everything)
4. safe_nmi
(store in another per-CPU buffer; defer everything)
This patchset forces safe context around TTY and UART locks. In fact, the deferred context would be enough to prevent all the mentioned deadlocks."
But Linus Torvalds had strong objections to the whole idea. He said:
"The rule is simple: DO NOT DO THAT THEN.
Don't make recursive locks. Don't make random complexity. Just stop doing the thing that hurts.
There is no valid reason why an UART driver should do a printk()
of any sort inside the critical region where the console is locked.
Just remove those printk
's; don't add new crazy locking.
If you had a spinlock that deadlocked because it was inside an already spinlocked region, you'd say 'that's buggy'.
This is the exact same issue. We don't work around buggy garbage. We fix the bug – by removing the problematic printk
."
In light of this, Steven Rostedt remarked, "Perhaps we should do an audit of the console drivers and remove all printk
, pr_*
, WARN*
, [and] BUG*
from them."
Sergey objected that this wasn't just a case of removing unwanted printk()
s from the code. He said, "It's not UART on its own that immediately calls into printk()
, that would be trivial to fix; it's all those subsystems that [the] serial console driver can call into."
He added, "For instance, kernel/workqueue.c
– it may WARN_ON/printk
in various cases. And those WARN
s/printk
s are OK. Except for one thing: workqueue
can be called from a serial console driver, which suddenly will turn those WARN
s/printk
s into illegal ones, due to possible deadlocks. And serial consoles can call into WQ. Not directly, but via TTY code."
He added, "IOW, there is this tricky 'we were called from a serial driver' context, which is hard to track, but printk_safe
can help us in those cases."
But Linus replied, "We already have the whole PRINTK_SAFE_CONTEXT_MASK
model that only adds it to a secondary buffer if you get recursion. Why isn't that triggering? That's the whole point of it. I absolutely do *not* want to see any crazy changes to tty drivers. No, no, no."
And Sergey said this was exactly what his code did – it enhanced printk_safe
to handle this new set of circumstances.
The thread ended inconclusively. But the desire to improve printk()
is a real one. Absolutely everyone is on board, if they could only figure out the right way to do it.
Kernel Internationalization (or Not)
Sometimes discussions on particular topics seem to come out of nowhere. Recently David Howells posted some sample code for one of the kernel's new library interfaces. Pavel Machek observed that the code would output English language error messages and remarked, "Not sure that is reasonable, as that is going to cause problems with translations." David said that simply outputting error codes wouldn't cut it, because the logs by default would be output via printk()
, and therefore needed to be human-readable.
Pavel replied, "Errors should have numbers, and catalog explaining what error means what. That way user space can translate, and it is what we do with errno
. I believe numbers are best. If you hate numbers, you can still use strings, as long as you can enumerate them in docs (but it will be strange design). But anything else is not suitable, I'm afraid."
David objected, saying the errors consisted of various components that needed to be calculated separately and used as parameters to produce a unified message. But Pavel insisted that regardless of these requirements, it was essential for user code to be able to parse the errors as well. As a counter-example against plain text messages, he said, "One way is to pass ('not enough pixie dust (%d too short) on device %s in %s', 123, 'foo', 'warddrobe')
. But if you pass it as one string, it becomes hard/impossible to parse. (For example if device is named 'foo in bar'
.)"
At this point in the debate, Linus Torvalds came in with a statement of policy. He seemed to agree partly with David, in that user code did not need to parse the errors, and partly with Pavel, in that errors should still be numerical codes rather than text. Linus said:
"We don't internationalize kernel strings. We never have. Yes, some people tried to do some database of kernel messages for translation purposes, but I absolutely refused to make that part of the development process. It's a pain.
For some GUI project, internationalization might be a big deal, and it might be 'TheRule(tm)'. For the kernel, not so much. We care about the technology, not the language.
So we'll continue to give error numbers for 'an error happened'. And if/when people need more information about just what triggered that error, they are as English-language strings. You can quote them and google them without having to understand them. That's just how things work.
Let's face it, the mount options themselves are already (shortened) English language words. We talk about mtime
and create
.
There are places where localization is a good idea. The kernel is *not* one of those places."
And Matthew Wilcox pointed out that the gettext
tool, "uses the English text as a search string and replaces it with the localized string. This is a very common design!"
Pavel affirmed that English language in the kernel was not a problem, but he said that error numbers currently allowed user code to read kernel messages and produce localized language output. He said, "User space does [a] good job translating errors, and it would be good to keep that capability."
Linus pointed out that even gettext
was not ideal. And in terms of the readability of error messages, he said, "if you are messing with mount options and things like that, you'd better be able to google the incomprehensible words. Most of them will be incomprehensible even if you're a native speaker."
He added in a separate email, that kernel errors could generally be highly specialized and auto-generated by the kernel code to match very specific circumstances. At that point he said, "Once the string has been generated, it can now be thousands of different strings, and you can't just look them up from a table any more [...] The string will have various random key names etc. in it."
For message localization, at least for the kernel, Linus declared:
"I really think the best option is 'Ignore the problem'. The system calls will still continue to report the basic error numbers (EINVAL
, etc.), and the extended error strings will be just that: extended error strings. Ignore them if you can't understand them.
That said, people have wanted these kinds of extended error descriptors forever, and the reason we haven't added them is that it generally is more pain than it is necessarily worth."
Theodore Ts'o said he also felt this really wasn't a problem worth dealing with. Even super-complex kernel error messages would be best handled, he said, by pushing them off into user space and letting user code sort it out and translate any of it into whatever languages might be needed.
Pavel essentially agreed with all of the above. But since David's original code was something new, rather than a revamp of existing code, he said, "we have chance to do it right for a minimum price (because the interface is new, we don't need compatibility)."
But Theodore drove his point home, saying, "I think David's proposal of just returning Error:
followed by English text is just fine, and doing more than that is overdesign. The advantage of dmesg
is that it's well understood by everyone that dmesg
is English text meant for experts. The problem once we move away from dmesg
, this tends to cause the I18N brigade to start agitating for something more complicated. And if the only choices were some complex I18N horror through a system call, or just leaving the (English) text messages in dmesg
, I'd vote for dmesg
for sure."
The discussion continued for a bit, with various developers arguing for and against various translation techniques, and explaining the problems with those techniques. Then Linus said, "Really. No translation. No design for translation. It's a nasty, nasty rat-hole, and it's a pain for everybody."
He added, "the fact is, I want simple English interfaces. And people who have issues with that should just not use them. End of story. Use the existing error numbers if you want internationalization, and live with the fact that you only get the very limited error number. It's really that simple."
David replied, "fine by me," and that was the end of the discussion.
The really strange and interesting thing about this discussion is that one of the things corporate Internet companies really focus on these days is accessibility, both for handicaps and many languages. There's political pressure to do it for sensitivity reasons, and economic pressure to do it as a way to open up more markets. An Internet company really can't reach a certain level of success and growth without migrating its interfaces to a fully internationalized infrastructure.
Meanwhile, the Linux kernel has already taken over the entire world, running absolutely everywhere, relied on by absolutely everyone – even people who think they are Windows users – and the issue of internationalization is barely even a question.
Perhaps this is because open source projects have the luxury of shunting off portions of their activity to other projects that either exist now or will exist as soon as someone recognizes they need them. Corporate projects generally can't rely on getting that kind of help from their competitors.
Kernel Encryption and Secure Boot
A persistent security debate in the Linux kernel world centers around media companies trying to prevent the owner of a given system from having full control over that system. Technically, that does constitute a security issue, since it's about access control. And there is a lot of money at stake because media companies want to offer users all sorts of access to their media products, if those products can be protected against copyright infringement. But the kernel developers refuse to implement or accept such features, because they believe the machine owner has the ultimate right to control their own system.
The debate can become convoluted. Often the media companies don't want to admit that a proposed patch is really intended to take control away from the machine owner, because they know the patch would never be accepted in that case. And just as often, the kernel developers doesn't want to seem like they are arbitrarily rejecting patches for reasons to which the patch submitter has not admitted. What tends to follow is therefore a strange dance of call-and-response, where the kernel developers try to get the patch submitter to admit the true purpose of their patch, while the patch submitter tries to present the patch as having a general-purpose security value beyond any side effect of taking control away from the machine owner.
Recently Chen Yu from Intel wanted to add a kernel feature to encrypt the running kernel image when the user hibernated the system. This would involve installing a kernel module to generate a key from a user's passphrase, encrypting the kernel image, and decrypting when the user resumed the system.
Pavel Machek pointed out that uswsusp
(userspace software suspend) already provided an encryption feature. He asked Chen to explain the specific security attacks his kernel-based encryption system would guard against that would be better than uswsusp
's approach.
Chen referenced the patch log, which read, "Generally the advantage is: Users do not have to encrypt the whole swap partition as other tools. After all, ideally kernel memory should be encrypted by the kernel itself." But Pavel was not satisfied and reiterated that Chen's explanation did not address the specific security attacks and defenses that would be better than uswsusp
. Pavel also added, "Also note that [Chun-Yi Lee] has patch series which encrypts both in-kernel and uswsusp
hibernation methods. His motivation is secure boot. How does this compare to his work?"
Chen replied that it was better for the kernel to encrypt the running system than to have user space do it, because it avoided having to transfer the kernel memory in plain-text from kernel space to user space. It would also save time by not having to copy data between kernel and user space. By staying in the kernel, the user would not have to worry about userspace bugs introducing security holes. And he added that he had been collaborating with Chun-Yi on these patches.
Chun-Yi also spoke up, explaining:
"The pros of my solution is that the signed/encrypted snapshot image can be stored to anywhere. Both in-kernel and user space.
Yu's patch is encrypt the page buffer before sending to block io layer for writing to swap. The main logic is applied to swap.c
. It's against the swap solution in-kernel.
The pros of Yu's solution is that it encrypts the compressed image data. So, for the huge system memory case, it has better performance.
Yu's plan is using the sysfs
to switch different encrypt/sign solutions. And, we will share encrypt/sign helper and key manager in the above two solutions."
Pavel was unconvinced. He remarked flatly, "Answer to bugs in user space is not to move code from user space to kernel."
Pavel also asked, "So your goal is to make hibernation compatible with kernel lockdown? Do your patches provide sufficient security that hibernation can be enabled with kernel lockdown?" And Oliver Neukum requested clarification, saying, "if the key comes from user space, will that be enough?" And Pavel replied, "Yes, that seems to be one of problems of Yu Chen's patchset."
Pavel explained that he was personally opposed to doing hibernation encryption in the kernel since it could be (and was already) done successfully in user space. But he acknowledged, "We have this weird thing called secure boot [that] some people seem to want. So we may need some crypto in the kernel – but I'd like something that works with uswsusp
, too. Plus, it is mandatory that patch explains what security guarantees they want to provide against what kinds of attacks."
In response to Oliver's question about the key coming from user space, Chen replied, "we once tried to generate key in kernel, but people suggest to generate key in user space and provide it to the kernel, which is what ecryptfs
do currently, so it seems this should also be safe for encryption in kernel."
Chen also offered a summary of the difference between his patch and Chun-Yi's, saying, "The only difference between Chun-Yi's hibernation encryption solution and our solution is that his strategy encrypts the snapshot from scratch, and ours encrypts each page before them going to block device. The benefit of his solution is that the snapshot can be encrypt[ed] in kernel first thus the uswsusp
is allowed to read it to user space even kernel is lock down. And I had a discussion with Chun-Yi that we can use his snapshot solution to make uswsusp
happy, and we share the crypto help code and he can also use our user provided key for his signature. From this point of view, our code are actually the same, except that we can help clean up the code and also enhance some encryption process for his solution."
In response to Pavel's post about Secure Boot, Oliver remarked, "maybe we should state clearly that the goal of these patch set[s] is to make Secure Boot and STD coexist. Anything else is a nice side effect, but not the primary justification, right? And we further agree that the model of Secure Boot requires the encryption to be done in kernel space, don't we? Furthermore IMHO the key must also be generated in trusted code, hence in kernel space. Yu Chen, I really cannot see how a symmetrical encryption with a known key can be secure."
And Pavel added, "I don't think generating key in user space is good enough for providing guarantees for secure-boot."
Pavel also continued to ask for specific security dangers, and how any of this code might address it, to which Oliver explained:
"Unsigned code must not take over the privilege level of signed code. Hence:
1. Unsigned code must not [be] allowed to read sensitive parts of signed code's memory space
2. Unsigned code must not be able to alter the memory space of signed code – snapshots that are changed must not be able to be resumed"
But he also asked why key generation in user space would not be secure, to which Pavel replied, "Because then, user space has both key (now) and encrypted image (after reboot), so it can decrypt, modify, re-encrypt…?"
The discussion continued, with new versions of the patches coming out and further feedback. At one point Yu Chen said, "I'm still a little confused about the 'resume' phase. Taking encryption as example (not signature), the purpose of doing hibernation encryption is to prevent other users from stealing RAM content. Say, user A uses a passphrase to generate the key and encrypted the hibernation snapshot and stores it on the disk. Then if user B wants to do a hibernation resume to A's previous environment, B has to provide the same passphrase. If I understand correctly, the secret key is saved in header and stored on the disk. Which means, any one can read the header from the disk to get the secret key in trampoline thus decrypt the image, which is not safe."
But Pavel replied, laying his cards on the table:
"No, I don't think that's purpose here.
Purpose here is to prevent user from reading/modifying kernel memory content on machine he owns.
Strange as it may sound, that is what 'secure' boot requires (and what Disney wants)."
And Yu Chen, laying his cards down too, said, "Ok, I understand this requirement, and I'm also concerning how to distinguish different users from seeing data of each other."
At one point Oliver said, "While the system is running and the fs
is mounted, your data is as secure as root access to your machine, right? You encrypt a disk primarily so data cannot be recovered (and altered) while the system is not running. Secure Boot does not trust root fully. There is a cryptographic chain of trust and user space is not part of it."
Ultimately there was no resolution to the discussion. The debate is so odd! Every time Pavel asked what security weaknesses the patch addressed, he seemed really to be inviting the patch developers to admit that instead of addressing security concerns relevant to the user, they were trying to keep the user locked out of their own system. And for that same reason, the developers seemed to avoid actually listing the security weaknesses Pavel wanted.
At the same time, as Pavel said, Secure Boot is a reality. And as more and more patches arrive to support it, Disney and others do seem to be gradually making inroads towards eventually locking the user out of their own system.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Armbian 24.11 Released with Expanded Hardware Support
If you've been waiting for Armbian to support OrangePi 5 Max and Radxa ROCK 5B+, the wait is over.
-
SUSE Renames Several Products for Better Name Recognition
SUSE has been a very powerful player in the European market, but it knows it must branch out to gain serious traction. Will a name change do the trick?
-
ESET Discovers New Linux Malware
WolfsBane is an all-in-one malware that has hit the Linux operating system and includes a dropper, a launcher, and a backdoor.
-
New Linux Kernel Patch Allows Forcing a CPU Mitigation
Even when CPU mitigations can consume precious CPU cycles, it might not be a bad idea to allow users to enable them, even if your machine isn't vulnerable.
-
Red Hat Enterprise Linux 9.5 Released
Notify your friends, loved ones, and colleagues that the latest version of RHEL is available with plenty of enhancements.
-
Linux Sees Massive Performance Increase from a Single Line of Code
With one line of code, Intel was able to increase the performance of the Linux kernel by 4,000 percent.
-
Fedora KDE Approved as an Official Spin
If you prefer the Plasma desktop environment and the Fedora distribution, you're in luck because there's now an official spin that is listed on the same level as the Fedora Workstation edition.
-
New Steam Client Ups the Ante for Linux
The latest release from Steam has some pretty cool tricks up its sleeve.
-
Gnome OS Transitioning Toward a General-Purpose Distro
If you're looking for the perfectly vanilla take on the Gnome desktop, Gnome OS might be for you.
-
Fedora 41 Released with New Features
If you're a Fedora fan or just looking for a Linux distribution to help you migrate from Windows, Fedora 41 might be just the ticket.