Zack's Kernel News

Zack's Kernel News

Article from Issue 219/2019

This month Zack discusses the KUnit testing framework, removing the Ancient 00-INDEX files, and identifying process termination without polling.

KUnit Testing Framework

A sizable part of kernel development is taken up with testing, debugging, and speed evaluation. Even the revision control system includes the git bisect operation to quickly find which patch introduced a given bug. There are whole bug classes that, if not caught before the developer submits the patch, will get a stern rebuke from Linus Torvalds.

One type of test that has recently come to the Linux kernel is unit testing. Brendan Higgins introduced KUnit for consideration to the Linux Kernel Mailing List. The idea of unit tests is that code can be tested completely independently from the rest of the system. A given subroutine is tested solely for the action it performs. This is in contrast to other forms of testing, where code is so interconnected throughout the system that it can only be tested indirectly, by examining the behavior of a full running kernel. Unlike those more complex tests, unit tests are typically simple and repeatable, test a single thing, and give an immediate result. They are also fast and able to perform many tests in just a few seconds.

The response was a lot of enthusiasm for this work, especially from developers who had been rigging up their own unit tests on a case-by-case basis. Daniel Vetter was one of these, creating unit tests for the Direct Rendering Manager (DRM) driver. As he put it, "Having proper and standardized infrastructure for kernel unit tests sounds terrific. In other words: I want." And Dan Williams had been doing a similar thing for the libnvdimm driver. He said he planned to convert all of his unit tests to use KUnit.

Not everyone was immediately enthusiastic. Tim Bird wanted to make sure that any generalized unit testing system would behave properly under unusual conditions. For example, if a developer were running under one architecture, but testing code intended for another architecture, how would KUnit handle that? Would it compile the code for the target system or the running system? He also asked where unit tests should live. Did they belong in the same directory as the code they were testing, potentially cluttering them up? Or should unit tests all be collected together into a special directory only for unit tests? And Tim was also interested in the possibility that unit tests might be a useful entry point for new developers. If a unit test were easier to write than the code it tested, then perhaps new developers could start off writing unit tests, and then move "up" to writing actual driver and subsystem code.

Brendan addressed some of these concerns. Among other things, he confirmed that cross-compiling unit tests was not currently supported. If someone wanted to run unit tests covering a given architecture, he said, they would have to compile and run the tests on that architecture. The KUnit code would not handle that for them.

Brendan added that KUnit tests would live most naturally in the directories with the code they tested. This way it would be trivial to write driver and subsystem updates, along with all their unit tests; the same people who maintained the main code for a given project would also maintain the unit tests. He added that he thought this would actually reduce a maintainer's workload, because unit tests would result in cleaner, more maintainable code throughout a given project.

However, Brendan said he did not believe that unit tests would be easier to write than driver and subsystem code itself. He said, "the person who writes the test must understand what the code they are testing is supposed to do. To some extent that will probably require someone with some expertise to ensure that the test makes sense."

There was not much controversy over Brendan's KUnit patches. It seems as though most developers consider them an idea whose time has come; they'll soon take their place in the growing pantheon of testing and profiling code that supports the Linux kernel.

Removing the Ancient 00-INDEX Files

Sometimes kernel infrastructure can hang around for a long time after it's no longer needed. Documentation is especially susceptible to this, since out-of-date documentation doesn't actually break anything; it just makes it more difficult and annoying for humans to understand.

Originally, Linux used files named 00-INDEX in a given directory to list that directory's files and give a brief description of their purpose. Nowadays, no one really relies on those files anymore, and kernel documentation has shifted quite far from those early days.

Henrik Austad recently decided to purge the 00-INDEX files from the kernel. By now they are well-known to be very out-of-date. He wrote a script to test how out-of-date they were and found hundreds of instances. In light of that, he posted a patch to remove them from the source tree en masse.

Joe Perches liked this idea, pointing out that the kernel now used reStructuredText (.rst) files to document itself. He felt there was no need whatsoever to hold onto those old 00-INDEX files.

Jonathan Corbet suggested sharing the patch around a little bit to see if there was any pushback, since there were still people who sent him patches to update them; there was actually useful information in them.

In fact, Henrik said he'd be happy to update all the 00-INDEX files if that was the thing to do, but if there was no need, he said, he'd rather just ditch them.

Josh Triplett saw no need to keep any 00-INDEX files. At most, he thought it might be beneficial to copy their useful information into the .rst files currently used for documentation, but even that didn't seem so important to him.

Paul Moore also supported removing the 00-INDEX files. He saw no point in keeping them.

Overall, it does seem as though the 00-INDEX files no longer serve a useful purpose. For a long time, the kernel had no native documentation, and the 00-INDEX files provided an absolute minimum of information – just the names of each source file and a brief description. Since then, kernel documentation has blossomed, and there probably is no need for such primitive methods.

Identifying Process Termination Without Polling

Sometimes no one disputes the value of a new feature, but no one can agree on the right way to implement it. Often there are simply different sensibilities about what should be sacrificed in favor of what and which feature elements are the most important.

Recently Daniel Colascione posted some patches to help the Android OS identify when a given process had terminated. Of course, it wasn't only for Android; it was a general-purpose feature, but his motivation in writing the patch was to solve a particular problem for Android.

Typically processes only care about the exit status of their own children, and there are features already in the kernel for that. It's less usual for any process to care about the status of some random process elsewhere on the system. But as Daniel pointed out, Android ran a process called lmkd, which would kill processes that threatened to use too much RAM. For this, it needed to give the kill command and then check to see that the process had exited properly. Currently, lmkd would just keep checking over and over and over, polling on the process until the process ID disappeared. Daniel wanted to put a more efficient system in place. He wanted lmkd to block on a read request and simply wait until that request returned. When it did, that would mean the process had terminated. Much simpler and less resource-intensive.

This was where controversy erupted.

Joel Fernandes felt that polling on the process ID was plenty good enough. The reason was that Daniel's idea involved creating new files under the /proc directory, which would increase the overall infrastructure of the kernel. Avoiding that seemed like an important thing to Joel.

Joel's alternative, however, involved tracking the target process with ptrace(), which Daniel felt would be odd, given that ptrace() was supposed to be more of a debugging tool and not really a general-purpose tool for a running system. Using it in this case, Daniel said, would even interfere with debuggers and core dumps.

In general, all the alternatives proposed instead of Daniel's patch were more complex and heavyweight than Daniel's code. But the developers advocating those alternatives liked them, because they didn't involve creating new files in the /proc directory. Daniel, meanwhile, felt very strongly that, "given that we can, cheaply, provide a clean and consistent API to userspace, why would we instead want to inflict some exotic and hard-to-use interface on userspace instead?"

There was not really any resolution to the controversy during the conversation. Sometimes, it's not clear what the most important issues really are in a given situation. We clearly don't want the /proc directory to grow without bound, but does Daniel's feature really represent that kind of bloat? We clearly would prefer to use existing mechanisms to solve problems rather than create new infrastructure for them, but do the alternatives to Daniel's code really represent the simplest solution? In all likelihood, the debate will continue, drawing in more prominent developers, until a clear technical requirement is identified, showing that one approach is truly better than the other.

The Author

The Linux kernel mailing list comprises the core of Linux development activities. Traffic volumes are immense, often reaching 10,000 messages in a week, and keeping up to date with the entire scope of development is a virtually impossible task for one person. One of the few brave souls to take on this task is Zack Brown.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95