Kernel Issues - The "Hurricane Katrina" of programming
Paw Prints: Writings of the maddog
Last week I spent two days that the Red Hat Summit in Boston. Unlike a lot of conferences I attend, I actually spent much of my time in technical talks listening to some of the things that Red Hat was going to be putting into RHEL 6.0 which is due out in a short time1.
I enjoy listening to technical talks, particularly ones talking about kernel issues since I used to teach operating system design. I taught other types of programming (database, compiler design, networking, graphics) but in my opinion most application-level programming (including libraries) is a “calm sea” versus the “Hurricane Katrina” of kernel programming.
One of the areas of interest to me was the various file systems being supported in the upcoming RHEL, not only the various attributes of the filesystems (that I could also get by reading various white papers and reports on the Internet) but some of the lower-level “grunt work” that needs to be done to make sure the files system is dependable and efficient under different loads.
As an example, in order to get better performance out of some newer filesystems, the device drivers had been changed to utilize various hardware features in iSCSI and SATA. These standard features had been defined in the hardware architecture for some time, but had never been utilized by Linux. Unfortunately some of the earlier implementations of these standards did not implement these features correctly, so the Red Hat engineers had to go back and re-qualify some of the hardware controllers to make sure that they would work as expected with the modified drivers.
Another area of testing is in file system limits. Some of the newer filesystems can expand to 100 TeraBytes or more2. It is one thing to write the code that “theoretically” can handle 100 TeraBytes, but quite another to actually set up 100 TB of disk space and exercise the code to get actual verification that it is working correctly, or to get true measured performance.
As another example, issues of programming the kernel have been accentuated by issues around virtualization. Kernel locks, which have always been a touchy issue with race conditions, and clock skew issues in bare metal operating systems take on a life of their own when running on a virtualized system.
One of my favorite talks of the two-day session was “Kernel Optimizations for KVM” by Rik van Riel. It was Rik's talk that illuminated the issues of clock programming that I alluded to earlier. Rik pointed out that the normal issue of the clock causing an interrupt every 100 milliseconds was not a big deal in a normal Linux kernel running on bare metal, as the CPU has enough time to go to sleep between clock tics to save energy. However, when you have many virtual machines running at one time on a piece of hardware, with a normal Linux “clock” the virtual machines would wake up the host just to have a “clock tick” registered, which would keep the host cpu from being efficient with electricity. The other issues happens if the client system happens to be swapped out when its clock tic should be registered, the client system will miss its clock tic and have an incorrect time. As Rik pointed out, an “incorrect time” on a series of virtual servers might mean that the packet that goes from one system to another may show up either “very late” or “before it was sent”. With current changes in the Linux KVM kernel the guest system can simply ask the host for the time-of-day information, and get the correct “time-of-day” for its activities.
These are the types of issues that make kernel programming “interesting”.
1Tim Burke, VP of Platform Engineering (and an old friend of mine), did a great balance of going over the major features in only an hour's time without talking faster than a tabacco auctioneer.
2- is it just me, or is it that “100 TeraBytes” does not seem that large any more? Perhaps it is because I can buy 2 TB disks at such an inexpensive price....comments powered by Disqus
News site for the openSUSE community falls victim to a Wordpress exploit.
The source code is available online.
One out of three virtual machines on Microsoft Azure Cloud run Linux.
The form factor of the board makes it a drop-in replacement for Raspberry Pi.
Makes it easier for customers to move workloads into container-centric applications.
SUSE’s answer to container-centric operating systems.
Linux 4.9 is the biggest release in terms of number of commits.
The latest version of the official RHEL clone is here.
New release targets Linux professionals.
The Fedora project adds Wayland and Gnome 3.22