Optimization
maddog's Doghouse
Understanding how optimization works is just as crucial to fast code as a good compiler.
I have written about the importance of knowing how things work "from the bottom up." Usually, this includes understanding (at least at an overall level) how the hardware works and how compilers and operating systems accomplish their tasks.
Recently a university student I know was taking a course on embedded systems, and they were writing a device handler in the C language. The student had defined a global variable to hold a value that would be filled in by the hardware itself, and not initialized by any other statement. Their professor told them that they had to declare the variable "volatile," or else the compiler might "optimize" the variable outside of the scope of the module. From the code the compiler could "see," there was no way that the variable's value would change, so normal optimizations might assume that the value in the variable was the same as the first time it was set, and that registers loaded with the contents of that variable would still be valid.
Wrong.
One main way that this variable might change is if the global variable was used in an interrupt, and the hardware itself (not the programmer's code) had asynchronously changed the value inside the global variable. The second way would be if the code was meant to be multithreaded and another thread had changed the value. This would not be seen by the compiler, because in that thread the value of the thread was untouched.
In both cases, the designation as "volatile" tells the compiler not to do various optimizations to that variable and to always take the value from the variable itself even if "it has not changed."
Multithreaded programming is more common than it was years ago; there are many fewer programmers who understand (and realize the coding issues) of interrupt handling. Without the word "volatile" on that variable, the compiler might silently move the statement using the variable's value out of the scope of a loop or even a module, with the compiler assuming that it was never updated. Most programmers would look at the source code for many weeks without understanding what was happening.
Another good "optimization" story comes from Digital Equipment Corporation's (DEC) early Unix products, and particularly the days of the X Window System.
In the years 1986 to 1988, the GNU compilers were still evolving. While many people used GNU because it was readily available and had the same language syntax and semantics across many hardware architectures and operating systems, the code GNU generated was not the most optimized in the world. As an example, a commercial compiler could generate code up to 30 percent faster in execution than the code GCC would generate. However, as people saw that CPUs were getting faster and faster with costs of good programmers continuing to climb, people often just "got a faster CPU" to make up for the lack of optimization performance.
Asking the compiler to generate lots of optimizations for any architecture or operating system also made the compilation go slower and introduced "bugs" as the optimizations changed the way the code worked.
In any case, the VAX C compiler engineers decided to change their compiler from accepting just ANSI C to accepting ANSI C and GNU C, in order to recompile not only the X Window System code but perhaps even the Unix kernel itself (based upon BSD Unix v4.1 and heavily modified by DEC engineers).
It took many months and much work to get the VAX C complier to understand the nuances of the GNU C suite, but finally the VAX C people finished the project and decided to recompile the X Window Server to see how much performance improvement they could achieve.
It was close to zero improvement.
Curious about why the "optimizing compiler" did not get better performance, they looked at the source code for the X Window System Server. They realized that it had been written by very, very expert coders who anticipated what the GNU compiler was going to do and wrote their code to do "hand optimizations" to make their code very, very efficient even if the compiler was not doing that.
Granted, other code in the X Window System had not been written by such skilled programmers, and that code did improve with the optimizing compiler, but pieces of critical code typically did not see much improvement.
Times have changed. Computer architecture has become more complex with multiple levels of cache, multicore CPUs, and other issues that make it really difficult for even the most experienced programmer to do this type of "optimization" in their source code. However combining a knowledgeable programmer with a good optimizing compiler (and the GNU compilers have improved dramatically) can make a great deal of difference in the speed of your code.
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Gnome 48 Debuts New Audio Player
To date, the audio player found within the Gnome desktop has been meh at best, but with the upcoming release that all changes.
-
Plasma 6.3 Ready for Public Beta Testing
Plasma 6.3 will ship with KDE Gear 24.12.1 and KDE Frameworks 6.10, along with some new and exciting features.
-
Budgie 10.10 Scheduled for Q1 2025 with a Surprising Desktop Update
If Budgie is your desktop environment of choice, 2025 is going to be a great year for you.
-
Firefox 134 Offers Improvements for Linux Version
Fans of Linux and Firefox rejoice, as there's a new version available that includes some handy updates.
-
Serpent OS Arrives with a New Alpha Release
After months of silence, Ikey Doherty has released a new alpha for his Serpent OS.
-
HashiCorp Cofounder Unveils Ghostty, a Linux Terminal App
Ghostty is a new Linux terminal app that's fast, feature-rich, and offers a platform-native GUI while remaining cross-platform.
-
Fedora Asahi Remix 41 Available for Apple Silicon
If you have an Apple Silicon Mac and you're hoping to install Fedora, you're in luck because the latest release supports the M1 and M2 chips.
-
Systemd Fixes Bug While Facing New Challenger in GNU Shepherd
The systemd developers have fixed a really nasty bug amid the release of the new GNU Shepherd init system.
-
AlmaLinux 10.0 Beta Released
The AlmaLinux OS Foundation has announced the availability of AlmaLinux 10.0 Beta ("Purple Lion") for all supported devices with significant changes.
-
Gnome 47.2 Now Available
Gnome 47.2 is now available for general use but don't expect much in the way of newness, as this is all about improvements and bug fixes.