Optimization
maddog's Doghouse
Understanding how optimization works is just as crucial to fast code as a good compiler.
I have written about the importance of knowing how things work "from the bottom up." Usually, this includes understanding (at least at an overall level) how the hardware works and how compilers and operating systems accomplish their tasks.
Recently a university student I know was taking a course on embedded systems, and they were writing a device handler in the C language. The student had defined a global variable to hold a value that would be filled in by the hardware itself, and not initialized by any other statement. Their professor told them that they had to declare the variable "volatile," or else the compiler might "optimize" the variable outside of the scope of the module. From the code the compiler could "see," there was no way that the variable's value would change, so normal optimizations might assume that the value in the variable was the same as the first time it was set, and that registers loaded with the contents of that variable would still be valid.
Wrong.
One main way that this variable might change is if the global variable was used in an interrupt, and the hardware itself (not the programmer's code) had asynchronously changed the value inside the global variable. The second way would be if the code was meant to be multithreaded and another thread had changed the value. This would not be seen by the compiler, because in that thread the value of the thread was untouched.
In both cases, the designation as "volatile" tells the compiler not to do various optimizations to that variable and to always take the value from the variable itself even if "it has not changed."
Multithreaded programming is more common than it was years ago; there are many fewer programmers who understand (and realize the coding issues) of interrupt handling. Without the word "volatile" on that variable, the compiler might silently move the statement using the variable's value out of the scope of a loop or even a module, with the compiler assuming that it was never updated. Most programmers would look at the source code for many weeks without understanding what was happening.
Another good "optimization" story comes from Digital Equipment Corporation's (DEC) early Unix products, and particularly the days of the X Window System.
In the years 1986 to 1988, the GNU compilers were still evolving. While many people used GNU because it was readily available and had the same language syntax and semantics across many hardware architectures and operating systems, the code GNU generated was not the most optimized in the world. As an example, a commercial compiler could generate code up to 30 percent faster in execution than the code GCC would generate. However, as people saw that CPUs were getting faster and faster with costs of good programmers continuing to climb, people often just "got a faster CPU" to make up for the lack of optimization performance.
Asking the compiler to generate lots of optimizations for any architecture or operating system also made the compilation go slower and introduced "bugs" as the optimizations changed the way the code worked.
In any case, the VAX C compiler engineers decided to change their compiler from accepting just ANSI C to accepting ANSI C and GNU C, in order to recompile not only the X Window System code but perhaps even the Unix kernel itself (based upon BSD Unix v4.1 and heavily modified by DEC engineers).
It took many months and much work to get the VAX C complier to understand the nuances of the GNU C suite, but finally the VAX C people finished the project and decided to recompile the X Window Server to see how much performance improvement they could achieve.
It was close to zero improvement.
Curious about why the "optimizing compiler" did not get better performance, they looked at the source code for the X Window System Server. They realized that it had been written by very, very expert coders who anticipated what the GNU compiler was going to do and wrote their code to do "hand optimizations" to make their code very, very efficient even if the compiler was not doing that.
Granted, other code in the X Window System had not been written by such skilled programmers, and that code did improve with the optimizing compiler, but pieces of critical code typically did not see much improvement.
Times have changed. Computer architecture has become more complex with multiple levels of cache, multicore CPUs, and other issues that make it really difficult for even the most experienced programmer to do this type of "optimization" in their source code. However combining a knowledgeable programmer with a good optimizing compiler (and the GNU compilers have improved dramatically) can make a great deal of difference in the speed of your code.
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Gnome 47.1 Released with a Few Fixes
The latest release of the Gnome desktop is all about fixing a few nagging issues and not about bringing new features into the mix.
-
System76 Unveils an Ampere-Powered Thelio Desktop
If you're looking for a new desktop system for developing autonomous driving and software-defined vehicle solutions. System76 has you covered.
-
VirtualBox 7.1.4 Includes Initial Support for Linux kernel 6.12
The latest version of VirtualBox has arrived and it not only adds initial support for kernel 6.12 but another feature that will make using the virtual machine tool much easier.
-
New Slimbook EVO with Raw AMD Ryzen Power
If you're looking for serious power in a 14" ultrabook that is powered by Linux, Slimbook has just the thing for you.
-
The Gnome Foundation Struggling to Stay Afloat
The foundation behind the Gnome desktop environment is having to go through some serious belt-tightening due to continued financial problems.
-
Thousands of Linux Servers Infected with Stealth Malware Since 2021
Perfctl is capable of remaining undetected, which makes it dangerous and hard to mitigate.
-
Halcyon Creates Anti-Ransomware Protection for Linux
As more Linux systems are targeted by ransomware, Halcyon is stepping up its protection.
-
Valve and Arch Linux Announce Collaboration
Valve and Arch have come together for two projects that will have a serious impact on the Linux distribution.
-
Hacker Successfully Runs Linux on a CPU from the Early ‘70s
From the office of "Look what I can do," Dmitry Grinberg was able to get Linux running on a processor that was created in 1971.
-
OSI and LPI Form Strategic Alliance
With a goal of strengthening Linux and open source communities, this new alliance aims to nurture the growth of more highly skilled professionals.