Optimization

maddog's Doghouse

Article from Issue 201/2017
Author(s):

Understanding how optimization works is just as crucial to fast code as a good compiler.

I have written about the importance of knowing how things work "from the bottom up." Usually, this includes understanding (at least at an overall level) how the hardware works and how compilers and operating systems accomplish their tasks.

Recently a university student I know was taking a course on embedded systems, and they were writing a device handler in the C language. The student had defined a global variable to hold a value that would be filled in by the hardware itself, and not initialized by any other statement. Their professor told them that they had to declare the variable "volatile," or else the compiler might "optimize" the variable outside of the scope of the module. From the code the compiler could "see," there was no way that the variable's value would change, so normal optimizations might assume that the value in the variable was the same as the first time it was set, and that registers loaded with the contents of that variable would still be valid.

Wrong.

One main way that this variable might change is if the global variable was used in an interrupt, and the hardware itself (not the programmer's code) had asynchronously changed the value inside the global variable. The second way would be if the code was meant to be multithreaded and another thread had changed the value. This would not be seen by the compiler, because in that thread the value of the thread was untouched.

In both cases, the designation as "volatile" tells the compiler not to do various optimizations to that variable and to always take the value from the variable itself even if "it has not changed."

Multithreaded programming is more common than it was years ago; there are many fewer programmers who understand (and realize the coding issues) of interrupt handling. Without the word "volatile" on that variable, the compiler might silently move the statement using the variable's value out of the scope of a loop or even a module, with the compiler assuming that it was never updated. Most programmers would look at the source code for many weeks without understanding what was happening.

Another good "optimization" story comes from Digital Equipment Corporation's (DEC) early Unix products, and particularly the days of the X Window System.

In the years 1986 to 1988, the GNU compilers were still evolving. While many people used GNU because it was readily available and had the same language syntax and semantics across many hardware architectures and operating systems, the code GNU generated was not the most optimized in the world. As an example, a commercial compiler could generate code up to 30 percent faster in execution than the code GCC would generate. However, as people saw that CPUs were getting faster and faster with costs of good programmers continuing to climb, people often just "got a faster CPU" to make up for the lack of optimization performance.

Asking the compiler to generate lots of optimizations for any architecture or operating system also made the compilation go slower and introduced "bugs" as the optimizations changed the way the code worked.

In any case, the VAX C compiler engineers decided to change their compiler from accepting just ANSI C to accepting ANSI C and GNU C, in order to recompile not only the X Window System code but perhaps even the Unix kernel itself (based upon BSD Unix v4.1 and heavily modified by DEC engineers).

It took many months and much work to get the VAX C complier to understand the nuances of the GNU C suite, but finally the VAX C people finished the project and decided to recompile the X Window Server to see how much performance improvement they could achieve.

It was close to zero improvement.

Curious about why the "optimizing compiler" did not get better performance, they looked at the source code for the X Window System Server. They realized that it had been written by very, very expert coders who anticipated what the GNU compiler was going to do and wrote their code to do "hand optimizations" to make their code very, very efficient even if the compiler was not doing that.

Granted, other code in the X Window System had not been written by such skilled programmers, and that code did improve with the optimizing compiler, but pieces of critical code typically did not see much improvement.

Times have changed. Computer architecture has become more complex with multiple levels of cache, multicore CPUs, and other issues that make it really difficult for even the most experienced programmer to do this type of "optimization" in their source code. However combining a knowledgeable programmer with a good optimizing compiler (and the GNU compilers have improved dramatically) can make a great deal of difference in the speed of your code.

The Author

Jon "maddog" Hall is an author, educator, computer scientist, and free software pioneer who has been a passionate advocate for Linux since 1994 when he first met Linus Torvalds and facilitated the port of Linux to a 64-bit system. He serves as president of Linux International®.

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Doghouse: Coding Contest

    "maddog" and Linaro are collaborating on a contest to improve the performance of certain GNU/Linux source code modules.

  • Compiler Design - Terror in the classroom
  • How Compilers Work

    Compilers translate source code into executable programs and libraries. Inside modern compiler suites, a multistage process analyzes the source code, points out errors, generates intermediate code and tables, rearranges a large amount of data, and adapts the code to the target processor.

  • Meson Build System

    Developers fed up with cryptic Makefiles should take a look at the new Meson build system, which is simple to operate, offers scripting capabilities, integrates external test tools, and supports Linux, Windows, and Mac OS X.

  • GCC 4.1

    The new version of the GNU Compiler (GCC) has a fresh crop of optimizations and support for Objective C. The Recursive Descent Parser introduced in version 4.0 is now used for the C and Objective C derivatives.

comments powered by Disqus

Direct Download

Read full article as PDF:

News