Improving performance of Linux on ARM
Assembly Line

"maddog" looks at some of Linaro's efforts to improve GNU/Linux performance on ARM architectures.
For the past several months, I have been working with Linaro [1], an association of companies who want to see GNU/Linux working well on ARM architectures. Although ARM Holdings designs the ARM architecture chips, various other companies manufacture the CPUs, GPUs, and SoCs (Systems on a Chip) from ARM's licensed designs. Some of these companies use these manufactured units in their own products, and some sell the manufactured units to other companies and to the general public. For the past couple of years, ARM has been working on a 64-bit chip, and their licensees are getting close to having ARM 64-bit hardware ready.
One of the ARM engineers determined that 1,400 different source code modules in either Ubuntu or Fedora (or both) have assembly language in the code. This is not to say that the assembly language (or lack of it) will stop the module from working on the ARM64 system, because there may be higher level fallback code (e.g., code written in C) that will take over and be compiled for the missing ARM64 assembly language. However, the modules have not been tested and verified either on actual hardware or on the emulators for the ARM64 architecture that currently exist. Thus, Linaro decided to enlist the community in porting some of these modules and has created a contest with prizes for those people who help out [2].
The engineers also noticed that a lot of the code containing assembly language was fairly old. It was designed in an age when systems had a single CPU; CPUs were much slower, with a single core; memory was measured in megabytes, not gigabytes; Ethernet was 10Mbps, not 1,000Mbps; and the GNU compilers were not as good at optimization as commercial compilers. Therefore, people wrote assembler for the tightest, fastest parts of the system.
If those programs were written today, however, they might have a lot less assembly language, and the code would be more portable. Thus, the contest was expanded to include improving the performance of these modules and (perhaps) eliminating some of the old assembly language where it made sense.
Embedded systems exemplify how our perspective of "performance" has changed over time, in that the size of the memory footprint is often a measure of performance, with a small footprint providing savings in the manufacturing process. Extended battery life, achieved by allowing parts of the system to be turned off after the application is finished, also represents an improvement in performance. In large server farms, performance is often measured in electricity savings, savings on cooling, or in reduction of equipment purchases and floor space.
In the early years of my programming career, my job was not to write new functionality but to get other people's programs to work "better." My manager told me that if I could not get the application to work in half the time, not to bother with it. In almost every case, I could make an application run not only in half the time, but often five to 10 times faster. It was a very satisfying job, so it has been interesting to start investigating new techniques for profiling code, finding the bottlenecks, and seeing new performance improvements and efficiencies that can be made since I did this work 30 years ago.
At the same time, I am working with some very small systems that have some really interesting features. The use of GPUs for computation, digital signal processing chips, and field-programmable gate arrays (FPGAs) were all conceptual years ago, but they were cost and space prohibitive. These concepts now have become not only feasible but even competitive in price/performance with other, more "mainstream" types of circuitry.
A board from a company called Adapteva not only has a SoC with a two-core ARM processor, FPGA, and digital signal processing chips, it also has a 16- or 64-core CPU. All of this, plus some system memory and USB ports, comes on a board in the US$ 100-150 price range [3]. The opportunity to learn about these architectures has now become practical.
Recently some people attracted a lot of attention by building a "supercomputer" out of a Raspberry Pi, a single-core system that does not invite the type of programming that might occur in a real HPC system. In an HPC system, each board can have several CPUs or several cores in a single CPU and use OpenMP in conjunction with MPI and other heterogeneous computing environments. Substituting computers such as the Banana Pi [4] or ODROID-U3 [5] would create a higher performing "supercomputer" at a reasonable increase in price and would afford a more realistic mix of programming styles.
I encourage readers to sign up for Linaro's contest and help GNU/Linux be the best that it can be.
The author
Jon "maddog" Hall is an author, educator, computer scientist, and free software pioneer who has been a passionate advocate for Linux since 1994 when he first met Linus Torvalds and facilitated the port of Linux to a 64-bit system. He serves as president of Linux International®.
Infos
- Linaro: http://linaro.org
- Linaro Performance Challenge: http://performance.linaro.org
- Parallela: http://www.adapteva.com/parallella/
- Banana Pi: http://www.banana-pi.org
- ODROID-U3: http://hardkernel.com/main/products/prdt_info.php
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Direct Download
Read full article as PDF:
Price $2.95
News
-
Elementary OS 5.1 Has Arrived
One of the most highly regarded Linux desktop distributions has released its next iteration.
-
Linux Mint 19.3 Will be Released by Christmas
The developers behind Linux Mint have announced 19.3 will be released by Christmas 2019.
-
Linux Kernel 5.4 Released
A number of new changes and improvements have reached the Linux kernel.
-
System76 To Design And Build Laptops In-House
In-house designed and built laptops coming from System76.
-
News and views on the GPU revolution in HPC and Big Data:
-
The PinePhone Pre-Order has Arrived
Anyone looking to finally get their hands on an early release of the PinePhone can do so as of November 15.
-
Microsoft Edge Coming to Linux
Microsoft is bringing it’s new Chromium-based Edge browser to Linux.
-
Open Invention Network Backs Gnome Project Against Patent Troll
OIN has deployed its legal team to find prior art.
-
Fedora 31 Released
The latest version of Fedora comes with new packages and libraries.
-
openSUSE OBS Can Now Build Windows WSL Images
openSUSE enables developers to build their own WSL distributions.