The ARM architecture – yesterday, today, and tomorrow
ARM Architecture
Like the x86 architecture, the ARM architecture has been extended over time to meet new requirements. Each version of the architecture defines which of these extensions are mandatory and which are optional. The extensions might not be as varied as with the x86, but the scope is still large (Figure 1), which is why this article focuses on the hitherto most-used architectures: ARMv4 through ARMv7. ARMv8, which you'll learn about later in this article, differs significantly from its predecessors with an extension to 64 bits.
32 Bits for Everything
The ARM architecture was designed from the beginning as a 32-bit architecture, which is expressed in particular in the 32-bit processing width and 32-bit address space (from ARMv3 onward; 26-bit addressing before this). An ARM core thus addresses a maximum of 4GB of memory, although most implementations actually use only a part of it. Only the Cortex A15 avoids this limit with a few tricks.
As with other processor architectures, ARM has several processor modes. Ordinary programs run in user mode, and system mode is reserved for privileged operating system code. Some modes also handle exceptions and, starting with ARMv7, hardware-assisted virtualization. One special feature of the ARM architecture is that each mode has its specific registers, which the system automatically remaps during a mode change. ARM systems thus implement interrupts in a very efficient way.
Most implementations also have an MMU (Memory Management Unit) for storage virtualization and memory protection, but some only have an MPU (Memory Protection Unit) for the implementation of memory protection. Some very simple microcontrollers do without both.
The main difference between x86 and ARM is that ARM is a RISC architecture (see the "RISC" box), whereas the x86 is a member of the CISC (Complex Instruction Set Computer) family. ARM leverages the RISC concept to the max; it is a load-store architecture with a relatively large number of registers and a very small number of commands.
In combination with the restricted addressing modes, this means that all commands can be encoded with exactly 32 bits (i.e., one word) and aligned with word boundaries. The instruction decoder can thus be designed very simply: All it has to do for a command is read a word from memory and then decode it.
With an x86, the overhead is far greater because commands here have lengths of between 1 and 15 bytes (some even have instruction set extensions). The processor thus has to decide, depending on the start of a command, how long the command will be – alignment at word boundaries is not possible.
Current x86 implementations solve this problem by breaking down the complex x86 instructions into simple RISC instructions (known as micro-operations). These steps are not necessary for an ARM processor, which reduces both the hardware overhead and energy consumption.
Conditional Instructions
The mode-specific registers lead to the need for a large number of registers (about 40), but in any mode, only 16 registers (R0 to R15) can be addressed directly (Figure 2). The programmer can use registers R0 to R12 freely, whereas R13 acts as the stack pointer in most cases, R14 is used as the link register for storing the return address for procedure calls, and R15 is the program counter, which the processor can also access directly, just like any other register.
The instruction set avoids unnecessary redundancy, providing standard instructions for arithmetic and logical operations, memory access, program flow control, exception handling, controlling the various modes, and accessing coprocessors. The instructions themselves are no different from those of other architectures, so in this article, we just highlight a few features.
Unlike most other architectures, in which only branch instructions allow execution as a function of conditions, almost any ARM instruction is conditional. To allow this to happen, the command code uses a 4-bit mask to specify which conditions (negative, zero, carry, overflow) must be met for execution, allowing for very compact code and avoiding jumps (Listing 2).
Listing 2
Conditional Execution
In addition to the load-store instructions that allow a memory word to be transferred between memory and the registers, load-store multiple instructions allow a series of contiguous words in memory between a set of registers. This means that the processor can write a small variable field to the registers with one command. This approach also lends itself to very effective use of the stack because it can store or read multiple registers at once. This is of particular interest in the programming of interrupt handlers or context switching in an operating system, since the entire register set can be replaced with just two commands.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
SUSE Renames Several Products for Better Name Recognition
SUSE has been a very powerful player in the European market, but it knows it must branch out to gain serious traction. Will a name change do the trick?
-
ESET Discovers New Linux Malware
WolfsBane is an all-in-one malware that has hit the Linux operating system and includes a dropper, a launcher, and a backdoor.
-
New Linux Kernel Patch Allows Forcing a CPU Mitigation
Even when CPU mitigations can consume precious CPU cycles, it might not be a bad idea to allow users to enable them, even if your machine isn't vulnerable.
-
Red Hat Enterprise Linux 9.5 Released
Notify your friends, loved ones, and colleagues that the latest version of RHEL is available with plenty of enhancements.
-
Linux Sees Massive Performance Increase from a Single Line of Code
With one line of code, Intel was able to increase the performance of the Linux kernel by 4,000 percent.
-
Fedora KDE Approved as an Official Spin
If you prefer the Plasma desktop environment and the Fedora distribution, you're in luck because there's now an official spin that is listed on the same level as the Fedora Workstation edition.
-
New Steam Client Ups the Ante for Linux
The latest release from Steam has some pretty cool tricks up its sleeve.
-
Gnome OS Transitioning Toward a General-Purpose Distro
If you're looking for the perfectly vanilla take on the Gnome desktop, Gnome OS might be for you.
-
Fedora 41 Released with New Features
If you're a Fedora fan or just looking for a Linux distribution to help you migrate from Windows, Fedora 41 might be just the ticket.
-
AlmaLinux OS Kitten 10 Gives Power Users a Sneak Preview
If you're looking to kick the tires of AlmaLinux's upstream version, the developers have a purrfect solution.