Facebook releases its own OOM implementation
Contract Killer

© Lead Image © efks, 123RF.com
When a Linux system runs out of memory, a special agent, the out-of-memory killer, rushes to its aid. Facebook has now introduced its own OOM killer. What makes it different from its kernel-based counterpart? And what is an OOM killer really?
If you have not placed an order for a large server for a long time, you will probably rub your eyes in amazement the next time you order a new device: Configurations with terabytes instead of gigabytes of RAM are easy to get, and you don't need to be a millionaire to buy them. Gone are the days when people were proud of every single gigabyte (Figure 1).
Some buyers don't even worry about RAM anymore and just assume the system will have enough; however, this might be a little too optimistic, even on a modern system. Servers still sometimes come up short on RAM, and when they do, it can have dramatic consequences: If a component such as systemd needs RAM and cannot allocate it, the system will malfunction or stop working. To avoid a RAM shortage bringing computers to their knees, the Linux kernel has a watchdog on board: the out-of-memory killer, or OOM killer for short. In an emergency, OOM frees up memory by shooting down processes in a targeted way; the memory is then available for other, presumably more important purposes.
Many legends and horror stories are centered on the OOM-killer, and the admin's sense of humor is typically strained when they see kernel messages in the log saying that the killer has struck again (Figure 2). The reason for the anxiety is that it is large applications, such as Java, that the OOM killer targets as its victims.

Java is not famed for being very sparing with resources, but it is usually necessary for running the application for which the server exists. If the OOM killer shoots down Java on a Tomcat system, a load balancer usually catches the problem, but the server taken out in this way is still gone at the end of the day.
This article introduces the current OOM implementation in Linux and explains how it works. I will then compare this standard implementation with an alternative approach chosen by Facebook.
How OOM Situations Occur
Even servers with huge amounts of RAM can get into situations where the available system RAM is not sufficient. This is because the Linux kernel uses certain ways and means to allocate memory as efficiently as possible. If you have ever called top
and looked at the RAM statistics, you will be aware that even on systems with a large amount of RAM and very little load, the display for RAM utilization is often close to the 100 percent limit, even if the system has nothing to do (Figure 3).

The Linux kernel is the interface between the hardware on one side and the programs on the other. If a program wants memory, it asks the kernel for it using a system call like malloc()
. However, it takes too long for the kernel to first search for free memory and then make the requested amount available.
Instead, the kernel preempts: It divides the entire available memory into segments, known as memory pages. In addition, the kernel remembers which pages are already assigned to the running programs and which are thus still available. If a program now comes along and uses RAM, the kernel simply assigns it a memory page from the list of free pages. Because the memory pages are not all the same size, the kernel also has a certain degree of flexibility and can ensure that there is not too much waste.
Waste Is Bad
It is important to avoid waste to the greatest extent possible. Even if you have an arbitrary amount of RAM at your disposal, you will still want to use it as well and efficiently as possible. For many years, the Linux kernel has supported a function that many admins consider equivalent to opening up the proverbial Pandora's box – overbooking RAM.
Roughly speaking, it works like this: The kernel assigns memory pages to requesting programs as usual, but more in total than would actually be available through the physically available working memory. This does not directly cause OOM problems – they are caused by programs that require too much RAM.
However, RAM overcommitment increases the risk of OOM situations because the kernel does not rigorously deal with potential difficulties in advance. If Linux did not allow applications to allocate more memory than actually exists, crashes due to a lack of memory would be unthinkable because applications would simply see an error message when they tried to claim more memory than available.
The Linux approach is different. The kernel speculates that allocated memory will never be fully used. The vm.overcommit_memory=sysctl
variable manages everything else: If it is set to
, which is the default value, the kernel uses a heuristic approach to calculate how much RAM is actually free. It then sets this in relation to the memory that a requesting application wants to have. If the calculations are positive, the program gets the memory, even if the amount of allocated memory becomes larger than the actual memory available in the system.
vm.overcommit_memory=1
makes the kernel even more radical: In this case, the kernel skips the heuristic analysis and approves every request for RAM. But if you set the value to 2
, RAM overbooking is switched off.
What Really Helps
If you think that it is sufficient to deactivate RAM overbooking on the basis of the previous explanations, you are wrong. The OOM problem is not caused by overbooking RAM, but by programs that continuously allocate too much RAM. And unfortunately, they usually do this unpredictably and for a variety of reasons. Often the root of the problem is simply a programming error, which causes the affected program to overburden the RAM. Occasionally, it actually happens that a system needs more RAM than is available to process incoming requests.
If you are confronted with OOM situations, you should first try very carefully to find the cause. If the emergency is not based on a programming error and the OOM situations occur regularly and reproducibly, the long-term solution can only be more hardware. You can either put more RAM into the affected servers or scale the setup horizontally.
If you are dealing with a programming error, it is a good idea to find it and repair it – in collaboration with the developers if necessary. Troubleshooting in such cases can be tough and time consuming. But if OOM problems occur after an update where there were none before, a bug is most likely the trigger.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.
News
-
Fedora 39 Beta is Now Available for Testing
For fans and users of Fedora Linux, the first beta of release 39 is now available, which is a minor upgrade but does include GNOME 45.
-
Fedora Linux 40 to Drop X11 for KDE Plasma
When Fedora 40 arrives in 2024, there will be a few big changes coming, especially for the KDE Plasma option.
-
Real-Time Ubuntu Available in AWS Marketplace
Anyone looking for a Linux distribution for real-time processing could do a whole lot worse than Real-Time Ubuntu.
-
KSMBD Finally Reaches a Stable State
For those who've been looking forward to the first release of KSMBD, after two years it's no longer considered experimental.
-
Nitrux 3.0.0 Has Been Released
The latest version of Nitrux brings plenty of innovation and fresh apps to the table.
-
Linux From Scratch 12.0 Now Available
If you're looking to roll your own Linux distribution, the latest version of Linux From Scratch is now available with plenty of updates.
-
Linux Kernel 6.5 Has Been Released
The newest Linux kernel, version 6.5, now includes initial support for two very exciting features.
-
UbuntuDDE 23.04 Now Available
A new version of the UbuntuDDE remix has finally arrived with all the updates from the Deepin desktop and everything that comes with the Ubuntu 23.04 base.
-
Star Labs Reveals a New Surface-Like Linux Tablet
If you've ever wanted a tablet that rivals the MS Surface, you're in luck as Star Labs has created such a device.
-
SUSE Going Private (Again)
The company behind SUSE Linux Enterprise, Rancher, and NeuVector recently announced that Marcel LUX III SARL (Marcel), its majority shareholder, intends to delist it from the Frankfurt Stock Exchange by way of a merger.