Tips for speeding up your Linux system

Tweak Talk

Author(s):

If you are looking for ways to speed up your Linux, consider this collection of curated performance tweaks.

Linux is renowned as a high-performance operating system, and it runs on nearly all of the world's most powerful supercomputers. It also runs very well on regular desktops and workstations, but sometimes people ask for more. Whether you're faced with a low-end hardware setup or a loaded production system with high I/O, there is always room for tweaks and optimizations. Linux is an ideal OS for tinkering, and you have many options for eliminating performance bottlenecks, fixing non-optimal settings, and making the system more fluid and responsive. The goal of this article is to point you to some best practices for tweaking a typical home or office Linux-powered machine, while avoiding some of the outdated or less efficient advice.

The Curse of Low Memory

"Buy more RAM" – that's a frequent response to "I've got only 2GBs." However, sometimes it is not possible to install more memory bars into a computer. An average Linux desktop runs butter-smooth with 8GB, very nicely with 4GB (with some limitations to multitasking), and quite poorly with 2GB or less. Some palliative techniques that bring relief include using zram and zswap. These are the two methods of compressing memory to take down (or even completely avoid) swapping memory pages to the hard drive. Thanks to compression, the system has more free RAM, and with the lower swapping, the filesystem also speeds up. The trade-off is a higher CPU load due to constant compressing and decompressing, but its impact is usually smaller than the lagging caused by a running out RAM.

Zram is a compressed RAM-based swap device designed for systems with no physical swap partitions. It is a Linux kernel module (included since kernel 3.14) that creates a very fast virtual block device backed by RAM and sets it as a top-priority swap "partition." All you need to do is install the supplementary package for the zram systemd service and enable it. In Ubuntu, use the following commands:

$ sudo apt install zram-config
$ sudo systemctl start zram-config

Now there are extra /dev/zramX virtual devices, one per each of your CPU cores. It is easy to see the new swap setup with the $ sudo zramcltl command (Figure 1). The size of the devices is determined automatically based on your system's specs, although you can still change it. The following example sets /dev/zram0 to 1GB:

# swapoff /dev/zram0 && zramctl --size 1024M /dev/zram0
# mkswap /dev/zram0 && swapon /dev/zram0
Figure 1: Once the zram service for systemd is started, extra compressed swap devices appear.

Zswap is different in that it is designed for systems with a normal swap partition. If there is a swap partition, zswap provides a compressed cache that sits between RAM and the swap device. When it is time to offload memory pages to disk, zswap compresses the pages and stores them in the dedicated RAM cache. This technique is often referred to as a writeback cache, and it significantly reduces the disk I/O.

Zswap is a kernel feature (version 3.15 onwards) that you can enable and configure via the GRUB2 bootloader parameters – in the most basic case, it is enough to add zswap.enabled=1. I'll explicitly define the compression algorithm as LZ4, tell zswap to allocate 30 percent of RAM, and set the allocator to z3fold for better memory management:

zswap.enabled=1 zswap.compressor=lz4 zswap.max_pool_percent=30 zswap.zpool=z3fold

It is theoretically possible to use both zram and zswap at the same time, but this option is not known to improve performance more than the improvement you get from choosing one of the tools. When deciding on a compressing method for your system, keep in mind that zram works best in situations where you want to avoid HDD or SSD wear out (e.g., on small systems with flash storage). Zswap is a more general-purpose tweak.

More on Swapping

Depending on the amount of available RAM, it may be worth adjusting parameters such as swappiness and disk cache pressure. Both take settings from /etc/sysctl.d/99-sysctl.conf. There is a popular mistaken belief that swappiness represents a threshold for RAM usage, and when the amount of used RAM hits that threshold, swapping starts. In fact, the vm.swappines setting is a ratio (0-100) that controls the aggressiveness of swapping, particularly the memory pages reclamation by the kernel. Changing swappiness only makes sense for systems with slow or aging mechanical drives in order to reduce the swapping activity to some degree. For instance, changing the value from 60 (default) to 40 is a good practice for making a slow machine perform better, especially if it has at least 2GB of RAM.

Disk cache pressure (vm.vfs_cache_pressure) controls the tendency of the kernel to reclaim the memory that is used for caching of directory and inode objects. Setting the value to 0 leads to out-of-memory lockdowns, whereas increasing it over the default value of 100 makes the Linux kernel reclaim VFS inodes too actively and slows down the system. The optimal value is perhaps 50; at least my test showed that setting the value to 50 makes many filesystem operations (like finding files) noticeably faster.

Tinkering with the Kernel?

There are lots of third-party patches for Linux that require manually applying and rebuilding the whole source tree of the kernel, which could be a difficult and time-consuming task. The promising, efficient, and easy way to include these options is the XanMod kernel.

The XanMod project provides a patch set and prebuilt kernel packages for Debian, Ubuntu, Arch, and Gentoo. This modified kernel features the optimized CPU scheduler, Budget Fair Queueing (BFQ) I/O scheduler (the tool that controls the disk read/write requests management, which plays a huge role in keeping the OS responsive when the disk is under heavy load), and other tweaks for better preemptive multitasking and lower latencies. When compared to the stock Linux kernel in such tests as encoding something with FFmpeg, resizing and rotating images in Gimp, and similar CPU-heavy tasks, the XanMod 5.4 kernel was 8-10 percent faster, which was a solid margin. Install XanMod in Ubuntu as follows:

$ echo 'deb http://deb.xanmod.org releases main' | sudo tee /etc/apt/sources.list.d/xanmod-kernel.list && wget -qO - https://dl.xanmod.org/gpg.key | sudo apt-key add
$ sudo apt update && sudo apt install linux-xanmod

Hard Disk Requests Scheduler

Many of XanMod's benefits depend on the I/O scheduler. Historically, Linux has supported many I/O schedulers, but these days, the choice comes down to Multiqueue Deadline (mq-deadline), Kyber, and BFQ. Check their availability with this command:

$ cat /sys/block/sda/queue/scheduler

Multiqueue Deadline is an adaptation of the original Deadline scheduler, which was created to guarantee a start service time for a request. Our tests showed that mq-deadline is a good all-rounder with no drawbacks, yet with few advantages as well.

Kyber is a more recent scheduler tuned for fast multi-queue devices, such as modern NVMe drives. It has two queues: one synchronous and another for asynchronous requests.

BFQ is often advertised as the best scheduler of all. In fact, it provides the best interactivity for systems with relatively slow drives, including the low-end SSDs. If you wish to eliminate the slow I/O bottleneck and make disk read/writes appear to be faster, BFQ is second to none. Switch to the desired scheduler in the runtime using the following template:

$ sudo echo scheduler_name > /sys/block/sda/queue/scheduler

In the case of BFQ, it also makes sense to add the scsi_mod.use_blk_mq=1 boot option in the GRUB configuration (Figure 2).

Figure 2: To make changes persistent, edit the GRUB2 configuration.

Changing the scheduler affects the Linux system when it is under heavy workload. For instance, try to encode or archive something big (or otherwise load the CPU), and then try to copy some data on a flash thumb drive. Such an exercise will show how the system handles the huge flow of I/O requests, which could help you decide with the right scheduler.

Make the Processes Run Nicer

Modern Linux systems include several services that start during boot time. Some of these services may be unused and therefore disabled with no harm. The first step is to look at which services are consuming the boot time:

$ systemd-analyze blame

This command prints the list of auto-started services in the descending order based on how much time they took to start. If you don't know what a service is used for, don't disable it. However, if you are not managing a mail server, then it is safe to turn off Postfix. Also, consider if CUPS (for printing) and database services like PostgreSQL or MariaDB are there for any use. Disabling a service is as simple as:

$ sudo systemctl disable service_name

Use stop instead of disable to stop a service immediately. For the rest, re-sort and optimize the running processes using Ananicy. Ananicy [1], a third-party tool, consists of a shell script and a systemd daemon to control priorities of running processes and applications. This is solely a desktop-oriented tweak intended to solve such issues as, "Why does my game lag during kernel compilation?" Ananicy ships with a community-maintained list of rules, which are very sane for the most part. After installing the tool (follow the README.md guide), enable and start Ananicy as follows:

$ sudo systemctl enable ananicy && sudo systemctl start ananicy

The effect of tools such as Ananicy will be different across the endless variety of configurations, but it is always noticeable. Ananicy is perfect for laptops – it makes batteries last longer and fans behave more quietly.

Apply Filesystem Tweaks

In the past, it was common to hear the advice of using the noatime mounting option for partitions listed in /etc/fstab. This option is counterproductive in modern Linux systems that already use the less risky and performance-friendly relatime option by default.

For everyday scenarios with most major Linux systems, the most balanced and fastest filesystem is usually ext4, which is already tuned for the best performance in most mainstream distros. However, long-running Linux systems suffer from disk fragmentation, which leads to slowdowns. Fragmentation in Linux isn't a nightmare that it used to be in the Windows world in late 90s, but it is still a problem if your Linux machine runs intensive disk reads and writes for a long time. The solution is defragmenting. There is the universal script by Con Kolivas, defrag [2], that rewrites files in order of largest to smallest and works for any filesystem. But, for the extX filesystem family, a better solution is shipped within the E2fsprogs package. Start by examining the current state of fragmentation for a test /dev/sda3 partition:

$ sudo e4defrag -c /dev/sda3

Look at the 'Fragmentation score' and see if it is not too high (e.g., below 30). Even if the partition is healthy, the above command will list the files that have been fragmented and may suffer from longer access times. It is easy to defragment them with this command:

$ sudo e4defrag /dev/sda3

The e4defrag command also accepts directories, so that you don't have to process the whole partition if you only work with a given directory. More than that, defragmenting adjusts free extents to the size of the files that you store. So, if you run the e2freefrag tool afterwards, you'll see the table of extents of different sizes, adapted to the kind of information that already exists on the disk (Figure 3):

$ sudo e2freefrag /dev/sda3
Figure 3: Too many small extents will lead to file fragmentation on an ext4 filesystem.

In addition to traditional defragmentation, you can also improve ext4 filesystem performance by reallocating frequently used files using e4rat. This tool reduces disk access time utilizing the EXT4_IOC_MOVE_EXT ioctl feature of ext4 and doing so-called online defragmentation. Modern e4rat code [3] takes few minutes to build from source (see the project page for guidance). Normally, e4rat requires three phases: first for learning (collecting files), second for reallocating what has been collected, and third for preloading reallocated data to page cache (Figure 4). To get started with e4rat, enter:

$ sudo systemctl stop auditd // (auditd conflicts with E4rat)
$ sudo e4rat-collect // (start opening apps you want to optimize, hit Ctrl+C when done)
$ sudo e4rat-realloc e4rat-collect.log
Figure 4: Reallocating frequently used data can be a magic potion for systems with rotational disks.

This technique effectively makes every cold start of an optimized app feel like it is hot, which is a great aid for slow or low-end Linux systems.

Small Tweaks

Also consider the following small, yet efficient tweaks:

  • Eliminate excessive cryptographic routines that involve hard drives. To prevent the I/O on your SSD or HDD from contributing to the entropy pool, you can disable the add_random setting for your block devices:
# echo 0 > /sys/block/sda/queue/add_random

Linux uses the entropy pool for things like generating SSH keys, SSL certificates, or anything else crypto. Preventing your SSD from contributing to this pool probably isn't a security issue, but it will save you small amounts of I/O.

  • Bring parallel drive probing back. It may seem surprising, but modern-day Linux distributions still probe ATA devices serially by default, which is called staggered spin-ups (SSS). This technique goes back to a time when spinning up multiple drives at once caused power usage peaks, and thus the Linux kernel avoided parallel probes. This behavior makes little or no sense on modern systems, especially if there are no rotational drives. Enabling probing ATA drives in parallel speeds up the boot process. First, check if the SSS flag is set in your system:
$ dmesg | grep SSS

If it is, add the libahci.ignore_sss=1 boot option to GRUB. A Linux system with several hard drives or SSDs will see a better boot time.

  • Mount /tmp to RAM. Most Linux systems already use tmpfs (check it with $ df), but you can add another RAM-backed mount point to get rid of temporary runtime clutter that usually sits in /tmp:
$ echo "tmpfs /tmp tmpfs rw, nosuid,nodev 0 0" | sudo tee -a /etc/fstab

This way your system will clear everything found in /tmp upon every reboot. Doing the same for /var/tmp is not recommended as long as the /var/tmp directory is meant to store data persistently.

Conclusion

The very versatile Linux includes many modules, tools, and settings that allow you to tweak the system and improve performance. If you have a computer with insufficient memory or slower-than-expect I/O, or even if you just want to experiment to learn more about Linux, the techniques described in this article will help you take your first steps.

The Author

Alexander Tolstoy is a long-time Linux enthusiast and tech journalist. He never stops exploring hot new open source picks and loves writing reviews, tutorials, and various tips and tricks. Sometimes he must face a bitter truth thanks to the inhuman fortune | cowsay command that he thoughtlessly put in ~/.bashrc.