Automatic backups to external media

Safe House

Article from Issue 227/2019
Author(s):

A recent backup is more reliable than any kind of data rescue. But for many users, a backup won't happen unless the process is easy to manage.

The simplest form of backup is to copy the data to a mobile device, such as a USB stick or an external USB hard drive. You can perform this kind of backup either with the file manager or with a copy command like:

cp -a source_directory target_directory

This command uses the archive option -a to recursively copy the files, and it also includes the file permissions. If the target directory does not exist, the command creates it as a copy of the source and its contents. If, on the other hand, it already exists, it will contain a subdirectory named source_directory after the copy operation.

The cp command with a single option is easy to remember, but it has a number of drawbacks. Exotic files and attributes can be too much for cp. On top of that, the process is inefficient, because it stubbornly copies all the source files. Furthermore, there is no verification of the results.

That's why professionals use the more complicated command shown in Listing 1, with rsync, which is installed by default on almost every distribution. Here, too, there is the archive option -a. The other options relate to extended attributes (-X), special attributes (such as access control lists, -A), hardlinks (-H), and sparse files (-S). In addition, the command does not copy across partition boundaries (-x).

Listing 1

Copying with rsync

 

Rsync creates the target directory if it does not already exist. A slash after the name of the source directory ensures that rsync does not copy the directory itself (line 1), but only its contents including subdirectories (line 2).

If you run the command more than once, rsync will copy only modified files. It also checks whether the copy was actually successful. If you add the --delete option, the command will even delete files on the target that no longer exist in the source – so you get a perfect one-to-one copy.

Copies Are Not Enough

The problem with this type of backup method is that a simple replication of the data also copies defective files, or files encrypted by Trojans. With Linux, the danger is less from viruses and Trojans than simply the user sitting in front of the keyboard. If you quickly delete or edit a file by mistake, then only notice this after the backup copy, the old version is long gone.

A genuine backup therefore also includes archiving old versions. At this point, things usually get complicated, and many users give up. But, fortunately, there are a number of projects dedicated to this task. In most cases, the solutions cause some overhead during the initial setup, but the daily backup then becomes all the easier.

You have to overcome two hurdles during rsnapshot's initial setup: creating an appropriate backup medium and installing the software and its configuration. All in all, however, this takes about fifteen minutes of prep work.

Backup Medium

For the backup medium, use either a USB stick with enough capacity, or an external USB hard disk. For mobile use, I use an 8GB USB stick, which is fine for my home directory with my DIY programs and articles. For a large media collection, a 4TB disk is recommended. Make sure that the backup medium does not contain any important data. (Note: Moving forward, when I refer to a backup stick, I also mean the alternative hard disk option as well.)

USB sticks and hard disks usually come preformatted with the FAT32 or NTFS filesystem, both are predominant in the Windows world. Neither is suitable for backing up Linux systems. In the case of sticks, you also need to consider whether there is a partition, depending on the capacity. If in doubt, repartition the stick so that it contains exactly one partition. The desktop environment typically comes with a tool for this; alternatively, you will find an application like GParted [1] in the package sources.

After plugging in the stick, first check in the file manager whether the Linux system has automatically mounted the partition on the medium. This may not be the case after repartitioning, but it can happen. In this case, unmount the partition and format it with ext4.

This can be done in a GUI with GParted (Figure 1) or at the command line with the command from Listing 2, line 2. Replace /dev/sd<XY> with the device name of the medium (e.g., /dev/sdb1). If in doubt, it is best to check the device name before formatting with lsblk (line 1). The command outputs the partitions, the sizes, and the corresponding mount points, if any (Figure 2). In Figure 2, the stick can be easily identified by its size of almost 8GB.

Listing 2

Formatting the Stick with ext4

 

Figure 1: Using GParted, you can quickly partition and format disks and assign a label to the new partition.
Figure 2: The output from lsblk: The backup USB stick can be quickly identified based on partition size.

Listing 2 also assigns a label for the partition. Instead of BACKUP you can specify something else here. You will need the identifier again later, because you only want the backup to start if you plug the correct medium into the system. With GParted, you will find the function Label File System within the Partition menu.

System Adaptations

After setting up the backup media, some packages need to be installed and configured. First you import the rsync, git, and rsnapshot packages in the package manager. Line 1 of Listing 3 shows the command for Debian, Ubuntu, and their derivatives. If rsnapshot is missing from the repositories of your distribution, you can download the script from the project page [2].

Listing 3

Installing rsync, git, and rsnapshot

 

You also need my autobackup script and configuration files [3], which you download and install with the commands from lines 2 to 4 of Listing 3. If you are using a Debian-based system, such as Ubuntu or Linux Mint, the command from line 4 will also install the rsync and rsnapshot packages. If you do not want to do this, comment out the command install_packages at the end of the tools/install file.

When you plug in the backup stick later on, the kernel detects this and creates a udev event (Figure 3). The constantly running udev daemon then starts the /usr/local/sbin/autobackup script. It checks whether the backup stick is actually plugged in, and whether a corresponding backup already exists for the current date, current week, or current month. If this is not the case, the script starts the rsnapshot script with the correct parameters.

Figure 3: The command udevadm monitor --property lets you observe the events under the hood of udev in real time.

To make all of this work, you need to configure the autobackup script as shown in Listing 4. The configuration file can be found in /etc/autobackup.conf. Here you normally only have to enter the LABEL, which in our example is BACKUP. You can set the value of SYSLOG to 1. The script then writes detailed messages to the system log, so that you can view more information if problems occur. Normally the script will only make one backup per day; force_daily=1 changes this. When making any changes, be careful not to violate the shell syntax; especially do not put any spaces before or after the equals signs.

Listing 4

Configuring autobackup

 

The second step is to create the /etc/rsnapshot.conf file. On installing rsnapshot, you will usually have an example, as with installing the autobackup script. Copy the /etc/rsnapshot.conf.autobackup-service file to /etc/rsnapshot.conf (Listing 3, line 5). You only have to make adjustments in the middle and at the very end of the file (Listing 5). In the middle, you need to define how many generations you want to store for each interval. The identifiers (day, week, month, and year, with the latter commented out) must match the values in the last four lines of Listing 4.

Listing 5

Adjusting /etc/rsnapshot.conf

 

At the end of the file, you specify which directories the script should save. Rsnapshot offers several possibilities to exclude files – but this is rarely worth the effort.

There is a pitfall in /etc/rsnapshot.conf, however: You have to separate the single words with tabs, not spaces. You will want to call the sudo rsnapshot check command after editing. The command checks the syntax and determines whether an editor has tacitly replaced tabs with spaces.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Charly's Column

    The principle behind Rsnapshot is nothing new: use Rsync and SSH to back up files to another computer. What sets Rsnapshot apart is its simple setup and its rich collection of features.

  • Ivman

    Better knowledge of the Linux hotplug system opens the door to to innovation and automation. We’ll show you a custom backup solution for laptop computers.

  • Admin Workshop: Backups with Rsync

    It is often inefficient to fire up a tape drive whenever you need to back up files or restore a backup. The Rsync tool pushes critical files to a second computer, where you can access them easily.

  • Back In Time

    Despite the importance of backups, many users still view the process as too complicated and too inconvenient. Back In Time makes the unloved backup less terrifying.

  • Backup Integrity

    A backup policy can protect your data from malware attacks and system crashes, but first you need to ensure that you are backing up uncorrupted data.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News