One-Step Encryption and Backup Tool

Command Line – duplicity

© Lead Image © Jakub Gojda, 123RF.com

© Lead Image © Jakub Gojda, 123RF.com

Article from Issue 232/2020
Author(s):

With a single command, duplicity lets you encrypt and back up files. All you need to do is learn its unconventional command structure.

Despite its name, duplicity [1] is not a command to enable dishonesty. Instead, duplicity is one of those modern command-line tools that combines more than one function in the same application. Instead of encrypting files in a separate operation and then backing them up, duplicity does both in a single step. When it comes to using duplicity, its only limitation is a somewhat eccentric command structure.

Using GnuGPG for encryption, duplicity backs up directories and files on a local or remote server. Although sources to back up are expressed as directory paths, targets for the backup files must be listed as a URL, not a path. For example, a local target directory must be identified as file:///usr/local/backup rather than /backup. (Note that the three forward slashes in the target URL are not an error: Two are for the URL, and the third is for the path from root.) By default, each archive is placed in a separate directory unless you use the --allow-source-mismatch option.

duplicity supports backups to local drives (including mounted external drives), FTP, SFTP/SCP, Rsync, WebDAV, Google Docs, HSI, and Amazon S3. duplicity's man page does not detail how to set up all these various targets, but detailed instructions and examples are available online, particularly for those that require additional libraries, such as Google Drive, which requires PyDrive [2], and Amazon S3, which requires python-boto [3]. Some targets also take unique options. Regardless of the targets, after the first creation of a backup, later backups will be incremental, affecting only parts of files that have changed since the last backup (Figure 1). Remember that directories containing a backup display in a file manager, but the backup archives do not since they are encrypted. You will need to use duplicity to list the backup archives (see the Actions section).

Figure 1: A local backup: duplicity detects that the source and target are connected, that this is the first backup, and defaults to a full backup. Since a GnuPG key is not defined as an environmental variable, duplicity then requests a GnuPG key. After the backup, a summary of statistics is given that includes the size of the backup files.

Backups created by duplicity will preserve original directories and files, including permissions and symbolic links, although not hard links. In addition, the man page warns against including /proc, the directory that displays system information in a virtual filesystem, because of the likelihood of causing a crash.

In order to encrypt, duplicity requires a GnuPG encryption key. If you do not already have a key, you can familiarize yourself with the basic concepts and the procedures for creating one by reading an online how-to [4]. When you are ready, you can generate a key with the command:

gpg --full-generate-key

You can use --gen-key, a less verbose option, if you are familiar with gpg, but --full-generate-key is easier for novices, since it gives more explanations, and, if nothing else, provides moderately secure defaults (Figure 2).

Figure 2: Before encrypting, duplicity requires a GnuPG key.

Actions

duplicity uses actions (aka sub-commands) to refine the basic command. You can do a basic backup with nothing more than the following structure:

duplicity SOURCE-PATH TARGET-URL

For example:

duplicity /home/tt scp://TARGET-UID/backup

However, you have the option of adding an action after the basic command. For example, when creating a backup, you can specify full rather than the default incr (incremental), even if nothing has changed in the backup. If you specifically request incr, duplicity will switch to full if anything goes wrong. If a previous backup attempt has failed or stopped before finishing, you should run cleanup before the next attempt to see if any files with errors are in the backup. If so, you will need to use the --force option to delete them.

To edit an archive, you can use the remove-older-than TIME action, specifying the time in a number of different formats, including: YYYY/MM/DD; an interval after which all files will be deleted, using s, m, h, D, W, M, or Y to set the interval; or else simply now. Similarly, remove-all-but-n-full COUNT indicates how many backup sets to keep – for example, a count of 1 will leave only the latest backup set. Both actions need to be used with --force to delete files.

During housekeeping, you can use list-current-files to see what files are currently backed up in an archive (Figure 3). For a quicker response time, this information is read from duplicity's signature files rather than the archive, which may mean that a corrupted file is not detected. To check the state of files directly, you can use verify, with a verbosity level of 4 or higher (see the Selected Options section for more information) to see a separate message for each altered file. Alternatively, use collection-status to see the backup's status, a summary of the sets that the backup contains, and the number of tar files in each.

Figure 3: Encrypted archives can only be detected from within duplicity.

To restore files, simply reverse the source and the target:

duplicity scp://TARGET-UID/backup/* /home/tt/

If a file or directory already exists, then duplicity will not overwrite unless the --force option is used.

Selected Options

duplicity's options are all in long form, rather than being a single character. A large number specify what to include or exclude. duplicity includes two matching sets of options that begin with --exclude or --include, which are completed by specifics. For example, --exclude-shell_pattern PATTERN and --include-shell_pattern PATTERN are completed with a string of characters. Should either match a directory, then all the directory's sub-files are affected. Alternatively, you can use the regexp completion. The completion device-files DEVICE can be used on exterior devices. Still other completions are filelist, if-present FILENAME, or, for --exclude, --exclude-other-filesystems FILESYSTEM. Another separate option for excluding files is --extra-clean, which saves space by not backing up things like the signature files for old backups.

Other options modify duplicity's basic behavior. With --encrypt-key KEY, you can specify which GnuPG key to use. If you are not sure whether you need a full backup, you can use --full-if-older-than TIME, which will only do a full backup if the last one was before a certain time. When restoring, you can use --rename OLD-PATH NEW-PATH to change the directory where the files are written; otherwise, the archived files will be restored to their original location, overwriting any files there.

If you receive errors, you might try again with --num-retries NUMBER or even --ignore-errors. Should you want to add two backups to the same directory, you can use --allow-source-mismatch or else wait for duplicity to prompt you to use that option. In truly desperate circumstances, you can use --force, either in anticipation or when prompted – although some data might be lost if you do. Perhaps, though, you may prefer to receive advance warning by testing what you are about to do with the addition of --dry-run, which will tell you what will happen without performing any action.

When using most of these options, you might want to adjust the verbosity level with --verbosity LEVEL. Like verbose actions in other commands, duplicity's verbosity option adjusts the level of feedback you receive. In verbosity's case, this feedback includes what warnings are displayed. duplicity has nine levels of verbosity. Not all are named, but those with names are Error ( ), Warning (2), Notice (4), Info (8), and Debug (9). Named levels may be displayed or set by name rather than number or by their initial letter. The default is Notice, which means that only notices, warnings, and errors are displayed. However, should you run into difficulties, you might want to increase the verbosity in the hopes of displaying information that can help you solve your difficulties. For example, if verbosity is set to Debug, you can see the full details of what a command does. The alternative is to look through the log for ignored errors – which is both inconvenient and inefficient, since no error summary is given at the end.

duplicity's default protocol is SFTP/SCP/SSH, defaulting to SFTP, because it has fewer shell quoting problems than SCP. You can specify that SFTP/SCP look for an FTP_PASSWORD environmental variable, or prompt the user if the variable cannot be found. In addition, you can switch from the default SFTP protocol with --use-scp. If necessary, you can pass SSH options with:

--ssh-options oopt1=PARAMETER1 oopt2-PARAMETER2

Depending on the backup's location, you may also need other options. Using FTP, you may need to experiment with --ftp-passive and --ftp-regular if you are having trouble making connections. Similarly, when using Amazon S3, you may decide against the default --s3-use-new-style in favor of --s3-european-buckets. In addition, you may get faster uploads with --s3-unencrypted-connections, but at the cost of an insecure connection: Your archive will still be encrypted, but an observer could see the name of the bucket, your access key, and any increment dates and their sizes, as well as the fact that you are using duplicity.

An Efficient Command Structure

If duplicity has a fault, it is a lack of compression options. I am sure that I am not the only one who would like to have control over the archive's size. However, perhaps it is just as well, since compression and encryption might result in too many things that could go wrong.

Still, by combining two related functions in a single command, duplicity remains one of the premier backup applications for Linux. Depending on your setup, you may take several tries to archive with duplicity. But, once you do, the chances are that you will use the same handful of command structures repeatedly, regardless of whether you are backing up a home system or a large network. Among the dozens of available backup tools, duplicity remains one of the most versatile and reliable.

In fact, the project site proposes replacing tar with duplicity, using a revised archival format [5]. The proposal cites tar's incompatibility with some filesystems, and its inelegant handling of long file names. The proposal does not seem to have reached the development stage, but it suggests the project's long-term goal is to improve Linux backup even more than it has already done.

The Author

Bruce Byfield is a computer journalist and a freelance writer and editor specializing in free and open source software. In addition to his writing projects, he also teaches live and e-learning courses. In his spare time, Bruce writes about Northwest coast art (http://brucebyfield.wordpress.com). He is also co-founder of Prentice Pieces, a blog about writing and fantasy at https://prenticepieces.com/.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Duplicity Cloud Backup

    If you're looking for a secure and portable backup technique, try combining the trusty command-line utility Duplicity with an available cloud account.

  • Mediapurge

    If you have a download folder full of photos and music, Mediapurge can help you sort files and even remove duplicates, but beware of its quirks.

  • BorgBackup

    BorgBackup and the Vorta graphical front end take the stress out of creating backups.

  • Backup Solutions

    Backup strategies in IT are essential and expensive in terms of planning and administration, but individuals have simpler solutions. We look at five backup solutions for the desktop.

  • Areca Backup

    Sometimes you just need to back up a few directories on a computer, not administer a distributed installation or an array of disks. Areca Backup gives you hassle-free backups of individual hard drives.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News