Make off-site backups or you will lose your data
Backup Pains

Who needs attackers when you have system administrators? Learn why copying your data doesn't mean you've backed it up.
As I write this column, I cannot help but reflect on the irony of just having wiped out a month's worth of data. In the spirit of this article, I was fiddling around with backups on my web server, and I managed to accidentally delete most of /var/ and all of the /home/ directory. This wouldn't have been so bad if I hadn't kept the daily backups in /home/backups/. Oops.
Backing Up Doesn't Always Mean You Have Backups
If your data isn't available, or the systems to process and serve it aren't available, you have a problem. In the case of my web server, the missing /var/ and /home/ render it pretty much useless. It serves 404s and that's pretty much it. To make sure data is available, you need to back it up. Seems simple right? In reality most of us (myself included) get it wrong, and although we go through the motions of making a backup, what we're really doing is just copying the data somewhere else that is equally vulnerable to loss.
In my case, I made a classic mistake of storing my backups on the same system that the data being backed up is on, and to make things worse, I actually kept it in a commonly accessed directory. Not that this would have mattered. Because the server only has one hard drive, I am only a single disk failure away from complete data loss no matter how much I back my data up locally on the server. Even if I were to install a second hard drive in the machine, it's still all too easy for a single event (bad drive controller, attacker wiping the system, fire, flood, power supply going bonkers, theft, etc.) to wipe out more than one hard drive.
What Are Real Backups?
Three main elements go into making real backups. Number one: You have to ensure that the data was actually backed up. I have seen far too many systems that write the data to a CD, DVD, or tape improperly, which results in nonrecoverable data. Ideally, you need to test out every backup you make, but if this isn't practical, you at least need to make occasional spot checks to ensure the data can be recovered.
Number two: You need to have off-site backups that are as close to read-only as you can get. This doesn't necessarily mean they have to be in a different physical location (although this is always a good idea), but they have to at least be separate enough that a single failure or event such as formatting a disk array or losing a server won't wipe out both the live data and the backups. A perfect example of this in action (besides, of course, my recent faux pas) is the website AVSIM Online, which lost 13 years of data to a single attack [1]. According to reports, AVSIM Online had two servers that copied their data off each other in an effort to back each other up. As I have said before, many of us are only copying data when we do backups and not making actual backups. In this case, an attacker broke into both servers, since they were basically identical, and deleted all the live data and the copies held on both servers. AVSIM Online lost their website, email, file library, forums, and more in one fell swoop and most likely will never get the data back. In my case, I was lucky. I only deleted a month's worth of logfiles and collected data, so all I have to do is wait a month for new data to be collected – good thing this wasn't someone's financial records.
Number three: You need to make sure you aren't deleting files out of the backup or zeroing the file contents unless you are beyond 100% certain you will never need that file again. For this reason, RAID isn't a backup solution. Even if you have multiple hard drives in a RAID configuration so that the loss of a single or even multiple drives will notcause data loss, you can still lose data through deletion (rm, mkfs, etc.) or zeroing or altering of files (cat foo > bar).
Principles of Security
The three principles of security are: Availability, Integrity, and Confidentiality (also referred to as the AIC triad). In a nutshell, you have to keep the stuff you need to work working, you need to ensure that your data hasn't been changed by an attacker, and you need to keep your private stuff confidential.
Get Off the System
Fortunately almost every mature backup program supports getting data from a client and storing it somewhere else – often on a dedicated server, disk array, tape, DVD, and so on. Some excellent options exist for Linux: Amanda [2], which ships with almost every distribution, as well as BackupPC [3] and Bacula [4], both covered in Linux Pro Magazine [5] [6]. Although I won't cover the details here, suffice it to say they are very powerful, have lots of knobs to adjust, and definitely back up your data if you set them up right. The quick and dirty option is rsync (yum install rsync, apt-get install rsync, etc.) [7].
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
News
-
Mageia 9 Beta 2 is Ready for Testing
The latest beta of the popular Mageia distribution now includes the latest kernel and plenty of updated applications.
-
KDE Plasma 6 Looks to Bring Basic HDR Support
The KWin piece of KDE Plasma now has HDR support and color management geared for the 6.0 release.
-
Bodhi Linux 7.0 Beta Ready for Testing
The latest iteration of the Bohdi Linux distribution is now available for those who want to experience what's in store and for testing purposes.
-
Changes Coming to Ubuntu PPA Usage
The way you manage Personal Package Archives will be changing with the release of Ubuntu 23.10.
-
AlmaLinux 9.2 Now Available for Download
AlmaLinux has been released and provides a free alternative to upstream Red Hat Enterprise Linux.
-
An Immutable Version of Fedora Is Under Consideration
For anyone who's a fan of using immutable versions of Linux, the Fedora team is currently considering adding a new spin called Fedora Onyx.
-
New Release of Br OS Includes ChatGPT Integration
Br OS 23.04 is now available and is geared specifically toward web content creation.
-
Command-Line Only Peropesis 2.1 Available Now
The latest iteration of Peropesis has been released with plenty of updates and introduces new software development tools.
-
TUXEDO Computers Announces InfinityBook Pro 14
With the new generation of their popular InfinityBook Pro 14, TUXEDO upgrades its ultra-mobile, powerful business laptop with some impressive specs.
-
Linux Kernel 6.3 Release Includes Interesting Features
Although it's not a Long Term Release candidate, Linux 6.3 includes features that will benefit end users.