Your NAS isn't enough – you still need to back up your data!
Not All NAS

© Lead Image © bram janssens, 123RF.com
Some users trust their data to powerful file servers that advertise enterprise data protection, but your Network Attached Storage system might not be as safe as you think it is.
There is a point in the life of a compulsive data hoarder when a regular computer is not enough to contain a burgeoning file collection. Upon the relentless expansion of a massive data compilation, the first step a home user takes to extend the storage capacity is to purchase an external USB hard drive. The hard drive will buy the user some time, but eventually this solution will fall short. A data hoarder who is dedicated enough will eventually have to invest in a Network Attached Storage (NAS) server.
A NAS is a dedicated server optimized to store large amounts of information. NAS servers are commonly available as commercial appliances, but many power users prefer to build their own from spare parts. Serious NAS servers are scalable and allowed to increase their capacity by adding hard drives as needed. Better yet, they often offer enterprise features that come in very handy, and they promise mitigations to the most common threats against the long term survival of your files.
NAS vendors often advertise fault tolerance and profess the immunity of their systems from disaster, which causes users to treat this sort of storage as bulletproof, dumping their data and then skipping the step of making backups. But rarely do these consumer-grade storage systems provide a complete solution. This article describes some of the things that can go wrong – and why you still need to perform backups to ensure that your data is safe.
The Features of a Quality NAS
A wide range of NAS options are available for home users. These options vary in quality from desktop toys to quasi-enterprise systems trying to pass as domestic appliances (Figure 1).

With the exception of the low end ones, NAS boxes are designed with the purpose of offering the highest possible availability. In this context, a high availability machine is one that can keep serving its users under adverse conditions. Such a server needs to be able to keep functioning if a hard drive fails, if the power grid blacks out, or if its power supply malfunctions.
Servers mitigate hard drive failures by the use of Redundant Array of Independent Disks (RAID). A RAID group is just a set of hard drives that are recognized as a single virtual drive by the operating system. (See the box entitled "Popular RAID Levels" for more information on some common RAID scenarios.) In a domestic NAS context, these drives will most often be grouped in the so called RAID 5 level. RAID 5 distributes the data within the array evenly across every device, with some extra parity components. Should one of the drives fail, the server will keep functioning in a degraded state by keeping the remaining drives running and using the parity data to reconstruct lost information.
Popular RAID Levels
RAIDs can be built in multiple ways, depending on the purpose they serve. The most popular traditional RAID levels are:
- RAID 0 stripes data across all the drives in the set for increased performance (Figure 2). The total size of the RAID is that of the sum of the sizes of every individual drive. A disk failure kills the array, making it a dangerous RAID level to use. RAID 0 has better read and write throughput than a single hard drive of the same size as the array, because the workload is evenly distributed over the individual drives in the RAID.

- RAID 1 mirrors the data across all the drives in the array (Figure 3). Since every drive has a full copy of all the data, a RAID 1 can keep working as long as one of its drives is still operational. RAID 1 is good for keeping a proper uptime, but it is not very cost effective, because, at the very least, it takes twice as many drives for the same storage capacity.

- RAID 5 is among the most popular in small deployments. This form of RAID is known as disk striping with parity. The disks are striped (as with RAID 0), but an additional drive provides a parity bit, ensuring that the array can keep working if one of the drives fails (Figure 4). RAID 6 does pretty much the same thing, except it can keep working after two hard drive failures.

- RAID 10 is a combination of RAID 0 and RAID 1. Drives are deployed in couples in which each unit mirrors the other. Then all the pairs are placed in a RAID 0 (Figure 5). RAID 10 can keep functioning as long as at least one drive in each pair is in working order.
A server can survive blackouts by the use of an Uninterrupted Power Supply (UPS), which is just a fancy term for a battery that kicks in when the power grid goes down (Figure 6). A modern UPS can communicate with the server over USB or Ethernet in order to let the operating system know how much power is left in the battery, which is useful to force the machine to shutdown in an orderly way when the supply is about to run dry.

About ECC
Good NAS hardware will often feature Error Correction Code (ECC) RAM. ECC RAM is capable of checking itself for consistency against random errors in memory, which are more frequent than it seems [1]. RAM errors are considered dangerous for the survival of a dataset and the continued operation of a server. A botched bit in RAM could cause the operating system to malfunction or cause a file to get corrupted. ECC is intended to reduce the risk of such an event and keep the system running after a memory error.
A theory holds that a bit error in RAM could cause a chain reaction, resulting in massive data corruption within a ZFS filesystem. It is therefore argued that the only safe way of running a ZFS server is with ECC RAM, and that doing otherwise is borderline suicidal.
ZFS uses no pre-mount consistency checker and lacks filesystem repair tools at the time of this writing. ZFS was conceived as a self-healing filesystem, capable of repairing data corruption on the go. Should ZFS try to read a data block that has been corrupted by, let's say, a hard drive defect, the filesystem would be able to identify the issue and attempt to repair it on the fly from parity data. Such self-healing features do, in theory, eliminate the need for recovery tools. The FreeNAS project (now TrueNAS) used to warn that a botched memory operation could cause permanent damage to the filesystem, and since there are no recovery tools available, data could end up being unrecoverable [2].
However, opinions differ on whether ZFS is more susceptible to failure than other filesystems. Matthew Ahrens, cofounder of Sun's ZFS project, argues that using ZFS with non-ECC RAM is about as risky as running a regular filesystem without it [3], arguing that ECC RAM is not necessary but is highly recommended.
RAID Issues
A good NAS promises excellent uptime and looks indestructible on the surface. It would seem like files should be able to survive indefinitely in such a server. After all, if a NAS is capable of withstanding a hard drive failure (the most common hardware malfunction [4]), there is not much incentive for spending the big amount of money required to set another server up and keeping a backup of the original one.
The problem is that there is only so much a file server can do to protect your data, especially outside of an enterprise environment. Quality server hardware is designed to guarantee good uptime in the face of trouble, but not necessarily the integrity of your information. There are a number of reasons why a NAS may still fail.
If a hard drive fails within a NAS' RAID 5 set, the whole array will work at a degraded level. From the user viewpoint, the array is still operational, but it has ceased to offer fault tolerance. Should another drive fail before a new one is added and the array is rebuilt, the information contained in the array will be lost. Many a RAID array has failed due to owner procrastination – or due to the long wait time waiting for the attention of an overworked sys admin.
But tardy repair is just one of the reasons why some experts are wary of depending on RAID. A casual search on the Internet will find countless opinions regarding the unsuitability of RAID 5 for modern file servers [5]. Storage media is not perfect and may suffer random read failures. Hard Drives are reliable enough for most purposes [6], but every now and then they will throw an Unrecoverable Read Error (URE). UREs are errors which take place when the hard drive tries to access a block of data and fails to do so. Modern drives are estimated to suffer an URE for every 10^14 bits read on average, which means errors are rare.
The bigger a disk array, the higher the chance that a defective sector exists somewhere. The argument of RAID 5 detractors is that disk arrays are becoming so big that the probability of triggering a URE is becoming too high to be acceptable. This is so because the more bits are managed by the RAID, the more likely it is that at least one block of information is problematic.
If a RAID 5 loses a drive to hardware failure, a new drive can be plugged in, and the RAID 5 may be rebuilt from the data existing in the remaining disks. However, if any of the remaining disks throws a URE during this process, the consequences may range from losing the data existing in that sector to being unable to rebuild the whole RAID (depending on the quality of the RAID controller and drives).
Experience suggests that the fear of being unable to rebuild big arrays is blown out of proportion. Nevertheless, it is important to remember that RAID 5 is a tool for guaranteeing uptime rather than the integrity of your files.
There are RAID levels with better fault tolerance than RAID 5 (such as RAID 6 or RAID 10) but using these alternative RAID levels in a small system is comparatively expensive.
Nearly as bad as this is the fact that many RAID controllers are proprietary and don't offer a good migration path. If you are using a proprietary solution and want to move your hard drives from an old server – maybe because the old one finally bit the dust! – you might discover that your data is unreadable in its destination machine.
On the other hand, software issues might destroy your files just as quickly as a hardware level malfunction, and using an enterprise-grade server won't do much for you if you are hit by a bug. For example, QNAP's NAS appliances were massively affected by a vulnerability that caused many users to be preyed on by the DeadBolt ransomware [7][8].
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Direct Download
Read full article as PDF:
Price $2.95
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Find SysAdmin Jobs
News
-
CarbonOS: A New Linux Distro with a Focus on User Experience
CarbonOS is a brand new, built-from-scratch Linux distribution that uses the Gnome desktop and has a special feature that makes it appealing to all types of users.
-
Kubuntu Focus Announces XE Gen 2 Linux Laptop
Another Kubuntu-based laptop has arrived to be your next ultra-portable powerhouse with a Linux heart.
-
MNT Seeks Financial Backing for New Seven-Inch Linux Laptop
MNT Pocket Reform is a tiny laptop that is modular, upgradable, recyclable, reusable, and ships with Debian Linux.
-
Ubuntu Flatpak Remix Adds Flatpak Support Preinstalled
If you're looking for a version of Ubuntu that includes Flatpak support out of the box, there's one clear option.
-
Gnome 44 Release Candidate Now Available
The Gnome 44 release candidate has officially arrived and adds a few changes into the mix.
-
Flathub Vying to Become the Standard Linux App Store
If the Flathub team has any say in the matter, their product will become the default tool for installing Linux apps in 2023.
-
Debian 12 to Ship with KDE Plasma 5.27
The Debian development team has shifted to the latest version of KDE for their testing branch.
-
Planet Computers Launches ARM-based Linux Desktop PCs
The firm that originally released a line of mobile keyboards has taken a different direction and has developed a new line of out-of-the-box mini Linux desktop computers.
-
Ubuntu No Longer Shipping with Flatpak
In a move that probably won’t come as a shock to many, Ubuntu and all of its official spins will no longer ship with Flatpak installed.
-
openSUSE Leap 15.5 Beta Now Available
The final version of the Leap 15 series of openSUSE is available for beta testing and offers only new software versions.