Use mhddfs to group hard disks and directories
United

The multi-hard drive disk filesystem (mhddfs) combines directories or hard disks on a union filesystem to create a single, large, virtual filesystem that you can use both locally and via Samba or NFS.
Establishing a reliable system and keeping track of a continually growing collection of movies and audio files can be very time consuming. What makes matters worse is that multimedia data typically resides on various disks.
This is where mhddfs enters the game: Using a union filesystem, it groups files from different locations to create a virtual directory. The tool not only combines existing data, it also provides details about free storage space on the individual filesystems. (See the box "What Is a Union Filesystem?")
Consequently, it is no longer a problem to use small disks to store a music collection that extends over three disks. You could just as easily store rock music on one disk, classical tracks on another, and e-books on the third. What happens, however, if your rock music disk is full, but your e-book disk still has room to spare? Things start to become untidy again.
An alternative would be to create a RAID [1] array, but you would always have to compromise between keeping your data safe and using storage space; it does not appear to be a viable solution for the example in this article. The use of LVM [2] only makes sense with RAID for reasons of data safety, and again this does not help solve the problem presented here.
Fortunately, mhddfs offers precisely the functionality that most users need in this case: If you run out of space on one of the grouped disks, the data can be migrated in the background to a different disk with free space without the user even noticing. By default, mhddfs reserves 4GB on each disk for emergencies: If needed, you can use
mlimit=<Limit>
to reduce this value down to as little as 100MB at the outset.
Transparent Write Access
For the virtual array to work, mhddfs – in contrast to UnionFS; AuFS, as commonly used by Live media; or OverlayFS, which was recently added to the kernel – not only makes read access transparent, but also data writes. Whereas legacy union filesystems rely on the copy-on-write (COW) [3], here mhddfs not only writes to the top level of the filesystem, but to all underlying levels, too.
Mhddfs stores files that you add to the virtual array on the first hard disk, as long as it has sufficient space (i.e., as long as the mlimit
is still upheld). After this, it checks the remaining disks in sequence to see if they have sufficient space. If none of the mlimits
on the disks meet the requirements, mhddfs uses the disk with the most space.
Mhddfs always stores files atomically, avoiding the kind of file splitting that you see with LVM. This works on all popular Linux filesystems, including Samba and NFS, because both return correct information about occupied and free space on the respective filesystems. SSHFS does not meet this criterion and the mhddfs developers thus warn against integrating it.
If mhddfs notices during a write that the disk in question does not have enough space, it moves the data it has already written to another disk with more space as a background operation and continues the write action on that disk. The writing program does not notice this. In other words, you can work with the virtual filesystem as if you were working on a single large disk.
No matter where data resides or how much space is available on individual disks, you only see the complete remaining free space. If you later buy a disk with enough capacity and decide to stop using the smaller disks in the mhddfs array, or if you want to use the smaller disks elsewhere, you can simply copy the content of the virtual filesystem to the new disk and unmount the smaller disks.
Flexible Storage
Mhddfs is available from the repositories of most distributions; you can thus use your distribution's package manager for the install. If you prefer to build mhddfs yourself, you can pick up the source code from the mhddfs Subversion repository [4]. Using the tool is very easy in practice. In the following example, I use three hard disks: sda1, sdb1, and sdc1; Listing 1 shows the situation at the start.
Listing 1
Example Disks
You can now create a new mountpoint for the array you will be creating and assign the permissions by typing:
mkdir /mnt/media chmod 775 /mnt/media
From now on, the FUSE filesystem, which as installed to fulfill one of mhddfs's dependencies, comes into its own with its ability to migrate kernel space functions to userspace.
You do not need to be root to use mhddfs; a normal user account is fine. The account simply needs to belong to the fuse
group. You can ensure this by typing:
addgroup <User> fuse
Now create the new array (Listing 2, line 1); the -o allow_other
option allows other users to create files.
Listing 2
Mount Disks to Filesystem
Additionally, you can specify the mlimit
parameter, as mentioned before, but options really belong in /etc/fstab
. Assuming that the mount works, you will see output as shown in Listing 2. All three disks are mounted; all logged-in users have access, and the limit is 4GB. The results, viewed using df -h
, should look something like Listing 3.
Listing 3
New Filesystem
As you can see, the software has created the new filesystem; the total capacity is that of the sum total of the individual disks, and the same is true of the free space. The next task is to provide this setup automatically at boot time. To do so, add a new line to your /etc/fstab
file (Listing 4).
Listing 4
Creating the Filesystem at Bootup
If you do experience problems, it is a good idea to use another option to define where the software creates a logfile and to define the verbosity level for mhddfs's output (Listing 5). For more details, refer to the mhddfs man page [5].
Listing 5
Mhddfs Options
If needed, you can add more disks to the array at any time. To do so, unmount the array, restart the software, and add the disks. Then add the mountpoint to your /etc/fstab
to mount the array automatically.
If you want to stop using the program, remove the line from the /etc/fstab
file and delete the mountpoint for the array. If you have a distribution that uses Systemd, you can launch mhddfs via the init system; the "Launching mhddfs with Systemd" box describes this option.
Launching mhddfs with Systemd
Mhddfs does not come with a service file for Systemd. For this reason, you need to create the file, named /etc/systemd/system/mnt-media.mount
, then copy the script from Listing 6 to the file. The command
systemctl daemon-reload
then reloads the file so the service can be started with the systemctl enable mnt-virtual.mount
command at boot time. You can then type
systemctl start mnt-virtual.mount
for an automatic start.
Listing 6
/etc/systemd/system/mnt-media.mount
On the Safe Side
The driver, which is what mhddfs is at the end of the day, focuses on a single task in the classic Unix style, and it does its job well. However, it does not offer any kind of backup in the case of failure. A disk failure in the array will therefore cause loss of data. One drawback in practice is that you do not know where the software will store a new file and thus do not know what data you stand to lose if a disk dies on you.
The only remedy is to back up the data involved. Mhddfs is often used in combination with SnapRAID [6] to add a modicum of safety. Beyond this, you can also mirror the array one-to-one. To do so, create a second mhddfs instance on your backup disk and synchronize the two instances using Rsync or a similar tool [7].
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Direct Download
Read full article as PDF:
Price $2.95
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
News
-
KDE Plasma 6 Looks to Bring Basic HDR Support
The KWin piece of KDE Plasma now has HDR support and color management geared for the 6.0 release.
-
Bodhi Linux 7.0 Beta Ready for Testing
The latest iteration of the Bohdi Linux distribution is now available for those who want to experience what's in store and for testing purposes.
-
Changes Coming to Ubuntu PPA Usage
The way you manage Personal Package Archives will be changing with the release of Ubuntu 23.10.
-
AlmaLinux 9.2 Now Available for Download
AlmaLinux has been released and provides a free alternative to upstream Red Hat Enterprise Linux.
-
An Immutable Version of Fedora Is Under Consideration
For anyone who's a fan of using immutable versions of Linux, the Fedora team is currently considering adding a new spin called Fedora Onyx.
-
New Release of Br OS Includes ChatGPT Integration
Br OS 23.04 is now available and is geared specifically toward web content creation.
-
Command-Line Only Peropesis 2.1 Available Now
The latest iteration of Peropesis has been released with plenty of updates and introduces new software development tools.
-
TUXEDO Computers Announces InfinityBook Pro 14
With the new generation of their popular InfinityBook Pro 14, TUXEDO upgrades its ultra-mobile, powerful business laptop with some impressive specs.
-
Linux Kernel 6.3 Release Includes Interesting Features
Although it's not a Long Term Release candidate, Linux 6.3 includes features that will benefit end users.
-
Arch-Based blendOS Features Cool Trick
If you're looking for a Linux distribution that blends Linux, Android, and web apps together, blendOS might be what you're looking for.