The ZFS on Linux with FUSE
Listing 2 demonstrates a RAID-Z array with three (virtual) disks – RAID-Z2 is similar. (The keyword for RAID-Z2 is raidz2 instead of raidz.) The commands in Listing 2 create a pool with RAID functionality. Note that ZFS will not let you extend the capacity: You can't just add new disks to the RAID pool. However, there is a workaround. As shown in Listing 3, you can replace the existing disks with three larger disks.
$ for i in $(seq 3); do dd if=/dev/zero of=/tmp/rpool$i bs=1024 count=65536; done $ zpool create rpool raidz /tmp/rpool1 /tmp/rpool2 /tmp/rpool3
Replacing the Disks
$ for i in $(seq 4 6); do dd if=/dev/zero of=/tmp/rpool$i bs=1024 count=128000; done $ zpool replace rpool /tmp/rpool1 /tmp/rpool4 $ zpool replace rpool /tmp/rpool2 /tmp/rpool5 $ zpool replace rpool /tmp/rpool3 /tmp/rpool6
Alternatively, you can increase the array capacity by adding mirror, or RAID-Z, pools to the existing pool, rpool (Listing 4). This technique makes sense when the new disks are the same size as the existing disks. As long as you have more than two disks, RAID-Z is preferable to mirroring for failure safety reasons.
$ zpool add rpool mirror /tmp/rpool4 /tmp/rpool5 $ zpool add rpool raidz /tmp/rpool4 /tmp/rpool5 /tmp/rpool6
Preventing Data Loss
Modern hard disks have self-test functions that let you check current hardware status by running a special tool. If a disk is in a critical state, ZFS lets you remove it from the pool to check the hardware:
zpool offline rpool /tmp/rpool3
If you find out the hardware has an irreparable defect, you have no alternative but to replace it with the use of the zpool replace command, as shown in Listing 3. Whereas offline simply disables the disk in the array, the replace command swaps the existing medium.
ZFS then proceeds to synchronize the pool, which can take a couple of minutes. The zpool status command keeps you up to date with the current status.
If you are wondering why Apple is so interested in ZFS, you might consider an interesting feature in Mac OS X: The "Time Machine" stores filesystem states, which lets users restore older states. Time Machine is actually based on ZFS.
In OpenSolaris, the developers have integrated this feature with Nautilus . On Linux, you currently have no alternative but to use the command line. To create a snapshot, type zfs snapshot rpool@created. The @ sign and an arbitrary string following it are important. The zfs list command outputs the existing pools and snapshots (Listing 5).
$ zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 409K 266M 32,2K /rpool rpool@created 0 - 32,2K -
If you change a pool – that is, copy or add files – the USED and REFER columns will have changed from the original time. If you accidentally delete some data, zfs rollback rpool@created is all it takes to restore the pool to its original state.
Compared with the current crop of popular Linux filesystems, ZFS has some very interesting features, such as the integration of the Volume Manager and RAID and the ability to create snapshots. Other promising traits include online compression, or the ability to export or import pools. The many benefits of ZFS make it quite clear how big a lead this filesystem has over its competitors right now. Although Oracle's Btrfs promises similar abilities, it will take some time until it is ready for production use.
- ZFS: http://opensolaris.org/os/community/zfs/
- OpenSolaris: http://www.opensolaris.com
- Btrfs: http://btrfs.wiki.kernel.org/index.php/Main_Page
- FUSE: http://fuse.sourceforge.net/
- ZFS FUSE: https://developer.berlios.de/projects/zfs-fuse/
- Snapshot integration in Nautilus: http://blogs.sun.com/erwann/entry/zfs_on_the_desktop_zfs
Buy this article as PDF
Kernel king admits his tone has alienated volunteers, but says the demands of the process require directness.
New flaw in an old encryption scheme leaves the experts scrambling to disable SSL 3
Lennart Poettering wants to change the way Linux developers talk to each other.
Enterprise giant frees itself from ink and home PCs (and visa versa).
Mozilla’s product think tank sinks silently into history.
TODO group will focus on open source tools in large-scale environments.
New tool will look like GParted but support a wider range of storage technologies.
New public key pinning feature will help prevent man-in-the-middle attacks.
Carnegie Mellon researchers say 3 million pages could fall down the phishing hole in the next year.
The US government rolls new best-practice rules for protecting SSH.