Staying in sync with a network filesystem

Think Sync

OFS uses synchronization in two situations: to keep the cache up to date while the server connection is up and to reintegrate any local changes with the server when a lost connection is re-established.

Alternative tools, such as Coda and Intermezzo, provide software on the server side to let the server report changes to all relevant clients. OFS, of the other hand, is a client-only solution – it is the user's responsibility to trigger the cache update process manually.

When updating the cache, OFS relies on the files' modification timestamps. Then the OFS daemon compares the timestamp of the original file in the share to the copy in the cache. If the remote file has changed, OFS copies it to the local cache.

To synchronize, OFS must perform a full search of the offline directory tree. Because this process stresses the network, the hard disk, and the CPU, the developers decided to do without full automation based on polling.

When a program issues an open() system call, the update process is triggered. At that point, OFS checks to see whether the file in its offline directory is up to date; if not, it downloads the latest version from the server. OFS uses the local cache to answer read access requests, whereas write requests modify both the cache and the share (Figure 4). This approach guarantees both fast read access and up-to-date files.

Figure 4: When a file is opened (left), OFS updates the copy in the local cache if possible. Read requests (top right) are served directly from the cache to improve performance. Write requests modify both the cache and (if possible) the original on the server (bottom right).

When the server connection is down, OFS logs all write accesses handled by the cache. Each entry in the logfile contains both the path to the file and a parameter specifying the change. The possible values are: created, deleted, and modified.

Once the server connection is restored, OFS works its way through the logfile, performing each of the steps on the server side: creating new files, deleting existing files, or uploading the cached version to the server.

During reintegration, a file the user has modified locally might have been changed on the server, too. To avoid simply overwriting the file, OFS first checks how current the remote file is. If the modification time is later than the last synchronization time, the remote file must have been changed in the meantime. In this case, OFS launches its conflict-resolution tool.

In some cases – for example, if one user has modified the local file while another user has simply renamed the version on the server – the conflict is easily resolved. However, some conflicts are impossible to resolve automatically, such as cases in which two users modify the file in different ways while it is offline. In this situation, OFS leaves the decision to the users. A diff mechanism analyzes both versions of the file, and a preview shows the results of the proposed conflict resolution. (The OFS configuration decides which diff tool to use.) Text files are typically fairly easy to merge, with the exception of a situation in which both sides have changed the same part of the file. Because it is more difficult to display and merge differences in binary files, the only approach is to choose one of the versions.

The KFilePlugin plugin for Dolphin and Konqueror adds an OFS tab to the Properties dialog field, which is displayed when you right-click a file or directory and select Properties. By selecting the OFS tab, users can add a directory to the local cache or write changes out to the server. This GUI plugin is not a required part of OFS. Machines without a GUI can still use OFS at the command line.

Attributes

For the most part, OFS interacts with applications just like any other filesystem. Its special features include a flag that specifies whether a file should be available offline if the connection is lost. Many filesystems, AFS for example, have their own tools for setting these parameters. OFS, however, removes the need for special tools by relying on extended file attributes, which the user can set and read through standard Linux file utilities.

Extended file attributes make it possible to add metadata to files (as long as the filesystem supports this). The kernel does not handle the get and set commands itself but passes them directly to the filesystem. It is thus the filesystem's responsibility to process or save the attributes. FUSE supports extended file attributes and offers callback functions for setting, deleting, reading, and listing attributes.

OFS relies on this feature to implement its own custom attributes, which serve only to exchange information between the filesystem and the shell. OFS does not save them and does not pass them on to the underlying filesystem; you might say they are virtual attributes. The current version of OFS supports two attributes in the ofs namespace:

  • ofs.offline indicates whether a file is available offline. Setting this attribute triggers the update process and adds the tree to the list of directories available offline.
  • ofs.available indicates whether the file is available right now, or whether the cached version is in use. In other words, it tells you whether the server connection is available. This attribute is read-only.

In both cases, the only two states are "set" (yes) and "not set" (no). setfattr -n attribute_path sets the state, and getfattr -n attribute_path reads it. To delete the attribute completely, you can use setfattr -x attribute_path.

Future

OFS is not a mature product, although the current version is usable if you accept its limitations. The roadmap features a number of extensions. The modular design supports the introduction of more agent classes, for example, to provide more precise details on the availability of a remote filesystem.

The Author

Peter Trommler is Professor of Theoretical Computer Science, Computer Communications, and Information Security at the Georg-Simon-Ohm University in Nuremberg, Germany. He has been working on Unix for 18 years and on Linux for 10 years.

Carsten Kolassa has worked in the Linux community for many years, focusing on embedded topics, networks, and security.

Frank Gsellmann is currently developing the resyncronisation mechanism of OFS project as his Master's thesis. Tobias Jähnel laid the foundations for the project for his thesis; he now works for EB Automotive: http://www.elektrobit.com

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Managing Linux Filesystems

    Even with all the talk of Big Data and the storage revolution, a steady and reliable block-based filesystem is still a central feature of most Linux systems.

  • A web service filesystem

    The Fuse kernel module lets developers implement even the most idiosyncratic of filesystems. We’ll show you how to build a filesystem that relies on SOAP to publish data over web services.

  • Cmdfs

    Cmdfs builds a filtered virtual filesystem based on a source directory tree. You can even integrate other programs to convert data on the fly.

  • AuFS

    AuFS offers a painless filesystem for a thin client, and FS-Cache provides a persistent cache.

  • Hot Backups

    The tools and strategies you use to back up files that are not being accessed won't work when you copy data that is currently in use by a busy application. This article explains the danger of employing common Linux utilities to back up living data and examines some alternatives.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News