Synchronizing data with the Git-annex Assistant

Git Exchange

© Lead Image © Anatoli Babi, 123RF.com

© Lead Image © Anatoli Babi, 123RF.com

Article from Issue 167/2014
Author(s):

Git-annex Assistant is a handy web interface that lets you use the power of Git to synchronize data across several computers.

Git-annex comes directly from the heart of the Linux ecosystem. It lets users manage files in a Git repository and sync them across multiple devices, such as an encrypted archive in the cloud or a backup on an external hard drive or SSH server. Once you have mastered Git-annex, you can accomplish these tasks with ease.

The abundance of options may deter some people from using the tool. This fear is countered by the Git-annex Assistant front end, which hides the complexity behind a modern web interface.

Installation

Although packages are available for the most common distributions like Debian, Fedora, and Ubuntu, these usually lag far behind the current state of development. Debian "Wheezy" and Ubuntu "Precise" have version 3.2 in their repositories, which does not yet support the Assistant. Ubuntu "Trusty" at least comes with version 5.2, but because of the fast pace of development, it makes more sense to install the precompiled binary archives [1].

For this manual installation, you just need to extract the archive and add the git-annex.linux folder to the path. Listing 1 shows the steps required to modify the path variable, but this is only a temporary change. For a permanent installation, add the git-annex.linux folder to your $PATH variable. In Ubuntu, you can do this in the ~/.pam_environment or ~/.bashrc file.

Listing 1

Adding git-annex.linux to PATH

 

Git-annex Assistant

The web interface (Figure 1) is part of Git-annex and was created as a result of a crowdfunding campaign. It complements the extensive set of commands with a focus on simple input screens for creating repositories and configuring repositories in the cloud, including their encryption. Moreover, the front end supports the ability to configure synchronization between data repositories via dialogs and to retrieve information about the status of current operations.

Figure 1: The Git-annex Assistant web interface simplifies some tasks for which you would otherwise have to use the command line.

To start the application, type the following

git-annex-webapp

at the command line. The software then automatically opens the browser and calls a URL with the format:

http://127.0.0.1:59739/?auth=<Token>

The <Token> comprises a long string of letters and numbers. For security reasons, both the port and tokens change with every call. The lower part of Figure 1 lists an overview of currently synchronized repositories. The sample includes local data in the ~/annex directory.

If you look in this folder, you will only find the hidden directory, .git. However, if you create a file in the ~/annex folder, the Assistant running in the background automatically creates a Git commit and takes the file into its care. Repositories you create in the web interface are set to "direct mode," so this action is transparent to the user [2]. However, the command line allows you peek behind the curtains to see what exactly Git-annex is doing (Listing 2).

Listing 2

Checking the Log

 

Local Pairing

In a network in which you can access the clients directly via SSH, "local pairing" offers a way of automatically synchronizing files. The members of a workgroup can exchange data on the shared network without a central server. As a starting point, you just need the individual computers with the repositories and an SSH server running on each system.

One client grants another client access to the data: The Add another repository button takes you via Local computer to a Secret Phrase prompt – this is the password for the pairing. Armed with this password, the clients can mutually authorize each other for the data exchange. In pairing, the data is multicast via UDP port 55556; the subsequent synchronization in turn uses SSH [3].

Authenticating via public keys makes the setup far easier because it removes the need to enter passwords repeatedly. Moreover, it increases security considerably. The result of these pairings are Git remotes, each pointing to the other client (Listing 3). Although the configuration of the remotes could also be handled manually, the web interface simplifies the task considerably.

Listing 3

git remote

 

The client folders stay in sync after pairing. To add another computer, just repeat the above steps, and migrate the data in the ~/annex folders to the other machines. A data exchange takes place immediately when changes in the directory occur, and this includes removing a file. This method does not protect you against loss due to accidental deletion.

Until now, the sync only took place on the local network. However, a person in the field or on the road does not always have access the workgroup data. A centralized server that all clients can reach provides a remedy.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Git-Annex

    Git-annex is storage software that distributes files across devices, servers, and cloud services. It can encrypt files and keep everything in sync, and it always knows where to find your data.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News