Manage your music collection with Picard
Pickin' and Grinnin'
Getting that iTunes experience requires more than just Amarok or Rhythmbox. It also requires planning – especially if you went digital before the Linux desktop had audio players.
While the younger set lives blissfully unimpressed by anything that preceded downloaded music, many Linux users nostalgically hang on to compact disc collections. Tools such as Gnome's Sound Juicer, Gtk's Grip, and KDE's KAudioCreator copy CDs to hard drives easily, and the modern versions of these tools let users add information about the CD to the files and directories in which they are stored. But if you don't have this information or it is outdated, mislabeled, or just plain wrong, where do you get it and where do you put it? Simply put, it all comes down to tagging.
Recently I tagged my entire 400+ CD collection using Picard, the official tag editor for the MusicBrainz project. Despite its lack of meaningful documentation, Picard is both stable and easy to use. It allowed me to tag all my music quickly, relabeled the directory structures in a more appropriate and consistent manner, and even added album art. The result is a music collection fully compliant with Amarok, MythMusic, KPlaylist, and a host of other multimedia tools for Linux.
In this article, I examine the standard tag format for audio files known as ID3, I discuss working with tags in Picard and MusicBrainz, and I walk you through these tools so you can update your audio collections quickly and painlessly right from the desktop.
Picard and MusicBrainz work on ID3 tags. ID3  is a format that adds metadata – additional text information – to digital audio files. ID3v1, the first version of ID3, was created to address the lack of metadata support in the original MP3 specification . It adds a chunk of data to the end of the audio file with various information, such as the album name and artist. Because this version of the standard for identifying MP3 files does not support internationalization, and because the information is often stored as text in the user's native language, many players end up displaying the extra information incorrectly. Additionally, this format used a very small chunk of metadata, which forced the truncation of long song and album names.
ID3v2 was created to address the shortcomings of ID3v1, although in reality, the two versions are not related. Whereas ID3v1 was a de facto standard with limited capabilities for appending data to files, ID3v2 is a fully accepted standard offering as much as 256MB of metadata.
If your files have ID3v2 tags, it will help you get an iTunes-like experience out of your music players. Some tools will allow you to add either ID3v1, ID3v2, or both. Where provided, choose ID3v2. Additionally, you might be offered variations on ID3v2. In this case, choose the highest value, such as ID3v2.3 or ID3v2.4.
Multiple Tag Formats
Picard provides an option to remove another tag format known as APE from your audio files. If you find that format, enabling this option will remove it and help prevent problems that can arise from having two types of tags applied to a single file.
Additionally, you can embed cover art into tags or have them saved as separate files in the album folders. Picard can apply tags to a wide variety of audio formats, including MP3, OGG, and FLAC. Picard writes ID3v2.4 tags by default, but you can configure it to write ID3v2.3. This might be necessary to work around a problem when using tagged files with iTunes, but Linux players probably won't care either way.
Compatibility with Audio Players
Of the Linux music players I tested for ID3 compatibility – Banshee, Amarok, RhythmBox, Audacious, XMMS, VLC, MPlayer, and Xine – only Amarok appeared to have problems with the ID3 tags on my audio files. As it scanned the directory of music, Amarok printed the following message for nearly every track it found:
TagLib: ID3v2.4 no longer supports the frame type TDAT. It will be discarded from the tag.
Apparently Amarok has moved to the latest release of ID3v2 and is recognizing tag information that is no longer supported in that new release. Fortunately, it just ignores the outdated data.
Banshee, Amarok, and Rhythmbox are the iTunes-style players that show cover art, as well as additional information. Each of these must scan the music directories to create a database and utilize the ID3 information – and none of these applications share their database with any of the others.
Audacious and XMMS are simple players. Audacious shows more ID3 information and can show cover art. As far as I can tell, XMMS does neither.
VLC, MPlayer, and Xine are all media players that are more typically used for video playback. VLC will display ID3 information, but it won't display cover art unless you grab it. (I couldn't get it to use the local cover art already downloaded.) If you start MPlayer on a command line, you'll see ID3 information, but it doesn't display cover art. Xine just plays the file and doesn't display ID3 information.
Note that my tests were far from exhaustive. Most of these players will let you edit the tag information directly, but doing this manually for a large collection would take a while.
Picard and MusicBrainz
MusicBrainz  is a website that provides a large database of album metadata. Access to this data is offered directly through the website or through applications that can read XML  data. The database is user maintained, so any user can provide updates.
Picard is the Python-based cross-platform application that is used to query the MusicBrainz website for album metadata and simplifies the process of tagging your collection. The application uses an acoustic fingerprint in an effort to identify the audio files in an album and find the closest matches in the database. When you first use Picard, be sure to configure the default release country and enable use of folksonomies for genres in the options dialog (Options | Options). Folksonomies are community-based information that might improve the application of genres to audio files.
Picard opens with a folder browser on the left, a middle column for temporary identification, and a right column that shows the matched track and album data (Figure 1). First, find a folder of music files in the browser and drag it into the Unmatched Files entry in the middle column. I dragged a folder containing folders for each of Boston's three albums, and Picard immediately began to identify the audio files and posted matching albums in the right column.
Any files Picard can't match with data from MusicBrainz remain in the Unmatched Files collection. To match these, Picard couples with your web browser. Clicking on the entry under Unmatched Files fills in known information in the Original Metadata fields at the bottom of the window (Figure 2). Then click on the Lookup button, which opens a browser window in which you can enter the artist name and any other additional information. Next, click on the Search button, choose the appropriate entry, and hit the green tagger button (Figure 3). This adds the album to the window on the right; then you can drag any other tracks you have in Unmatched Files into the matching track name under the album.
If you use Lookup, usually a single CD will have multiple near-perfect matches because CDs are often released in multiple formats in different parts of the world. Thus, although the US version might have 10 tracks, the UK version of the same CD could have 11 tracks, or the order of tracks might be different. MusicBrainz does an excellent job of providing sets of almost exact matches from which to choose.
Alternatively, if the track isn't matched but Picard shows the album in the list on the right anyway, just drag the track from Unmatched Files to its matching album entry in that list. Tracks matched to the wrong album can be dragged from that album to the correct album (if shown) in the list on the right.
Picard is packaged for most popular Linux distributions; however, you might need to install extra packages to get acoustic fingerprinting. For example, on Fedora, you need to install both the picard and picard-freeworld packages.
Lennart Poettering wants to change the way Linux developers talk to each other.
Enterprise giant frees itself from ink and home PCs (and visa versa).
Mozilla’s product think tank sinks silently into history.
TODO group will focus on open source tools in large-scale environments.
New tool will look like GParted but support a wider range of storage technologies.
New public key pinning feature will help prevent man-in-the-middle attacks.
Carnegie Mellon researchers say 3 million pages could fall down the phishing hole in the next year.
The US government rolls new best-practice rules for protecting SSH.
Klaus Knopper announces the latest version of his iconic Live Linux system.
All websites that use these popular CMS tools could be vulnerable to denial of service attacks if users don't install the updates.