Accessing iTunes XML metadata with Python

Build a Tree

The parse() method takes input from an XML file object and builds a hierarchical tree as a collection of elements. Each element may have multiple child elements or no child elements. The root element is instantiated by the getroot() method (Listing 3).

Listing 3

Build Tree from XML File

tree = ET.parse(xmlinfile)
root = tree.getroot()

Track Dictionary

Track metadata is a hybrid of nested lists and dictionaries with key-value pairs. The dictionary containing the metadata is nested two levels below the root. The find() method in the two statements shown in Listing 4 finds the next level descendant with a <dict> tag.

Listing 4

First and Second Generation Dictionaries

dict_gen1 = root.find('dict')
dict_gen2 = dict_gen1.find('dict')

The second-generation dictionary contains a list of all track dictionaries in the library or playlist. After locating the track dictionaries, you can work on the child elements that contain the metadata for each track.

Element Tags and Text

To get the track metadata, you need to find the track metadata tags and then retrieve the text from those tags. The statement in Listing 5 uses the findall() method to find all child elements of each track dictionary beneath dict_gen2 and creates a list object named tracklist that contains a list of dictionaries.

Listing 5

Tracklist Metadata Dictionaries


Next extract from tracklist nested lists of child elements that have track metadata. Not all <dict> tags will contain the metadata that you want, so you can use the Artist text string of the <key> tag to identify the desired dictionaries. See Figure 2 and Listing 6. The element text is accessed with the element.text attribute.

Listing 6

Create Metadata List for All Tracks

itunes_music = []
for item in tracklist:
    x = list(item)
    for i in range( len(x) ):
        if x[i].text == "Artist":
            itunes_music.append( list(item) )
Figure 2: Tag and text.

The itunes_music list object is a list of all song tracks contained in the input XML file. At this point, you may want to inspect the resulting metadata in the itunes_music list. Each <key> tag is followed by the corresponding <string> or <integer> tag. The code in Listing 7 assumes that the tagtrue() function was previously defined (for brevity, code not shown here), tests for <key> tag text that matches the desired metadata strings (Figure 3), and then prints the key and value pair text strings to your screen.

Listing 7

Display Metadata

for i in range(len(itunes_music)):
    for j in range(len(itunes_music[i])):
        if tagtrue(itunes_music[i][j].text):
                , itunes_music[i][j+1].text)
Figure 3: Loop index rows and columns.

After confirming the metadata, you can then utilize it as you need. Because the list is organized as a two-dimensional array, I will represent it in table form as it would appear in a spreadsheet or database table (Figure 3). Note that the i loop (or outer loop variable) represents rows, and the j loop (or inner loop variable) represents columns.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Musical Talent: Songbird 1.2 has Landed

    The new version automatically organizes music libraries and is fully integrative with iTunes.

  • Firefly Audio Streaming

    The Firefly Media Server makes streaming music and Internet radio onto your home network for iTunes or Banshee clients as easy as pie.

  • LibreOffice Music Database

    LibreOffice Calc and Base are all you need to create a simple database for organizing the songs in your music collection.

  • Banshee

    In Irish mythology, the banshee’s mourning call is heard when a member of the family is about to die. The Banshee tool on Linux makes noise too, but for a far happier purpose. This banshee helps you organize your musical collection.

  • Managing Music with Picard

    Getting that iTunes experience requires more than just Amarok or Rhythmbox. It also requires planning – especially if you went digital before the Linux desktop had audio players.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More