Storing metadata in files
Reading Lamp
Users can very easily sort the content of an XMP packet with the exempi
command-line tool by using the -x
option. This tool comes from the library of the same name found on Debian in the exempi package and in exempi-tools on openSUSE.
Listing 1 shows a shortened version of a typical XMP packet. The x:xmpmeta
root element first wraps the data in the XMP packet. After this, the rdf:RDF
tag creates the rdf
namespace with the URI http://www.w3.org/1999/02/22-rdf-syntax-ns#.
Listing 1
PDF File XMP Packet
The XMP elements always reside in rdf:Description
blocks. For these, the element rdf:about
proves to be obligatory, although it always remains empty in XMP. XMP data is written in different styles. Commonly, the elements of a description class are collected in a Description
environment, and then a respective namespace is created.
Three Dublin Core elements – dc:format
, dc:title
, and dc:creator
– are found in the XMP packet. The title of the document is in an alternative list (rdf:Alt
) in several language versions (xml:lang
), and the publisher is in a sequential list (rdf:Seq
).
Core XMP elements such as xmp:CreateDate
and xmp:ModifyDate
, which bear date stamps, are found in a further Description
environment. It states here that Framemaker 8.0 has produced this document (xmp:CreatorTool
, line 20). Specific description elements for PDF documents (xmlns:pdf
) and data fields for media management (xmlns:xmpMM
) follow in other blocks. Table 1 delivers an overview of some XMP elements and classes.
Table 1
A Selection of XMP Elements and Namespaces
Description | Content | Format |
---|---|---|
Dublin Core (http://purl.org/dc/elements/1.1/) |
||
|
Title of document or item |
Alternative list with |
|
Producer (person or organization) |
Ordered list |
|
Name of the publishing entity |
Unordered list |
|
Collection of keywords |
Unordered list |
|
Language of the item |
Unordered list with RFC 306 tags |
|
File format of the object |
MIME type |
|
ISBN/ISSN, URN, DOI, and others |
Text |
XMP Core Elements (http://ns.adobe.com/xap/1.0/) |
||
|
Object's date of production |
Date stamp |
|
Tool of production |
Text |
|
Modification date of the metadata |
Date stamp |
|
Modification date of the object |
Date stamp |
|
Rating of the tool |
Score from -1 to 5 |
XMP Rights Management (http://ns.adobe.com/xap/1.0/rights/) |
||
|
Copyright marking |
True/false |
|
Rights holder |
Unordered list |
|
License/use terms |
Alternative list with |
XMP Media Management (http://ns.adobe.com/xap/1.0/mm/) |
||
|
Identifier of the object |
GUID stamp |
|
Identifier of an object instance |
GUID stamp |
Python XMP Toolkit
XMP applications can be programmed without great effort with the help of a few Python libraries (see the "XMP and Exif with Python" box). The Python XMP Toolkit [7] was developed by the European Space Agency (ESA), among others, to manage images from the Hubble Telescope (Figure 1). The current version is 2.0.1.
XMP and Exif with Python
Free Python libraries for programming Exif applications also can handle XMP, they reside alongside the XMP Toolkit (although the Exif libraries do not support the same file formats). None of these tools is implemented in pure Python; instead, they are all bindings to available C- or C++ libraries.
Under the hood, the XMP Toolkit is a wrapper written with ctypes around Exempi [8], an offshoot of Adobe's official XMP Software Development Kit (current version 2.3.0, which is based on Adobe XMP SDK 4.1.1).
Pyexiv2 [9] is a binding to the Exiv2 C++ library [10], implemented with Boost.Python, which developers can use to program applications for Exif, IPTC, and XMP metadata (current version 0.3.2). Because no one is developing Pyexiv2, a switch to GExiv2 is recommended.
GExiv2 [11] is a wrapper around the Exiv2 library for the GObject programming environment (current version 0.10.3). The software supports GObject introspection, which Python programmers can access via PyGObject [12]. To do this, you need the gir1.2-gexiv2, python-gi, and python3-gi packages (e.g., on a Debian system). Then, use the following command:
from gi.repository import GExiv2
to import the library.
If you install the XMP Toolkit's Linux package, you also get the necessary Exempi library on your computer. Until now, the Toolkit has only been available in a few distributions, such as in Debian and its offshoots, where it is within the python-libxmp and python3-libxmp packages. Alternatively, you can install it from the Python Packet Index [13].
The online documentation currently misses some of its parts [14]; Debian users are better off installing the python-libxmp-doc
documentation package [15]. Alternatively, programmers can collect the documentation from the GitHub repo or scour the source code directly for the docstrings.
The libxmp.files.XMPFiles
class controls the handling of files in the Python XMP Toolkit, and the libxmp.core.XMPMeta
class offers a range of methods (functions) for manipulating XMP packets in memory. For contact with XMP, the Toolkit defines its own complex data object, with which Python's usual XML tools cannot cope (although this is not necessary). The example in Listing 2 demonstrates a few simple operations with the Toolkit in an IPython session.
Listing 2
Python XMP Toolkit Demo
Action Mode
The listing script imports both main classes and opens the requested file, loading the XMP packet via get_xmp()
onto the myxmp
memory object. If the file called still does not contain an XMP packet, the Toolkit creates an empty template with x:xmpmeta
and the basic RDF framework in memory.
The consts
module offers a range of substitutes for the common namespaces, meaning consts.XMP_NS_DC
, for instance, represents the Dublin Core URI, and consts.XMP_NS_XMP
represents that for the core XMP elements.
The get_localized_text()
method returns certain datasets localized with xml:lang
in alternative lists (e.g., those with x-default
from dc:title
). The XMP object can be manipulated in a targeted way with set_property()
, for instance, by converting the x-default
language setting to en
. The set_localized_text()
changes localized data in alternative lists, and in dc:title
, it would expand a short German title with the localization xml:lang=de
.
Next, get_property()
again requests the dataset of the xmp:MetadataDate
element this time. This is a date stamp in ISO 8601 format. Developers can create a new date stamp (now
) with the Python library's datetime
module [16] and overwrite xmp:MetadataDate
in the XMP packet with set_property()
.
Alternatively, the set_property_datetime()
method deals with date stamps. The can_put_xmp()
method checks whether the opened file is write protected. If this is not the case, put_xmp()
writes the file and close_file()
closes it.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Red Hat Adds New Deployment Option for Enterprise Linux Platforms
Red Hat has re-imagined enterprise Linux for an AI future with Image Mode.
-
OSJH and LPI Release 2024 Open Source Pros Job Survey Results
See what open source professionals look for in a new role.
-
Proton 9.0-1 Released to Improve Gaming with Steam
The latest release of Proton 9 adds several improvements and fixes an issue that has been problematic for Linux users.
-
So Long Neofetch and Thanks for the Info
Today is a day that every Linux user who enjoys bragging about their system(s) will mourn, as Neofetch has come to an end.
-
Ubuntu 24.04 Comes with a “Flaw"
If you're thinking you might want to upgrade from your current Ubuntu release to the latest, there's something you might want to consider before doing so.
-
Canonical Releases Ubuntu 24.04
After a brief pause because of the XZ vulnerability, Ubuntu 24.04 is now available for install.
-
Linux Servers Targeted by Akira Ransomware
A group of bad actors who have already extorted $42 million have their sights set on the Linux platform.
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.