Customizing file formats with unoconv
Flexible Import/Export
A hidden utility in the LibreOffice toolbox, unoconv offers a wide array of import and export filter options for use at the command line.
LibreOffice is designed to save, import, or export one file at a time, using standard filter settings. The File menu allows you to choose PDF export options, but for most other types of files, you must use the default filter settings. If you want to save multiple files, or adjust the filter settings, you need to shift to the command line and run unoconv [1], a little known Python script that gives you greater control, both with a wide array of import and export filter options.
Unoconv is short for Universal Network Objects (UNO) conversion, a reference to the UNO API used by both LibreOffice and OpenOffice [2]. Bindings for UNO are available for most C++, Java, and Python compilers, and the API is used to create extensions, as well as to provide support for formats not visible in the LibreOffice desktop window, such as the obsolete LibreOffice 1.0 file formats.
Unsurprisingly, unoconv requires access to LibreOffice's resources. The easiest way to provide this access is to install unoconv on a system that already has LibreOffice installed. However, as detailed in the man page, you can also use the --connect
(-c
) option followed by a comma-separated list to define and connect to the location of a remote LibreOffice instance or --listener
(-l
) to have unoconv detect one.
Unoconv's basic command structure (Figure 1) is:
unoconv [FILE].EXTENSION
Other files can be added, either in a space-separated list or by using regular expressions. The command structure assumes that you are exporting the file(s) to PDF format, which is probably the most widely used operation for the command. The extension is the quickest way to specify the type of file, although alternatively you can use the option --doctype
(-d
) [TYPE]
, specifying document, graphics, presentation, or spreadsheet. Formulas, databases, or charts are not supported by unoconv – no doubt due to lack of demand, since these types of documents have existed in LibreOffice and its predecessor OpenOffice.org for over a decade. If you prefer to see confirmation that the command has been successfully carried out, you can also add up to three --verbose
(-v
) options – without at least one, unoconv only displays error messages, and the only sign of a completed conversion is the return to the command prompt.
If you want to change the export format, add the option --format=
(-f
). The supported formats for both exports and imports are displayed by running unoconv --show
. Supported formats include text, CSV, dBase, HTML, PDF, several versions of Microsoft Office formats, StarOffice formats (LibreOffice's original ancestor), common graphic formats, and, of course, current LibreOffice formats (Figure 2).
In addition, unoconv also includes several different housekeeping options. If a file's attributes matter, you can add --preserve
so that the output file has the same attributes as the original file. For batch conversions, you might want to use --output
(-o
) to place all the output files in a separate directory, rather than have them mixed together with the original files. The output file can also be password protected by adding:
--password= [PASSWORD]
Still another interesting option is to set the output file to the same format as the original, then add --template
(-t
) [FILE]
to add styles from another file to the output – a command-line version of the Load Styles feature in the Styles and Formatting window on LibreOffice's desktop interface.
Import and Export Filter Settings
For many users, the default filter settings are all that is needed. However, you can adjust both import and filter settings to your own preferences, using --export
(-e
) [SETTING]
or --import
(-i
) [SETTING]
. Among other purposes, this ability can be used as an easy method for adjusting the character encoding or date formats in the original file.
Filter settings are added directly after --export
(-e
) or --import
(-i
), with a separate option for each setting. For text and CSV files, these settings are introduced by FilterOptions=
and completed by a comma-separated list unique to the format. In the list, settings can be left blank (,,
) or at the end of the list omitted altogether, forcing the use of the default settings.
By contrast, PDF and graphics exports and imports are added after --export
(-e
) or --import
(-i
), with a separate option for each setting. In other words, to set a password and set the highest image resolution to 300dpi in a PDF file, the command would include:
--export PermissionPassword=abcdef --export MaxImageResolution=300
A complete list of standard import and export settings is available online [3], but it is far too long to mention here. However, different types of files have their own set of filter options.
Text Export and Import
For text import, the most common setting to customize is the encoding. A single value can be entered, such as
--import FilterOptions=76
which would set the encoding to UTF-8. However, for exporting text from a spreadsheet, the FilterOptions
fields are encoding, field-separator, text-delimiter, quote-all-text-cells, and save-cell-content-as-shown.
CSV Text File Import
CSV files have four basic settings. In order, they are the field separator, the text delimiter, the encoding, and the first line in the file to convert to or from a spreadsheet. For example,
--export FilterOptions=44,34,76,2
will set commas as the field separator, a double quotation mark as the text delimiter, UTF-8 as the encoding, and the first line in the file to the second. In theory, at the end of the settings, you could add the date format for each column, so that:
--export FilterOptions=44,34,76,2,1/5,2/5,3/5
would specify that the date formats for the first three columns would be YY/MM/DD. Any other columns would use the date format already specified for them.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
So Long Neofetch and Thanks for the Info
Today is a day that every Linux user who enjoys bragging about their system(s) will mourn, as Neofetch has come to an end.
-
Ubuntu 24.04 Comes with a “Flaw"
If you're thinking you might want to upgrade from your current Ubuntu release to the latest, there's something you might want to consider before doing so.
-
Canonical Releases Ubuntu 24.04
After a brief pause because of the XZ vulnerability, Ubuntu 24.04 is now available for install.
-
Linux Servers Targeted by Akira Ransomware
A group of bad actors who have already extorted $42 million have their sights set on the Linux platform.
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
-
Kodi 21.0 Open-Source Entertainment Hub Released
After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
-
Linux Usage Increases in Two Key Areas
If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
-
Vulnerability Discovered in xz Libraries
An urgent alert for Fedora 40 has been posted and users should pay attention.