OpenType and HarfBuzz

Hacking OpenType Fonts

© Lead Image © lassedesignen, Fotolia.com

© Lead Image © lassedesignen, Fotolia.com

Article from Issue 205/2017
Author(s):

Thanks to OpenType and HarfBuzz, you can now modify fonts like a designer.

Until now, digital fonts have been a static resource. From an application, users can select a font, as well as its size and weight, but no other options have been available to the average user. However, with OpenType [1] established as the dominant format for font files on all operating systems, a new relationship is opening up in which users can modify the display of fonts with the strategic placement of code tags. It's a command-line solution applied to desktop technology.

The OpenType format specifications were first drawn up by Microsoft and Adobe in 1996. However, in 2007, it became an ISO standard and soon became the file format of choice for font designers, including those developing free-licensed fonts. OpenType has a distinct advantage over both Type 1 (Postscript) and TrueType fonts (the main formats in the 1990s), because its Unicode support potentially gives it a full range of accents and other diacritical marks, as well as all the major writing systems – or scripts, as OpenType refers to them. As computing moved away from the English-dominated ASCII standard, increasingly, neither Type 1 nor TrueType could measure up. At the same time, because OpenType is backwardly compatible with both Type 1 and TrueType fonts, the formats can still be used by batch converting them with a script in FontForge, the open source font designer. You can tell an OpenType font by its file extension, usually .otf.

As OpenType became popular, its support in free software was aided by the spread of HarfBuzz [2], which renders a Unicode character into the corresponding glyph in a font. In the early years of the millennium, HarfBuzz became necessary, because font rendering was done by three separate applications; FreeType, Pango, and Qt. This meant that the same font could be displayed differently in different applications. At first, lead developer Behdad Esfahbod released an early version of HarfBuzz – now called Old HarfBuzz – that tried to draw on all three font renderers, but in the end, the complications proved so great that an independent version was written from scratch, becoming the HarfBuzz now used today in Chrome OS, Chrome, Firefox, and KDE. Other projects, such a Gimp, Inkscape, Krita, and Scribus are now in the process of switching to HarfBuzz as well.

HarfBuzz receives attention mostly for its support of all Unicode-supported scripts. Together with a font like Noto, which supports the full array of Unicode characters up to 6.1, HarfBuzz today can consistently represent not only simple scripts such as English, Greek, or Japanese, which use sequences of standalone glyphs (Figure 1), but also more complex scripts like Persian, Arabic, or the Indic languages (Figure 2). In these scripts, some glyphs are joined, while others stand alone and can change position according to context. In fact, the name "HarfBuzz" is a transliteration of the Persian for "OpenType," reflecting both the format it was designed for and the fact that its 15-year-long development began with Esfahbod's efforts to render Persian correctly in web browsers.

Figure 1: A simple script like Japanese or English consists of discrete characters in a standard order.
Figure 2: A complex script like Arabic or Persian uses a combination of joined and separate letters, whose positioning can change with context.

That is a time-consuming accomplishment, because each language requires its own shaping rules. Yet what HarfBuzz does for Western European languages like English is almost as impressive. Just as word processors were an improvement over typewriters because they could print italics, rather than indicating them by underlining, so HarfBuzz is an improvement over earlier font renderers because it automatically adds available advanced features. With the spread of OpenType and HarfBuzz, computer users now have the potential to produce copy as polished as anything coming from a professional print shop. What is more, users can select the features they want to use with a bit of simple hacking.

Code Tags

With OpenType and HarfBuzz, features are edited by an extensive series of code tags. Most tags can be enabled simply with a four-letter abbreviation. For example, to enable the use of italics, you would use the tag ital. Similarly, -ital turns off italics. Both of these notations also have a more complex form that adds a value. Usually, the value simply toggles the feature – for example, ital=1 is the same as ital, whereas ital=0 is the same as -ital. However, in a few cases, the ability to add a value offers a selection of several choices.

Many OpenType tags are mostly relevant to a particular language, although experimenting with tags in another language can sometimes lead to interesting results. Here, I am concerned only with those useful in English and to a lesser extent with other Western European languages that use a Latin-based alphabet, because those are the typographical standards I know and that most readers are likely to find most relevant.

Wikipedia gives a complete list of OpenType tags [3]. Some enable features that are available in applications like word processors, but may not be available in applications like web browsers. Next, I discuss some of the most useful.

Case Tags

English uses uppercase and lowercase characters. For layout, some of the most useful are small capitals, a modified set of glyphs that make a string of uppercase letters fit better into a body of text. Most word processors already support small capitals, but many do so by manufacturing their own versions, which are clumsy makeshifts that look nothing like the small capitals the font designers intended (Figure 3). Table 1 lists the tags used for capitals.

Table 1

Capital Tags

smcp

Replaces lowercase letters with small capitals

c2sc

Replaces uppercase letters with small capitals

cpsp

Improves spacing between all-capital text

Figure 3: Small capitals improve the look of strings of uppercase letters.

Numeral Tags

English generally uses lining, aka ranging figures, whose glyphs all sit on the same baseline. However, professional designers often prefer old-style figures, in which each glyph has its own baseline (Figure 4). Additionally, applications like spreadsheets can benefit from tabular figures, in which each glyph has a uniform width, although their availability is rare.

Figure 4: Lining or ranging figures are most common today, but old-style figures are more readable in blocks of text.

Typically, users write fractions with full-sized characters, making them difficult to read. Special character dialogs can replace a few common fractions, but OpenType fonts can be set to reduce the size of any two numbers separated by a forward slash, making them more readable. Some users might also appreciate the use of a slashed zero to distinguish zero from an upper case O. Table 2 shows the tags for modifying numerals.

Table 2

Numeral Tags

inum

Replaces numerals with lining figures

onum

Replaces numerals with old-style figures

tnum

Replaces numerals with tabular figures

frac

Reduces the size of any two numbers separated by a forward slash

zero

Replaces a regular zero with a zero with a slash through it

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News