Rendering images as text

Text Art

© Lead Image © anan punyod, 123RF.com

© Lead Image © anan punyod, 123RF.com

Author(s): , Author(s):

If you need to display an image in the terminal or as plain HTML, a variety of smart tools can help with the conversion.

Thanks to increasingly sophisticated technology, displaying high-resolution images on screen is no longer a difficult task. However, these more detailed images (with a combination of greater image size, resolution, and color depth) come at a cost, consuming more storage space and taking longer to download from remote sources, such as web browsers and webcams.

Sometimes you just need an image to display quickly. You can save time and bandwidth by displaying images at a lower resolution and color depth as text (ASCII or Unicode characters) directly in the terminal and converting them with American National Standards (ANSI) color codes [1]. You can also convert the images to plain HTML and CSS and embed the results in a web page. Some text browsers, such as ELinks, then display these images directly in the accessed web page. In a similar manner, the Browsh [2] browser does this internally and can also render images as text.

In this article, we will discuss the available tools for converting images to text and explore whether this approach is suitable for everyday use. This article follows up on a previously published article [3] that dealt with tools for creating ASCII art.

The Conversion

First, you need to convert an image into individual characters before embedding or displaying it. To do this, each pixel (or group of pixels) is assigned a suitably colored single letter, or more precisely a glyph (a graphic symbol).

You do the conversion via filters in the form of libraries. For example, you can use the aview and asciiview tools from the Ascii Art Library (AAlib) [4] or img2txt and cacaview from the Colour Ascii Art Library (libcaca) [5]. The end result is a rendering of the image using letters with ANSI control codes. The resulting reduction in resolution and color depth not only reduces the volume of data to be transmitted, but it also means that the converted image can be displayed in a text terminal or text-based browser.

As an example, you can convert a PNG image of the Tux logo into letters with libcaca's img2txt (which replaces imgtoppm from previous library publications). To create a letter file named tux.txt, you use the following command:

$ img2txt tux.png > tux.txt

Figure 1 shows an excerpt from the content of the tux.txt file. Without additional switches in the call, the output text is 60 characters wide and formatted as colored ANSI.

Figure 1: An excerpt of tux.txt shows a portion of the Tux logo in the form of ANSI control codes.

You can use the -W (--width) option to regulate the width of the output. If you do not specify anything using -H (--height), img2txt scales the height of the output to match the original aspect ratio. The -f (--format) option lets you define the output format. Table 1 summarizes img2txt's output options.

Table 1

img2txt Output Options

Option

Description

ansi

ANSI with color codes

bbfr

BBCode [6]

caca

Internal libcaca format

html

HTML with support for CSS and DIV

html3

HTML with tables

irc

IRC with CTRL+K control codes

ps

PostScript

svg

Scalable Vector Graphics (SVG)

tga

Targa image format

utf8

UTF-8 with carriage return

utf8cr

UTF-8 with carriage return and line feed

If you want to save ANSI images as normal images again, you can use the AnsiLove library [7]. By interpreting the ANSI codes, AnsiLove creates screenshots in PNG format, rather than forcing you to create a screenshot of the text terminal.

Displaying Converted Images

For quite some time now, we have been experimenting with tools for displaying the converted images. In the Spring of 2020, Axel's toolbox [8] included chafa [9] and aha [10], along with the catimg [11] utility. Although the name is definitely a reference to the cat/tac commands, you might interpret catimg to be a tool for displaying cat pictures, but it can do considerably more. During our research, jp2a [12], which works similarly to img2txt, also appeared on the scene.

Functioning as image viewers for the terminal, chafa and catimg have only been an integral part of a stable release of the Linux distribution since Debian GNU/Linux 10. chafa displays one or more images as an unabridged slideshow in the terminal (Figure 2). It scales an image to match the current width and height of the terminal window. On the other hand, catimg orients an image based on its width resulting in the upper edge of the image disappearing from the terminal display during scrolling. The following two calls use chafa to show a single image and a slideshow of all PNG files in the current directory:

$ chafa linux.png
$ chafa *.png
Figure 2: Using chafa to show Tux as a text image.

chafa comes with a number of interesting options for effects. For example, you use -c (--colors) to set color mode to 2, 16, or 256 colors or to a 24-bit view mode. The -d (--duration) option determines how long an image remains in the slideshow (the default is three seconds). You can define the output size with -s (--size) <width>x<height>. By default, chafa uses the terminal's size or 80x25 characters if it cannot determine the size. The --watch option outputs the image again for each change. With catimg, you can use the -l option to control how often it plays an animated GIF.

Neither chafa nor catimg can handle crossfade effects to make the slideshows more interesting. Perhaps the developers will read this article and consider adding such a feature in the future.

Differences

For all their similarities, the tools presented in this article have distinct peculiarities. For example, aview can only handle images in PNM format and only display images in grayscale since AAlib does not support colors. AAlib's asciiview functions as a wrapper around aview that converts images to the PNM format required by aview up front using external tools. Animated GIFs do not work.

Unless you specify otherwise in the call, img2txt converts images to ASCII instead of Unicode glyphs and uses only 16 colors. Depending on the terminal, some characters might flash. Working similar to img2txt, cacaview is a plain vanilla image viewer, which opens a separate window to use the best possible terminal settings. As a result, nothing flashes here. However, animated GIFs do not work here either.

Both chafa and catimg display PNGs, JPEGs, GIFs, and many other image formats in more than 16 colors. If the terminal supports it, both can also use Unicode glyphs. Both can also handle animated GIFs. When displaying animated GIFs, the programs display the images in an endless loop instead of exiting by themselves.

Since catimg always uses the whole terminal width for display, you can use the -w option to specify the width of the output if necessary. Figure 3 shows this for a width of 100 pixels.

Figure 3: A text image of Linus Torvalds constrained to a width of 100 pixels with catimg.

On the Web

If you're familiar with HTML code, you will have seen the HTML <img> tag. The code in Listing 1 references the tux.png graphics file and embeds it for display at the current location on a webpage.

Listing 1

Including an Image in HTML

<img src="tux.png" alt="Tux, the Linux penguin">

The alt attribute (which has been mandatory since 2011) specifies an image's alternative text that the web browser will display if the image cannot (or is not supposed to) load. For instance, if you view the website with a screen reader such as Orca [13], it will read out the alternative text.

The aha ANSI HTML adapter, img2txt, and jp2a (by specifying appropriate options) convert text with ANSI color codes (or images) into text-based HTML sequences and HTML tags for setting the color. You can then copy this output directly into your HTML file (Figure 4).

Figure 4: Tux as a text image on a web page.

Why would you want to use text images on the web? You might want to integrate terminal content as a screenshot on a web page without having to use an image, saving data – this way the screenshot is more or less in the original format, namely text.

Figure 5 shows a directory listing with special pink background. Using the command sequence in Listing 2, you pipe the output of the ls command to aha, which gives it a pink background and outputs it to the ls.html file.

Listing 2

HTML with Aha

$ ls --color=always | aha --pink > ls.html
Figure 5: Terminal output as an image in the web browser.

Theoretically, these results can also be read by a screen reader. In testing, however, this currently only works reliably with black and white images. Presumably, the contrast is not high enough for the screen reader if you use other color combinations.

Videos

You can also use these techniques with videos. To do this, you use AAlib and libcaca as video output plugins for MPlayer and VLC (both), xine (AAlib), and mpv [14] (libcaca). The Hasciicam project [15] renders images from a TV card or camera as an ASCII image.

The mpv player also comes with its own format, True Color Text (TCT). However, TCT's documentation is quite sparse; it is only mentioned in mpv's help page. (In fact, we were only able to conclude from comments in the source code the meaning of the TCT acronym.) More information on the subject would be useful.

Table 2 shows some sample commands for playing movie sequences; Figure 6 shows the matching output using aaxine.

Table 2

Converting Videos with AAlib and libcaca

Program

Sample Command

mpv with TCT library

mpv --vo=tct https://youtu.be/Qd_1t7kw5EA

mpv with libcaca

mpv --vo=caca https://youtu.be/Qd_1t7kw5EA

MPlayer with AAlib

mplayer -vo aa video.mp4

xine with AAlib

aaxine video.mp4

Figure 6: Using aaxine to play a video.

The whole thing works quite well, but it does reach its limits if you need to view recorded keynote presentations, for example. If slides are embedded in the presentation recording, the text on the slides is often unreadable. To read these slides, you would need OCR capabilities in addition to the ability to reduce the resolution and convert to a different format.

QR Codes

What works with images and videos also works with QR codes. Put simply, QR codes are actually just special images that can be displayed at the command line. The advantage here is that you don't have to render the QR code's rough "pixels" as glyphs. You simply need to convert each of the blocks into an empty space or a half or whole Unicode block character.

This means that QR codes look just as crisp on a text-based terminal as they do as real images. The same applies to the text representation on web pages. Helpful tools for doing this include qrencode [16] and qrcode [17] from the Debian go-qrcode package. Figure 7 shows the foobar string as a QR code in text form.

Figure 7: The foobar string as a QR code in text form.

A possible application for this is provided by the pass-otp one-time password plugin [18] in the pass [19] password manager. With the help of qrencode and qrcode, secrets transferred via QR code can also be output as QR codes – even on a text-based terminal.

Conclusions

A few last suggestions for interesting applications: you might want to take a look at asciinema [20], MapSCII [21], and ASCIIQuarium [22].

With asciinema, you can record terminal sessions as video. In addition, the project website acts as a YouTube replacement and preserves your recorded sequences in 8-bit. MapSCII (Figure 8), a digital atlas based on OpenStreetMap, offers a zoom function (Figure 8): Use A to zoom in, Z to zoom out, and the arrow keys to navigate. The coordinates of the image's center at the bottom of the screen help with orientation.

Figure 8: MapSCII is an ASCII-based digital atlas with zoom functions.

For a little entertainment at the terminal, try the ASCIIQuarium (Figure 9), which turns your terminal into a virtual aquarium. Happy fishing!

Figure 9: ASCIIQuarium puts diverse aquatic species on your terminal.

The Author

Frank Hofmann mostly works on the road as a developer, trainer, and author. His favorite places for working are Berlin, Geneva, and Cape Town. Axel Beckert is a Linux system administrator and network security specialist for ETH Zurich's IT services department. He is also involved with the Debian distribution, the Linux User Group Switzerland (LUGS), the Hackerfunk podcast, and various open source projects. Hofmann and Beckert are authors of Debian Package Management.