Organizing character and code sets



When foreign characters occur in program code or data, Perl programmers need a solution that avoids the tribulations of Babel.

In the beginning was the ASCII table – 128 characters that let users compose English-language texts, including a couple of foreign characters that were on any typewriter, such as % or $, and of course a couple of control characters, such as line break, page feed, or the bell. It was just a matter of time until non-English speakers started looking for ways to add the accented characters and umlauts their native languages needed, and the first approach was to squash them into the next group of 128 characters. All 256 characters were numbered 0 through 255 and encoded on computers with 8 bits (1 byte) of data. This was the birth of the ISO 8859 standard (also known as Latin 1).