Manipulating Binary Data with Bash
ASCII to Unicode
In some situations, you might need to convert an ASCII character into Unicode. ASCII is an 8-bit character set, whereas as Unicode starts at a 16-bit length. Converting from ASCII to Unicode might seem complicated, but it is actually quite simple thanks to the backward compatibility built into the Unicode standard. To convert ASCII to Unicode, you just need to prepend the value of 0 onto each ASCII character, thus making it a 16-bit character (see Listing 3).
Listing 3
ASCII to Unicode
01 ascii2unicode() { 02 echo "$1" | sed 's/\(.\)/\1\n/g' | awk '/^.$/{ printf("%c%c",0,$0) }' 03 } 04 05 command> ascii2unicode jello 06 output> jello
In Listing 3, the output of the command appears to show no noticeable change. To get a better view of the binary data behind this text, pipe the output into xxd
:
command> ascii2unicode jello | xxd output> 0000000: 006a 0065 006c 006c 006f .j.e.l.l.o
As you can see, the ASCII values have been prepended with "00," which converts them to 16-bit Unicode characters. Take a closer look at Listing 3 to see what's happening: The output of the echo
statement is piped into the sed
statement, which places each character of the output on a separate line. The awk
command reads the input from the sed
command line-by-line, and when the line contains a single character, it prints the character prepended by the character value "0".
URL Encoding and Decoding
Hexadecimal data is something you see every day, but it often goes unnoticed. When data is passed as a query string in a URL, it may be encoded using special formatting. This formatting consists of a percent sign followed by the hexadecimal value of an ASCII character. For example, the URL encoded string of "%61%62%63," when decoded, becomes "abc." Listing 4 shows a function for performing URL encoding and decoding.
Listing 4
URL Encoding and Decoding
01 urlencode() { 02 echo -n "$1" | xxd -p | tr -d '\n' | sed 's/\(..\)/%\1/g' 03 } 04 05 urldecode() { 06 tr -d '%' <<< "$1" | xxd -r -p 07 } 08 09 command> urlencode name 10 output> %6e%61%6d%65 11 command> urldecode %64%6f%6e%65 12 output> done
The function in Listing 4 uses the standard functionality of xxd
. When encoding a string, the output of xxd
is split into 1-byte chunks and prepended with a "%" by the sed
command. When decoding, all percent signs are stripped and the output is piped into xxd
to revert the hexadecimal string to ASCII.
Calculating IP Subnets
On an IP network, the subnet mask specifies how many bits of the IP address will be dedicated to the network ID and how many will be used for the host ID. The size of the host ID address space will tell you how many host IP addresses are available. Listing 5 shows how to convert the subnet mask to a binary string and determine the host ID count.
Listing 5
Converting a Subnet Mask
01 subnetcalc() { 02 echo -n "$1" | \ 03 awk 'BEGIN { FS="." ; printf("obase=2;ibase=A;") } { printf("%s;%s;%s;%s;\n",$1,$2,$3,$4) }' | \ 04 bc | sed 's/^0$/00000000/g;s/\(.\)/\1\n/g' | \ 05 awk 'BEGIN { ht = 0; nt = 0; } 06 /[01]/ { if ($0=="1") nt++; if ($0=="0") ht++; } 07 END { printf("Network bits: %s\nHost bits: %s\nHost IP Count: %d\n",nt,ht,2^ht); }' 08 } command> subnetcalc 255.255.192.0 output> Network bits: 18 output> Host bits: 14 output> Host IP Count: 16384
The output of the echo
statement is fed into the awk
statement. This first awk
command will generate the statement that is piped into the following bc
command. The statement will include ibase
, obase
, and each individual octet of the subnet mask. Once bc
evaluates the statements, it returns four lines: one line for each octet. The following sed
statement finds lines containing only "0" and extends them to 8-bits of zeros. The sed
statement also puts each bit on a line by itself. This will be necessary to properly evaluate the host bit length. The awk
statement has three sections. The first section initializes the ht
and nt
variables, which store the host total bits and network total bits, respectively. The next section searches for lines containing 0 or 1. If the value is 1, the network total is incremented, and if the value is 0, the host total is incremented. The final section of the awk
statement prints the summary data for the network, including the host and network bit counts, along with the host IP count.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.
News
-
Red Hat Migrates RHEL from Xorg to Wayland
If you've been wondering when Xorg will finally be a thing of the past, wonder no more, as Red Hat has made it clear.
-
PipeWire 1.0 Officially Released
PipeWire was created to take the place of the oft-troubled PulseAudio and has finally reached the 1.0 status as a major update with plenty of improvements and the usual bug fixes.
-
Rocky Linux 9.3 Available for Download
The latest version of the RHEL alternative is now available and brings back cloud and container images for ppc64le along with plenty of new features and fixes.
-
Ubuntu Budgie Shifts How to Tackle Wayland
Ubuntu Budgie has yet to make the switch to Wayland but with a change in approaches, they're finally on track to making it happen.
-
TUXEDO's New Ultraportable Linux Workstation Released
The TUXEDO Pulse 14 blends portability with power, thanks to the AMD Ryzen 7 7840HS CPU.
-
AlmaLinux Will No Longer Be "Just Another RHEL Clone"
With the release of AlmaLinux 9.3, the distribution will be built entirely from upstream sources.
-
elementary OS 8 Has a Big Surprise in Store
When elementary OS 8 finally arrives, it will not only be based on Ubuntu 24.04 but it will also default to Wayland for better performance and security.
-
OpenELA Releases Enterprise Linux Source Code
With Red Hat restricting the source for RHEL, it was only a matter of time before those who depended on that source struck out on their own.
-
StripedFly Malware Hiding in Plain Sight as a Cryptocurrency Miner
A rather deceptive piece of malware has infected 1 million Windows and Linux hosts since 2017.
-
Experimental Wayland Support Planned for Linux Mint 21.3
As with most Linux distributions, the migration to Wayland is in full force. While some distributions have already made the move, Linux Mint has been a bit slower to do so.