Prepare calculations and chart results with Bash Math, Shell Style
Database Backup Example
MySQL or MariaDB relational databases are very common inside Linux servers or desktops. Their content can be saved in plain text files, whose content can then be used for any calculation necessary. Of course, the first question that comes to mind is "why not perform those calculations inside the database? Aren't database engines optimized to do just that?"
The answer to this question is yes, relational databases are much faster than Bash, but they can only process their own content! If you need to do anything more than that with your files (from comparing the numbers inside the database with those inside some other file to using the same numbers to decide what to do next), you must do it outside the database. Bash is an ideal environment for such tasks. There are two other advantages of working on plain text backups instead of the actual tables in the live database. One is that you can compare at any moment backups made at different times. Another is that, starting from the same database backup file, you can generate with Bash many partial dumps to combine their content in Bash in ways that would not be possible with any direct database queries. One application of this approach in small business would be finding all customers with unpaid balances over some threshold inside the database, in order to automatically send them reminders by email, each with the corresponding invoice attached. In the same way, a teacher could generate custom exercises for each student using research or financial databases, depending on previous results.
To produce output that is immediately usable for this kind of work, use:
mysqldump -u DBUSER --password=PASSWORD --skip-extended-insert > database-backup.sql
This is the simplest approach, because the --skip-extended-insert
option dumps each database record on a separate line. If the database contained only one table called purchaseorders
, composed of four columns that corresponded to the unique identifier, date, description, and amount of each purchase, the file database-backup.sql
would contain lots of lines with this format:
INSERT INTO `purchaseorders` VALUES (44,'2004-10-20','stationery',1855.0000);
Knowing that the database has the above format, you may save in a separate file all the records that are about stationery purchased between 2004 and 2005 with this command:
egrep "'2004\-|2005\-" stationery-expenses-2004-2005.sql
Then extract the amounts only, as follows:
cut '-d,' -f2,4 stationery-expenses-2004-2005.sql >expense-amounts.txt
Here, I have used the cut
command, setting the column delimiter to the comma ('-d,'
) and selecting only the second and fourth column (-f2,4
) to save only dates and amounts inside the file expense-amounts.txt
. Now, for example, if I wanted to calculate the average import of those stationery expenses, I would feed the second column of expense-amount
to the awk
program:
cat expense-amounts.txt | awk '-F,' '{sum+=$2} END { print "Average = ",sum/NR}'
The -F
switch tells awk
to use commas as column separators instead of spaces, which are its default. The commands inside braces mean "read the file one line at a time, adding to the sum
variable the value in the second column; when the file is finished, print the string Average
, followed by the value of sum
divided by the number of records (NR
).
While the above mix of grep
, cut
, and awk
is not the most efficient, I wanted to show you all three tools working together to give you a good idea of their potential for massive number crunching in shell scripts. Combining these programs, you can extract, reformat, and preprocess, with very little effort on your part, lots of numeric data of any nature. The awk
trick above works the same way, for example, when you want to calculate the average size of all the files in the current directory, including all its subfolders. Try it for yourself:
find . -type f -exec ls -l {} \; | awk '{sum+=$5} END { print "Average = ",sum/NR}'
Charts
While I am not sure that a picture is always worth a thousand words, a chart is surely worth much more than a huge, unreadable table full of numbers. The tool I recommend to handle such tasks from shell scripts today is the same one I used in the 90s: the venerable Gnuplot [14]. The graphs it creates are not the fanciest, but Gnuplot is very well documented and available on almost any operating system, including Linux distributions optimized for tiny servers. Besides, using Gnuplot is really simple (see Figure 1):
- 1 Save the numbers to plot in a plain text data file.
- 2 Save the plotting commands for Gnuplot in an instruction file.
- 3 Tell Gnuplot to execute the instruction file on the data file.
-
Creating the data file on the fly consists of running awk
commands similar to the one used in the database example or applying the techniques I presented in this series' previous installments. Concerning plotting commands, unless you have really uncommon needs, 90 percent of the time a quick online search will return commands ready to copy and paste into your script with little or no changes. The result will be spartan, but it will deliver clear charts like Figure 2 (to see how I generated this, see [15]) that your script can then embed in web pages, send by email, or use in any other way desired.
The Big Picture
If all you needed to know about "math" in Bash were its arithmetic operators, a tiny cheatsheet would suffice. Since this is not the case, this installment is different from the previous ones: This time, presenting one, relatively complex script that solves one specific real-world problem would have been too reductive. I wanted to introduce, instead, with the minimum amount of working code, approaches and tools suitable for any kind of number-related processing and show how they fit together (see Figure 3).

There are two take-away lessons from this installment. First, it is easy to insert single formulas in Bash scripts, using bc
if necessary. Second, very often a Bash script is the most efficient way to run or prepare calculations on large quantities of numbers of any nature, from any source (including the Internet). The real power of Gnuplot, awk
, bc
, and so on is in how you combine them in the same shell script. Give it a try!
Infos
- "Tutorials – Custom Shell Scripts" by Marco Fioretti, Linux Magazine, issue 219, February 2019, pp. 84-88
- "Tutorials – Complex Containers" by Marco Fioretti, Linux Magazine, issue 220, March 2019, pp. 84-89
- "Tutorials – Shell Flow Control" by Marco Fioretti, Linux Magazine, issue 221, April 2019, pp. 86-91
- "Tutorials – Shell Test Conditions and Exit Codes" by Marco Fioretti, Linux Magazine, issue 222, May 2019, pp. 84-88
- Genius: http://www.jirka.org/genius.html
- Graph: http://www.padowan.dk
- bc man page: https://linux.die.net/man/1/bc
- Handling time differences: https://stackoverflow.com/questions/4946785/how-to-find-the-difference-in-days-between-two-dates
- "Bash Scripting – Arithmetic, Logical, Relational and Bitwise Operators": http://www.yourownlinux.com/2016/12/bash-arithmetic-logical-relational-bitwise-operators.html
- "Geographical Distance Between Long and Lat in Bash": http://www.unix.com/shell-programming-and-scripting/250356-geographical-distance-between-long-lat-bash.html
- "Calculate Distance and Azimuth": http://www.unix.com/shell-programming-and-scripting/129989-calculate-distance-azimuth.html
- "Work the Shell – Calculating the Distance between Two Latitude/Longitude Points" by Dave Taylor, Linux Journal, December 1, 2009: http://www.linuxjournal.com/magazine/work-shell-calculating-distance-between-two-latitudelongitude-points
- R: http://www.r-project.org/
- Gnuplot: http://www.gnuplot.info
- "How to Create Stacked Area Graphs with Gnuplot": http://freesoftware.zona-m.net/how-to-create-stacked-area-graphs-with-gnuplot/
« Previous 1 2 3
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.
News
-
Linux Kernel Reducing Long-Term Support
LTS support for the Linux kernel is about to undergo some serious changes that will have a considerable impact on the future.
-
Fedora 39 Beta is Now Available for Testing
For fans and users of Fedora Linux, the first beta of release 39 is now available, which is a minor upgrade but does include GNOME 45.
-
Fedora Linux 40 to Drop X11 for KDE Plasma
When Fedora 40 arrives in 2024, there will be a few big changes coming, especially for the KDE Plasma option.
-
Real-Time Ubuntu Available in AWS Marketplace
Anyone looking for a Linux distribution for real-time processing could do a whole lot worse than Real-Time Ubuntu.
-
KSMBD Finally Reaches a Stable State
For those who've been looking forward to the first release of KSMBD, after two years it's no longer considered experimental.
-
Nitrux 3.0.0 Has Been Released
The latest version of Nitrux brings plenty of innovation and fresh apps to the table.
-
Linux From Scratch 12.0 Now Available
If you're looking to roll your own Linux distribution, the latest version of Linux From Scratch is now available with plenty of updates.
-
Linux Kernel 6.5 Has Been Released
The newest Linux kernel, version 6.5, now includes initial support for two very exciting features.
-
UbuntuDDE 23.04 Now Available
A new version of the UbuntuDDE remix has finally arrived with all the updates from the Deepin desktop and everything that comes with the Ubuntu 23.04 base.
-
Star Labs Reveals a New Surface-Like Linux Tablet
If you've ever wanted a tablet that rivals the MS Surface, you're in luck as Star Labs has created such a device.