Prepare calculations and chart results with Bash Math, Shell Style
Database Backup Example
MySQL or MariaDB relational databases are very common inside Linux servers or desktops. Their content can be saved in plain text files, whose content can then be used for any calculation necessary. Of course, the first question that comes to mind is "why not perform those calculations inside the database? Aren't database engines optimized to do just that?"
The answer to this question is yes, relational databases are much faster than Bash, but they can only process their own content! If you need to do anything more than that with your files (from comparing the numbers inside the database with those inside some other file to using the same numbers to decide what to do next), you must do it outside the database. Bash is an ideal environment for such tasks. There are two other advantages of working on plain text backups instead of the actual tables in the live database. One is that you can compare at any moment backups made at different times. Another is that, starting from the same database backup file, you can generate with Bash many partial dumps to combine their content in Bash in ways that would not be possible with any direct database queries. One application of this approach in small business would be finding all customers with unpaid balances over some threshold inside the database, in order to automatically send them reminders by email, each with the corresponding invoice attached. In the same way, a teacher could generate custom exercises for each student using research or financial databases, depending on previous results.
To produce output that is immediately usable for this kind of work, use:
mysqldump -u DBUSER --password=PASSWORD --skip-extended-insert > database-backup.sql
This is the simplest approach, because the --skip-extended-insert
option dumps each database record on a separate line. If the database contained only one table called purchaseorders
, composed of four columns that corresponded to the unique identifier, date, description, and amount of each purchase, the file database-backup.sql
would contain lots of lines with this format:
INSERT INTO `purchaseorders` VALUES (44,'2004-10-20','stationery',1855.0000);
Knowing that the database has the above format, you may save in a separate file all the records that are about stationery purchased between 2004 and 2005 with this command:
egrep "'2004\-|2005\-" stationery-expenses-2004-2005.sql
Then extract the amounts only, as follows:
cut '-d,' -f2,4 stationery-expenses-2004-2005.sql >expense-amounts.txt
Here, I have used the cut
command, setting the column delimiter to the comma ('-d,'
) and selecting only the second and fourth column (-f2,4
) to save only dates and amounts inside the file expense-amounts.txt
. Now, for example, if I wanted to calculate the average import of those stationery expenses, I would feed the second column of expense-amount
to the awk
program:
cat expense-amounts.txt | awk '-F,' '{sum+=$2} END { print "Average = ",sum/NR}'
The -F
switch tells awk
to use commas as column separators instead of spaces, which are its default. The commands inside braces mean "read the file one line at a time, adding to the sum
variable the value in the second column; when the file is finished, print the string Average
, followed by the value of sum
divided by the number of records (NR
).
While the above mix of grep
, cut
, and awk
is not the most efficient, I wanted to show you all three tools working together to give you a good idea of their potential for massive number crunching in shell scripts. Combining these programs, you can extract, reformat, and preprocess, with very little effort on your part, lots of numeric data of any nature. The awk
trick above works the same way, for example, when you want to calculate the average size of all the files in the current directory, including all its subfolders. Try it for yourself:
find . -type f -exec ls -l {} \; | awk '{sum+=$5} END { print "Average = ",sum/NR}'
Charts
While I am not sure that a picture is always worth a thousand words, a chart is surely worth much more than a huge, unreadable table full of numbers. The tool I recommend to handle such tasks from shell scripts today is the same one I used in the 90s: the venerable Gnuplot [14]. The graphs it creates are not the fanciest, but Gnuplot is very well documented and available on almost any operating system, including Linux distributions optimized for tiny servers. Besides, using Gnuplot is really simple (see Figure 1):
- 1 Save the numbers to plot in a plain text data file.
- 2 Save the plotting commands for Gnuplot in an instruction file.
- 3 Tell Gnuplot to execute the instruction file on the data file.
-
Creating the data file on the fly consists of running awk
commands similar to the one used in the database example or applying the techniques I presented in this series' previous installments. Concerning plotting commands, unless you have really uncommon needs, 90 percent of the time a quick online search will return commands ready to copy and paste into your script with little or no changes. The result will be spartan, but it will deliver clear charts like Figure 2 (to see how I generated this, see [15]) that your script can then embed in web pages, send by email, or use in any other way desired.
The Big Picture
If all you needed to know about "math" in Bash were its arithmetic operators, a tiny cheatsheet would suffice. Since this is not the case, this installment is different from the previous ones: This time, presenting one, relatively complex script that solves one specific real-world problem would have been too reductive. I wanted to introduce, instead, with the minimum amount of working code, approaches and tools suitable for any kind of number-related processing and show how they fit together (see Figure 3).
![](/var/linux_magazin/storage/images/issues/2019/223/tutorials-shell-math/figure-3/748487-1-eng-US/Figure-3_large.png)
There are two take-away lessons from this installment. First, it is easy to insert single formulas in Bash scripts, using bc
if necessary. Second, very often a Bash script is the most efficient way to run or prepare calculations on large quantities of numbers of any nature, from any source (including the Internet). The real power of Gnuplot, awk
, bc
, and so on is in how you combine them in the same shell script. Give it a try!
Infos
- "Tutorials – Custom Shell Scripts" by Marco Fioretti, Linux Magazine, issue 219, February 2019, pp. 84-88
- "Tutorials – Complex Containers" by Marco Fioretti, Linux Magazine, issue 220, March 2019, pp. 84-89
- "Tutorials – Shell Flow Control" by Marco Fioretti, Linux Magazine, issue 221, April 2019, pp. 86-91
- "Tutorials – Shell Test Conditions and Exit Codes" by Marco Fioretti, Linux Magazine, issue 222, May 2019, pp. 84-88
- Genius: http://www.jirka.org/genius.html
- Graph: http://www.padowan.dk
- bc man page: https://linux.die.net/man/1/bc
- Handling time differences: https://stackoverflow.com/questions/4946785/how-to-find-the-difference-in-days-between-two-dates
- "Bash Scripting – Arithmetic, Logical, Relational and Bitwise Operators": http://www.yourownlinux.com/2016/12/bash-arithmetic-logical-relational-bitwise-operators.html
- "Geographical Distance Between Long and Lat in Bash": http://www.unix.com/shell-programming-and-scripting/250356-geographical-distance-between-long-lat-bash.html
- "Calculate Distance and Azimuth": http://www.unix.com/shell-programming-and-scripting/129989-calculate-distance-azimuth.html
- "Work the Shell – Calculating the Distance between Two Latitude/Longitude Points" by Dave Taylor, Linux Journal, December 1, 2009: http://www.linuxjournal.com/magazine/work-shell-calculating-distance-between-two-latitudelongitude-points
- R: http://www.r-project.org/
- Gnuplot: http://www.gnuplot.info
- "How to Create Stacked Area Graphs with Gnuplot": http://freesoftware.zona-m.net/how-to-create-stacked-area-graphs-with-gnuplot/
« Previous 1 2 3
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
![Learn More](https://www.linux-magazine.com/var/linux_magazin/storage/images/media/linux-magazine-eng-us/images/misc/learn-more/834592-1-eng-US/Learn-More_medium.png)
News
-
NVIDIA Released Driver for Upcoming NVIDIA 560 GPU for Linux
Not only has NVIDIA released the driver for its upcoming CPU series, it's the first release that defaults to using open-source GPU kernel modules.
-
OpenMandriva Lx 24.07 Released
If you’re into rolling release Linux distributions, OpenMandriva ROME has a new snapshot with a new kernel.
-
Kernel 6.10 Available for General Usage
Linus Torvalds has released the 6.10 kernel and it includes significant performance increases for Intel Core hybrid systems and more.
-
TUXEDO Computers Releases InfinityBook Pro 14 Gen9 Laptop
Sporting either AMD or Intel CPUs, the TUXEDO InfinityBook Pro 14 is an extremely compact, lightweight, sturdy powerhouse.
-
Google Extends Support for Linux Kernels Used for Android
Because the LTS Linux kernel releases are so important to Android, Google has decided to extend the support period beyond that offered by the kernel development team.
-
Linux Mint 22 Stable Delayed
If you're anxious about getting your hands on the stable release of Linux Mint 22, it looks as if you're going to have to wait a bit longer.
-
Nitrux 3.5.1 Available for Install
The latest version of the immutable, systemd-free distribution includes an updated kernel and NVIDIA driver.
-
Debian 12.6 Released with Plenty of Bug Fixes and Updates
The sixth update to Debian "Bookworm" is all about security mitigations and making adjustments for some "serious problems."
-
Canonical Offers 12-Year LTS for Open Source Docker Images
Canonical is expanding its LTS offering to reach beyond the DEB packages with a new distro-less Docker image.
-
Plasma Desktop 6.1 Released with Several Enhancements
If you're a fan of Plasma Desktop, you should be excited about this new point release.