Introducing sorting algorithms in Go
Programming Snapshot – Sorting in Go
Whether alphabetical or numerical, bubble sort or quicksort, there is no escape from sorting in computer science. In this month's column, Mike Schilli sorts out the pros and cons of various sorting algorithms in Go.
Long before the first computers existed, sorting data occupied mankind. The German-American Herman Hollerith invented an electromechanical machine as early as 1890 that sorted punched cards into different exit shafts to speed up the evaluation of the US census of that era. And even today, computers continue to sort data – whether this be for a list of YouTube video suggestions, the top 100 charts in the music industry by sales figures, or the slowest queries against a MySQL database for performance analysis purposes.
Machines sort at amazing speeds with programs changing the order of arbitrarily formed data structures. Most of the time, they're relying on a simple key as a sorting criterion, often an integer or a character string. This explains why classical reference works on sorting algorithms [1] [2] often only show you how to sort a series of numbers. This is because porting these algorithms to deal with more complex data structures is trivial and can be done by simply defining a mapping of the data structure to a key.
Bubbles Rise to the Surface
As the simplest sorting algorithm, programming beginners often learn how to bubble sort with two for
loops. In the first loop, the i
index in Listing 1 [3] works its way from left to right through the elements of an array. For each element examined, another for
loop starts with the j
index; it scans all the following elements and checks if the element under review from the first loop is smaller than all the following ones. If the procedure finds a smaller element further to the right, it simply swaps it with the one it is examining. In this way, all elements looked at by the first loop are guaranteed to not have any smaller elements to their right. Finally, when the first loop arrives at the far right end, the array is sorted in ascending order (Figure 1).
Listing 1
bubble.go
01 package main 02 03 import ( 04 "fmt" 05 ) 06 07 func main() { 08 items := []string{"green", "red", 09 "blue", "black", "white"} 10 11 for i, _ := range items { 12 for j := i + 1; j < len(items); j++ { 13 if items[i] > items[j] { 14 items[i], items[j] = 15 items[j], items[i] 16 } 17 } 18 } 19 20 fmt.Printf("%+v\n", items) 21 }
In the first for
loop starting in line 11, Listing 1 shows you the range
operator; for each element of the items
array slice, it returns the index, starting at 0, and the element itself. The loop header ignores the latter by assigning it to the pseudo-variable _
, since it is only interested in the index of the currently examined element.
The second for
loop in line 12 starts with the element directly to the right of the currently examined one; the index, j
, starts at i+1
. If an element is found at index position j
that is larger than the one referenced by i
, the program uses an a,b = b,a
construct to quickly swap the two, which is a boon in Go compared to other programming languages mandating the use of temporary variables. Listing 1 is easily compiled by typing
go build bubble.go
and the program will display an alphabetically sorted list of colors (Listing 2).
Listing 2
Color List
$ ./bubble [black blue green red white]
The bubble sort algorithm is easy to understand, but requires a relatively large number (O(n2)) of steps to complete the sorting task. Smarter methods like merge sort or quicksort offer better performance in general. For small amounts of data, such as arrays with up to five elements, bubble sort works well enough. It sorts the entries in the original array without external storage requirements, it is stable (the order of identical elements remains unchanged), and it handles presorted input data quickly.
Counting Sort
In job interviews, candidates are sometimes asked to name a fast sort algorithm. Before instinctively replying "quicksort," they might do well to consider whether the specific framework conditions of the problem at hand might allow for clever alternatives. For example, if the elements to be sorted consist of, say, just a few different keys, the counting sort algorithm [4] is most likely much quicker (Figure 2).
In Listing 3, the things
array to be sorted contains just a few repeating values, 3
, 4
, and 5
. Instead of firing up a traditional sorting algorithm, the implementation uses a newly allocated count
array, which uses counters at index positions 3
, 4
, and 5
to note how often the corresponding values occur in the input array. In this way, a value of 7 is written down at index position 3
, because there are seven 3
s in things
, and index position 4
in count
contains a 5, for the five 4
s in things
.
Listing 3
counting.go
01 package main 02 03 import ( 04 "fmt" 05 ) 06 07 func main() { 08 things := []int{5, 4, 3, 3, 3, 4, 4, 09 4, 4, 5, 5, 3, 3, 3, 3} 10 fmt.Printf("%+v\n", things) 11 12 count := []int{} 13 14 for _, thing := range things { 15 if len(count) <= thing { 16 count = append(count, make([]int, 17 thing - len(count) + 1)...) 18 } 19 count[thing]++ 20 } 21 fmt.Printf("count=%+v\n", count) 22 23 idx := 0 24 for val, run := range count { 25 for x := 0; x < run; x++ { 26 things[idx] = val 27 idx++ 28 } 29 } 30 31 fmt.Printf("%+v\n", things) 32 }
Extending Static Arrays
Since the sorting routine does not initially know how many different values will occur in the array to be sorted, it first creates an empty ("count") array. If the for
loop then wants to create an entry in count
at a position that exceeds the size of the array, line 16 uses the built-in append()
function to extend count
to the required size.
In contrast to typical scripting languages, Go does not work with arrays that grow dynamically; instead, they are allocated with fixed sizes. If a program creates a new array with capacity for three elements using make([]int, 3)
and later accesses an element outside the range 0 to 2, Go will throw a run-time error and terminate the program. This also applies to array slices, by the way. They only define a sliding window onto an existing static array. An array can only be extended by the built-in append()
function, which creates a new array with all additionally required elements appended at the end.
In other words, line 16 creates a new, longer array from the old count
array, which has been found to be too short, and then reassigns it to the original count
. The new array differs from the old one by the difference between the desired index (in the thing
variable) and the last index of the existing array. To do this, the code uses make()
to create a small array that fills up the difference and feeds its elements one by one (expanded by the ...
operator) to the append()
function.
After the first for
loop, the count
array contains the counters for the collected keys, as printed out as a checkpoint in line 21. The second for
loop starting in line 24 iterates over the counters in count
and fills the things
array with seven 3
s first, then with five 4
s, and so on. Although the procedure uses an external count
array for counting, the extra memory required is not relevant for a small number of different keys. Sorting is done in situ (i.e., the original array is modified). The number of required steps is linear (O(n)), so the method works orders of magnitude faster than even the most sophisticated generic sorting function for large amounts of data – but only because of the very special properties of the input data, of course.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
New Slimbook EVO with Raw AMD Ryzen Power
If you're looking for serious power in a 14" ultrabook that is powered by Linux, Slimbook has just the thing for you.
-
The Gnome Foundation Struggling to Stay Afloat
The foundation behind the Gnome desktop environment is having to go through some serious belt-tightening due to continued financial problems.
-
Thousands of Linux Servers Infected with Stealth Malware Since 2021
Perfctl is capable of remaining undetected, which makes it dangerous and hard to mitigate.
-
Halcyon Creates Anti-Ransomware Protection for Linux
As more Linux systems are targeted by ransomware, Halcyon is stepping up its protection.
-
Valve and Arch Linux Announce Collaboration
Valve and Arch have come together for two projects that will have a serious impact on the Linux distribution.
-
Hacker Successfully Runs Linux on a CPU from the Early ‘70s
From the office of "Look what I can do," Dmitry Grinberg was able to get Linux running on a processor that was created in 1971.
-
OSI and LPI Form Strategic Alliance
With a goal of strengthening Linux and open source communities, this new alliance aims to nurture the growth of more highly skilled professionals.
-
Fedora 41 Beta Available with Some Interesting Additions
If you're a Fedora fan, you'll be excited to hear the beta version of the latest release is now available for testing and includes plenty of updates.
-
AlmaLinux Unveils New Hardware Certification Process
The AlmaLinux Hardware Certification Program run by the Certification Special Interest Group (SIG) aims to ensure seamless compatibility between AlmaLinux and a wide range of hardware configurations.
-
Wind River Introduces eLxr Pro Linux Solution
eLxr Pro offers an end-to-end Linux solution backed by expert commercial support.