Introducing sorting algorithms in Go

Programming Snapshot – Sorting in Go

Article from Issue 233/2020

Whether alphabetical or numerical, bubble sort or quicksort, there is no escape from sorting in computer science. In this month's column, Mike Schilli sorts out the pros and cons of various sorting algorithms in Go.

Long before the first computers existed, sorting data occupied mankind. The German-American Herman Hollerith invented an electromechanical machine as early as 1890 that sorted punched cards into different exit shafts to speed up the evaluation of the US census of that era. And even today, computers continue to sort data – whether this be for a list of YouTube video suggestions, the top 100 charts in the music industry by sales figures, or the slowest queries against a MySQL database for performance analysis purposes.

Machines sort at amazing speeds with programs changing the order of arbitrarily formed data structures. Most of the time, they're relying on a simple key as a sorting criterion, often an integer or a character string. This explains why classical reference works on sorting algorithms [1] [2] often only show you how to sort a series of numbers. This is because porting these algorithms to deal with more complex data structures is trivial and can be done by simply defining a mapping of the data structure to a key.

Bubbles Rise to the Surface

As the simplest sorting algorithm, programming beginners often learn how to bubble sort with two for loops. In the first loop, the i index in Listing 1 [3] works its way from left to right through the elements of an array. For each element examined, another for loop starts with the j index; it scans all the following elements and checks if the element under review from the first loop is smaller than all the following ones. If the procedure finds a smaller element further to the right, it simply swaps it with the one it is examining. In this way, all elements looked at by the first loop are guaranteed to not have any smaller elements to their right. Finally, when the first loop arrives at the far right end, the array is sorted in ascending order (Figure 1).

Listing 1


01 package main
03 import (
04   "fmt"
05 )
07 func main() {
08   items := []string{"green", "red",
09     "blue", "black", "white"}
11   for i, _ := range items {
12     for j := i + 1; j < len(items); j++ {
13       if items[i] > items[j] {
14         items[i], items[j] =
15           items[j], items[i]
16       }
17     }
18   }
20   fmt.Printf("%+v\n", items)
21 }
Figure 1: The two loop variables i and j in the bubble sort work their way through the array and swap unsorted elements.

In the first for loop starting in line 11, Listing 1 shows you the range operator; for each element of the items array slice, it returns the index, starting at 0, and the element itself. The loop header ignores the latter by assigning it to the pseudo-variable _, since it is only interested in the index of the currently examined element.

The second for loop in line 12 starts with the element directly to the right of the currently examined one; the index, j, starts at i+1. If an element is found at index position j that is larger than the one referenced by i, the program uses an a,b = b,a construct to quickly swap the two, which is a boon in Go compared to other programming languages mandating the use of temporary variables. Listing 1 is easily compiled by typing

go build bubble.go

and the program will display an alphabetically sorted list of colors (Listing 2).

Listing 2

Color List

$ ./bubble
[black blue green red white]

The bubble sort algorithm is easy to understand, but requires a relatively large number (O(n2)) of steps to complete the sorting task. Smarter methods like merge sort or quicksort offer better performance in general. For small amounts of data, such as arrays with up to five elements, bubble sort works well enough. It sorts the entries in the original array without external storage requirements, it is stable (the order of identical elements remains unchanged), and it handles presorted input data quickly.

Counting Sort

In job interviews, candidates are sometimes asked to name a fast sort algorithm. Before instinctively replying "quicksort," they might do well to consider whether the specific framework conditions of the problem at hand might allow for clever alternatives. For example, if the elements to be sorted consist of, say, just a few different keys, the counting sort algorithm [4] is most likely much quicker (Figure 2).

Figure 2: Three steps during a run of counting sort: Input data, computed index, and final result.

In Listing 3, the things array to be sorted contains just a few repeating values, 3, 4, and 5. Instead of firing up a traditional sorting algorithm, the implementation uses a newly allocated count array, which uses counters at index positions 3, 4, and 5 to note how often the corresponding values occur in the input array. In this way, a value of 7 is written down at index position 3, because there are seven 3s in things, and index position 4 in count contains a 5, for the five 4s in things.

Listing 3


01 package main
03 import (
04   "fmt"
05 )
07 func main() {
08   things := []int{5, 4, 3, 3, 3, 4, 4,
09     4, 4, 5, 5, 3, 3, 3, 3}
10   fmt.Printf("%+v\n", things)
12   count := []int{}
14   for _, thing := range things {
15     if len(count) <= thing {
16       count = append(count, make([]int,
17               thing - len(count) + 1)...)
18     }
19     count[thing]++
20   }
21   fmt.Printf("count=%+v\n", count)
23   idx := 0
24   for val, run := range count {
25     for x := 0; x < run; x++ {
26       things[idx] = val
27       idx++
28     }
29   }
31   fmt.Printf("%+v\n", things)
32 }

Extending Static Arrays

Since the sorting routine does not initially know how many different values will occur in the array to be sorted, it first creates an empty ("count") array. If the for loop then wants to create an entry in count at a position that exceeds the size of the array, line 16 uses the built-in append() function to extend count to the required size.

In contrast to typical scripting languages, Go does not work with arrays that grow dynamically; instead, they are allocated with fixed sizes. If a program creates a new array with capacity for three elements using make([]int, 3) and later accesses an element outside the range 0 to 2, Go will throw a run-time error and terminate the program. This also applies to array slices, by the way. They only define a sliding window onto an existing static array. An array can only be extended by the built-in append() function, which creates a new array with all additionally required elements appended at the end.

In other words, line 16 creates a new, longer array from the old count array, which has been found to be too short, and then reassigns it to the original count. The new array differs from the old one by the difference between the desired index (in the thing variable) and the last index of the existing array. To do this, the code uses make() to create a small array that fills up the difference and feeds its elements one by one (expanded by the ... operator) to the append() function.

After the first for loop, the count array contains the counters for the collected keys, as printed out as a checkpoint in line 21. The second for loop starting in line 24 iterates over the counters in count and fills the things array with seven 3s first, then with five 4s, and so on. Although the procedure uses an external count array for counting, the extra memory required is not relevant for a small number of different keys. Sorting is done in situ (i.e., the original array is modified). The number of required steps is linear (O(n)), so the method works orders of magnitude faster than even the most sophisticated generic sorting function for large amounts of data – but only because of the very special properties of the input data, of course.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Tutorials – Bash Arrays

    Discover the data structures that make your shell scripts really powerful.

  • Safe Passage

    How does a ferryman transport a wolf, a goat, and a cabbage across a river in a boat that can only carry the ferryman plus one item at a time, without the wolf eating the goat or the goat eating the cabbage while left together back on shore? Mike programs the solution in Go.

  • Top Coder

    Springtime is application time! Mike Schilli, who has experience with job application procedures at companies in Silicon Valley, digs up a question asked at the Google Engineering interview and explains a possible solution in Go.

  • Julia

    Parallel processing is indispensable today – particularly in the field of natural sciences and engineering. Normal desktop users, however, can also benefit from higher performance through parallel execution with at least four calculation cores.

  • Exploring Quicksort

    If you wanted to assign a flavor to the Quicksort algorithm, it would be sweet and sour. Sweet, because it is very elegant; sour, because typical implementations sometimes leave more questions than they answer.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95