Introducing sorting algorithms in Go

Insertion Sort

When players of card games like Rummy or bridge pick up a new hand, they often insert new cards into their hand one by one by making room at the position where the new card fits to create a sorted hand. To do this, they move the lower value cards to the left and higher value cards to the right and insert the new card into the space created (Figure 3). This is also how the insertion sort algorithm [1] works, a procedure that sorts in-place with only a few lines of code, but requires some mental dexterity.

Figure 3: When a card player picks up a new hand, they insert the new cards in sorted order by making space where the new cards will fit.

The algorithm divides the array to be sorted into two sections: on the left, the elements already sorted so far; on the right, the unsorted elements. To do this, it determines the element to be examined, starting with the second element of the array. In Figure 4 (top), this is initially the number 4. To the left of it, the previously sorted list contains only the 3; to the right of it, what is – thus far – the unsorted part then follows, starting with a 1. Now a loop starts going left to check where to insert the current element into the sorted section. Since the current element (the 4) is larger than the only sorted element to its left, there is nothing else to do – the algorithm proceeds to the next element.

Figure 4: Moving elements in an insertion sort.

In the next step in Figure 4 (bottom), the number 1 is the current element, and the loop hops over to the left to check where to insert it in the sorted part. It first sees the 4, determines that the current element (1) must be to the left of it, and moves the 4 one place to the right as shown in step (a), in place of the current element that was previously saved in a variable. Further to the left, it then sees the 3, which also needs to be on the right of the current element, and then moves the 3 to the position previously occupied by the 4 as shown in step (b). Now there is nothing standing in the way of the 1 previously stashed away in a variable; it can move to the first position in the array as shown in step (c). The sequence is now 1-3-4-…, and the algorithm proceeds by inspecting the 2 as the current element. Lather, rinse, repeat.

Listing 4 implements the whole thing in a relatively compact way. The for loop in line 11 determines the current element, starting at j=1 (i.e., the second element of the array starting at index position 0). Listing 4's output with the various sorting steps and their current key elements in each case is shown in Figure 5.

Listing 4

insert.go

01 package main
02
03 import (
04   "fmt"
05 )
06
07 func main() {
08   things := []int{3, 4, 1, 2, 6, 5, 8, 7}
09   fmt.Printf("%+v\n", things)
10
11   for j := 1; j < len(things); j++ {
12     key := things[j]
13     fmt.Printf("key=%d\n", key)
14     i := j - 1
15     for i >= 0 && things[i] > key {
16       things[i+1] = things[i]
17       i--
18     }
19     things[i+1] = key
20     fmt.Printf("%+v\n", things)
21   }
22 }
Figure 5: The insertion sort from Listing 4 sorts the array by finding out-of-place values and inserting them into the right spot.

The test loop, which jumps to the left to make room in the sorted part, starts at i=j-1 in line 14. It continues until it encounters a sorted element that is less than or equal to the current element, or until it risks dropping off on the left end of the array at i<0. In each round of the loop, line 16 copies an element in the sorted section one place to the right, and i is decremented by one. At the end of the loop, line 19 copies the current element cached in key to the vacant position, and the outer for loop moves on to the next round.

However, the number of steps required by insertion sort increases proportionally to the square of the number of elements. The algorithm therefore does not work faster than a bubble sort. For randomly structured input data, there are definitely smarter methods that reduce the number of steps from O(n2) to O(n*lg(n)).

Merge Sort

Better methods often rely on the "divide and conquer" principle. This means that they do not solve the whole problem in one go, but divide it into two more easily solvable partial problems, which they then tackle recursively.

What I like about merge sort is its almost trivial implementation based on plain vanilla recursion. The trick is to cut the array in half, sort the two halves individually, and then merge the two sorted stacks into a new one resulting in a larger sorted stack. The individual "sorting" procedure is not really sorting at all: The sorting function again chops the array in half and calls itself against the two halves until it deals with an array with only one element. The function declares this as sorted and returns it without change.

The real grit of the procedure is in the merge() function, which merges two sorted arrays. To do this, it walks through the elements of the two sorted stacks, compares their two top elements, removes the smaller one, and puts it on the result stack. Figure 6 illustrates the procedure with a card game that combines the two upper sorted stacks to create the stack at the bottom, which ends up to be sorted as well.

Figure 6: The merge() function combines two sorted stacks of cards (on top) to create a new stack that is also sorted (bottom). It always picks the top stack with the lower card for the next step.

In Listing 5, the merge() function starting in line 24 expects an integer array a, as well as three index positions: p, q, and r. The two partial arrays that merge() has to join are available at the positions p through q and q+1 through r. The function then goes ahead and stores the results in place in the original array a.

Listing 5

merge.go

01 package main
02
03 import (
04   "fmt"
05 )
06
07 func main() {
08   things := []int{17, 34, 42, 1, 7, 9, 8}
09
10   fmt.Printf("%+v\n", things)
11   mergesort(things, 0, len(things)-1)
12   fmt.Printf("%+v\n", things)
13 }
14
15 func mergesort(a []int, p, r int) {
16   if p < r {
17     q := (p + r) / 2
18     mergesort(a, p, q)
19     mergesort(a, q+1, r)
20     merge(a, p, q, r)
21   }
22 }
23
24 func merge(a []int, p, q, r int) {
25   left := make([]int, q-p+1)
26   right := make([]int, r-q)
27
28   copy(left, a[p:])
29   copy(right, a[q+1:])
30
31   fmt.Printf("left=%d\n", left)
32   fmt.Printf("right=%d\n", right)
33
34   lidx := 0
35   ridx := 0
36   oidx := p
37
38   for lidx < len(left) &&
39       ridx < len(right) {
40     if left[lidx] < right[ridx] {
41       a[oidx] = left[lidx]
42       lidx++
43     } else {
44       a[oidx] = right[ridx]
45       ridx++
46     }
47     oidx++
48   }
49   for lidx < len(left) {
50     a[oidx] = left[lidx]
51     lidx++
52     oidx++
53   }
54   for ridx < len(right) {
55     a[oidx] = right[ridx]
56     ridx++
57     oidx++
58   }
59 }

This presupposes that merge() first creates a copy of the left and right array in the variables left and right. The function uses make() to allocate memory for arrays of the desired length. Then, starting in line 28, it calls the built-in copy() function and passes it the arrays to be copied as a[p:] ("Element at index p and following") and a[q+1:] ("Element at index q+1 and following"). Note that we don't want copy() to actually copy all following elements, but make use of the fact that copy() only copies as many characters as will fit into the target array, specified as the first parameter. But since we've allocated left and right correctly with their required lengths previously, the number of characters actually copied is also correct.

Then the for loop runs through the two arrays starting in line 38, removes the smaller current element from the corresponding stack, and terminates if either one is depleted. The remaining elements of the other array are then copied into the resulting array one by one by the two for loops starting in line 49 or 54.

The main mergesort() function in line 15, which expects the unsorted array, as well as the index positions of start and end, only needs to chop the array passed to it into two parts, call itself recursively against the two subfields, and call merge() to merge the two results. The recursion is repeated until mergesort() receives an array with only one element (then p == r and the if condition is false) whereupon mergesort() simply returns, because this one-element array is sorted by definition. Figure 7 shows how the algorithm gradually sorts an array of seven integers and prints out the result.

Figure 7: Merge sort puts a field with seven integers in the correct order.

Built-In

Normally, however, aside from coding interviews, software engineers never really code up their own sorting algorithms in Go. The language offers built-in sorting functions for slices of strings, integers, or floats. After importing the sort package (Listing 6), sort.Ints(), sort.Float64s, and sort.Strings() are available, each of which arranges a slice of integers or floating-point numbers or strings in ascending order by sorting the array passed to it in place.

Listing 6

stringsort.go

01 package main
02
03 import (
04   "fmt"
05   "sort"
06 )
07
08 func main() {
09   things := []string{"green", "red", "blue", "black", "white"}
10
11   sort.Strings(things)
12
13   fmt.Printf("%+v\n", things)
14 }

As you can see from the source code [5] on the Go website, the language uses the quicksort algorithm internally. This is a divide-and-conquer mechanism that runs at breakneck speed virtually everywhere. Usually it only takes O(n*log(n)) steps to reach its goal. On rare occasions, however, it freaks out completely to the tune of O(n2) complexity – with a million entries; this could mean the difference between seconds and years.

If the algorithm has to terminate with 100 percent assurance, other methods sometimes turn out to be more advantageous. However, in order to fend off such outliers, standard libraries of programming languages often include procedures that slightly modify the algorithm and avert worst case scenarios.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Exploring Quicksort

    If you wanted to assign a flavor to the Quicksort algorithm, it would be sweet and sour. Sweet, because it is very elegant; sour, because typical implementations sometimes leave more questions than they answer.

  • Tutorials – Bash Arrays

    Discover the data structures that make your shell scripts really powerful.

  • Safe Passage

    How does a ferryman transport a wolf, a goat, and a cabbage across a river in a boat that can only carry the ferryman plus one item at a time, without the wolf eating the goat or the goat eating the cabbage while left together back on shore? Mike programs the solution in Go.

  • Top Coder

    Springtime is application time! Mike Schilli, who has experience with job application procedures at companies in Silicon Valley, digs up a question asked at the Google Engineering interview and explains a possible solution in Go.

  • Julia

    Parallel processing is indispensable today – particularly in the field of natural sciences and engineering. Normal desktop users, however, can also benefit from higher performance through parallel execution with at least four calculation cores.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News