Track down race conditions with Go
Programming Snapshot – Racing Goroutines

© Lead Image © alphaspirit, 123RF.com
If program parts running in parallel keep interfering with each other, you may have a race condition. Mike Schilli shows how to instruct the Go compiler to detect these conditions and how to avoid them in the first place.
If programmers are not careful, program parts that are running in parallel will constantly get in each other's way, whether as processes, threads, or goroutines. If you leave the order in which system components read or modify data to chance, you are adding time bombs to your code. They will blow up sooner or later, leaving you with runtime errors that are difficult to troubleshoot. But how do you avoid them?
The common assumption that components will run in the same order that a program calls them is a fallacy – one easily refuted with an example such as in Listing 1. But coincidence can also be a factor. It is quite possible for something to work once but then crash after a small, and often unrelated, change to the code. The load on the system you are using can also play a role: Something may work flawlessly in slack times but fall apart unexpectedly under a heavy load.
Listing 1
orderfail.go
01 package main 02 import ( 03 "fmt" 04 ) 05 06 func main() { 07 done := make(chan bool) 08 n := 10 09 10 for i := 0; i < n; i++ { 11 go func(id int) { 12 fmt.Printf("goroutine %d\n", id) 13 done <- true 14 }(i) 15 } 16 17 for i := 0; i < n; i++ { 18 <-done 19 } 20 }
The fact that unsynchronized goroutines do not run in the order in which they are defined, even if the program starts them one after the other, is nicely illustrated by Listing 1 [1] and the output in the upper part of Figure 1. Although the for
loop starts goroutine
first, followed by 1
, then 2
, and so on, as defined by the index numbers in i
, the upper part of Figure 1 makes it clear from the compiled program's output that chaos reigns, and the goroutines write their messages to the output as a wildly confusing mess.
Each of the 10 go func()
s created in the for
loop passes the current loop index as a parameter to the respective goroutine, completely according to the textbook, so that they do not all share the same variable. Also, to stop the program from terminating immediately after the for
loop ends – instead of making it wait until all the goroutines have completed their work – each goroutine sends a message to the done
channel at the end of its working life. The final for
loop starting in line 17 collects the messages from there and does not terminate until the last goroutine has said goodbye.
One by One
But if you really want goroutine
to start first, then goroutine 1
, and so on, you need to use a synchronization mechanism, such as channels or mutex constructs, to make sure that the Go runtime maintains the desired order, defying the natural chaos.
Listing 2 demonstrates this with an array of 10 channels. The goroutines all start blocking, shortly after they are called, and wait until a message arrives on the channel assigned to them. This unblocks the read statement from the channel array starters
in line 17, and the goroutine moves on to printing its "Running" message. At first, none of the channels will have a message, but line 27 after the for
loop then starts a chain of events by writing a value to the first channel.
Listing 2
orderok.go
01 package main 02 import ( 03 "fmt" 04 ) 05 06 func main() { 07 done := make(chan bool) 08 n := 10 09 10 starters := make([](chan bool), n) 11 for i := 0; i < n; i++ { 12 starters[i] = make(chan bool) 13 } 14 15 for i := 0; i < n; i++ { 16 go func(id int) { 17 <-starters[id] 18 fmt.Printf("Running %d\n", id) 19 if id < n-1 { 20 starters[id+1] <- true 21 } 22 // [... DO WORK ...] 23 done <- true 24 }(i) 25 } 26 27 starters[0] <- true 28 29 for i := 0; i < n; i++ { 30 <-done 31 } 32 }
This releases the goroutine with the id
of
, because the block in its read statement in line 17 is now lifted. The routine then outputs its message and, to keep things ticking along, writes to the channel with the id+1
(i.e., 1
). This in turn triggers goroutine 1
, which in turn triggers goroutine 2
. This merry dance continues in a controlled manner until goroutine 9
initiates the completion of the program.
This approach naturally reduces the concurrency of all goroutines, which now do not all start quasi-simultaneously but wait for each other – but only as long as the individual goroutine needs to trigger the next one in the channel. What happens afterwards within the individual goroutines (commented in line 22 with the placeholder DO WORK
), is again a quasi-simultaneous affair.
There Can Only Be One Winner
The disastrous consequences that race conditions can cause in an application are illustrated by an airline's booking program in Listing 3. It detects in line 13 that there is still one seat available on the plane in the variable seats
, which is shared by two different goroutines. It then outputs a success message to the user and sets the number of remaining seats to zero.
Listing 3
airline.go
01 package main 02 import ( 03 "fmt" 04 "time" 05 ) 06 07 func main() { 08 seats := 1 09 10 for i := 0; i < 2; i++ { 11 go func(id int) { 12 time.Sleep(100 * time.Millisecond) 13 if seats > 0 { 14 fmt.Printf("%d booked!\n", id) 15 seats = 0 16 } else { 17 fmt.Printf("%d missed out.\n", id) 18 } 19 }(i) 20 } 21 22 time.Sleep(1 * time.Second) 23 fmt.Println("") 24 }
However, there are two parallel goroutines fighting over the booking in the for
loop starting in line 10. While one rejoices and prints the success message, the second goroutine also tests the variable seats
, which is still set to 1
, and proceeds to book the seat as well. The result is an overbooked plane and angry passengers.
The output at the top of Figure 2 shows that Listing 3 does indeed allow repeated double-bookings – exacerbated by the length of the microsleep instruction at line 12, simulating the actual booking process. This is not what a customer, or an airline, wants.
The root of the problem is obvious: Two concurrent program threads share the variable seats
during the time that elapses between the check seats > 0
in line 13 and the variable being reset by seats = 0
in line 15. If the second goroutine is performing a check while the first is booking the seat, the second goroutine erroneously thinks it has a free seat because seats
is still set to 1
. A booking error is inevitable.
The problem can be solved by either performing the check and setting the variable in a single atomic statement or by declaring the program area containing both statements to be a critical section that locks out other goroutines as long as one goroutine is working in it.
Listing 4 shows a possible solution to the problem using a buffered booking
channel with a depth of 1
, as created by the make
statement in line 9. Thanks to the buffer, one goroutine can write a value into the channel without it immediately blocking [2]. But if the next goroutine tries to send a value into the channel, it blocks until someone else has extracted the buffered value, and this happens at the end of the critical section in line 21.
Listing 4
airline-ok.go
01 package main 02 import ( 03 "fmt" 04 "time" 05 ) 06 07 func main() { 08 seats := 1 09 booking := make(chan bool, 1) 10 11 for i := 0; i < 2; i++ { 12 go func(id int) { 13 time.Sleep(100 * time.Millisecond) 14 booking <- true 15 if seats > 0 { 16 fmt.Printf("%d booked!\n", id) 17 seats = 0 18 } else { 19 fmt.Printf("%d missed out.\n", id) 20 } 21 <-booking 22 }(i) 23 } 24 25 time.Sleep(1 * time.Second) 26 fmt.Println("") 27 }
With this safeguard in place, only one goroutine traverses the critical section at any given time, and it doesn't matter how long it takes to check or set the seats
variable, because no one can interfere in the meantime. The lower part of Figure 2 then also shows that only one goroutine at a time makes the booking, while the other goroutine reports that there are no more seats available – to the disappointment of the passenger who wants to book. But that's how things have to be.
Reporting Speeders
During development, Go helps you detect race conditions – if you compile the source code with the -race
option. If two goroutines then race for a variable, the Go runtime detects this in the moment and outputs a corresponding error message (Figure 3). However, this requires the program to enter the subrange that triggers the problem during the test run. This makes it important for the test suite to cover the code as completely as possible.
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Direct Download
Read full article as PDF:
Price $2.95
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Find SysAdmin Jobs
News
-
LibreOffice 7.5 has Arrived and is Loaded with New Features and Improvements
The favorite office suite of the Linux community has a new release that includes some visual refreshing and new features across all modules.
-
The Next Major Release of Elementary OS Has Arrived
It's been over a year since the developers of elementary OS released version 6.1 (Jólnir) but they've finally made their latest release (Horus) available with a renewed focus on the user.
-
KDE Plasma 5.27 Beta Is Ready for Testing
The latest beta iteration of the KDE Plasma desktop is now available and includes some important additions and fixes.
-
Netrunner OS 23 Is Now Available
The latest version of this Linux distribution is now based on Debian Bullseye and is ready for installation and finally hits the KDE 5.20 branch of the desktop.
-
New Linux Distribution Built for Gamers
With a Gnome desktop that offers different layouts and a custom kernel, PikaOS is a great option for gamers of all types.
-
System76 Beefs Up Popular Pangolin Laptop
The darling of open-source-powered laptops and desktops will soon drop a new AMD Ryzen 7-powered version of their popular Pangolin laptop.
-
Nobara Project Is a Modified Version of Fedora with User-Friendly Fixes
If you're looking for a version of Fedora that includes third-party and proprietary packages, look no further than the Nobara Project.
-
Gnome 44 Now Has a Release Date
Gnome 44 will be officially released on March 22, 2023.
-
Nitrux 2.6 Available with Kernel 6.1 and a Major Change
The developers of Nitrux have officially released version 2.6 of their Linux distribution with plenty of new features to excite users.
-
Vanilla OS Initial Release Is Now Available
A stock GNOME experience with on-demand immutability finally sees its first production release.