Performance gains with goroutines
Faster Through Parallelism
With this tool, the web client in Listing 5 now sends the requests to the different Internet pages at the same time and saves time instead of awaiting each request's return and then moving on to the next. It also uses the httpsimple
package shown in Listing 1 to retrieve the data from the web. The fetchall()
function as of line 32 starts a separate goroutine for each request; this means that four goroutines are working on retrieving and processing the data, and another one is collecting the results, all at the same time!
Listing 5
http-parallel.go
01 package main 02 03 import( 04 "fmt" 05 "httpsimple" 06 ) 07 08 type Result struct { 09 Error error 10 Body string 11 Url string 12 } 13 14 func main() { 15 urls := []string{ 16 "https://google.com", 17 "https://facebook.com", 18 "https://yahoo.com", 19 "https://apple.com"} 20 21 results := fetchall(urls) 22 23 for i := 0; i<len(urls); i++ { 24 result := <-results 25 if result.Error == nil { 26 fmt.Printf("%s: %d bytes\n", 27 result.Url, len(result.Body)) 28 } 29 } 30 } 31 32 func fetchall( 33 urls []string) (<-chan Result) { 34 35 results := make(chan Result) 36 37 for _, url := range urls { 38 go func(url string) { 39 body, err := httpsimple.Get(url) 40 results <- Result{ 41 Error: err, Body: body, Url: url} 42 }(url) 43 } 44 45 return results 46 }
The channel through which the worker bees send their results to the main program is defined in line 35, setting the type of the data fed into the channel as the Result
structure defined in line 8. After the Get()
function of the httpsimple
package has returned the text data from the retrieved web page, and the result has been stored in the body
variable, line 40 inserts it along with any error codes and the URL into the data structure and then writes it into the channel, using the write operator <-
on the right side of the results
channel variable.
Beware Pitfalls!
When firing off goroutines in for
loops, there is one typical newcomer mistake that you will want to avoid [4]. The go func(){}()
call to an anonymously defined function as a goroutine acts as a closure (i.e., any locally defined variables in the main program are available in the goroutines, even if the variables lose their validity on leaving the current code block.)
But since the url
loop variable changes its value in each new pass of the loop, and most likely none of the goroutines will start running before the loop ends, programmers will find themselves faced with the strange phenomenon that each of the goroutines is given the same value for url
, usually the last element of the array in the loop. To prevent this from happening and to make sure that each goroutine gets its own url
value, the loop body in Listing 5 adds the url
parameter to the argument list of the anonymous function in line 38, while line 42 passes it into the function as an argument.
The main program iterates as of line 23 over a fixed number of channel entries. Thankfully, the number is defined by the length of the urls
array in line 15. The channel read operator can then simply block in line 24 until the next result is available in the channel, since the parallel goroutines will store exactly the specified number of results in the channel.
Figure 4 shows that parallel data collection indeed saves a good deal of time; the program completes the process about three times as fast. It is undoubtedly more efficient to keep the computer busy with other tasks while waiting for web data than to sit around, twiddling its tiny thumbs.
I highly recommend the book by Katherine Cox-Buday on the subject of concurrency with Go [5]. It meticulously walks the reader through good and bad design with Go channels, and it not only shows the common design patterns, but also looks behind the scenes and explains why a certain approach will produce faster and less error-prone programs.
The speed increase does not come as a free gift with parallelization. If you don't pay meticulous attention, you might end up scratching your head and wondering why you have race conditions, deadlocks, or other mysterious panic attacks of the program on production systems under load. Consequently, careful design is important.
Infos
- Listings for this article: ftp://ftp.linux-magazine.com/pub/listings/linux-magazine.com/219/
- "Don't use Go's default HTTP client (in production)" by Nathan Smith: https://medium.com/@nate510/don-t-use-go-s-default-http-client-4804cb19f779
- "Tower of Babylon" by Michael Schilli, Linux Magazine, issue 201, August, 2017, pp. 60-62: http://www.linux-magazine.com/Issues/2017/201/Programming-Snapshot-Multilingual-Programming/(language)/eng-US
- "Closure mistake with for loops": https://github.com/golang/go/wiki/CommonMistakes
- Cox-Buday, Katherine. Concurrency in Go. O'Reilly, 2017
« Previous 1 2 3
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Canonical Releases Ubuntu 24.04
After a brief pause because of the XZ vulnerability, Ubuntu 24.04 is now available for install.
-
Linux Servers Targeted by Akira Ransomware
A group of bad actors who have already extorted $42 million have their sights set on the Linux platform.
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
-
Kodi 21.0 Open-Source Entertainment Hub Released
After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
-
Linux Usage Increases in Two Key Areas
If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
-
Vulnerability Discovered in xz Libraries
An urgent alert for Fedora 40 has been posted and users should pay attention.
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.