Lesson 30. Goroutines and concurrency

After reading lesson 30, you’ll be able to

  • Start a goroutine
  • Use channels to communicate
  • Understand channel pipelines

Look, it’s a gopher factory! All the gophers are busy building things. Well, almost all. Over in the corner is a sleeping gopher—or maybe he’s deep in thought. Here’s an important gopher: she’s giving orders to other gophers. They run around and do her bidding, tell others what to do, and eventually report back their findings to her. Some gophers are sending things from the factory. Others are receiving things sent from outside.

Until now, all the Go we’ve written has been like a single gopher in this factory, busy with her own tasks and not bothering with anyone else’s. Go programs are more often like a whole factory, with many independent tasks all doing their own thing, but communicating with each other towards some common goal. These concurrent tasks might include fetching data from a web server, computing millions of digits of pi, or controlling a robot arm.

In Go, an independently running task is known as a goroutine. In this lesson, you’ll learn how to start as many goroutines as you like and communicate between them with channels. Goroutines are similar to coroutines, fibers, processes, or threads in other languages, although they’re not quite the same as any of those. They’re very efficient to create, and Go makes it straightforward to coordinate many concurrent operations.

Consider this

Consider writing a program that performs a sequence of actions. Each action might take a long time and could involve waiting for something to happen before it’s done. It could be written as straightforward, sequential code. But what if you want to do two or more of those sequences at the same time?

For example, you might want one part of your program to go through a list of email addresses and send an email for each one, while another task waits for incoming email and stores them in a database. How would you write that?

In some languages, you would need to change the code quite a bit. But in Go, you can use exactly the same kind of code for each independent task. Goroutines enable you to run any number of actions at the same time.

30.1. Starting a goroutine

Starting a goroutine is as easy as calling a function. All you need is the go keyword in front of the call.

The goroutine in listing 30.1 is similar to our sleepy gopher in the corner of the factory. He doesn’t do much, though where that Sleep statement is, he could be doing some serious thought (computation) instead. When the main function returns, all the goroutines in the program are immediately stopped, so we need to wait long enough to see the sleepy gopher print his “... snore ...” message. We’ll wait for a little bit longer than necessary just to make sure.

Listing 30.1. Sleepy gopher: sleepygopher.go
package main

import (
    "fmt"
    "time"
)

func main() {
    go sleepyGopher()                 1
    time.Sleep(4 * time.Second)       2
}                                     3

func sleepyGopher() {
    time.Sleep(3 * time.Second)       4
    fmt.Println("... snore ...")
}

  • 1 The goroutine is started.
  • 2 Waiting for the gopher to snore
  • 3 When we get here, all the goroutines are stopped.
  • 4 The gopher sleeps.

Quick check 30.1

1

What would you use in Go if you wanted to do more than one thing at the same time?

2

What keyword is used to start a new independently running task?

QC 30.1 answer

1

A goroutine.

2

go.

 

30.2. More than one goroutine

Each time we use the go keyword, a new goroutine is started. All goroutines appear to run at the same time. They might not technically run at the same time, though, because computers only have a limited number of processing units.

In fact, these processors usually spend some time on one goroutine before proceeding to another, using a technique known as time sharing. Exactly how this happens is a dark secret known only to the Go runtime and the operating system and processor you’re using. It’s best always to assume that the operations in different goroutines may run in any order.

The main function in listing 30.2 starts five sleepyGopher goroutines. They all sleep for three seconds and then print the same thing.

Listing 30.2. Five sleepy gophers: sleepygophers.go
package main

import (
    "fmt"
    "time"
)

func main() {
    for i := 0; i < 5; i++ {
        go sleepyGopher()
    }
    time.Sleep(4 * time.Second)
}

func sleepyGopher() {
    time.Sleep(3 * time.Second)
    fmt.Println("... snore ...")
}

We can find out which ones finish first by passing an argument to each goroutine. Passing an argument to a goroutine is like passing an argument to any function: the value is copied and passed as a parameter.

When you run the next listing, you should see that even though we started all the goroutines in order from zero to nine, they all finished at different times. If you run this outside the Go playground, you’ll see a different order every time.

Listing 30.3. Identified gophers: identifiedgophers.go
func main() {
    for i := 0; i < 5; i++ {
        go sleepyGopher(i)
    }
    time.Sleep(4 * time.Second)
}

func sleepyGopher(id int) {
    time.Sleep(3 * time.Second)
    fmt.Println("... ", id, " snore ...")
}

There’s a problem with this code. It’s waiting for four seconds when it only needs to wait for just over three seconds. More importantly, if the goroutines are doing more than just sleeping, we won’t know how long they’re going to take to do their work. We need some way for the code to know when all the goroutines have finished. Fortunately Go provides us with exactly what we need: channels.

Quick check 30.2

Q1:

What order do different goroutines run in?

QC 30.2 answer

1:

Any order.

 

30.3. Channels

A channel can be used to send values safely from one goroutine to another. Think of a channel as one of those pneumatic tube systems in old offices that passed around mail. If you put an object into it, it zips to the other end of the tube and can be taken out by someone else.

Like any other Go type, channels can be used as variables, passed to functions, stored in a structure, and do almost anything else you want them to do.

To create a channel, use make, the same built-in function used to make maps and slices. Channels have a type that’s specified when you make them. The following channel can only send and receive integer values:

c := make(chan int)

Once you have a channel, you can send values to it and receive the values sent to it. You send or receive values on a channel with the left arrow operator (<-).

To send a value, point the arrow toward the channel expression, as if the arrow were telling the value on the right to flow into the channel. The send operation will wait until something (in another goroutine) tries to receive on the same channel. While it’s waiting, the sender can’t do anything else, although all other goroutines will continue running freely (assuming they’re not waiting on channel operations too). The following sends the value 99:

c <- 99

To receive a value from a channel, the arrow points away from the channel (it’s to the left of the channel). In the following code, we receive a value from channel c and assign it to variable r. Similarly to sending on a channel, the receiver will wait until another goroutine tries to send on the same channel:

r := <-c
Note

Although it’s common to use a channel receive operation on its own line, that’s not required. The channel receive operation can be used anywhere any other expression can be used.

The code in listing 30.4 makes a channel and passes it to five sleepy gopher goroutines. Then it waits to receive five messages, one for each goroutine that’s been started. Each goroutine sleeps and then sends a value identifying itself. When execution reaches the end of the main function, we know for sure that all the gophers will have finished sleeping, and it can return without disturbing any gopher’s sleep. For example, say we have a program that saves the results of some number-crunching computation to online storage. It might save several things at the same time, and we don’t want to quit before all the results have been successfully saved.

Listing 30.4. Channeled sleeping gophers: simplechan.go
func main() {
    c := make(chan int)                                              1
    for i := 0; i < 5; i++ {
        go sleepyGopher(i, c)
    }
    for i := 0; i < 5; i++ {
        gopherID := <-c                                              2
        fmt.Println("gopher ", gopherID, " has finished sleeping")
    }
}

func sleepyGopher(id int, c chan int) {                              3
    time.Sleep(3 * time.Second)
    fmt.Println("... ", id, " snore ...")
    c <- id                                                          4
}

  • 1 Makes the channel to communicate over
  • 2 Receives a value from a channel
  • 3 Declares the channel as an argument
  • 4 Sends a value back to main

The square boxes in figure 30.1 represent goroutines, and the circle represents a channel. A link from a goroutine to a channel is labeled with the name of the variable that refers to the channel; the arrow direction represents the way the goroutine is using the channel. When an arrow points towards a goroutine, the goroutine is reading from the channel.

Figure 30.1. How the gophers look together

Quick check 30.3

1

What statement would you use to send the string "hello world" on a channel named c?

2

How would you receive that value and assign it to a variable?

QC 30.3 answer

1

c <- "hello world"

2

v = <-c

 

30.4. Channel surfing with select

In the preceding example, we used a single channel to wait for many goroutines. That works well when all the goroutines are producing the same type of value, but that’s not always the case. Often we’ll want to wait for two or more different kinds of values.

One example of this is when we’re waiting for some values over a channel but we want to avoid waiting too long. Perhaps we’re a little impatient with our sleepy gophers, and our patience runs out after a time. Or we may want to time out a network request after a few seconds rather than several minutes.

Fortunately, the Go standard library provides a nice function, time.After, to help. It returns a channel that receives a value after some time has passed (the goroutine that sends the value is part of the Go runtime).

We want to continue receiving values from the sleepy gophers until either they’ve all finished sleeping or our patience runs out. That means we need to wait on both the timer channel and the other channel at the same time. The select statement allows us to do this.

The select statement looks like the switch statement covered in lesson 3. Each case inside a select holds a channel receive or send. select waits until one case is ready and then runs it and its associated case statement. It’s as if select is looking at both channels at once and takes action when it sees something happen on either of them.

The following listing uses time.After to make a timeout channel and then uses select to wait for the channel from the sleepy gophers and the timeout channel.

Listing 30.5. Impatiently waiting for sleepy gophers: select1.go
timeout := time.After(2 * time.Second)
for i := 0; i < 5; i++ {
    select {                                                         1
    case gopherID := <-c:                                            2
        fmt.Println("gopher ", gopherID, " has finished sleeping")
    case <-timeout:                                                  3
        fmt.Println("my patience ran out")
        return                                                       4
    }
}

  • 1 The select statement
  • 2 Waits for a gopher to wake up
  • 3 Waits for time to run out
  • 4 Gives up and returns
Tip

When there are no cases in the select statement, it will wait forever. That might be useful to stop the main function returning when you’ve started some goroutines that you want to leave running indefinitely.

This isn’t very interesting when all the gophers are sleeping for exactly three seconds, because our patience always runs out before any gophers wake up. The gophers in the next listing sleep for a random amount of time. When you run this, you’ll find that some gophers wake up in time, but others don’t.

Listing 30.6. A randomly sleeping gopher: select2.go
func sleepyGopher(id int, c chan int) {
    time.Sleep(time.Duration(rand.Intn(4000)) * time.Millisecond)
    c <- id
}
Tip

This pattern is useful whenever you want to limit the amount of time spent doing something. By putting the action inside a goroutine and sending on a channel when it completes, anything in Go can be timed out.

Note

Although we’ve stopped waiting for the goroutines, if we haven’t returned from the main function, they’ll still be sitting around using up memory. It’s good practice to tell them to finish, if possible.

Nil channels do nothing

Because you need to create channels explicitly with make, you may wonder what happens if you use channel values that haven’t been “made.” As with maps, slices, and pointers, channels can be nil. In fact, nil is their default zero value.

If you try to use a nil channel, it won’t panic—instead, the operation (send or receive) will block forever, like a channel that nothing ever receives from or sends to. The exception to this is close (covered later in this lesson). If you try to close a nil channel, it will panic.

At first glance, that may not seem very useful, but it can be surprisingly helpful. Consider a loop containing a select statement. We may not want to wait for all the channels mentioned in the select every time through the loop. For example, we might only try to send on a channel when we have a value ready to send. We can do that by using a channel variable that’s only non-nil when we want to send a value.

So far, all has been well. When our main function received on the channel, it found a gopher sending a value on the channel. But what would happen if we accidentally tried to read when there were no goroutines left to send? Or if we tried to send on a channel instead of receive?

Quick check 30.4

1

What kind of value does time.After return?

2

What happens if you send or receive on a nil channel?

3

What does each case in a select statement have in it?

QC 30.4 answer

1

A channel.

2

It will block forever.

3

A channel operation.

 

30.5. Blocking and deadlock

When a goroutine is waiting to send or receive on a channel, we say that it’s blocked. This might sound the same as if we’d written some code with a loop that spins around forever doing nothing, and on the face of it they look exactly the same. But if you run an infinite loop in a program on your laptop, you may find that the fan starts to whir and the computer gets hot because it’s doing a lot of work. By contrast, a blocked goroutine takes no resources (other than a small amount of memory used by the goroutine itself). It’s parked itself quietly, waiting for whatever is blocking it to stop blocking it.

When one or more goroutines end up blocked for something that can never happen, it’s called deadlock, and your program will generally crash or hang up. Deadlocks can be caused by something as simple as this:

func main() {
    c := make(chan int)
    <-c
}

In large programs, deadlocks can involve an intricate series of dependencies between goroutines.

Although theoretically hard to guard against, in practice, by sticking to a few simple guidelines (covered soon), it’s not hard to make deadlock-free programs. When you do find a deadlock, Go can show you the state of all the goroutines, so it’s often easy to find out what’s going on.

Quick check 30.5

Q1:

What does a blocked goroutine do?

QC 30.5 answer

1:

It does nothing at all.

 

30.6. A gopher assembly line

So far, our gophers have been pretty sleepy. They just sleep for a while and then wake up and send a single value on their channel. But not all gophers in this factory are like that. Some are industriously working on an assembly line, receiving an item from a gopher earlier in the line, doing some work on it, then sending it on to the next gopher in the line. Although the work done by each gopher is simple, the assembly line can produce surprisingly sophisticated results.

This technique, known as a pipeline, is useful for processing large streams of data without using large quantities of memory. Although each goroutine might hold only a single value at a time, it may process millions of values over time. A pipeline is also useful because you can use it as a “thought tool” to help solve some kinds of problems more easily.

We already have all the tools we need to assemble goroutines into a pipeline. Go values flow down the pipeline, handed from one goroutine to the next. A worker in the pipeline repeatedly receives a value from its upstream neighbor, does something with it, and sends the result downstream.

Let’s build an assembly line of workers that process string values. The gopher at the start of the assembly line is shown in listing 30.7—the source of the stream. This gopher doesn’t read values, but only sends them. In another program, this might involve reading data from a file, a database, or the network, but here we’ll just send a few arbitrary values. To tell the downstream gophers that there are no more values, the source sends a sentinel value, the empty string, to indicate when it’s done.

Listing 30.7. Source gopher: pipeline1.go
func sourceGopher(downstream chan string) {
    for _, v := range []string{"hello world", "a bad apple", "goodbye all"}
{
        downstream <- v
    }
    downstream <- ""
}

The gopher in listing 30.8 filters out anything bad from the assembly line. It reads an item from its upstream channel and only sends it on the downstream channel if the value doesn’t have the string "bad" in it. When it sees the final empty string, the filter gopher quits, making sure to send the empty string to the next gopher down the line too.

Listing 30.8. Filter gopher: pipeline1.go
func filterGopher(upstream, downstream chan string) {
    for {
        item := <-upstream
        if item == "" {
            downstream <- ""
            return
        }
        if !strings.Contains(item, "bad") {
            downstream <- item
        }
    }
}

The gopher that sits at the end of the assembly line—the print gopher—is shown in listing 30.9. This gopher doesn’t have anything downstream. In another program, it might save the results to a file or a database, or print a summary of the values it’s seen. Here the print gopher prints all the values it sees.

Listing 30.9. Print gopher: pipeline1.go
func printGopher(upstream chan string) {
    for {
        v := <-upstream
        if v == "" {
            return
        }
        fmt.Println(v)
    }
}

Let’s put our gopher workers together. We’ve got three stages in the pipeline (source, filter, print) but only two channels. We don’t need to start a new goroutine for the last gopher because we want to wait for it to finish before exiting the whole program. When the printGopher function returns, we know that the two other goroutines have done their work, and we can return from main, finishing the whole program, as shown in the following listing and illustrated in figure 30.2.

Listing 30.10. Assembly: pipeline1.go
func main() {
    c0 := make(chan string)
    c1 := make(chan string)
    go sourceGopher(c0)
    go filterGopher(c0, c1)
    printGopher(c1)
}
Figure 30.2. Gopher pipeline

There’s an issue with the pipeline code we have so far. We’re using the empty string a way to signify that there aren’t any more values to process, but what if we want to process an empty string as if it were any other value? Instead of strings, we could send a struct value containing both the string we want and a Boolean field saying whether it’s the last value.

But there’s a better way. Go lets us close a channel to signify that no more values will be sent, like so:

close(c)

When a channel is closed, you can’t write any more values to it (you’ll get a panic if you try), and any read will return immediately with the zero value for the type (the empty string in this case).

Note

Be careful! If you read from a closed channel in a loop without checking whether it’s closed, the loop will spin forever, burning lots of CPU time. Make sure you know which channels may be closed and check accordingly.

How do we tell whether the channel has been closed? Like this:

v, ok := <-c

When we assign the result to two variables, the second variable will tell us whether we’ve successfully read from the channel. It’s false when the channel has been closed.

With these new tools, we can easily close down the whole pipeline. The following listing shows the source goroutine at the head of the pipeline.

Listing 30.11. Assembly: pipeline2.go
func sourceGopher(downstream chan string) {
    for _, v := range []string{"hello world", "a bad apple", "goodbye all"}
{
        downstream <- v
    }
    close(downstream)
}

The next listing shows how the filter goroutine now looks.

Listing 30.12. Assembly: pipeline2.go
func filterGopher(upstream, downstream chan string) {
    for {
        item, ok := <-upstream
        if !ok {
            close(downstream)
            return
        }
        if !strings.Contains(item, "bad") {
            downstream <- item
        }
    }
}

This pattern of reading from a channel until it’s closed is common enough that Go provides a shortcut. If we use a channel in a range statement, it will read values from the channel until the channel is closed.

This means our code can be rewritten more simply with a range loop. The following listing accomplishes the same thing as before.

Listing 30.13. Assembly: pipeline2.go
func filterGopher(upstream, downstream chan string) {
    for item := range upstream {
        if !strings.Contains(item, "bad") {
            downstream <- item
        }
    }
    close(downstream)
}

The final gopher on the assembly line reads all the messages and prints one after another, as shown in the next listing.

Listing 30.14. Assembly: pipeline2.go
func printGopher(upstream chan string) {
    for v := range upstream {
        fmt.Println(v)
    }
}
Quick check 30.6

1

What value do you see when you read from a closed channel?

2

How do you check whether a channel has been closed?

QC 30.6 answer

1

The zero value for the channel’s type.

2

Use a two-valued assignment statement:

v, ok := <-c

 

Summary

  • The go statement starts a new goroutine, running concurrently.
  • Channels are used to send values between goroutines.
  • A channel is created with make(chan string).
  • The <- operator receives from a channel (when used before a channel value).
  • The <- operator sends to a channel (when placed between the channel value and the value to be sent).
  • The close function closes a channel.
  • The range statement reads all the values from a channel until it’s closed.

Let’s see if you got this...

Experiment: remove-identical.go

It’s boring to see the same line repeated over and over again. Write a pipeline element (a goroutine) that remembers the previous value and only sends the value to the next stage of the pipeline if it’s different from the one that came before. To make things a little simpler, you may assume that the first value is never the empty string.

Experiment: split-words.go

Sometimes it’s easier to operate on words than on sentences. Write a pipeline element that takes strings, splits them up into words (you can use the Fields function from the strings package), and sends all the words, one by one, to the next pipeline stage.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset