I. Introduction to Goroutine

Goroutine is a highly distinctive design in the Go programming language and one of its major highlights. Essentially a coroutine, it is the key to achieving parallel computing. Using goroutine is quite straightforward. You can start a coroutine simply by using the go keyword, and it runs asynchronously. The program can continue executing the subsequent code without waiting for the goroutine to complete.

go func() // Start a coroutine to run a function using the go keyword

II. Internal Principles of Goroutine

Concept Introduction

Concurrency

On a single CPU, multiple tasks can be executed simultaneously. In an extremely short period, the CPU quickly switches between tasks (for example, it executes program A for a short while and then rapidly switches to program B). There is an overlap in time (from a macroscopic perspective, it appears to be concurrent, but at a microscopic level, it is still sequential execution). This gives the illusion that multiple tasks are being executed simultaneously, and this is what we call concurrency.

Parallelism

When a system has multiple CPUs, each CPU can run tasks at the same time without competing for the resources of its own CPU. They work simultaneously, and this is known as parallelism.

Process

When the CPU switches between programs, if it doesn't save the state of the previous program (the so - called context), and directly switches to the next program, a series of states of the previous program will be lost. To solve this problem, the concept of a process is introduced to allocate the resources required for program execution. Therefore, a process is the basic resource unit required for a program to run (it can also be regarded as an entity of program execution). For example, when running a text - editing application, the process for this application manages all the resources like memory space for the text buffer, file - handling resources, etc.

Thread

Switching between multiple processes by the CPU consumes a significant amount of time because process switching requires a transition to the kernel mode, and each scheduling requires reading user - mode data. As the number of processes increases, CPU scheduling consumes a large amount of resources. Thus, the concept of a thread is introduced. Threads themselves consume very few resources; they share the resources within a process. When the kernel schedules threads, it doesn't consume as many resources as when scheduling processes. For instance, in a web - server application, multiple threads can be used to handle different client requests simultaneously, sharing the resources of the server process such as network connections and memory caches.

Coroutine

A coroutine has its own register context and stack. When the coroutine is scheduled to switch, it saves the register context and stack in another location. When switching back, it restores the previously saved register context and stack. So, a coroutine can retain the state from the previous call (i.e., a specific combination of all local states). Each time it re - enters the process, it is equivalent to returning to the state of the previous call, in other words, returning to the position in the logical flow where it left last time. The operations of threads and processes are triggered by the program through system interfaces, and the ultimate executor is the system. However, the operations of coroutines are executed by the user's own program, and goroutine is a type of coroutine.

Introduction to the Scheduling Model

The powerful concurrent implementation of goroutine is achieved through the GPM scheduling model. The following explains the goroutine scheduling model.

There are four important structures inside the Go scheduler: M, P, G, and Sched (Sched is not shown in the diagram).

M: Represents a kernel - level thread. One M is one thread, and goroutines run on M. For example, when a goroutine is launched to perform a complex calculation, this goroutine is assigned to an M for execution. M is a large structure that maintains a small - object memory cache (mcache), the currently executing goroutine, a random - number generator, and many other pieces of information.
G: Represents a goroutine. It has its own stack for storing function - call information, an instruction pointer to specify the execution position, and other information such as the channel it is waiting for, which is used for scheduling. For example, if a goroutine is waiting to receive data from a channel, this information is stored in the G structure.
P: The full name is Processor. It is mainly used to execute goroutines. You can think of it as a task - dispatcher. It also maintains a goroutine queue that stores all the goroutines that need to be executed by it. For example, when multiple goroutines are created, they are added to the queue maintained by P for scheduling.
Sched: Represents the scheduler. It can be regarded as a central scheduling center. It maintains the queues of M and G, as well as some state information of the scheduler, ensuring the efficient scheduling of the entire system.

Scheduling Implementation

As can be seen from the figure, there are 2 physical threads M, each M has a processor P, and there is a running goroutine.

The number of P can be set through GOMAXPROCS(). It actually represents the true concurrency level, that is, the number of goroutines that can run simultaneously.
The gray goroutines in the figure are not running and are in the ready state, waiting to be scheduled. P maintains this queue (called runqueue).
In the Go language, starting a goroutine is very simple: just use go function. Therefore, every time a go statement is executed, a goroutine is added to the end of the runqueue. At the next scheduling point, a goroutine is taken out from the runqueue for execution (but how to decide which goroutine to select?).

When an OS thread M0 is blocked (as shown in the figure below), P will switch to run M1. The M1 in the figure may be in the process of being created or taken from the thread cache.

When M0 returns, it must try to obtain a P to run the goroutine. Usually, it will try to get a P from other OS threads. If it fails to obtain one, it will put the goroutine into a global runqueue and then go to sleep itself (put into the thread cache). All P will periodically check the global runqueue and run the goroutines in it; otherwise, the goroutines on the global runqueue will never be executed.

Another situation is that the task G assigned to P is completed quickly (uneven distribution), which will cause this processor P to be idle, while other Ps still have tasks. If there are no tasks G in the global runqueue, P has to obtain some G from other Ps for execution. Generally, if P takes tasks from other Ps, it usually takes half of the run queue to ensure that each OS thread can be fully utilized, as shown in the figure below:

III. Using Goroutine

Basic Usage

Set the number of CPUs for goroutine to run. The latest version of Go has a default setting.

num := runtime.NumCPU() // Get the number of logical CPUs of the host, preparing for setting the concurrency level later
runtime.GOMAXPROCS(num) // Set the maximum number of CPUs that can be executed simultaneously according to the number of host CPUs, thereby controlling the concurrency level of goroutines

Usage Examples

Example 1: Simple Goroutine Calculation

package main

import (
    "fmt"
    "time"
)

// cal function is used to calculate the sum of two integers and print the result
func cal(a int, b int) {
    c := a + b
    fmt.Printf("%d + %d = %d\n", a, b, c)
}

func main() {
    for i := 0; i < 10; i++ {
        go cal(i, i + 1) // Start 10 goroutines to perform calculations
    }
    time.Sleep(time.Second * 2) // Sleep is used to wait for all tasks to complete
}

Result:

8 + 9 = 17
9 + 10 = 19
4 + 5 = 9
5 + 6 = 11
0 + 1 = 1
1 + 2 = 3
2 + 3 = 5
3 + 4 = 7
7 + 8 = 15
6 + 7 = 13

Goroutine Exception Catching

When starting multiple goroutines, if one of them encounters an exception and no exception handling is done, the entire program will terminate. Therefore, when writing a program, it is advisable to add exception handling to the functions run by each goroutine. The recover function can be used for exception handling.

package main

import (
    "fmt"
    "time"
)

func addele(a []int, i int) {
    // Use defer to delay the execution of the anonymous function, which is used to catch possible exceptions
    defer func() {
        // Call the recover function to obtain the exception information
        err := recover()
        if err!= nil {
            // Print the exception information
            fmt.Println("add ele fail")
        }
    }()
    a[i] = i
    fmt.Println(a)
}

func main() {
    Arry := make([]int, 4)
    for i := 0; i < 10; i++ {
        go addele(Arry, i)
    }
    time.Sleep(time.Second * 2)
}

Result:

add ele fail
[0 0 0 0]
[0 1 0 0]
[0 1 2 0]
[0 1 2 3]
add ele fail
add ele fail
add ele fail
add ele fail
add ele fail

Synchronized Goroutines

Since goroutines execute asynchronously, it is possible that when the main program exits, some goroutines have not finished executing, and these goroutines will also exit. If you want to wait for all goroutine tasks to complete before exiting, Go provides the sync package and channel to solve the synchronization problem. Of course, if you can predict the execution time of each goroutine, you can also use time.Sleep to wait for them to complete before exiting the program (as in the above example).

Example 1: Using the sync Package to Synchronize Goroutines

WaitGroup is used to wait for a group of goroutines to complete. The main program calls Add to add the number of goroutines to be waited for. Each goroutine calls Done when it finishes execution, and the number in the waiting queue is then decreased by 1. The main program is blocked by Wait until the waiting queue is 0.

package main

import (
    "fmt"
    "sync"
)

func cal(a int, b int, n *sync.WaitGroup) {
    c := a + b
    fmt.Printf("%d + %d = %d\n", a, b, c)
    // When the goroutine is completed, call the Done method to decrease the count of WaitGroup by 1
    defer n.Done()
}

func main() {
    var go_sync sync.WaitGroup // Declare a WaitGroup variable
    for i := 0; i < 10; i++ {
        // Increase the count of WaitGroup by 1 before starting the goroutine
        go_sync.Add(1)
        go cal(i, i + 1, &go_sync)
    }
    // Block and wait until the count of WaitGroup is 0, that is, all goroutines are completed
    go_sync.Wait()
}

Result:

9 + 10 = 19
2 + 3 = 5
3 + 4 = 7
4 + 5 = 9
5 + 6 = 11
1 + 2 = 3
6 + 7 = 13
7 + 8 = 15
0 + 1 = 1
8 + 9 = 17

Example 2: Implementing Synchronization between Goroutines through Channel

Implementation method: Through channel, communication can be carried out between multiple goroutines. When a goroutine is completed, it sends an exit signal to the channel. When all goroutines exit, use a for loop to obtain signals from the channel. If no data can be obtained, it will be blocked until all goroutines are completed. The prerequisite for using this method is to know the number of started goroutines.

package main

import (
    "fmt"
    "time"
)

func cal(a int, b int, Exitchan chan bool) {
    c := a + b
    fmt.Printf("%d + %d = %d\n", a, b, c)
    time.Sleep(time.Second * 2)
    // Send a signal to the channel to indicate that the goroutine is completed
    Exitchan <- true
}

func main() {
    // Create a bool - type channel with a capacity of 10 to store the completion signals of goroutines
    Exitchan := make(chan bool, 10)
    for i := 0; i < 10; i++ {
        go cal(i, i + 1, Exitchan)
    }
    for j := 0; j < 10; j++ {
        // Receive signals from the channel. If no signal is available, it will be blocked until a goroutine is completed and sends a signal
        <-Exitchan
    }
    // Close the channel
    close(Exitchan)
}

Communication between Goroutines

Goroutine is essentially a coroutine, which can be understood as a thread managed by the Go scheduler rather than the kernel. Communication or data sharing between goroutines can be achieved through channel. Of course, global variables can also be used to share data.

Example: Using Channel to Simulate the Producer - Consumer Pattern

package main

import (
    "fmt"
    "sync"
)

func Productor(mychan chan int, data int, wait *sync.WaitGroup) {
    // Send data to the channel
    mychan <- data
    fmt.Println("product data：", data)
    // Mark the producer as completed and decrease the count of WaitGroup by 1
    wait.Done()
}

func Consumer(mychan chan int, wait *sync.WaitGroup) {
    // Receive data from the channel
    a := <-mychan
    fmt.Println("consumer data：", a)
    // Mark the consumer as completed and decrease the count of WaitGroup by 1
    wait.Done()
}

func main() {
    // Create an int - type channel with a capacity of 100 for data transfer between the producer and the consumer
    datachan := make(chan int, 100)
    var wg sync.WaitGroup
    for i := 0; i < 10; i++ {
        // Start the producer goroutine to send data to the channel
        go Productor(datachan, i, &wg)
        // Increase the count of WaitGroup
        wg.Add(1)
    }
    for j := 0; j < 10; j++ {
        // Start the consumer goroutine to receive data from the channel
        go Consumer(datachan, &wg)
        // Increase the count of WaitGroup
        wg.Add(1)
    }
    // Block and wait until both the producer and the consumer have completed their tasks
    wg.Wait()
}

Result:

consumer data： 4
product data： 5
product data： 6
product data： 7
product data： 8
product data： 9
consumer data： 1
consumer data： 5
consumer data： 6
consumer data： 7
consumer data： 8
consumer data： 9
product data： 2
consumer data： 2
product data： 3
consumer data： 3
product data： 4
consumer data： 0
product data： 0
product data： 1

Leapcell: The Next-Gen Serverless Platform for Web Hosting, Async Tasks, and Redis

Finally, I would like to recommend the most suitable platform for deploying Go services: Leapcell

1. Multi - Language Support

Develop with JavaScript, Python, Go, or Rust.

2. Deploy unlimited projects for free

Pay only for usage — no requests, no charges.

3. Unbeatable Cost Efficiency

Pay - as - you - go with no idle charges.
Example: $25 supports 6.94M requests at a 60ms average response time.

4. Streamlined Developer Experience

Intuitive UI for effortless setup.
Fully automated CI/CD pipelines and GitOps integration.
Real - time metrics and logging for actionable insights.

5. Effortless Scalability and High Performance

Auto - scaling to handle high concurrency with ease.
Zero operational overhead — just focus on building.

Explore more in the documentation!

Leapcell Twitter: https://x.com/LeapcellHQ

Go's Concurrency Decoded: Goroutine Scheduling

I. Introduction to Goroutine

II. Internal Principles of Goroutine

Concept Introduction

Concurrency

Parallelism

Process

Thread

Coroutine

Introduction to the Scheduling Model

Scheduling Implementation

III. Using Goroutine

Basic Usage

Usage Examples

Example 1: Simple Goroutine Calculation

Goroutine Exception Catching

Synchronized Goroutines

Example 1: Using the sync Package to Synchronize Goroutines

Example 2: Implementing Synchronization between Goroutines through Channel

Communication between Goroutines

Example: Using Channel to Simulate the Producer - Consumer Pattern

Leapcell: The Next-Gen Serverless Platform for Web Hosting, Async Tasks, and Redis

1. Multi - Language Support

2. Deploy unlimited projects for free

3. Unbeatable Cost Efficiency

4. Streamlined Developer Experience

5. Effortless Scalability and High Performance

Share this article

More Posts from Leapcell

Blockchain Development with Chainstack and Python

Understanding Node.js Cluster: The Core Concepts

Popular Posts