go context详解

前言

平时在 Go 工程的开发中，几乎所有服务端的默认实现（例如：HTTP Server），都在处理请求时开启了新的 goroutine 进行处理。

但从一开始就存在一个问题，那就是当一个请求被取消或超时时，所有在该请求上工作的 goroutine 应该迅速退出，以便系统可以回收他们正在使用的资源。

因此 Go 官方在2014年，Go 1.7 版本中正式引入了 context 标准库。其主要作用是在 goroutine 中进行上下文的传递，在传递信息中又包含了 goroutine 的运行控制、上下文信息传递等功能。

什么是 context

Context 是Go 语言独有功能之一，用于上下文控制，可以在 goroutine 中进行传递。

context 与 select-case 联合，还可以实现上下文的截止时间、信号控制、信息传递等跨 goroutine 的操作，是 Go 语言协程的重要组成部分。

context 基本特性

在 Go context 用法中，我们常常将其与 select 关键字结合使用，用于监听其是否结束、取消等。

演示代码：

func main() {
	parentCtx := context.Background()
	ctx, cancel := context.WithTimeout(parentCtx, 1*time.Millisecond)
	defer cancel()

	select {
	case <-time.After(1 * time.Second):
		fmt.Println("overslept")
	case <-ctx.Done():
		fmt.Println(ctx.Err())
	}
}

输出结果：

context deadline exceeded

我们通过调用标准库 context.WithTimeout 方法针对 parentCtx 变量设置了超时时间，并在随后调用 select-case 进行 context.Done 方法的监听，最后由于达到了截止时间。因此逻辑上 select 走到了 context.Err 的 case 分支，最终输出 context deadline exceeded。

除了上述所描述的方法外，标准库 context 还支持下述方法：

func WithCancel(parent Context) (ctx Context, cancel CancelFunc)
func WithDeadline(parent Context, d time.Time) (Context, CancelFunc)
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc)
type Context
    func Background() Context
    func TODO() Context
    func WithValue(parent Context, key, val interface{}) Context

WithCancel：基于父级 context，创建一个可以取消的新 context。
WithDeadline：基于父级 context，创建一个具有截止时间（Deadline）的新 context。
WithTimeout：基于父级 context，创建一个具有超时时间（Timeout）的新 context。
Background：创建一个空的 context，一般常用于作为根的父级 context。
TODO：创建一个空的 context，一般用于未确定时的声明使用。
WithValue：基于某个 context 创建并存储对应的上下文信息。

如果是更进一步结合 goroutine 的话，常见的例子是：

func(ctx context.Context) <-chan int {
  dst := make(chan int)
  n := 1
  go func() {
   for {
    select {
    case <-ctx.Done():
     return
    case dst <- n:
     n++
    }
   }
  }()
  return dst
 }

我们平时工程中会起很多的 goroutine，这时候会在 goroutine 内结合 for+select，针对 context 的事件进行处理，达到跨 goroutine 控制的目的。

context 正确使用方式

对第三方调用传入 context

在 Go 语言中，Context 已经是默认支持的规范了。因此我们对第三方有调用诉求的时候，可以传入 context：

func main() {
 req, err := http.NewRequest("GET", "https://xxx.com/", nil)
 if err != nil {
  fmt.Printf("http.NewRequest err: %+v", err)
  return
 }

 ctx, cancel := context.WithTimeout(req.Context(), 50*time.Millisecond)
 defer cancel()

 req = req.WithContext(ctx)
 resp, err := http.DefaultClient.Do(req)
 if err != nil {
  fmt.Printf("http.DefaultClient.Do err: %+v", err)
  return
 }
 defer resp.Body.Close()
}

一般第三方开源库都已经实现了根据 context 的超时控制，所以当程序超时时，将会中断请求。

若你发现第三方开源库没有支持 context，建议换一个，免得出现级联故障。

不要将上下文存储在结构类型中

大家会发现，在 Go 语言中，所有的第三方开源库，业务代码。几乎清一色的都会将 context 放在方法的一个入参参数，作为首位形参。
例如：

标准要求：每个方法的第一个参数都将 context 作为第一个参数，并使用 ctx 变量名惯用语。
当然，也有极少数把 context 放在结构体中的。基本常见于：

底层基础库。
DDD 结构。

每个请求都是独立的，context 自然每个都不一样，想清楚自己的应用使用场景很重要，否则遵循 Go 基本规范就好。

函数调用链必须传播上下文

我们会把 context 作为方法首位，本质目的是为了传播 context，自行完整调用链路上的各类控制：

func List(ctx context.Context, db *sqlx.DB) ([]User, error) {
 ctx, span := trace.StartSpan(ctx, "internal.user.List")
 defer span.End()

 users := []User{}
 const q = `SELECT * FROM users`

 if err := db.SelectContext(ctx, &users, q); err != nil {
  return nil, errors.Wrap(err, "selecting users")
 }

 return users, nil
}

像在上述例子中，我们会把所传入方法的 context 一层层的传进去下一级方法。这里就是将外部的 context 传入 List 方法，再传入 SQL 执行的方法，解决了 SQL 执行语句的时间问题。

context 的继承和派生

在 Go 标准库 context 中具有以下派生 context 的标准方法：

func WithCancel(parent Context) (ctx Context, cancel CancelFunc)
func WithDeadline(parent Context, d time.Time) (Context, CancelFunc)
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc)

代码例子如下：

func handle(w http.ResponseWriter, req *http.Request) {
  // parent context
 timeout, _ := time.ParseDuration(req.FormValue("timeout"))
 ctx, cancel := context.WithTimeout(context.Background(), timeout)

  // chidren context
 newCtx, cancel := context.WithCancel(ctx)
 defer cancel()
 // do something...
}

一般会有父级 context 和子级 context 的区别，我们要保证在程序的行为中上下文对于多个 goroutine 同时使用是安全的。并且存在父子级别关系，父级 context 关闭或超时，可以继而影响到子级 context 的程序。

不传递 nil context

很多时候我们在创建 context 时，还不知道其具体的作用和下一步用途是什么。

这种时候大家可能会直接使用 context.Background 方法：

var (
   background = new(emptyCtx)
   todo       = new(emptyCtx)
)

func Background() Context {
   return background
}

func TODO() Context {
   return todo
}

但在实际的 context 建议中，我们会建议使用 context.TODO 方法来创建顶级的 context，直到弄清楚实际 Context 的下一步用途，再进行变更。

context 仅传递必要的值

我们在使用 context 作为上下文时，经常有信息传递的诉求。像是在 gRPC 中就会有 metadata 的概念，而在 gin 中就会自己封装 context 作为参数管理。
Go 标准库 context 也有提供相关的方法：

type Context
    func WithValue(parent Context, key, val interface{}) Context

代码例子如下：

func main() {
 type favContextKey string
 f := func(ctx context.Context, k favContextKey) {
  if v := ctx.Value(k); v != nil {
   fmt.Println("found value:", v)
   return
  }
  fmt.Println("key not found:", k)
 }

 k := favContextKey("小米")
 ctx := context.WithValue(context.Background(), k, "小米")

 f(ctx, k)
 f(ctx, favContextKey("小红"))
}

输出结果：

found value: 小米
key not found: 小红

在规范中，建议 context 在传递时，仅携带必要的参数给予其他的方法，或是 goroutine。甚至在 gRPC 中做严格的出、入上下文参数的控制。

在业务场景上，context 传值适用于传必要的业务核心属性，例如：租户号、小程序ID 等。不要将可选参数放到 context 中，否则可能会一团糟。

总结

对第三方调用要传入 context，用于控制远程调用。
不要将上下文存储在结构类型中，尽可能的作为函数第一位形参传入。
函数调用链必须传播上下文，实现完整链路上的控制。
context 的继承和派生，保证父、子级 context 的联动。
不传递 nil context，不确定的 context 应当使用 TODO。
context 仅传递必要的值，不要让可选参数揉在一起。

context 本质

我们在基本特性中介绍了不少 context 的方法，其基本大同小异。看上去似乎不难，接下来我们看看其底层的基本原理和设计。

context 相关函数的标准返回如下：

func WithXXXX(parent Context, xxx xxx) (Context, CancelFunc)

其返回值分别是 Context 和 CancelFunc，接下来我们将进行分析这两者的作用。

接口

Context 接口：

type Context interface {
    Deadline() (deadline time.Time, ok bool)
    Done() <-chan struct{}
    Err() error
    Value(key interface{}) interface{}
}

Deadline：获取当前 context 的截止时间。
Done：获取一个只读的 channel，类型为结构体。可用于识别当前 channel 是否已经被关闭，其原因可能是到期，也可能是被取消了。
Err：获取当前 context 被关闭的原因。
Value：获取当前 context 对应所存储的上下文信息。

Canceler 接口：

type canceler interface {
	cancel(removeFromParent bool, err error)
	Done() <-chan struct{}
}

cancel：调用当前 context 的取消方法。
Done：与前面一致，可用于识别当前 channel 是否已经被关闭。

基础结构

在标准库 context 的设计上，一共提供了四类 context 类型来实现上述接口。分别是 emptyCtx、cancelCtx、timerCtx 以及 valueCtx。

emptyCtx

在日常使用中，常常使用到的 context.Background 方法，又或是 context.TODO 方法。

源码如下：

var (
	background = new(emptyCtx)
	todo       = new(emptyCtx)
)

func Background() Context {
	return background
}

func TODO() Context {
	return todo
}

其本质上都是基于 emptyCtx 类型的基本封装。而 emptyCtx 类型本质上是实现了 Context 接口：

type emptyCtx int

func (*emptyCtx) Deadline() (deadline time.Time, ok bool) {
	return
}

func (*emptyCtx) Done() <-chan struct{} {
	return nil
}

func (*emptyCtx) Err() error {
	return nil
}

func (*emptyCtx) Value(key interface{}) interface{} {
	return nil
}

实际上 emptyCtx 类型的 context 的实现非常简单，因为他是空 context 的定义，因此没有 deadline，更没有 timeout，可以认为就是一个基础空白 context 模板。

cancelCtx

在调用 context.WithCancel 方法时，我们会涉及到 cancelCtx 类型，其主要特性是取消事件。源码如下:

func WithCancel(parent Context) (ctx Context, cancel CancelFunc) {
	c := newCancelCtx(parent)
	propagateCancel(parent, &c)
	return &c, func() { c.cancel(true, Canceled) }
}

func newCancelCtx(parent Context) cancelCtx {
	return cancelCtx{Context: parent}
}

其中的 newCancelCtx 方法将会生成出一个可以取消的新 context，如果该 context 执行取消，与其相关联的子 context 以及对应的 goroutine 也会收到取消信息。

首先 main goroutine 创建并传递了一个新的 context 给 goroutine b，此时 goroutine b 的 context 是 main goroutine context 的子集：

传递过程中，goroutine b 再将其 context 一个个传递给了 goroutine c、d、e。最后在运行时 goroutine b 调用了 cancel 方法。使得该 context 以及其对应的子集均接受到取消信号，对应的 goroutine 也进行了响应。

接下来我们针对 cancelCtx 类型来进一步看看：

type cancelCtx struct {
	Context

	mu       sync.Mutex            // protects following fields
	done     chan struct{}         // created lazily, closed by first cancel call
	children map[canceler]struct{} // set to nil by the first cancel call
	err      error                 // set to non-nil by the first cancel call
}

该结构体所包含的属性也比较简单，主要是 children 字段，其包含了该 context 对应的所有子集 context，便于在后续发生取消事件的时候进行逐一通知和关联。

而其他的属性主要用于并发控制（互斥锁）、取消信息和错误的写入：

func (c *cancelCtx) Value(key interface{}) interface{} {
	if key == &cancelCtxKey {
		return c
	}
	return c.Context.Value(key)
}

func (c *cancelCtx) Done() <-chan struct{} {
	c.mu.Lock()
	if c.done == nil {
		c.done = make(chan struct{})
	}
	d := c.done
	c.mu.Unlock()
	return d
}

func (c *cancelCtx) Err() error {
	c.mu.Lock()
	err := c.err
	c.mu.Unlock()
	return err
}

在上述代码中可以留意到，done 属性（只读 channel）是在真正调用到 Done 方法时才会去创建。需要配合 select-case 来使用。

timerCtx

在调用 context.WithTimeout 方法时，我们会涉及到 timerCtx 类型，其主要特性是 Timeout 和 Deadline 事件，源码如下：

func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc) {
	return WithDeadline(parent, time.Now().Add(timeout))
}

func WithDeadline(parent Context, d time.Time) (Context, CancelFunc) {
	...
	c := &timerCtx{
		cancelCtx: newCancelCtx(parent),
		deadline:  d,
	}
}

你可以发现 timerCtx 类型是基于 cancelCtx 类型的。我们再进一步看看 timerCtx 结构体：

type timerCtx struct {
	cancelCtx
	timer *time.Timer // Under cancelCtx.mu.

	deadline time.Time
}

其实 timerCtx 类型也就是 cancelCtx 类型，加上 time.Timer 和对应的 Deadline，也就是包含了时间属性的控制。

我们进一步看看其配套的 cancel 方法，思考一下其是如何进行取消动作的：

func (c *timerCtx) Deadline() (deadline time.Time, ok bool) {
	return c.deadline, true
}

func (c *timerCtx) cancel(removeFromParent bool, err error) {
	c.cancelCtx.cancel(false, err)
	if removeFromParent {
		removeChild(c.cancelCtx.Context, c)
	}
	c.mu.Lock()
	if c.timer != nil {
		c.timer.Stop()
		c.timer = nil
	}
	c.mu.Unlock()
}

先会调用 cancelCtx 类型的取消事件。若存在父级节点，则移除当前 context 子节点，最后停止定时器并进行定时器重置。而 Deadline 或 Timeout 的行为则由 timerCtx 的 WithDeadline 方法实现：

func WithDeadline(parent Context, d time.Time) (Context, CancelFunc) {
	if cur, ok := parent.Deadline(); ok && cur.Before(d) {
		// The current deadline is already sooner than the new one.
		return WithCancel(parent)
	}
	...
}

该方法会先进行前置判断，若父级节点的 Deadline 时间早于当前所指定的 Deadline 时间，将会直接生成一个 cancelCtx 的 context。

func WithDeadline(parent Context, d time.Time) (Context, CancelFunc) {
	...
	c := &timerCtx{
		cancelCtx: newCancelCtx(parent),
		deadline:  d,
	}
	propagateCancel(parent, c)
	dur := time.Until(d)
	if dur <= 0 {
		c.cancel(true, DeadlineExceeded) // deadline has already passed
		return c, func() { c.cancel(false, Canceled) }
	}
	c.mu.Lock()
	defer c.mu.Unlock()
	if c.err == nil {
		c.timer = time.AfterFunc(dur, func() {
			c.cancel(true, DeadlineExceeded)
		})
	}
	return c, func() { c.cancel(true, Canceled) }
}

接下来将会正式生成成为一个 timeCtx 类型，并将其加入到父级 context 是 children 属性中。最后进行当前时间与 Deadline 时间的计算，并通过调用 time.AfterFunc 在到期后自动调用 cancel 方法发起取消事件，自然也就会触发父子级的事件传播。

func WithValue(parent Context, key, val interface{}) Context {
	...
	if !reflectlite.TypeOf(key).Comparable() {
		panic("key is not comparable")
	}
	return &valueCtx{parent, key, val}
}

你会发现 valueCtx 结构体也非常的简单，核心就是键值对：

type valueCtx struct {
	Context
	key, val interface{}
}

其在配套方法上也不会太复杂，基本就是要求可比较，接着就是存储匹配：

func (c *valueCtx) Value(key interface{}) interface{} {
	if c.key == key {
		return c.val
	}
	return c.Context.Value(key)
}

这时候你可能又有疑问了，那多个父子级 context 是如何实现跨 context 的上下文信息获取的？

这秘密其实在上面的 valueCtx 和 Value 方法中有所表现：

本质上 valueCtx 类型是一个单向链表，会在调用 Value 方法时先查询自己的节点是否有该值。若无，则会通过自身存储的上层父级节点的信息一层层向上寻找对应的值，直到找到为止。

而在实际的工程应用中，你会发现各大框架，例如：gin、grpc 等。他都是有自己再实现一套上下文信息的传输的二次封装，本意也是为了更好的管理和观察上下文信息。

context 取消事件

在我们针对 context 的各类延伸类型和源码进行了分析后。我们进一步提出一个疑问点，context 是如何实现跨 goroutine 的取消事件并传播开来的，是如何实现的？

这个问题的答案就在于 WithCancel 和 WithDeadline 都会涉及到 propagateCancel 方法，其作用是构建父子级的上下文的关联关系，若出现取消事件时，就会进行处理：

func propagateCancel(parent Context, child canceler) {
	done := parent.Done()
	if done == nil {
		return
	}

	select {
	case <-done:
		child.cancel(false, parent.Err())
		return
	default:
	}
	...
}

当父级上下文（parent）的 Done 结果为 nil 时，将会直接返回，因为其不会具备取消事件的基本条件，可能该 context 是 Background、TODO 等方法产生的空白 context。
当父级上下文（parent）的 Done 结果不为 nil 时，则发现父级上下文已经被取消，作为其子级，该 context 将会触发取消事件并返回父级上下文的取消原因。

func propagateCancel(parent Context, child canceler) {
	...
	if p, ok := parentCancelCtx(parent); ok {
		p.mu.Lock()
		if p.err != nil {
			child.cancel(false, p.err)
		} else {
			if p.children == nil {
				p.children = make(map[canceler]struct{})
			}
			p.children[child] = struct{}{}
		}
		p.mu.Unlock()
	} else {
		atomic.AddInt32(&goroutines, +1)
		go func() {
			select {
			case <-parent.Done():
				child.cancel(false, parent.Err())
			case <-child.Done():
			}
		}()
	}
}

经过前面一个代码片段的判断，已得知父级 context 未触发取消事件，当前父级和子级 context 均正常（未取消）。

将会执行以下流程：

调用 parentCancelCtx 方法找到具备取消功能的父级 context。并将当前 context，也就是 child 加入到父级 context 的 children 列表中，等待后续父级 context 的取消事件通知和响应。
调用 parentCancelCtx 方法没有找到，将会启动一个新的 goroutine 去监听父子 context 的取消事件通知。

通过对 context 的取消事件和整体源码分析，可得知 cancelCtx 类型的上下文包含了其下属的所有子节点信息：

也就是其在 children 属性的 map[canceler]struct{} 存储结构上就已经支持了子级关系的查找，也就自然可以进行取消事件传播了。

而具体的取消事件的实际行为，则是在前面提到的 propagateCancel 方法中，会在执行例如 cacenl 方法时，会对父子级上下文分别进行状态判断，若满足则进行取消事件，并传播给子级同步取消。

总结

作为 Go 语言的核心功能之一，其实标准库 context 非常的短小精悍，使用的都是基本的数据结构和理念。既满足了跨 goroutine 的调控控制，像是并发、超时控制等。

同时也满足了上下文的信息传递。在工程应用中，例如像是链路ID、公共参数、鉴权校验等，都会使用到 context 作为媒介。

目前官方对于 context 的建议是作为方法的首参数传入，虽有些麻烦，但也有人选择将其作为结构体中的一个属性传入。但这也会带来一些心智负担，需要识别是否重新 new 一个。