Skip to main content

Go: Garbage Collection 垃圾回收

📅 2026-02-10 ✏️ 2026-03-24 Inside Go CS GO
No related notes

0.1 · Step 1: 为什么需要 GC#

Go 程序运行时,通过逃逸分析决定对象分配在栈还是堆上。栈上的对象随函数返回自动回收,但堆上的对象生命周期不确定——当没有任何引用指向它时,它就成了垃圾,需要被回收。这就是 GC 的职责。

Per-P allocation: Memory allocated through size-isolated regions in each P, minimizing lock contention and fragmentation

Go uses concurrent tri-color mark-and-sweep garbage collection with write barriers. The collector is non-generational and non-compacting.

0.2 · Step 2: Mark-and-Sweep 算法#

GC 的核心思路分两步:

  1. 标记(Mark):从根对象出发,遍历所有可达对象,标记为”存活”
  2. 清除(Sweep):未被标记的对象就是垃圾,回收其内存

最朴素的实现需要暂停整个程序(Stop-The-World),但 Go 追求低延迟,因此将大部分 GC 工作做成并发的——GC 线程与应用线程(mutator)同时运行。

并发带来一个问题:在标记过程中,mutator 可能修改指针关系,导致存活对象被误判为垃圾。Go 通过**写屏障(Write Barrier)**解决这个问题。

0.2.1 · Write Barrier 写屏障#

Purpose: Prevent black objects from pointing directly to white objects

Implementation:

  • Inserted at compile time
  • When pointer assignment occurs, write barrier records the store
  • Marks/rescans affected objects as needed

Enables concurrent marking without additional STW pauses

但写屏障的开启/关闭本身需要所有 P 处于一致状态,这就引出了 STW 和 safepoint。

0.2.2 · STW 与 Safepoint#

虽然 Go GC 是并发的,但仍需要短暂的 STW 来保证一致性。STW 出现在两个关键时刻:

  1. Mark 开始时:开启写屏障、启用 mutator assist、入队根标记任务。此时需要所有 P 处于一致状态
  2. Mark 结束时:确认所有标记工作完成,关闭 worker 和 assist,执行清理(flush mcache 等)

STW pauses are kept short — typically sub-millisecond. All P’s must reach a GC safe point before STW can proceed.

Safepoint: 有配套 stack map / liveness 信息的可暂停点

编译期:编译器在特定位置(函数调用点、循环回边等)生成 stack map / liveness 信息,记录哪些寄存器/栈槽持有活跃指针。这些位置就是 safepoint。

运行时:当 GC 需要 STW 时,向所有 goroutine 发出抢占请求,每个 goroutine 运行到下一个 safepoint 时暂停。此时 runtime 可以用编译期生成的 stack map 精确扫描栈上的指针根。

STW 的本质就是「等所有 P/goroutine 停在 safepoint 上」。没有 stack map 的位置不能暂停,否则 GC 无法区分栈上的整数和指针。

Go 1.14+ 引入了异步抢占(signal-based preemption),即使没有函数调用的紧密循环也能被打断到 safepoint。

0.3 · Step 3: 三色标记(Tri-Color Marking)—— 从根对象开始扫描#

标记阶段使用三色抽象来追踪对象状态:

Colors:

  • White 白色: Object is unmarked(默认状态,可能是垃圾)
  • Gray 灰色: Object is marked, but its children haven’t been scanned yet(已发现,待扫描子对象)
  • Black 黑色: Object is marked, its children have been scanned(已完成扫描)

Invariant: Black objects never point directly to white objects (enforced by write barrier)

Process:

  1. Start with roots (stack, globals, etc.) as gray
  2. Scan gray objects, mark their children as gray, color parent black
  3. Repeat until no gray objects remain
  4. All remaining white objects are garbage

0.3.1 · 并发标记的细节

标记分为 STW 准备和并发执行两部分:

Preparation (STW):

  • Set gcphase to _GCmark
  • Enable write barrier
  • Enable mutator assist
  • Queue root marking jobs

Concurrent Marking:

  • Start the world (P’s resume work)
  • GC work distributed between dedicated marking workers and mutator assist
  • Write barrier records overwritten and new pointers
  • New allocations immediately marked as black
  • Scan all stacks, color globals and heap-external pointers
  • Exhaust gray work queue

Termination Detection:

  • Use distributed termination algorithm (gcMarkDone) to detect when no more root jobs or gray objects exist
  • Transition to mark termination phase

0.3.2 · Large Object Optimization#

Problem: Scanning large objects can cause long pauses, reduce parallelism

Solution:

  • Objects larger than maxObletBytes split into array of oblets (max maxObletBytes each)
  • When scanning reaches large object, only scan first oblet
  • Remaining oblets queued as new jobs

0.3.3 · Mutator Assist#

Problem: If mutators allocate faster than collector can mark, heap grows unbounded

Solution: When allocation rate exceeds marking rate, mutators must assist GC

  • Assist work includes marking and sweeping
  • Amount of assist per allocation determined by pacing algorithm

0.4 · Step 4: 清除(Sweep)#

标记完成后,所有 white 对象都是垃圾。清除阶段回收这些对象的内存。

Preparation:

  • Set gcphase to _GCoff
  • Configure sweep state
  • Disable write barrier

Concurrent Sweeping:

  • Start the world
  • New allocations are white
  • Spans are lazily swept as needed for allocation
  • Background goroutine sweeps spans concurrently

0.4.1 · Concurrent Sweep Details#

Mechanics:

  • Background goroutine lazily sweeps spans (helps non-CPU-bound programs)
  • When goroutine needs a new span:
    1. Try to reclaim via sweeping (sweep same-size spans until at least one object freed)
    2. If insufficient, sweep larger or multiple spans as needed
    3. If still insufficient, request from OS

Safety:

  • All mcache flushed to central cache at mark termination, making them empty
  • Goroutines flush mcache when fetching new span
  • Finalizer goroutine only runs after all spans are swept
  • At next GC start, any remaining unswept spans are forcefully swept

Critical invariant: No operations on unswept spans (would corrupt GC bitmap)

0.5 · Step 5: GC Cycle 全景#

一个完整的 GC cycle 分为四个阶段:

Phase 1          Phase 2                    Phase 3          Phase 4
Sweep Term.      Mark Phase                 Mark Term.       Sweep Phase
|── STW ──|── STW ──|── Concurrent ──|── STW ──|──── Concurrent ────|
  清扫残留    开写屏障    三色标记并发执行  关写屏障       并发清除
              入队根任务                    flush mcache    懒清扫 span

0.5.1 · Phase 1: Sweep Termination#

  • STW: All P’s reach GC safe point
  • Sweep any remaining unswept spans (only if GC was forced before expected time)

0.5.2 · Phase 2: Mark Phase#

  • STW Preparation: 开启写屏障、启用 assist、入队根标记任务
  • Concurrent Marking: 三色标记并发执行
  • Termination Detection: gcMarkDone 检测所有灰色对象已处理完毕

0.5.3 · Phase 3: Mark Termination#

  • STW: All P’s reach safe point
  • Set gcphase to _GCmarktermination
  • Disable workers and assists
  • Housekeeping (flush mcaches, etc.)

0.5.4 · Phase 4: Sweep Phase#

  • Concurrent Sweeping: 懒清扫,按需回收 span

0.5.5 · Next Cycle Trigger (GOGC)#

Triggered when allocated memory reaches a threshold relative to live memory:

  • Controlled by GOGC environment variable (default 100)
  • Example: If GOGC=100 and live heap is 4M, next GC triggers at 8M

0.5.6 · Pacing Algorithm#

Goal: Keep heap size near target while minimizing pause time

Mechanism:

  • Feedback loop: Gather info about running application
  • Stress metric: How fast application allocates heap memory
  • Before each collection: Estimate time to finish collection
  • Adjust marking pace and assist requirements dynamically

Tuning: Controlled implicitly via GOGC and allocation patterns

0.6 · GC Observability#

0.6.1 · GC Trace#

Enable with: GODEBUG=gctrace=1 go run main.go

Example output:

gc 1405 @6.068s 11%: 0.058+1.2+0.083 ms clock, 0.70+2.5/1.5/0+0.99 ms cpu, 7->11->6 MB, 10 MB goal, 12 P

Breakdown:

MetricMeaning
gc 1405The 1405th GC run
@6.068sElapsed time since program start
11%Percent of CPU spent in GC so far
Wall-Clock Times
0.058msSTW - Mark Start (write barrier on)
1.2msConcurrent - Marking
0.083msSTW - Mark Termination (write barrier off)
CPU Times
0.70msSTW - Mark Start
2.5msConcurrent - Mark Assist (inline with allocation)
1.5msConcurrent - Background GC time
0msConcurrent - Idle GC time
0.99msSTW - Mark Termination
Memory
7MBHeap in-use before marking started
11MBHeap in-use after marking finished
6MBHeap marked as live after marking
10MBCollection goal for heap in-use after marking
Threads
12PNumber of logical processors

0.6.2 · Pacing Trace#

Enable with: GODEBUG=gcpacertrace=1

Shows pacing decisions and feedback adjustments.

0.7 · Forced Collection & Helper Functions#

func GC() {}

Forces a garbage collection cycle:

  1. Waits until sweep termination, sweep phase, or mark termination
  2. Assists with sweep if needed
  3. Waits for next GC mark and termination phases to complete
  4. Assists with sweep again if needed
// Sweep any remaining unswept spans, returns pages returned to heap
func sweepone() uintptr {}

// Check if all spans are swept
func isSweepDone() bool {}

// Start GC with specified trigger type
func gcStart(trigger gcTrigger) {}

0.8 · Advanced Features: Cleanups and Weak Pointers (Go 1.24+)#

0.8.1 · runtime.AddCleanup#

Function signature: runtime.AddCleanup(obj any, cleanup func(arg), arg)

Purpose: Queue a cleanup function to run when object becomes unreachable (improved finalizer)

Advantages over SetFinalizer:

  1. Avoid object resurrection: Cleanup function receives only the arg, not the original object. Object is not forcibly kept alive.
  2. Faster reclamation: Object can be reclaimed immediately, no need for two GC cycles
  3. Support reference cycles: Object can participate in cycles (even self-pointers)

Why SetFinalizer is problematic:

  • Object resurrection: Finalizer called with object pointer → GC must keep object alive
  • Two GC cycles needed: First cycle detects unreachable → runs finalizer; Second cycle reclaims memory

Typical use: Clean up external resources (syscall.Munmap for mmap, file descriptor closure, etc.)

// Example: Memory-mapped file
type MemoryMappedFile struct {
	data []byte
}

func NewMemoryMappedFile(filename string) (*MemoryMappedFile, error) {
	// ... create memory mapping ...
	mf := &MemoryMappedFile{data: data}
	cleanup := func(data []byte) {
		syscall.Munmap(data) // cleanup resource
	}
	runtime.AddCleanup(mf, cleanup, data) // auto-run when mf unreachable
	return mf, nil
}

Rules:

  • Cleanup function must NOT reference the original object (directly or via captured variables), or cleanup never runs
  • Special check: If arg is the object itself, AddCleanup panics (prevents finalizer-style misuse)
  • Non-deterministic: GC may never run cleanups (implementation-dependent)

0.8.2 · weak.Pointer#

Generic type: weak.Pointer[T]

Purpose: Safely point to object without preventing GC (pointer ignored by GC)

Key method: Value() *T returns valid pointer or nil

Core properties:

  1. Comparable and stable identity: Weak pointer identity persists even after object is reclaimed. Safe for map key or CompareAndDelete.
  2. Independent references: Multiple weak pointers to same object are independent

Typical use: Cache deduplication without manual lifecycle management

// Example: Cached memory-mapped files
var cache sync.Map // map[string]weak.Pointer[MemoryMappedFile]

func NewCachedMemoryMappedFile(filename string) (*MemoryMappedFile, error) {
	var newFile *MemoryMappedFile
	for {
		// Try to load from cache
		value, ok := cache.Load(filename)
		if !ok {
			if newFile == nil {
				var err error
				newFile, err = NewMemoryMappedFile(filename)
				if err != nil {
					return nil, err
				}
			}

			// Create weak pointer and try to insert
			wp := weak.Make(newFile)
			value, loaded := cache.LoadOrStore(filename, wp)
			if !loaded {
				// Register cleanup: delete map entry when object unreachable
				runtime.AddCleanup(newFile, func(filename string) {
					cache.CompareAndDelete(filename, wp)
				}, filename)
				return newFile, nil
			}
		}

		// Check if cache entry is still valid
		if mf := value.(weak.Pointer[MemoryMappedFile]).Value(); mf != nil {
			return mf, nil
		}

		// Entry invalid, delete and retry
		cache.CompareAndDelete(filename, value)
	}
}

Rules:

  • As map key: Map value must NOT strongly reference the object pointed to, or object stays alive (cache ineffective)
    • This inspired ephemeron concept (future work) to resolve
  • Non-deterministic: Behavior depends on GC implementation details

0.8.3 · Combined Usage#

Both features together enable:

  1. Self-cleaning cache: Weak pointer to object, auto-cleanup when no other references exist
  2. Composable design: Multiple independent cleanups on single object, supports modular design

0.8.4 · Caveats#

  1. Error-prone:

    • Cleanup function cannot reference original object (direct or captured)
    • Weak pointer map key’s value cannot strongly reference object
    • Problems are subtle and hard to debug
  2. Non-deterministic:

    • Cleanup execution depends on GC implementation
    • May never run (though rare in practice)
    • Special testing techniques needed (see GC Guide)
  3. Advanced feature:

    • Most Go code rarely needs direct usage
    • Prefer higher-level patterns (e.g., unique package)
    • Use only when explicitly needed

0.9 · Guidance#

These are low-level advanced features. Reference the GC Guide for detailed semantics and testing. Use cleanups and weak pointers only for clear-cut scenarios (cache deduplication, resource management), not as general solutions. Most code benefits from these features indirectly.

0.10 · References#