Go Memory & GC Internals
Memory management và garbage collection là hai yếu tố quyết định performance của Go service. Hiểu sâu về chúng giúp bạn tối ưu latency, giảm memory footprint, và debug memory leaks.
💡 Go GC là concurrent, tri-color mark-and-sweep collector với target STW pause < 1ms.
Memory Allocator
Stack vs Heap
Stack:
- Cấp phát nhanh (chỉ cần di chuyển stack pointer)
- Tự động free khi function return
- Size nhỏ (default 2KB, grow tự động đến 1GB)
- Thread-local (không cần lock)
Heap:
- Cấp phát chậm hơn (cần allocator)
- Quản lý bởi GC
- Size lớn, shared giữa goroutines
- Có overhead (metadata, fragmentation)
Rule: Stack khi có thể, heap khi cần.
Escape Analysis
Compiler quyết định biến nằm ở stack hay heap dựa trên escape analysis.
Biến escape khi nào?
// 1. Return pointer to local variable
func newUser() *User {
u := User{Name: "Alice"}
return &u // ← u escapes to heap
}
// 2. Store pointer in heap-allocated struct
type Container struct {
data *Data
}
func create() *Container {
d := Data{}
return &Container{data: &d} // ← d escapes
}
// 3. Pass to interface{} (type không biết trước)
func print(v interface{}) {
fmt.Println(v)
}
func main() {
x := 42
print(x) // ← x escapes (fmt.Println nhận interface{})
}
// 4. Send to channel
func send(ch chan int) {
x := 42
ch <- x // ← x might escape (tùy channel buffer)
}
// 5. Closure captures variable
func outer() func() int {
x := 0
return func() int {
x++ // ← x escapes (closure outlives outer)
return x
}
}
// 6. Size không biết compile-time
func allocate(n int) []byte {
return make([]byte, n) // ← escapes (n runtime value)
}
Kiểm tra escape analysis
go build -gcflags='-m' main.go
Output:
./main.go:5:2: moved to heap: u
./main.go:6:9: &u escapes to heap
./main.go:15:13: ... argument does not escape
./main.go:15:13: x escapes to heap
Giải thích:
moved to heap: u→ biến u được allocate trên heapescapes to heap→ pointer escapedoes not escape→ stay on stack
Tránh escape không cần thiết
// ❌ BAD: Unnecessary escape
func sum(nums []int) int {
result := 0
for _, n := range nums {
result += n
}
return result // result on stack, OK
}
// ✅ GOOD: No escape
func sumPtr(nums []int) *int {
result := 0
for _, n := range nums {
result += n
}
return &result // ← result escapes, but necessary
}
// ✅ BETTER: Avoid returning pointer if possible
func sum2(nums []int) int {
result := 0
for _, n := range nums {
result += n
}
return result // No escape
}
Memory Allocator: TCMalloc-inspired
Go allocator dựa trên TCMalloc (Thread-Caching Malloc).
Size classes
Objects được chia thành size classes:
Tiny: < 16 bytes
Small: 16 bytes - 32 KB (67 size classes)
Large: > 32 KB
Ví dụ size classes:
- 8, 16, 24, 32, 48, 64, 80, 96, 112, 128, ...
- Mỗi allocation làm tròn lên size class gần nhất
x := make([]byte, 33) // Làm tròn lên 48 bytes
// Waste 15 bytes
Allocation path
Request allocation
↓
1. Tiny allocator (< 16B, no pointers)
├─ Yes → From P's tiny block
└─ No → Next step
↓
2. Small allocator (16B - 32KB)
├─ Check P's mcache
│ ├─ Has free span → Allocate
│ └─ No free span → Get from mcentral
└─ mcentral empty → Get from mheap
↓
3. Large allocator (> 32KB)
└─ Allocate directly from mheap
Cấu trúc
mcache (per-P, no lock):
P's mcache
├── Tiny allocator
├── Span list for size class 1 (8B)
├── Span list for size class 2 (16B)
├── ...
└── Span list for size class 67 (32KB)
mcentral (per size class, with lock):
- Chứa spans cho mỗi size class
- Được share giữa các P
mheap (global, with lock):
- Quản lý toàn bộ heap memory
- Cấp phát spans cho mcentral
- Trả memory về OS khi không dùng
Garbage Collector
GC algorithm: Concurrent Mark & Sweep
Phases:
1. Mark Setup (STW)
↓
2. Concurrent Mark (concurrent)
↓
3. Mark Termination (STW)
↓
4. Concurrent Sweep (concurrent)
Timeline:
User code running
↓
STW: Mark Setup (~100µs)
↓
User code + GC Mark (concurrent)
↓
STW: Mark Termination (~100µs)
↓
User code + GC Sweep (concurrent)
↓
Cycle complete
Tri-color marking
Colors:
- White: Chưa scan (initially all objects)
- Gray: Đã mark, chưa scan children
- Black: Đã scan xong, và tất cả children đã mark
Algorithm:
1. Start: All objects white, roots gray
2. While gray objects exist:
- Pick gray object
- Scan its pointers, mark pointed objects gray
- Mark itself black
3. End: Black objects reachable, white objects garbage
Concurrent marking issue: User code có thể modify pointers trong khi GC mark.
Solution: Write barrier — track pointer writes, re-mark nếu cần.
Write barrier
Khi user code ghi pointer:
obj.field = newPtr
Write barrier ghi lại:
If obj is black and newPtr is white:
Mark newPtr gray
Cost: Mỗi pointer write có overhead nhỏ (~10-20 ns).
Khi nào active: Chỉ trong GC marking phase.
GC Tuning
GOGC
Default: GOGC=100
Ý nghĩa: GC trigger khi heap size tăng thêm 100% so với sau GC trước.
Heap after last GC: 100 MB
GOGC=100
→ Next GC triggers at: 200 MB
GOGC=200
→ Next GC triggers at: 300 MB
Trade-off:
- GOGC cao → ít GC hơn, memory usage cao
- GOGC thấp → nhiều GC hơn, memory usage thấp
Khi nào tune:
# Giảm memory usage (chấp nhận nhiều GC hơn)
GOGC=50 ./myapp
# Giảm GC frequency (chấp nhận dùng nhiều RAM hơn)
GOGC=200 ./myapp
# Disable GC (development only!)
GOGC=off ./myapp
GOMEMLIMIT (Go 1.19+)
Set soft memory limit:
GOMEMLIMIT=2GiB ./myapp
Ý nghĩa: GC cố gắng giữ heap usage < limit.
Priority: GOMEMLIMIT > GOGC
Benefit: Đoán trước memory usage trong container.
Example:
# Container có 4GB RAM
# Set limit 3GB để tránh OOM
GOMEMLIMIT=3GiB ./myapp
debug.SetGCPercent()
Dynamically adjust GOGC:
import "runtime/debug"
// Set GOGC to 200
debug.SetGCPercent(200)
// Disable GC
debug.SetGCPercent(-1)
Manual GC
runtime.GC() // Force GC cycle
Use case:
- Sau load spike, force GC để free memory
- Testing, benchmarking
Note: Thường không cần manual GC — automatic GC đã tối ưu tốt.
Monitoring GC
1. runtime.ReadMemStats
var m runtime.MemStats
runtime.ReadMemStats(&m)
fmt.Printf("Alloc: %v MB\n", m.Alloc/1024/1024)
fmt.Printf("TotalAlloc: %v MB\n", m.TotalAlloc/1024/1024)
fmt.Printf("Sys: %v MB\n", m.Sys/1024/1024)
fmt.Printf("NumGC: %v\n", m.NumGC)
fmt.Printf("PauseTotalNs: %v ms\n", m.PauseTotalNs/1e6)
Metrics quan trọng:
Alloc: Heap allocated và đang dùngSys: Memory requested từ OSNumGC: Số lần GC đã chạyPauseNs: Pause times (STW duration)
2. GODEBUG=gctrace
GODEBUG=gctrace=1 ./myapp
Output:
gc 1 @0.001s 0%: 0.018+0.23+0.003 ms clock, 0.14+0.076/0.22/0.001+0.025 ms cpu, 4->4->0 MB, 5 MB goal, 8 P
Giải thích:
gc 1: GC cycle #1@0.001s: 0.001s sau start0%: 0% CPU dành cho GC0.018+0.23+0.003 ms: STW + concurrent + STW pause4->4->0 MB: Heap before GC → after mark → after sweep5 MB goal: Target heap size8 P: 8 processors
3. pprof heap profile
go tool pprof http://localhost:6060/debug/pprof/heap
Commands:
(pprof) top # Top allocators
(pprof) list <func> # Line-by-line allocation
(pprof) web # Visualize call graph
4. trace
curl http://localhost:6060/debug/pprof/trace?seconds=5 > trace.out
go tool trace trace.out
View: GC pause timeline, heap size over time.
Optimizing Allocations
1. Object pooling (sync.Pool)
Use case: Tái sử dụng objects thay vì allocate mới.
var bufferPool = sync.Pool{
New: func() interface{} {
return make([]byte, 4096)
},
}
func process(data []byte) {
buf := bufferPool.Get().([]byte)
defer bufferPool.Put(buf)
// Use buf
copy(buf, data)
// ...
}
Benefit: Giảm allocations → giảm GC pressure.
Note: Pool reset sau mỗi GC cycle (không phải cache vĩnh viễn).
2. Pre-allocate slices
// ❌ BAD: Append causes multiple allocations
var result []int
for i := 0; i < 1000; i++ {
result = append(result, i) // Reallocate nhiều lần
}
// ✅ GOOD: Pre-allocate
result := make([]int, 0, 1000)
for i := 0; i < 1000; i++ {
result = append(result, i) // No reallocation
}
3. Reuse strings
// ❌ BAD: String concat causes allocations
s := ""
for i := 0; i < 100; i++ {
s += "hello" // Each concat allocates new string
}
// ✅ GOOD: Use strings.Builder
var b strings.Builder
b.Grow(500) // Pre-allocate
for i := 0; i < 100; i++ {
b.WriteString("hello")
}
s := b.String()
4. Avoid []byte ↔ string conversion
// ❌ BAD: Conversion allocates
func process(data []byte) {
s := string(data) // Allocates new string
// ...
}
// ✅ GOOD: Work with []byte directly
func process(data []byte) {
// ...
}
// ✅ Unsafe (zero-copy, but dangerous)
import "unsafe"
func bytesToString(b []byte) string {
return *(*string)(unsafe.Pointer(&b))
}
Warning: Unsafe approach breaks if []byte modified sau conversion.
5. Reduce pointer indirection
// ❌ BAD: Many pointers → GC scan overhead
type Node struct {
Next *Node
Data *Data
}
// ✅ GOOD: Value types khi có thể
type Node struct {
Next *Node
Data Data // Embed value, not pointer
}
Trade-off: Value copy vs pointer scan time.
Memory Leaks
Common causes
1. Goroutine leak
// ❌ Goroutine never exits
func leak() {
ch := make(chan int)
go func() {
<-ch // Block forever if no sender
}()
}
2. Forgotten callbacks
// ❌ Callback holds reference
type Handler struct {
callbacks []func()
}
func (h *Handler) Register(cb func()) {
h.callbacks = append(h.callbacks, cb)
// Never removed → memory leak
}
3. Large slice holding reference
// ❌ Slice holds entire array
func process(data []byte) []byte {
return data[0:10] // Small slice, but references full array
}
// ✅ Copy to new slice
func process(data []byte) []byte {
result := make([]byte, 10)
copy(result, data[0:10])
return result // Original data can be GC'd
}
Debugging leaks
1. pprof heap diff
# Baseline
curl http://localhost:6060/debug/pprof/heap > heap1.out
# Wait...
# After some time
curl http://localhost:6060/debug/pprof/heap > heap2.out
# Compare
go tool pprof -base heap1.out heap2.out
2. Check goroutine count
import "runtime"
ticker := time.NewTicker(10 * time.Second)
go func() {
for range ticker.C {
fmt.Println("Goroutines:", runtime.NumGoroutine())
}
}()
Nếu tăng liên tục → leak.
Tóm tắt
| Concept | Key Point |
|---|---|
| Stack vs Heap | Stack nhanh, heap có GC overhead |
| Escape analysis | Compiler quyết định allocation location |
| Allocator | TCMalloc-inspired, per-P cache |
| GC | Concurrent tri-color mark & sweep |
| GOGC | Default 100 (GC khi heap double) |
| GOMEMLIMIT | Soft limit cho heap size |
| Write barrier | Track pointer writes trong mark phase |
| sync.Pool | Reuse objects, giảm allocations |
| Memory leak | Goroutine leak, forgotten references |
Tài liệu tham khảo
- Go GC Guide: https://go.dev/doc/gc-guide
- Dmitry Vyukov - Go Memory Model: https://golang.org/ref/mem
- Rick Hudson - Go GC Talk: https://www.youtube.com/watch?v=aiv1JOfMjm0
- TCMalloc: https://github.com/google/tcmalloc