profile
https://fulmenflux.co/blog/go/complete-guide-to-profile-golang-services-in-production/
0.1 · Go: profiling 性能分析#
当面对内存泄漏、CPU占用过高等性能问题时,使用golang提供的pprof工具进行性能分析,快速定位性能问题。
0.1.1 · what is profile#
A Profile is a collection of stack traces showing the call sequences that led to instances of a particular event, such as allocation.
goroutine - stack traces of all current goroutines heap - a sampling of memory allocations of live objects allocs - a sampling of all past memory allocations threadcreate - stack traces that led to the creation of new OS threads block - stack traces that led to blocking on synchronization primitives mutex - stack traces of holders of contended mutexes
cpu: stack traces of CPU returned by runtime
All of these kinds of profiles (goroutine, heap allocations, etc) are just collections of stacktraces, maybe with some metadata attached. A profile is mostly a bunch of Samples. A sample is basically a stack trace.
https://github.com/prashantv/go_profiling_talk cpu(call stack sampling): heap(allocation profiling): block: trace:
Each profile is a collection of samples, where each sample is associated to a point in a location hierarchy, one or more numeric values, and a set of labels. Often these profiles represents data collected through statistical sampling of a program, so each sample describes a program call stack and a number or value of samples collected at a location.
A profiler runs your program and configures the operating system to interrupt it at regular intervals. This is done by sending SIGPROF to the program being profiled, which suspends and transfers execution to the profiler. The profiler then grabs the program counter for each executing thread and restarts the program.
https://github.com/bradfitz/talk-yapc-asia-2015/blob/master/talk.md
0.1.2 · get started#
pprof is a tool for visualization and analysis of profiling data.
Usage Modes:
1. Report generation: Text/Graphical reports
2. Interactive terminal use
3. Web interface
flat: the value of the location itself.
cum: the value of the location plus all its descendants.
Interpreting the Callgraph:
Node Color:
Node Font Size:
Edge Weight:
Edge Color:
Dashed Edges:
Solid Edges:
"(inline)" Edge Marker:
两个步骤:
- profiling
- visualization and analysis
# 模拟压测
wrk -c 100 -d 30s -t 4 http://127.0.0.1:9088/apm/alarm/templates
# profiling and analysis
go tool pprof "http://127.0.0.1:6060/debug/pprof/profile?seconds=20"
# profiling
curl -v "http://127.0.0.1:6060/debug/pprof/profile?seconds=10"
# analysis
go tool pprof -http="HOSTIP:8000" ./cpu.prof
0.1.2.1 · profiling#
tests and benchmarks: using the -cpuprofile and -memprofile flags.
pkg net/http/pprof: add /debug/pprof endpoints in your service. import _ “net/http/pprof” curl localhost:$PORT/debug/pprof/$PROFILE_TYPE
https://pkg.go.dev/runtime https://pkg.go.dev/runtime/pprof
package main
import (
"os"
"runtime/pprof"
)
func main() {
f, perr := os.Create("cpu.pprof")
if perr != nil {
l.Fatal(perr)
}
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()
// your go code
// ...
}
# go tool pprof -http=<port> <profiling file>
go tool pprof -http=":8000" ./cpu.pprof
select “flame graph” from the VIEW menus in the site header.
cli interactive mode:
go tool pprof cpu.pprof
see commands use help.
https://github.com/pkg/profile
0.1.2.2 · visualization and analysis#
use go tool pprof to analyze
command in go tool pprof:
“web” or “svg” to generate a graph:
0.1.2.2.1 · Flame Graphs#
程序员精进之路:性能调优利器—火焰图 https://zhuanlan.zhihu.com/p/147875569
https://www.brendangregg.com/flamegraphs.html
https://www.datadoghq.com/knowledge-center/distributed-tracing/flame-graph/
https://www.matoski.com/article/golang-profiling-flamegraphs/
https://cloud.tencent.com/developer/article/1656810
generate Flame graphs in go: https://github.com/uber-archive/go-torch
每一列代表一个调用栈,每一个格子代表一个函数
纵轴展示了栈的深度,按照调用关系从下到上排列,最顶上格子代表采样时,正在占用 cpu 的函数。
横轴格子的宽度代表其在采样中出现频率,所以一个格子的宽度越大,说明它是瓶颈原因的可能性就越大。
0.2 · urls#
https://jvns.ca/blog/2017/09/24/profiling-go-with-pprof/
go tool pprof —help https://rakyll.org/archive/ https://rakyll.org/custom-profiles/ https://rakyll.org/mutexprofile/
https://hackernoon.com/go-the-complete-guide-to-profiling-your-code-h51r3waz
##
Difference Between Flat And Cumulative
func A() {
B() // takes 1s
DO STH DIRECTLY // takes 4s
C() // takes 6s
}
the flat time of function A is 4s and the cum is 11s.
Difference Between Allocations And Heap Profile
Allocations profile will start pprof in a mode which displays the total number of allocated objects since the program began which includes garbage collected ones as well.
Heap profile will start in a mode that displays number of live allocated objects which does not include garbage collected bytes.
inuse_space: amount of memory allocated and not released yet (heap)
inuse_objects: amount of objects allocated and not released yet (heap)
alloc_space: total amount of memory allocated (regardless of released)
alloc_objects: total amount of objects allocated (regardless of released)
## How does a profiler work?
CPU Profiler:
The Go CPU profiler uses a SIGPROF signal to record code execution statistics. Once the signal got registered, it will deliver every specified time interval.
On every invocation signal, the handler will trace back the execution by unwinding it from the current PC value. This will generate a stack trace and increment its hit count.
Memory Profiler:
The memory profiler samples heap allocations.
Recording all allocation and unwinding thestack trace would be expensive, therefore a sampling technique is used.
The sampling rate defines the mean of exponential distribution . The default value is 512 KB which is specified by the runtime.MemProfileRate.
https://blog.csdn.net/DiDi_Tech/article/details/100912814
0.3 · go-tool-trace#
https://making.pusher.com/go-tool-trace it visualizes all the runtime events.
- Diagnosing latency problems
- Diagnosing poor parallelism It’s not appropriate if you want to track down slow functions, or generally find where your program is spending most of its CPU time. go tool trace is better suited at finding out what your program is doing over time, not in aggregate.
https://github.com/campoy/go-tooling-workshop/blob/master/3-dynamic-analysis/4-tracing/1-tracing.md
https://pkg.go.dev/runtime/trace
https://docs.google.com/document/d/1FP5apqzBgr7ahCCgFO-yoVhk4YZrNIDNf9RybngBc14/edit
https://www.youtube.com/watch?v=mmqDlbWk_XA https://speakerdeck.com/rhysh/an-introduction-to-go-tool-trace
https://rakyll.medium.com/debugging-latency-in-go-1-11-9f97a7910d68 Regions are sections in the code you want to collect tracing data for. A region starts and ends in the same goroutine. A task, on the other hand, is more of a logical group to categorize related regions together. A task can end in a different goroutine than the goroutine it started at.