self-aware Golang profile dumper[beta]

Related tags

Command Line holmes
Overview

holmes

WARNING : holmes is under heavy development now, so API will make breaking change during dev. If you want to use it in production, please wait for the first release.

Self-aware Golang profile dumper.

Our online system often crashes at midnight (usually killed by the OS due to OOM). As lazy developers, we don't want to be woken up at midnight and waiting for the online error to recur.

holmes comes to rescue.

how to use

dump goroutine when goroutine number spikes

h, _ := holmes.New(
    holmes.WithCollectInterval("5s"),
    holmes.WithCoolDown("1m"),
    holmes.WithDumpPath("/tmp"),
    holmes.WithTextDump(),
    holmes.WithGoroutineDump(10, 25, 2000),
)
h.EnableGoroutineDump()

// start the metrics collect and dump loop
h.Start()

// stop the dumper
h.Stop()
  • WithCollectInterval("5s") means the system metrics are collected once 5 seconds
  • WithCoolDown("1m") means once a dump happened, the next dump will not happen before cooldown finish-1 minute.
  • WithDumpPath("/tmp") means the dump binary file(binary mode) or the dump log file(text mode) will write content to /tmp dir
  • WithTextDump() means not in binary mode, so it's text mode profiles
  • WithGoroutineDump(500, 25, 20000) means dump will happen when current_goroutine_num > 500 && current_goroutine_num > 125% * previous_average_goroutine_num or current_goroutine_num > 20000

dump cpu profile when cpu load spikes

h, _ := holmes.New(
    holmes.WithCollectInterval("5s"),
    holmes.WithCoolDown("1m"),
    holmes.WithDumpPath("/tmp"),
    holmes.WithCPUDump(20, 25, 80),
)
h.EnableCPUDump()

// start the metrics collect and dump loop
h.Start()

// stop the dumper
h.Stop()
  • WithCollectInterval("5s") means the system metrics are collected once 5 seconds
  • WithCoolDown("1m") means once a dump happened, the next dump will not happen before cooldown finish-1 minute.
  • WithDumpPath("/tmp") means the dump binary file(binary mode) or the dump log file(text mode) will write content to /tmp dir
  • WithBinaryDump() or WithTextDump() doesn't affect the CPU profile dump, because the pprof standard library doesn't support text mode dump
  • WithCPUDump(10, 25, 80) means dump will happen when cpu usage > 10% && cpu usage > 125% * previous cpu usage recorded or cpu usage > 80%

dump heap profile when RSS spikes

h, _ := holmes.New(
    holmes.WithCollectInterval("5s"),
    holmes.WithCoolDown("1m"),
    holmes.WithDumpPath("/tmp"),
    holmes.WithTextDump(),
    holmes.WithMemDump(30, 25, 80),
)

h.EnableMemDump()

// start the metrics collect and dump loop
h.Start()

// stop the dumper
h.Stop()
  • WithCollectInterval("5s") means the system metrics are collected once 5 seconds
  • WithCoolDown("1m") means once a dump happened, the next dump will not happen before cooldown finish-1 minute.
  • WithDumpPath("/tmp") means the dump binary file(binary mode) or the dump log file(text mode) will write content to /tmp dir
  • WithTextDump() means not in binary mode, so it's text mode profiles
  • WithMemDump(30, 25, 80) means dump will happen when memory usage > 10% && memory usage > 125% * previous memory usage or memory usage > 80%

enable them all!

It's easy.

h, _ := holmes.New(
    holmes.WithCollectInterval("5s"),
    holmes.WithCoolDown("1m"),
    holmes.WithDumpPath("/tmp"),
    holmes.WithTextDump(),

    holmes.WithCPUDump(10, 25, 80),
    holmes.WithMemDump(30, 25, 80),
    holmes.WithGoroutineDump(500, 25, 20000),
)

h.EnableMemDump().
    EnableCPUDump().
    EnableGoroutineDump()

running in docker or other cgroup limited environment

h, _ := holmes.New(
    holmes.WithCollectInterval("5s"),
    holmes.WithCoolDown("1m"),
    holmes.WithDumpPath("/tmp"),
    holmes.WithTextDump(),

    holmes.WithCPUDump(10, 25, 80),
    holmes.WithCGroup(true), // set cgroup to true
)

known risks

Collect a goroutine itself may cause latency spike because of the STW.

design

Holmes collects the following stats every interval passed:

  • Goroutine number by runtime.NumGoroutine.
  • RSS used by the current process with gopsutil
  • CPU percent a total. eg total 8 core, use 4 core = 50% with gopsutil

After warming up phase finished, Holmes will compare the current stats with the average of previous collected stats(10 cycles). If the dump rule is matched, Holmes will dump the related profile to log(text mode) or binary file(binary mode).

When you get warning messages sent by your own monitor system, eg. memory usage exceed 80%, OOM killed, CPU usage exceed 80%, goroutine nun exceed 100k. The profile is already dumped to your dump path. You could just fetch the profile and see what actually happend without pressure.

case show

RSS peak caused by make a 1GB slice

see this example

after warming up, just curl http://localhost:10003/make1gb for some times, then you'll probably see:

heap profile: 0: 0 [1: 1073741824] @ heap/1048576
0: 0 [1: 1073741824] @ 0x42ba3ef 0x4252254 0x4254095 0x4254fd3 0x425128c 0x40650a1
#	0x42ba3ee	main.make1gbslice+0x3e			/Users/xargin/go/src/github.com/mosn/holmes/example/1gbslice.go:24
#	0x4252253	net/http.HandlerFunc.ServeHTTP+0x43	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2012
#	0x4254094	net/http.(*ServeMux).ServeHTTP+0x1a4	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2387
#	0x4254fd2	net/http.serverHandler.ServeHTTP+0xa2	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2807
#	0x425128b	net/http.(*conn).serve+0x86b		/Users/xargin/sdk/go1.14.2/src/net/http/server.go:1895

1: 1073741824 means 1 object and 1GB memory consumption.

goroutine explosion caused by deadlock

See this example

curl localhost:10003/lockorder1

curl localhost:10003/lockorder2

After warming up, wrk -c 100 http://localhost:10003/req, then you'll see the deadlock caused goroutine num peak:

100 @ 0x40380b0 0x4048c80 0x4048c6b 0x40489e7 0x406f72c 0x42badfc 0x42badfd 0x4252b94 0x42549d5 0x4255913 0x4251bcc 0x40659e1
#	0x40489e6	sync.runtime_SemacquireMutex+0x46	/Users/xargin/sdk/go1.14.2/src/runtime/sema.go:71
#	0x406f72b	sync.(*Mutex).lockSlow+0xfb		/Users/xargin/sdk/go1.14.2/src/sync/mutex.go:138
#	0x42badfb	sync.(*Mutex).Lock+0x8b			/Users/xargin/sdk/go1.14.2/src/sync/mutex.go:81
#	0x42badfc	main.req+0x8c				/Users/xargin/go/src/github.com/mosn/holmes/example/deadlock.go:30
#	0x4252b93	net/http.HandlerFunc.ServeHTTP+0x43	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2012
#	0x42549d4	net/http.(*ServeMux).ServeHTTP+0x1a4	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2387
#	0x4255912	net/http.serverHandler.ServeHTTP+0xa2	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2807
#	0x4251bcb	net/http.(*conn).serve+0x86b		/Users/xargin/sdk/go1.14.2/src/net/http/server.go:1895
1 @ 0x40380b0 0x4048c80 0x4048c6b 0x40489e7 0x406f72c 0x42bb041 0x42bb042 0x4252b94 0x42549d5 0x4255913 0x4251bcc 0x40659e1

#	0x40489e6	sync.runtime_SemacquireMutex+0x46	/Users/xargin/sdk/go1.14.2/src/runtime/sema.go:71
#	0x406f72b	sync.(*Mutex).lockSlow+0xfb		/Users/xargin/sdk/go1.14.2/src/sync/mutex.go:138
#	0x42bb040	sync.(*Mutex).Lock+0xf0			/Users/xargin/sdk/go1.14.2/src/sync/mutex.go:81
#	0x42bb041	main.lockorder2+0xf1			/Users/xargin/go/src/github.com/mosn/holmes/example/deadlock.go:50
#	0x4252b93	net/http.HandlerFunc.ServeHTTP+0x43	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2012
#	0x42549d4	net/http.(*ServeMux).ServeHTTP+0x1a4	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2387
#	0x4255912	net/http.serverHandler.ServeHTTP+0xa2	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2807
#	0x4251bcb	net/http.(*conn).serve+0x86b		/Users/xargin/sdk/go1.14.2/src/net/http/server.go:1895

1 @ 0x40380b0 0x4048c80 0x4048c6b 0x40489e7 0x406f72c 0x42baf11 0x42baf12 0x4252b94 0x42549d5 0x4255913 0x4251bcc 0x40659e1
#	0x40489e6	sync.runtime_SemacquireMutex+0x46	/Users/xargin/sdk/go1.14.2/src/runtime/sema.go:71
#	0x406f72b	sync.(*Mutex).lockSlow+0xfb		/Users/xargin/sdk/go1.14.2/src/sync/mutex.go:138
#	0x42baf10	sync.(*Mutex).Lock+0xf0			/Users/xargin/sdk/go1.14.2/src/sync/mutex.go:81
#	0x42baf11	main.lockorder1+0xf1			/Users/xargin/go/src/github.com/mosn/holmes/example/deadlock.go:40
#	0x4252b93	net/http.HandlerFunc.ServeHTTP+0x43	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2012
#	0x42549d4	net/http.(*ServeMux).ServeHTTP+0x1a4	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2387
#	0x4255912	net/http.serverHandler.ServeHTTP+0xa2	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2807
#	0x4251bcb	net/http.(*conn).serve+0x86b		/Users/xargin/sdk/go1.14.2/src/net/http/server.go:1895

The req API was blocked by deadlock.

Your should set DumpFullStack to true to locate deadlock bug.

goroutine explosion caused by channel block

see this example

after warming up, just wrk -c100 http://localhost:10003/chanblock

goroutine profile: total 203
100 @ 0x4037750 0x4007011 0x4006a15 0x42ba3c9 0x4252234 0x4254075 0x4254fb3 0x425126c 0x4065081
#	0x42ba3c8	main.channelBlock+0x38			/Users/xargin/go/src/github.com/mosn/holmes/example/channelblock.go:26
#	0x4252233	net/http.HandlerFunc.ServeHTTP+0x43	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2012
#	0x4254074	net/http.(*ServeMux).ServeHTTP+0x1a4	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2387
#	0x4254fb2	net/http.serverHandler.ServeHTTP+0xa2	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2807
#	0x425126b	net/http.(*conn).serve+0x86b		/Users/xargin/sdk/go1.14.2/src/net/http/server.go:1895

It's easy to locate.

process slowly leaks goroutines

See this example

The producer forget to close the task channel after produce finishes, so every request to this URI will leak a goroutine, we could curl http://localhost:10003/leak several time and got the following log:

goroutine profile: total 10
7 @ 0x4038380 0x4008497 0x400819b 0x42bb129 0x4065cb1
#	0x42bb128	main.leak.func1+0x48	/Users/xargin/go/src/github.com/mosn/holmes/example/slowlyleak.go:26

It's easy to find the leakage reason

large memory allocation caused by business logic

See this example, this is a similar example as the large slice make.

After warming up finished, wrk -c100 http://localhost:10003/alloc:

pprof memory, config_min : 3, config_diff : 25, config_abs : 80, previous : [0 0 0 4 0 0 0 0 0 0], current : 4
heap profile: 83: 374069984 [3300: 14768402720] @ heap/1048576
79: 374063104 [3119: 14768390144] @ 0x40104b3 0x401024f 0x42bb1ba 0x4252ff4 0x4254e35 0x4255d73 0x425202c 0x4065e41
#	0x42bb1b9	main.alloc+0x69				/Users/xargin/go/src/github.com/mosn/holmes/example/alloc.go:25
#	0x4252ff3	net/http.HandlerFunc.ServeHTTP+0x43	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2012
#	0x4254e34	net/http.(*ServeMux).ServeHTTP+0x1a4	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2387
#	0x4255d72	net/http.serverHandler.ServeHTTP+0xa2	/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2807
#	0x425202b	net/http.(*conn).serve+0x86b		/Users/xargin/sdk/go1.14.2/src/net/http/server.go:1895

deadloop caused cpu outage

See this example.

After warming up finished, curl http://localhost:10003/cpuex several times, then you'll see the cpu profile dump to your dump path.

Notice the cpu profile currently doesn't support text mode.

go tool pprof cpu.20201028100641.bin

(pprof) top
Showing nodes accounting for 19.45s, 99.95% of 19.46s total
Dropped 6 nodes (cum <= 0.10s)
      flat  flat%   sum%        cum   cum%
    17.81s 91.52% 91.52%     19.45s 99.95%  main.cpuex.func1
     1.64s  8.43% 99.95%      1.64s  8.43%  runtime.asyncPreempt

(pprof) list func1
Total: 19.46s
ROUTINE ======================== main.cpuex.func1 in /Users/xargin/go/src/github.com/mosn/holmes/example/cpu_explode.go
    17.81s     19.45s (flat, cum) 99.95% of Total
      80ms       80ms      1:package main
         .          .      2:
         .          .      3:import (
         .          .      4:	"net/http"
         .          .      5:	"time"
         .          .      6:
         .          .      7:	"github.com/mosn/holmes"
         .          .      8:)
         .          .      9:
         .          .     10:func init() {
         .          .     11:	http.HandleFunc("/cpuex", cpuex)
         .          .     12:	go http.ListenAndServe(":10003", nil)
         .          .     13:}
         .          .     14:
         .          .     15:var h = holmes.New("2s", "1m", "/tmp", false).
         .          .     16:	EnableCPUDump().Config(20, 25, 80)
         .          .     17:
         .          .     18:func main() {
         .          .     19:	h.Start()
         .          .     20:	time.Sleep(time.Hour)
         .          .     21:}
         .          .     22:
         .          .     23:func cpuex(wr http.ResponseWriter, req *http.Request) {
         .          .     24:	go func() {
    17.73s     19.37s     25:		for {
         .          .     26:		}
         .          .     27:	}()
         .          .     28:}

So we find out the criminal.

large thread allocation caused by cgo block

See this example

This is a cgo block example, massive cgo blocking will cause many threads created.

After warming up, curl http://localhost:10003/leak, then the thread profile and goroutine profile will be dump to dumpPath:

[2020-11-10 19:49:52.145][Holmes] pprof thread, config_min : 10, config_diff : 25, config_abs : 100,  previous : [8 8 8 8 8 8 8 8 8 1013], current : 1013
[2020-11-10 19:49:52.146]threadcreate profile: total 1013
1012 @
#	0x0

1 @ 0x403af6e 0x403b679 0x4037e34 0x4037e35 0x40677d1
#	0x403af6d	runtime.allocm+0x14d			/Users/xargin/sdk/go1.14.2/src/runtime/proc.go:1390
#	0x403b678	runtime.newm+0x38			/Users/xargin/sdk/go1.14.2/src/runtime/proc.go:1704
#	0x4037e33	runtime.startTemplateThread+0x2c3	/Users/xargin/sdk/go1.14.2/src/runtime/proc.go:1768
#	0x4037e34	runtime.main+0x2c4			/Users/xargin/sdk/go1.14.2/src/runtime/proc.go:186

goroutine profile: total 1002
999 @ 0x4004f8b 0x4394a61 0x4394f79 0x40677d1
#	0x4394a60	main._Cfunc_output+0x40	_cgo_gotypes.go:70
#	0x4394f78	main.leak.func1.1+0x48	/Users/xargin/go/src/github.com/mosn/holmes/example/thread_trigger.go:45

1 @ 0x4038160 0x40317ca 0x4030d35 0x40c6555 0x40c8db4 0x40c8d96 0x41a8f92 0x41c2a52 0x41c1894 0x42d00cd 0x42cfe17 0x4394c57 0x4394c20 0x4037d82 0x40677d1
#	0x4030d34	internal/poll.runtime_pollWait+0x54		/Users/xargin/sdk/go1.14.2/src/runtime/netpoll.go:203
#	0x40c6554	internal/poll.(*pollDesc).wait+0x44		/Users/xargin/sdk/go1.14.2/src/internal/poll/fd_poll_runtime.go:87
#	0x40c8db3	internal/poll.(*pollDesc).waitRead+0x1d3	/Users/xargin/sdk/go1.14.2/src/internal/poll/fd_poll_runtime.go:92
#	0x40c8d95	internal/poll.(*FD).Accept+0x1b5		/Users/xargin/sdk/go1.14.2/src/internal/poll/fd_unix.go:384
#	0x41a8f91	net.(*netFD).accept+0x41			/Users/xargin/sdk/go1.14.2/src/net/fd_unix.go:238
#	0x41c2a51	net.(*TCPListener).accept+0x31			/Users/xargin/sdk/go1.14.2/src/net/tcpsock_posix.go:139
#	0x41c1893	net.(*TCPListener).Accept+0x63			/Users/xargin/sdk/go1.14.2/src/net/tcpsock.go:261
#	0x42d00cc	net/http.(*Server).Serve+0x25c			/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2901
#	0x42cfe16	net/http.(*Server).ListenAndServe+0xb6		/Users/xargin/sdk/go1.14.2/src/net/http/server.go:2830
#	0x4394c56	net/http.ListenAndServe+0x96			/Users/xargin/sdk/go1.14.2/src/net/http/server.go:3086
#	0x4394c1f	main.main+0x5f					/Users/xargin/go/src/github.com/mosn/holmes/example/thread_trigger.go:55
#	0x4037d81	runtime.main+0x211				/Users/xargin/sdk/go1.14.2/src/runtime/proc.go:203

1 @ 0x4038160 0x4055bea 0x4394ead 0x40677d1
#	0x4055be9	time.Sleep+0xb9		/Users/xargin/sdk/go1.14.2/src/runtime/time.go:188
#	0x4394eac	main.init.0.func1+0x1dc	/Users/xargin/go/src/github.com/mosn/holmes/example/thread_trigger.go:34

1 @ 0x43506d5 0x43504f0 0x434d28a 0x4391872 0x43914cf 0x43902c2 0x40677d1
#	0x43506d4	runtime/pprof.writeRuntimeProfile+0x94				/Users/xargin/sdk/go1.14.2/src/runtime/pprof/pprof.go:694
#	0x43504ef	runtime/pprof.writeGoroutine+0x9f				/Users/xargin/sdk/go1.14.2/src/runtime/pprof/pprof.go:656
#	0x434d289	runtime/pprof.(*Profile).WriteTo+0x3d9				/Users/xargin/sdk/go1.14.2/src/runtime/pprof/pprof.go:329
#	0x4391871	github.com/mosn/holmes.(*Holmes).threadProfile+0x2e1		/Users/xargin/go/src/github.com/mosn/holmes/holmes.go:260
#	0x43914ce	github.com/mosn/holmes.(*Holmes).threadCheckAndDump+0x9e	/Users/xargin/go/src/github.com/mosn/holmes/holmes.go:241
#	0x43902c1	github.com/mosn/holmes.(*Holmes).startDumpLoop+0x571		/Users/xargin/go/src/github.com/mosn/holmes/holmes.go:158

So we know that the threads are blocked by cgo calls.

Comments
  • consul/api 版本冲突无法使用

    consul/api 版本冲突无法使用

    自己项目中有用到consul版本为v1.4.3,holmes中的consul/api的版本为v1.3.0

    去看holmes代码中其实并没用用到consul,去追依赖发现依赖层级如下

    mosn.io/pkg-> github.com/dubbogo/gost -> github.com/prometheus/client_golang  -> github.com/prometheus/common ->  github.com/go-kit/kit -> consul/api v1.3.0
    
    

    本身holmes中并没有任何consul的依赖,即使是使用的日志库holmes.io/pkg/log的功能中,也没有用到consul/api的功能。

    不知是否可以把moson.io/pkg/log中有关consul的依赖去除,要不可惜了这么优秀的开源不能使用。

    opened by baipangbai 16
  • Add dangerous limit option and fix some typos.

    Add dangerous limit option and fix some typos.

    main changes: add dangerous limit option, ref to this issue. other changes: fix some typos and break long sentences to make docs easier to read without dragging screen side to side.

    bug First-time contributor cla:yes size/M 
    opened by Jun10ng 9
  • feature: support upload profile to external platform.

    feature: support upload profile to external platform.

    now, we can only write the profile on the local filesystem. it's better to upload them to a central platform so that we can get real-time alerts based on the center platform, and we can analyze them on the central platform.

    design: we can add a new profile callback option, call the callback on each write profile file if it has been set.

    callback API maybe:

    callback(type, filename, reason)
    
    opened by doujiang24 8
  • Add a dangerous_limit paramter for WithCPUDump method

    Add a dangerous_limit paramter for WithCPUDump method

    Hi team,

    In my opinion, there is should be a dangerous_limit parameter means holmes will not dump profile when current CPU usage reached this limit, cuz CPU pprof usually waste some resource, commonly 5% or less, if holmes executes the CPU pprof causes CPU usage up 5% and result of the service crash, I won't hope that.

    func WithCPUDump(min int, diff int, abs int)
    

    change to

    func WithCPUDump(min int, diff int, abs int, dangerous int)
    

    or add a new withOption func

    func WithDangerousLimit(d int)
    

    I prefer the latter.

    opened by Jun10ng 8
  • "open file failed" when dump path does not exist

    When using dump path that does not exist, for example

    holmes.WithDumpPath("./tmp"),
    

    Expected behavior: The goroutine should be dumped successfully.

    Current behavior: Will encounter error like

    2022-03-29 17:07:06,609 [ERROR] failed to write profile to file(tmp/goroutine..20220329170706.609.log), err: pprof goroutine open file failed : open tmp/goroutine..20220329170706.609.log: no such file or directory
    

    Source Code

    package main
    
    import (
    	"fmt"
    	"runtime"
    	"time"
    
    	"mosn.io/holmes"
    	mlog "mosn.io/pkg/log"
    )
    
    func main() {
    	logger := holmes.NewStdLogger()
    	logger.SetLogLevel(mlog.INFO)
    
    	h, _ := holmes.New(
    		holmes.WithCollectInterval("1s"),
    
    		holmes.WithTextDump(),
    		holmes.WithDumpPath("./tmp"),
    		// dump will happen when current_goroutine_num > 500 && current_goroutine_num < 1500
    		holmes.WithGoroutineDump(500, 0, 500, 1500, 40*time.Second),
    		holmes.WithLogger(logger),
    	)
    	h.EnableGoroutineDump()
    	h.Start()
    
    	spawnGoroutine(490)
    
    	for {
    		fmt.Println(time.Now(), "Number of goroutines:", runtime.NumGoroutine())
    		spawnGoroutine(10)
    		time.Sleep(10 * time.Second)
    	}
    }
    
    func spawnGoroutine(n int64) {
    	for i := int64(0); i < n; i++ {
    		go func() {
    			time.Sleep(500 * time.Minute)
    		}()
    	}
    }
    
    

    go.mod

    module test_holmes
    
    go 1.16
    
    require (
    	mosn.io/holmes v1.0.0
    	mosn.io/pkg v0.0.0-20211217101631-d914102d1baf
    )
    
    opened by zhangqibupt 5
  • feature: support external logger.

    feature: support external logger.

    we create logger based on the file in holmes, it makes holmes easier to use standalone. but sometimes, we already have logger, we'd better support just use the existing logger instead create a new one. it will make holmes easier to integrate.

    opened by doujiang24 5
  • 关于ring的一些问题

    关于ring的一些问题

    1、我看完代码感觉ring的作用只是计算平均值。那么为啥没有用队列,添加元素的时候从队头删除一个,队尾添加一个。

    2、如果没有用队列的原因是队列没有像ring那样严格的数量限制容易出现问题,那么为啥没有封装一下官方库提供的ring,而是自己写了一个。自己写的ring里面idx的最大值等于len(ring.data),又有ring.data[ring.idx]的操作,感觉这个有点危险。

    opened by x-wesley 5
  • heap samples are not what I expect

    heap samples are not what I expect

    What version of Go are you using (go version)?

    $  1.17.8
    
    

    Does this issue reproduce with the latest release?

    Haven't tried.

    What operating system and processor architecture are you using (go env)?

    go env Output
    $ MacOS and amd64
    

    What did you do?

    I wrote a piece of code:

    var (
        default = 1073741824
        a []byte
    )
    
    func MakeNGbSlice(ctx *gin.Context) {
        sizeStr ,ok := ctx.GetQuery("size")
        if ok {
            size, _ := strconv.Atoi(sizeStr)
            if size > 0 {
                defaultSize = size * defaultSize
            }
        }
    
        a = make([]byte, 0, defaultSize)
        for i := 0; i < defaultSize; i ++ {
            a = append(a, byte('a'))
        }
        time.Sleep(time.Second * 10)
        a = nil // for gc
    }
    

    Then I curled the api to trigger the code to run.

    What did you expect to see?

    1. The RSS my app used went up to 1GB.
    2. The heap profile data was right and could help me find out the reason why RSS went up after I dumped the heap samples.

    What did you see instead?

    1. The RSS my app used went up to 1GB.
    2. The heap profile seemed to be weird:
    Type: inuse_space
    Time: Aug 3, 2022 at 2:32pm (CST)
    Entering interactive mode (type "help" for commands, "o" for options)
    (pprof) top
    Showing nodes accounting for 12052.56kB, 64.38% of 18721.44kB total
    Showing top 10 nodes out of 153
          flat  flat%   sum%        cum   cum%
     2562.81kB 13.69% 13.69%  3075.02kB 16.43%  runtime.allocm
     2048.81kB 10.94% 24.63%  2048.81kB 10.94%  runtime.malg
     2048.19kB 10.94% 35.57%  2048.19kB 10.94%  github.com/Shopify/sarama.(*TopicMetadata).decode
     1097.69kB  5.86% 41.44%  1097.69kB  5.86%  github.com/Shopify/sarama.(*client).updateMetadata
     1089.33kB  5.82% 47.26%  1089.33kB  5.82%  google.golang.org/grpc/internal/transport.newBufWriter
    // ...
    
    1. After about 2 minutes, I profiled again and the result was what I expected:
    Type: inuse_space
    Time: Aug 3, 2022 at 2:33pm (CST)
    Entering interactive mode (type "help" for commands, "o" for options)
    (pprof) top
    Showing nodes accounting for 1024.50MB, 98.25% of 1042.78MB total
    Dropped 154 nodes (cum <= 5.21MB)
    Showing top 10 nodes out of 18
          flat  flat%   sum%        cum   cum%
        1024MB 98.20% 98.20%     1024MB 98.20%  .../examples.Make1GbSlice
        0.50MB 0.048% 98.25%     6.63MB  0.64%  runtime.main
             0     0% 98.25%  1024.51MB 98.25%  github.com/gin-gonic/gin.(*Context).Next (inline)
             0     0% 98.25%  1024.51MB 98.25%  github.com/gin-gonic/gin.(*Engine).ServeHTTP
             0     0% 98.25%  1024.51MB 98.25%  github.com/gin-gonic/gin.(*Engine).handleHTTPRequest
    

    So, it seems like that the samples pprof dumps in time are wrong.

    For more discussion, Please see golang issue #54233

    opened by dumbFeng 4
  • Why CPU Dump didn't work

    Why CPU Dump didn't work

    I run holmes inside my application and set the follow options.

           // other initial
    
    
    	h, _ := holmes.New(
    		holmes.WithCollectInterval("5s"),
    		holmes.WithCoolDown("1m"),
    		holmes.WithDumpPath("/tmp"),
    		holmes.WithCPUDump(1, 25, 80),
    		holmes.WithCPUMax(90),
    	)
    	h.EnableCPUDump()
    
    	// start the metrics collect and dump loop
    	h.Start()
    
          // server start
    
    

    then I run a eating cpu shell script, and the following screenshot is top command output, CPU usage rate is almost 100%. image

    but I check holmes.log it shows CPU usage rate is 0%

    image

    opened by Jun10ng 4
  • A discussion about dynamic configuration.

    A discussion about dynamic configuration.

    Hi team,

    I saw there was an issue about "dynamic configuration", and I want to discuss some questions about it.

    1. The definition of "dynamic" is holmes pull/receive configurations from a remote server? or holmes can adjust its configuration automatically through pre-prepared plans?
    2. If its definition is the former, holmes receive configuration from Apollo/Redis/API or some service built by users themselves, we better draft an abstract API, and implement dynamic config feature based on the API.
    3. Does we support hot-load configuration? I mean weather holmes supports modifying its config when it is running? Or just on a simple way-- only load configuration from remote when holmes is initiating. The change to support the former seems great.
    opened by Jun10ng 3
  • 有没有朋友遇到过这个问题

    有没有朋友遇到过这个问题

    类似这个 https://github.com/golang/go/issues/40974

    goland里面执行gotest没有问题。

    项目里go install 命令的时候出现以下问题: /usr/local/Cellar/go/1.15.2/libexec/pkg/tool/darwin_amd64/link: running clang failed: exit status 1 ld: sectionForAddress(0x13654DE) address not in any section file '/var/folders/rr/2rdmy50d1jb0tjnjfhcbnbqh0000gn/T/go-link-273425566/go.o' for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation)

    升级go版本至1.15.5后问题消失

    opened by x-wesley 3
  • Builds with Go 1.18, but tests with 1.17, 1.18 and 1.19

    Builds with Go 1.18, but tests with 1.17, 1.18 and 1.19

    This uses a test matrix to ensure the floor version of Mosn is two behind latest Go, even if one behind latest (currently 1.18) is used to build binaries.

    See https://github.com/mosn/mosn/issues/2153

    First-time contributor cla:yes size/XXL 
    opened by codefromthecrypt 3
  • feature: print the CPU usage during sampling cpu profile

    feature: print the CPU usage during sampling cpu profile

    The cpu profile may missing the CPU usage high time, when CPU usage goes down quickly. eg. https://github.com/mosn/holmes/issues/123#issuecomment-1257762975

    we may record the CPU usage during sampling the CPU profile, and log it. eg. we can start another goroutine, and log CPU usage continuously.

    opened by doujiang24 4
  • cpu 突然飙高的一下触发了dump操作,但是dump下来的文件使用 go tool pprof  分析文件,好像什么都分析不出来

    cpu 突然飙高的一下触发了dump操作,但是dump下来的文件使用 go tool pprof 分析文件,好像什么都分析不出来

    代码: h, err := holmes.New( holmes.WithProfileReporter(r), holmes.WithCollectInterval("30s"), holmes.WithDumpPath(pprofDumpPath), holmes.WithLogger(holmes.NewFileLog(holmesLogFilePath, mlog.INFO)), holmes.WithCPUDump(30, 50,60, time.Minute), holmes.WithCPUMax(90), holmes.WithCGroup(true), ) h.EnableCPUDump()

    // start the metrics collect and dump loop
    h.Start()
    

    holmes log image

    pprof_file wecom-temp-151278-ae93f62a8cb7ddf44a471a9863a220a7

    看起来感觉cpu突然飙高了一下,然后又正常了,是采集周期或者参数设置的问题吗

    opened by sk872529557 15
Releases(v1.0.2)
  • v1.0.2(Sep 9, 2022)

    What's Changed

    • feat:#96 by @songzhibin97 in https://github.com/mosn/holmes/pull/98
    • fix: removed the useless dot in the dump filename by @AtlanCI in https://github.com/mosn/holmes/pull/100
    • Update util.go by @nejisama in https://github.com/mosn/holmes/pull/106
    • feat: skip goroutine dump when already finished a thread check dump in a single loop. by @AtlanCI in https://github.com/mosn/holmes/pull/103
    • fix: Fix DisableMemDump using wrong Opts #105 by @AtlanCI in https://github.com/mosn/holmes/pull/107
    • typo fix: grTriggerCount => gcHeapTriggerCount. by @doujiang24 in https://github.com/mosn/holmes/pull/110
    • Goroutine profile known risk improvement by @Jun10ng in https://github.com/mosn/holmes/pull/111
    • Enable holmes as pyroscope client and reports pprof event to pyroscope server by @Jun10ng in https://github.com/mosn/holmes/pull/109
    • fix log print cpu to curCPU by @Jun10ng in https://github.com/mosn/holmes/pull/115
    • feat(report): pass scene information to reporter by @dumbFeng in https://github.com/mosn/holmes/pull/119

    New Contributors

    • @AtlanCI made their first contribution in https://github.com/mosn/holmes/pull/100
    • @nejisama made their first contribution in https://github.com/mosn/holmes/pull/106
    • @dumbFeng made their first contribution in https://github.com/mosn/holmes/pull/119

    Full Changelog: https://github.com/mosn/holmes/compare/v1.0.1...v1.0.2

    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Apr 1, 2022)

    holmes v1.0.1

    What's Changed

    • feat:supports dump goroutine to logger and fix typos by @Jun10ng in https://github.com/mosn/holmes/pull/91
    • docs: add withDumpTo logs by @Jun10ng in https://github.com/mosn/holmes/pull/93
    • stop remind typos by @Jun10ng in https://github.com/mosn/holmes/pull/94
    • fix: #89 by @songzhibin97 in https://github.com/mosn/holmes/pull/92

    Full Changelog: https://github.com/mosn/holmes/compare/v1.0.0...v1.0.1

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Mar 21, 2022)

    This is the first release version for holmes, you can use holmes on your production environment now.

    Background introduction and design details were posted on our blog sites : holmes introduction in Chinese, holmes is used by mosn inside ant financial, which has more than 400k instances in production.

    Now holmes supports self-aware profile dump including:

    • goroutine profile dump when goroutine number spikes
    • cpu profile dump when cpu usage spikes
    • heap profile dump when memory usage spikes.

    Thanks for our contributors: @Jun10ng @doujiang24 @songzhibin97 @cch123 @Mutated1994

    If you encounter any problem when you use holmes, welcome to open an issue or pull request!

    Source code(tar.gz)
    Source code(zip)
Owner
MOSN
The Cloud Native Proxy for Edge or Service Mesh
MOSN
Envp - ENVP is cli wrapper that sets environment variables by profile when you execute the command line

ENVP ENVP is cli wrapper that sets environment variables by profile based config

Sunggun Yu 3 Nov 7, 2022
Build self-updating Go programs

selfupdate: Build self-updating Go programs NOTE: Original work at github.com/inconshreveable/go-update, modified for the needs within MinIO project P

High Performance, Kubernetes Native Object Storage 298 Jan 8, 2023
code is pretty self explanatory and decently commented

square_approximation code is pretty self explanatory and decently commented when limit set to 10^9 or more ram usage goes to >42 gigs of ram which is

null 0 Apr 7, 2022
Self-host your GitHub repositories.

self-forge One day, I'd like to write a lightweight clone of GitHub. For now, here's ~100 lines of Go that host your source files. Clones all of a Git

Andrew Healey 12 Jan 6, 2022
A simple single-file executable to pull a git-ssh repository and serve the web app found to a self-contained browser window

go-git-serve A simple single-file executable to pull a git-ssh repository (using go-git library) and serve the web app found to a self-contained brows

Justin Searle 0 Jan 19, 2022
Argparse for golang. Just because `flag` sucks

Golang argparse Let's be honest -- Go's standard command line arguments parser flag terribly sucks. It cannot come anywhere close to the Python's argp

Alexey Kamenskiy 486 Dec 28, 2022
Golang library with POSIX-compliant command-line UI (CLI) and Hierarchical-configuration. Better substitute for stdlib flag.

cmdr cmdr is a POSIX-compliant, command-line UI (CLI) library in Golang. It is a getopt-like parser of command-line options, be compatible with the ge

hz 116 Oct 28, 2022
Fully featured Go (golang) command line option parser with built-in auto-completion support.

go-getoptions Go option parser inspired on the flexibility of Perl’s GetOpt::Long. Table of Contents Quick overview Examples Simple script Program wit

David Gamba 46 Dec 14, 2022
CONTRIBUTIONS ONLY: A Go (golang) command line and flag parser

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

Alec Thomas 3.3k Dec 29, 2022
A CLI tool implemented by Golang to manage `CloudComb` resource

CloudComb CLI tool: comb Get Started comb is a CLI tool for manage resources in CloudComb base on cloudcomb-go-sdk. Support Mac, Linux and Windows. We

Bingo Huang 22 Jan 4, 2021
Automatically generate Go (golang) struct definitions from example JSON

gojson gojson generates go struct definitions from json or yaml documents. Example $ curl -s https://api.github.com/repos/chimeracoder/gojson | gojson

Aditya Mukerjee 2.6k Jan 1, 2023
Command Line Alias Manager and Plugin System - Written in Golang

aly - Command Line Alias Manager and Packager Aly offers the simplest way to manage, share, and obtain command line aliases! Warning: This project is

Max Bridgland 21 Jun 16, 2022
💻 PTerm | Pretty Terminal Printer A golang module to print pretty text

✨ PTerm is a modern go module to beautify console output. Featuring charts, progressbars, tables, trees, and many more ?? It's completely configurable and 100% cross-platform compatible.

null 3.1k Jan 1, 2023
A collection of terminal-based widgets for richer Golang CLI apps.

Flinch A collection of terminal-based widgets for richer Golang CLI apps. Ships with a library to build your own widgets/TUIs too. Warning: This modul

Liam Galvin 41 Jan 7, 2023
A really basic thread-safe progress bar for Golang applications

progressbar A very simple thread-safe progress bar which should work on every OS without problems. I needed a progressbar for croc and everything I tr

Zack 3k Jan 1, 2023
Console progress bar for Golang

Terminal progress bar for Go Installation go get github.com/cheggaaa/pb/v3 Documentation for v1 bar available here Quick start package main import (

Sergey Cherepanov 3.4k Jan 9, 2023
Color package for Go (golang)

color Color lets you use colorized outputs in terms of ANSI Escape Codes in Go (Golang). It has support for Windows too! The API can be used in severa

Fatih Arslan 5.9k Dec 31, 2022
Golang terminal dashboard

termui termui is a cross-platform and fully-customizable terminal dashboard and widget library built on top of termbox-go. It is inspired by blessed-c

Zack Guo 12.3k Dec 29, 2022
A library for writing system daemons in golang.

go-daemon Library for writing system daemons in Go. Now supported only UNIX-based OS (Windows is not supported). But the library was tested only on Li

Sergey Yarmonov 1.9k Dec 29, 2022