Continuous profiling for long-term postmortem analysis

Overview

profefe

Build Status Go Report Card Docker Pulls MIT licensed

profefe, a continuous profiling system, collects profiling data from a fleet of running applications and provides API for querying profiling samples for postmortem performance analysis.

Why Continuous Profiling?

"Continuous Profiling and Go" describes the motivation behind profefe:

With the increase in momentum around the term “observability” over the last few years, there is a common misconception amongst the developers, that observability is exclusively about metrics, logs and tracing (a.k.a. “three pillars of observability”) [..] With metrics and tracing, we can see the system on a macro-level. Logs only cover the known parts of the system. Performance profiling is another signal that uncovers the micro-level of a system; continuous profiling allows observing how the components of the application and the infrastructure it runs in, influence the overall system.

How does it work?

See Design Docs documentation.

Quickstart

To build and start profefe collector, run:

$ make
$ ./BUILD/profefe -addr=localhost:10100 -storage-type=badger -badger.dir=/tmp/profefe-data

2019-06-06T00:07:58.499+0200    info    profefe/main.go:86    server is running    {"addr": ":10100"}

The command above starts profefe collector backed by BadgerDB as a storage for profiles. profefe supports other storage types: S3, Google Cloud Storage and ClickHouse.

Run ./BUILD/profefe -help to show the list of all available options.

Example application

profefe ships with a fork of Google Stackdriver Profiler's example application, modified to use profefe agent, that sends profiling data to profefe collector.

To start the example application run the following command in a separate terminal window:

$ go run ./examples/hotapp/main.go

After a brief period, the application will start sending CPU profiles to profefe collector.

send profile: http://localhost:10100/api/0/profiles?service=hotapp-service&labels=version=1.0.0&type=cpu
send profile: http://localhost:10100/api/0/profiles?service=hotapp-service&labels=version=1.0.0&type=cpu
send profile: http://localhost:10100/api/0/profiles?service=hotapp-service&labels=version=1.0.0&type=cpu

With profiling data persisted, query the profiles from the collector using its HTTP API (refer to documentation for collector's HTTP API below). As an example, request all profiling data associated with the given meta-information (service name and a time frame), as a single merged profile:

$ go tool pprof 'http://localhost:10100/api/0/profiles/merge?service=hotapp-service&type=cpu&from=2019-05-30T11:49:00&to=2019-05-30T12:49:00&labels=version=1.0.0'

Fetching profile over HTTP from http://localhost:10100/api/0/profiles...
Saved profile in /Users/varankinv/pprof/pprof.samples.cpu.001.pb.gz
Type: cpu

(pprof) top
Showing nodes accounting for 43080ms, 99.15% of 43450ms total
Dropped 53 nodes (cum <= 217.25ms)
Showing top 10 nodes out of 12
      flat  flat%   sum%        cum   cum%
   42220ms 97.17% 97.17%    42220ms 97.17%  main.load
     860ms  1.98% 99.15%      860ms  1.98%  runtime.nanotime
         0     0% 99.15%    21050ms 48.45%  main.bar
         0     0% 99.15%    21170ms 48.72%  main.baz
         0     0% 99.15%    42250ms 97.24%  main.busyloop
         0     0% 99.15%    21010ms 48.35%  main.foo1
         0     0% 99.15%    21240ms 48.88%  main.foo2
         0     0% 99.15%    42250ms 97.24%  main.main
         0     0% 99.15%    42250ms 97.24%  runtime.main
         0     0% 99.15%     1020ms  2.35%  runtime.mstart

profefe includes a tool, that allows importing existing pprof data into the collector. While profefe collector is still running, run the following:

$ ./scripts/pprof_import.sh --service service1 --label region=europe-west3 --label host=backend1 --type cpu -- path/to/cpu.prof

uploading service1-cpu-backend1-20190313-0948Z.prof...OK

Using Docker

You can build a docker image with profefe collector, by running the command:

$ make docker-image

The documentation about running profefe in docker is in contrib/docker/README.md.

HTTP API

Store pprof-formatted profile

POST /api/0/profiles?service=&type=[cpu|heap|...]&labels=
body pprof.pb.gz

< HTTP/1.1 200 OK
< Content-Type: application/json
<
{
  "code": 200,
  "body": {
    "id": ,
    "type": ,
    ···
  }
}
  • service — service name (string)
  • type — profile type ("cpu", "heap", "block", "mutex", "goroutine", "threadcreate", or "other")
  • labels — a set of key-value pairs, e.g. "region=europe-west3,dc=fra,ip=1.2.3.4,version=1.0" (Optional)

Example

$ curl -XPOST \
  "http:///api/0/profiles?service=api-backend&type=cpu&labels=region=europe-west3,dc=fra" \
  --data-binary "@$HOME/pprof/api-backend-cpu.prof"

Store runtime execution traces (experimental)

Go's runtime traces are a special case of profiling data, that can be stored and queried with profefe.

Currently, profefe doesn't support extracting the timestamp of when the trace was created. Client may provide this information via created_at parameter, see below.

POST /api/0/profiles?service=&type=trace&created_at=&labels=
body trace.out

< HTTP/1.1 200 OK
< Content-Type: application/json
<
{
  "code": 200,
  "body": {
    "id": ,
    "type": "trace",
    ···
  }
}
  • service — service name (string)
  • type — profile type ("trace")
  • created_at — trace profile creation time, e.g. "2006-01-02T15:04:05" (defaults to server's current time)
  • labels — a set of key-value pairs, e.g. "region=europe-west3,dc=fra,ip=1.2.3.4,version=1.0" (Optional)

Example

$ curl -XPOST \
  "http:///api/0/profiles?service=api-backend&type=trace&created_at=2019-05-01T18:45:00&labels=region=europe-west3,dc=fra" \
  --data-binary "@$HOME/pprof/api-backend-trace.out"

Query meta information about stored profiles

GET /api/0/profiles?service=&type=&from=&to=&labels=

< HTTP/1.1 200 OK
< Content-Type: application/json
<
{
  "code": 200,
  "body": [
    {
      "id": ,
      "type": 
    },
    ···
  ]
}
  • service — service name
  • from, to — a time frame in which profiling data was collected, e.g. "from=2006-01-02T15:04:05"
  • type — profile type ("cpu", "heap", "block", "mutex", "goroutine", "threadcreate", "trace", "other") (Optional)
  • labels — a set of key-value pairs, e.g. "region=europe-west3,dc=fra,ip=1.2.3.4,version=1.0" (Optional)

Example

$ curl "http:///api/0/profiles?service=api-backend&type=cpu&from=2019-05-01T17:00:00&to=2019-05-25T00:00:00"

Query saved profiling data returning it as a single merged profile

GET /api/0/profiles/merge?service=&type=&from=&to=&labels=

< HTTP/1.1 200 OK
< Content-Type: application/octet-stream
< Content-Disposition: attachment; filename="pprof.pb.gz"
<
pprof.pb.gz

Request parameters are the same as for querying meta information.

Note, "type" parameter is required; merging runtime traces is not supported.

Return individual profile as pprof-formatted data

GET /api/0/profiles/

< HTTP/1.1 200 OK
< Content-Type: application/octet-stream
< Content-Disposition: attachment; filename="pprof.pb.gz"
<
pprof.pb.gz
  • id - id of stored profile, returned with the request for meta information above

Merge a set of individual profiles into a single profile

GET /api/0/profiles/++...

< HTTP/1.1 200 OK
< Content-Type: application/octet-stream
< Content-Disposition: attachment; filename="pprof.pb.gz"
<
pprof.pb.gz
  • id1, id2 - ids of stored profiles

Note, merging is possible only for profiles of the same type; merging runtime traces is not supported.

Get services for which profiling data is stored

GET /api/0/services

< HTTP/1.1 200 OK
< Content-Type: application/json
<
{
  "code": 200,
  "body": [
    ,
    ···
  ]
}

Get profefe server version

GET /api/0/version

< HTTP/1.1 200 OK
< Content-Type: application/json
<
{
  "code": 200,
  "body": {
    "version": ,
    "commit": ,
    "build_time": "
  }
}

FAQ

Does continuous profiling affect the performance of the production?

Profiling always comes with some costs. Go collects sampling-based profiling data and for the most applications the real overhead is small enough (refer to "Can I profile my production services" from Go's Diagnostics documentation).

To reduce the costs, users can adjust the frequency of collection rounds, e.g. collect 10 seconds of CPU profiles every 5 minutes.

profefe-agent tries to reduce the overhead further by adding a small jiggling in-between the profiles collection rounds. This distributes the total profiling overhead, making sure that not all instances of application's cluster are being profiled at the same time.

Can I use profefe with non-Go projects?

profefe collects pprof-formatted profiling data. The format is used by Go profiler, but thrid-party profilers for other programming languages support of the format too. For example, google/pprof-nodejs for Node.js, tikv/pprof-rs for Rust, arnaud-lb/php-memory-profiler for PHP, etc.

Integrating those is the subject of building a transport layer between the profiler and profefe.

Further reading

While the topic of continuous profiling in the production is quite unrepresented in the public internet, some research and commercial projects already exist:

profefe is still in its early state. Feedback and contribution are very welcome.

License

MIT

Comments
  • clickhouse: new experimental storage type

    clickhouse: new experimental storage type

    The PR adds a new experimental storage type, based on ClickHouse database. The storage parses pprof data and stores individual samples in ClickHouse.

    A new command line flag --storage-type=<type>, that defines profefe's storage type:

    $ profefe \
      -addr=localhost:10100 \
      -storage-type=clickhouse \
      -clickhouse.dsn='tcp://127.0.0.1:9000?database=profefe'
    

    TODO:

    • [x] provide persistent database schema
    • [x] calculate storage requirements
    • [x] set up integration tests from storage/clickhouse to run on CI

    Implementation details

    In the current implementation, there is no way to restore the pprof from the DB. That is, Storage.ListProfiles is not supported. The storage is intended to be used with analytical subsystems that work directly with ClickHouse.

    The storage only supports pprof profiles, i.e. type=trace doesn't work — storage returns "unsupported profile type" error.

    The recommended SQL schema is in pkg/storage/clickhouse/schema/profefe.sql.

    Outdated details

    In this implementation, storage uses the following DB schema (not really, see the discussion below):

    CREATE TABLE IF NOT EXISTS pprof_profiles (
    	profile_key FixedString(12),
    	profile_type Enum8(
    		'cpu' = 1,
    		'heap' = 2,
    		'block' = 3,
    		'mutex' = 4,
    		'goroutine' = 5,
    		'threadcreate' = 6,
    		'other' = 100
    	),
    	external_id String,
    	service_name LowCardinality(String),
    	created_at DateTime,
    	labels Nested (
    		key LowCardinality(String),
    		value String
    	)
    ) engine=Memory
    
    CREATE TABLE IF NOT EXISTS pprof_samples (
    	profile_key FixedString(12),
    	digest UInt64,
    	locations Nested (
    		func_name LowCardinality(String),
    		file_name LowCardinality(String),
    		lineno UInt16
    	),
    	values Array(UInt64),
    	labels Nested (
    		key String,
    		value String
    	)
    ) engine=Memory
    

    Note, engine=Memory is obviously not supposed to be used in production.

    opened by narqo 10
  • S3 profile ID does not look right

    S3 profile ID does not look right

    Hello,

    S3 should work as the other storage.

    S3 profile ID P0.service/4/bq0rups0s6cfii5dn030,from=label,label2=label2

    For the other storage: bq0rups0s6cfii5dn030

    This ID is complicated because this does not work:

    go tool pprof http://localhost:10100/api/0/profiles/P0.service/4/bq0rups0s6cfii5dn030,from=label,label2=label2
    

    I am not sure what changed but the implementation we hacked here https://github.com/profefe/profefe/pull/52 were working as expected

    opened by gianarb 6
  • S3 backing for profefe

    S3 backing for profefe

    Hey there, here is a PR to store profiles in S3.

    I've encoded the metadata about the profile within the S3 key like this:

    /service/profile_type/created_at_unix_time/label1=value1,label2=value2/id
    

    Additionally, I tweaked the CLI parameters so that one must specify either badger or s3.

    What do you think?

    opened by goller 4
  • storage/gcs: implement storage on GCS

    storage/gcs: implement storage on GCS

    Hi, this PR adds a new storage type based on GCS. I've implemented the schema for the object key referring to https://github.com/profefe/profefe/pull/69.

    opened by tatsumack 3
  • Fatal on building docker image

    Fatal on building docker image

    go build  -trimpath -ldflags " -X github.com/profefe/profefe/version.version=git-8c2ec3f -X github.com/profefe/profefe/version.commit=8c2ec3f -X github.com/profefe/profefe/version.buildTime=2020-04-09T21:10:21Z" -o /go/src/github.com/profefe/profefe/BUILD/profefe ./cmd/profefe
    runtime: mlock of signal stack failed: 12
    runtime: increase the mlock limit (ulimit -l) or
    runtime: update your kernel to 5.3.15+, 5.4.2+, or 5.5+
    fatal error: mlock failed
    
    
    runtime stack:
    runtime.throw(0xa3b45e, 0xc)
    	/usr/local/go/src/runtime/panic.go:1112 +0x72
    runtime.mlockGsignal(0xc000383200)
    	/usr/local/go/src/runtime/os_linux_x86.go:72 +0x107
    runtime.mpreinit(0xc00079ca80)
    	/usr/local/go/src/runtime/os_linux.go:341 +0x78
    runtime.mcommoninit(0xc00079ca80)
    	/usr/local/go/src/runtime/proc.go:630 +0x108
    runtime.allocm(0xc00003a800, 0xa82468, 0xc3ce98)
    	/usr/local/go/src/runtime/proc.go:1390 +0x14e
    runtime.newm(0xa82468, 0xc00003a800)
    	/usr/local/go/src/runtime/proc.go:1704 +0x39
    runtime.startm(0x0, 0xc000038001)
    	/usr/local/go/src/runtime/proc.go:1869 +0x12a
    runtime.wakep(...)
    	/usr/local/go/src/runtime/proc.go:1953
    runtime.resetspinning()
    	/usr/local/go/src/runtime/proc.go:2415 +0x93
    runtime.schedule()
    	/usr/local/go/src/runtime/proc.go:2527 +0x2de
    runtime.exitsyscall0(0xc0001e0600)
    	/usr/local/go/src/runtime/proc.go:3263 +0x116
    runtime.mcall(0x20000)
    	/usr/local/go/src/runtime/asm_amd64.s:318 +0x5b
    
    opened by korjavin 3
  • Implement storage on top of s3

    Implement storage on top of s3

    closes #52

    This PR is an updated version of the S3 backing storage from #52.

    The major difference from #52 is that the storage uploads a single object per write call. Instead of storing a separate object for the metadata, I encode everything that's required for existing reader's API into the object key and use the key as the profile's ID (which became possible with #68).

    The schema for the object key

    schemaV.service_name/profile_type/digest,label1=value1,label2=value2
    
    • "schemaV" shows the naming schema that was used when the profile was stored in S3. The hope is that it will allow us to evolve the API in future. Currently, the schema is P0.
    • "digest" is the unique profile id. The digest is implemented via rs/xid. The value of the ID is built from the incoming profile's creation time.

    FindProfiles and ListServices are implemented via s3's ListObjects API. The later relies on S3's common prefixes, which is expected to return all services in a single page response.


    The PR also adds support for non-AWS s3-compatible object storage services (Minio, Yandex.Cloud, etc):

    $ env AWS_ACCESS_KEY_ID=minioadmin AWS_SECRET_ACCESS_KEY=minioadmin \
       ./BUILD/profefe -s3.bucket=test.0 -s3.endpoint-url=http://127.0.0.1:9000 -s3.disable-ssl=true
    

    TODO:

    • [x] Implement Storage.ListServices.
    • [x] Update docs.
    opened by narqo 3
  • Add support for threadcreate profile type

    Add support for threadcreate profile type

    Hello

    There is a particular reason why only CPU and heap are acceptable profile types? There are at least goroutines, space, threadcreate and I think even more.

    I don't know if it is a not so common practice but for example, at InfluxDB we developed a custom profile type called all:

    https://docs.influxdata.com/influxdb/v1.7/tools/api/#debug-pprof-all-http-endpoint

    Thanks a lot

    opened by gianarb 3
  • Response after creation

    Response after creation

    Hello

    I am writing a bit of code around profefe and I would like to return a usable link or at least the profile ID for the current submitted one.

    Currently after the profile submission returns only {"code":200}. Are you against the possibility to return at least its ID?

    opened by gianarb 2
  • storage/postgres: Schema migrations

    storage/postgres: Schema migrations

    As for now each change in Postgres's SQL schema required starting DB from scratch. That's okay for the earlier prototyping phase, but something that must be addressed in future.

    One needs to investigate how SQL schema migration is typically handled in a Go project.

    Check me:

    • https://github.com/heroiclabs/nakama/tree/master/migrate
    • https://github.com/sourcegraph/sourcegraph/tree/master/migrations
    wontfix 
    opened by narqo 2
  • About strategy of deployment

    About strategy of deployment

    Hello bro, I have one question. I will just start collector in one machine when I use badger as storage, because badger is not distributed storage? Is right? What should I do if I want to deploy cluster of collector?

    opened by vision9527 1
  • Block and Mutex: failed to fetch any source profiles

    Block and Mutex: failed to fetch any source profiles

    I am running into 404's when querying the Profefe merge API for both Block and Mutex Profiles. CPU/Heap/Threadcreate/Goroutine profiles are working just fine. Seeing the same behavior locally when running with docker-compose and in production on Kubernetes.

    Golang Version: 1.14

    Profefe Agent setup (at the top of my main.go):

    pffAgent, err := agent.Start(
    	config.Config.ProfefeHost,
    	config.Config.ServiceName,
    	agent.WithCPUProfile(10*time.Second),
    	agent.WithHeapProfile(),
    	agent.WithBlockProfile(),
    	agent.WithMutexProfile(),
    	agent.WithGoroutineProfile(),
    	agent.WithThreadcreateProfile(),
    	agent.WithLogger(agentLogger),
    	agent.WithLabels(
    		"region", config.Config.AWSRegion,
    		"instance", instance.GetID(),
    	),
    	agent.WithTickInterval(config.Config.GetProfefeTickInterval()),
    )
    if err != nil {
    	log.Fatalln(err)
    }
    defer pffAgent.Stop()
    

    For both the following URLs, I get a {"code":404,"error":"nothing found"} response.

    Block Query URL: http://localhost:10100/api/0/profiles/merge?service=redacted&type=block&from=2020-12-11T17:32:03&to=2020-12-11T17:42:03

    Profefe Logs

    2020-12-11T17:53:45.616Z	error	profefe/reply.go:79	request failed	{"url": "/api/0/profiles?labels=region%3Dus-east-1%2Cinstance%3Dbcfa84e4f2f6&service=redacted&type=block", "error": "profile is empty: no samples"}
    

    Mutex Query URL: http://localhost:10100/api/0/profiles/merge?service=redacted&type=mutex&from=2020-12-11T17:40:33&to=2020-12-11T17:50:33

    Profefe Logs

    2020-12-11T17:53:46.619Z	error	profefe/reply.go:79	request failed	{"url": "/api/0/profiles?labels=region%3Dus-east-1%2Cinstance%3Dbcfa84e4f2f6&service=redacted&type=mutex", "error": "profile is empty: no samples"}
    

    Based on the logs, It makes me feel like the agent isn't collecting the block and mutex profiles, or it isn't sending the profiles (maybe empty based on the logs?) to the collector. Any help would be appreciated.

    Thanks!

    opened by connormckelvey 1
  • build(deps): bump github.com/aws/aws-sdk-go from 1.29.9 to 1.33.0

    build(deps): bump github.com/aws/aws-sdk-go from 1.29.9 to 1.33.0

    Bumps github.com/aws/aws-sdk-go from 1.29.9 to 1.33.0.

    Changelog

    Sourced from github.com/aws/aws-sdk-go's changelog.

    Release v1.33.0 (2020-07-01)

    Service Client Updates

    • service/appsync: Updates service API and documentation
    • service/chime: Updates service API and documentation
      • This release supports third party emergency call routing configuration for Amazon Chime Voice Connectors.
    • service/codebuild: Updates service API and documentation
      • Support build status config in project source
    • service/imagebuilder: Updates service API and documentation
    • service/rds: Updates service API
      • This release adds the exceptions KMSKeyNotAccessibleFault and InvalidDBClusterStateFault to the Amazon RDS ModifyDBInstance API.
    • service/securityhub: Updates service API and documentation

    SDK Features

    • service/s3/s3crypto: Introduces EncryptionClientV2 and DecryptionClientV2 encryption and decryption clients which support a new key wrapping algorithm kms+context. (#3403)
      • DecryptionClientV2 maintains the ability to decrypt objects encrypted using the EncryptionClient.
      • Please see s3crypto documentation for migration details.

    Release v1.32.13 (2020-06-30)

    Service Client Updates

    • service/codeguru-reviewer: Updates service API and documentation
    • service/comprehendmedical: Updates service API
    • service/ec2: Updates service API and documentation
      • Added support for tag-on-create for CreateVpc, CreateEgressOnlyInternetGateway, CreateSecurityGroup, CreateSubnet, CreateNetworkInterface, CreateNetworkAcl, CreateDhcpOptions and CreateInternetGateway. You can now specify tags when creating any of these resources. For more information about tagging, see AWS Tagging Strategies.
    • service/ecr: Updates service API and documentation
      • Add a new parameter (ImageDigest) and a new exception (ImageDigestDoesNotMatchException) to PutImage API to support pushing image by digest.
    • service/rds: Updates service documentation
      • Documentation updates for rds

    Release v1.32.12 (2020-06-29)

    Service Client Updates

    • service/autoscaling: Updates service documentation and examples
      • Documentation updates for Amazon EC2 Auto Scaling.
    • service/codeguruprofiler: Updates service API, documentation, and paginators
    • service/codestar-connections: Updates service API, documentation, and paginators
    • service/ec2: Updates service API, documentation, and paginators
      • Virtual Private Cloud (VPC) customers can now create and manage their own Prefix Lists to simplify VPC configurations.

    Release v1.32.11 (2020-06-26)

    Service Client Updates

    • service/cloudformation: Updates service API and documentation
      • ListStackInstances and DescribeStackInstance now return a new StackInstanceStatus object that contains DetailedStatus values: a disambiguation of the more generic Status value. ListStackInstances output can now be filtered on DetailedStatus using the new Filters parameter.
    • service/cognito-idp: Updates service API

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • storage/s3: Consider updating to AWS SDK v2

    storage/s3: Consider updating to AWS SDK v2

    aws-sdk-go-v2 promises to be a more idiomatic and lighter version of AWS SDK. One should investigate if the promises hold true (and maybe we won't need #117 at all).

    Refer to https://aws.amazon.com/blogs/developer/aws-sdk-for-go-version-2-general-availability/ for details.

    opened by narqo 0
  • storage/clickhouse: Tests failing with ClickHouse 20.12

    storage/clickhouse: Tests failing with ClickHouse 20.12

    See ClickHouse's changelog.

    Tests fail with the stacktrace below when run against ClickHouse 20.12

                --- FAIL: TestStorage/Reader/TestFindProfileIDs/by_service-type (0.01s)
                    suite.go:132:
                        	Error Trace:	suite.go:132
                        	Error:      	Received unexpected error:
                        	            	code: 386, message: There is no supertype for types Enum8('cpu' = 1, 'heap' = 2, 'block' = 3, 'mutex' = 4, 'goroutine' = 5, 'threadcreate' = 6, 'other' = 100), UInt8 because some of them are numbers and some of them are not
                        	Test:       	TestStorage/Reader/TestFindProfileIDs/by_service-type
    
    opened by narqo 1
  • storage/s3: Investigate lighter alternatives for aws-sdk-go

    storage/s3: Investigate lighter alternatives for aws-sdk-go

    Even though storage/s3 doesn't use anything more complicated, than s3-client, the dependency from aws-sdk-go, adds 5 MB to the resulting 20 MB of the binary. There are more lightweight alternative s3 clients, worth investigating exist.

    The one that looks promising minio-go.

    opened by narqo 0
  • Consider splitting index and data storages

    Consider splitting index and data storages

    As it's now each storage implementation (Badger, S3, ClickHouse) are self-contained. Even though that works fine now, there could be a case, where a user wanted to separate the data and the index.

    One example of such case is when a user wants to store the profiling data as blobs in an object-storage (s3, gcs, ceph), while store the index in a collector's local or external storage (badger, postgres). That will allow introducing API for more comlicated queries, when searching the profiles. E.g. there isn't an obious and (cost-) efficient way for having the API, which returns existing labels from the data stored on S3.

    The obvious downside is that that will make the overall system more complicated. Maintaining two data-stores within a single process can easily lead to data inconsistensies.

    question 
    opened by narqo 0
  • Rethink profiling types

    Rethink profiling types

    Profiling types, profefe-collector accepts with the API requests is a predefined list. That made sense in the earlier versions, when profefe stored different profile types in the dedicated Postgres tables, with different schemas per type.

    Currently, all types (expect "trace") are treated equally in all implementations of Storage.

    Because the list is predefined, using profefe with languages, that expose different types, seems confusing. For example, the profiler for Node.js provides profiling data for "Wall time" and "Heap". It's not clear for a user, what profiling type they should use when sending the profiles to profefe-collector ("CPU time" shows CPU usage but "Wall time" shows total time). A general suggestion always was to use "other" when in doubt but there could be a better way to handle that.

    A possible solution to explore is to suggest annotating the profiles with a "special" label, indicating the actual type of the profile. That is

    /api/profiles?service=myapp&type=other&labels=lang=nodejs,profile_type=wall_time
    
    question 
    opened by narqo 0
Owner
profefe - continuous profiling
Despite the constant profefe
profefe - continuous profiling
conprof - Continuous Profiling

conprof - Continuous Profiling Conprof is a continuous profiling project. Continuous profiling is the act of taking profiles of programs in a systemat

Parca 4 Feb 10, 2022
A tool suite for Redis profiling

Insecticide Insecticide is a tool suite for Redis profiling. It finds ambiguous values in your redis configuration.

Сити-Мобил 8 Dec 13, 2021
A profiling tool to peek and profile the memory or cpu usage of a process

Peekprof Get the CPU and Memory usage of a single process, monitor it live, and extract it in CSV and HTML. Get the best out of your optimizations. Us

Apostolis A. 8 Jan 9, 2023
A simple Cron library for go that can execute closures or functions at varying intervals, from once a second to once a year on a specific date and time. Primarily for web applications and long running daemons.

Cron.go This is a simple library to handle scheduled tasks. Tasks can be run in a minimum delay of once a second--for which Cron isn't actually design

Robert K 215 Dec 17, 2022
JOB, make your short-term command as a long-term job. 将命令行规划成任务的工具

job make your short-term command as a long-term job Install Shell Install (Linux & MacOS) # binary will be $(go env GOPATH)/bin/job $: curl -sfL https

JayL 123 Nov 12, 2022
Continuous profiling for analysis of CPU, memory usage over time, and down to the line number. Saving infrastructure cost, improving performance, and increasing reliability.

Continuous profiling for analysis of CPU, memory usage over time, and down to the line number. Saving infrastructure cost, improving performance, and increasing reliability.

Parca 2.8k Jan 2, 2023
Tape backup software optimized for large WORM data and long-term recoverability

Mixtape Backup software for tape users with lots of WORM data. Draft design License This codebase is not open-source software (or free, or "libre") at

Dave Anderson 16 Oct 30, 2022
Get the long-term token for the Budget Insight API (sandbox)

token-budget-insight This small tool aims to automate the creation of the long-term token for a Budget Insight application; the process is explain in

Enguerrand Allamel 0 Jan 7, 2022
Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.

Grafana Mimir Grafana Mimir is an open source software project that provides a scalable long-term storage for Prometheus. Some of the core strengths o

Grafana Labs 2.7k Jan 3, 2023
Cpu-profiling - Basic example of CPU Profiling in Golang which shows the bottlenecks and how much time is spent per function

cpu-profiling Basic example of CPU Profiling in Golang which shows the bottlenec

Felipe Azevedo 0 Aug 2, 2022
🔥 Continuous profiling platform — debug performance issues in your code!

Pyroscope is an open source continuous profiling platform.

Pyroscope 6.9k Jan 7, 2023
perfessor - Continuous Profiling Sidecar

perfessor - Continuous Profiling Sidecar About Perfessor is a continuous profiling agent that can profile running programs using perf It then converts

null 56 Sep 28, 2022
conprof - Continuous Profiling

conprof - Continuous Profiling Conprof is a continuous profiling project. Continuous profiling is the act of taking profiles of programs in a systemat

Parca 4 Feb 10, 2022
Continuous profiling of golang program based on pprof

基于 pprof 的 Golang 程序连续分析 Demo 点击 point Quick Start 需要被收集分析的golang程序,需要提供net/http/pprof端点,并配置在collector.yaml配置文件中 #run server :8080 go run ser

xyctruth 150 Jan 9, 2023
golang long polling library. Makes web pub-sub easy via HTTP long-poll server :smiley: :coffee: :computer:

golongpoll Golang long polling library. Makes web pub-sub easy via an HTTP long-poll server. New in v1.1 Deprecated CreateManager and CreateCustomMana

J Cuga 620 Jan 6, 2023
An open-source, distributed, cloud-native CD (Continuous Delivery) product designed for developersAn open-source, distributed, cloud-native CD (Continuous Delivery) product designed for developers

Developer-oriented Continuous Delivery Product ⁣ English | 简体中文 Table of Contents Zadig Table of Contents What is Zadig Quick start How to use? How to

null 0 Oct 19, 2021
pprof is a tool for visualization and analysis of profiling data

Introduction pprof is a tool for visualization and analysis of profiling data. pprof reads a collection of profiling samples in profile.proto format a

Google 6.1k Jan 8, 2023
Continuous performance analysis reports for software projects 🤖

Performabot Welcome to Performabot! This little helper can be used to provide Continuous Performance Reports within your GitHub project. But please be

Sascha Grunert 41 Nov 16, 2022