CNCF Jaeger, a Distributed Tracing Platform

Overview

Slack chat Project+Community stats OpenTracing-1.0 Mentioned in Awesome Go Unit Tests Coverage Status FOSSA Status CII Best Practices

Jaeger - a Distributed Tracing System

Jaeger, inspired by Dapper and OpenZipkin, is a distributed tracing platform created by Uber Technologies and donated to Cloud Native Computing Foundation. It can be used for monitoring microservices-based distributed systems:

  • Distributed context propagation
  • Distributed transaction monitoring
  • Root cause analysis
  • Service dependency analysis
  • Performance / latency optimization

See also:

Jaeger is hosted by the Cloud Native Computing Foundation (CNCF) as the 7th top-level project (graduated in October 2019). If you are a company that wants to help shape the evolution of technologies that are container-packaged, dynamically-scheduled and microservices-oriented, consider joining the CNCF. For details about who's involved and how Jaeger plays a role, read the CNCF Jaeger incubation announcement and Jaeger graduation announcement.

Get Involved

Jaeger is an open source project with open governance. We welcome contributions from the community, and we’d love your help to improve and extend the project. Here are some ideas for how to get involved. Many of them don’t even require any coding.

Features

High Scalability

Jaeger backend is designed to have no single points of failure and to scale with the business needs. For example, any given Jaeger installation at Uber is typically processing several billions of spans per day.

Native support for OpenTracing

Jaeger backend, Web UI, and instrumentation libraries have been designed from the ground up to support the OpenTracing standard.

  • Represent traces as directed acyclic graphs (not just trees) via span references
  • Support strongly typed span tags and structured logs
  • Support general distributed context propagation mechanism via baggage

OpenTelemetry

On 28-May-2019, the OpenTracing and OpenCensus projects announced their intention to merge into a new CNCF project called OpenTelemetry. The Jaeger and OpenTelemetry projects have different goals. OpenTelemetry aims to provide APIs and SDKs in multiple languages to allow applications to export various telemetry data out of the process, to any number of metrics and tracing backends. The Jaeger project is primarily the tracing backend that receives tracing telemetry data and provides processing, aggregation, data mining, and visualizations of that data. The Jaeger client libraries do overlap with OpenTelemetry in functionality. OpenTelemetry will natively support Jaeger as a tracing backend and eventually might make Jaeger native clients unnecessary. For more information please refer to a blog post Jaeger and OpenTelemetry.

Multiple storage backends

Jaeger supports two popular open source NoSQL databases as trace storage backends: Cassandra and Elasticsearch. There is also embedded database support using Badger. There are ongoing community experiments using other databases, such as ScyllaDB, InfluxDB, Amazon DynamoDB. Jaeger also ships with a simple in-memory storage for testing setups.

Modern Web UI

Jaeger Web UI is implemented in Javascript using popular open source frameworks like React. Several performance improvements have been released in v1.0 to allow the UI to efficiently deal with large volumes of data and to display traces with tens of thousands of spans (e.g. we tried a trace with 80,000 spans).

Cloud Native Deployment

Jaeger backend is distributed as a collection of Docker images. The binaries support various configuration methods, including command line options, environment variables, and configuration files in multiple formats (yaml, toml, etc.) Deployment to Kubernetes clusters is assisted by Kubernetes templates and a Helm chart.

Observability

All Jaeger backend components expose Prometheus metrics by default (other metrics backends are also supported). Logs are written to standard out using the structured logging library zap.

Security

Third-party security audits of Jaeger are available in https://github.com/jaegertracing/security-audits. Please see Issue #1718 for the summary of available security mechanisms in Jaeger.

Backwards compatibility with Zipkin

Although we recommend instrumenting applications with OpenTracing API and binding to Jaeger client libraries to benefit from advanced features not available elsewhere, if your organization has already invested in the instrumentation using Zipkin libraries, you do not have to rewrite all that code. Jaeger provides backwards compatibility with Zipkin by accepting spans in Zipkin formats (Thrift or JSON v1/v2) over HTTP. Switching from Zipkin backend is just a matter of routing the traffic from Zipkin libraries to the Jaeger backend.

Related Repositories

Documentation

Instrumentation Libraries

Deployment

Components

Building From Source

See CONTRIBUTING.

Contributing

See CONTRIBUTING.

Thanks to all the people who already contributed!

Maintainers

Rules for becoming a maintainer are defined in the GOVERNANCE document. Below are the official maintainers of the Jaeger project. Please use @jaegertracing/jaeger-maintainers to tag them on issues / PRs.

Some repositories under jaegertracing org have additional maintainers.

Emeritus Maintainers

We are grateful to our former maintainers for their contributions to the Jaeger project.

Project Status Bi-Weekly Meeting

The Jaeger contributors meet bi-weekly, and everyone is welcome to join. Agenda and meeting details here.

Roadmap

See https://www.jaegertracing.io/docs/roadmap/

Get in Touch

Have questions, suggestions, bug reports? Reach the project community via these channels:

Adopters

Jaeger as a product consists of multiple components. We want to support different types of users, whether they are only using our instrumentation libraries or full end to end Jaeger installation, whether it runs in production or you use it to troubleshoot issues in development.

Please see ADOPTERS.md for some of the organizations using Jaeger today. If you would like to add your organization to the list, please comment on our survey issue.

License

Apache 2.0 License.

Comments
  • ClickHouse as a storage backend

    ClickHouse as a storage backend

    ClickHouse, an open-source column-oriented DBMS designed initially for real-time analytics and mostly write-once/read-many big-data use cases, can be used as a very efficient log and trace storage.

    Meta issue: #638 Additional storage backends

    enhancement area/storage 
    opened by sboisson 61
  • jaeger-agent reproducible memory leak in 1.21.0

    jaeger-agent reproducible memory leak in 1.21.0

    Describe the bug I am observing very high and rapidly increasing memory usage of jaeger-agent which may be a memory leak. Eventually the agent (container) may run out of memory and crash.

    I am able to reproduce the behavior reliably. It is happening at very low span rates of <= 30 or 50 Spans/sec according to jaeger_collector_spans_received_total

    I am using a dev setup running Demo ASP.NET Core Webservices, using opentelemetry-dotnet for instrumentation. Since these are dummy projects in a home lab environment, I am able to provide the full source code of the .NET solution if necessary.

    Possible Cause & Fix

    https://github.com/open-telemetry/opentelemetry-dotnet/issues/1372 It looks like this problem can be fixed by by using MaxPayloadSizeInBytes = 65000; which was the default until mid september.

    Is this memory consumption by jaeger-agent expected behavior if a client library misbehaves? Or is this something the jaeger team would like to investigate?


    I am observing this behavior running jaeger-all-in-one native on windows. or in a linux container on DockerDesktop WSL2, or in a Linux Hyper-V VM. At first I was using and blaming badger local storage. I then switched to elasticsearch storage. I have no split up to running agent, collector and query containers on WSL 2 so I can pinpoint the memory usage to agent.

    Agent is currently not on localhost where instrumented client application is running, but I tried this also and the issue happend too. Will try this again now that I am no longer using all in one image.

    The issue does not seem to occur under very light load. I am curl'ing my services to generate spans. At first memory is stable and low. Then I started curl'ing in 10 parallel loops increasing span creation rate.

    After some minutes agents memory jumps from < 50 MB to > 2GB and then > 5 GB. The container has currently a hard memory limit of mem_limit: 8000m.

    grafik At the current moment it sites "stable" at 4.6 GB but I have seen it go beyond 8 GB as well.

    A symptom or maybe the cause for this are log errors starting to appear in agent logs. While running up to about 3 curl loops there are no log messages. A litte more requests and these start trickling in:

    jaeger-agent        | {"level":"error","ts":1605909716.7062023,"caller":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: error reading list begin: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    jaeger-agent        | {"level":"error","ts":1605909751.7633333,"caller":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 2 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    jaeger-agent        | {"level":"error","ts":1605909761.783476,"caller":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: Required field TraceIdLow is not set","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}      
    jaeger-agent        | {"level":"error","ts":1605909771.80936,"caller":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading 
    struct: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    jaeger-agent        | {"level":"error","ts":1605909791.8287015,"caller":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 8 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    

    These 5 log errors correlate with metric

    jaeger_agent_thrift_udp_t_processor_handler_errors_total{protocol="compact",model="jaeger"}
    

    grafik

    I am not sure if the instrumentation library is to blame and this is a concurrency issue there. If requests are reduced, no more log errors are happening, memory is stable (16 MB).

    When increasing request load error rate increases again, and at some point memory jumps a few gigabytes:

    grafik

    I took 300 log lines and de-duplicated them a bit to these 50:

    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 6 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: Required field TraceIdLow is not set","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Log error reading struct: Required field Timestamp is not set","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: Invalid data length","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: error reading list begin: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 8 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 9 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 8 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 11 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: Unknown data type 0","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 8 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 2 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 2 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 6 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 23 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 12 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 15 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: EOF","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 22 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 6 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.SpanRef error reading struct: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.SpanRef error reading struct: *jaeger.SpanRef field 6 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field -24 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 25 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 12 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 20 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 33 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 14 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 5 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: error reading list begin: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.SpanRef error reading struct: *jaeger.SpanRef field 5 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 10 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 13 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 12 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 13 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.SpanRef error reading struct: Required field RefType is not set","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 4 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 10 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 25 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 16 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 13 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 5 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 18 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 11 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Log error reading struct: *jaeger.Log field 9 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 19 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 25 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 22 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    

    To Reproduce Steps to reproduce the behavior:

    1. Run ASP.NET Core 3.1 Demo Solution using opentelemetry instrumentation
    2. Create load to generate spans
    3. jaeger-agent memory increase by some gigabyte and agent may crash.

    If the logs / other steps are not enough I should be able to provide you with a ready to run docker-compose setup, but it will take me some time.

    Expected behavior jaeger-agent should not consume this much memory with such little load.

    Screenshots See above.

    Version (please complete the following information):

    • OS: Windows 10 2020H2 + Docker Desktop 2.5.0.1 on WSL 2` or natively on windows.
    • Jaeger version: 1.20.0 and 1.20.1
    • Deployment: agent on bare metal windows, or WSL2 linux container, or linux vm. Agent is currently not on localhost where instrumented client application is running, but I tried this also and the issue happend too.
    • opentelemetry-dotnet version 1.0.0-RC1 or 0.8.0-beta

    What troubleshooting steps did you try? Try to follow https://www.jaegertracing.io/docs/latest/troubleshooting/ and describe how far you were able to progress and/or which steps did not work.

    I did enable debug logging for agent and collector, files attached. For this run, agent memory did only increase up until 1.9 GB. jaeger-agent-debug.log jaeger-collector-debug.log

    Not sure which other steps would apply. I am no go dev, so using debug images would be of no use.

    Additional context

    bug performance needs-info 
    opened by Mario-Hofstaetter 56
  • Sorted key/value store (badger) backed storage plugin

    Sorted key/value store (badger) backed storage plugin

    This is a storage plugin for Jaeger implemented against a sorted K/V store. Although this version was done against Badger it should work with relatively small changes against any lexicographically sorted K/V store (RocksDB is such a possibility also - but it would require cgo for compiling).

    This is WIP, pushed for early feedback. It is missing implementation for Duration index (range index scanning) as well as GetOperations & GetServices interfaces and benchmarketing/more tests of course. Some smaller TODO parts obviously remain as well, some for easier development purposes and some just lacking optimization (not to mention splitting some parts to separate functions).

    cc @pavolloffay

    enhancement review area/storage 
    opened by burmanm 54
  • How to support plugins

    How to support plugins

    We continue to be asked if we can "support X as storage backend" (e.g. #331, #421). Provided that the authors are willing to contribute, maintain, and support such backend implementations, we still have an open question of whether we want to accept those contributions into the main jaeger repository. I could be wrong, by my initial reaction is that we should keep them in separate "contrib" style repositories, for the following reasons:

    1. having half a dozen implementations is going to bloat the size of the binary, increase compile / testing time
    2. having them in core repo suggests that they are officially supported, same as Cassandra/ES, but we don't have expertise in all those different storage solutions, and cannot be on the hook to support them

    If we do keep them in the contrib repos, however, we need an approach to allow end users to use those implementations without having to rebuild the backend from source.

    One such approach is using Go plugins (https://golang.org/pkg/plugin/), for example as done in Kanali. I think it is feasible to package plugins as individual containers and mount them into a shared volume where the main Jaeger binaries can locate and load them.

    cc @pavolloffay @black-adder - any thoughts?

    Update 12/28/2017

    An alternative approach mentioned in the comments below is the sidecar plugin model (e.g. https://github.com/hashicorp/go-plugin) where the plugin runs as a separate process and the main binary communicates with it via RPC, e.g. gRPC streams. It's worth noting, however, that this approach is still a special case of in-process plugin model, so we need to start there and answer the questions below. For each type of plugin we can support a built-in "sidecar" variant.

    Update 8/2/2018

    Per https://github.com/jaegertracing/jaeger/issues/422#issuecomment-410129850, I think the following is a reasonable, actionable, and realistic plan. If someone wants to volunteer, ping me.

    • [ ] define protobuf version of SpanWriter and SpanReader interfaces
    • [ ] implement gRPC client and server, where server delegates to the respective storage.SpanReader/Writer interfaces, and client implements them.
    • [ ] extend storage factory with two new types, e.g. h-plugin (h for harshicorp) and g-plugin (plain gRPC). The h-plugin should support a cli flag for the name of the plugin executable. The g-plugin should support a cli flag for grpc server host:port.
    • [ ] implement in-memory storage as g-plugin using gRPC client/server defined above
    • [ ] implement one of the other storage types (Cassandra or ES) as h-plugin as a template.
    • [ ] TBD: how to pass configuration to the h-plugins. Because plugins are plain executables, they can use viper just like the main binaries, and the cli flag with the plugin command line might be a long string (or the user can pass params via env vars). We probably should provide a template for main, so that the actual main for a plugin is very short.
    • [ ] update documentation with example of building an h-plugin.
    • [ ] replace Cassandra with in-memory shared service in crossdock integration test.

    Update 9/4/2018

    Someone pointed out that Go's pkg/plugin now supports MacOS and Linux. This removes a significant development hurdle with using the native plugins, and makes it a viable option which is probably simpler to implement than the gRPC-based harshicorp model.

    enhancement help wanted roadmap 
    opened by yurishkuro 51
  • Use cobra/viper to manage configuration

    Use cobra/viper to manage configuration

    Fixes: https://github.com/uber/jaeger/issues/233

    • [x] fix names of binaries in Use statement (if necessary)
    • [x] export flag names to constants
    • [x] default configuration should be directly submitted to viper not flags
    opened by pavolloffay 48
  • distirbution of traces/span  amongst collector

    distirbution of traces/span amongst collector

    Requirement - what kind of business use case are you trying to solve?

    Are collector load balanced ?

    Problem - what in Jaeger blocks you from solving the requirement?

    We have our jaegertracing setup working with back end configured as elastic search. Currently we have two collector replica set up . There are 5-10 services which sends traces to the collector ( the number of services , keep changing ) . I see collectors are not evenly loaded with traffic. One collector reaches to the max queue usage where as other collector is hardly using 20-30% capacity . This causes the drop from the collector which is loaded to the capacity .
    Can we load balance the traffic (spans) amongst the both collector ? I am not sure if there is any config and i am missing it.

    Proposal - what do you suggest to solve the problem or improve the existing situation?

    Any open questions to address

    question 
    opened by prana24 47
  • Memory peaks on agent v1.19.2

    Memory peaks on agent v1.19.2

    Describe the bug One group of jaeger agents (3 replicas), deployed on kubernetes. Not so loaded system - 500-700 spans per second. Used v1.13.1 for year without any issues, upgraded all componenet to 1.19.2 (also replaced Tchannel with gRPC) and agent memory usage become unstable. Before upgrade agent instances used as low as 16 mb with 64 limit, but after upgrade memory usage peaks appeared and agent got oomkilled. I raised limits, but even 512 mb is not enough.

    After few hours and tens of oomkills, I downgraded agent to 1.18.1 version and see no issues so far.

    To Reproduce Steps to reproduce the behavior:

    1. Deploy agent v1.19.2
    2. Send 500-700 spans per second
    3. ???
    4. Observe memory usage peaks.

    Expected behavior Agent memory usage close to lower versions.

    Screenshots image

    Version (please complete the following information):

    • OS: Kubernetes nodes based on Ubuntu 20.04 LTS, 5.4.0-39 kernel
    • Jaeger version: 1.19.2
    • Deployment: Kubernetes

    What troubleshooting steps did you try? Collected /metrics, will provide if needed.

    Additional context

    bug performance needs-info 
    opened by zigmund 44
  • Add support for ES index aliases / rollover to the dependency store (Resolves #2143)

    Add support for ES index aliases / rollover to the dependency store (Resolves #2143)

    Which problem is this PR solving?

    Currently the dependency store has no support for index aliases / rollover indices. Resolves https://github.com/jaegertracing/jaeger/issues/2143#

    Short description of the changes

    Use the already existing and used config variable UseReadWriteAliases to switch between the current behavior and using non-dated "-read" and "-write" index names. This aligns with the behavior that is already in place for spans and services.

    storage/elasticsearch 
    opened by frittentheke 43
  • [tracking issue] Client libraries in different languages

    [tracking issue] Client libraries in different languages

    This issue tracks various implementations of tracing clients in different languages, either already available or under development.

    • [x] Java https://github.com/uber/jaeger-client-java
    • [x] Go https://github.com/uber/jaeger-client-go
    • [x] Python https://github.com/uber/jaeger-client-python
    • [x] Node.js https://github.com/uber/jaeger-client-node
    • [x] C/C++ (https://github.com/jaegertracing/cpp-client)
    • [x] C# / dotNET (https://github.com/jaegertracing/jaeger-client-csharp)
    • [ ] Javascript (in browser) (https://github.com/jaegertracing/jaeger-client-javascript/issues/1 - help wanted)
    • [ ] Java on Android (#577 - help wanted)
    • [ ] iOS client (#576 - help wanted)
    • [ ] Ruby / Rails (#268 - help wanted)
    • [ ] PHP (#211 - help wanted)
    • [ ] Lua (#898 - help wanted)

    As a guideline, the following is a rough outline for getting a new client out:

    1. setup new project, travis build, code coverage, release publishing, etc.
    2. implement basic tracer & span functionality with a simple 100% sampling & a configuration mechanism
    3. implement remote reporter with abstract sender
    4. implement UDP or HTTP sender or both
    5. implement crossdock integration tests
    6. implement other samplers (probabilistic, rate limiting)
    7. implement remote sampler that pulls sampling strategy from the agent
    8. implement adaptive sampler
    9. optionally other features like baggage whitelisting

    Steps 1-4 are needed to get a minimal viable client. Steps 5-9 could be added later.

    enhancement help wanted roadmap meta-issue 
    opened by yurishkuro 43
  • Tag based search in Jaeger UI is not finding spans when Badger storage is used.

    Tag based search in Jaeger UI is not finding spans when Badger storage is used.

    Requirement - what kind of business use case are you trying to solve?

    Tag based search in Jaeger UI is not finding spans when Badger storage is used.

    Problem - what in Jaeger blocks you from solving the requirement?

    Tag search in Jaeger UI not finding spans based on tags when badger plugin is used for storage across given time range. I have a series of traces recorded for different hosts and specifying hostname or custom tag that was done within last couple hours locates the span, but it doesn't find anything that was done earlier in the day.

    Looking through the code for the plugin it seems like it should be creating an index and storage for all the tags in the plugin and therefore searchable for a given service and operation. However selecting item from UI with correct settings doesn't find the right traces.

    Proposal - what do you suggest to solve the problem or improve the existing situation?

    I suspect the bug is in the plugin itself (either writer or reader) because when you use memory storage search works properly. I think plugin should allow searching across the full built index (scoped to time range) and find all the traces even if they are stored on disk and not just memory.

    Any open questions to address

    Just wanted to figure out if there is some undocumented behavior such as only doing search across traces in memory rather than storage or this is something that's not working and should be.

    storage/badger 
    opened by aachinfiev 42
  • Incomplete span support

    Incomplete span support

    Which problem is this PR solving?

    • This PR supports sending intermediate spans as suggested in #729
    • this PR is build upon PR #728

    Short description of the changes

    • changed all jaeger models to accept a new key incomplete
    • generated new thrift types
    • zipkin spans are always converted incomplete: false
    • backwards compatible with client, that do not support this feature (incomplete will be false)
    • changed elasticsearch mapping to support inomplete attribute
    • changed cassandra schema and reader/writer to support incomplete attribute
    opened by phal0r 40
  • [Bug]:

    [Bug]: "jaeger_collector_spans_serviceNames" metrics names should be written in 'snake_case' not 'camelCase'

    What happened?

    As per the Prometheus best practices, "metric names should be written in 'snake_case' not 'camelCase' There is a jaeger collector metric 'jaeger_collector_spans_serviceNames' which is following the camelCase in metric naming convention

    So an improvement is expected from the Jaeger side to follow the snake_case naming convention for above metric

    Steps to reproduce

    Collect the collector metrics Run the 'Promtool' which is a Prometheus command line tool that helps to lint service metrics and validate metric names, labels , meta data, and types based on Prometheus instrumentation best practices.

    Below warning is displayed for the " jaeger_collector_spans_serviceNames" metric

    Metric names should be written in snake_case and not camelCase

    Expected behavior

    Metric name should follow snake_case naming pattern

    So expected metric name is something like this "jaeger_collector_spans_service_names" instead of "jaeger_collector_spans_serviceNames"

    Relevant log output

    No response

    Screenshot

    No response

    Additional context

    No response

    Jaeger backend version

    1.38.1

    SDK

    No response

    Pipeline

    No response

    Stogage backend

    No response

    Operating system

    No response

    Deployment model

    No response

    Deployment configs

    No response

    bug 
    opened by manisha5464 0
  • [Bug]: The Anti-MIME-Sniffing header X-Content-Type-Options was not set to 'nosniff'

    [Bug]: The Anti-MIME-Sniffing header X-Content-Type-Options was not set to 'nosniff'

    What happened?

    The Anti-MIME-Sniffing header X-Content-Type-Options was not set to 'nosniff'. This allows older versions of Internet Explorer and Chrome to perform MIME-sniffing on the response body, potentially causing the response body to be interpreted and displayed as a content type other than the declared content type. Current (early 2014) and legacy versions of Firefox will use the declared content type (if one is set), rather than performing MIME-sniffing.

    Steps to reproduce

    Send a GET request to the Jaeger Collector/Query/Agent. The response specifies Content-Type, Date & Content-Length only.

    Expected behavior

    Ensure that the application/web server sets the Content-Type header appropriately, and that it sets the X-Content-Type-Options header to 'nosniff' for all web pages.

    Relevant log output

    No response

    Screenshot

    No response

    Additional context

    No response

    Jaeger backend version

    No response

    SDK

    No response

    Pipeline

    No response

    Stogage backend

    No response

    Operating system

    No response

    Deployment model

    No response

    Deployment configs

    No response

    bug 
    opened by mupjha 0
  • [Feature]: Remove gogo from jaeger query service

    [Feature]: Remove gogo from jaeger query service

    Requirement

    External users may want to call jaeger query service to retrieve their traces to do a variety of things, such as build custom trace monitoring UIs, or retrieve traces for auditing purposes. This functionality is advertised in the Jaeger docs here.

    Problem

    The jaeger query grpc api still depends on gogo proto, which is now deprecated (the last release was Jan 2021). This inconveniences the api user, as they must now use gogo as well, if they want to call the API.

    Note that in most cases, gogo is cross compatible with vanilla protobuf. However, this is not the case with jaeger query, as jaeger uses gogo's custom types for its traceIDs. These gogo docs explain that gogo custom types are not interopable with vanilla proto in several ways.

    So ultimately, the user has to use this long-deprecated gogo software, if they want to call jaeger query's grpc api. This is unfortunate.

    Proposal

    Remove gogo from jaeger.

    Open questions

    If maintainers agree that this work is needed, I would be happy to contribute. I have experience with gogo, grpc, and protobuf.

    enhancement 
    opened by gregorychen3 1
  • Restore NewGRPCHandler method

    Restore NewGRPCHandler method

    Signed-off-by: Prithvi Raj [email protected]

    Which problem is this PR solving?

    • Resolves #4082

    Short description of the changes

    • Add a NewGRPCHandler method. This method does not have the same signature as the previous method, so it is still breaking semver, but it provides a straightforward fix for users who which to customize the server.
    opened by vprithvi 1
  • [Bug]: `cmd/query/app.NewGRPCHandler` removed without alternatives

    [Bug]: `cmd/query/app.NewGRPCHandler` removed without alternatives

    Requirement

    Ability to get a GRPCHandler which can be used with a custom server

    Problem

    In #3091, it looks like app.NewGRPCHandler was removed accidentally - as this is a exported function, it breaks semver.

    Proposal

    Add app.NewGRPCHandler back.

    Open questions

    No response

    bug 
    opened by vprithvi 0
Releases(v1.40.0)
Owner
Jaeger - Distributed Tracing Platform
Jaeger - Distributed Tracing Platform
Tool for generating OpenTelemetry tracing decorators.

tracegen Tool for generating OpenTelemetry tracing decorators. Installation go get -u github.com/KazanExpress/tracegen/cmd/... Usage tracegen generate

Marketplace Technologies 5 Apr 7, 2022
Golog is a logger which support tracing and other custom behaviors out of the box. Blazing fast and simple to use.

GOLOG Golog is an opinionated Go logger with simple APIs and configurable behavior. Why another logger? Golog is designed to address mainly two issues

Damiano Petrungaro 37 Oct 2, 2022
Distributed-Log-Service - Distributed Log Service With Golang

Distributed Log Service This project is essentially a result of my attempt to un

Hamza Yusuff 6 Jun 1, 2022
Distributed Commit Log from Travis Jeffery's Distributed Services book

proglog A distributed commit log. This repository follows the book "Distributed Services with Go" by Travis Jeffrey. The official repository for this

Arindam Das 2 May 23, 2022
Distributed, lock-free, self-hosted health checks and status pages

Checkup is distributed, lock-free, self-hosted health checks and status pages, written in Go. It features an elegant, minimalistic CLI and an idiomati

Sourcegraph 3.3k Dec 30, 2022
distributed monitoring system

OWL OWL 是由国内领先的第三方数据智能服务商 TalkingData 开源的一款企业级分布式监控告警系统,目前由 Tech Operation Team 持续开发更新维护。 OWL 后台组件全部使用 Go 语言开发,Go 语言是 Google 开发的一种静态强类型、编译型、并发型,并具有垃圾回

null 826 Dec 24, 2022
Distributed simple and robust release management and monitoring system.

Agente Distributed simple and robust release management and monitoring system. **This project on going work. Road map Core system First worker agent M

StreetByters Community 30 Nov 17, 2022
A distributed logger micro-service built in go

Distributed Logger A distributed logger micro-service built in go using gRPC for internal communication with other micro-services and JSON for externa

Famous Ketoma 0 Nov 6, 2021
Nightingale - A Distributed and High-Performance Monitoring System. Prometheus enterprise edition

Introduction ?? A Distributed and High-Performance Monitoring System. Prometheus

taotao 1 Jan 7, 2022
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

The open-source platform for monitoring and observability. Grafana allows you to query, visualize, alert on and understand your metrics no matter wher

Grafana Labs 53.3k Jan 3, 2023
Detecctor is a ⚡ fast, fully customizable 💗 monitoring platform. It uses Telegram as a notification 📥 service

Detecctor is a ⚡ fast, fully customizable ?? monitoring platform. It uses Telegram as a notification ?? service. The main components are a TCP server, MongoDB and multiple clients.

null 2 Nov 16, 2021
Open Source Software monitoring platform tools.

ByteOpen Open Source Software monitoring platform tools. Usage Clone the repo to your own go src path cd ~/go/src git clone https://code.byted.org/inf

Ye Xia 2 Nov 21, 2021
BTFS - The First Scalable Decentralized Storage System - A Foundational Platform for Decentralized Applications

go-btfs What is BTFS? BitTorrent File System (BTFS) is a protocol forked from IPFS that utilizes the TRON network and the BitTorrent Ecosystem for int

BitTorrent Inc. 129 Jan 1, 2023
Butler - Aggregation and Alerting Platform

Welcome to Butler Table of Contents Welcome About The Project Contributing Developer Workflow Getting Started Configuration About The Project Contribu

Butler 2 Mar 1, 2022
CNCF Jaeger, a Distributed Tracing Platform

Jaeger - a Distributed Tracing System Jaeger, inspired by Dapper and OpenZipkin, is a distributed tracing platform created by Uber Technologies and do

Jaeger - Distributed Tracing Platform 16.9k Jan 2, 2023
CNCF Jaeger, a Distributed Tracing Platform

Jaeger - a Distributed Tracing System Jaeger, inspired by Dapper and OpenZipkin, is a distributed tracing platform created by Uber Technologies and do

Jaeger - Distributed Tracing Platform 16.9k Jan 3, 2023
Jaeger-s3 - Jaeger gRPC storage plugin for Amazon S3

jaeger-s3 jaeger-s3 is gRPC storage plugin for Jaeger, which uses Amazon Kinesis

Johannes Würbach 11 Dec 26, 2022
Jaeger-influxdb - The repository that contains InfluxDB Storage gRPC plugin for Jaeger

NOTICE: This repository is archived and is no longer maintained. Please use http

Rohan 0 Feb 16, 2022
Rest API to get KVB departures - Written in Go with hexagonal architecture and tracing via OpenTelemetry and Jaeger

KVB API Rest API to get upcoming departures per KVB train station Implemented in Go with hexagonal architecture and tracing via OpenTelemetry and Jaeg

Jan Ritter 3 May 7, 2022
A simple Go library for 3D ray-tracing rendering, implementing the book Ray Tracing in One Weekend.

Ray Tracing in Go A Go implementation of the book Ray Tracing in One Weekend. The repository provides a library to describe and render your own scenes

Yuto Takahashi 53 Sep 23, 2022