CNCF Jaeger, a Distributed Tracing Platform

Overview

Gitter chat Project+Community stats OpenTracing-1.0 Mentioned in Awesome Go Unit Tests Coverage Status FOSSA Status CII Best Practices

Jaeger - a Distributed Tracing System

Jaeger, inspired by Dapper and OpenZipkin, is a distributed tracing platform created by Uber Technologies and donated to Cloud Native Computing Foundation. It can be used for monitoring microservices-based distributed systems:

  • Distributed context propagation
  • Distributed transaction monitoring
  • Root cause analysis
  • Service dependency analysis
  • Performance / latency optimization

See also:

Jaeger is hosted by the Cloud Native Computing Foundation (CNCF) as the 7th top-level project (graduated in October 2019). If you are a company that wants to help shape the evolution of technologies that are container-packaged, dynamically-scheduled and microservices-oriented, consider joining the CNCF. For details about who's involved and how Jaeger plays a role, read the CNCF Jaeger incubation announcement and Jaeger graduation announcement.

Get Involved

Jaeger is an open source project with open governance. We welcome contributions from the community, and we’d love your help to improve and extend the project. Here are some ideas for how to get involved. Many of them don’t even require any coding.

Features

High Scalability

Jaeger backend is designed to have no single points of failure and to scale with the business needs. For example, any given Jaeger installation at Uber is typically processing several billions of spans per day.

Native support for OpenTracing

Jaeger backend, Web UI, and instrumentation libraries have been designed from the ground up to support the OpenTracing standard.

  • Represent traces as directed acyclic graphs (not just trees) via span references
  • Support strongly typed span tags and structured logs
  • Support general distributed context propagation mechanism via baggage

OpenTelemetry

On 28-May-2019, the OpenTracing and OpenCensus projects announced their intention to merge into a new CNCF project called OpenTelemetry. The Jaeger and OpenTelemetry projects have different goals. OpenTelemetry aims to provide APIs and SDKs in multiple languages to allow applications to export various telemetry data out of the process, to any number of metrics and tracing backends. The Jaeger project is primarily the tracing backend that receives tracing telemetry data and provides processing, aggregation, data mining, and visualizations of that data. The Jaeger client libraries do overlap with OpenTelemetry in functionality. OpenTelemetry will natively support Jaeger as a tracing backend and eventually might make Jaeger native clients unnecessary. For more information please refer to a blog post Jaeger and OpenTelemetry.

Multiple storage backends

Jaeger supports two popular open source NoSQL databases as trace storage backends: Cassandra and Elasticsearch. There is also embedded database support using Badger. There are ongoing community experiments using other databases, such as ScyllaDB, InfluxDB, Amazon DynamoDB. Jaeger also ships with a simple in-memory storage for testing setups.

Modern Web UI

Jaeger Web UI is implemented in Javascript using popular open source frameworks like React. Several performance improvements have been released in v1.0 to allow the UI to efficiently deal with large volumes of data and to display traces with tens of thousands of spans (e.g. we tried a trace with 80,000 spans).

Cloud Native Deployment

Jaeger backend is distributed as a collection of Docker images. The binaries support various configuration methods, including command line options, environment variables, and configuration files in multiple formats (yaml, toml, etc.) Deployment to Kubernetes clusters is assisted by Kubernetes templates and a Helm chart.

Observability

All Jaeger backend components expose Prometheus metrics by default (other metrics backends are also supported). Logs are written to standard out using the structured logging library zap.

Security

Third-party security audits of Jaeger are available in https://github.com/jaegertracing/security-audits. Please see Issue #1718 for the summary of available security mechanisms in Jaeger.

Backwards compatibility with Zipkin

Although we recommend instrumenting applications with OpenTracing API and binding to Jaeger client libraries to benefit from advanced features not available elsewhere, if your organization has already invested in the instrumentation using Zipkin libraries, you do not have to rewrite all that code. Jaeger provides backwards compatibility with Zipkin by accepting spans in Zipkin formats (Thrift or JSON v1/v2) over HTTP. Switching from Zipkin backend is just a matter of routing the traffic from Zipkin libraries to the Jaeger backend.

Related Repositories

Documentation

Instrumentation Libraries

Deployment

Components

Building From Source

See CONTRIBUTING.

Contributing

See CONTRIBUTING.

Maintainers

Rules for becoming a maintainer are defined in the GOVERNANCE document. Below are the official maintainers of the Jaeger project. Please use @jaegertracing/jaeger-maintainers to tag them on issues / PRs.

Some repositories under jaegertracing org have additional maintainers.

Emeritus Maintainers

We are grateful to our former maintainers for their contributions to the Jaeger project.

Project Status Bi-Weekly Meeting

The Jaeger contributors meet bi-weekly, and everyone is welcome to join. Agenda and meeting details here.

Roadmap

See https://www.jaegertracing.io/docs/roadmap/

Questions, Discussions, Bug Reports

Reach project contributors via these channels:

Adopters

Jaeger as a product consists of multiple components. We want to support different types of users, whether they are only using our instrumentation libraries or full end to end Jaeger installation, whether it runs in production or you use it to troubleshoot issues in development.

Please see ADOPTERS.md for some of the organizations using Jaeger today. If you would like to add your organization to the list, please comment on our survey issue.

License

Apache 2.0 License.

Issues
  • ClickHouse as a storage backend

    ClickHouse as a storage backend

    ClickHouse, an open-source column-oriented DBMS designed initially for real-time analytics and mostly write-once/read-many big-data use cases, can be used as a very efficient log and trace storage.

    Meta issue: #638 Additional storage backends

    enhancement area/storage 
    opened by sboisson 61
  • jaeger-agent reproducible memory leak in 1.21.0

    jaeger-agent reproducible memory leak in 1.21.0

    Describe the bug I am observing very high and rapidly increasing memory usage of jaeger-agent which may be a memory leak. Eventually the agent (container) may run out of memory and crash.

    I am able to reproduce the behavior reliably. It is happening at very low span rates of <= 30 or 50 Spans/sec according to jaeger_collector_spans_received_total

    I am using a dev setup running Demo ASP.NET Core Webservices, using opentelemetry-dotnet for instrumentation. Since these are dummy projects in a home lab environment, I am able to provide the full source code of the .NET solution if necessary.

    Possible Cause & Fix

    https://github.com/open-telemetry/opentelemetry-dotnet/issues/1372 It looks like this problem can be fixed by by using MaxPayloadSizeInBytes = 65000; which was the default until mid september.

    Is this memory consumption by jaeger-agent expected behavior if a client library misbehaves? Or is this something the jaeger team would like to investigate?


    I am observing this behavior running jaeger-all-in-one native on windows. or in a linux container on DockerDesktop WSL2, or in a Linux Hyper-V VM. At first I was using and blaming badger local storage. I then switched to elasticsearch storage. I have no split up to running agent, collector and query containers on WSL 2 so I can pinpoint the memory usage to agent.

    Agent is currently not on localhost where instrumented client application is running, but I tried this also and the issue happend too. Will try this again now that I am no longer using all in one image.

    The issue does not seem to occur under very light load. I am curl'ing my services to generate spans. At first memory is stable and low. Then I started curl'ing in 10 parallel loops increasing span creation rate.

    After some minutes agents memory jumps from < 50 MB to > 2GB and then > 5 GB. The container has currently a hard memory limit of mem_limit: 8000m.

    grafik At the current moment it sites "stable" at 4.6 GB but I have seen it go beyond 8 GB as well.

    A symptom or maybe the cause for this are log errors starting to appear in agent logs. While running up to about 3 curl loops there are no log messages. A litte more requests and these start trickling in:

    jaeger-agent        | {"level":"error","ts":1605909716.7062023,"caller":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: error reading list begin: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    jaeger-agent        | {"level":"error","ts":1605909751.7633333,"caller":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 2 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    jaeger-agent        | {"level":"error","ts":1605909761.783476,"caller":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: Required field TraceIdLow is not set","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}      
    jaeger-agent        | {"level":"error","ts":1605909771.80936,"caller":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading 
    struct: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    jaeger-agent        | {"level":"error","ts":1605909791.8287015,"caller":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 8 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    

    These 5 log errors correlate with metric

    jaeger_agent_thrift_udp_t_processor_handler_errors_total{protocol="compact",model="jaeger"}
    

    grafik

    I am not sure if the instrumentation library is to blame and this is a concurrency issue there. If requests are reduced, no more log errors are happening, memory is stable (16 MB).

    When increasing request load error rate increases again, and at some point memory jumps a few gigabytes:

    grafik

    I took 300 log lines and de-duplicated them a bit to these 50:

    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 6 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: Required field TraceIdLow is not set","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Log error reading struct: Required field Timestamp is not set","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: Invalid data length","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: error reading list begin: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 8 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 9 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 8 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 11 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: Unknown data type 0","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 8 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 2 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 2 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 6 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 23 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 12 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 15 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: EOF","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 22 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 6 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.SpanRef error reading struct: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.SpanRef error reading struct: *jaeger.SpanRef field 6 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field -24 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 25 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 12 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 20 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 33 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 14 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 5 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: error reading list begin: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.SpanRef error reading struct: *jaeger.SpanRef field 5 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 10 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 13 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 12 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 13 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.SpanRef error reading struct: Required field RefType is not set","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 4 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 10 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 25 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 16 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 13 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 5 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 18 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 11 read error: don't know what type: 15","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Log error reading struct: *jaeger.Log field 9 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 19 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 25 read error: don't know what type: 13","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    ":"processors/thrift_processor.go:123","msg":"Processor failed","error":"*jaeger.Batch error reading struct: *jaeger.Span error reading struct: *jaeger.Span field 22 read error: don't know what type: 14","stacktrace":"github.com/jaegertracing/jaeger/cmd/agent/app/processors.(*ThriftProcessor).processBuffer\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:123\ngithub.com/jaegertracing/jaeger/cmd/agent/app/processors.NewThriftProcessor.func2\n\tgithub.com/jaegertracing/jaeger/cmd/agent/app/processors/thrift_processor.go:87"}
    

    To Reproduce Steps to reproduce the behavior:

    1. Run ASP.NET Core 3.1 Demo Solution using opentelemetry instrumentation
    2. Create load to generate spans
    3. jaeger-agent memory increase by some gigabyte and agent may crash.

    If the logs / other steps are not enough I should be able to provide you with a ready to run docker-compose setup, but it will take me some time.

    Expected behavior jaeger-agent should not consume this much memory with such little load.

    Screenshots See above.

    Version (please complete the following information):

    • OS: Windows 10 2020H2 + Docker Desktop 2.5.0.1 on WSL 2` or natively on windows.
    • Jaeger version: 1.20.0 and 1.20.1
    • Deployment: agent on bare metal windows, or WSL2 linux container, or linux vm. Agent is currently not on localhost where instrumented client application is running, but I tried this also and the issue happend too.
    • opentelemetry-dotnet version 1.0.0-RC1 or 0.8.0-beta

    What troubleshooting steps did you try? Try to follow https://www.jaegertracing.io/docs/latest/troubleshooting/ and describe how far you were able to progress and/or which steps did not work.

    I did enable debug logging for agent and collector, files attached. For this run, agent memory did only increase up until 1.9 GB. jaeger-agent-debug.log jaeger-collector-debug.log

    Not sure which other steps would apply. I am no go dev, so using debug images would be of no use.

    Additional context

    bug performance needs-info 
    opened by Mario-Hofstaetter 56
  • Sorted key/value store (badger) backed storage plugin

    Sorted key/value store (badger) backed storage plugin

    This is a storage plugin for Jaeger implemented against a sorted K/V store. Although this version was done against Badger it should work with relatively small changes against any lexicographically sorted K/V store (RocksDB is such a possibility also - but it would require cgo for compiling).

    This is WIP, pushed for early feedback. It is missing implementation for Duration index (range index scanning) as well as GetOperations & GetServices interfaces and benchmarketing/more tests of course. Some smaller TODO parts obviously remain as well, some for easier development purposes and some just lacking optimization (not to mention splitting some parts to separate functions).

    cc @pavolloffay

    enhancement review area/storage 
    opened by burmanm 54
  • How to support plugins

    How to support plugins

    We continue to be asked if we can "support X as storage backend" (e.g. #331, #421). Provided that the authors are willing to contribute, maintain, and support such backend implementations, we still have an open question of whether we want to accept those contributions into the main jaeger repository. I could be wrong, by my initial reaction is that we should keep them in separate "contrib" style repositories, for the following reasons:

    1. having half a dozen implementations is going to bloat the size of the binary, increase compile / testing time
    2. having them in core repo suggests that they are officially supported, same as Cassandra/ES, but we don't have expertise in all those different storage solutions, and cannot be on the hook to support them

    If we do keep them in the contrib repos, however, we need an approach to allow end users to use those implementations without having to rebuild the backend from source.

    One such approach is using Go plugins (https://golang.org/pkg/plugin/), for example as done in Kanali. I think it is feasible to package plugins as individual containers and mount them into a shared volume where the main Jaeger binaries can locate and load them.

    cc @pavolloffay @black-adder - any thoughts?

    Update 12/28/2017

    An alternative approach mentioned in the comments below is the sidecar plugin model (e.g. https://github.com/hashicorp/go-plugin) where the plugin runs as a separate process and the main binary communicates with it via RPC, e.g. gRPC streams. It's worth noting, however, that this approach is still a special case of in-process plugin model, so we need to start there and answer the questions below. For each type of plugin we can support a built-in "sidecar" variant.

    Update 8/2/2018

    Per https://github.com/jaegertracing/jaeger/issues/422#issuecomment-410129850, I think the following is a reasonable, actionable, and realistic plan. If someone wants to volunteer, ping me.

    • [ ] define protobuf version of SpanWriter and SpanReader interfaces
    • [ ] implement gRPC client and server, where server delegates to the respective storage.SpanReader/Writer interfaces, and client implements them.
    • [ ] extend storage factory with two new types, e.g. h-plugin (h for harshicorp) and g-plugin (plain gRPC). The h-plugin should support a cli flag for the name of the plugin executable. The g-plugin should support a cli flag for grpc server host:port.
    • [ ] implement in-memory storage as g-plugin using gRPC client/server defined above
    • [ ] implement one of the other storage types (Cassandra or ES) as h-plugin as a template.
    • [ ] TBD: how to pass configuration to the h-plugins. Because plugins are plain executables, they can use viper just like the main binaries, and the cli flag with the plugin command line might be a long string (or the user can pass params via env vars). We probably should provide a template for main, so that the actual main for a plugin is very short.
    • [ ] update documentation with example of building an h-plugin.
    • [ ] replace Cassandra with in-memory shared service in crossdock integration test.

    Update 9/4/2018

    Someone pointed out that Go's pkg/plugin now supports MacOS and Linux. This removes a significant development hurdle with using the native plugins, and makes it a viable option which is probably simpler to implement than the gRPC-based harshicorp model.

    enhancement help wanted roadmap 
    opened by yurishkuro 51
  • Use cobra/viper to manage configuration

    Use cobra/viper to manage configuration

    Fixes: https://github.com/uber/jaeger/issues/233

    • [x] fix names of binaries in Use statement (if necessary)
    • [x] export flag names to constants
    • [x] default configuration should be directly submitted to viper not flags
    opened by pavolloffay 48
  • distirbution of traces/span  amongst collector

    distirbution of traces/span amongst collector

    Requirement - what kind of business use case are you trying to solve?

    Are collector load balanced ?

    Problem - what in Jaeger blocks you from solving the requirement?

    We have our jaegertracing setup working with back end configured as elastic search. Currently we have two collector replica set up . There are 5-10 services which sends traces to the collector ( the number of services , keep changing ) . I see collectors are not evenly loaded with traffic. One collector reaches to the max queue usage where as other collector is hardly using 20-30% capacity . This causes the drop from the collector which is loaded to the capacity .
    Can we load balance the traffic (spans) amongst the both collector ? I am not sure if there is any config and i am missing it.

    Proposal - what do you suggest to solve the problem or improve the existing situation?

    Any open questions to address

    question 
    opened by prana24 47
  • Memory peaks on agent v1.19.2

    Memory peaks on agent v1.19.2

    Describe the bug One group of jaeger agents (3 replicas), deployed on kubernetes. Not so loaded system - 500-700 spans per second. Used v1.13.1 for year without any issues, upgraded all componenet to 1.19.2 (also replaced Tchannel with gRPC) and agent memory usage become unstable. Before upgrade agent instances used as low as 16 mb with 64 limit, but after upgrade memory usage peaks appeared and agent got oomkilled. I raised limits, but even 512 mb is not enough.

    After few hours and tens of oomkills, I downgraded agent to 1.18.1 version and see no issues so far.

    To Reproduce Steps to reproduce the behavior:

    1. Deploy agent v1.19.2
    2. Send 500-700 spans per second
    3. ???
    4. Observe memory usage peaks.

    Expected behavior Agent memory usage close to lower versions.

    Screenshots image

    Version (please complete the following information):

    • OS: Kubernetes nodes based on Ubuntu 20.04 LTS, 5.4.0-39 kernel
    • Jaeger version: 1.19.2
    • Deployment: Kubernetes

    What troubleshooting steps did you try? Collected /metrics, will provide if needed.

    Additional context

    bug performance needs-info 
    opened by zigmund 44
  • Add support for ES index aliases / rollover to the dependency store (Resolves #2143)

    Add support for ES index aliases / rollover to the dependency store (Resolves #2143)

    Which problem is this PR solving?

    Currently the dependency store has no support for index aliases / rollover indices. Resolves https://github.com/jaegertracing/jaeger/issues/2143#

    Short description of the changes

    Use the already existing and used config variable UseReadWriteAliases to switch between the current behavior and using non-dated "-read" and "-write" index names. This aligns with the behavior that is already in place for spans and services.

    storage/elasticsearch 
    opened by frittentheke 43
  • [tracking issue] Client libraries in different languages

    [tracking issue] Client libraries in different languages

    This issue tracks various implementations of tracing clients in different languages, either already available or under development.

    • [x] Java https://github.com/uber/jaeger-client-java
    • [x] Go https://github.com/uber/jaeger-client-go
    • [x] Python https://github.com/uber/jaeger-client-python
    • [x] Node.js https://github.com/uber/jaeger-client-node
    • [x] C/C++ (https://github.com/jaegertracing/cpp-client)
    • [x] C# / dotNET (https://github.com/jaegertracing/jaeger-client-csharp)
    • [ ] Javascript (in browser) (https://github.com/jaegertracing/jaeger-client-javascript/issues/1 - help wanted)
    • [ ] Java on Android (#577 - help wanted)
    • [ ] iOS client (#576 - help wanted)
    • [ ] Ruby / Rails (#268 - help wanted)
    • [ ] PHP (#211 - help wanted)
    • [ ] Lua (#898 - help wanted)

    As a guideline, the following is a rough outline for getting a new client out:

    1. setup new project, travis build, code coverage, release publishing, etc.
    2. implement basic tracer & span functionality with a simple 100% sampling & a configuration mechanism
    3. implement remote reporter with abstract sender
    4. implement UDP or HTTP sender or both
    5. implement crossdock integration tests
    6. implement other samplers (probabilistic, rate limiting)
    7. implement remote sampler that pulls sampling strategy from the agent
    8. implement adaptive sampler
    9. optionally other features like baggage whitelisting

    Steps 1-4 are needed to get a minimal viable client. Steps 5-9 could be added later.

    enhancement help wanted roadmap meta-issue 
    opened by yurishkuro 43
  • Tag based search in Jaeger UI is not finding spans when Badger storage is used.

    Tag based search in Jaeger UI is not finding spans when Badger storage is used.

    Requirement - what kind of business use case are you trying to solve?

    Tag based search in Jaeger UI is not finding spans when Badger storage is used.

    Problem - what in Jaeger blocks you from solving the requirement?

    Tag search in Jaeger UI not finding spans based on tags when badger plugin is used for storage across given time range. I have a series of traces recorded for different hosts and specifying hostname or custom tag that was done within last couple hours locates the span, but it doesn't find anything that was done earlier in the day.

    Looking through the code for the plugin it seems like it should be creating an index and storage for all the tags in the plugin and therefore searchable for a given service and operation. However selecting item from UI with correct settings doesn't find the right traces.

    Proposal - what do you suggest to solve the problem or improve the existing situation?

    I suspect the bug is in the plugin itself (either writer or reader) because when you use memory storage search works properly. I think plugin should allow searching across the full built index (scoped to time range) and find all the traces even if they are stored on disk and not just memory.

    Any open questions to address

    Just wanted to figure out if there is some undocumented behavior such as only doing search across traces in memory rather than storage or this is something that's not working and should be.

    storage/badger 
    opened by aachinfiev 42
  • Incomplete span support

    Incomplete span support

    Which problem is this PR solving?

    • This PR supports sending intermediate spans as suggested in #729
    • this PR is build upon PR #728

    Short description of the changes

    • changed all jaeger models to accept a new key incomplete
    • generated new thrift types
    • zipkin spans are always converted incomplete: false
    • backwards compatible with client, that do not support this feature (incomplete will be false)
    • changed elasticsearch mapping to support inomplete attribute
    • changed cassandra schema and reader/writer to support incomplete attribute
    opened by phal0r 40
  • Bump github.com/go-openapi/swag from 0.21.1 to 0.22.0

    Bump github.com/go-openapi/swag from 0.21.1 to 0.22.0

    Bumps github.com/go-openapi/swag from 0.21.1 to 0.22.0.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies go 
    opened by dependabot[bot] 1
  • [Feature]: Metric is too little

    [Feature]: Metric is too little

    Requirement

    metric is so little that I can only monitor Jaeger services with little indicator.

    Problem

    metric is so little that I can only monitor Jaeger services with little indicator.

    such as : the latency of send in Jaeger-agent , or the latency of Jaeger-collecor , the count of spans that the agent received/rejected/dropped, the count of spans that the client created、send and not send.

    Proposal

    No response

    Open questions

    No response

    enhancement 
    opened by THMAIL 3
  • Starting GRPC server 16685 connection refused

    Starting GRPC server 16685 connection refused

    What happened?

    When trying out the jaeger all-in-one image 1.29.0 & 1.37.0 I am seeing a connection refused when starting the GRPC server

    Potentially something I have missed, but thought better get a bug created just in case. Maybe something else needs to be enabled when installing all-in-one deployment via helm. Needs to be installed via helm for our deployment process.

    Am installing this via the helm charts and disabling everything within a jaeger.yaml values file so only the all-in-one image is used. jaeger.yaml file

    # Default values for jaeger.
    # This is a YAML-formatted file.
    # Jaeger values are grouped by component. Cassandra values override subchart values
    
    provisionDataStore:
      cassandra: false
      elasticsearch: false
      kafka: false
    
    # Overrides the image tag where default is the chart appVersion.
    tag: ""
    
    nameOverride: ""
    fullnameOverride: ""
    
    allInOne:
      enabled: true
      image: jaegertracing/all-in-one
      tag: 1.37.0
      pullPolicy: IfNotPresent
      extraEnv: []
      # samplingConfig: |-
      #   {
      #     "default_strategy": {
      #       "type": "probabilistic",
      #       "param": 1
      #     }
      #   }
      ingress:
        enabled: true
      # resources:
      #   limits:
      #     cpu: 500m
      #     memory: 512Mi
      #   requests:
      #     cpu: 256m
      #     memory: 128Mi
    
    storage:
      # allowed values (cassandra, elasticsearch)
      type: cassandra
      cassandra:
        host: cassandra
        port: 9042
        tls:
          enabled: false
          secretName: cassandra-tls-secret
        user: user
        usePassword: true
        password: password
        keyspace: jaeger_v1_test
        ## Use existing secret (ignores previous password)
        # existingSecret:
        ## Cassandra related env vars to be configured on the concerned components
        extraEnv: []
          # - name: CASSANDRA_SERVERS
          #   value: cassandra
          # - name: CASSANDRA_PORT
          #   value: "9042"
          # - name: CASSANDRA_KEYSPACE
          #   value: jaeger_v1_test
          # - name: CASSANDRA_TLS_ENABLED
          #   value: "false"
        ## Cassandra related cmd line opts to be configured on the concerned components
        cmdlineParams: {}
          # cassandra.servers: cassandra
          # cassandra.port: 9042
          # cassandra.keyspace: jaeger_v1_test
          # cassandra.tls.enabled: "false"
      elasticsearch:
        scheme: http
        host: elasticsearch-master
        port: 9200
        user: elastic
        usePassword: true
        password: changeme
        # indexPrefix: test
        ## Use existing secret (ignores previous password)
        # existingSecret:
        # existingSecretKey:
        nodesWanOnly: false
        extraEnv: []
        ## ES related env vars to be configured on the concerned components
          # - name: ES_SERVER_URLS
          #   value: http://elasticsearch-master:9200
          # - name: ES_USERNAME
          #   value: elastic
          # - name: ES_INDEX_PREFIX
          #   value: test
        ## ES related cmd line opts to be configured on the concerned components
        cmdlineParams: {}
          # es.server-urls: http://elasticsearch-master:9200
          # es.username: elastic
          # es.index-prefix: test
      kafka:
        brokers:
          - kafka:9092
        topic: jaeger_v1_test
        authentication: none
        extraEnv: []
      grpcPlugin:
        extraEnv: []
    
    # Begin: Override values on the Cassandra subchart to customize for Jaeger
    cassandra:
      persistence:
        # To enable persistence, please see the documentation for the Cassandra chart
        enabled: false
      config:
        cluster_name: jaeger
        seed_size: 1
        dc_name: dc1
        rack_name: rack1
        endpoint_snitch: GossipingPropertyFileSnitch
    # End: Override values on the Cassandra subchart to customize for Jaeger
    
    # Begin: Override values on the Kafka subchart to customize for Jaeger
    kafka:
      replicaCount: 1
      autoCreateTopicsEnable: true
      zookeeper:
        replicaCount: 1
        serviceAccount:
          create: true
    
    # End: Override values on the Kafka subchart to customize for Jaeger
    
    # Begin: Default values for the various components of Jaeger
    # This chart has been based on the Kubernetes integration found in the following repo:
    # https://github.com/jaegertracing/jaeger-kubernetes/blob/main/production/jaeger-production-template.yml
    #
    # This is the jaeger-cassandra-schema Job which sets up the Cassandra schema for
    # use by Jaeger
    schema:
      annotations: {}
      image: jaegertracing/jaeger-cassandra-schema
      imagePullSecrets: []
      pullPolicy: IfNotPresent
      resources: {}
        # limits:
        #   cpu: 500m
        #   memory: 512Mi
        # requests:
        #   cpu: 256m
        #   memory: 128Mi
      serviceAccount:
        create: true
        # Explicitly mounts the API credentials for the Service Account
        automountServiceAccountToken: true
        name:
      podAnnotations: {}
      podLabels: {}
      securityContext: {}
      podSecurityContext: {}
      ## Deadline for cassandra schema creation job
      activeDeadlineSeconds: 300
      extraEnv: []
        # - name: MODE
        #   value: prod
        # - name: TRACE_TTL
        #   value: "172800"
        # - name: DEPENDENCIES_TTL
        #   value: "0"
    
    # For configurable values of the elasticsearch if provisioned, please see:
    # https://github.com/elastic/helm-charts/tree/master/elasticsearch#configuration
    elasticsearch: {}
    
    ingester:
      enabled: false
      podSecurityContext: {}
      securityContext: {}
      annotations: {}
      image: jaegertracing/jaeger-ingester
      imagePullSecrets: []
      pullPolicy: IfNotPresent
      dnsPolicy: ClusterFirst
      cmdlineParams: {}
      replicaCount: 1
      autoscaling:
        enabled: false
        minReplicas: 2
        maxReplicas: 10
        # targetCPUUtilizationPercentage: 80
        # targetMemoryUtilizationPercentage: 80
      service:
        annotations: {}
        # List of IP ranges that are allowed to access the load balancer (if supported)
        loadBalancerSourceRanges: []
        type: ClusterIP
      resources: {}
        # limits:
        #   cpu: 1
        #   memory: 1Gi
        # requests:
        #   cpu: 500m
        #   memory: 512Mi
      serviceAccount:
        create: true
        # Explicitly mounts the API credentials for the Service Account
        automountServiceAccountToken: false
        name:
      nodeSelector: {}
      tolerations: []
      affinity: {}
      podAnnotations: {}
      ## Additional pod labels
      ## ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
      podLabels: {}
      extraSecretMounts: []
      extraConfigmapMounts: []
    
      serviceMonitor:
        enabled: false
        additionalLabels: {}
        # https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#relabelconfig
        relabelings: []
        # -- ServiceMonitor metric relabel configs to apply to samples before ingestion
        # https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#endpoint
        metricRelabelings: []
    
    agent:
      podSecurityContext: {}
      securityContext: {}
      enabled: false
      annotations: {}
      image: jaegertracing/jaeger-agent
      # tag: 1.22
      imagePullSecrets: []
      pullPolicy: IfNotPresent
      cmdlineParams: {}
      extraEnv: []
      daemonset:
        useHostPort: false
        updateStrategy: {}
          # type: RollingUpdate
          # rollingUpdate:
          #   maxUnavailable: 1
      service:
        annotations: {}
        # List of IP ranges that are allowed to access the load balancer (if supported)
        loadBalancerSourceRanges: []
        type: ClusterIP
        # zipkinThriftPort :accept zipkin.thrift over compact thrift protocol
        zipkinThriftPort: 5775
        # compactPort: accept jaeger.thrift over compact thrift protocol
        compactPort: 6831
        # binaryPort: accept jaeger.thrift over binary thrift protocol
        binaryPort: 6832
        # samplingPort: (HTTP) serve configs, sampling strategies
        samplingPort: 5778
      resources: {}
        # limits:
        #   cpu: 500m
        #   memory: 512Mi
        # requests:
        #   cpu: 256m
        #   memory: 128Mi
      serviceAccount:
        create: true
        # Explicitly mounts the API credentials for the Service Account
        automountServiceAccountToken: false
        name:
        annotations: {}
      nodeSelector: {}
      tolerations: []
      affinity: {}
      podAnnotations: {}
      ## Additional pod labels
      ## ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
      podLabels: {}
      extraSecretMounts: []
      # - name: jaeger-tls
      #   mountPath: /tls
      #   subPath: ""
      #   secretName: jaeger-tls
      #   readOnly: true
      extraConfigmapMounts: []
      # - name: jaeger-config
      #   mountPath: /config
      #   subPath: ""
      #   configMap: jaeger-config
      #   readOnly: true
      useHostNetwork: false
      dnsPolicy: ClusterFirst
      priorityClassName: ""
      initContainers: []
    
      serviceMonitor:
        enabled: false
        additionalLabels: {}
        # https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#relabelconfig
        relabelings: []
        # -- ServiceMonitor metric relabel configs to apply to samples before ingestion
        # https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#endpoint
        metricRelabelings: []
    
    collector:
      podSecurityContext: {}
      securityContext: {}
      enabled: false
      annotations: {}
      image: jaegertracing/jaeger-collector
      # tag: 1.22
      imagePullSecrets: []
      pullPolicy: IfNotPresent
      dnsPolicy: ClusterFirst
      extraEnv: []
      cmdlineParams: {}
      replicaCount: 1
      autoscaling:
        enabled: false
        minReplicas: 2
        maxReplicas: 10
        # targetCPUUtilizationPercentage: 80
        # targetMemoryUtilizationPercentage: 80
      service:
        annotations: {}
        # The IP to be used by the load balancer (if supported)
        loadBalancerIP: ''
        # List of IP ranges that are allowed to access the load balancer (if supported)
        loadBalancerSourceRanges: []
        type: ClusterIP
        grpc:
          port: 14250
          # nodePort:
        # httpPort: can accept spans directly from clients in jaeger.thrift format
        http:
          port: 14268
          # nodePort:
        # can accept Zipkin spans in JSON or Thrift
        zipkin: {}
          # port: 9411
          # nodePort:
      ingress:
        enabled: false
        # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
        # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
        # ingressClassName: nginx
        annotations: {}
        labels: {}
        # Used to create an Ingress record.
        # The 'hosts' variable accepts two formats:
        # hosts:
        #   - chart-example.local
        # or:
        # hosts:
        #   - host: chart-example.local
        #     servicePort: grpc
        # annotations:
          # kubernetes.io/ingress.class: nginx
          # kubernetes.io/tls-acme: "true"
        # labels:
          # app: jaeger-collector
        # tls:
          # Secrets must be manually created in the namespace.
          # - secretName: chart-example-tls
          #   hosts:
          #     - chart-example.local
      resources: {}
        # limits:
        #   cpu: 1
        #   memory: 1Gi
        # requests:
        #   cpu: 500m
        #   memory: 512Mi
      serviceAccount:
        create: true
        # Explicitly mounts the API credentials for the Service Account
        automountServiceAccountToken: false
        name:
        annotations: {}
      nodeSelector: {}
      tolerations: []
      affinity: {}
      podAnnotations: {}
      ## Additional pod labels
      ## ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
      podLabels: {}
      extraSecretMounts: []
      # - name: jaeger-tls
      #   mountPath: /tls
      #   subPath: ""
      #   secretName: jaeger-tls
      #   readOnly: true
      extraConfigmapMounts: []
      # - name: jaeger-config
      #   mountPath: /config
      #   subPath: ""
      #   configMap: jaeger-config
      #   readOnly: true
      # samplingConfig: |-
      #   {
      #     "service_strategies": [
      #       {
      #         "service": "foo",
      #         "type": "probabilistic",
      #         "param": 0.8,
      #         "operation_strategies": [
      #           {
      #             "operation": "op1",
      #             "type": "probabilistic",
      #             "param": 0.2
      #           },
      #           {
      #             "operation": "op2",
      #             "type": "probabilistic",
      #             "param": 0.4
      #           }
      #         ]
      #       },
      #       {
      #         "service": "bar",
      #         "type": "ratelimiting",
      #         "param": 5
      #       }
      #     ],
      #     "default_strategy": {
      #       "type": "probabilistic",
      #       "param": 1
      #     }
      #   }
      priorityClassName: ""
      serviceMonitor:
        enabled: false
        additionalLabels: {}
        # https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#relabelconfig
        relabelings: []
        # -- ServiceMonitor metric relabel configs to apply to samples before ingestion
        # https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#endpoint
        metricRelabelings: []
    
    query:
      enabled: false
      oAuthSidecar:
        enabled: false
        image: quay.io/oauth2-proxy/oauth2-proxy:v7.1.0
        pullPolicy: IfNotPresent
        containerPort: 4180
        args: []
        extraEnv: []
        extraConfigmapMounts: []
        extraSecretMounts: []
      # config: |-
      #   provider = "oidc"
      #   https_address = ":4180"
      #   upstreams = ["http://localhost:16686"]
      #   redirect_url = "https://jaeger-svc-domain/oauth2/callback"
      #   client_id = "jaeger-query"
      #   oidc_issuer_url = "https://keycloak-svc-domain/auth/realms/Default"
      #   cookie_secure = "true"
      #   email_domains = "*"
      #   oidc_groups_claim = "groups"
      #   user_id_claim = "preferred_username"
      #   skip_provider_button = "true"
      podSecurityContext: {}
      securityContext: {}
      agentSidecar:
        enabled: true
    #    resources:
    #      limits:
    #        cpu: 500m
    #        memory: 512Mi
    #      requests:
    #        cpu: 256m
    #        memory: 128Mi
      annotations: {}
      image: jaegertracing/jaeger-query
      # tag: 1.22
      imagePullSecrets: []
      pullPolicy: IfNotPresent
      dnsPolicy: ClusterFirst
      cmdlineParams: {}
      extraEnv: []
      replicaCount: 1
      service:
        annotations: {}
        type: ClusterIP
        # List of IP ranges that are allowed to access the load balancer (if supported)
        loadBalancerSourceRanges: []
        port: 80
        # Specify a custom target port (e.g. port of auth proxy)
        # targetPort: 8080
        # Specify a specific node port when type is NodePort
        # nodePort: 32500
      ingress:
        enabled: false
        # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
        # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
        # ingressClassName: nginx
        annotations: {}
        labels: {}
        # Used to create an Ingress record.
        # hosts:
        #   - chart-example.local
        # annotations:
          # kubernetes.io/ingress.class: nginx
          # kubernetes.io/tls-acme: "true"
        # labels:
          # app: jaeger-query
        # tls:
          # Secrets must be manually created in the namespace.
          # - secretName: chart-example-tls
          #   hosts:
          #     - chart-example.local
        health:
          exposed: false
      resources: {}
        # limits:
        #   cpu: 500m
        #   memory: 512Mi
        # requests:
        #    cpu: 256m
        #    memory: 128Mi
      serviceAccount:
        create: true
        # Explicitly mounts the API credentials for the Service Account
        automountServiceAccountToken: false
        name:
        annotations: {}
      nodeSelector: {}
      tolerations: []
      affinity: {}
      podAnnotations: {}
      ## Additional pod labels
      ## ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
      podLabels: {}
      extraConfigmapMounts: []
      # - name: jaeger-config
      #   mountPath: /config
      #   subPath: ""
      #   configMap: jaeger-config
      #   readOnly: true
      extraVolumes: []
      sidecars: []
      ##   - name: your-image-name
      ##     image: your-image
      ##     ports:
      ##       - name: portname
      ##         containerPort: 1234
      priorityClassName: ""
      serviceMonitor:
        enabled: false
        additionalLabels: {}
        # https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#relabelconfig
        relabelings: []
        # -- ServiceMonitor metric relabel configs to apply to samples before ingestion
        # https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#endpoint
        metricRelabelings: []
      # config: |-
      #   {
      #     "dependencies": {
      #       "dagMaxNumServices": 200,
      #       "menuEnabled": true
      #     },
      #     "archiveEnabled": true,
      #     "tracking": {
      #       "gaID": "UA-000000-2",
      #       "trackErrors": true
      #     }
      #   }
    
    spark:
      enabled: false
      annotations: {}
      image: jaegertracing/spark-dependencies
      imagePullSecrets: []
      tag: latest
      pullPolicy: Always
      cmdlineParams: {}
      extraEnv: []
      schedule: "49 23 * * *"
      successfulJobsHistoryLimit: 5
      failedJobsHistoryLimit: 5
      concurrencyPolicy: Forbid
      resources: {}
        # limits:
        #   cpu: 500m
        #   memory: 512Mi
        # requests:
        #   cpu: 256m
        #   memory: 128Mi
      serviceAccount:
        create: true
        # Explicitly mounts the API credentials for the Service Account
        automountServiceAccountToken: false
        name:
      nodeSelector: {}
      tolerations: []
      affinity: {}
      extraSecretMounts: []
      extraConfigmapMounts: []
      podAnnotations: {}
      ## Additional pod labels
      ## ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
      podLabels: {}
    
    esIndexCleaner:
      enabled: false
      securityContext:
        runAsUser: 1000
      podSecurityContext:
        runAsUser: 1000
      annotations: {}
      image: jaegertracing/jaeger-es-index-cleaner
      imagePullSecrets: []
      pullPolicy: Always
      cmdlineParams: {}
      extraEnv: []
        # - name: ROLLOVER
        #   value: 'true'
      schedule: "55 23 * * *"
      successfulJobsHistoryLimit: 3
      failedJobsHistoryLimit: 3
      concurrencyPolicy: Forbid
      resources: {}
        # limits:
        #   cpu: 500m
        #   memory: 512Mi
        # requests:
        #   cpu: 256m
        #   memory: 128Mi
      numberOfDays: 7
      serviceAccount:
        create: true
        # Explicitly mounts the API credentials for the Service Account
        automountServiceAccountToken: false
        name:
      nodeSelector: {}
      tolerations: []
      affinity: {}
      extraSecretMounts: []
      extraConfigmapMounts: []
      podAnnotations: {}
      ## Additional pod labels
      ## ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
      podLabels: {}
    
    esRollover:
      enabled: false
      securityContext: {}
      podSecurityContext:
        runAsUser: 1000
      annotations: {}
      image: jaegertracing/jaeger-es-rollover
      imagePullSecrets: []
      tag: latest
      pullPolicy: Always
      cmdlineParams: {}
      extraEnv:
        - name: CONDITIONS
          value: '{"max_age": "1d"}'
      schedule: "10 0 * * *"
      successfulJobsHistoryLimit: 3
      failedJobsHistoryLimit: 3
      concurrencyPolicy: Forbid
      resources: {}
        # limits:
        #   cpu: 500m
        #   memory: 512Mi
        # requests:
        #   cpu: 256m
        #   memory: 128Mi
      serviceAccount:
        create: true
        # Explicitly mounts the API credentials for the Service Account
        automountServiceAccountToken: false
        name:
      nodeSelector: {}
      tolerations: []
      affinity: {}
      extraSecretMounts: []
      extraConfigmapMounts: []
      podAnnotations: {}
      ## Additional pod labels
      ## ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
      podLabels: {}
      initHook:
        extraEnv: []
          # - name: SHARDS
          #   value: "3"
        annotations: {}
        podAnnotations: {}
        podLabels: {}
        ttlSecondsAfterFinished: 120
    
    esLookback:
      enabled: false
      securityContext: {}
      podSecurityContext:
        runAsUser: 1000
      annotations: {}
      image: jaegertracing/jaeger-es-rollover
      imagePullSecrets: []
      tag: latest
      pullPolicy: Always
      cmdlineParams: {}
      extraEnv:
        - name: UNIT
          value: days
        - name: UNIT_COUNT
          value: '7'
      schedule: '5 0 * * *'
      successfulJobsHistoryLimit: 3
      failedJobsHistoryLimit: 3
      concurrencyPolicy: Forbid
      resources: {}
        # limits:
        #   cpu: 500m
        #   memory: 512Mi
        # requests:
        #   cpu: 256m
        #   memory: 128Mi
      serviceAccount:
        create: true
        # Explicitly mounts the API credentials for the Service Account
        automountServiceAccountToken: false
        name:
      nodeSelector: {}
      tolerations: []
      affinity: {}
      extraSecretMounts: []
      extraConfigmapMounts: []
      podAnnotations: {}
      ## Additional pod labels
      ## ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/
      podLabels: {}
    # End: Default values for the various components of Jaeger
    
    hotrod:
      enabled: false
      podSecurityContext: {}
      securityContext: {}
      replicaCount: 1
      image:
        repository: jaegertracing/example-hotrod
        pullPolicy: Always
        pullSecrets: []
      service:
        annotations: {}
        name: hotrod
        type: ClusterIP
        # List of IP ranges that are allowed to access the load balancer (if supported)
        loadBalancerSourceRanges: []
        port: 80
      ingress:
        enabled: false
        # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
        # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
        # ingressClassName: nginx
        # Used to create Ingress record (should be used with service.type: ClusterIP).
        hosts:
          - chart-example.local
        annotations: {}
          # kubernetes.io/ingress.class: nginx
          # kubernetes.io/tls-acme: "true"
        tls:
          # Secrets must be manually created in the namespace.
          # - secretName: chart-example-tls
          #   hosts:
          #     - chart-example.local
      resources: {}
        # We usually recommend not to specify default resources and to leave this as a conscious
        # choice for the user. This also increases chances charts run on environments with little
        # resources, such as Minikube. If you do want to specify resources, uncomment the following
        # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
        # limits:
        #   cpu: 100m
        #   memory: 128Mi
        # requests:
        #   cpu: 100m
        #   memory: 128Mi
      serviceAccount:
        create: true
        # Explicitly mounts the API credentials for the Service Account
        automountServiceAccountToken: false
        name:
      nodeSelector: {}
      tolerations: []
      affinity: {}
      tracing:
        host: null
        port: 6831
    
    # Array with extra yaml objects to install alongside the chart. Values are evaluated as a template.
    extraObjects: []
    

    See this in the jaeger pod logs

    {"level":"info","ts":1659652986.6050475,"caller":"channelz/funcs.go:340","msg":"[core][Server #7 ListenSocket #10] ListenSocket created","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652987.6055405,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8 SubChannel #9] Subchannel Connectivity change to IDLE","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652987.6056607,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0008b3320, {IDLE connection error: desc = \"transport: Error while dialing dial tcp :16685: connect: connection refused\"}","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652987.6056855,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] Channel Connectivity change to IDLE","system":"grpc","grpc_log":true}
    

    I can see port opened via the k8s jaeger-query service

    $ kubectl get svc  -n istio-system | grep jaeger
    jaeger-agent        ClusterIP      None             <none>        5775/UDP,5778/TCP,6831/UDP,6832/UDP          7d21h
    jaeger-collector    ClusterIP      None             <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP       7d21h
    jaeger-query        ClusterIP      None             <none>        16686/TCP,16685/TCP                          7d21h
    

    But within the only running pod I don't see anything mentioning port 16685

    $ kubectl get pod jaeger-7c8fb9bb6c-pmp72 -n istio-system -o=yaml | grep 16685
    $
    

    However port 16686 is referenced

    spec:
      containers:
      - env:
        - name: SPAN_STORAGE_TYPE
          value: memory
        - name: COLLECTOR_ZIPKIN_HOST_PORT
          value: :9411
        - name: JAEGER_DISABLED
          value: "false"
        image: jaegertracing/all-in-one:1.37.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 5
          httpGet:
            path: /
            port: 14269
            scheme: HTTP
          initialDelaySeconds: 5
          periodSeconds: 15
          successThreshold: 1
          timeoutSeconds: 1
        name: jaeger
        ports:
        - containerPort: 5775
          protocol: UDP
        - containerPort: 6831
          protocol: UDP
        - containerPort: 6832
          protocol: UDP
        - containerPort: 5778
          protocol: TCP
        - containerPort: 16686
          protocol: TCP
        - containerPort: 9411
          protocol: TCP
    

    Steps to reproduce

    1. helm repo add jaegertracing https://jaegertracing.github.io/helm-charts
    2. helm install jaeger jaegertracing/jaeger -n istio-system -f jaeger.yaml
    3. kubectl logs -f <insert_jaeger_pod_name> -n istio-system

    Expected behavior

    Not not see a connection refused when starting GPRC server on port 16685

    Relevant log output

    $ kubectl logs -f jaeger-7c8fb9bb6c-pmp72 -n istio-system
    2022/08/04 22:43:06 maxprocs: Leaving GOMAXPROCS=16: CPU quota undefined
    {"level":"info","ts":1659652986.5780008,"caller":"flags/service.go:119","msg":"Mounting metrics handler on admin server","route":"/metrics"}
    {"level":"info","ts":1659652986.57805,"caller":"flags/service.go:125","msg":"Mounting expvar handler on admin server","route":"/debug/vars"}
    {"level":"info","ts":1659652986.5781796,"caller":"flags/admin.go:128","msg":"Mounting health check on admin server","route":"/"}
    {"level":"info","ts":1659652986.5782237,"caller":"flags/admin.go:141","msg":"Starting admin HTTP server","http-addr":":14269"}
    {"level":"info","ts":1659652986.578245,"caller":"flags/admin.go:120","msg":"Admin server started","http.host-port":"[::]:14269","health-status":"unavailable"}
    {"level":"info","ts":1659652986.5794344,"caller":"memory/factory.go:66","msg":"Memory storage initialized","configuration":{"MaxTraces":0}}
    {"level":"info","ts":1659652986.5805168,"caller":"static/strategy_store.go:138","msg":"Loading sampling strategies","filename":"/etc/jaeger/sampling_strategies.json"}
    {"level":"info","ts":1659652986.593715,"caller":"channelz/funcs.go:340","msg":"[core][Server #1] Server created","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.5938025,"caller":"server/grpc.go:104","msg":"Starting jaeger-collector gRPC server","grpc.host-port":"[::]:14250"}
    {"level":"info","ts":1659652986.5938115,"caller":"server/http.go:48","msg":"Starting jaeger-collector HTTP server","http host-port":":14268"}
    {"level":"info","ts":1659652986.593898,"caller":"server/zipkin.go:55","msg":"Listening for Zipkin HTTP traffic","zipkin host-port":":9411"}
    {"level":"info","ts":1659652986.5939295,"caller":"channelz/funcs.go:340","msg":"[core][Server #1 ListenSocket #2] ListenSocket created","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.603039,"caller":"grpc/builder.go:71","msg":"Agent requested insecure grpc connection to collector(s)"}
    {"level":"info","ts":1659652986.603084,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3] Channel created","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6030955,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3] original dial target is: \":14250\"","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6031077,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3] dial target \":14250\" parse failed: parse \":14250\": missing protocol scheme","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.603112,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3] fallback to scheme \"passthrough\"","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.603123,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3] parsed dial target is: {Scheme:passthrough Authority: Endpoint::14250 URL:{Scheme:passthrough Opaque: User: Host: Path:/:14250 RawPath: ForceQuery:false RawQuery: Fragment: RawFragment:}}","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6031394,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3] Channel authority set to \"localhost:14250\"","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6032646,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3] Resolver state updated: {\n  \"Addresses\": [\n    {\n      \"Addr\": \":14250\",\n      \"ServerName\": \"\",\n      \"Attributes\": null,\n      \"BalancerAttributes\": null,\n      \"Type\": 0,\n      \"Metadata\": null\n    }\n  ],\n  \"ServiceConfig\": null,\n  \"Attributes\": null\n} (resolver returned new addresses)","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6034167,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3] Channel switches to new LB policy \"round_robin\"","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6036775,"caller":"grpclog/component.go:55","msg":"[balancer]base.baseBalancer: got new ClientConn state: {{[{\n  \"Addr\": \":14250\",\n  \"ServerName\": \"\",\n  \"Attributes\": null,\n  \"BalancerAttributes\": null,\n  \"Type\": 0,\n  \"Metadata\": null\n}] <nil> <nil>} <nil>}","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6038163,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3 SubChannel #4] Subchannel created","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6038601,"caller":"grpclog/component.go:71","msg":"[roundrobin]roundrobinPicker: Build called with info: {map[]}","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6038685,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3] Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.603935,"caller":"grpc/builder.go:109","msg":"Checking connection to collector"}
    {"level":"info","ts":1659652986.6039464,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3 SubChannel #4] Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6039588,"caller":"grpc/builder.go:120","msg":"Agent collector connection state change","dialTarget":":14250","status":"CONNECTING"}
    {"level":"info","ts":1659652986.6039634,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3 SubChannel #4] Subchannel picks a new address \":14250\" to connect","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6040459,"caller":"grpclog/component.go:71","msg":"[balancer]base.baseBalancer: handle SubConn state change: 0xc0009fc080, CONNECTING","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6041598,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3 SubChannel #4] Subchannel Connectivity change to READY","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6041825,"caller":"grpclog/component.go:71","msg":"[balancer]base.baseBalancer: handle SubConn state change: 0xc0009fc080, READY","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6041996,"caller":"grpclog/component.go:71","msg":"[roundrobin]roundrobinPicker: Build called with info: {map[0xc0009fc080:{{\n  \"Addr\": \":14250\",\n  \"ServerName\": \"\",\n  \"Attributes\": null,\n  \"BalancerAttributes\": null,\n  \"Type\": 0,\n  \"Metadata\": null\n}}]}","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6042056,"caller":"channelz/funcs.go:340","msg":"[core][Channel #3] Channel Connectivity change to READY","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.60421,"caller":"grpc/builder.go:120","msg":"Agent collector connection state change","dialTarget":":14250","status":"READY"}
    {"level":"info","ts":1659652986.6042097,"caller":"./main.go:256","msg":"Starting agent"}
    {"level":"info","ts":1659652986.6042488,"caller":"querysvc/query_service.go:135","msg":"Archive storage not created","reason":"archive storage not supported"}
    {"level":"info","ts":1659652986.604272,"caller":"app/flags.go:136","msg":"Archive storage not initialized"}
    {"level":"info","ts":1659652986.6043122,"caller":"app/agent.go:69","msg":"Starting jaeger-agent HTTP server","http-port":5778}
    {"level":"info","ts":1659652986.6044168,"caller":"channelz/funcs.go:340","msg":"[core][Server #7] Server created","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.604474,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] Channel created","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6044922,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] original dial target is: \":16685\"","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6044993,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] dial target \":16685\" parse failed: parse \":16685\": missing protocol scheme","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6045032,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] fallback to scheme \"passthrough\"","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.604511,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] parsed dial target is: {Scheme:passthrough Authority: Endpoint::16685 URL:{Scheme:passthrough Opaque: User: Host: Path:/:16685 RawPath: ForceQuery:false RawQuery: Fragment: RawFragment:}}","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6045334,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] Channel authority set to \"localhost:16685\"","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6045642,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] Resolver state updated: {\n  \"Addresses\": [\n    {\n      \"Addr\": \":16685\",\n      \"ServerName\": \"\",\n      \"Attributes\": null,\n      \"BalancerAttributes\": null,\n      \"Type\": 0,\n      \"Metadata\": null\n    }\n  ],\n  \"ServiceConfig\": null,\n  \"Attributes\": null\n} (resolver returned new addresses)","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.604587,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] Channel switches to new LB policy \"pick_first\"","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6046064,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8 SubChannel #9] Subchannel created","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.604689,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8 SubChannel #9] Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.604735,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8 SubChannel #9] Subchannel picks a new address \":16685\" to connect","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6047604,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0008b3320, {CONNECTING <nil>}","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.604774,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.604839,"caller":"app/static_handler.go:181","msg":"UI config path not provided, config file will not be watched"}
    {"level":"warn","ts":1659652986.6048996,"caller":"channelz/funcs.go:342","msg":"[core][Channel #8 SubChannel #9] grpc: addrConn.createTransport failed to connect to {\n  \"Addr\": \":16685\",\n  \"ServerName\": \"localhost:16685\",\n  \"Attributes\": null,\n  \"BalancerAttributes\": null,\n  \"Type\": 0,\n  \"Metadata\": null\n}. Err: connection error: desc = \"transport: Error while dialing dial tcp :16685: connect: connection refused\"","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6049216,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8 SubChannel #9] Subchannel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6049376,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0008b3320, {TRANSIENT_FAILURE connection error: desc = \"transport: Error while dialing dial tcp :16685: connect: connection refused\"}","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6049545,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] Channel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652986.6049085,"caller":"app/server.go:217","msg":"Query server started","http_addr":"[::]:16686","grpc_addr":"[::]:16685"}
    {"level":"info","ts":1659652986.6050124,"caller":"healthcheck/handler.go:129","msg":"Health Check state change","status":"ready"}
    {"level":"info","ts":1659652986.6050298,"caller":"app/server.go:281","msg":"Starting HTTP server","port":16686,"addr":":16686"}
    {"level":"info","ts":1659652986.6050348,"caller":"app/server.go:300","msg":"Starting GRPC server","port":16685,"addr":":16685"}
    {"level":"info","ts":1659652986.6050475,"caller":"channelz/funcs.go:340","msg":"[core][Server #7 ListenSocket #10] ListenSocket created","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652987.6055405,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8 SubChannel #9] Subchannel Connectivity change to IDLE","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652987.6056607,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0008b3320, {IDLE connection error: desc = \"transport: Error while dialing dial tcp :16685: connect: connection refused\"}","system":"grpc","grpc_log":true}
    {"level":"info","ts":1659652987.6056855,"caller":"channelz/funcs.go:340","msg":"[core][Channel #8] Channel Connectivity change to IDLE","system":"grpc","grpc_log":true}
    

    Screenshot

    No response

    Additional context

    No response

    Jaeger backend version

    1.29.0 & 1.37.0

    SDK

    No response

    Pipeline

    No response

    Stogage backend

    none

    Operating system

    WSL2

    Deployment model

    Helm

    Deployment configs

    No response

    bug 
    opened by darynh72 1
  • [Feature]: Publish description and readme to Docker Hub

    [Feature]: Publish description and readme to Docker Hub

    Requirement

    As a user downloading Jaeger images from Docker Hub, I want to be able to read a description and basic usage in the Docker Hub repositories.

    Problem

    Right now most of Jaeger's Docker Hub repositories contain neither description nor the readme.

    Proposal

    Use Docker Hub API to push the README files from the respective cmd/*** dirs when publishing Docker images.

    Example: https://github.com/peter-evans/dockerhub-description/blob/main/src/dockerhub-helper.ts

    Open questions

    No response

    enhancement help wanted good first issue 
    opened by yurishkuro 6
  • [Bug]: jaeger ingester module nil pointer error (when consume to ES)

    [Bug]: jaeger ingester module nil pointer error (when consume to ES)

    What happened?

    Start jaeger container with ES as data storage, the ingester container keeps down, nil pointer error.

    Version 1.35 & 1.36 has the same issue.

    Steps to reproduce

    Step0: run jaeger with es by container

    #!/usr/bin/env bash
    set -o errexit
    set -o nounset
    set -o pipefail
    
    BASE=/data/jaeger
    mkdir -p $BASE
    cd $BASE
    
    if [ $# -ge 1 ] && [ $1 = "clean" ]; then
        sudo docker-compose down 2>/dev/null
        sudo rm -rf esdata01 esdata02 esdata03 kibanadata 2>/dev/null
        rm .env docker-compose.yml nginx_query.conf nginx_collector.conf 2>/dev/null
        exit 0
    fi
    
    mkdir -p esdata01 esdata02 esdata03 kibanadata
    chown 1000:1000 -R esdata01 esdata02 esdata03 kibanadata
    
    cat >.env <<'EOF'
    # Password for the 'elastic' user (at least 6 characters)
    ELASTIC_PASSWORD=passwd
    
    # Password for the 'kibana_system' user (at least 6 characters)
    KIBANA_PASSWORD=passwd
    
    # Version of Elastic products
    STACK_VERSION=7.17.5
    
    # Set the cluster name
    CLUSTER_NAME=jeager-cluster
    
    # Set to 'basic' or 'trial' to automatically start the 30-day trial
    LICENSE=basic
    #LICENSE=trial
    
    # Port to expose Elasticsearch HTTP API to the host
    ES_PORT=9200
    
    # Port to expose Kibana to the host
    KIBANA_PORT=5601
    
    # Increase or decrease based on the available host memory (in bytes)
    MEM_LIMIT=4294967296
    
    KAFKA_BROKER=<broker1>:9002,<broker2>:9002,<broker3>:9002
    EOF
    
    cat >nginx_query.conf <<'EOF'
    upstream jaeger_query {
        server query01:16686;
        server query02:16686;
        server query03:16686;
    }
    
    server {
        listen          16686;
        server_name     0.0.0.0;
    
        location / {
          proxy_pass  http://jaeger_query;
        }
    }
    EOF
    
    cat >nginx_collector.conf <<'EOF'
    upstream jaeger_collector {
        server collector01:4318;
        server collector02:4318;
        server collector03:4318;
    }
    
    server {
        listen          4318;
        server_name     0.0.0.0;
        location / {
            proxy_pass      http://jaeger_collector;
        }
    }
    EOF
    
    cat >docker-compose.yml <<'EOF'
    version: "2.2"
    
    services:
      es01:
        image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
        volumes:
          - ./esdata01:/usr/share/elasticsearch/data
        ports:
          - 9200
        restart: always
        environment:
          - node.name=es01
          - cluster.name=${CLUSTER_NAME}
          - cluster.initial_master_nodes=es01,es02,es03
          - discovery.seed_hosts=es02,es03
          - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
          - bootstrap.memory_lock=true
          - xpack.security.enabled=false
          - xpack.license.self_generated.type=${LICENSE}
        mem_limit: ${MEM_LIMIT}
        ulimits:
          memlock:
            soft: -1
            hard: -1
        healthcheck:
          test:
            [
              "CMD-SHELL",
              "curl -s http://localhost:9200 | grep 'You Know, for Search'",
            ]
          interval: 10s
          timeout: 10s
          retries: 120
    
      es02:
        depends_on:
          - es01
        image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
        volumes:
          - ./esdata02:/usr/share/elasticsearch/data
        ports:
          - 9200
        restart: always
        environment:
          - node.name=es02
          - cluster.name=${CLUSTER_NAME}
          - cluster.initial_master_nodes=es01,es02,es03
          - discovery.seed_hosts=es01,es03
          - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
          - bootstrap.memory_lock=true
          - xpack.security.enabled=false
          - xpack.license.self_generated.type=${LICENSE}
        mem_limit: ${MEM_LIMIT}
        ulimits:
          memlock:
            soft: -1
            hard: -1
        healthcheck:
          test:
            [
              "CMD-SHELL",
              "curl -s http://localhost:9200 | grep 'You Know, for Search'",
            ]
          interval: 10s
          timeout: 10s
          retries: 120
    
      es03:
        depends_on:
          - es02
        image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION}
        volumes:
          - ./esdata03:/usr/share/elasticsearch/data
        ports:
          - 9200
        restart: always
        environment:
          - node.name=es03
          - cluster.name=${CLUSTER_NAME}
          - cluster.initial_master_nodes=es01,es02,es03
          - discovery.seed_hosts=es01,es02
          - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
          - bootstrap.memory_lock=true
          - xpack.security.enabled=false
          - xpack.license.self_generated.type=${LICENSE}
        mem_limit: ${MEM_LIMIT}
        ulimits:
          memlock:
            soft: -1
            hard: -1
        healthcheck:
          test:
            [
              "CMD-SHELL",
              "curl -s http://localhost:9200 | grep 'You Know, for Search'",
            ]
          interval: 10s
          timeout: 10s
          retries: 120
    
      kibana:
        depends_on:
          es01:
            condition: service_healthy
          es02:
            condition: service_healthy
          es03:
            condition: service_healthy
        image: docker.elastic.co/kibana/kibana:${STACK_VERSION}
        volumes:
          - ./kibanadata:/usr/share/kibana/data
        ports:
          - ${KIBANA_PORT}:5601
        restart: always
        environment:
          - SERVERNAME=kibana
          - ELASTICSEARCH_HOSTS=["http://es01:9200","http://es02:9200","http://es03:9200"]
          - ELASTICSEARCH_USERNAME=kibana_system
          - ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}
        mem_limit: ${MEM_LIMIT}
        healthcheck:
          test:
            [
              "CMD-SHELL",
              "curl -s -I http://localhost:5601 | grep 'HTTP/1.1 302 Found'",
            ]
          interval: 10s
          timeout: 10s
          retries: 120
    
      collector01:
        depends_on:
          es01:
            condition: service_healthy
          es02:
            condition: service_healthy
          es03:
            condition: service_healthy
        image: jaegertracing/jaeger-collector:1.35
        ports:
          - 9411
          - 14250
          - 14268
          - 14269
          - 4318
          - 4317
        restart: always
        environment:
          - COLLECTOR_OTLP_ENABLED=true
          - SPAN_STORAGE_TYPE=kafka
          - KAFKA_PRODUCER_BROKERS=${KAFKA_BROKER}
    
      collector02:
        depends_on:
          es01:
            condition: service_healthy
          es02:
            condition: service_healthy
          es03:
            condition: service_healthy
        image: jaegertracing/jaeger-collector:1.35
        ports:
          - 9411
          - 14250
          - 14268
          - 14269
          - 4318
          - 4317
        restart: always
        environment:
          - COLLECTOR_OTLP_ENABLED=true
          - SPAN_STORAGE_TYPE=kafka
          - KAFKA_PRODUCER_BROKERS=${KAFKA_BROKER}
    
      collector03:
        depends_on:
          es01:
            condition: service_healthy
          es02:
            condition: service_healthy
          es03:
            condition: service_healthy
        image: jaegertracing/jaeger-collector:1.35
        ports:
          - 9411
          - 14250
          - 14268
          - 14269
          - 4318
          - 4317
        restart: always
        environment:
          - COLLECTOR_OTLP_ENABLED=true
          - SPAN_STORAGE_TYPE=kafka
          - KAFKA_PRODUCER_BROKERS=${KAFKA_BROKER}
    
      ingestor:
        depends_on:
          es01:
            condition: service_healthy
          es02:
            condition: service_healthy
          es03:
            condition: service_healthy
        image: jaegertracing/jaeger-ingester:1.35
        ports:
          - 14270
        restart: always
        environment:
          - SPAN_STORAGE_TYPE=elasticsearch
        command:
          - "--kafka.consumer.brokers=${KAFKA_BROKER}"
          - "--es.server-urls=http://es01:${ES_PORT},http://es02:${ES_PORT},http://es02:${ES_PORT}"
    
      query01:
        depends_on:
          es01:
            condition: service_healthy
          es02:
            condition: service_healthy
          es03:
            condition: service_healthy
        image: jaegertracing/jaeger-query:1.35
        ports:
          - 16685
          - 16686
          - 16687
        restart: always
        environment:
          - SPAN_STORAGE_TYPE=elasticsearch
        command:
          - "--es.server-urls=http://es01:${ES_PORT},http://es02:${ES_PORT},http://es02:${ES_PORT}"
    
      query02:
        depends_on:
          es01:
            condition: service_healthy
          es02:
            condition: service_healthy
          es03:
            condition: service_healthy
        image: jaegertracing/jaeger-query:1.35
        ports:
          - 16685
          - 16686
          - 16687
        restart: always
        environment:
          - SPAN_STORAGE_TYPE=elasticsearch
        command:
          - "--es.server-urls=http://es01:${ES_PORT},http://es02:${ES_PORT},http://es02:${ES_PORT}"
    
      query03:
        depends_on:
          es01:
            condition: service_healthy
          es02:
            condition: service_healthy
          es03:
            condition: service_healthy
        image: jaegertracing/jaeger-query:1.35
        ports:
          - 16685
          - 16686
          - 16687
        restart: always
        environment:
          - SPAN_STORAGE_TYPE=elasticsearch
        command:
          - "--es.server-urls=http://es01:${ES_PORT},http://es02:${ES_PORT},http://es02:${ES_PORT}"
    
      nginx:
        depends_on:
          - query01
          - query02
          - query03
          - collector01
          - collector02
          - collector03
        image: nginx
        ports:
          - 16686:16686
          - 4318:4318
        restart: always
        volumes:
          - ./nginx_query.conf:/etc/nginx/conf.d/nginx_query.conf
          - ./nginx_collector.conf:/etc/nginx/conf.d/nginx_collector.conf
    EOF
    
    sudo docker-compose up -d
    
    

    Step1: At first it works fine, I can see data in jaeger-ui, but then ingester container keeps down. Error log:

    2022/07/26 10:24:35 maxprocs: Leaving GOMAXPROCS=36: CPU quota undefined
    {"level":"info","ts":1658831075.124132,"caller":"flags/service.go:119","msg":"Mounting metrics handler on admin server","route":"/metrics"}
    {"level":"info","ts":1658831075.1241765,"caller":"flags/service.go:125","msg":"Mounting expvar handler on admin server","route":"/debug/vars"}
    {"level":"info","ts":1658831075.1243546,"caller":"flags/admin.go:128","msg":"Mounting health check on admin server","route":"/"}
    {"level":"info","ts":1658831075.1244004,"caller":"flags/admin.go:141","msg":"Starting admin HTTP server","http-addr":":14270"}
    {"level":"info","ts":1658831075.1244335,"caller":"flags/admin.go:120","msg":"Admin server started","http.host-port":"[::]:14270","health-status":"unavailable"}
    {"level":"info","ts":1658831075.1393647,"caller":"config/config.go:206","msg":"Elasticsearch detected","version":7}
    {"level":"info","ts":1658831075.73236,"caller":"healthcheck/handler.go:129","msg":"Health Check state change","status":"ready"}
    {"level":"info","ts":1658831075.7324462,"caller":"consumer/consumer.go:79","msg":"Starting main loop"}
    {"level":"info","ts":1658831102.2093694,"caller":"consumer/consumer.go:167","msg":"Starting error handler","partition":0}
    {"level":"info","ts":1658831102.2093897,"caller":"consumer/consumer.go:110","msg":"Starting message handler","partition":0}
    {"level":"info","ts":1658831102.408139,"caller":"consumer/consumer.go:167","msg":"Starting error handler","partition":2}
    {"level":"info","ts":1658831102.4081435,"caller":"consumer/consumer.go:110","msg":"Starting message handler","partition":2}
    {"level":"info","ts":1658831102.4151018,"caller":"consumer/consumer.go:167","msg":"Starting error handler","partition":1}
    {"level":"info","ts":1658831102.4151034,"caller":"consumer/consumer.go:110","msg":"Starting message handler","partition":1}
    {"level":"info","ts":1658831104.133721,"caller":"consumer/processor_factory.go:65","msg":"Creating new processors","partition":0}
    {"level":"info","ts":1658831104.194396,"caller":"consumer/processor_factory.go:65","msg":"Creating new processors","partition":2}
    {"level":"info","ts":1658831104.380035,"caller":"consumer/processor_factory.go:65","msg":"Creating new processors","partition":1}
    panic: runtime error: invalid memory address or nil pointer dereference
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0xe417c6]
    
    goroutine 2291 [running]:
    github.com/jaegertracing/jaeger/plugin/storage/es/spanstore/dbmodel.FromDomain.convertProcess(...)
            github.com/jaegertracing/jaeger/plugin/storage/es/spanstore/dbmodel/from_domain.go:123
    github.com/jaegertracing/jaeger/plugin/storage/es/spanstore/dbmodel.FromDomain.convertSpanEmbedProcess({0x80?, 0xc0003f6ba0?, {0x15a7e88?, 0xfd1bf2?}}, 0xc0015432c0)
            github.com/jaegertracing/jaeger/plugin/storage/es/spanstore/dbmodel/from_domain.go:64 +0x126
    github.com/jaegertracing/jaeger/plugin/storage/es/spanstore/dbmodel.FromDomain.FromDomainEmbedProcess(...)
            github.com/jaegertracing/jaeger/plugin/storage/es/spanstore/dbmodel/from_domain.go:43
    github.com/jaegertracing/jaeger/plugin/storage/es/spanstore.(*SpanWriter).WriteSpan(0xc0001152c0, {0x0?, 0x0?}, 0xc0015432c0)
            github.com/jaegertracing/jaeger/plugin/storage/es/spanstore/writer.go:152 +0x7a
    github.com/jaegertracing/jaeger/cmd/ingester/app/processor.KafkaSpanProcessor.Process({{0x15ac040, 0x1e84f90}, {0x15abec0, 0xc0001152c0}, {0x0, 0x0}}, {0x15aeac0?, 0xc000f6b5e0?})
            github.com/jaegertracing/jaeger/cmd/ingester/app/processor/span_processor.go:67 +0xd3
    github.com/jaegertracing/jaeger/cmd/ingester/app/processor/decorator.(*retryDecorator).Process(0xc000515180, {0x15aeac0, 0xc000f6b5e0})
            github.com/jaegertracing/jaeger/cmd/ingester/app/processor/decorator/retry.go:110 +0x37
    github.com/jaegertracing/jaeger/cmd/ingester/app/consumer.(*comittingProcessor).Process(0xc000741e00, {0x15aeac0, 0xc000f6b5e0})
            github.com/jaegertracing/jaeger/cmd/ingester/app/consumer/committing_processor.go:44 +0x5e
    github.com/jaegertracing/jaeger/cmd/ingester/app/processor.(*metricsDecorator).Process(0xc000651840, {0x15aeac0, 0xc000f6b5e0})
            github.com/jaegertracing/jaeger/cmd/ingester/app/processor/metrics_decorator.go:44 +0x5b
    github.com/jaegertracing/jaeger/cmd/ingester/app/processor.(*ParallelProcessor).Start.func1()
            github.com/jaegertracing/jaeger/cmd/ingester/app/processor/parallel_processor.go:57 +0x42
    created by github.com/jaegertracing/jaeger/cmd/ingester/app/processor.(*ParallelProcessor).Start
            github.com/jaegertracing/jaeger/cmd/ingester/app/processor/parallel_processor.go:53 +0xf5
    

    Expected behavior

    Everything works.

    Relevant log output

    No response

    Screenshot

    No response

    Additional context

    No response

    Jaeger backend version

    No response

    SDK

    No response

    Pipeline

    No response

    Stogage backend

    No response

    Operating system

    No response

    Deployment model

    No response

    Deployment configs

    No response

    bug 
    opened by huahuayu 19
Releases(v1.37.0)
Owner
Jaeger - Distributed Tracing Platform
Jaeger - Distributed Tracing Platform
CNCF Jaeger, a Distributed Tracing Platform

Jaeger - a Distributed Tracing System Jaeger, inspired by Dapper and OpenZipkin, is a distributed tracing platform created by Uber Technologies and do

Jaeger - Distributed Tracing Platform 16.2k Aug 10, 2022
CNCF Jaeger, a Distributed Tracing Platform

Jaeger - a Distributed Tracing System Jaeger, inspired by Dapper and OpenZipkin, is a distributed tracing platform created by Uber Technologies and do

Jaeger - Distributed Tracing Platform 16.2k Aug 9, 2022
Jaeger-s3 - Jaeger gRPC storage plugin for Amazon S3

jaeger-s3 jaeger-s3 is gRPC storage plugin for Jaeger, which uses Amazon Kinesis

Johannes Würbach 8 Aug 4, 2022
Jaeger-influxdb - The repository that contains InfluxDB Storage gRPC plugin for Jaeger

NOTICE: This repository is archived and is no longer maintained. Please use http

Rohan 0 Feb 16, 2022
Rest API to get KVB departures - Written in Go with hexagonal architecture and tracing via OpenTelemetry and Jaeger

KVB API Rest API to get upcoming departures per KVB train station Implemented in Go with hexagonal architecture and tracing via OpenTelemetry and Jaeg

Jan Ritter 3 May 7, 2022
A simple Go library for 3D ray-tracing rendering, implementing the book Ray Tracing in One Weekend.

Ray Tracing in Go A Go implementation of the book Ray Tracing in One Weekend. The repository provides a library to describe and render your own scenes

Yuto Takahashi 55 Feb 2, 2022
Grafana Tempo is a high volume, minimal dependency distributed tracing backend.

Grafana Tempo is an open source, easy-to-use and high-scale distributed tracing backend. Tempo is cost-efficient, requiring only object storage to ope

Grafana Labs 2.2k Aug 4, 2022
Enable requests served by caddy for distributed tracing via The OpenTracing Project.

caddy-opentracing Enable requests served by caddy for distributed tracing via The OpenTracing Project. Dependencies The Go OpenTracing Library Jaeger,

null 14 Feb 14, 2022
Distributed tracing using OpenTelemetry and ClickHouse

Distributed tracing backend using OpenTelemetry and ClickHouse Uptrace is a dist

Uptrace 774 Aug 2, 2022
Uptrace - Distributed tracing backend using OpenTelemetry and ClickHouse

Distributed tracing backend using OpenTelemetry and ClickHouse Uptrace is a dist

Rohan 0 Mar 8, 2022
Topology-tester - Application to easily test microservice topologies and distributed tracing including K8s and Istio

Topology Tester The Topology Tester app allows you to quickly build a dynamic mi

Bas van Beek 1 Jan 14, 2022
A Kubernetes Native Batch System (Project under CNCF)

Volcano is a batch system built on Kubernetes. It provides a suite of mechanisms that are commonly required by many classes of batch & elastic workloa

Volcano 2.5k Aug 9, 2022
Kubernetes Native Edge Computing Framework (project under CNCF)

KubeEdge KubeEdge is built upon Kubernetes and extends native containerized application orchestration and device management to hosts at the Edge. It c

KubeEdge 5.2k Aug 6, 2022
🐻 The Universal Service Mesh. CNCF Sandbox Project.

Kuma is a modern Envoy-based service mesh that can run on every cloud, in a single or multi-zone capacity, across both Kubernetes and VMs. Thanks to i

Kuma 2.3k Aug 10, 2021
🐻 The Universal Service Mesh. CNCF Sandbox Project.

Kuma is a modern Envoy-based service mesh that can run on every cloud, in a single or multi-zone capacity, across both Kubernetes and VMs. Thanks to i

Kuma 2.9k Aug 10, 2022
OpenYurt - Extending your native Kubernetes to edge(project under CNCF)

openyurtio/openyurt English | 简体中文 What is NEW! Latest Release: September 26th, 2021. OpenYurt v0.5.0. Please check the CHANGELOG for details. First R

OpenYurt 1.3k Aug 11, 2022
Jaeger ClickHouse storage plugin implementation

Jaeger ClickHouse Jaeger ClickHouse gRPC storage plugin. This is WIP and it is based on https://github.com/bobrik/jaeger/tree/ivan/clickhouse/plugin/s

Pavol Loffay 1 Feb 15, 2022
Jaeger ClickHouse storage plugin implementation

This is implementation of Jaeger's storage plugin for ClickHouse.

Jaeger - Distributed Tracing Platform 118 Jul 28, 2022
Fibonacci by golang, opentelemetry, jaeger

Fibonacci Technology stack Opentelemetry Jaeger Prometheus Development Run Run docker-compose and main.go: make all Run docker-compose down: make down

null 0 Jan 14, 2022
Distributed-Services - Distributed Systems with Golang to consequently build a fully-fletched distributed service

Distributed-Services This project is essentially a result of my attempt to under

Hamza Yusuff 6 Jun 1, 2022
Error tracing and annotation.

errors -- import "github.com/juju/errgo" The errors package provides a way to create and diagnose errors. It is compatible with the usual Go error idi

Juju 228 Jun 18, 2022
A flexible process data collection, metrics, monitoring, instrumentation, and tracing client library for Go

Package monkit is a flexible code instrumenting and data collection library. See documentation at https://godoc.org/gopkg.in/spacemonkeygo/monkit.v3 S

Space Monkey Go 465 Aug 1, 2022
🧶 Dead simple, lightweight tracing.

?? tracer Dead simple, lightweight tracing. ?? Idea The tracer provides API to trace execution flow. func Do(ctx context.Context) { defer tracer.Fetc

Kamil Samigullin 65 Jul 13, 2022
Application tracing system for Go, based on Google's Dapper.

appdash (view on Sourcegraph) Appdash is an application tracing system for Go, based on Google's Dapper and Twitter's Zipkin. Appdash allows you to tr

Sourcegraph 1.7k Aug 8, 2022
libraries for various programming languages that make it easy to generate per-process trace files that can be loaded into chrome://tracing

chrometracing: chrome://tracing trace_event files The chrometracing directory contains libraries for various programming languages that make it easy t

Google 22 Jul 8, 2022
A high-performance timeline tracing library for Golang, used by TiDB

Minitrace-Go A high-performance, ergonomic timeline tracing library for Golang. Basic Usage package main import ( "context" "fmt" "strcon

TiKV Project 43 May 5, 2022
A modern tool for the Windows kernel exploration and tracing

Fibratus A modern tool for the Windows kernel exploration and observability Get Started » Docs • Filaments • Download • Discussions What is Fibratus?

Nedim Šabić² 1.6k Aug 10, 2022
Go gRPC Kafka CQRS microservices with tracing

Golang CQRS Kafka gRPC Postgresql MongoDB Redis microservices example ?? ??‍?? Full list what has been used: Kafka as messages broker gRPC Go implemen

Alexander 91 Jul 28, 2022
An observability database aims to ingest, analyze and store Metrics, Tracing and Logging data.

BanyanDB BanyanDB, as an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data. It's designed to handle observab

The Apache Software Foundation 111 Jul 30, 2022