Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.

Overview

Grafana Mimir

Grafana Mimir logo

Grafana Mimir is an open source software project that provides a scalable long-term storage for Prometheus. Some of the core strengths of Grafana Mimir include:

  • Easy to install and maintain: Grafana Mimir’s extensive documentation, tutorials, and deployment tooling make it quick to get started. Using its monolithic mode, you can get Grafana Mimir up and running with just one binary and no additional dependencies. Once deployed, the best-practice dashboards, alerts, and playbooks packaged with Grafana Mimir make it easy to monitor the health of the system.
  • Massive scalability: You can run Grafana Mimir's horizontally-scalable architecture across multiple machines, resulting in the ability to process orders of magnitude more time series than a single Prometheus instance. Internal testing shows that Grafana Mimir handles up to 1 billion active time series.
  • Global view of metrics: Grafana Mimir enables you to run queries that aggregate series from multiple Prometheus instances, giving you a global view of your systems. Its query engine extensively parallelizes query execution, so that even the highest-cardinality queries complete with blazing speed.
  • Cheap, durable metric storage: Grafana Mimir uses object storage for long-term data storage, allowing it to take advantage of this ubiquitous, cost-effective, high-durability technology. It is compatible with multiple object store implementations, including AWS S3, Google Cloud Storage, Azure Blob Storage, OpenStack Swift, as well as any S3-compatible object storage.
  • High availability: Grafana Mimir replicates incoming metrics, ensuring that no data is lost in the event of machine failure. Its horizontally scalable architecture also means that it can be restarted, upgraded, or downgraded with zero downtime, which means no interruptions to metrics ingestion or querying.
  • Natively multi-tenant: Grafana Mimir’s multi-tenant architecture enables you to isolate data and queries from independent teams or business units, making it possible for these groups to share the same cluster. Advanced limits and quality-of-service controls ensure that capacity is shared fairly among tenants.

Migrating to Grafana Mimir

If you're migrating to Grafana Mimir, refer to the following documents:

Deploying Grafana Mimir

For information about how to deploy Grafana Mimir, refer to Deploying Grafana Mimir.

Getting started

If you’re new to Grafana Mimir, read the Getting started guide.

Before deploying Grafana Mimir in a production environment, read:

  1. An overview of Grafana Mimir’s architecture
  2. Configuring Grafana Mimir
  3. Running Grafana Mimir in production

Documentation

Refer to the following links to access Grafana Mimir documentation:

Contributing

To contribute to Grafana Mimir, refer to Contributing to Grafana Mimir.

Join the Grafana Mimir discussion

If you have any questions or feedback regarding Grafana Mimir, join the Grafana Mimir Discussion. Alternatively, consider joining the monthly Grafana Mimir Community Call.

Your feedback is always welcome, and you can also share it via the #mimir Slack channel.

License

Grafana Mimir is distributed under AGPL-3.0-only.

Comments
  • Support for rollout-operator and Zone Awareness

    Support for rollout-operator and Zone Awareness

    What this PR does

    For the record: the original PR this is based on was authored by @ryan-dyer-sp , see #2437

    Replication zone support for alertmanager, ingester, store-gateway component Including migration path, tests and documentation.

    The migration is written in a way so that:

    1. step sets the final configuration in the Mimir YAML configuration
    • this means it can be validated right at the start
    • subsequent steps alter CLI options , only restart what's necessary
    1. steps are using named toggles, I realized that a single number would be too hard for us to maintain.

    Which issue(s) this PR fixes or relates to

    Fixes #2020

    Checklist

    • [ ] Tests updated
    • [x] Documentation added
    • [x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    helm release/notified-changelog-cut 
    opened by krajorama 35
  • Add compactor HTTP API for uploading TSDB blocks

    Add compactor HTTP API for uploading TSDB blocks

    What this PR does

    Add compactor HTTP API for uploading TSDB blocks.

    TODOs

    • [x] Add HTTP endpoints in cloud gateway
    • [x] Add HTTP endpoints in GEM gateway
    • [x] Incorporate @pracucci's feedback
    • [x] Add tests
    • [x] Add validation when creating a block upload session
    • [x] Validate block metadata
    • [x] Validate block files(?) (@colega suggestion)
    • [x] Make sure that when starting a backfill, file lengths are sent with file index
    • [x] Validate that block time range is within retention period (@aldernero working on this)
    • [x] Validate minTime/maxTime in meta.json (@aldernero working on this)
    • [x] Validate block ID on backfill start
    • [x] Test output from sanitizeMeta
    • [x] Mark user-uploaded blocks, for debugging and security(?) (this is in place through thanos.source property in meta.json)
    • [x] Make sure that a backfill can be restarted, in case it got interrupted
    • [x] Add/fix tests

    Which issue(s) this PR fixes or relates to

    Checklist

    • [x] Tests updated
    • [x] Documentation added
    • [x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    enhancement component/compactor 
    opened by aknuds1 34
  • Fix panic in distributor due to early cleanup

    Fix panic in distributor due to early cleanup

    Fixes: https://github.com/grafana/mimir/issues/2266

    Edit 2022-07-22: Most of the code in pkg/distributor/forwarding has been re-written so looking at the diff between the old and the new code is probably not very helpful when reviewing, I recommend looking at the new implementation as if it were completely new.

    In a conversation via DMs with @bboreham we concluded that it would be better to keep the forwarding logic as isolated from the distributor's PushWithCleanup() as possible, this should help to ensure that there are no race conditions or pool usage bugs.

    opened by replay 26
  • Add a docker-compose local setup to fully test Mimir

    Add a docker-compose local setup to fully test Mimir

    What this PR does: In this PR I propose to introduce a docker-compose local setup (based on single binary and memberlist) to allow the community to have a quick way to try the latest stable release of Mimir in a HA setup. It also runs Prometheus (used both to scrape Mimir metrics and run recording rules) as well as Grafana with our dashboard provisioned.

    The PR includes a tutorial-style README.md guiding the user step-by-step. I've try to follow the Grafana tutorial style, but I haven't used tutorial markdown syntax given for the moment won't be published as a Grafana tutorial.

    Which issue(s) this PR fixes: Fixes #991 Fixes #1024

    Checklist

    • [ ] Tests updated
    • [ ] Documentation added
    • [ ] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    type/docs 
    opened by pracucci 24
  • Runtime override of tenant-specific active series custom trackers

    Runtime override of tenant-specific active series custom trackers

    What this PR does

    This PR moves the active series custom trackers to a runtime configuration, and also introduces tenant-specific overwrites for these matchers. I am uploading this to start an early discussion of the design,

    Key points I would already like to discuss:

    • ~~Moving the custom trackers to runtime configuration is a breaking change as it removes the flag and changes the default behavior~~
    • Config change management could be better handled on manager side, with some content hashing, but that would introduce dskit changes...

    Which issue(s) this PR fixes

    https://github.com/grafana/mimir-squad/issues/526

    Fixes #

    Checklist

    • [x] Tests updated
    • [x] Documentation added
    • [x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    opened by gubjanos 23
  • Fix makefiles in development folder to properly check yml in CI/CD

    Fix makefiles in development folder to properly check yml in CI/CD

    What this PR does

    Fixes this problem: currently, you can modify the docker-compose.yml directly (which is supposed to be generated) without modifying the docker-compose.jsonnet template and have it still pass the CI/CD tests, which doesn't seem to be intended, because of this issue https://stackoverflow.com/questions/3931741/why-does-make-think-the-target-is-up-to-date where if the make command name is the same as the name of a file in that folder that already exists, it does not regenerate it and hence does not notice that the file has changed. So we use .PHONY to ensure that make check in our CI/CD will actually regenerate the file properly.

    Which issue(s) this PR fixes or relates to

    N/A. CI/CD issue only, doesn't affect users

    Checklist

    • [x] Tests updated
    • [x] Documentation added
    • [x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    opened by zenador 21
  • track sample OOO/wallclock delay in histogram

    track sample OOO/wallclock delay in histogram

    This should help with building an understanding of customer requirements as far as how far back in time (wrt walclock, and wrt the sample with highest timestamp seen) samples tend to go. This whole codebase is pretty new to me, so want to check if this makes sense... Note that we don't track per-tenantID as that would be expensive and I don't think we need to. cc @codesome

    Note: changes are in mimir and also in vendored prometheus. For now I made the change in vendor dir directly (hence the CI failure), but if we want to take this forward, it will be done properly.

    Checklist

    • [ ] Tests updated
    • [ ] Documentation added
    • [ ] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    opened by Dieterbe 21
  • Add alertmanager fallback config in Helm chart

    Add alertmanager fallback config in Helm chart

    What this PR does

    Adds the possibility to define a fallback config for alertmanager from the helm chart config

    Checklist

    • [x] Tests updated
    • [ ] Documentation added
    • [x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    helm release/notified-changelog-cut 
    opened by duncan485 20
  • Automate release process

    Automate release process

    In preparation of Mimir launch, we need to automate the release process:

    • [x] Build binaries (+ checksums)
    • [x] Build and push Docker images to docker.io
    • [x] https://github.com/grafana/mimir/issues/837
    • [x] https://github.com/grafana/mimir/pull/975
    • [x] https://github.com/grafana/mimir/pull/1138
    • [x] Push all Mimir images (including tools and build image) to docker.io/grafana/*
    • [x] Revise RELEASE.md
      • [x] Release schedule: should be removed, it was from Cortex
      • [x] Change Cortex->Mimir
      • [x] Ensure release procedure is up to date
      • [x] Decide how to tag release, start at v2.0.0 -> design doc

    Out of scope:

    • publishing docs to the grafana.com website (tracked in another issue - cc @jdbaldry can you fill in the issue please?)
    • rpm, deb, homebrew packages (in fact, get rid of any leftovers in makefile and elsewhere)
    • automating upload of binaries
    opened by pracucci 18
  • Slow integration tests caused by slow update of ring client metrics

    Slow integration tests caused by slow update of ring client metrics

    While investigating slow integration tests I've noticed that some of them (eg. TestSingleBinaryWithMemberlist) is significantly slowed down by assertions ring metrics, like this one:

    require.NoError(t, s.Stop(mimir1))
    require.NoError(t, mimir2.WaitSumMetrics(e2e.Equals(2*512), "cortex_ring_tokens_total"))
    

    Why? Because the ring client updates the metrics every 10s (hardcoded): https://github.com/grafana/dskit/blob/84c00dae89477871dbfa0b83c823c7258f53e3bd/ring/ring.go#L288-L302

    This means that every time there's a ring change (eg. s.Step(mimir1)) and then we wait for that change to be propagated (eg. mimir2.WaitSumMetrics(e2e.Equals(2*512), "cortex_ring_tokens_total")) we end up waiting up to 10s after the change has been propagated just because client metrics are not updated right after the update has been received by the ring client.

    The regression has been introduced in dskit PR 50.

    type/tests 
    opened by pracucci 18
  • `markblocks` tool to mark blocks for deletion or as non-compactable

    `markblocks` tool to mark blocks for deletion or as non-compactable

    What this PR does

    I think we should have this committed in the repo rather than having to look for it in issue comments.

    Which issue(s) this PR fixes or relates to

    Ref: https://github.com/grafana/mimir/issues/1537

    Checklist

    • [ ] Tests updated
    • [x] Documentation added
    • [x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    opened by colega 17
  • [Helm][v4.0.0] Gateway component doesn't have `/metrics` endpoint

    [Helm][v4.0.0] Gateway component doesn't have `/metrics` endpoint

    Describe the bug

    After deploying mimir-distributed version 4.0.0

    image

    To Reproduce

    Steps to reproduce the behavior:

    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
    helm repo add grafana https://grafana.github.io/helm-charts
    helm repo update
    
    helm upgrade --install prometheus-operator prometheus-community/kube-prometheus-stack
    helm fetch --untar grafana/mimir-distributed --version 4.0.0
    helm upgrade --install mimir ./mimir-distributed \
      -f ./mimir-distributed/values.yaml \
      -f ./mimir-distributed/large.yaml
    

    Expected behavior

    Either be able to disable the ServiceMonitor for the gateway or add the /metrics endpoint as expected (see workaround below)

    Environment

    • Infrastructure: AWS EKS
    • Deployment tool: Helm

    Additional Context

    My workaround has been to add nginx-prometheus-exporter as sidecar

    gateway:
      enabledNonEnterprise: true
      extraContainers:
      - args:
        - -nginx.retries=5
        - -nginx.scrape-uri=http://127.0.0.1:8080/nginx_status
        - -web.telemetry-path=/metrics
        - -web.listen-address=:9113
        image: nginx/nginx-prometheus-exporter:0.11.0
        name: nginx-exporter
        resources:
          requests:
            cpu: 20m
            memory: 32Mi
      nginx:
        config:
          serverSnippet: |
            location = /metrics {
              proxy_pass http://127.0.0.1:9113$request_uri;
              auth_basic off;
            }
            location = /nginx_status {
              stub_status;
              auth_basic off;
            }
        verboseLogging: false
      podDisruptionBudget:
        maxUnavailable: 50%
      replicas: 5
      resources: {}
    
    

    image

    opened by carlosjgp 0
  • Make query-frontend cache TTL configurable, and increase the default.

    Make query-frontend cache TTL configurable, and increase the default.

    Is your feature request related to a problem? Please describe.

    query-frontend caches results, but it sets a TTL of 7d to those results:

    https://github.com/grafana/mimir/blob/cac269d4837e168b3b3f17b7e2e56cdacd79c4c7/pkg/frontend/querymiddleware/split_and_cache.go#L37-L38

    If you're caching a year-long query, it means that you'll have to recalculate it next week again, which is undesirable.

    Describe the solution you'd like

    Make the TTL configurlable, allow no TTL at all (Memcache is LRU, why caring about TTL?)

    Note that the maximum TTL for Memcache appears to be 30d.

    enhancement good first issue 
    opened by colega 0
  • Docs: Introduce dedicated reference section for config parameters

    Docs: Introduce dedicated reference section for config parameters

    What this PR does

    This moves the configuration parameter reference from a sub-sub-sub page to a dedicated top-level section where it stands out more and is easier to access (imho) for someone not very familiar with the structure of the documentation.

    It also adds the --init flag to the docker run command that serves docs locally. At least on my system, this was necessary to make the container shut down on Ctrl+C as advertised.

    Which issue(s) this PR fixes or relates to

    n/a

    Checklist

    • [ ] Tests updated
    • [ ] Documentation added
    • [ ] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    type/docs ease-of-use 
    opened by flxbk 0
  • Helm 4.0.0 Gateway nginx config is not working with zone-aware alertmanager configuration (mimir oss)

    Helm 4.0.0 Gateway nginx config is not working with zone-aware alertmanager configuration (mimir oss)

    Describe the bug

    Helm 4.0.0 Gateway nginx config is not working with zone-aware alertmanager configuration (mimir oss)

    To Reproduce

    Steps to reproduce the behavior:

    1. Install mimir-distributed version 3.3.x
    2. Upgrade to mimir-distributed version 4.0.0 with zone-aware configuration disabled
    3. Migrate to unify proxy deplouyment using following procedure https://grafana.com/docs/mimir/v2.5.x/operators-guide/deploy-grafana-mimir/migrate-to-unified-proxy-deployment/
    4. Migrate alertmanager single zone to zone-aware replication using following procedure https://grafana.com/docs/mimir/latest/migration-guide/migrating-from-single-zone-with-helm/

    Expected behavior

    Expect the alert manager to be available through the gateway configuration but it is not.

    The problem is that the gateway nginx.conf is not correct. The service "mimir-alertmanager" has been replaced by "mimir-alertmanager-zone-a", "mimir-alertmanager-zone-b" and "mimir-alertmanager-zone-c"

    
        # Alertmanager endpoints
    
        location /alertmanager {
    
          proxy_pass      http://mimir-alertmanager.mimir.svc.cluster.local:8080$request_uri;
    
        }
    
        location = /multitenant_alertmanager/status {
    
          proxy_pass      http://mimir-alertmanager.mimir.svc.cluster.local:8080$request_uri;
    
        }
    
        location = /api/v1/alerts {
    
          proxy_pass      http://mimir-alertmanager.mimir.svc.cluster.local:8080$request_uri;
    
        }
    

    resolution proposal

    change the gateway nginx.conf (see: https://github.com/grafana/mimir/blob/mimir-distributed-4.0.0/operations/helm/charts/mimir-distributed/values.yaml#L2485) to have something like that

    
        # Alertmanager endpoints
    
        location /alertmanager {
    
          proxy_pass      http://mimir-alertmanager-headless.mimir.svc.cluster.local:8080$request_uri;
    
        }
    
        location = /multitenant_alertmanager/status {
    
          proxy_pass      http://mimir-alertmanager-headless.mimir.svc.cluster.local:8080$request_uri;
    
        }
    
        location = /api/v1/alerts {
    
          proxy_pass      http://mimir-alertmanager-headless.mimir.svc.cluster.local:8080$request_uri;
    
        }
    
    opened by df-cgdm 0
  • Alerts: Add alert that triggers for idle alertmanager instances

    Alerts: Add alert that triggers for idle alertmanager instances

    What this PR does

    This adds an alert to detect alertmanager instances that don't own any tenants.

    Which issue(s) this PR fixes or relates to

    Fixes #1959

    Checklist

    • [ ] Tests updated
    • [x] Documentation added
    • [x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    type/docs monitoring/alerts 
    opened by flxbk 0
  • Introduce querier.max-partial-query-length flag

    Introduce querier.max-partial-query-length flag

    What this PR does

    This introduces the querier.max-partial-query-length flag to allow limiting the time range for (partial) queries at the querier level. It also deprecates store.max-query-length which became ambiguous due to its different semantics in limiting query ranges at the frontend and querier levels.

    Which issue(s) this PR fixes or relates to

    Fixes #2793

    Checklist

    • [x] Tests updated
    • [ ] Documentation added
    • [x] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
    component/querier ease-of-use 
    opened by flxbk 0
Releases(mimir-2.5.0)
  • mimir-2.5.0(Dec 14, 2022)

    This release contains 230 PRs from 43 authors, including new contributors Aldo D'Aquino, Anıl Mısırlıoğlu, Charles Korn, Danny Staple, Dylan Crees, Eduardo Silvi, FG, Jesse Weaver, KarlisAG, Leegin-darknight, Rohan Kumar, Wille Faler, Y.Horie, manohar-koukuntla, paulroche, songjiayang, Éamon Ryan. Thank you!

    Grafana Mimir version 2.5 release notes

    Grafana Labs is excited to announce version 2.5 of Grafana Mimir.

    The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.

    Features and enhancements

    • Alertmanager Discord support Alertmanager can now be configured to send alerts in Discord channels.

    • Configurable TLS minimum version and cipher suites We added the flags -server.tls-min-version and -server.tls-cipher-suites that can be used to define the minimum TLS version and the supported cipher suites in all HTTP and gRPC servers in Mimir.

    • Lower memory usage in store-gateway, ingester and alertmanager We made various changes related to how index lookups are performed and how the active series custom trackers are implemented, which results in better performance and lower overall memory usage in the store-gateway and ingester. We also optimized the alertmanager, which results in a 50% reduction in memory usage in use cases with larger numbers of tenants.

    • Improved Mimir dashboards We added two new dashboards named Mimir / Overview resources and Mimir / Overview networking. Furthermore, we have made various improvements to the following existing dashboards:

      • Mimir / Overview: Add "remote read", "metadata", and "exemplar" queries.
      • Mimir / Writes: Add optional row about the distributor's new forwarding feature.
      • Mimir / Tenants: Add insights into the read path.

    Helm chart improvements

    • Zone aware replication Helm now supports deploying the ingesters and store-gateways as different availability zones. The replication is also zone-aware, therefore multiple instances of one zone can fail without any service interruption and roll outs can be performed faster because many instances of each zone can be restarted together, as opposed to them all restarting in sequence.

      This is a breaking change, for details on how to upgrade please review the Helm changelog.

    • Running without root privileges All Mimir, GEM and Agent processes now don't require root privileges to run anymore.

    • Unified reverse proxy (gateway) configuration for Mimir and GEM This change allows for an easier upgrade path from Mimir to GEM, without any downtime. The unified configuration also makes it possible to autoscale the GEM gateway pods and it supports OpenShift Route. The change also deprecates the nginx section in the configuration. The section will be removed in release 7.0.0.

    • Updated MinIO The MinIO sub-chart was updated from 4.x to 5.0.0, note that this update inherits a breaking change because the MinIO gateway mode was removed.

    • Updated sizing plans We updated our sizing plans to make them reflect better how we recommend running Mimir and GEM in production. Note that this includes a breaking change for users of the "small" plan, more details can be found in the Helm changelog.

    • Various quality of life improvements

      • Rollout strategies without downtime
      • Read path and compactor configuration refresh, providing better default settings
      • OTLP ingestion support in the Nginx configuration
      • A default configuration for alertmanager, so the user interface and the sending of alerts from the ruler works out of the box

    Bug fixes

    • Flusher: Added Overrides as a dependency to prevent panics when starting with -target=flusher. PR 3151
    • Query-frontend: properly close gRPC streams to the query-scheduler to stop memory and goroutines leak. PR 3302
    • Ruler: persist evaluation delay configured in the rulegroup. PR 3392
    • Fix panics in OTLP ingest path when parse errors occur. PR 3538

    Changelog

    2.5.0

    Grafana Mimir

    • [CHANGE] Flag -azure.msi-resource is now ignored, and will be removed in Mimir 2.7. This setting is now made automatically by Azure. #2682
    • [CHANGE] Experimental flag -blocks-storage.tsdb.out-of-order-capacity-min has been removed. #3261
    • [CHANGE] Distributor: Wrap errors from pushing to ingesters with useful context, for example clarifying timeouts. #3307
    • [CHANGE] The default value of -server.http-write-timeout has changed from 30s to 2m. #3346
    • [CHANGE] Reduce period of health checks in connection pools for querier->store-gateway, ruler->ruler, and alertmanager->alertmanager clients to 10s. This reduces the time to fail a gRPC call when the remote stops responding. #3168
    • [CHANGE] Hide TSDB block ranges period config from doc and mark it experimental. #3518
    • [FEATURE] Alertmanager: added Discord support. #3309
    • [ENHANCEMENT] Added -server.tls-min-version and -server.tls-cipher-suites flags to configure cipher suites and min TLS version supported by HTTP and gRPC servers. #2898
    • [ENHANCEMENT] Distributor: Add age filter to forwarding functionality, to not forward samples which are older than defined duration. If such samples are not ingested, cortex_discarded_samples_total{reason="forwarded-sample-too-old"} is increased. #3049 #3113
    • [ENHANCEMENT] Store-gateway: Reduce memory allocation when generating ids in index cache. #3179
    • [ENHANCEMENT] Query-frontend: truncate queries based on the configured creation grace period (--validation.create-grace-period) to avoid querying too far into the future. #3172
    • [ENHANCEMENT] Ingester: Reduce activity tracker memory allocation. #3203
    • [ENHANCEMENT] Query-frontend: Log more detailed information in the case of a failed query. #3190
    • [ENHANCEMENT] Added -usage-stats.installation-mode configuration to track the installation mode via the anonymous usage statistics. #3244
    • [ENHANCEMENT] Compactor: Add new cortex_compactor_block_max_time_delta_seconds histogram for detecting if compaction of blocks is lagging behind. #3240 #3429
    • [ENHANCEMENT] Ingester: reduced the memory footprint of active series custom trackers. #2568
    • [ENHANCEMENT] Distributor: Include X-Scope-OrgId header in requests forwarded to configured forwarding endpoint. #3283 #3385
    • [ENHANCEMENT] Alertmanager: reduced memory utilization in Mimir clusters with a large number of tenants. #3309
    • [ENHANCEMENT] Add experimental flag -shutdown-delay to allow components to wait after receiving SIGTERM and before stopping. In this time the component returns 503 from /ready endpoint. #3298
    • [ENHANCEMENT] Go: update to go 1.19.3. #3371
    • [ENHANCEMENT] Alerts: added RulerRemoteEvaluationFailing alert, firing when communication between ruler and frontend fails in remote operational mode. #3177 #3389
    • [ENHANCEMENT] Clarify which S3 signature versions are supported in the error "unsupported signature version". #3376
    • [ENHANCEMENT] Store-gateway: improved index header reading performance. #3393 #3397 #3436
    • [ENHANCEMENT] Store-gateway: improved performance of series matching. #3391
    • [ENHANCEMENT] Move the validation of incoming series before the distributor's forwarding functionality, so that we don't forward invalid series. #3386 #3458
    • [ENHANCEMENT] S3 bucket configuration now validates that the endpoint does not have the bucket name prefix. #3414
    • [ENHANCEMENT] Query-frontend: added "fetched index bytes" to query statistics, so that the statistics contain the total bytes read by store-gateways from TSDB block indexes. #3206
    • [ENHANCEMENT] Distributor: push wrapper should only receive unforwarded samples. #2980
    • [BUGFIX] Flusher: Add Overrides as a dependency to prevent panics when starting with -target=flusher. #3151
    • [BUGFIX] Updated golang.org/x/text dependency to fix CVE-2022-32149. #3285
    • [BUGFIX] Query-frontend: properly close gRPC streams to the query-scheduler to stop memory and goroutines leak. #3302
    • [BUGFIX] Ruler: persist evaluation delay configured in the rulegroup. #3392
    • [BUGFIX] Ring status pages: show 100% ownership as "100%", not "1e+02%". #3435
    • [BUGFIX] Fix panics in OTLP ingest path when parse errors exist. #3538

    Mixin

    • [CHANGE] Alerts: Change MimirSchedulerQueriesStuck for time to 7 minutes to account for the time it takes for HPA to scale up. #3223
    • [CHANGE] Dashboards: Removed the Querier > Stages panel from the Mimir / Queries dashboard. #3311
    • [CHANGE] Configuration: The format of the autoscaling section of the configuration has changed to support more components. #3378
      • Instead of specific config variables for each component, they are listed in a dictionary. For example, autoscaling.querier_enabled becomes autoscaling.querier.enabled.
    • [FEATURE] Dashboards: Added "Mimir / Overview resources" dashboard, providing an high level view over a Mimir cluster resources utilization. #3481
    • [FEATURE] Dashboards: Added "Mimir / Overview networking" dashboard, providing an high level view over a Mimir cluster network bandwidth, inflight requests and TCP connections. #3487
    • [FEATURE] Compile baremetal mixin along k8s mixin. #3162 #3514
    • [ENHANCEMENT] Alerts: Add MimirRingMembersMismatch firing when a component does not have the expected number of running jobs. #2404
    • [ENHANCEMENT] Dashboards: Add optional row about the Distributor's metric forwarding feature to the Mimir / Writes dashboard. #3182 #3394 #3394 #3461
    • [ENHANCEMENT] Dashboards: Remove the "Instance Mapper" row from the "Alertmanager Resources Dashboard". This is a Grafana Cloud specific service and not relevant for external users. #3152
    • [ENHANCEMENT] Dashboards: Add "remote read", "metadata", and "exemplar" queries to "Mimir / Overview" dashboard. #3245
    • [ENHANCEMENT] Dashboards: Use non-red colors for non-error series in the "Mimir / Overview" dashboard. #3246
    • [ENHANCEMENT] Dashboards: Add support to multi-zone deployments for the experimental read-write deployment mode. #3256
    • [ENHANCEMENT] Dashboards: If enabled, add new row to the Mimir / Writes for distributor autoscaling metrics. #3378
    • [ENHANCEMENT] Dashboards: Add read path insights row to the "Mimir / Tenants" dashboard. #3326
    • [ENHANCEMENT] Alerts: Add runbook urls for alerts. #3452
    • [ENHANCEMENT] Configuration: Make it possible to configure namespace label, job label, and job prefix. #3482
    • [ENHANCEMENT] Dashboards: improved resources and networking dashboards to work with read-write deployment mode too. #3497 #3504 #3519 #3531
    • [ENHANCEMENT] Alerts: Added "MimirDistributorForwardingErrorRate" alert, which fires on high error rates in the distributor’s forwarding feature. #3200
    • [ENHANCEMENT] Improve phrasing in Overview dashboard. #3488
    • [BUGFIX] Dashboards: Fix legend showing persistentvolumeclaim when using deployment_type=baremetal for Disk space utilization panels. #3173 #3184
    • [BUGFIX] Alerts: Fixed MimirGossipMembersMismatch alert when Mimir is deployed in read-write mode. #3489
    • [BUGFIX] Dashboards: Remove "Inflight requests" from object store panels because the panel is not tracking the inflight requests to object storage. #3521

    Jsonnet

    • [CHANGE] Replaced the deprecated policy/v1beta1 with policy/v1 when configuring a PodDisruptionBudget. #3284
    • [CHANGE] Common storage configuration is now used to configure object storage in all components. This is a breaking change in terms of Jsonnet manifests and also a CLI flag update for components that use object storage, so it will require a rollout of those components. The changes include: #3257
      • blocks_storage_backend was renamed to storage_backend and is now used as the common storage backend for all components.
        • So were the related blocks_storage_azure_account_(name|key) and blocks_storage_s3_endpoint configurations.
      • storage_s3_endpoint is now rendered by default using the aws_region configuration instead of a hardcoded us-east-1.
      • ruler_client_type and alertmanager_client_type were renamed to ruler_storage_backend and alertmanager_storage_backend respectively, and their corresponding CLI flags won't be rendered unless explicitly set to a value different from the one in storage_backend (like local).
      • alertmanager_s3_bucket_name, alertmanager_gcs_bucket_name and alertmanager_azure_container_name have been removed, and replaced by a single alertmanager_storage_bucket_name configuration used for all object storages.
      • genericBlocksStorageConfig configuration object was removed, and so any extensions to it will be now ignored. Use blockStorageConfig instead.
      • rulerClientConfig and alertmanagerStorageClientConfig configuration objects were renamed to rulerStorageConfig and alertmanagerStorageConfig respectively, and so any extensions to their previous names will be now ignored. Use the new names instead.
      • The CLI flags *.s3.region are no longer rendered as they are optional and the region can be inferred by Mimir by performing an initial API call to the endpoint.
      • The migration to this change should usually consist of:
        • Renaming blocks_storage_backend key to storage_backend.
        • For Azure/S3:
          • Renaming blocks_storage_(azure|s3)_* configurations to storage_(azure|s3)_*.
          • If ruler_storage_(azure|s3)_* and alertmanager_storage_(azure|s3)_* keys were different from the block_storage_* ones, they should be now provided using CLI flags, see configuration reference for more details.
        • Removing ruler_client_type and alertmanager_client_type if their value match the storage_backend, or renaming them to their new names otherwise.
        • Reviewing any possible extensions to genericBlocksStorageConfig, rulerClientConfig and alertmanagerStorageClientConfig and moving them to the corresponding new options.
        • Renaming the alertmanager's bucket name configuration from provider-specific to the new alertmanager_storage_bucket_name key.
    • [CHANGE] The overrides-exporter.libsonnet file is now always imported. The overrides-exporter can be enabled in jsonnet setting the following: #3379
      {
        _config+:: {
          overrides_exporter_enabled: true,
        }
      }
      
    • [FEATURE] Added support for experimental read-write deployment mode. Enabling the read-write deployment mode on a existing Mimir cluster is a destructive operation, because the cluster will be re-created. If you're creating a new Mimir cluster, you can deploy it in read-write mode adding the following configuration: #3379 #3475 #3405
      {
        _config+:: {
          deployment_mode: 'read-write',
      
          // See operations/mimir/read-write-deployment.libsonnet for more configuration options.
          mimir_write_replicas: 3,
          mimir_read_replicas: 2,
          mimir_backend_replicas: 3,
        }
      }
      
    • [ENHANCEMENT] Add autoscaling support to the mimir-read component when running the read-write-deployment model. #3419
    • [ENHANCEMENT] Added $._config.usageStatsConfig to track the installation mode via the anonymous usage statistics. #3294
    • [ENHANCEMENT] The query-tee node port ($._config.query_tee_node_port) is now optional. #3272
    • [ENHANCEMENT] Add support for autoscaling distributors. #3378
    • [ENHANCEMENT] Make auto-scaling logic ensure integer KEDA thresholds. #3512
    • [BUGFIX] Fixed query-scheduler ring configuration for dedicated ruler's queries and query-frontends. #3237 #3239
    • [BUGFIX] Jsonnet: Fix auto-scaling so that ruler-querier CPU threshold is a string-encoded integer millicores value. #3520

    Mimirtool

    • [FEATURE] Added mimirtool alertmanager verify command to validate configuration without uploading. #3440
    • [ENHANCEMENT] Added mimirtool rules delete-namespace command to delete all of the rule groups in a namespace including the namespace itself. #3136
    • [ENHANCEMENT] Refactor mimirtool analyze prometheus: add concurrency and resiliency #3349
      • Add --concurrency flag. Default: number of logical CPUs
    • [BUGFIX] --log.level=debug now correctly prints the response from the remote endpoint when a request fails. #3180

    Documentation

    • [ENHANCEMENT] Documented how to configure HA deduplication using Consul in a Mimir Helm deployment. #2972
    • [ENHANCEMENT] Improve MimirQuerierAutoscalerNotActive runbook. #3186
    • [ENHANCEMENT] Improve MimirSchedulerQueriesStuck runbook to reflect debug steps with querier auto-scaling enabled. #3223
    • [ENHANCEMENT] Use imperative for docs titles. #3178 #3332 #3343
    • [ENHANCEMENT] Docs: mention gRPC compression in "Production tips". #3201
    • [ENHANCEMENT] Update ADOPTERS.md. #3224 #3225
    • [ENHANCEMENT] Add a note for jsonnet deploying. #3213
    • [ENHANCEMENT] out-of-order runbook update with use case. #3253
    • [ENHANCEMENT] Fixed TSDB retention mentioned in the "Recover source blocks from ingesters" runbook. #3280
    • [ENHANCEMENT] Run Grafana Mimir in production using the Helm chart. #3072
    • [ENHANCEMENT] Use common configuration in the tutorial. #3282
    • [ENHANCEMENT] Updated detailed steps for migrating blocks from Thanos to Mimir. #3290
    • [ENHANCEMENT] Add scheme to DNS service discovery docs. #3450
    • [BUGFIX] Remove reference to file that no longer exists in contributing guide. #3404
    • [BUGFIX] Fix some minor typos in the contributing guide and on the runbooks page. #3418
    • [BUGFIX] Fix small typos in API reference. #3526
    • [BUGFIX] Fixed TSDB retention mentioned in the "Recover source blocks from ingesters" runbook. #3278
    • [BUGFIX] Fixed configuration example in the "Configuring the Grafana Mimir query-frontend to work with Prometheus" guide. #3374

    Tools

    • [FEATURE] Add copyblocks tool, to copy Mimir blocks between two GCS buckets. #3264
    • [ENHANCEMENT] copyblocks: copy no-compact global markers and optimize min time filter check. #3268
    • [ENHANCEMENT] Mimir rules GitHub action: Added the ability to change default value of label when running prepare command. #3236
    • [BUGFIX] Mimir rules Github action: Fix single line output. #3421

    All changes in this release: https://github.com/grafana/mimir/compare/mimir-2.4.0...mimir-2.5.0

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(29.42 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(28.87 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(26.84 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(25.62 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-2.5.0_amd64.deb(16.90 MB)
    mimir-2.5.0_amd64.deb-sha-256(65 bytes)
    mimir-2.5.0_amd64.rpm(16.76 MB)
    mimir-2.5.0_amd64.rpm-sha-256(65 bytes)
    mimir-2.5.0_arm64.deb(15.36 MB)
    mimir-2.5.0_arm64.deb-sha-256(65 bytes)
    mimir-2.5.0_arm64.rpm(15.26 MB)
    mimir-2.5.0_arm64.rpm-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.22 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(15.97 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.62 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.00 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(54.08 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(53.28 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(48.98 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.93 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(51.15 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(50.85 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(46.94 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(45.31 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(47.57 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(45.86 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(14.39 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(14.22 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(12.98 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(12.50 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.5.0-rc.0(Nov 30, 2022)

    This release contains 227 PRs from 43 authors, including new contributors Aldo D'Aquino, Anıl Mısırlıoğlu, Charles Korn, Danny Staple, Dylan Crees, Eduardo Silvi, FG, Jesse Weaver, KarlisAG, Leegin-darknight, Rohan Kumar, Wille Faler, Y.Horie, manohar-koukuntla, paulroche, songjiayang, Éamon Ryan. Thank you!

    Grafana Mimir version 2.5.0-rc.0 release notes

    Grafana Labs is excited to announce version 2.5.0-rc.0 of Grafana Mimir.

    The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.

    Features and enhancements

    • Alertmanager Discord support Alertmanager can now be configured to send alerts in Discord channels.

    • Configurable TLS minimum version and cipher suites We added the flags -server.tls-min-version and -server.tls-cipher-suites that can be used to define the minimum TLS version and the supported cipher suites in all HTTP and gRPC servers in Mimir.

    • Lower memory usage in store-gateway, ingester and alertmanager We made various changes related to how index lookups are performed and how the active series custom trackers are implemented, which results in better performance and lower overall memory usage in the store-gateway and ingester. We also optimized the alertmanager, which results in a 50% reduction in memory usage in use cases with larger numbers of tenants.

    • Improved Mimir dashboards We added two new dashboards named Mimir / Overview resources and Mimir / Overview networking. Furthermore, we have made various improvements to the following existing dashboards:

      • Mimir / Overview: Add "remote read", "metadata", and "exemplar" queries.
      • Mimir / Writes: Add optional row about the distributor's new forwarding feature.
      • Mimir / Tenants: Add insights into the read path.

    Helm chart improvements

    • Zone aware replication Helm now supports deploying the ingesters and store-gateways as different availability zones. The replication is also zone-aware, therefore multiple instances of one zone can fail without any service interruption and roll outs can be performed faster because many instances of each zone can be restarted together, as opposed to them all restarting in sequence.

      This is a breaking change, for details on how to upgrade please review the Helm changelog.

    • Running without root privileges All Mimir, GEM and Agent processes now don't require root privileges to run anymore.

    • Unified reverse proxy (gateway) configuration for Mimir and GEM This change allows for an easier upgrade path from Mimir to GEM, without any downtime. The unified configuration also makes it possible to autoscale the GEM gateway pods and it supports OpenShift Route. The change also deprecates the nginx section in the configuration. The section will be removed in release 7.0.0.

    • Updated MinIO The MinIO sub-chart was updated from 4.x to 5.0.0, note that this update inherits a breaking change because the MinIO gateway mode was removed.

    • Updated sizing plans We updated our sizing plans to make them reflect better how we recommend running Mimir and GEM in production. Note that this includes a breaking change for users of the "small" plan, more details can be found in the Helm changelog.

    • Various quality of life improvements

      • Rollout strategies without downtime
      • Read path and compactor configuration refresh, providing better default settings
      • OTLP ingestion support in the Nginx configuration
      • A default configuration for alertmanager, so the user interface and the sending of alerts from the ruler works out of the box

    Bug fixes

    • Flusher: Added Overrides as a dependency to prevent panics when starting with -target=flusher. PR 3151
    • Query-frontend: properly close gRPC streams to the query-scheduler to stop memory and goroutines leak. PR 3302
    • Ruler: persist evaluation delay configured in the rulegroup. PR 3392
    • Fix panics in OTLP ingest path when parse errors occur. PR 3538

    Changelog

    2.5.0-rc.0

    Grafana Mimir

    • [CHANGE] Flag -azure.msi-resource is now ignored, and will be removed in Mimir 2.7. This setting is now made automatically by Azure. #2682
    • [CHANGE] Experimental flag -blocks-storage.tsdb.out-of-order-capacity-min has been removed. #3261
    • [CHANGE] Distributor: Wrap errors from pushing to ingesters with useful context, for example clarifying timeouts. #3307
    • [CHANGE] The default value of -server.http-write-timeout has changed from 30s to 2m. #3346
    • [CHANGE] Reduce period of health checks in connection pools for querier->store-gateway, ruler->ruler, and alertmanager->alertmanager clients to 10s. This reduces the time to fail a gRPC call when the remote stops responding. #3168
    • [CHANGE] Hide TSDB block ranges period config from doc and mark it experimental. #3518
    • [FEATURE] Alertmanager: added Discord support. #3309
    • [ENHANCEMENT] Added -server.tls-min-version and -server.tls-cipher-suites flags to configure cipher suites and min TLS version supported by HTTP and gRPC servers. #2898
    • [ENHANCEMENT] Distributor: Add age filter to forwarding functionality, to not forward samples which are older than defined duration. If such samples are not ingested, cortex_discarded_samples_total{reason="forwarded-sample-too-old"} is increased. #3049 #3113
    • [ENHANCEMENT] Store-gateway: Reduce memory allocation when generating ids in index cache. #3179
    • [ENHANCEMENT] Query-frontend: truncate queries based on the configured creation grace period (--validation.create-grace-period) to avoid querying too far into the future. #3172
    • [ENHANCEMENT] Ingester: Reduce activity tracker memory allocation. #3203
    • [ENHANCEMENT] Query-frontend: Log more detailed information in the case of a failed query. #3190
    • [ENHANCEMENT] Added -usage-stats.installation-mode configuration to track the installation mode via the anonymous usage statistics. #3244
    • [ENHANCEMENT] Compactor: Add new cortex_compactor_block_max_time_delta_seconds histogram for detecting if compaction of blocks is lagging behind. #3240 #3429
    • [ENHANCEMENT] Ingester: reduced the memory footprint of active series custom trackers. #2568
    • [ENHANCEMENT] Distributor: Include X-Scope-OrgId header in requests forwarded to configured forwarding endpoint. #3283 #3385
    • [ENHANCEMENT] Alertmanager: reduced memory utilization in Mimir clusters with a large number of tenants. #3309
    • [ENHANCEMENT] Add experimental flag -shutdown-delay to allow components to wait after receiving SIGTERM and before stopping. In this time the component returns 503 from /ready endpoint. #3298
    • [ENHANCEMENT] Go: update to go 1.19.3. #3371
    • [ENHANCEMENT] Alerts: added RulerRemoteEvaluationFailing alert, firing when communication between ruler and frontend fails in remote operational mode. #3177 #3389
    • [ENHANCEMENT] Clarify which S3 signature versions are supported in the error "unsupported signature version". #3376
    • [ENHANCEMENT] Store-gateway: improved index header reading performance. #3393 #3397 #3436
    • [ENHANCEMENT] Store-gateway: improved performance of series matching. #3391
    • [ENHANCEMENT] Move the validation of incoming series before the distributor's forwarding functionality, so that we don't forward invalid series. #3386 #3458
    • [ENHANCEMENT] S3 bucket configuration now validates that the endpoint does not have the bucket name prefix. #3414
    • [ENHANCEMENT] Query-frontend: added "fetched index bytes" to query statistics, so that the statistics contain the total bytes read by store-gateways from TSDB block indexes. #3206
    • [ENHANCEMENT] Distributor: push wrapper should only receive unforwarded samples. #2980
    • [BUGFIX] Flusher: Add Overrides as a dependency to prevent panics when starting with -target=flusher. #3151
    • [BUGFIX] Updated golang.org/x/text dependency to fix CVE-2022-32149. #3285
    • [BUGFIX] Query-frontend: properly close gRPC streams to the query-scheduler to stop memory and goroutines leak. #3302
    • [BUGFIX] Ruler: persist evaluation delay configured in the rulegroup. #3392
    • [BUGFIX] Ring status pages: show 100% ownership as "100%", not "1e+02%". #3435
    • [BUGFIX] Fix panics in OTLP ingest path when parse errors exist. #3538

    Mixin

    • [CHANGE] Alerts: Change MimirSchedulerQueriesStuck for time to 7 minutes to account for the time it takes for HPA to scale up. #3223
    • [CHANGE] Dashboards: Removed the Querier > Stages panel from the Mimir / Queries dashboard. #3311
    • [CHANGE] Configuration: The format of the autoscaling section of the configuration has changed to support more components. #3378
      • Instead of specific config variables for each component, they are listed in a dictionary. For example, autoscaling.querier_enabled becomes autoscaling.querier.enabled.
    • [FEATURE] Dashboards: Added "Mimir / Overview resources" dashboard, providing an high level view over a Mimir cluster resources utilization. #3481
    • [FEATURE] Dashboards: Added "Mimir / Overview networking" dashboard, providing an high level view over a Mimir cluster network bandwidth, inflight requests and TCP connections. #3487
    • [FEATURE] Compile baremetal mixin along k8s mixin. #3162 #3514
    • [ENHANCEMENT] Alerts: Add MimirRingMembersMismatch firing when a component does not have the expected number of running jobs. #2404
    • [ENHANCEMENT] Dashboards: Add optional row about the Distributor's metric forwarding feature to the Mimir / Writes dashboard. #3182 #3394 #3394 #3461
    • [ENHANCEMENT] Dashboards: Remove the "Instance Mapper" row from the "Alertmanager Resources Dashboard". This is a Grafana Cloud specific service and not relevant for external users. #3152
    • [ENHANCEMENT] Dashboards: Add "remote read", "metadata", and "exemplar" queries to "Mimir / Overview" dashboard. #3245
    • [ENHANCEMENT] Dashboards: Use non-red colors for non-error series in the "Mimir / Overview" dashboard. #3246
    • [ENHANCEMENT] Dashboards: Add support to multi-zone deployments for the experimental read-write deployment mode. #3256
    • [ENHANCEMENT] Dashboards: If enabled, add new row to the Mimir / Writes for distributor autoscaling metrics. #3378
    • [ENHANCEMENT] Dashboards: Add read path insights row to the "Mimir / Tenants" dashboard. #3326
    • [ENHANCEMENT] Alerts: Add runbook urls for alerts. #3452
    • [ENHANCEMENT] Configuration: Make it possible to configure namespace label, job label, and job prefix. #3482
    • [ENHANCEMENT] Dashboards: improved resources and networking dashboards to work with read-write deployment mode too. #3497 #3504 #3519 #3531
    • [ENHANCEMENT] Alerts: Added "MimirDistributorForwardingErrorRate" alert, which fires on high error rates in the distributor’s forwarding feature. #3200
    • [ENHANCEMENT] Improve phrasing in Overview dashboard. #3488
    • [BUGFIX] Dashboards: Fix legend showing persistentvolumeclaim when using deployment_type=baremetal for Disk space utilization panels. #3173 #3184
    • [BUGFIX] Alerts: Fixed MimirGossipMembersMismatch alert when Mimir is deployed in read-write mode. #3489
    • [BUGFIX] Dashboards: Remove "Inflight requests" from object store panels because the panel is not tracking the inflight requests to object storage. #3521

    Jsonnet

    • [CHANGE] Replaced the deprecated policy/v1beta1 with policy/v1 when configuring a PodDisruptionBudget. #3284
    • [CHANGE] Common storage configuration is now used to configure object storage in all components. This is a breaking change in terms of Jsonnet manifests and also a CLI flag update for components that use object storage, so it will require a rollout of those components. The changes include: #3257
      • blocks_storage_backend was renamed to storage_backend and is now used as the common storage backend for all components.
        • So were the related blocks_storage_azure_account_(name|key) and blocks_storage_s3_endpoint configurations.
      • storage_s3_endpoint is now rendered by default using the aws_region configuration instead of a hardcoded us-east-1.
      • ruler_client_type and alertmanager_client_type were renamed to ruler_storage_backend and alertmanager_storage_backend respectively, and their corresponding CLI flags won't be rendered unless explicitly set to a value different from the one in storage_backend (like local).
      • alertmanager_s3_bucket_name, alertmanager_gcs_bucket_name and alertmanager_azure_container_name have been removed, and replaced by a single alertmanager_storage_bucket_name configuration used for all object storages.
      • genericBlocksStorageConfig configuration object was removed, and so any extensions to it will be now ignored. Use blockStorageConfig instead.
      • rulerClientConfig and alertmanagerStorageClientConfig configuration objects were renamed to rulerStorageConfig and alertmanagerStorageConfig respectively, and so any extensions to their previous names will be now ignored. Use the new names instead.
      • The CLI flags *.s3.region are no longer rendered as they are optional and the region can be inferred by Mimir by performing an initial API call to the endpoint.
      • The migration to this change should usually consist of:
        • Renaming blocks_storage_backend key to storage_backend.
        • For Azure/S3:
          • Renaming blocks_storage_(azure|s3)_* configurations to storage_(azure|s3)_*.
          • If ruler_storage_(azure|s3)_* and alertmanager_storage_(azure|s3)_* keys were different from the block_storage_* ones, they should be now provided using CLI flags, see configuration reference for more details.
        • Removing ruler_client_type and alertmanager_client_type if their value match the storage_backend, or renaming them to their new names otherwise.
        • Reviewing any possible extensions to genericBlocksStorageConfig, rulerClientConfig and alertmanagerStorageClientConfig and moving them to the corresponding new options.
        • Renaming the alertmanager's bucket name configuration from provider-specific to the new alertmanager_storage_bucket_name key.
    • [CHANGE] The overrides-exporter.libsonnet file is now always imported. The overrides-exporter can be enabled in jsonnet setting the following: #3379
      {
        _config+:: {
          overrides_exporter_enabled: true,
        }
      }
      
    • [FEATURE] Added support for experimental read-write deployment mode. Enabling the read-write deployment mode on a existing Mimir cluster is a destructive operation, because the cluster will be re-created. If you're creating a new Mimir cluster, you can deploy it in read-write mode adding the following configuration: #3379 #3475 #3405
      {
        _config+:: {
          deployment_mode: 'read-write',
      
          // See operations/mimir/read-write-deployment.libsonnet for more configuration options.
          mimir_write_replicas: 3,
          mimir_read_replicas: 2,
          mimir_backend_replicas: 3,
        }
      }
      
    • [ENHANCEMENT] Add autoscaling support to the mimir-read component when running the read-write-deployment model. #3419
    • [ENHANCEMENT] Added $._config.usageStatsConfig to track the installation mode via the anonymous usage statistics. #3294
    • [ENHANCEMENT] The query-tee node port ($._config.query_tee_node_port) is now optional. #3272
    • [ENHANCEMENT] Add support for autoscaling distributors. #3378
    • [ENHANCEMENT] Make auto-scaling logic ensure integer KEDA thresholds. #3512
    • [BUGFIX] Fixed query-scheduler ring configuration for dedicated ruler's queries and query-frontends. #3237 #3239
    • [BUGFIX] Jsonnet: Fix auto-scaling so that ruler-querier CPU threshold is a string-encoded integer millicores value. #3520

    Mimirtool

    • [FEATURE] Added mimirtool alertmanager verify command to validate configuration without uploading. #3440
    • [ENHANCEMENT] Added mimirtool rules delete-namespace command to delete all of the rule groups in a namespace including the namespace itself. #3136
    • [ENHANCEMENT] Refactor mimirtool analyze prometheus: add concurrency and resiliency #3349
      • Add --concurrency flag. Default: number of logical CPUs
    • [BUGFIX] --log.level=debug now correctly prints the response from the remote endpoint when a request fails. #3180

    Documentation

    • [ENHANCEMENT] Documented how to configure HA deduplication using Consul in a Mimir Helm deployment. #2972
    • [ENHANCEMENT] Improve MimirQuerierAutoscalerNotActive runbook. #3186
    • [ENHANCEMENT] Improve MimirSchedulerQueriesStuck runbook to reflect debug steps with querier auto-scaling enabled. #3223
    • [ENHANCEMENT] Use imperative for docs titles. #3178 #3332 #3343
    • [ENHANCEMENT] Docs: mention gRPC compression in "Production tips". #3201
    • [ENHANCEMENT] Update ADOPTERS.md. #3224 #3225
    • [ENHANCEMENT] Add a note for jsonnet deploying. #3213
    • [ENHANCEMENT] out-of-order runbook update with use case. #3253
    • [ENHANCEMENT] Fixed TSDB retention mentioned in the "Recover source blocks from ingesters" runbook. #3280
    • [ENHANCEMENT] Run Grafana Mimir in production using the Helm chart. #3072
    • [ENHANCEMENT] Use common configuration in the tutorial. #3282
    • [ENHANCEMENT] Updated detailed steps for migrating blocks from Thanos to Mimir. #3290
    • [ENHANCEMENT] Add scheme to DNS service discovery docs. #3450
    • [BUGFIX] Remove reference to file that no longer exists in contributing guide. #3404
    • [BUGFIX] Fix some minor typos in the contributing guide and on the runbooks page. #3418
    • [BUGFIX] Fix small typos in API reference. #3526
    • [BUGFIX] Fixed TSDB retention mentioned in the "Recover source blocks from ingesters" runbook. #3278
    • [BUGFIX] Fixed configuration example in the "Configuring the Grafana Mimir query-frontend to work with Prometheus" guide. #3374

    Tools

    • [FEATURE] Add copyblocks tool, to copy Mimir blocks between two GCS buckets. #3264
    • [ENHANCEMENT] copyblocks: copy no-compact global markers and optimize min time filter check. #3268
    • [ENHANCEMENT] Mimir rules GitHub action: Added the ability to change default value of label when running prepare command. #3236
    • [BUGFIX] Mimir rules Github action: Fix single line output. #3421

    All changes in this release: https://github.com/grafana/mimir/compare/mimir-2.4.0...mimir-2.5.0-rc.0

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(29.42 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(28.87 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(26.84 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(25.62 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-2.5.0-rc.0_amd64.deb(16.91 MB)
    mimir-2.5.0-rc.0_amd64.deb-sha-256(65 bytes)
    mimir-2.5.0-rc.0_amd64.rpm(16.76 MB)
    mimir-2.5.0-rc.0_amd64.rpm-sha-256(65 bytes)
    mimir-2.5.0-rc.0_arm64.deb(15.36 MB)
    mimir-2.5.0-rc.0_arm64.deb-sha-256(65 bytes)
    mimir-2.5.0-rc.0_arm64.rpm(15.26 MB)
    mimir-2.5.0-rc.0_arm64.rpm-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.22 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(15.97 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.62 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.00 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(54.08 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(53.28 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(48.98 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.93 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(51.15 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(50.85 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(46.94 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(45.31 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(47.57 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(45.86 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(14.39 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(14.22 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(12.98 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(12.50 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.4.0(Oct 28, 2022)

    This release contains 190 PRs from 29 authors, including new contributors Fayzal Ghantiwala, Furkan Türkal, Joe Blubaugh, Justin Lei, Nicolas DUPEUX, Paul Puschmann, Radu Domnu, Shubham Ranjan. Thank you!

    Grafana Mimir version 2.4.0 release notes

    Grafana Labs is excited to announce version 2.4 of Grafana Mimir.

    The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.

    Note: If you are upgrading from Grafana Mimir 2.3, review the list of important changes that follow.

    Features and enhancements

    • Query-scheduler ring-based service discovery: The query-scheduler is an optional, stateless component that retains a queue of queries to execute, and distributes the workload to available queriers. The use the query-scheduler, query-frontends and queriers are required to discover the addresses of the query-scheduler instances.

      In addition to DNS-based service discovery, Mimir 2.4 introduces the ring-based service discovery for the query-scheduler. When enabled, the query-schedulers join their own hash ring (similar to other Mimir components), and the query-frontends and queriers discover query-scheduler instances via the ring.

      Ring-based service discovery makes it easier to set up the query-scheduler in environments where you can't easily define a DNS entry that resolves to the running query-scheduler instances. For more information, refer to query-scheduler configuration.

    • New API endpoint exposes per-tenant limits: Mimir 2.4 introduces a new API endpoint, which is available on all Mimir components that load the runtime configuration. The endpoint exposes the limits of the authenticated tenant. You can use this new API endpoint when developing custom integrations with Mimir that require looking up the actual limits that are applied on a given tenant. For more information, refer to Get tenant limits.

    • New TLS configuration options: Mimir 2.4 introduces new options to configure the accepted TLS cipher suites, and the minimum versions for the HTTP and gRPC clients that are used between Mimir components, or by Mimir to communicate to external services such as Consul or etcd.

      You can use these new configuration options to override the default TLS settings and meet your security policy requirements. For more information, refer to Securing Grafana Mimir communications with TLS.

    • Maximum range query length limit: Mimir 2.4 introduces the new configuration option -query-frontend.max-total-query-length to limit the maximum range query length, which is computed as the query's end minus start timestamp. This limit is enforced in the query-frontend and defaults to -store.max-query-length if unset.

      The new configuration option allows you to set different limits between the received query maximum length (-query-frontend.max-total-query-length) and the maximum length of partial queries after splitting and sharding (-store.max-query-length).

    The following experimental features have been promoted to stable:

    Helm chart improvements

    The mimir-distributed Helm chart is the best way to install Mimir on Kubernetes. As part of the Mimir 2.4 release, we’re also releasing version 3.2 of the mimir-distributed Helm chart.

    Notable enhancements follow. For the full list of changes, see the Helm chart changelog.

    • Added support for topologySpreadContraints.
    • Replaced the default anti-affinity rules with topologySpreadContraints for all components which puts less restrictions on where Kubernetes can run pods.
    • Important: if you are not using the sizing plans (small.yaml, large.yaml, capped-small.yaml, capped-large.yaml) in production, you must reintroduce pod affinity rules for the ingester and store-gateway. This also fixes a missing label selector for the ingester. Merge the following with your custom values file:
      ingester:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: target
                      operator: In
                      values:
                        - ingester
                topologyKey: "kubernetes.io/hostname"
              - labelSelector:
                  matchExpressions:
                    - key: app.kubernetes.io/component
                      operator: In
                      values:
                        - ingester
                topologyKey: "kubernetes.io/hostname"
      store_gateway:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: target
                      operator: In
                      values:
                        - store-gateway
                topologyKey: "kubernetes.io/hostname"
              - labelSelector:
                  matchExpressions:
                    - key: app.kubernetes.io/component
                      operator: In
                      values:
                        - store-gateway
                topologyKey: "kubernetes.io/hostname"
      
    • Updated the anti affinity rules in the sizing plans (small.yaml, large.yaml, capped-small.yaml, capped-large.yaml). The sizing plans now enforce that no two pods of the ingester, store-gateway, or alertmanager StatefulSets are scheduled on the same Node. Pods from different StaatefulSets can share a Node.
    • Support for Openshift Route resource for nginx has been added.

    Important changes

    In Grafana Mimir 2.4, the default values of the following configuration options have changed:

    • -distributor.remote-timeout has changed from 20s to 2s.
    • -distributor.forwarding.request-timeout has changed from 10s to 2s.
    • -blocks-storage.tsdb.head-compaction-concurrency has changed from 5 to 1.
    • The hash-ring heartbeat period for distributors, ingesters, rulers, and compactors has increased from 5s to 15s.

    In Grafana Mimir 2.4, the following deprecated configuration options have been removed:

    • The YAML configuration option limits.active_series_custom_trackers_config.
    • The CLI flag -ingester.ring.join-after and its respective YAML configuration option ingester.ring.join_after.
    • The CLI flag -querier.shuffle-sharding-ingesters-lookback-period and its respective YAML configuration option querier.shuffle_sharding_ingesters_lookback_period.

    With Grafana Mimir 2.4, the anonymous usage statistics tracking is enabled by default. Mimir maintainers use this anonymous information to learn more about how the open source community runs Mimir and what the Mimir team should focus on when working on the next features and documentation improvements. If possible, we ask you to keep the usage reporting feature enabled. In case you want to opt-out from anonymous usage statistics reporting, refer to Disable the anonymous usage statistics reporting.

    Bug fixes

    • PR 2979: Fix remote write HTTP response status code returned by Mimir when failing to write only to one ingester (the quorum is still honored when running Mimir with the default replication factor of 3) and some series are not ingested because of validation errors or some limits being reached.
    • PR 3005: Fix the querier to re-balance its workers connections when a query-frontend or query-scheduler instance is terminated.
    • PR 2963: Fix the remote read endpoint to correctly support the Accept-Encoding: snappy HTTP request header.

    Changelog

    2.4.0

    Grafana Mimir

    • [CHANGE] Distributor: change the default value of -distributor.remote-timeout to 2s from 20s and -distributor.forwarding.request-timeout to 2s from 10s to improve distributor resource usage when ingesters crash. #2728 #2912
    • [CHANGE] Anonymous usage statistics tracking: added the -ingester.ring.store value. #2981
    • [CHANGE] Series metadata HELP that is longer than -validation.max-metadata-length is now truncated silently, instead of being dropped with a 400 status code. #2993
    • [CHANGE] Ingester: changed default setting for -ingester.ring.readiness-check-ring-health from true to false. #2953
    • [CHANGE] Anonymous usage statistics tracking has been enabled by default, to help Mimir maintainers make better decisions to support the open source community. #2939 #3034
    • [CHANGE] Anonymous usage statistics tracking: added the minimum and maximum value of -ingester.out-of-order-time-window. #2940
    • [CHANGE] The default hash ring heartbeat period for distributors, ingesters, rulers and compactors has been increased from 5s to 15s. Now the default heartbeat period for all Mimir hash rings is 15s. #3033
    • [CHANGE] Reduce the default TSDB head compaction concurrency (-blocks-storage.tsdb.head-compaction-concurrency) from 5 to 1, in order to reduce CPU spikes. #3093
    • [CHANGE] Ruler: the ruler's remote evaluation mode (-ruler.query-frontend.address) is now stable. #3109
    • [CHANGE] Limits: removed the deprecated YAML configuration option active_series_custom_trackers_config. Please use active_series_custom_trackers instead. #3110
    • [CHANGE] Ingester: removed the deprecated configuration option -ingester.ring.join-after. #3111
    • [CHANGE] Querier: removed the deprecated configuration option -querier.shuffle-sharding-ingesters-lookback-period. The value of -querier.query-ingesters-within is now used internally for shuffle sharding lookback, while you can use -querier.shuffle-sharding-ingesters-enabled to enable or disable shuffle sharding on the read path. #3111
    • [CHANGE] Memberlist: cluster label verification feature (-memberlist.cluster-label and -memberlist.cluster-label-verification-disabled) is now marked as stable. #3108
    • [CHANGE] Distributor: only single per-tenant forwarding endpoint can be configured now. Support for per-rule endpoint has been removed. #3095
    • [FEATURE] Query-scheduler: added an experimental ring-based service discovery support for the query-scheduler. Refer to query-scheduler configuration for more information. #2957
    • [FEATURE] Introduced the experimental endpoint /api/v1/user_limits exposed by all components that load runtime configuration. This endpoint exposes realtime limits for the authenticated tenant, in JSON format. #2864 #3017
    • [FEATURE] Query-scheduler: added the experimental configuration option -query-scheduler.max-used-instances to restrict the number of query-schedulers effectively used regardless how many replicas are running. This feature can be useful when using the experimental read-write deployment mode. #3005
    • [ENHANCEMENT] Go: updated to go 1.19.2. #2637 #3127 #3129
    • [ENHANCEMENT] Runtime config: don't unmarshal runtime configuration files if they haven't changed. This can save a bit of CPU and memory on every component using runtime config. #2954
    • [ENHANCEMENT] Query-frontend: Add cortex_frontend_query_result_cache_skipped_total and cortex_frontend_query_result_cache_attempted_total metrics to track the reason why query results are not cached. #2855
    • [ENHANCEMENT] Distributor: pool more connections per host when forwarding request. Mark requests as idempotent so they can be retried under some conditions. #2968
    • [ENHANCEMENT] Distributor: failure to send request to forwarding target now also increments cortex_distributor_forward_errors_total, with status_code="failed". #2968
    • [ENHANCEMENT] Distributor: added support forwarding push requests via gRPC, using httpgrpc messages from weaveworks/common library. #2996
    • [ENHANCEMENT] Query-frontend / Querier: increase internal backoff period used to retry connections to query-frontend / query-scheduler. #3011
    • [ENHANCEMENT] Querier: do not log "error processing requests from scheduler" when the query-scheduler is shutting down. #3012
    • [ENHANCEMENT] Query-frontend: query sharding process is now time-bounded and it is cancelled if the request is aborted. #3028
    • [ENHANCEMENT] Query-frontend: improved Prometheus response JSON encoding performance. #2450
    • [ENHANCEMENT] TLS: added configuration parameters to configure the client's TLS cipher suites and minimum version. The following new CLI flags have been added: #3070
      • -alertmanager.alertmanager-client.tls-cipher-suites
      • -alertmanager.alertmanager-client.tls-min-version
      • -alertmanager.sharding-ring.etcd.tls-cipher-suites
      • -alertmanager.sharding-ring.etcd.tls-min-version
      • -compactor.ring.etcd.tls-cipher-suites
      • -compactor.ring.etcd.tls-min-version
      • -distributor.forwarding.grpc-client.tls-cipher-suites
      • -distributor.forwarding.grpc-client.tls-min-version
      • -distributor.ha-tracker.etcd.tls-cipher-suites
      • -distributor.ha-tracker.etcd.tls-min-version
      • -distributor.ring.etcd.tls-cipher-suites
      • -distributor.ring.etcd.tls-min-version
      • -ingester.client.tls-cipher-suites
      • -ingester.client.tls-min-version
      • -ingester.ring.etcd.tls-cipher-suites
      • -ingester.ring.etcd.tls-min-version
      • -memberlist.tls-cipher-suites
      • -memberlist.tls-min-version
      • -querier.frontend-client.tls-cipher-suites
      • -querier.frontend-client.tls-min-version
      • -querier.store-gateway-client.tls-cipher-suites
      • -querier.store-gateway-client.tls-min-version
      • -query-frontend.grpc-client-config.tls-cipher-suites
      • -query-frontend.grpc-client-config.tls-min-version
      • -query-scheduler.grpc-client-config.tls-cipher-suites
      • -query-scheduler.grpc-client-config.tls-min-version
      • -query-scheduler.ring.etcd.tls-cipher-suites
      • -query-scheduler.ring.etcd.tls-min-version
      • -ruler.alertmanager-client.tls-cipher-suites
      • -ruler.alertmanager-client.tls-min-version
      • -ruler.client.tls-cipher-suites
      • -ruler.client.tls-min-version
      • -ruler.query-frontend.grpc-client-config.tls-cipher-suites
      • -ruler.query-frontend.grpc-client-config.tls-min-version
      • -ruler.ring.etcd.tls-cipher-suites
      • -ruler.ring.etcd.tls-min-version
      • -store-gateway.sharding-ring.etcd.tls-cipher-suites
      • -store-gateway.sharding-ring.etcd.tls-min-version
    • [ENHANCEMENT] Store-gateway: Add -blocks-storage.bucket-store.max-concurrent-reject-over-limit option to allow requests that exceed the max number of inflight object storage requests to be rejected. #2999
    • [ENHANCEMENT] Query-frontend: allow setting a separate limit on the total (before splitting/sharding) query length of range queries with the new experimental -query-frontend.max-total-query-length flag, which defaults to -store.max-query-length if unset or set to 0. #3058
    • [ENHANCEMENT] Query-frontend: Lower TTL for cache entries overlapping the out-of-order samples ingestion window (re-using -ingester.out-of-order-allowance from ingesters). #2935
    • [ENHANCEMENT] Ruler: added support to forcefully disable recording and/or alerting rules evaluation. The following new configuration options have been introduced, which can be overridden on a per-tenant basis in the runtime configuration: #3088
      • -ruler.recording-rules-evaluation-enabled
      • -ruler.alerting-rules-evaluation-enabled
    • [ENHANCEMENT] Distributor: Add age filter to forwarding functionality, to not forward samples which are older than defined duration. #3049
    • [ENHANCEMENT] Distributor: Improved error messages reported when the distributor fails to remote write to ingesters. #3055
    • [ENHANCEMENT] Improved tracing spans tracked by distributors, ingesters and store-gateways. #2879 #3099 #3089
    • [ENHANCEMENT] Ingester: improved the performance of label value cardinality endpoint. #3044
    • [ENHANCEMENT] Ruler: use backoff retry on remote evaluation #3098
    • [ENHANCEMENT] Query-frontend: Include multiple tenant IDs in query logs when present instead of dropping them. #3125
    • [ENHANCEMENT] Query-frontend: truncate queries based on the configured blocks retention period (-compactor.blocks-retention-period) to avoid querying past this period. #3134
    • [ENHANCEMENT] Alertmanager: reduced memory utilization in Mimir clusters with a large number of tenants. #3143
    • [ENHANCEMENT] Store-gateway: added extra span logging to improve observability. #3131
    • [BUGFIX] Querier: Fix 400 response while handling streaming remote read. #2963
    • [BUGFIX] Fix a bug causing query-frontend, query-scheduler, and querier not failing if one of their internal components fail. #2978
    • [BUGFIX] Querier: re-balance the querier worker connections when a query-frontend or query-scheduler is terminated. #3005
    • [BUGFIX] Distributor: Now returns the quorum error from ingesters. For example, with replication_factor=3, two HTTP 400 errors and one HTTP 500 error, now the distributor will always return HTTP 400. Previously the behaviour was to return the error which the distributor first received. #2979
    • [BUGFIX] Ruler: fix panic when ruler.external_url is explicitly set to an empty string ("") in YAML. #2915
    • [BUGFIX] Alertmanager: Fix support for the Telegram API URL in the global settings. #3097
    • [BUGFIX] Alertmanager: Fix parsing of label matchers without label value in the API used to retrieve alerts. #3097
    • [BUGFIX] Ruler: Fix not restoring alert state for rule groups when other ruler replicas shut down. #3156
    • [BUGFIX] Updated golang.org/x/net dependency to fix CVE-2022-27664. #3124
    • [BUGFIX] Fix distributor from returning a 500 status code when a 400 was received from the ingester. #3211
    • [BUGFIX] Fix incorrect OS value set in Mimir v2.3.* RPM packages. #3221

    Mixin

    • [CHANGE] Alerts: MimirQuerierAutoscalerNotActive is now critical and fires after 1h instead of 15m. #2958
    • [FEATURE] Dashboards: Added "Mimir / Overview" dashboards, providing an high level view over a Mimir cluster. #3122 #3147 #3155
    • [ENHANCEMENT] Dashboards: Updated the "Writes" and "Rollout progress" dashboards to account for samples ingested via the new OTLP ingestion endpoint. #2919 #2938
    • [ENHANCEMENT] Dashboards: Include per-tenant request rate in "Tenants" dashboard. #2874
    • [ENHANCEMENT] Dashboards: Include inflight object store requests in "Reads" dashboard. #2914
    • [ENHANCEMENT] Dashboards: Make queries used to find job, cluster and namespace for dropdown menus configurable. #2893
    • [ENHANCEMENT] Dashboards: Include rate of label and series queries in "Reads" dashboard. #3065 #3074
    • [ENHANCEMENT] Dashboards: Fix legend showing on per-pod panels. #2944
    • [ENHANCEMENT] Dashboards: Use the "req/s" unit on panels showing the requests rate. #3118
    • [ENHANCEMENT] Dashboards: Use a consistent color across dashboards for the error rate. #3154

    Jsonnet

    • [FEATURE] Added support for query-scheduler ring-based service discovery. #3128
    • [ENHANCEMENT] Querier autoscaling is now slower on scale downs: scale down 10% every 1m instead of 100%. #2962
    • [BUGFIX] Memberlist: gossip_member_label is now set for ruler-queriers. #3141

    Mimirtool

    • [ENHANCEMENT] mimirtool analyze: Store the query errors instead of exit during the analysis. #3052
    • [BUGFIX] mimir-tool remote-read: fix returns where some conditions return nil error even if there is error. #3053

    Documentation

    • [ENHANCEMENT] Added documentation on how to configure storage retention. #2970
    • [ENHANCEMENT] Improved gRPC clients config documentation. #3020
    • [ENHANCEMENT] Added documentation on how to manage alerting and recording rules. #2983
    • [ENHANCEMENT] Improved MimirSchedulerQueriesStuck runbook. #3006
    • [ENHANCEMENT] Added "Cluster label verification" section to memberlist documentation. #3096
    • [ENHANCEMENT] Mention compression in multi-zone replication documentation. #3107
    • [BUGFIX] Fixed configuration option names in "Enabling zone-awareness via the Grafana Mimir Jsonnet". #3018
    • [BUGFIX] Fixed mimirtool analyze parameters documentation. #3094
    • [BUGFIX] Fixed YAML configuraton in the "Manage the configuration of Grafana Mimir with Helm" guide. #3042
    • [BUGFIX] Fixed Alertmanager capacity planning documentation. #3132

    Tools

    • [BUGFIX] trafficdump: Fixed panic occurring when -success-only=true and the captured request failed. #2863

    All changes in this release: https://github.com/grafana/mimir/compare/mimir-2.3.1...mimir-2.4.0

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(30.29 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.76 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(27.50 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(26.31 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-2.4.0_amd64.deb(17.01 MB)
    mimir-2.4.0_amd64.deb-sha-256(65 bytes)
    mimir-2.4.0_amd64.rpm(16.85 MB)
    mimir-2.4.0_amd64.rpm-sha-256(65 bytes)
    mimir-2.4.0_arm64.deb(15.47 MB)
    mimir-2.4.0_arm64.deb-sha-256(65 bytes)
    mimir-2.4.0_arm64.rpm(15.37 MB)
    mimir-2.4.0_arm64.rpm-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.20 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(15.97 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.60 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.00 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(54.95 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(54.15 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(49.62 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(47.56 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(51.68 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(51.34 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(47.26 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(45.62 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(47.96 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(46.26 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(14.37 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(14.20 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(12.96 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(12.50 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.4.0-rc.1(Oct 17, 2022)

    This release contains 8 PRs from 2 authors. Thank you!

    Changelog

    2.4.0-rc.1

    Grafana Mimir

    • [BUGFIX] Fix distributor from returning a 500 status code when a 400 was received from the ingester. #3211
    • [BUGFIX] Fix incorrect OS value set in Mimir v2.3.* RPM packages. #3221

    All changes in this release: https://github.com/grafana/mimir/compare/mimir-2.4.0-rc.0...mimir-2.4.0-rc.1

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(30.29 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.76 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(27.50 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(26.31 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-2.4.0-rc.1_amd64.deb(17.01 MB)
    mimir-2.4.0-rc.1_amd64.deb-sha-256(65 bytes)
    mimir-2.4.0-rc.1_amd64.rpm(16.85 MB)
    mimir-2.4.0-rc.1_amd64.rpm-sha-256(65 bytes)
    mimir-2.4.0-rc.1_arm64.deb(15.47 MB)
    mimir-2.4.0-rc.1_arm64.deb-sha-256(65 bytes)
    mimir-2.4.0-rc.1_arm64.rpm(15.37 MB)
    mimir-2.4.0-rc.1_arm64.rpm-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.20 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(15.97 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.60 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.00 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(54.95 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(54.15 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(49.62 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(47.56 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(51.68 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(51.34 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(47.26 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(45.62 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(47.96 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(46.26 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(14.37 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(14.20 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(12.96 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(12.50 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.4.0-rc.0(Oct 7, 2022)

    This release contains 166 PRs from 29 authors. Thank you!

    Grafana Mimir version 2.4.0-rc.0 release notes

    Grafana Labs is excited to announce version 2.4 of Grafana Mimir.

    The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.

    Note: If you are upgrading from Grafana Mimir 2.3, review the list of important changes that follow.

    Features and enhancements

    • Query-scheduler ring-based service discovery: The query-scheduler is an optional, stateless component that retains a queue of queries to execute, and distributes the workload to available queriers. The use the query-scheduler, query-frontends and queriers are required to discover the addresses of the query-scheduler instances.

      In addition to DNS-based service discovery, Mimir 2.4 introduces the ring-based service discovery for the query-scheduler. When enabled, the query-schedulers join their own hash ring (similar to other Mimir components), and the query-frontends and queriers discover query-scheduler instances via the ring.

      Ring-based service discovery makes it easier to set up the query-scheduler in environments where you can’t easily define a DNS entry that resolves to the running query-scheduler instances. For more information, refer to query-scheduler configuration.

    • New API endpoint exposes per-tenant limits: Mimir 2.4 introduces a new API endpoint, which is available on all Mimir components that load the runtime configuration. The endpoint exposes the limits of the authenticated tenant. You can use this new API endpoint when developing custom integrations with Mimir that require looking up the actual limits that are applied on a given tenant. For more information, refer to Get tenant limits.

      New TLS configuration options: Mimir 2.4 introduces new options to configure the accepted TLS cipher suites, and the minimum versions for the HTTP and gRPC clients that are used between Mimir components, or by Mimir to communicate to external services such as Consul or etcd.

      You can use these new configuration options to override the default TLS settings and meet your security policy requirements. For more information, refer to Securing Grafana Mimir communications with TLS.

    • Maximum range query length limit: Mimir 2.4 introduces the new configuration option -query-frontend.max-total-query-length to limit the maximum range query length, which is computed as the query’s end minus start timestamp. This limit is enforced in the query-frontend and defaults to -store.max-query-length if unset.

      The new configuration option allows you to set different limits between the received query maximum length (-query-frontend.max-total-query-length) and the maximum length of partial queries after splitting and sharding (-store.max-query-length).

    Helm chart improvements

    The mimir-distributed Helm chart is the best way to install Mimir on Kubernetes. As part of the Mimir 2.4 release, we’re also releasing version 3.2 of the mimir-distributed Helm chart.

    Notable enhancements follow. For the full list of changes, see the Helm chart changelog.

    • Added support for topologySpreadContraints.

    • Replaced the default anti-affinity rules with topologySpreadContraints for all components which puts less restrictions on where Kubernetes can run pods.

    • Important: if you are not using the sizing plans (small.yaml, large.yaml, capped-small.yaml, capped-large.yaml) in production, you must reintroduce pod affinity rules for the ingester and store-gateway. This also fixes a missing label selector for the ingester. Merge the following with your custom values file:

      ingester:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: target
                      operator: In
                      values:
                        - ingester
                topologyKey: "kubernetes.io/hostname"
              - labelSelector:
                  matchExpressions:
                    - key: app.kubernetes.io/component
                      operator: In
                      values:
                        - ingester
                topologyKey: "kubernetes.io/hostname"
      store_gateway:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                    - key: target
                      operator: In
                      values:
                        - store-gateway
                topologyKey: "kubernetes.io/hostname"
              - labelSelector:
                  matchExpressions:
                    - key: app.kubernetes.io/component
                      operator: In
                      values:
                        - store-gateway
                topologyKey: "kubernetes.io/hostname"
      
    • Updated the anti affinity rules in the sizing plans (small.yaml, large.yaml, capped-small.yaml, capped-large.yaml). The sizing plans now enforce that no two pods of the ingester, store-gateway, or alertmanager StatefulSets are scheduled on the same Node. Pods from different StaatefulSets can share a Node.

    • Support for Openshift Route resource for nginx has been added.

    Important changes

    In Grafana Mimir 2.4, the default values of the following configuration options have changed:

    • -distributor.remote-timeout has changed from 20s to 2s.
    • -distributor.forwarding.request-timeout has changed from 10s to 2s.
    • -blocks-storage.tsdb.head-compaction-concurrency has changed from 5 to 1.
    • The hash-ring heartbeat period for distributors, ingesters, rulers, and compactors has increased from 5s to 15s.

    With Grafana Mimir 2.4, the anonymous usage statistics tracking is enabled by default. Mimir maintainers use this anonymous information to learn more about how the open source community runs Mimir and what the Mimir team should focus on when working on the next features and documentation improvements. If possible, we ask you to keep the usage reporting feature enabled. In case you want to opt-out from anonymous usage statistics reporting, refer to Disable the anonymous usage statistics reporting.

    Bug fixes

    • PR 2979: Fix remote write HTTP response status code returned by Mimir when failing to write only to one ingester (the quorum is still honored when running Mimir with the default replication factor of 3) and some series are not ingested because of validation errors or some limits being reached.
    • PR 3005: Fix the querier to re-balance its workers connections when a query-frontend or query-scheduler instance is terminated.
    • PR 2963: Fix the remote read endpoint to correctly support the Accept-Encoding: snappy HTTP request header.

    Changelog

    2.4.0-rc.0

    Grafana Mimir

    • [CHANGE] Distributor: change the default value of -distributor.remote-timeout to 2s from 20s and -distributor.forwarding.request-timeout to 2s from 10s to improve distributor resource usage when ingesters crash. #2728 #2912
    • [CHANGE] Anonymous usage statistics tracking: added the -ingester.ring.store value. #2981
    • [CHANGE] Series metadata HELP that is longer than -validation.max-metadata-length is now truncated silently, instead of being dropped with a 400 status code. #2993
    • [CHANGE] Ingester: changed default setting for -ingester.ring.readiness-check-ring-health from true to false. #2953
    • [CHANGE] Anonymous usage statistics tracking has been enabled by default, to help Mimir maintainers make better decisions to support the open source community. #2939 #3034
    • [CHANGE] Anonymous usage statistics tracking: added the minimum and maximum value of -ingester.out-of-order-time-window. #2940
    • [CHANGE] The default hash ring heartbeat period for distributors, ingesters, rulers and compactors has been increased from 5s to 15s. Now the default heartbeat period for all Mimir hash rings is 15s. #3033
    • [CHANGE] Reduce the default TSDB head compaction concurrency (-blocks-storage.tsdb.head-compaction-concurrency) from 5 to 1, in order to reduce CPU spikes. #3093
    • [CHANGE] Ruler: the ruler's remote evaluation mode (-ruler.query-frontend.address) is now stable. #3109
    • [CHANGE] Limits: removed the deprecated YAML configuration option active_series_custom_trackers_config. Please use active_series_custom_trackers instead. #3110
    • [CHANGE] Ingester: removed the deprecated configuration option -ingester.ring.join-after. #3111
    • [CHANGE] Querier: removed the deprecated configuration option -querier.shuffle-sharding-ingesters-lookback-period. The value of -querier.query-ingesters-within is now used internally for shuffle sharding lookback, while you can use -querier.shuffle-sharding-ingesters-enabled to enable or disable shuffle sharding on the read path. #3111
    • [CHANGE] Memberlist: cluster label verification feature (-memberlist.cluster-label and -memberlist.cluster-label-verification-disabled) is now marked as stable. #3108
    • [CHANGE] Distributor: only single per-tenant forwarding endpoint can be configured now. Support for per-rule endpoint has been removed. #3095
    • [CHANGE] Query-frontend: truncate queries based on the configured blocks retention period (-compactor.blocks-retention-period) to avoid querying past this period. #3134
    • [FEATURE] Query-scheduler: added an experimental ring-based service discovery support for the query-scheduler. Refer to query-scheduler configuration for more information. #2957
    • [FEATURE] Introduced the experimental endpoint /api/v1/user_limits exposed by all components that load runtime configuration. This endpoint exposes realtime limits for the authenticated tenant, in JSON format. #2864 #3017
    • [FEATURE] Query-scheduler: added the experimental configuration option -query-scheduler.max-used-instances to restrict the number of query-schedulers effectively used regardless how many replicas are running. This feature can be useful when using the experimental read-write deployment mode. #3005
    • [ENHANCEMENT] Go: updated to go 1.19.2. #2637 #3127 #3129
    • [ENHANCEMENT] Runtime config: don't unmarshal runtime configuration files if they haven't changed. This can save a bit of CPU and memory on every component using runtime config. #2954
    • [ENHANCEMENT] Query-frontend: Add cortex_frontend_query_result_cache_skipped_total and cortex_frontend_query_result_cache_attempted_total metrics to track the reason why query results are not cached. #2855
    • [ENHANCEMENT] Distributor: pool more connections per host when forwarding request. Mark requests as idempotent so they can be retried under some conditions. #2968
    • [ENHANCEMENT] Distributor: failure to send request to forwarding target now also increments cortex_distributor_forward_errors_total, with status_code="failed". #2968
    • [ENHANCEMENT] Distributor: added support forwarding push requests via gRPC, using httpgrpc messages from weaveworks/common library. #2996
    • [ENHANCEMENT] Query-frontend / Querier: increase internal backoff period used to retry connections to query-frontend / query-scheduler. #3011
    • [ENHANCEMENT] Querier: do not log "error processing requests from scheduler" when the query-scheduler is shutting down. #3012
    • [ENHANCEMENT] Query-frontend: query sharding process is now time-bounded and it is cancelled if the request is aborted. #3028
    • [ENHANCEMENT] Query-frontend: improved Prometheus response JSON encoding performance. #2450
    • [ENHANCEMENT] TLS: added configuration parameters to configure the client's TLS cipher suites and minimum version. The following new CLI flags have been added: #3070
      • -alertmanager.alertmanager-client.tls-cipher-suites
      • -alertmanager.alertmanager-client.tls-min-version
      • -alertmanager.sharding-ring.etcd.tls-cipher-suites
      • -alertmanager.sharding-ring.etcd.tls-min-version
      • -compactor.ring.etcd.tls-cipher-suites
      • -compactor.ring.etcd.tls-min-version
      • -distributor.forwarding.grpc-client.tls-cipher-suites
      • -distributor.forwarding.grpc-client.tls-min-version
      • -distributor.ha-tracker.etcd.tls-cipher-suites
      • -distributor.ha-tracker.etcd.tls-min-version
      • -distributor.ring.etcd.tls-cipher-suites
      • -distributor.ring.etcd.tls-min-version
      • -ingester.client.tls-cipher-suites
      • -ingester.client.tls-min-version
      • -ingester.ring.etcd.tls-cipher-suites
      • -ingester.ring.etcd.tls-min-version
      • -memberlist.tls-cipher-suites
      • -memberlist.tls-min-version
      • -querier.frontend-client.tls-cipher-suites
      • -querier.frontend-client.tls-min-version
      • -querier.store-gateway-client.tls-cipher-suites
      • -querier.store-gateway-client.tls-min-version
      • -query-frontend.grpc-client-config.tls-cipher-suites
      • -query-frontend.grpc-client-config.tls-min-version
      • -query-scheduler.grpc-client-config.tls-cipher-suites
      • -query-scheduler.grpc-client-config.tls-min-version
      • -query-scheduler.ring.etcd.tls-cipher-suites
      • -query-scheduler.ring.etcd.tls-min-version
      • -ruler.alertmanager-client.tls-cipher-suites
      • -ruler.alertmanager-client.tls-min-version
      • -ruler.client.tls-cipher-suites
      • -ruler.client.tls-min-version
      • -ruler.query-frontend.grpc-client-config.tls-cipher-suites
      • -ruler.query-frontend.grpc-client-config.tls-min-version
      • -ruler.ring.etcd.tls-cipher-suites
      • -ruler.ring.etcd.tls-min-version
      • -store-gateway.sharding-ring.etcd.tls-cipher-suites
      • -store-gateway.sharding-ring.etcd.tls-min-version
    • [ENHANCEMENT] Store-gateway: Add -blocks-storage.bucket-store.max-concurrent-reject-over-limit option to allow requests that exceed the max number of inflight object storage requests to be rejected. #2999
    • [ENHANCEMENT] Query-frontend: allow setting a separate limit on the total (before splitting/sharding) query length of range queries with the new experimental -query-frontend.max-total-query-length flag, which defaults to -store.max-query-length if unset or set to 0. #3058
    • [ENHANCEMENT] Query-frontend: Lower TTL for cache entries overlapping the out-of-order samples ingestion window (re-using -ingester.out-of-order-allowance from ingesters). #2935
    • [ENHANCEMENT] Ruler: added support to forcefully disable recording and/or alerting rules evaluation. The following new configuration options have been introduced, which can be overridden on a per-tenant basis in the runtime configuration: #3088
      • -ruler.recording-rules-evaluation-enabled
      • -ruler.alerting-rules-evaluation-enabled
    • [ENHANCEMENT] Distributor: Add age filter to forwarding functionality, to not forward samples which are older than defined duration. #3049
    • [ENHANCEMENT] Distributor: Improved error messages reported when the distributor fails to remote write to ingesters. #3055
    • [ENHANCEMENT] Improved tracing spans tracked by distributors, ingesters and store-gateways. #2879 #3099 #3089
    • [ENHANCEMENT] Ingester: improved the performance of label value cardinality endpoint. #3044
    • [ENHANCEMENT] Ruler: use backoff retry on remote evaluation #3098
    • [ENHANCEMENT] Query-frontend: Include multiple tenant IDs in query logs when present instead of dropping them. #3125
    • [ENHANCEMENT] Alertmanager: reduced memory utilization in Mimir clusters with a large number of tenants. #3143
    • [ENHANCEMENT] Store-gateway: added extra span logging to improve observability. #3131
    • [BUGFIX] Querier: Fix 400 response while handling streaming remote read. #2963
    • [BUGFIX] Fix a bug causing query-frontend, query-scheduler, and querier not failing if one of their internal components fail. #2978
    • [BUGFIX] Querier: re-balance the querier worker connections when a query-frontend or query-scheduler is terminated. #3005
    • [BUGFIX] Distributor: Now returns the quorum error from ingesters. For example, with replication_factor=3, two HTTP 400 errors and one HTTP 500 error, now the distributor will always return HTTP 400. Previously the behaviour was to return the error which the distributor first received. #2979
    • [BUGFIX] Ruler: fix panic when ruler.external_url is explicitly set to an empty string ("") in YAML. #2915
    • [BUGFIX] Alertmanager: Fix support for the Telegram API URL in the global settings. #3097
    • [BUGFIX] Alertmanager: Fix parsing of label matchers without label value in the API used to retrieve alerts. #3097
    • [BUGFIX] Ruler: Fix not restoring alert state for rule groups when other ruler replicas shut down. #3156
    • [BUGFIX] Updated golang.org/x/net dependency to fix CVE-2022-27664. #3124

    Mixin

    • [CHANGE] Alerts: MimirQuerierAutoscalerNotActive is now critical and fires after 1h instead of 15m. #2958
    • [FEATURE] Dashboards: Added "Mimir / Overview" dashboards, providing an high level view over a Mimir cluster. #3122 #3147 #3155
    • [ENHANCEMENT] Dashboards: Updated the "Writes" and "Rollout progress" dashboards to account for samples ingested via the new OTLP ingestion endpoint. #2919 #2938
    • [ENHANCEMENT] Dashboards: Include per-tenant request rate in "Tenants" dashboard. #2874
    • [ENHANCEMENT] Dashboards: Include inflight object store requests in "Reads" dashboard. #2914
    • [ENHANCEMENT] Dashboards: Make queries used to find job, cluster and namespace for dropdown menus configurable. #2893
    • [ENHANCEMENT] Dashboards: Include rate of label and series queries in "Reads" dashboard. #3065 #3074
    • [ENHANCEMENT] Dashboards: Fix legend showing on per-pod panels. #2944
    • [ENHANCEMENT] Dashboards: Use the "req/s" unit on panels showing the requests rate. #3118
    • [ENHANCEMENT] Dashboards: Use a consistent color across dashboards for the error rate. #3154

    Jsonnet

    • [FEATURE] Added support for query-scheduler ring-based service discovery. #3128
    • [ENHANCEMENT] Querier autoscaling is now slower on scale downs: scale down 10% every 1m instead of 100%. #2962
    • [BUGFIX] Memberlist: gossip_member_label is now set for ruler-queriers. #3141

    Mimirtool

    • [ENHANCEMENT] mimirtool analyze: Store the query errors instead of exit during the analysis. #3052
    • [BUGFIX] mimir-tool remote-read: fix returns where some conditions return nil error even if there is error. #3053

    Documentation

    • [ENHANCEMENT] Added documentation on how to configure storage retention. #2970
    • [ENHANCEMENT] Improved gRPC clients config documentation. #3020
    • [ENHANCEMENT] Added documentation on how to manage alerting and recording rules. #2983
    • [ENHANCEMENT] Improved MimirSchedulerQueriesStuck runbook. #3006
    • [ENHANCEMENT] Added "Cluster label verification" section to memberlist documentation. #3096
    • [ENHANCEMENT] Mention compression in multi-zone replication documentation. #3107
    • [BUGFIX] Fixed configuration option names in "Enabling zone-awareness via the Grafana Mimir Jsonnet". #3018
    • [BUGFIX] Fixed mimirtool analyze parameters documentation. #3094
    • [BUGFIX] Fixed YAML configuraton in the "Manage the configuration of Grafana Mimir with Helm" guide. #3042
    • [BUGFIX] Fixed Alertmanager capacity planning documentation. #3132

    Tools

    • [BUGFIX] trafficdump: Fixed panic occurring when -success-only=true and the captured request failed. #2863
    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(30.29 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.76 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(27.50 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(26.31 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-2.4.0-rc.0_amd64.deb(17.01 MB)
    mimir-2.4.0-rc.0_amd64.deb-sha-256(65 bytes)
    mimir-2.4.0-rc.0_amd64.rpm(16.85 MB)
    mimir-2.4.0-rc.0_amd64.rpm-sha-256(65 bytes)
    mimir-2.4.0-rc.0_arm64.deb(15.47 MB)
    mimir-2.4.0-rc.0_arm64.deb-sha-256(65 bytes)
    mimir-2.4.0-rc.0_arm64.rpm(15.37 MB)
    mimir-2.4.0-rc.0_arm64.rpm-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.20 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(15.97 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.60 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.00 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(54.95 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(54.15 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(49.62 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(47.56 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(51.68 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(51.34 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(47.26 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(45.62 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(47.96 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(46.26 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(14.37 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(14.20 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(12.96 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(12.50 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.3.1(Sep 27, 2022)

    This release contains 5 PRs from 1 author. Thank you!

    2.3.1

    Grafana Mimir

    • [BUGFIX] Query-frontend: query sharding took exponential time to map binary expressions. #3027
    • [BUGFIX] Distributor: Stop panics on OTLP endpoint when a single metric has multiple timeseries. #3040

    Full Changelog: https://github.com/grafana/mimir/compare/mimir-2.3.0...mimir-2.3.1

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(29.82 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.40 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(27.07 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(26.00 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-2.3.1_amd64.deb(16.60 MB)
    mimir-2.3.1_amd64.deb-sha-256(65 bytes)
    mimir-2.3.1_amd64.rpm(16.50 MB)
    mimir-2.3.1_amd64.rpm-sha-256(65 bytes)
    mimir-2.3.1_arm64.deb(15.17 MB)
    mimir-2.3.1_arm64.deb-sha-256(65 bytes)
    mimir-2.3.1_arm64.rpm(15.10 MB)
    mimir-2.3.1_arm64.rpm-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(15.97 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(15.83 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.40 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(13.93 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(53.89 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(53.33 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(48.66 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.81 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(50.62 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(50.56 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(46.27 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(44.93 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(46.85 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(45.46 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(14.13 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(14.04 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(12.75 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(12.37 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.3.0(Sep 20, 2022)

    Grafana Mimir version 2.3 release notes

    Grafana Labs is excited to announce version 2.3 of Grafana Mimir, the most scalable, most performant open source time series database in the world.

    The highlights that follow include the top features, enhancements, and bugfixes in this release. For the complete list of changes, see the changelog.

    Note: If you are upgrading from Grafana Mimir 2.2, review the list of important changes that follow.

    This release contains 370 PRs from 39 authors. Thank you!

    Features and enhancements

    • Ingest metrics in OpenTelemetry format: This release of Grafana Mimir introduces experimental support for ingesting metrics from the OpenTelemetry Collector's otlphttp exporter. This adds a second ingestion option for users of the OTel Collector; Mimir was already compatible with the prometheusremotewrite exporter. For more information, please see Configure OTel Collector.

    • Tenant federation for metadata queries: Users with tenant federation enabled could already issue instant queries, range queries, and exemplar queries to multiple tenants at once and receive a single aggregated result. With Grafana Mimir 2.3, we've added tenant federation support to the /api/v1/metadata endpoint as well.

    • Simpler object storage configuration: Users can now configure block, alertmanager, and ruler storage all at once with the common YAML config option key (or -common.storage.* CLI flags). By centralizing your object storage configuration in one place, this enhancement makes configuration faster and less error prone. Users may still individually configure storage for each of these components if they desire. For more information, see the Common Configurations.

    • .deb and .rpm packages for Mimir: Starting with version 2.3, we're publishing .deb and .rpm files for Grafana Mimir, which will make installing and running it on Debian or RedHat-based linux systems much easier. Thank you to community contributor wilfriedroset for your work to implement this!

    • Import historic data: Users can now backfill time series data from their existing Prometheus or Cortex installation into Mimir using mimirtool, making it possible to migrate to Grafana Mimir without losing your existing metrics data. This support is still considered experimental and does not yet work for data stored in Thanos. To learn more about this feature, please see mimirtool backfill and Configure TSDB block upload

    • Increased instant query performance: Grafana Mimir now supports splitting instant queries by time. This allows it to better parallelize execution of instant queries and therefore return results faster. At present, splitting is only supported for a subset of instant queries, which means not all instant queries will see a speedup. This feature is currently experimental and is disabled by default. It can be enabled with the split_instant_queries_by_interval YAML config option in the limits section (or the CLI flag -query-frontend.split-instant-queries-by-interval).

    Helm chart improvements

    The Mimir Helm chart is the best way to install Mimir on Kubernetes. As part of the Mimir 2.3 release, we’re also releasing version 3.1 of the Mimir Helm chart.

    Notable enhancements follow. For the full list of changes, see the Helm chart changelog.

    • We've upgraded the MinIO subchart dependency from a deprecated chart to the supported one. This creates a breaking change in how the administrator password is set. However, as the built-in MinIO is not a recommended object store for production use cases, this change did not warrant a new major version of the Mimir Helm chart.
    • Query sharding is now enabled by default which should give you better performance on high cardinality metrics queries.
      • To compensate for the increased number of queries generated by query sharding, the query scheduler component is now enabled by default.
    • The backfill API endpoints for importing historic time series data are now exposed on the Nginx gateway.
    • Nginx now sets the value of the X-Scope-OrgID header equal to the value of Mimir's no_auth_tenant parameter by default. The previous release had set the value of X-Scope-OrgID to anonymous by default which complicated the process of migrating to Mimir.
    • Memberlist now uses DNS service-discovery by default, which decreases startup time for large Mimir clusters.

    Important changes

    In Grafana Mimir 2.3 we have removed the following previously deprecated configuration options:

    • The extend_writes parameter in the distributor YAML configuration and -distributor.extend-writes CLI flag have been removed.
    • The active_series_custom_trackers parameter has been removed from the YAML configuration. It had already been moved to the runtime configuration. See #1188 for details.
    • The blocks-storage.tsdb.isolation-enabled parameter in the YAML configuration and -blocks-storage.tsdb.isolation-enabled CLI flag have been removed.

    With Grafana Mimir 2.3 we have also updated the default value for the CLI flag -distributor.ha-tracker.max-clusters to 100 to provide Denial-of-Service protection. Previously -distributor.ha-tracker.max-clusters was unlimited by default which could allow a tenant with HA Dedupe enabled to overload the HA tracker with __cluster__ label values that could cause the HA Dedupe database to fail.

    Also, as noted above, the administrator password for Helm chart deployments using the built-in MinIO is now set differently.

    Bug fixes

    • PR 2447: Fix incorrect mapping of http status codes 429 to 500 when the request queue is full in the query-frontend. This corrects behavior in the query-frontend where a retryable 429 "Too Many Outstanding Requests" error from a querier was incorrectly returned as an unretryable 500 system error.
    • PR 2505: The Memberlist key-value (KV) store now tries to "fast-join" the cluster to avoid serving an empty KV store. This fix addresses the confusing "empty ring" error response and the error log message "ring doesn't exist in KV store yet" emitted by services when there are other members present in the ring when a service starts. Those using other key-value store options (e.g., consul, etcd) are not impacted by this bug.
    • PR 2289: The "List Prometheus rules" API endpoint of the Mimir Ruler component is no longer blocked while rules are being synced. This means users can now list rules while syncing larger rule sets.

    Changelog

    2.3.0

    Grafana Mimir

    • [CHANGE] Ingester: Added user label to ingester metric cortex_ingester_tsdb_out_of_order_samples_appended_total. On multitenant clusters this helps us find the rate of appended out-of-order samples for a specific tenant. #2493
    • [CHANGE] Compactor: delete source and output blocks from local disk on compaction failed, to reduce likelihood that subsequent compactions fail because of no space left on disk. #2261
    • [CHANGE] Ruler: Remove unused CLI flags -ruler.search-pending-for and -ruler.flush-period (and their respective YAML config options). #2288
    • [CHANGE] Successful gRPC requests are no longer logged (only affects internal API calls). #2309
    • [CHANGE] Add new -*.consul.cas-retry-delay flags. They have a default value of 1s, while previously there was no delay between retries. #2309
    • [CHANGE] Store-gateway: Remove the experimental ability to run requests in a dedicated OS thread pool and associated CLI flag -store-gateway.thread-pool-size. #2423
    • [CHANGE] Memberlist: disabled TCP-based ping fallback, because Mimir already uses a custom transport based on TCP. #2456
    • [CHANGE] Change default value for -distributor.ha-tracker.max-clusters to 100 to provide a DoS protection. #2465
    • [CHANGE] Experimental block upload API exposed by compactor has changed: Previous /api/v1/upload/block/{block} endpoint for starting block upload is now /api/v1/upload/block/{block}/start, and previous endpoint /api/v1/upload/block/{block}?uploadComplete=true for finishing block upload is now /api/v1/upload/block/{block}/finish. New API endpoint has been added: /api/v1/upload/block/{block}/check. #2486 #2548
    • [CHANGE] Compactor: changed -compactor.max-compaction-time default from 0s (disabled) to 1h. When compacting blocks for a tenant, the compactor will move to compact blocks of another tenant or re-plan blocks to compact at least every 1h. #2514
    • [CHANGE] Distributor: removed previously deprecated extend_writes (see #1856) YAML key and -distributor.extend-writes CLI flag from the distributor config. #2551
    • [CHANGE] Ingester: removed previously deprecated active_series_custom_trackers (see #1188) YAML key from the ingester config. #2552
    • [CHANGE] The tenant ID __mimir_cluster is reserved by Mimir and not allowed to store metrics. #2643
    • [CHANGE] Purger: removed the purger component and moved its API endpoints /purger/delete_tenant and /purger/delete_tenant_status to the compactor at /compactor/delete_tenant and /compactor/delete_tenant_status. The new endpoints on the compactor are stable. #2644
    • [CHANGE] Memberlist: Change the leave timeout duration (-memberlist.leave-timeout duration) from 5s to 20s and connection timeout (-memberlist.packet-dial-timeout) from 5s to 2s. This makes leave timeout 10x the connection timeout, so that we can communicate the leave to at least 1 node, if the first 9 we try to contact times out. #2669
    • [CHANGE] Alertmanager: return status code 412 Precondition Failed and log info message when alertmanager isn't configured for a tenant. #2635
    • [CHANGE] Distributor: if forwarding rules are used to forward samples, exemplars are now removed from the request. #2710, #2725
    • [CHANGE] Limits: change the default value of max_global_series_per_metric limit to 0 (disabled). Setting this limit by default does not provide much benefit because series are sharded by all labels. #2714
    • [CHANGE] Ingester: experimental -blocks-storage.tsdb.new-chunk-disk-mapper has been removed, new chunk disk mapper is now always used, and is no longer marked experimental. Default value of -blocks-storage.tsdb.head-chunks-write-queue-size has changed to 1000000, this enables async chunk queue by default, which leads to improved latency on the write path when new chunks are created in ingesters. #2762
    • [CHANGE] Ingester: removed deprecated -blocks-storage.tsdb.isolation-enabled option. TSDB-level isolation is now always disabled in Mimir. #2782
    • [CHANGE] Compactor: -compactor.partial-block-deletion-delay must either be set to 0 (to disable partial blocks deletion) or a value higher than 4h. #2787
    • [CHANGE] Query-frontend: CLI flag -query-frontend.align-querier-with-step has been deprecated. Please use -query-frontend.align-queries-with-step instead. #2840
    • [FEATURE] Compactor: Adds the ability to delete partial blocks after a configurable delay. This option can be configured per tenant. #2285
      • -compactor.partial-block-deletion-delay, as a duration string, allows you to set the delay since a partial block has been modified before marking it for deletion. A value of 0, the default, disables this feature.
      • The metric cortex_compactor_blocks_marked_for_deletion_total has a new value for the reason label reason="partial", when a block deletion marker is triggered by the partial block deletion delay.
    • [FEATURE] Querier: enabled support for queries with negative offsets, which are not cached in the query results cache. #2429
    • [FEATURE] EXPERIMENTAL: OpenTelemetry Metrics ingestion path on /otlp/v1/metrics. #695 #2436 #2461
    • [FEATURE] Querier: Added support for tenant federation to metric metadata endpoint. #2467
    • [FEATURE] Query-frontend: introduced experimental support to split instant queries by time. The instant query splitting can be enabled setting -query-frontend.split-instant-queries-by-interval. #2469 #2564 #2565 #2570 #2571 #2572 #2573 #2574 #2575 #2576 #2581 #2582 #2601 #2632 #2633 #2634 #2641 #2642 #2766
    • [FEATURE] Introduced an experimental anonymous usage statistics tracking (disabled by default), to help Mimir maintainers make better decisions to support the open source community. The tracking system anonymously collects non-sensitive, non-personally identifiable information about the running Mimir cluster, and is disabled by default. #2643 #2662 #2685 #2732 #2733 #2735
    • [FEATURE] Introduced an experimental deployment mode called read-write and running a fully featured Mimir cluster with three components: write, read and backend. The read-write deployment mode is a trade-off between the monolithic mode (only one component, no isolation) and the microservices mode (many components, high isolation). #2754 #2838
    • [ENHANCEMENT] Distributor: Decreased distributor tests execution time. #2562
    • [ENHANCEMENT] Alertmanager: Allow the HTTP proxy_url configuration option in the receiver's configuration. #2317
    • [ENHANCEMENT] ring: optimize shuffle-shard computation when lookback is used, and all instances have registered timestamp within the lookback window. In that case we can immediately return origial ring, because we would select all instances anyway. #2309
    • [ENHANCEMENT] Memberlist: added experimental memberlist cluster label support via -memberlist.cluster-label and -memberlist.cluster-label-verification-disabled CLI flags (and their respective YAML config options). #2354
    • [ENHANCEMENT] Object storage can now be configured for all components using the common YAML config option key (or -common.storage.* CLI flags). #2330 #2347
    • [ENHANCEMENT] Go: updated to go 1.18.4. #2400
    • [ENHANCEMENT] Store-gateway, listblocks: list of blocks now includes stats from meta.json file: number of series, samples and chunks. #2425
    • [ENHANCEMENT] Added more buckets to cortex_ingester_client_request_duration_seconds histogram metric, to correctly track requests taking longer than 1s (up until 16s). #2445
    • [ENHANCEMENT] Azure client: Improve memory usage for large object storage downloads. #2408
    • [ENHANCEMENT] Distributor: Add -distributor.instance-limits.max-inflight-push-requests-bytes. This limit protects the distributor against multiple large requests that together may cause an OOM, but are only a few, so do not trigger the max-inflight-push-requests limit. #2413
    • [ENHANCEMENT] Distributor: Drop exemplars in distributor for tenants where exemplars are disabled. #2504
    • [ENHANCEMENT] Runtime Config: Allow operator to specify multiple comma-separated yaml files in -runtime-config.file that will be merged in left to right order. #2583
    • [ENHANCEMENT] Query sharding: shard binary operations only if it doesn't lead to non-shardable vector selectors in one of the operands. #2696
    • [ENHANCEMENT] Add packaging for both debian based deb file and redhat based rpm file using FPM. #1803
    • [ENHANCEMENT] Distributor: Add cortex_distributor_query_ingester_chunks_deduped_total and cortex_distributor_query_ingester_chunks_total metrics for determining how effective ingester chunk deduplication at query time is. #2713
    • [ENHANCEMENT] Upgrade Docker base images to alpine:3.16.2. #2729
    • [ENHANCEMENT] Ruler: Add <prometheus-http-prefix>/api/v1/status/buildinfo endpoint. #2724
    • [ENHANCEMENT] Querier: Ensure all queries pulled from query-frontend or query-scheduler are immediately executed. The maximum workers concurrency in each querier is configured by -querier.max-concurrent. #2598
    • [ENHANCEMENT] Distributor: Add cortex_distributor_received_requests_total and cortex_distributor_requests_in_total metrics to provide visiblity into appropriate per-tenant request limits. #2770
    • [ENHANCEMENT] Distributor: Add single forwarding remote-write endpoint for a tenant (forwarding_endpoint), instead of using per-rule endpoints. This takes precendence over per-rule endpoints. #2801
    • [ENHANCEMENT] Added err-mimir-distributor-max-write-message-size to the errors catalog. #2470
    • [ENHANCEMENT] Add sanity check at startup to ensure the configured filesystem directories don't overlap for different components. #2828
    • [BUGFIX] TSDB: Fixed a bug on the experimental out-of-order implementation that led to wrong query results. #2701
    • [BUGFIX] Compactor: log the actual error on compaction failed. #2261
    • [BUGFIX] Alertmanager: restore state from storage even when running a single replica. #2293
    • [BUGFIX] Ruler: do not block "List Prometheus rules" API endpoint while syncing rules. #2289
    • [BUGFIX] Ruler: return proper *status.Status error when running in remote operational mode. #2417
    • [BUGFIX] Alertmanager: ensure the configured -alertmanager.web.external-url is either a path starting with /, or a full URL including the scheme and hostname. #2381 #2542
    • [BUGFIX] Memberlist: fix problem with loss of some packets, typically ring updates when instances were removed from the ring during shutdown. #2418
    • [BUGFIX] Ingester: fix misfiring MimirIngesterHasUnshippedBlocks and stale cortex_ingester_oldest_unshipped_block_timestamp_seconds when some block uploads fail. #2435
    • [BUGFIX] Query-frontend: fix incorrect mapping of http status codes 429 to 500 when request queue is full. #2447
    • [BUGFIX] Memberlist: Fix problem with ring being empty right after startup. Memberlist KV store now tries to "fast-join" the cluster to avoid serving empty KV store. #2505
    • [BUGFIX] Compactor: Fix bug when using -compactor.partial-block-deletion-delay: compactor didn't correctly check for modification time of all block files. #2559
    • [BUGFIX] Query-frontend: fix wrong query sharding results for queries with boolean result like 1 < bool 0. #2558
    • [BUGFIX] Fixed error messages related to per-instance limits incorrectly reporting they can be set on a per-tenant basis. #2610
    • [BUGFIX] Perform HA-deduplication before forwarding samples according to forwarding rules in the distributor. #2603 #2709
    • [BUGFIX] Fix reporting of tracing spans from PromQL engine. #2707
    • [BUGFIX] Apply relabel and drop_label rules before forwarding rules in the distributor. #2703
    • [BUGFIX] Distributor: Register cortex_discarded_requests_total metric, which previously was not registered and therefore not exported. #2712
    • [BUGFIX] Ruler: fix not restoring alerts' state at startup. #2648
    • [BUGFIX] Ingester: Fix disk filling up after restarting ingesters with out-of-order support disabled while it was enabled before. #2799
    • [BUGFIX] Memberlist: retry joining memberlist cluster on startup when no nodes are resolved. #2837
    • [BUGFIX] Query-frontend: fix incorrect mapping of http status codes 413 to 500 when request is too large. #2819
    • [BUGFIX] Alertmanager: revert upstream alertmananger to v0.24.0 to fix panic when unmarshalling email headers #2924 #2925
    • [BUGFIX] Fix sanity check done on configured filesystem directories when running Alertmanager in microservices mode. #2947

    Mixin

    • [CHANGE] Dashboards: "Slow Queries" dashboard no longer works with versions older than Grafana 9.0. #2223
    • [CHANGE] Alerts: use RSS memory instead of working set memory in the MimirAllocatingTooMuchMemory alert for ingesters. #2480
    • [CHANGE] Dashboards: remove the "Cache - Latency (old)" panel from the "Mimir / Queries" dashboard. #2796
    • [FEATURE] Dashboards: added support to experimental read-write deployment mode. #2780
    • [ENHANCEMENT] Dashboards: added missed rule evaluations to the "Evaluations per second" panel in the "Mimir / Ruler" dashboard. #2314
    • [ENHANCEMENT] Dashboards: add k8s resource requests to CPU and memory panels. #2346
    • [ENHANCEMENT] Dashboards: add RSS memory utilization panel for ingesters, store-gateways and compactors. #2479
    • [ENHANCEMENT] Dashboards: allow to configure graph tooltip. #2647
    • [ENHANCEMENT] Alerts: MimirFrontendQueriesStuck and MimirSchedulerQueriesStuck alerts are more reliable now as they consider all the intermediate samples in the minute prior to the evaluation. #2630
    • [ENHANCEMENT] Alerts: added RolloutOperatorNotReconciling alert, firing if the optional rollout-operator is not successfully reconciling. #2700
    • [ENHANCEMENT] Dashboards: added support to query-tee in front of ruler-query-frontend in the "Remote ruler reads" dashboard. #2761
    • [ENHANCEMENT] Dashboards: Introduce support for baremetal deployment, setting deployment_type: 'baremetal' in the mixin _config. #2657
    • [ENHANCEMENT] Dashboards: use timeseries panel to show exemplars. #2800
    • [BUGFIX] Dashboards: fixed unit of latency panels in the "Mimir / Ruler" dashboard. #2312
    • [BUGFIX] Dashboards: fixed "Intervals per query" panel in the "Mimir / Queries" dashboard. #2308
    • [BUGFIX] Dashboards: Make "Slow Queries" dashboard works with Grafana 9.0. #2223
    • [BUGFIX] Dashboards: add missing API routes to Ruler dashboard. #2412
    • [BUGFIX] Dashboards: stop setting 'interval' in dashboards; it should be set on your datasource. #2802

    Jsonnet

    • [CHANGE] query-scheduler is enabled by default. We advise to deploy the query-scheduler to improve the scalability of the query-frontend. #2431
    • [CHANGE] Replaced anti-affinity rules with pod topology spread constraints for distributor, query-frontend, querier and ruler. #2517
      • The following configuration options have been removed:
        • distributor_allow_multiple_replicas_on_same_node
        • query_frontend_allow_multiple_replicas_on_same_node
        • querier_allow_multiple_replicas_on_same_node
        • ruler_allow_multiple_replicas_on_same_node
      • The following configuration options have been added:
        • distributor_topology_spread_max_skew
        • query_frontend_topology_spread_max_skew
        • querier_topology_spread_max_skew
        • ruler_topology_spread_max_skew
    • [CHANGE] Change max_global_series_per_metric to 0 in all plans, and as a default value. #2669
    • [FEATURE] Memberlist: added support for experimental memberlist cluster label, through the jsonnet configuration options memberlist_cluster_label and memberlist_cluster_label_verification_disabled. #2349
    • [FEATURE] Added ruler-querier autoscaling support. It requires KEDA installed in the Kubernetes cluster. Ruler-querier autoscaler can be enabled and configure through the following options in the jsonnet config: #2545
      • autoscaling_ruler_querier_enabled: true to enable autoscaling.
      • autoscaling_ruler_querier_min_replicas: minimum number of ruler-querier replicas.
      • autoscaling_ruler_querier_max_replicas: maximum number of ruler-querier replicas.
      • autoscaling_prometheus_url: Prometheus base URL from which to scrape Mimir metrics (e.g. http://prometheus.default:9090/prometheus).
    • [ENHANCEMENT] Memberlist now uses DNS service-discovery by default. #2549
    • [ENHANCEMENT] Upgrade memcached image tag to memcached:1.6.16-alpine. #2740
    • [ENHANCEMENT] Added $._config.configmaps and $._config.runtime_config_files to make it easy to add new configmaps or runtime config file to all components. #2748

    Mimirtool

    • [ENHANCEMENT] Added mimirtool backfill command to upload Prometheus blocks using API available in the compactor. #1822
    • [ENHANCEMENT] mimirtool bucket-validation: Verify existing objects can be overwritten by subsequent uploads. #2491
    • [ENHANCEMENT] mimirtool config convert: Now supports migrating to the current version of Mimir. #2629
    • [BUGFIX] mimirtool analyze: Fix dashboard JSON unmarshalling errors by using custom parsing. #2386
    • [BUGFIX] Version checking no longer prompts for updating when already on latest version. #2723

    Mimir Continuous Test

    • [ENHANCEMENT] Added basic authentication and bearer token support for when Mimir is behind a gateway authenticating the calls. #2717

    Query-tee

    • [CHANGE] Renamed CLI flag -server.service-port to -server.http-service-port. #2683
    • [CHANGE] Renamed metric cortex_querytee_request_duration_seconds to cortex_querytee_backend_request_duration_seconds. Metric cortex_querytee_request_duration_seconds is now reported without label backend. #2683
    • [ENHANCEMENT] Added HTTP over gRPC support to query-tee to allow testing gRPC requests to Mimir instances. #2683

    Documentation

    • [ENHANCEMENT] Referenced mimirtool commands in the HTTP API documentation. #2516
    • [ENHANCEMENT] Improved DNS service discovery documentation. #2513

    Tools

    • [ENHANCEMENT] markblocks now processes multiple blocks concurrently. #2677

    New Contributors

    • @ese made their first contribution in https://github.com/grafana/mimir/pull/2196
    • @micborens made their first contribution in https://github.com/grafana/mimir/pull/2321
    • @marctc made their first contribution in https://github.com/grafana/mimir/pull/2518
    • @ravilushqa made their first contribution in https://github.com/grafana/mimir/pull/2562
    • @BrandonDalton made their first contribution in https://github.com/grafana/mimir/pull/2515
    • @nervo made their first contribution in https://github.com/grafana/mimir/pull/2569
    • @lamida made their first contribution in https://github.com/grafana/mimir/pull/2427
    • @LeviHarrison made their first contribution in https://github.com/grafana/mimir/pull/2644
    • @sysedwinistrator made their first contribution in https://github.com/grafana/mimir/pull/2087

    Full Changelog: https://github.com/grafana/mimir/compare/mimir-2.2.0...mimir-2.3.0

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(29.82 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.40 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(27.07 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(26.00 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-2.3.0_amd64.deb(16.60 MB)
    mimir-2.3.0_amd64.deb-sha-256(65 bytes)
    mimir-2.3.0_amd64.rpm(16.49 MB)
    mimir-2.3.0_amd64.rpm-sha-256(65 bytes)
    mimir-2.3.0_arm64.deb(15.17 MB)
    mimir-2.3.0_arm64.deb-sha-256(65 bytes)
    mimir-2.3.0_arm64.rpm(15.10 MB)
    mimir-2.3.0_arm64.rpm-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(15.97 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(15.83 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.40 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(13.93 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(53.89 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(53.33 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(48.66 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.81 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(50.62 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(50.56 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(46.27 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(44.93 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(46.85 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(45.46 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(14.13 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(14.04 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(12.75 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(12.37 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.3.0-rc.2(Sep 14, 2022)

    Changes since 2.3.0-rc.0

    This release contains 33 contributions from 9 authors. Thank you!

    Note: We tagged a 2.3.0-rc.1 but found a panic in the alertmanager before publishing the 2.3.0-rc.1 pre-release. With 2.3.0-rc.2 we have included the fix for the alertmanager and created a new tag and release candidate.


    2.3.0-rc.2

    Grafana Mimir

    • [BUGFIX] Alertmanager: revert upstream alertmananger to v0.24.0 to fix panic when unmarshalling email headers #2924 #2925

    2.3.0-rc.1

    Grafana Mimir

    • [CHANGE] Distributor: if forwarding rules are used to forward samples, exemplars are now removed from the request #2725
    • [CHANGE] Ingester: experimental -blocks-storage.tsdb.new-chunk-disk-mapper has been removed, new chunk disk mapper is now always used, and is no longer marked experimental. Default value of -blocks-storage.tsdb.head-chunks-write-queue-size has changed to 1000000, this enables async chunk queue by default, which leads to improved latency on the write path when new chunks are created in ingesters. #2762
    • [CHANGE] Ingester: removed deprecated -blocks-storage.tsdb.isolation-enabled option. TSDB-level isolation is now always disabled in Mimir. #2782
    • [CHANGE] Compactor: -compactor.partial-block-deletion-delay must either be set to 0 (to disable partial blocks deletion) or a value higher than 4h. #2787
    • [CHANGE] Query-frontend: CLI flag -query-frontend.align-querier-with-step has been deprecated. Please use -query-frontend.align-queries-with-step instead. #2840
    • [CHANGE] Distributor: change the default value of -distributor.remote-timeout to 2s from 20s and -distributor.forwarding.request-timeout to 2s from 10s to improve distributor resource usage when ingesters crash. #2728
    • [FEATURE] Introduced an experimental anonymous usage statistics tracking (disabled by default), to help Mimir maintainers make better decisions to support the open source community. The tracking system anonymously collects non-sensitive, non-personally identifiable information about the running Mimir cluster, and is disabled by default. #2643 #2662 #2685 #2732 #2733 #2735
    • [FEATURE] Introduced an experimental deployment mode called read-write and running a fully featured Mimir cluster with three components: write, read and backend. The read-write deployment mode is a trade-off between the monolithic mode (only one component, no isolation) and the microservices mode (many components, high isolation). #2754 #2838
    • [ENHANCEMENT] Distributor: Add cortex_distributor_query_ingester_chunks_deduped_total and cortex_distributor_query_ingester_chunks_total metrics for determining how effective ingester chunk deduplication at query time is. #2713
    • [ENHANCEMENT] Upgrade Docker base images to alpine:3.16.2. #2729
    • [ENHANCEMENT] Ruler: Add <prometheus-http-prefix>/api/v1/status/buildinfo endpoint. #2724
    • [ENHANCEMENT] Querier: Ensure all queries pulled from query-frontend or query-scheduler are immediately executed. The maximum workers concurrency in each querier is configured by -querier.max-concurrent. #2598
    • [ENHANCEMENT] Distributor: Add cortex_distributor_received_requests_total and cortex_distributor_requests_in_total metrics to provide visiblity into appropriate per-tenant request limits. #2770
    • [ENHANCEMENT] Distributor: Add single forwarding remote-write endpoint for a tenant (forwarding_endpoint), instead of using per-rule endpoints. This takes precendence over per-rule endpoints. #2801
    • [ENHANCEMENT] Added err-mimir-distributor-max-write-message-size to the errors catalog. #2470
    • [ENHANCEMENT] Add sanity check at startup to ensure the configured filesystem directories don't overlap for different components. #2828
    • [ENHANCEMENT] Go: updated to go 1.19.1. #2637
    • [BUGFIX] Ruler: fix not restoring alerts' state at startup. #2648
    • [BUGFIX] Ingester: Fix disk filling up after restarting ingesters with out-of-order support disabled while it was enabled before. #2799
    • [BUGFIX] Memberlist: retry joining memberlist cluster on startup when no nodes are resolved. #2837
    • [BUGFIX] Query-frontend: fix incorrect mapping of http status codes 413 to 500 when request is too large. #2819
    • [BUGFIX] Ruler: fix panic when ruler.external_url is explicitly set to an empty string ("") in YAML. #2915

    Mixin

    • [CHANGE] Dashboards: remove the "Cache - Latency (old)" panel from the "Mimir / Queries" dashboard. #2796
    • [FEATURE] Dashboards: added support to experimental read-write deployment mode. #2780
    • [ENHANCEMENT] Dashboards: Updated the Writes dashboard to account for samples ingested via the new OTLP ingestion endpoint. #2919
    • [ENHANCEMENT] Dashboards: added support to query-tee in front of ruler-query-frontend in the "Remote ruler reads" dashboard. #2761
    • [ENHANCEMENT] Dashboards: Introduce support for baremetal deployment, setting deployment_type: 'baremetal' in the mixin _config. #2657
    • [ENHANCEMENT] Dashboards: use timeseries panel to show exemplars. #2800
    • [ENHANCEMENT] Dashboards: Include per-tenant request rate in "Tenants" dashboard. #2874
    • [ENHANCEMENT] Dashboards: Include inflight object store requests in "Reads" dashboard. #2914
    • [BUGFIX] Dashboards: stop setting 'interval' in dashboards; it should be set on your datasource. #2802

    Jsonnet

    • [ENHANCEMENT] Upgrade memcached image tag to memcached:1.6.16-alpine. #2740
    • [ENHANCEMENT] Added $._config.configmaps and $._config.runtime_config_files to make it easy to add new configmaps or runtime config file to all components. #2748

    Mimirtool

    • [BUGFIX] Version checking no longer prompts for updating when already on latest version. #2723

    Query-tee

    • [CHANGE] Renamed CLI flag -server.service-port to -server.http-service-port. #2683
    • [CHANGE] Renamed metric cortex_querytee_request_duration_seconds to cortex_querytee_backend_request_duration_seconds. Metric cortex_querytee_request_duration_seconds is now reported without label backend. #2683
    • [ENHANCEMENT] Added HTTP over gRPC support to query-tee to allow testing gRPC requests to Mimir instances. #2683

    Mimir Continuous Test

    • [ENHANCEMENT] Added basic authentication and bearer token support for when Mimir is behind a gateway authenticating the calls. #2717

    Documentation

    Full Changelog: https://github.com/grafana/mimir/compare/mimir-2.3.0-rc0...mimir-2.3.0-rc.2

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(29.82 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.40 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(27.07 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(26.00 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-2.3.0-rc.2_amd64.deb(16.64 MB)
    mimir-2.3.0-rc.2_amd64.deb-sha-256(65 bytes)
    mimir-2.3.0-rc.2_amd64.rpm(16.49 MB)
    mimir-2.3.0-rc.2_amd64.rpm-sha-256(65 bytes)
    mimir-2.3.0-rc.2_arm64.deb(15.20 MB)
    mimir-2.3.0-rc.2_arm64.deb-sha-256(65 bytes)
    mimir-2.3.0-rc.2_arm64.rpm(15.10 MB)
    mimir-2.3.0-rc.2_arm64.rpm-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(15.97 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(15.83 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.40 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(13.93 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(53.89 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(53.33 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(48.66 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.81 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(50.62 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(50.56 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(46.27 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(44.93 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(46.85 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(45.46 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(14.13 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(14.04 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(12.75 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(12.37 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.3.0-rc0(Aug 25, 2022)

    This release contains 333 PRs from 39 authors. Thank you!

    Grafana Mimir version 2.3 release notes

    Grafana Labs is excited to announce version 2.3 of Grafana Mimir, the most scalable, most performant open source time series database in the world.

    The highlights that follow include the top features, enhancements, and bugfixes in this release. If you are upgrading from Grafana Mimir 2.2, there is upgrade-related information as well. For the complete list of changes, see the Changelog.

    Features and enhancements

    • Ingest metrics in OpenTelemetry format: This release of Grafana Mimir introduces experimental support for ingesting metrics from the OpenTelemetry Collector's otlphttp exporter. This adds a second ingestion option for users of the OTel Collector; Mimir was already compatible with the prometheusremotewrite exporter. For more information, please see Configure OTel Collector.

    • Increased instant query performance: Grafana Mimir now supports splitting instant queries by time. This allows it to better parallelize execution of instant queries and therefore return results faster. At present, splitting is only supported for a subset of instant queries, which means not all instant queries will see a speedup. This feature is being released as experimental and is disabled by default. It can be enabled by setting -query-frontend.split-instant-queries-by-interval.

    • Tenant federation for metadata queries: Users with tenant federation enabled could previously issue instant queries, range queries, and exemplar queries to multiple tenants at once and receive a single aggregated result. With Grafana Mimir 2.3, we've added tenant federation support to the /api/v1/metadata endpoint as well.

    • Simpler object storage configuration: Users can now configure block, alertmanager, and ruler storage all at once with the common YAML config option key (or -common.storage.* CLI flags). By centralizing your object storage configuration in one place, this enhancement makes configuration faster and less error prone. Users can still individually configure storage for each of these components if they desire. For more information, see the Common Configurations.

    • DEB and RPM packages for Mimir: Starting with version 2.3, we're publishing deb and rpm files for Grafana Mimir, which will make installing and running it on Debian or RedHat-based linux systems much easier. Thank you to community contributor wilfriedroset for your work to implement this!

    • Import historic data to Grafana Mimir: Users can now backfill time series data from their existing Prometheus or Cortex installation into Mimir using mimirtool, making it possible to migrate to Grafana Mimir without losing your existing metrics data. This support is still considered experimental and does not work for data stored in Thanos yet. To learn more about this feature, please see mimirtool backfill and Configure TSDB block upload

    • New Helm chart minor release: The Mimir Helm chart is the best way to install Mimir on Kubernetes. As part of the Mimir 2.3 release, we’re also releasing version 3.1 of the Mimir Helm chart. Notable enhancements follow. For the full list of changes, see the Helm chart changelog.

      • We've upgraded the MinIO subchart dependency from a deprecated chart to the supported one. This creates a breaking change in how the administrator password is set. However, as the built-in MinIO is not a recommended object store for production use cases, this change did not warrant a new major version of the Mimir Helm chart.
      • The backfill API endpoints for importing historic time series data are now exposed on the Nginx gateway.
      • Nginx now sets the value of the X-Scope-OrgID header equal to the value of Mimir's no_auth_tenant parameter by default. The previous release had set the value of X-Scope-OrgID to anonymous by default which complicated the process of migrating to Mimir.
      • Memberlist now uses DNS service-discovery by default, which should decrease startup time for large Mimir clusters.

    Upgrade considerations

    In Grafana Mimir 2.3 we have removed the following previously deprecated configuration options:

    • The extend_writes parameter in the distributor YAML configuration and -distributor.extend-writes CLI flag have been removed.
    • The active_series_custom_trackers parameter has been removed from the YAML configuration. It had already been moved to the runtime configuration. See #1188 for details.

    With Grafana Mimir 2.3 we have also updated the default value for -distributor.ha-tracker.max-clusters to 100 to provide Denial-of-Service protection. Previously -distributor.ha-tracker.max-clusters was unlimited by default which could allow a tenant with HA Dedupe enabled to overload the HA tracker with __cluster__ label values that could cause the HA Dedupe database to fail.

    Bug fixes

    • PR 2447: Fix incorrect mapping of http status codes 429 to 500 when the request queue is full in the query-frontend. This corrects behavior in the query-frontend where a 429 "Too Many Outstanding Requests" error (a retriable error) from a querier was incorrectly returned as a 500 system error (an unretriable error).
    • PR 2505: The Memberlist key-value (KV) store now tries to "fast-join" the cluster to avoid serving an empty KV store. This fix addresses the confusing "empty ring" error response and the error log message "ring doesn't exist in KV store yet" emitted by services when there are other members present in the ring when a service starts. Those using other key-value store options (e.g., consul, etcd) are not impacted by this bug.
    • PR 2289: The "List Prometheus rules" API endpoint of the Mimir Ruler component is no longer blocked while rules are being synced. This means users can now list rules while syncing larger rule sets.

    Changelog since 2.2

    2.3.0-rc.0

    Grafana Mimir

    • [CHANGE] Ingester: Added user label to ingester metric cortex_ingester_tsdb_out_of_order_samples_appended_total. On multitenant clusters this helps us find the rate of appended out-of-order samples for a specific tenant. #2493
    • [CHANGE] Compactor: delete source and output blocks from local disk on compaction failed, to reduce likelihood that subsequent compactions fail because of no space left on disk. #2261
    • [CHANGE] Ruler: Remove unused CLI flags -ruler.search-pending-for and -ruler.flush-period (and their respective YAML config options). #2288
    • [CHANGE] Successful gRPC requests are no longer logged (only affects internal API calls). #2309
    • [CHANGE] Add new -*.consul.cas-retry-delay flags. They have a default value of 1s, while previously there was no delay between retries. #2309
    • [CHANGE] Store-gateway: Remove the experimental ability to run requests in a dedicated OS thread pool and associated CLI flag -store-gateway.thread-pool-size. #2423
    • [CHANGE] Memberlist: disabled TCP-based ping fallback, because Mimir already uses a custom transport based on TCP. #2456
    • [CHANGE] Change default value for -distributor.ha-tracker.max-clusters to 100 to provide a DoS protection. #2465
    • [CHANGE] Experimental block upload API exposed by compactor has changed: Previous /api/v1/upload/block/{block} endpoint for starting block upload is now /api/v1/upload/block/{block}/start, and previous endpoint /api/v1/upload/block/{block}?uploadComplete=true for finishing block upload is now /api/v1/upload/block/{block}/finish. New API endpoint has been added: /api/v1/upload/block/{block}/check. #2486 #2548
    • [CHANGE] Compactor: changed -compactor.max-compaction-time default from 0s (disabled) to 1h. When compacting blocks for a tenant, the compactor will move to compact blocks of another tenant or re-plan blocks to compact at least every 1h. #2514
    • [CHANGE] Distributor: removed previously deprecated extend_writes (see #1856) YAML key and -distributor.extend-writes CLI flag from the distributor config. #2551
    • [CHANGE] Ingester: removed previously deprecated active_series_custom_trackers (see #1188) YAML key from the ingester config. #2552
    • [CHANGE] The tenant ID __mimir_cluster is reserved by Mimir and not allowed to store metrics. #2643
    • [CHANGE] Purger: removed the purger component and moved its API endpoints /purger/delete_tenant and /purger/delete_tenant_status to the compactor at /compactor/delete_tenant and /compactor/delete_tenant_status. The new endpoints on the compactor are stable. #2644
    • [CHANGE] Memberlist: Change the leave timeout duration (-memberlist.leave-timeout duration) from 5s to 20s and connection timeout (-memberlist.packet-dial-timeout) from 5s to 2s. This makes leave timeout 10x the connection timeout, so that we can communicate the leave to at least 1 node, if the first 9 we try to contact times out. #2669
    • [CHANGE] Alertmanager: return status code 412 Precondition Failed and log info message when alertmanager isn't configured for a tenant. #2635
    • [CHANGE] Distributor: if forwarding rules are used to forward samples, exemplars are now removed from the request. #2710
    • [CHANGE] Limits: change the default value of max_global_series_per_metric limit to 0 (disabled). Setting this limit by default does not provide much benefit because series are sharded by all labels. #2714
    • [FEATURE] Compactor: Adds the ability to delete partial blocks after a configurable delay. This option can be configured per tenant. #2285
      • -compactor.partial-block-deletion-delay, as a duration string, allows you to set the delay since a partial block has been modified before marking it for deletion. A value of 0, the default, disables this feature.
      • The metric cortex_compactor_blocks_marked_for_deletion_total has a new value for the reason label reason="partial", when a block deletion marker is triggered by the partial block deletion delay.
    • [FEATURE] Querier: enabled support for queries with negative offsets, which are not cached in the query results cache. #2429
    • [FEATURE] EXPERIMENTAL: OpenTelemetry Metrics ingestion path on /otlp/v1/metrics. #695 #2436 #2461
    • [FEATURE] Querier: Added support for tenant federation to metric metadata endpoint. #2467
    • [FEATURE] Query-frontend: introduced experimental support to split instant queries by time. The instant query splitting can be enabled setting -query-frontend.split-instant-queries-by-interval. #2469 #2564 #2565 #2570 #2571 #2572 #2573 #2574 #2575 #2576 #2581 #2582 #2601 #2632 #2633 #2634 #2641 #2642 #2766
    • [ENHANCEMENT] Distributor: Decreased distributor tests execution time. #2562
    • [ENHANCEMENT] Alertmanager: Allow the HTTP proxy_url configuration option in the receiver's configuration. #2317
    • [ENHANCEMENT] ring: optimize shuffle-shard computation when lookback is used, and all instances have registered timestamp within the lookback window. In that case we can immediately return origial ring, because we would select all instances anyway. #2309
    • [ENHANCEMENT] Memberlist: added experimental memberlist cluster label support via -memberlist.cluster-label and -memberlist.cluster-label-verification-disabled CLI flags (and their respective YAML config options). #2354
    • [ENHANCEMENT] Object storage can now be configured for all components using the common YAML config option key (or -common.storage.* CLI flags). #2330 #2347
    • [ENHANCEMENT] Go: updated to go 1.18.4. #2400
    • [ENHANCEMENT] Store-gateway, listblocks: list of blocks now includes stats from meta.json file: number of series, samples and chunks. #2425
    • [ENHANCEMENT] Added more buckets to cortex_ingester_client_request_duration_seconds histogram metric, to correctly track requests taking longer than 1s (up until 16s). #2445
    • [ENHANCEMENT] Azure client: Improve memory usage for large object storage downloads. #2408
    • [ENHANCEMENT] Distributor: Add -distributor.instance-limits.max-inflight-push-requests-bytes. This limit protects the distributor against multiple large requests that together may cause an OOM, but are only a few, so do not trigger the max-inflight-push-requests limit. #2413
    • [ENHANCEMENT] Distributor: Drop exemplars in distributor for tenants where exemplars are disabled. #2504
    • [ENHANCEMENT] Runtime Config: Allow operator to specify multiple comma-separated yaml files in -runtime-config.file that will be merged in left to right order. #2583
    • [ENHANCEMENT] Query sharding: shard binary operations only if it doesn't lead to non-shardable vector selectors in one of the operands. #2696
    • [ENHANCEMENT] Add packaging for both debian based deb file and redhat based rpm file using FPM. #1803
    • [BUGFIX] TSDB: Fixed a bug on the experimental out-of-order implementation that led to wrong query results. #2701
    • [BUGFIX] Compactor: log the actual error on compaction failed. #2261
    • [BUGFIX] Alertmanager: restore state from storage even when running a single replica. #2293
    • [BUGFIX] Ruler: do not block "List Prometheus rules" API endpoint while syncing rules. #2289
    • [BUGFIX] Ruler: return proper *status.Status error when running in remote operational mode. #2417
    • [BUGFIX] Alertmanager: ensure the configured -alertmanager.web.external-url is either a path starting with /, or a full URL including the scheme and hostname. #2381 #2542
    • [BUGFIX] Memberlist: fix problem with loss of some packets, typically ring updates when instances were removed from the ring during shutdown. #2418
    • [BUGFIX] Ingester: fix misfiring MimirIngesterHasUnshippedBlocks and stale cortex_ingester_oldest_unshipped_block_timestamp_seconds when some block uploads fail. #2435
    • [BUGFIX] Query-frontend: fix incorrect mapping of http status codes 429 to 500 when request queue is full. #2447
    • [BUGFIX] Memberlist: Fix problem with ring being empty right after startup. Memberlist KV store now tries to "fast-join" the cluster to avoid serving empty KV store. #2505
    • [BUGFIX] Compactor: Fix bug when using -compactor.partial-block-deletion-delay: compactor didn't correctly check for modification time of all block files. #2559
    • [BUGFIX] Query-frontend: fix wrong query sharding results for queries with boolean result like 1 < bool 0. #2558
    • [BUGFIX] Fixed error messages related to per-instance limits incorrectly reporting they can be set on a per-tenant basis. #2610
    • [BUGFIX] Perform HA-deduplication before forwarding samples according to forwarding rules in the distributor. #2603 #2709
    • [BUGFIX] Fix reporting of tracing spans from PromQL engine. #2707
    • [BUGFIX] Apply relabel and drop_label rules before forwarding rules in the distributor. #2703
    • [BUGFIX] Distributor: Register cortex_discarded_requests_total metric, which previously was not registered and therefore not exported. #2712

    Mixin

    • [CHANGE] Dashboards: "Slow Queries" dashboard no longer works with versions older than Grafana 9.0. #2223
    • [CHANGE] Alerts: use RSS memory instead of working set memory in the MimirAllocatingTooMuchMemory alert for ingesters. #2480
    • [ENHANCEMENT] Dashboards: added missed rule evaluations to the "Evaluations per second" panel in the "Mimir / Ruler" dashboard. #2314
    • [ENHANCEMENT] Dashboards: add k8s resource requests to CPU and memory panels. #2346
    • [ENHANCEMENT] Dashboards: add RSS memory utilization panel for ingesters, store-gateways and compactors. #2479
    • [ENHANCEMENT] Dashboards: allow to configure graph tooltip. #2647
    • [ENHANCEMENT] Alerts: MimirFrontendQueriesStuck and MimirSchedulerQueriesStuck alerts are more reliable now as they consider all the intermediate samples in the minute prior to the evaluation. #2630
    • [ENHANCEMENT] Alerts: added RolloutOperatorNotReconciling alert, firing if the optional rollout-operator is not successfully reconciling. #2700
    • [BUGFIX] Dashboards: fixed unit of latency panels in the "Mimir / Ruler" dashboard. #2312
    • [BUGFIX] Dashboards: fixed "Intervals per query" panel in the "Mimir / Queries" dashboard. #2308
    • [BUGFIX] Dashboards: Make "Slow Queries" dashboard works with Grafana 9.0. #2223
    • [BUGFIX] Dashboards: add missing API routes to Ruler dashboard. #2412

    Jsonnet

    • [CHANGE] query-scheduler is enabled by default. We advise to deploy the query-scheduler to improve the scalability of the query-frontend. #2431
    • [CHANGE] Replaced anti-affinity rules with pod topology spread constraints for distributor, query-frontend, querier and ruler. #2517
      • The following configuration options have been removed:
        • distributor_allow_multiple_replicas_on_same_node
        • query_frontend_allow_multiple_replicas_on_same_node
        • querier_allow_multiple_replicas_on_same_node
        • ruler_allow_multiple_replicas_on_same_node
      • The following configuration options have been added:
        • distributor_topology_spread_max_skew
        • query_frontend_topology_spread_max_skew
        • querier_topology_spread_max_skew
        • ruler_topology_spread_max_skew
    • [CHANGE] Change max_global_series_per_metric to 0 in all plans, and as a default value. #2669
    • [FEATURE] Memberlist: added support for experimental memberlist cluster label, through the jsonnet configuration options memberlist_cluster_label and memberlist_cluster_label_verification_disabled. #2349
    • [FEATURE] Added ruler-querier autoscaling support. It requires KEDA installed in the Kubernetes cluster. Ruler-querier autoscaler can be enabled and configure through the following options in the jsonnet config: #2545
      • autoscaling_ruler_querier_enabled: true to enable autoscaling.
      • autoscaling_ruler_querier_min_replicas: minimum number of ruler-querier replicas.
      • autoscaling_ruler_querier_max_replicas: maximum number of ruler-querier replicas.
      • autoscaling_prometheus_url: Prometheus base URL from which to scrape Mimir metrics (e.g. http://prometheus.default:9090/prometheus).
    • [ENHANCEMENT] Memberlist now uses DNS service-discovery by default. #2549

    Mimirtool

    • [ENHANCEMENT] Added mimirtool backfill command to upload Prometheus blocks using API available in the compactor. #1822
    • [ENHANCEMENT] mimirtool bucket-validation: Verify existing objects can be overwritten by subsequent uploads. #2491
    • [ENHANCEMENT] mimirtool config convert: Now supports migrating to the current version of Mimir. #2629
    • [BUGFIX] mimirtool analyze: Fix dashboard JSON unmarshalling errors by using custom parsing. #2386

    Mimir Continuous Test

    Documentation

    • [ENHANCEMENT] Referenced mimirtool commands in the HTTP API documentation. #2516
    • [ENHANCEMENT] Improved DNS service discovery documentation. #2513

    Tools

    • [ENHANCEMENT] markblocks now processes multiple blocks concurrently. #2677

    New Contributors

    • @ese made their first contribution in https://github.com/grafana/mimir/pull/2196
    • @micborens made their first contribution in https://github.com/grafana/mimir/pull/2321
    • @marctc made their first contribution in https://github.com/grafana/mimir/pull/2518
    • @ravilushqa made their first contribution in https://github.com/grafana/mimir/pull/2562
    • @BrandonDalton made their first contribution in https://github.com/grafana/mimir/pull/2515
    • @nervo made their first contribution in https://github.com/grafana/mimir/pull/2569
    • @lamida made their first contribution in https://github.com/grafana/mimir/pull/2427
    • @LeviHarrison made their first contribution in https://github.com/grafana/mimir/pull/2644
    • @sysedwinistrator made their first contribution in https://github.com/grafana/mimir/pull/2087

    Full Changelog: https://github.com/grafana/mimir/compare/mimir-2.2.0...mimir-2.3.0-rc0

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(29.76 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.34 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(27.01 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(26.00 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-2.3.0_amd64.deb(16.64 MB)
    mimir-2.3.0_amd64.deb-sha-256(65 bytes)
    mimir-2.3.0_amd64.rpm(16.49 MB)
    mimir-2.3.0_amd64.rpm-sha-256(65 bytes)
    mimir-2.3.0_arm64.deb(15.20 MB)
    mimir-2.3.0_arm64.deb-sha-256(65 bytes)
    mimir-2.3.0_arm64.rpm(15.10 MB)
    mimir-2.3.0_arm64.rpm-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(15.84 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-386(13.75 MB)
    mimir-continuous-test-linux-386-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.42 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.00 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(53.83 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(53.29 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(48.59 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.81 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(50.57 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(50.53 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(46.23 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(44.87 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(46.81 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(45.42 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(14.15 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(14.05 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-386(12.16 MB)
    query-tee-linux-386-sha-256(65 bytes)
    query-tee-linux-amd64(12.77 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(12.37 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.2.0(Jul 21, 2022)

    Grafana Labs is excited to announce version 2.2 of Grafana Mimir, the most scalable, most performant open source time series database in the world.

    The highlights that follow include the top features, enhancements, and bugfixes in this release. If you are upgrading from Grafana Mimir 2.1, there is upgrade-related information as well. For the complete list of changes, see the Changelog.

    This release contains 214 contributions from 32 authors. Thank you!

    Features and enhancements

    • Support for ingesting out-of-order samples: Grafana Mimir includes new, experimental support for ingesting out-of-order samples. This support is configurable, and it allows you to set how far out-of-order Mimir accepts samples on a per-tenant basis. This feature still needs additional testing; we do not recommend using it in a production environment. For more information, see Configuring out-of-order samples ingestion

    • Improved error messages: The error messages that Mimir reports are more human readable, and the messages include error codes that are easily searchable. For error descriptions, see the Grafana Mimir runbooks’ Errors catalog.

    • Configurable prefix for object storage: Mimir can now store block data, rules, and alerts in one bucket, with each under its own user-defined prefix, rather than requiring one bucket for each. You can configure the storage prefix by using -<storage>.storage-prefix option for corresponding storage: ruler-storage, alertmanager-storage or blocks-storage.

    • Store-gateway performance optimization The store-gateway can now pre-populate the file system cache when memory-mapping index-header files. This avoids the store-gateway from appearing to be stuck while loading index-headers. This feature is experimental and disabled by default; enable it using the flag -blocks-storage.bucket-store.index-header.map-populate-enabled.

    • Faster ingester startup: Ingesters now replay their WALs (write ahead logs) about 50% faster, and they also re-join the ring sooner under some conditions.

    • Helm Chart improvements: The Mimir Helm chart is the best way to install Mimir on Kubernetes. As part of the Mimir 2.2 release, we're also releasing version 3.0 of the Helm chart. Notable enhancements follow. For the full list of changes, see the Helm chart changelog.

      • The Helm chart now supports OpenShift.
      • The Helm chart can now easily deploy Grafana Agent in order to scrape metrics and logs from all Mimir pods, and ship them to a remote store, which makes it easier to monitor the health of your Mimir installation. For more information, see Collecting metrics and logs from Grafana Mimir.
      • The Helm chart now enables multi-tenancy by default. This makes it easy for you to add tenants as you grow your cluster. You can take advantage of Mimir's per-tenant quality-of-service features, which improves stability and resilience at high scale. To learn more about how multi-tenancy in Mimir works, see Grafana Mimir authorization and authentication. This change is backwards-compatible. To read about how we implemented this, see #2117.
      • We have significantly improved the configuration experience for the Helm chart, and here are a few of the most salient changes:
        • We've added an extraEnvFrom capability to all Mimir services to enable you to inject secrets via environment variables.
        • We've made it possible to globally set environment variables and inject secrets across all pods in the chart using global.extraEnv and global.extraEnvFrom. Note that the memcached and minio pods are not included.
        • We've switched the default storage of the Mimir configuration from a Secret to a ConfigMap, which makes it easier to quickly see the differences between your Mimir configurations between upgrades. We especially like the Helm diff plugin for this purpose.
        • We've added a structuredConfig option, which allows you to overwrite specific key-value pairs in the mimir.config template, which saves you from having to maintain the entire mimir.config in your own values.yaml file.
        • We've added the ability to create global pod annotations. This unlocks the ability to trigger a restart of all services in response to a single event, such as the update of the secret containing Mimir's storage credentials.
      • We've set the chart to disable -ingester.ring.unregister-on-shutdown and -distributor.extend-writes, for a smoother upgrade experience. Rolling restarts of ingesters are now less likely to cause spikes in resource usage.
      • We've improved the documentation for the Helm chart by adding a Getting started with Mimir using the Helm chart.
      • We've added a smoke test for your Mimir cluster to help catch errors immediately after you install or upgrade Mimir via the Helm chart.

    Upgrade considerations

    All deprecated API endpoints that are under /api/v1/rules* and /prometheus/rules* have now been removed from the ruler component in favor of identical endpoints that use the prefix /prometheus/config/v1/rules*.

    In Grafana Mimir 2.2, we have updated default values and some parameters to give you a better out-of-the-box experience:

    • Message size limits for gRPC messages that are exchanged between internal Mimir components have increased to 100 MiB from 4 MiB. This helps to avoid internal server errors when pushing or querying large data.

    • The -blocks-storage.bucket-store.ignore-blocks-within parameter changed from 0 to 10h. The default value of -querier.query-store-after changed from 0 to 12h. For most-recent data, both changes improve query performance by querying only the ingesters, rather than object storage.

    • The option -querier.shuffle-sharding-ingesters-lookback-period has been deprecated. If you previously changed this option from its default of 0s, set -querier.shuffle-sharding-ingesters-enabled to true and specify the lookback period by setting the -querier.query-ingesters-within option.

    • The -memberlist.abort-if-join-fails parameter now defaults to false. When Mimir is using memberlist as the backend store for its hash ring, and it fails to join the memberlist cluster, Mimir no longer aborts startup by default.

    If you have used a previous version of the Mimir Helm chart, you must address some of the chart's breaking changes before upgrading to helm chart version 3.0. For a detailed information about how to do this, see Upgrade the Grafana Mimir Helm chart from version 2.1 to 3.0.

    Bug fixes

    • PR 1883: Fixed a bug that caused the query-frontend and querier to crash when they received a user query with a special regular expression label matcher.
    • PR 1933: Fixed a bug in the ingester ring page, which showed incorrect status of entries in the ring.
    • PR 2090: Ruler in remote rule evaluation mode now applies the timeout correctly. Previously the ruler could get stuck forever, which halted rule evaluation.
    • PR 2036: Fixed panic at startup when Mimir is running in monolithic mode and query sharding is enabled.

    Changelog

    2.2.0

    Grafana Mimir

    • [CHANGE] Increased default configuration for -server.grpc-max-recv-msg-size-bytes and -server.grpc-max-send-msg-size-bytes from 4MB to 100MB. #1884
    • [CHANGE] Default values have changed for the following settings. This improves query performance for recent data (within 12h) by only reading from ingesters: #1909 #1921
      • -blocks-storage.bucket-store.ignore-blocks-within now defaults to 10h (previously 0)
      • -querier.query-store-after now defaults to 12h (previously 0)
    • [CHANGE] Alertmanager: removed support for migrating local files from Cortex 1.8 or earlier. Related to original Cortex PR https://github.com/cortexproject/cortex/pull/3910. #2253
    • [CHANGE] The following settings are now classified as advanced because the defaults should work for most users and tuning them requires in-depth knowledge of how the read path works: #1929
      • -querier.query-ingesters-within
      • -querier.query-store-after
    • [CHANGE] Config flag category overrides can be set dynamically at runtime. #1934
    • [CHANGE] Ingester: deprecated -ingester.ring.join-after. Mimir now behaves as this setting is always set to 0s. This configuration option will be removed in Mimir 2.4.0. #1965
    • [CHANGE] Blocks uploaded by ingester no longer contain __org_id__ label. Compactor now ignores this label and will compact blocks with and without this label together. mimirconvert tool will remove the label from blocks as "unknown" label. #1972
    • [CHANGE] Querier: deprecated -querier.shuffle-sharding-ingesters-lookback-period, instead adding -querier.shuffle-sharding-ingesters-enabled to enable or disable shuffle sharding on the read path. The value of -querier.query-ingesters-within is now used internally for shuffle sharding lookback. #2110
    • [CHANGE] Memberlist: -memberlist.abort-if-join-fails now defaults to false. Previously it defaulted to true. #2168
    • [CHANGE] Ruler: /api/v1/rules* and /prometheus/rules* configuration endpoints are removed. Use /prometheus/config/v1/rules*. #2182
    • [CHANGE] Ingester: -ingester.exemplars-update-period has been renamed to -ingester.tsdb-config-update-period. You can use it to update multiple, per-tenant TSDB configurations. #2187
    • [FEATURE] Ingester: (Experimental) Add the ability to ingest out-of-order samples up to an allowed limit. If you enable this feature, it requires additional memory and disk space. This feature also enables a write-behind log, which might lead to longer ingester-start replays. When this feature is disabled, there is no overhead on memory, disk space, or startup times. #2187
      • -ingester.out-of-order-time-window, as duration string, allows you to set how back in time a sample can be. The default is 0s, where s is seconds.
      • cortex_ingester_tsdb_out_of_order_samples_appended_total metric tracks the total number of out-of-order samples ingested by the ingester.
      • cortex_discarded_samples_total has a new label reason="sample-too-old", when the -ingester.out-of-order-time-window flag is greater than zero. The label tracks the number of samples that were discarded for being too old; they were out of order, but beyond the time window allowed. The labels reason="sample-out-of-order" and reason="sample-out-of-bounds" are not used when out-of-order ingestion is enabled.
    • [ENHANCEMENT] Distributor: Added limit to prevent tenants from sending excessive number of requests: #1843
      • The following CLI flags (and their respective YAML config options) have been added:
        • -distributor.request-rate-limit
        • -distributor.request-burst-limit
      • The following metric is exposed to tell how many requests have been rejected:
        • cortex_discarded_requests_total
    • [ENHANCEMENT] Store-gateway: Add the experimental ability to run requests in a dedicated OS thread pool. This feature can be configured using -store-gateway.thread-pool-size and is disabled by default. Replaces the ability to run index header operations in a dedicated thread pool. #1660 #1812
    • [ENHANCEMENT] Improved error messages to make them easier to understand; each now have a unique, global identifier that you can use to look up in the runbooks for more information. #1907 #1919 #1888 #1939 #1984 #2009 #2056 #2066 #2104 #2150 #2234
    • [ENHANCEMENT] Memberlist KV: incoming messages are now processed on per-key goroutine. This may reduce loss of "maintanance" packets in busy memberlist installations, but use more CPU. New memberlist_client_received_broadcasts_dropped_total counter tracks number of dropped per-key messages. #1912
    • [ENHANCEMENT] Blocks Storage, Alertmanager, Ruler: add support a prefix to the bucket store (*_storage.storage_prefix). This enables using the same bucket for the three components. #1686 #1951
    • [ENHANCEMENT] Upgrade Docker base images to alpine:3.16.0. #2028
    • [ENHANCEMENT] Store-gateway: Add experimental configuration option for the store-gateway to attempt to pre-populate the file system cache when memory-mapping index-header files. Enabled with -blocks-storage.bucket-store.index-header.map-populate-enabled=true. Note this flag only has an effect when running on Linux. #2019 #2054
    • [ENHANCEMENT] Chunk Mapper: reduce memory usage of async chunk mapper. #2043
    • [ENHANCEMENT] Ingester: reduce sleep time when reading WAL. #2098
    • [ENHANCEMENT] Compactor: Run sanity check on blocks storage configuration at startup. #2144
    • [ENHANCEMENT] Compactor: Add HTTP API for uploading TSDB blocks. Enabled with -compactor.block-upload-enabled. #1694 #2126
    • [ENHANCEMENT] Ingester: Enable querying overlapping blocks by default. #2187
    • [ENHANCEMENT] Distributor: Auto-forget unhealthy distributors after ten failed ring heartbeats. #2154
    • [ENHANCEMENT] Distributor: Add new metric cortex_distributor_forward_errors_total for error codes resulting from forwarding requests. #2077
    • [ENHANCEMENT] /ready endpoint now returns and logs detailed services information. #2055
    • [ENHANCEMENT] Memcached client: Reduce number of connections required to fetch cached keys from memcached. #1920
    • [ENHANCEMENT] Improved error message returned when -querier.query-store-after validation fails. #1914
    • [BUGFIX] Fix regexp parsing panic for regexp label matchers with start/end quantifiers. #1883
    • [BUGFIX] Ingester: fixed deceiving error log "failed to update cached shipped blocks after shipper initialisation", occurring for each new tenant in the ingester. #1893
    • [BUGFIX] Ring: fix bug where instances may appear unhealthy in the hash ring web UI even though they are not. #1933
    • [BUGFIX] API: gzip is now enforced when identity encoding is explicitly rejected. #1864
    • [BUGFIX] Fix panic at startup when Mimir is running in monolithic mode and query sharding is enabled. #2036
    • [BUGFIX] Ruler: report cortex_ruler_queries_failed_total metric for any remote query error except 4xx when remote operational mode is enabled. #2053 #2143
    • [BUGFIX] Ingester: fix slow rollout when using -ingester.ring.unregister-on-shutdown=false with long -ingester.ring.heartbeat-period. #2085
    • [BUGFIX] Ruler: add timeout for remote rule evaluation queries to prevent rule group evaluations getting stuck indefinitely. The duration is configurable with -querier.timeout (default 2m). #2090 #2222
    • [BUGFIX] Limits: Active series custom tracker configuration has been named back from active_series_custom_trackers_config to active_series_custom_trackers. For backwards compatibility both version is going to be supported for until Mimir v2.4. When both fields are specified, active_series_custom_trackers_config takes precedence over active_series_custom_trackers. #2101
    • [BUGFIX] Ingester: fixed the order of labels applied when incrementing the cortex_discarded_metadata_total metric. #2096
    • [BUGFIX] Ingester: fixed bug where retrieving metadata for a metric with multiple metadata entries would return multiple copies of a single metadata entry rather than all available entries. #2096
    • [BUGFIX] Distributor: canceled requests are no longer accounted as internal errors. #2157
    • [BUGFIX] Memberlist: Fix typo in memberlist admin UI. #2202
    • [BUGFIX] Ruler: fixed typo in error message when ruler failed to decode a rule group. #2151
    • [BUGFIX] Active series custom tracker configuration is now displayed properly on /runtime_config page. #2065
    • [BUGFIX] Query-frontend: vector and time functions were sharded, which made expressions like vector(1) > 0 and vector(1) fail. #2355

    Mixin

    • [CHANGE] Split mimir_queries rules group into mimir_queries and mimir_ingester_queries to keep number of rules per group within the default per-tenant limit. #1885
    • [CHANGE] Dashboards: Expose full image tag in "Mimir / Rollout progress" dashboard's "Pod per version panel." #1932
    • [CHANGE] Dashboards: Disabled gateway panels by default, because most users don't have a gateway exposing the metrics expected by Mimir dashboards. You can re-enable it setting gateway_enabled: true in the mixin config and recompiling the mixin running make build-mixin. #1955
    • [CHANGE] Alerts: adapt MimirFrontendQueriesStuck and MimirSchedulerQueriesStuck to consider ruler query path components. #1949
    • [CHANGE] Alerts: Change MimirRulerTooManyFailedQueries severity to critical. #2165
    • [ENHANCEMENT] Dashboards: Add config option datasource_regex to customise the regular expression used to select valid datasources for Mimir dashboards. #1802
    • [ENHANCEMENT] Dashboards: Added "Mimir / Remote ruler reads" and "Mimir / Remote ruler reads resources" dashboards. #1911 #1937
    • [ENHANCEMENT] Dashboards: Make networking panels work for pods created by the mimir-distributed helm chart. #1927
    • [ENHANCEMENT] Alerts: Add MimirStoreGatewayNoSyncedTenants alert that fires when there is a store-gateway owning no tenants. #1882
    • [ENHANCEMENT] Rules: Make recording_rules_range_interval configurable for cases where Mimir metrics are scraped less often that every 30 seconds. #2118
    • [ENHANCEMENT] Added minimum Grafana version to mixin dashboards. #1943
    • [BUGFIX] Fix container_memory_usage_bytes:sum recording rule. #1865
    • [BUGFIX] Fix MimirGossipMembersMismatch alerts if Mimir alertmanager is activated. #1870
    • [BUGFIX] Fix MimirRulerMissedEvaluations to show % of missed alerts as a value between 0 and 100 instead of 0 and 1. #1895
    • [BUGFIX] Fix MimirCompactorHasNotUploadedBlocks alert false positive when Mimir is deployed in monolithic mode. #1902
    • [BUGFIX] Fix MimirGossipMembersMismatch to make it less sensitive during rollouts and fire one alert per installation, not per job. #1926
    • [BUGFIX] Do not trigger MimirAllocatingTooMuchMemory alerts if no container limits are supplied. #1905
    • [BUGFIX] Dashboards: Remove empty "Chunks per query" panel from Mimir / Queries dashboard. #1928
    • [BUGFIX] Dashboards: Use Grafana's $__rate_interval for rate queries in dashboards to support scrape intervals of >15s. #2011
    • [BUGFIX] Alerts: Make each version of MimirCompactorHasNotUploadedBlocks distinct to avoid rule evaluation failures due to duplicate series being generated. #2197
    • [BUGFIX] Fix MimirGossipMembersMismatch alert when using remote ruler evaluation. #2159

    Jsonnet

    • [CHANGE] Remove use of -querier.query-store-after, -querier.shuffle-sharding-ingesters-lookback-period, -blocks-storage.bucket-store.ignore-blocks-within, and -blocks-storage.tsdb.close-idle-tsdb-timeout CLI flags since the values now match defaults. #1915 #1921
    • [CHANGE] Change default value for -blocks-storage.bucket-store.chunks-cache.memcached.timeout to 450ms to increase use of cached data. #2035
    • [CHANGE] The memberlist_ring_enabled configuration now applies to Alertmanager. #2102 #2103 #2107
    • [CHANGE] Default value for memberlist_ring_enabled is now true. It means that all hash rings use Memberlist as default KV store instead of Consul (previous default). #2161
    • [CHANGE] Configure -ingester.max-global-metadata-per-user to correspond to 20% of the configured max number of series per tenant. #2250
    • [CHANGE] Configure -ingester.max-global-metadata-per-metric to be 10. #2250
    • [CHANGE] Change _config.multi_zone_ingester_max_unavailable to 25. #2251
    • [FEATURE] Added querier autoscaling support. It requires KEDA installed in the Kubernetes cluster and query-scheduler enabled in the Mimir cluster. Querier autoscaler can be enabled and configure through the following options in the jsonnet config: #2013 #2023
      • autoscaling_querier_enabled: true to enable autoscaling.
      • autoscaling_querier_min_replicas: minimum number of querier replicas.
      • autoscaling_querier_max_replicas: maximum number of querier replicas.
      • autoscaling_prometheus_url: Prometheus base URL from which to scrape Mimir metrics (e.g. http://prometheus.default:9090/prometheus).
    • [FEATURE] Jsonnet: Add support for ruler remote evaluation mode (ruler_remote_evaluation_enabled), which deploys and uses a dedicated query path for rule evaluation. This enables the benefits of the query-frontend for rule evaluation, such as query sharding. #2073
    • [ENHANCEMENT] Added compactor service, that can be used to route requests directly to compactor (e.g. admin UI). #2063
    • [ENHANCEMENT] Added a consul_enabled configuration option to provide the ability to disable consul. It is automatically set to false when memberlist_ring_enabled is true and multikv_migration_enabled (used for migration from Consul to memberlist) is not set. #2093 #2152
    • [BUGFIX] Querier: Fix disabling shuffle sharding on the read path whilst keeping it enabled on write path. #2164

    Mimirtool

    • [CHANGE] mimirtool rules: --use-legacy-routes now toggles between using /prometheus/config/v1/rules (default) and /api/v1/rules (legacy) endpoints. #2182
    • [FEATURE] Added bearer token support for when Mimir is behind a gateway authenticating by bearer token. #2146
    • [BUGFIX] mimirtool analyze: Fix dashboard JSON unmarshalling errors (#1840). #1973
    • [BUGFIX] Make mimirtool build for Windows work again. #2273

    Mimir Continuous Test

    • [ENHANCEMENT] Added the -tests.smoke-test flag to run the mimir-continuous-test suite once and immediately exit. #2047 #2094

    Documentation

    • [ENHANCEMENT] Published Grafana Mimir runbooks as part of documentation. #1970
    • [ENHANCEMENT] Improved ruler's "remote operational mode" documentation. #1906
    • [ENHANCEMENT] Recommend fast disks for ingesters and store-gateways in production tips. #1903
    • [ENHANCEMENT] Explain the runtime override of active series matchers. #1868
    • [ENHANCEMENT] Clarify "Set rule group" API specification. #1869
    • [ENHANCEMENT] Published Mimir jsonnet documentation. #2024
    • [ENHANCEMENT] Documented required scrape interval for using alerting and recording rules from Mimir jsonnet. #2147
    • [ENHANCEMENT] Runbooks: Mention memberlist as possible source of problems for various alerts. #2158
    • [ENHANCEMENT] Added step-by-step article about migrating from Consul to Memberlist KV store using jsonnet without downtime. #2166
    • [ENHANCEMENT] Documented /memberlist admin page. #2166
    • [ENHANCEMENT] Documented how to configure Grafana Mimir's ruler with Jsonnet. #2127
    • [ENHANCEMENT] Documented how to configure queriers’ autoscaling with Jsonnet. #2128
    • [ENHANCEMENT] Updated mixin building instructions in "Installing Grafana Mimir dashboards and alerts" article. #2015 #2163
    • [ENHANCEMENT] Fix location of "Monitoring Grafana Mimir" article in the documentation hierarchy. #2130
    • [ENHANCEMENT] Runbook for MimirRequestLatency was expanded with more practical advice. #1967
    • [BUGFIX] Fixed ruler configuration used in the getting started guide. #2052
    • [BUGFIX] Fixed Mimir Alertmanager datasource in Grafana used by "Play with Grafana Mimir" tutorial. #2115
    • [BUGFIX] Fixed typos in "Scaling out Grafana Mimir" article. #2170
    • [BUGFIX] Added missing ring endpoint exposed by Ingesters. #1918

    New Contributors

    • @pdf made their first contribution in https://github.com/grafana/mimir/pull/1865
    • @secustor made their first contribution in https://github.com/grafana/mimir/pull/1870
    • @zenador made their first contribution in https://github.com/grafana/mimir/pull/1930
    • @pr00se made their first contribution in https://github.com/grafana/mimir/pull/1934
    • @hjet made their first contribution in https://github.com/grafana/mimir/pull/1973
    • @williamzelesny made their first contribution in https://github.com/grafana/mimir/pull/2028
    • @javad-hajiani made their first contribution in https://github.com/grafana/mimir/pull/2146
    • @rojas-diego made their first contribution in https://github.com/grafana/mimir/pull/2147
    • @jhesketh made their first contribution in https://github.com/grafana/mimir/pull/2163
    • @gonzalez made their first contribution in https://github.com/grafana/mimir/pull/2112
    • @Eve832 made their first contribution in https://github.com/grafana/mimir/pull/2170

    Full Changelog: https://github.com/grafana/mimir/compare/mimir-2.1.0...mimir-2.2.0

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(28.38 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.39 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(25.87 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(25.93 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.13 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(16.86 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.56 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.43 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(52.21 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(54.04 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(47.35 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.93 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(47.82 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(49.99 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(43.92 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(44.06 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(45.00 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(45.06 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(13.09 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(13.65 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(11.90 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(11.81 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.2.0-rc.1(Jul 12, 2022)

    This release contains 26 contributions from 6 authors. Thank you!

    Changes since 2.2.0-rc.0

    Grafana Mimir

    • [BUGFIX] Query-frontend: vector and time functions were sharded, which made expressions like vector(1) > 0 and vector(1) fail. #2355

    Mimirtool

    • [BUGFIX] Make mimirtool build for Windows work again. #2273

    Full Changelog: https://github.com/grafana/mimir/compare/mimir-2.2.0-rc.0...mimir-2.2.0-rc.1

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(28.38 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.39 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(25.87 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(25.93 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.13 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(16.86 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.56 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.43 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(52.21 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(54.04 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(47.35 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.93 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(47.82 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(49.99 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(43.92 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(44.06 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(45.00 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(45.06 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(13.09 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(13.65 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(11.90 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(11.81 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.2.0-rc.0(Jun 28, 2022)

    2.2.0-rc.0

    This release contains 214 contributions from 32 authors. Thank you!

    Grafana Labs is excited to announce version 2.2 of Grafana Mimir, the most scalable, most performant open source time series database in the world.

    Highlights include the top features, enhancements, and bugfixes in this release. If you are upgrading from Grafana Mimir 2.1, there is migration-related information as well. For the complete list of changes, see the Changelog.

    Features and enhancements

    • Support for ingesting out-of-order samples: Grafana Mimir includes new, experimental support for ingesting out-of-order samples. This support is configurable, with users able to set how far out-of-order Mimir will accept samples on a per-tenant basis. Note that this feature still needs a heavy testing, and is not production-ready yet.

    • Error messages: The error messages that Mimir reports are more human readable, and the messages include error codes that are easily searchable.

    • Configurable prefix for object storage: Mimir can now store block data, rules, and alerts in one bucket, each under its own user-defined prefix, rather than requiring one bucket for each. You can configure the storage prefix by using -<storage>.storage-prefix option for corresponding storage: ruler-storage, alertmanager-storage or blocks-storage.

    • Helm Chart update: TBD

    • Store-gateway can now optionally prepopulate the file system cache when memory-mapping index-header files. This can help store-gateway to avoid looking stuck while loading index-headers. Feature can be enabled with new experimental flag -blocks-storage.bucket-store.index-header.map-populate-enabled.

    • Faster ingester startup: Ingesters now replay Write-Ahead-Log by about 50% faster, and they also re-join the ring sooner under some conditions.

    Upgrade considerations

    We have updated default values and some parameters in Grafana Mimir 2.2 to give you a better out-of-the-box experience:

    • Message size limits for gRPC messages exchanged between internal Mimir components increased to 100 MiB from the previous 4 MiB. This helps to avoid internal server errors when pushing or querying large data.

    • The -blocks-storage.bucket-store.ignore-blocks-within parameter changed from 0 to 10h. The default value of -querier.query-store-after changed from 0 to 12h. Both changes improve query performance for most-recent data by querying only the ingesters, rather than object storage.

    • The option -querier.shuffle-sharding-ingesters-lookback-period has been deprecated. If you previously changed this option from its default of 0s, set -querier.shuffle-sharding-ingesters-enabled to true and specify the lookback period by setting the -querier.query-ingesters-within option.

    • The -memberlist.abort-if-join-fails parameter now defaults to false. When Mimir is using memberlist as a backend store for hash ring, and it fails to join the memberlist cluster, Mimir no longer aborts startup by default.

    Bug fixes

    • PR 1883: Fixed a bug that caused the query-frontend and querier to crash when they received a user query with a special regular expression label matcher.
    • PR 1933: Fixed a bug in the ingester ring page, which showed incorrect status of entries in the ring.
    • PR 2090: Ruler in remote rule evaluation mode now applies the timeout correctly. Previously the ruler could get stuck forever, which halted rule evaluation.
    • PR 2036: Fixed panic at startup when Mimir is running in monolithic mode and query sharding is enabled.

    CHANGELOG

    Grafana Mimir

    • [CHANGE] Increased default configuration for -server.grpc-max-recv-msg-size-bytes and -server.grpc-max-send-msg-size-bytes from 4MB to 100MB. #1884
    • [CHANGE] Default values have changed for the following settings. This improves query performance for recent data (within 12h) by only reading from ingesters: #1909 #1921
      • -blocks-storage.bucket-store.ignore-blocks-within now defaults to 10h (previously 0)
      • -querier.query-store-after now defaults to 12h (previously 0)
    • [CHANGE] Alertmanager: removed support for migrating local files from Cortex 1.8 or earlier. Related to original Cortex PR https://github.com/cortexproject/cortex/pull/3910. #2253
    • [CHANGE] The following settings are now classified as advanced because the defaults should work for most users and tuning them requires in-depth knowledge of how the read path works: #1929
      • -querier.query-ingesters-within
      • -querier.query-store-after
    • [CHANGE] Config flag category overrides can be set dynamically at runtime. #1934
    • [CHANGE] Ingester: deprecated -ingester.ring.join-after. Mimir now behaves as this setting is always set to 0s. This configuration option will be removed in Mimir 2.4.0. #1965
    • [CHANGE] Blocks uploaded by ingester no longer contain __org_id__ label. Compactor now ignores this label and will compact blocks with and without this label together. mimirconvert tool will remove the label from blocks as "unknown" label. #1972
    • [CHANGE] Querier: deprecated -querier.shuffle-sharding-ingesters-lookback-period, instead adding -querier.shuffle-sharding-ingesters-enabled to enable or disable shuffle sharding on the read path. The value of -querier.query-ingesters-within is now used internally for shuffle sharding lookback. #2110
    • [CHANGE] Memberlist: -memberlist.abort-if-join-fails now defaults to false. Previously it defaulted to true. #2168
    • [CHANGE] Ruler: /api/v1/rules* and /prometheus/rules* configuration endpoints are removed. Use /prometheus/config/v1/rules*. #2182
    • [CHANGE] Ingester: -ingester.exemplars-update-period has been renamed to -ingester.tsdb-config-update-period. You can use it to update multiple, per-tenant TSDB configurations. #2187
    • [FEATURE] Ingester: (Experimental) Add the ability to ingest out-of-order samples up to an allowed limit. If you enable this feature, it requires additional memory and disk space. This feature also enables a write-behind log, which might lead to longer ingester-start replays. When this feature is disabled, there is no overhead on memory, disk space, or startup times. #2187
      • -ingester.out-of-order-time-window, as duration string, allows you to set how back in time a sample can be. The default is 0s, where s is seconds.
      • cortex_ingester_tsdb_out_of_order_samples_appended_total metric tracks the total number of out-of-order samples ingested by the ingester.
      • cortex_discarded_samples_total has a new label reason="sample-too-old", when the -ingester.out-of-order-time-window flag is greater than zero. The label tracks the number of samples that were discarded for being too old; they were out of order, but beyond the time window allowed.
    • [ENHANCEMENT] Distributor: Added limit to prevent tenants from sending excessive number of requests: #1843
      • The following CLI flags (and their respective YAML config options) have been added:
        • -distributor.request-rate-limit
        • -distributor.request-burst-limit
      • The following metric is exposed to tell how many requests have been rejected:
        • cortex_discarded_requests_total
    • [ENHANCEMENT] Store-gateway: Add the experimental ability to run requests in a dedicated OS thread pool. This feature can be configured using -store-gateway.thread-pool-size and is disabled by default. Replaces the ability to run index header operations in a dedicated thread pool. #1660 #1812
    • [ENHANCEMENT] Improved error messages to make them easier to understand; each now have a unique, global identifier that you can use to look up in the runbooks for more information. #1907 #1919 #1888 #1939 #1984 #2009 #2056 #2066 #2104 #2150 #2234
    • [ENHANCEMENT] Memberlist KV: incoming messages are now processed on per-key goroutine. This may reduce loss of "maintanance" packets in busy memberlist installations, but use more CPU. New memberlist_client_received_broadcasts_dropped_total counter tracks number of dropped per-key messages. #1912
    • [ENHANCEMENT] Blocks Storage, Alertmanager, Ruler: add support a prefix to the bucket store (*_storage.storage_prefix). This enables using the same bucket for the three components. #1686 #1951
    • [ENHANCEMENT] Upgrade Docker base images to alpine:3.16.0. #2028
    • [ENHANCEMENT] Store-gateway: Add experimental configuration option for the store-gateway to attempt to pre-populate the file system cache when memory-mapping index-header files. Enabled with -blocks-storage.bucket-store.index-header.map-populate-enabled=true. Note this flag only has an effect when running on Linux. #2019 #2054
    • [ENHANCEMENT] Chunk Mapper: reduce memory usage of async chunk mapper. #2043
    • [ENHANCEMENT] Ingester: reduce sleep time when reading WAL. #2098
    • [ENHANCEMENT] Compactor: Run sanity check on blocks storage configuration at startup. #2144
    • [ENHANCEMENT] Compactor: Add HTTP API for uploading TSDB blocks. Enabled with -compactor.block-upload-enabled. #1694 #2126
    • [ENHANCEMENT] Ingester: Enable querying overlapping blocks by default. #2187
    • [ENHANCEMENT] Distributor: Auto-forget unhealthy distributors after ten failed ring heartbeats. #2154
    • [ENHANCEMENT] Distributor: Add new metric cortex_distributor_forward_errors_total for error codes resulting from forwarding requests. #2077
    • [ENHANCEMENT] /ready endpoint now returns and logs detailed services information. #2055
    • [ENHANCEMENT] Memcached client: Reduce number of connections required to fetch cached keys from memcached. #1920
    • [ENHANCEMENT] Improved error message returned when -querier.query-store-after validation fails. #1914
    • [BUGFIX] Fix regexp parsing panic for regexp label matchers with start/end quantifiers. #1883
    • [BUGFIX] Ingester: fixed deceiving error log "failed to update cached shipped blocks after shipper initialisation", occurring for each new tenant in the ingester. #1893
    • [BUGFIX] Ring: fix bug where instances may appear unhealthy in the hash ring web UI even though they are not. #1933
    • [BUGFIX] API: gzip is now enforced when identity encoding is explicitly rejected. #1864
    • [BUGFIX] Fix panic at startup when Mimir is running in monolithic mode and query sharding is enabled. #2036
    • [BUGFIX] Ruler: report cortex_ruler_queries_failed_total metric for any remote query error except 4xx when remote operational mode is enabled. #2053 #2143
    • [BUGFIX] Ingester: fix slow rollout when using -ingester.ring.unregister-on-shutdown=false with long -ingester.ring.heartbeat-period. #2085
    • [BUGFIX] Ruler: add timeout for remote rule evaluation queries to prevent rule group evaluations getting stuck indefinitely. The duration is configurable with -querier.timeout (default 2m). #2090 #2222
    • [BUGFIX] Limits: Active series custom tracker configuration has been named back from active_series_custom_trackers_config to active_series_custom_trackers. For backwards compatibility both version is going to be supported for until Mimir v2.4. When both fields are specified, active_series_custom_trackers_config takes precedence over active_series_custom_trackers. #2101
    • [BUGFIX] Ingester: fixed the order of labels applied when incrementing the cortex_discarded_metadata_total metric. #2096
    • [BUGFIX] Ingester: fixed bug where retrieving metadata for a metric with multiple metadata entries would return multiple copies of a single metadata entry rather than all available entries. #2096
    • [BUGFIX] Distributor: canceled requests are no longer accounted as internal errors. #2157
    • [BUGFIX] Memberlist: Fix typo in memberlist admin UI. #2202
    • [BUGFIX] Ruler: fixed typo in error message when ruler failed to decode a rule group. #2151
    • [BUGFIX] Active series custom tracker configuration is now displayed properly on /runtime_config page. #2065

    Mixin

    • [CHANGE] Split mimir_queries rules group into mimir_queries and mimir_ingester_queries to keep number of rules per group within the default per-tenant limit. #1885
    • [CHANGE] Dashboards: Expose full image tag in "Mimir / Rollout progress" dashboard's "Pod per version panel." #1932
    • [CHANGE] Dashboards: Disabled gateway panels by default, because most users don't have a gateway exposing the metrics expected by Mimir dashboards. You can re-enable it setting gateway_enabled: true in the mixin config and recompiling the mixin running make build-mixin. #1955
    • [CHANGE] Alerts: adapt MimirFrontendQueriesStuck and MimirSchedulerQueriesStuck to consider ruler query path components. #1949
    • [CHANGE] Alerts: Change MimirRulerTooManyFailedQueries severity to critical. #2165
    • [ENHANCEMENT] Dashboards: Add config option datasource_regex to customise the regular expression used to select valid datasources for Mimir dashboards. #1802
    • [ENHANCEMENT] Dashboards: Added "Mimir / Remote ruler reads" and "Mimir / Remote ruler reads resources" dashboards. #1911 #1937
    • [ENHANCEMENT] Dashboards: Make networking panels work for pods created by the mimir-distributed helm chart. #1927
    • [ENHANCEMENT] Alerts: Add MimirStoreGatewayNoSyncedTenants alert that fires when there is a store-gateway owning no tenants. #1882
    • [ENHANCEMENT] Rules: Make recording_rules_range_interval configurable for cases where Mimir metrics are scraped less often that every 30 seconds. #2118
    • [ENHANCEMENT] Added minimum Grafana version to mixin dashboards. #1943
    • [BUGFIX] Fix container_memory_usage_bytes:sum recording rule. #1865
    • [BUGFIX] Fix MimirGossipMembersMismatch alerts if Mimir alertmanager is activated. #1870
    • [BUGFIX] Fix MimirRulerMissedEvaluations to show % of missed alerts as a value between 0 and 100 instead of 0 and 1. #1895
    • [BUGFIX] Fix MimirCompactorHasNotUploadedBlocks alert false positive when Mimir is deployed in monolithic mode. #1902
    • [BUGFIX] Fix MimirGossipMembersMismatch to make it less sensitive during rollouts and fire one alert per installation, not per job. #1926
    • [BUGFIX] Do not trigger MimirAllocatingTooMuchMemory alerts if no container limits are supplied. #1905
    • [BUGFIX] Dashboards: Remove empty "Chunks per query" panel from Mimir / Queries dashboard. #1928
    • [BUGFIX] Dashboards: Use Grafana's $__rate_interval for rate queries in dashboards to support scrape intervals of >15s. #2011
    • [BUGFIX] Alerts: Make each version of MimirCompactorHasNotUploadedBlocks distinct to avoid rule evaluation failures due to duplicate series being generated. #2197
    • [BUGFIX] Fix MimirGossipMembersMismatch alert when using remote ruler evaluation. #2159

    Jsonnet

    • [CHANGE] Remove use of -querier.query-store-after, -querier.shuffle-sharding-ingesters-lookback-period, -blocks-storage.bucket-store.ignore-blocks-within, and -blocks-storage.tsdb.close-idle-tsdb-timeout CLI flags since the values now match defaults. #1915 #1921
    • [CHANGE] Change default value for -blocks-storage.bucket-store.chunks-cache.memcached.timeout to 450ms to increase use of cached data. #2035
    • [CHANGE] The memberlist_ring_enabled configuration now applies to Alertmanager. #2102 #2103 #2107
    • [CHANGE] Default value for memberlist_ring_enabled is now true. It means that all hash rings use Memberlist as default KV store instead of Consul (previous default). #2161
    • [CHANGE] Configure -ingester.max-global-metadata-per-user to correspond to 20% of the configured max number of series per tenant. #2250
    • [CHANGE] Configure -ingester.max-global-metadata-per-metric to be 10. #2250
    • [CHANGE] Change _config.multi_zone_ingester_max_unavailable to 25. #2251
    • [FEATURE] Added querier autoscaling support. It requires KEDA installed in the Kubernetes cluster and query-scheduler enabled in the Mimir cluster. Querier autoscaler can be enabled and configure through the following options in the jsonnet config: #2013 #2023
      • autoscaling_querier_enabled: true to enable autoscaling.
      • autoscaling_querier_min_replicas: minimum number of querier replicas.
      • autoscaling_querier_max_replicas: maximum number of querier replicas.
      • autoscaling_prometheus_url: Prometheus base URL from which to scrape Mimir metrics (e.g. http://prometheus.default:9090/prometheus).
    • [FEATURE] Jsonnet: Add support for ruler remote evaluation mode (ruler_remote_evaluation_enabled), which deploys and uses a dedicated query path for rule evaluation. This enables the benefits of the query-frontend for rule evaluation, such as query sharding. #2073
    • [ENHANCEMENT] Added compactor service, that can be used to route requests directly to compactor (e.g. admin UI). #2063
    • [ENHANCEMENT] Added a consul_enabled configuration option to provide the ability to disable consul. It is automatically set to false when memberlist_ring_enabled is true and multikv_migration_enabled (used for migration from Consul to memberlist) is not set. #2093 #2152
    • [BUGFIX] Querier: Fix disabling shuffle sharding on the read path whilst keeping it enabled on write path. #2164

    Mimirtool

    • [CHANGE] mimirtool rules: --use-legacy-routes now toggles between using /prometheus/config/v1/rules (default) and /api/v1/rules (legacy) endpoints. #2182
    • [FEATURE] Added bearer token support for when Mimir is behind a gateway authenticating by bearer token. #2146
    • [BUGFIX] mimirtool analyze: Fix dashboard JSON unmarshalling errors (#1840). #1973

    Mimir Continuous Test

    • [ENHANCEMENT] Added the -tests.smoke-test flag to run the mimir-continuous-test suite once and immediately exit. #2047 #2094

    Documentation

    • [ENHANCEMENT] Published Grafana Mimir runbooks as part of documentation. #1970
    • [ENHANCEMENT] Improved ruler's "remote operational mode" documentation. #1906
    • [ENHANCEMENT] Recommend fast disks for ingesters and store-gateways in production tips. #1903
    • [ENHANCEMENT] Explain the runtime override of active series matchers. #1868
    • [ENHANCEMENT] Clarify "Set rule group" API specification. #1869
    • [ENHANCEMENT] Published Mimir jsonnet documentation. #2024
    • [ENHANCEMENT] Documented required scrape interval for using alerting and recording rules from Mimir jsonnet. #2147
    • [ENHANCEMENT] Runbooks: Mention memberlist as possible source of problems for various alerts. #2158
    • [ENHANCEMENT] Added step-by-step article about migrating from Consul to Memberlist KV store using jsonnet without downtime. #2166
    • [ENHANCEMENT] Documented /memberlist admin page. #2166
    • [ENHANCEMENT] Documented how to configure Grafana Mimir's ruler with Jsonnet. #2127
    • [ENHANCEMENT] Documented how to configure queriers’ autoscaling with Jsonnet. #2128
    • [ENHANCEMENT] Updated mixin building instructions in "Installing Grafana Mimir dashboards and alerts" article. #2015 #2163
    • [ENHANCEMENT] Fix location of "Monitoring Grafana Mimir" article in the documentation hierarchy. #2130
    • [ENHANCEMENT] Runbook for MimirRequestLatency was expanded with more practical advice. #1967
    • [BUGFIX] Fixed ruler configuration used in the getting started guide. #2052
    • [BUGFIX] Fixed Mimir Alertmanager datasource in Grafana used by "Play with Grafana Mimir" tutorial. #2115
    • [BUGFIX] Fixed typos in "Scaling out Grafana Mimir" article. #2170
    • [BUGFIX] Added missing ring endpoint exposed by Ingesters. #1918

    New Contributors

    • @pdf made their first contribution in https://github.com/grafana/mimir/pull/1865
    • @secustor made their first contribution in https://github.com/grafana/mimir/pull/1870
    • @zenador made their first contribution in https://github.com/grafana/mimir/pull/1930
    • @pr00se made their first contribution in https://github.com/grafana/mimir/pull/1934
    • @hjet made their first contribution in https://github.com/grafana/mimir/pull/1973
    • @williamzelesny made their first contribution in https://github.com/grafana/mimir/pull/2028
    • @javad-hajiani made their first contribution in https://github.com/grafana/mimir/pull/2146
    • @rojas-diego made their first contribution in https://github.com/grafana/mimir/pull/2147
    • @jhesketh made their first contribution in https://github.com/grafana/mimir/pull/2163
    • @gonzalez made their first contribution in https://github.com/grafana/mimir/pull/2112
    • @Eve832 made their first contribution in https://github.com/grafana/mimir/pull/2170

    Full Changelog: https://github.com/grafana/mimir/compare/mimir-2.1.0...mimir-2.2.0-rc.0

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(28.38 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.39 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(25.87 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(25.93 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.13 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(16.86 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.56 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.43 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(52.21 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(54.04 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(47.35 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.93 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(47.82 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(49.99 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(43.92 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(44.06 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    query-tee-darwin-amd64(13.09 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(13.65 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(11.90 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(11.81 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.1.0(May 26, 2022)

    Grafana Labs is excited to announce version 2.1 of Grafana Mimir, the most scalable, most performant open source time series database in the world.

    Below we highlight the top features, enhancements and bugfixes in this release, as well as relevant callouts for those upgrading from Grafana Mimir 2.0. The complete list of changes is recorded in the Changelog.

    Features and enhancements

    • Mimir on ARM: We now publish Docker images for both amd64 and arm64, making it easier for those on arm-based machines to develop and run Mimir. Multiplaform images are available from the Mimir docker registry. Note that our existing integration test suite only uses the amd64 images, which means we cannot make any functional or performance guarantees about the arm64 images.

    • Remote ruler mode for improved rule evaluation performance: We've added a remote mode for the Grafana Mimir ruler, in which the ruler delegates rule evaluation to the query-frontend rather than evaluating rules directly within the ruler process itself. This allows recording and alerting rules to benefit from the query parallelization techniques implemented in the query-frontend (like query sharding). Remote mode is considered experimental and is off by default. To enable, see remote ruler.

    • Per-tenant custom trackers for monitoring cardinality: In Grafana Mimir 2.0, we introduced a custom tracker feature that allows you to track the count of active series over time that match a specific label matcher. In Grafana Mimir 2.1, we've made it possible to configure custom trackers via the runtime configuration file. This means you can now define different trackers for each tenant in your cluster and modify those trackers without an ingester restart.

    • Reduce cardinality of Grafana Mimir's /metrics endpoint: While Grafana Mimir does a good job of exposing a relatively small number of series about its own state, this number can tick up when running Grafana Mimir clusters with high tenant counts or high active series counts. To reduce this number (and the accompanying cost of scraping and storing these time series), we made several optimizations which decreased series count on the /metrics endpoint by more than 10%.

    Upgrade considerations

    We've updated the default values for 2 parameters in Grafana Mimir to give users better out-of-the-box performance:

    • We've changed the default for -blocks-storage.tsdb.isolation-enabled from true to false. We've marked this flag as deprecated and will remove it completely in 2 releases. TSDB isolation is a feature inherited from Prometheus that didn't provide any benefit given Grafana Mimir's distributed architecture and in our 1 billion series load test we found it actually hurt performance. Disabling it reduced our ingester 99th percentile latency by 90%.

    • The store-gateway attributes cache is now enabled by default (achieved by updating the default for -blocks-storage.bucket-store.chunks-cache.attributes-in-memory-max-items from 0 to 50000). This in-memory cache makes it faster to look up object attributes for chunk data. We've been running this optional cache internally for a while and upon a recent configuration audit, realized it made sense to do the same for all users. The increase in store-gateway memory utilization from enabling this cache is negligible and easily justified given the performance gains.

    Bug fixes

    2.1.0 bug fixes

    • PR 1704: Fixed a bug that previously caused Grafana Mimir to crash on startup when trying to run in monolithic mode with the results cache enabled due to duplicate metric names.
    • PR 1835: Fixed a bug that caused Grafana Mimir to crash when an invalid Alertmanager configuration was set even though the Alertmanager component was disabled. After this fix, the Alertmanager configuration is only validated if the Alertmanager component is loaded.
    • PR 1836: The ability to run Alertmanager with local storage broke in Grafana Mimir 2.0 when we removed the ability to run the Alertmanager without sharding. With this bugfix, we've made it possible to again run Alertmanager with local storage. However, for production use, we still recommend using external store since this is needed to persist Alertmanager state (e.g. silences) between replicas.
    • PR 1715: Restored Grafana Mimir's ability to use CNAME DNS records to reach memcached servers. The bug was inherited from an upstream change to Thanos; we contributed a fix to Thanos and subsequently updated our Thanos version.

    CHANGELOG

    Grafana Mimir

    • [CHANGE] Compactor: No longer upload debug meta files to object storage. #1257
    • [CHANGE] Default values have changed for the following settings: #1547
      • -alertmanager.alertmanager-client.grpc-max-recv-msg-size now defaults to 100 MiB (previously was not configurable and set to 16 MiB)
      • -alertmanager.alertmanager-client.grpc-max-send-msg-size now defaults to 100 MiB (previously was not configurable and set to 4 MiB)
      • -alertmanager.max-recv-msg-size now defaults to 100 MiB (previously was 16 MiB)
    • [CHANGE] Ingester: Add user label to metrics cortex_ingester_ingested_samples_total and cortex_ingester_ingested_samples_failures_total. #1533
    • [CHANGE] Ingester: Changed -blocks-storage.tsdb.isolation-enabled default from true to false. The config option has also been deprecated and will be removed in 2 minor version. #1655
    • [CHANGE] Query-frontend: results cache keys are now versioned, this will cause cache to be re-filled when rolling out this version. #1631
    • [CHANGE] Store-gateway: enabled attributes in-memory cache by default. New default configuration is -blocks-storage.bucket-store.chunks-cache.attributes-in-memory-max-items=50000. #1727
    • [CHANGE] Compactor: Removed the metric cortex_compactor_garbage_collected_blocks_total since it duplicates cortex_compactor_blocks_marked_for_deletion_total. #1728
    • [CHANGE] All: Logs that used theorg_id label now use user label. #1634 #1758
    • [CHANGE] Alertmanager: the following metrics are not exported for a given user and integration when the metric value is zero: #1783
      • cortex_alertmanager_notifications_total
      • cortex_alertmanager_notifications_failed_total
      • cortex_alertmanager_notification_requests_total
      • cortex_alertmanager_notification_requests_failed_total
      • cortex_alertmanager_notification_rate_limited_total
    • [CHANGE] Removed the following metrics exposed by the Mimir hash rings: #1791
      • cortex_member_ring_tokens_owned
      • cortex_member_ring_tokens_to_own
      • cortex_ring_tokens_owned
      • cortex_ring_member_ownership_percent
    • [CHANGE] Querier / Ruler: removed the following metrics tracking number of query requests send to each ingester. You can use cortex_request_duration_seconds_count{route=~"/cortex.Ingester/(QueryStream|QueryExemplars)"} instead. #1797
      • cortex_distributor_ingester_queries_total
      • cortex_distributor_ingester_query_failures_total
    • [CHANGE] Distributor: removed the following metrics tracking the number of requests from a distributor to ingesters: #1799
      • cortex_distributor_ingester_appends_total
      • cortex_distributor_ingester_append_failures_total
    • [CHANGE] Distributor / Ruler: deprecated -distributor.extend-writes. Now Mimir always behaves as if this setting was set to false, which we expect to be safe for every Mimir cluster setup. #1856
    • [FEATURE] Querier: Added support for streaming remote read. Should be noted that benefits of chunking the response are partial here, since in a typical query-frontend setup responses will be buffered until they've been completed. #1735
    • [FEATURE] Ruler: Allow setting evaluation_delay for each rule group via rules group configuration file. #1474
    • [FEATURE] Ruler: Added support for expression remote evaluation. #1536 #1818
      • The following CLI flags (and their respective YAML config options) have been added:
        • -ruler.query-frontend.address
        • -ruler.query-frontend.grpc-client-config.grpc-max-recv-msg-size
        • -ruler.query-frontend.grpc-client-config.grpc-max-send-msg-size
        • -ruler.query-frontend.grpc-client-config.grpc-compression
        • -ruler.query-frontend.grpc-client-config.grpc-client-rate-limit
        • -ruler.query-frontend.grpc-client-config.grpc-client-rate-limit-burst
        • -ruler.query-frontend.grpc-client-config.backoff-on-ratelimits
        • -ruler.query-frontend.grpc-client-config.backoff-min-period
        • -ruler.query-frontend.grpc-client-config.backoff-max-period
        • -ruler.query-frontend.grpc-client-config.backoff-retries
        • -ruler.query-frontend.grpc-client-config.tls-enabled
        • -ruler.query-frontend.grpc-client-config.tls-ca-path
        • -ruler.query-frontend.grpc-client-config.tls-cert-path
        • -ruler.query-frontend.grpc-client-config.tls-key-path
        • -ruler.query-frontend.grpc-client-config.tls-server-name
        • -ruler.query-frontend.grpc-client-config.tls-insecure-skip-verify
    • [FEATURE] Distributor: Added the ability to forward specifics metrics to alternative remote_write API endpoints. #1052
    • [FEATURE] Ingester: Active series custom trackers now supports runtime tenant-specific overrides. The configuration has been moved to limit config, the ingester config has been deprecated. #1188
    • [ENHANCEMENT] Alertmanager API: Concurrency limit for GET requests is now configurable using -alertmanager.max-concurrent-get-requests-per-tenant. #1547
    • [ENHANCEMENT] Alertmanager: Added the ability to configure additional gRPC client settings for the Alertmanager distributor #1547
      • -alertmanager.alertmanager-client.backoff-max-period
      • -alertmanager.alertmanager-client.backoff-min-period
      • -alertmanager.alertmanager-client.backoff-on-ratelimits
      • -alertmanager.alertmanager-client.backoff-retries
      • -alertmanager.alertmanager-client.grpc-client-rate-limit
      • -alertmanager.alertmanager-client.grpc-client-rate-limit-burst
      • -alertmanager.alertmanager-client.grpc-compression
      • -alertmanager.alertmanager-client.grpc-max-recv-msg-size
      • -alertmanager.alertmanager-client.grpc-max-send-msg-size
    • [ENHANCEMENT] Ruler: Add more detailed query information to ruler query stats logging. #1411
    • [ENHANCEMENT] Admin: Admin API now has some styling. #1482 #1549 #1821 #1824
    • [ENHANCEMENT] Alertmanager: added insight=true field to alertmanager dispatch logs. #1379
    • [ENHANCEMENT] Store-gateway: Add the experimental ability to run index header operations in a dedicated thread pool. This feature can be configured using -blocks-storage.bucket-store.index-header-thread-pool-size and is disabled by default. #1660
    • [ENHANCEMENT] Store-gateway: don't drop all blocks if instance finds itself as unhealthy or missing in the ring. #1806 #1823
    • [ENHANCEMENT] Querier: wait until inflight queries are completed when shutting down queriers. #1756 #1767
    • [BUGFIX] Query-frontend: do not shard queries with a subquery unless the subquery is inside a shardable aggregation function call. #1542
    • [BUGFIX] Query-frontend: added component=query-frontend label to results cache memcached metrics to fix a panic when Mimir is running in single binary mode and results cache is enabled. #1704
    • [BUGFIX] Mimir: services' status content-type is now correctly set to text/html. #1575
    • [BUGFIX] Multikv: Fix panic when using using runtime config to set primary KV store used by multi KV. #1587
    • [BUGFIX] Multikv: Fix watching for runtime config changes in multi KV store in ruler and querier. #1665
    • [BUGFIX] Memcached: allow to use CNAME DNS records for the memcached backend addresses. #1654
    • [BUGFIX] Querier: fixed temporary partial query results when shuffle sharding is enabled and hash ring backend storage is flushed / reset. #1829
    • [BUGFIX] Alertmanager: prevent more file traversal cases related to template names. #1833
    • [BUGFUX] Alertmanager: Allow usage with -alertmanager-storage.backend=local. Note that when using this storage type, the Alertmanager is not able persist state remotely, so it not recommended for production use. #1836
    • [BUGFIX] Alertmanager: Do not validate alertmanager configuration if it's not running. #1835

    Mixin

    • [CHANGE] Dashboards: Remove per-user series legends from Tenants dashboard. #1605
    • [CHANGE] Dashboards: Show in-memory series and the per-user series limit on Tenants dashboard. #1613
    • [CHANGE] Dashboards: Slow-queries dashboard now uses user label from logs instead of org_id. #1634
    • [CHANGE] Dashboards: changed all Grafana dashboards UIDs to not conflict with Cortex ones, to let people install both while migrating from Cortex to Mimir: #1801 #1808
      • Alertmanager from a76bee5913c97c918d9e56a3cc88cc28 to b0d38d318bbddd80476246d4930f9e55
      • Alertmanager Resources from 68b66aed90ccab448009089544a8d6c6 to a6883fb22799ac74479c7db872451092
      • Compactor from 9c408e1d55681ecb8a22c9fab46875cc to 1b3443aea86db629e6efdb7d05c53823
      • Compactor Resources from df9added6f1f4332f95848cca48ebd99 to 09a5c49e9cdb2f2b24c6d184574a07fd
      • Config from 61bb048ced9817b2d3e07677fb1c6290 to 5d9d0b4724c0f80d68467088ec61e003
      • Object Store from d5a3a4489d57c733b5677fb55370a723 to e1324ee2a434f4158c00a9ee279d3292
      • Overrides from b5c95fee2e5e7c4b5930826ff6e89a12 to 1e2c358600ac53f09faea133f811b5bb
      • Queries from d9931b1054053c8b972d320774bb8f1d to b3abe8d5c040395cc36615cb4334c92d
      • Reads from 8d6ba60eccc4b6eedfa329b24b1bd339 to e327503188913dc38ad571c647eef643
      • Reads Networking from c0464f0d8bd026f776c9006b05910000 to 54b2a0a4748b3bd1aefa92ce5559a1c2
      • Reads Resources from 2fd2cda9eea8d8af9fbc0a5960425120 to cc86fd5aa9301c6528986572ad974db9
      • Rollout Progress from 7544a3a62b1be6ffd919fc990ab8ba8f to 7f0b5567d543a1698e695b530eb7f5de
      • Ruler from 44d12bcb1f95661c6ab6bc946dfc3473 to 631e15d5d85afb2ca8e35d62984eeaa0
      • Scaling from 88c041017b96856c9176e07cf557bdcf to 64bbad83507b7289b514725658e10352
      • Slow queries from e6f3091e29d2636e3b8393447e925668 to 6089e1ce1e678788f46312a0a1e647e6
      • Tenants from 35fa247ce651ba189debf33d7ae41611 to 35fa247ce651ba189debf33d7ae41611
      • Top Tenants from bc6e12d4fe540e4a1785b9d3ca0ffdd9 to bc6e12d4fe540e4a1785b9d3ca0ffdd9
      • Writes from 0156f6d15aa234d452a33a4f13c838e3 to 8280707b8f16e7b87b840fc1cc92d4c5
      • Writes Networking from 681cd62b680b7154811fe73af55dcfd4 to 978c1cb452585c96697a238eaac7fe2d
      • Writes Resources from c0464f0d8bd026f776c9006b0591bb0b to bc9160e50b52e89e0e49c840fea3d379
    • [FEATURE] Alerts: added the following alerts on mimir-continuous-test tool: #1676
      • MimirContinuousTestNotRunningOnWrites
      • MimirContinuousTestNotRunningOnReads
      • MimirContinuousTestFailed
    • [ENHANCEMENT] Added per_cluster_label support to allow to change the label name used to differentiate between Kubernetes clusters. #1651
    • [ENHANCEMENT] Dashboards: Show QPS and latency of the Alertmanager Distributor. #1696
    • [ENHANCEMENT] Playbooks: Add Alertmanager suggestions for MimirRequestErrors and MimirRequestLatency #1702
    • [ENHANCEMENT] Dashboards: Allow custom datasources. #1749
    • [ENHANCEMENT] Dashboards: Add config option gateway_enabled (defaults to true) to disable gateway panels from dashboards. #1761
    • [ENHANCEMENT] Dashboards: Extend Top tenants dashboard with queries for tenants with highest sample rate, discard rate, and discard rate growth. #1842
    • [ENHANCEMENT] Dashboards: Show ingestion rate limit and rule group limit on Tenants dashboard. #1845
    • [ENHANCEMENT] Dashboards: Add "last successful run" panel to compactor dashboard. #1628
    • [BUGFIX] Dashboards: Fix "Failed evaluation rate" panel on Tenants dashboard. #1629
    • [BUGFIX] Honor the configured per_instance_label in all dashboards and alerts. #1697

    Jsonnet

    • [FEATURE] Added support for mimir-continuous-test. To deploy mimir-continuous-test you can use the following configuration: #1675 #1850
      _config+: {
        continuous_test_enabled: true,
        continuous_test_tenant_id: 'type-tenant-id',
        continuous_test_write_endpoint: 'http://type-write-path-hostname',
        continuous_test_read_endpoint: 'http://type-read-path-hostname/prometheus',
      },
      
    • [ENHANCEMENT] Ingester anti-affinity can now be disabled by using ingester_allow_multiple_replicas_on_same_node configuration key. #1581
    • [ENHANCEMENT] Added node_selector configuration option to select Kubernetes nodes where Mimir should run. #1596
    • [ENHANCEMENT] Alertmanager: Added a PodDisruptionBudget of withMaxUnavailable = 1, to ensure we maintain quorum during rollouts. #1683
    • [ENHANCEMENT] Store-gateway anti-affinity can now be enabled/disabled using store_gateway_allow_multiple_replicas_on_same_node configuration key. #1730
    • [ENHANCEMENT] Added store_gateway_zone_a_args, store_gateway_zone_b_args and store_gateway_zone_c_args configuration options. #1807
    • [BUGFIX] Pass primary and secondary multikv stores via CLI flags. Introduced new multikv_switch_primary_secondary config option to flip primary and secondary in runtime config.

    Mimirtool

    • [BUGFIX] config convert: Retain Cortex defaults for blocks_storage.backend, ruler_storage.backend, alertmanager_storage.backend, auth.type, activity_tracker.filepath, alertmanager.data_dir, blocks_storage.filesystem.dir, compactor.data_dir, ruler.rule_path, ruler_storage.filesystem.dir, and graphite.querier.schemas.backend. #1626 #1762

    Tools

    • [FEATURE] Added a markblocks tool that creates no-compact and delete marks for the blocks. #1551
    • [FEATURE] Added mimir-continuous-test tool to continuously run smoke tests on live Mimir clusters. #1535 #1540 #1653 #1603 #1630 #1691 #1675 #1676 #1692 #1706 #1709 #1775 #1777 #1778 #1795
    • [FEATURE] Added mimir-rules-action GitHub action, located at operations/mimir-rules-action/, used to lint, prepare, verify, diff, and sync rules to a Mimir cluster. #1723
    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(28.23 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.23 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(25.72 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(25.81 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.11 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(16.84 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.54 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.43 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(51.80 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(53.63 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(46.98 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.56 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(47.48 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(49.64 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(43.61 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(43.81 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(44.68 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(44.71 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(13.09 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(13.65 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(11.90 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(11.81 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.1.0-rc.1(May 18, 2022)

    CHANGELOG since mimir-2.1.0-rc.0

    • [CHANGE] Distributor / Ruler: deprecated -distributor.extend-writes. Now Mimir always behaves as if this setting was set to false, which we expect to be safe for every Mimir cluster setup. #1856
    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(28.23 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.23 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(25.72 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(25.81 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.11 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(16.84 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.54 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.43 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(51.80 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(53.63 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(46.98 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.56 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(47.48 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(49.64 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(43.61 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(43.81 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(44.68 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(44.71 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(13.09 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(13.65 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(11.90 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(11.81 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.1.0-rc.0(May 17, 2022)

    Grafana Mimir version 2.1 release notes

    Grafana Labs is excited to announce version 2.1 of Grafana Mimir, the most scalable, most performant open source time series database in the world.

    Below we highlight the top features, enhancements and bugfixes in this release, as well as relevant callouts for those upgrading from Grafana Mimir 2.0. The complete list of changes is recorded in the Changelog.

    Features and enhancements

    • Mimir on ARM: We now publish Docker images for both amd64 and arm64, making it easier for those on arm-based machines to develop and run Mimir. Multiplaform images are available from the Mimir docker registry. Note that our existing integration test suite only uses the amd64 images, which means we cannot make any functional or performance guarantees about the arm64 images.

    • Remote ruler mode for improved rule evaluation performance: We've added a remote mode for the Grafana Mimir ruler, in which the ruler delegates rule evaluation to the query-frontend rather than evaluating rules directly within the ruler process itself. This allows recording and alerting rules to benefit from the query parallelization techniques implemented in the query-frontend (like query sharding). Remote mode is considered experimental and is off by default. To enable, see remote ruler.

    • Per-tenant custom trackers for monitoring cardinality: In Grafana Mimir 2.0, we introduced a custom tracker feature that allows you to track the count of active series over time that match a specific label matcher. In Grafana Mimir 2.1, we've made it possible to configure custom trackers via the runtime configuration file. This means you can now define different trackers for each tenant in your cluster and modify those trackers without an ingester restart.

    • Reduce cardinality of Grafana Mimir's /metrics endpoint: While Grafana Mimir does a good job of exposing a relatively small number of series about its own state, this number can tick up when running Grafana Mimir clusters with high tenant counts or high active series counts. To reduce this number (and the accompanying cost of scraping and storing these time series), we made several optimizations which decreased series count on the /metrics endpoint by more than 10%.

    Upgrade considerations

    We've updated the default values for 2 parameters in Grafana Mimir to give users better out-of-the-box performance:

    • We've changed the default for -blocks-storage.tsdb.isolation-enabled from true to false. We've marked this flag as deprecated and will remove it completely in 2 releases. TSDB isolation is a feature inherited from Prometheus that didn't provide any benefit given Grafana Mimir's distributed architecture and in our 1 billion series load test we found it actually hurt performance. Disabling it reduced our ingester 99th percentile latency by 90%.

    • The store-gateway attributes cache is now enabled by default (achieved by updating the default for -blocks-storage.bucket-store.chunks-cache.attributes-in-memory-max-items from 0 to 50000). This in-memory cache makes it faster to look up object attributes for chunk data. We've been running this optional cache internally for a while and upon a recent configuration audit, realized it made sense to do the same for all users. The increase in store-gateway memory utilization from enabling this cache is negligible and easily justified given the performance gains.

    Bug fixes

    2.1.0 bug fixes

    • PR 1704: Fixed a bug that previously caused Grafana Mimir to crash on startup when trying to run in monolithic mode with the results cache enabled due to duplicate metric names.
    • PR 1835: Fixed a bug that caused Grafana Mimir to crash when an invalid Alertmanager configuration was set even though the Alertmanager component was disabled. After this fix, the Alertmanager configuration is only validated if the Alertmanager component is loaded.
    • PR 1836: The ability to run Alertmanager with local storage broke in Grafana Mimir 2.0 when we removed the ability to run the Alertmanager without sharding. With this bugfix, we've made it possible to again run Alertmanager with local storage. However, for production use, we still recommend using external store since this is needed to persist Alertmanager state (e.g. silences) between replicas.
    • PR 1715: Restored Grafana Mimir's ability to use CNAME DNS records to reach memcached servers. The bug was inherited from an upstream change to Thanos; we contributed a fix to Thanos and subsequently updated our Thanos version.
    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(28.23 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(29.23 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(25.72 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(25.81 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-continuous-test-darwin-amd64(16.11 MB)
    mimir-continuous-test-darwin-amd64-sha-256(65 bytes)
    mimir-continuous-test-darwin-arm64(16.84 MB)
    mimir-continuous-test-darwin-arm64-sha-256(65 bytes)
    mimir-continuous-test-linux-amd64(14.54 MB)
    mimir-continuous-test-linux-amd64-sha-256(65 bytes)
    mimir-continuous-test-linux-arm64(14.43 MB)
    mimir-continuous-test-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(51.80 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(53.63 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(46.98 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(46.56 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(47.48 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(49.63 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(43.61 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(43.81 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(44.68 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(44.70 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(13.09 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(13.65 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(11.90 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(11.81 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
  • mimir-2.0.0(Mar 29, 2022)

    Grafana Labs is excited to announce the first release of Grafana Mimir, the most scalable, most performant open source time series database in the world. In customer tests, we’ve shown that a single cluster can support more than 1 billion active time series.

    Besides massive scale, Grafana Mimir offers a host of other benefits, including easy deployment, native multi-tenancy, high availability, durable long-term storage, and exceptional query performance on even the highest cardinality queries.

    We’re launching Grafana Mimir with a 2.0 version number to signal our respect for Cortex, the project from which Grafana Mimir was forked. The choice of 2.0 also represents our conviction that Grafana Mimir is real-world-tested, production-ready software. It has served as the backbone of our Grafana Cloud Metrics and Grafana Enterprise Metrics products since their inception.

    Learn more:

    The complete list of changes is recorded in the Changelog.

    Source code(tar.gz)
    Source code(zip)
    metaconvert-darwin-amd64(25.98 MB)
    metaconvert-darwin-amd64-sha-256(65 bytes)
    metaconvert-darwin-arm64(26.89 MB)
    metaconvert-darwin-arm64-sha-256(65 bytes)
    metaconvert-linux-amd64(23.69 MB)
    metaconvert-linux-amd64-sha-256(65 bytes)
    metaconvert-linux-arm64(23.75 MB)
    metaconvert-linux-arm64-sha-256(65 bytes)
    mimir-darwin-amd64(48.17 MB)
    mimir-darwin-amd64-sha-256(65 bytes)
    mimir-darwin-arm64(49.83 MB)
    mimir-darwin-arm64-sha-256(65 bytes)
    mimir-linux-amd64(43.67 MB)
    mimir-linux-amd64-sha-256(65 bytes)
    mimir-linux-arm64(43.25 MB)
    mimir-linux-arm64-sha-256(65 bytes)
    mimirtool-darwin-amd64(45.10 MB)
    mimirtool-darwin-amd64-sha-256(65 bytes)
    mimirtool-darwin-arm64(47.17 MB)
    mimirtool-darwin-arm64-sha-256(65 bytes)
    mimirtool-linux-amd64(41.48 MB)
    mimirtool-linux-amd64-sha-256(65 bytes)
    mimirtool-linux-arm64(41.62 MB)
    mimirtool-linux-arm64-sha-256(65 bytes)
    mimirtool-windows-amd64.exe(42.51 MB)
    mimirtool-windows-amd64.exe-sha-256(65 bytes)
    mimirtool-windows-arm64.exe(42.54 MB)
    mimirtool-windows-arm64.exe-sha-256(65 bytes)
    query-tee-darwin-amd64(13.02 MB)
    query-tee-darwin-amd64-sha-256(65 bytes)
    query-tee-darwin-arm64(13.58 MB)
    query-tee-darwin-arm64-sha-256(65 bytes)
    query-tee-linux-amd64(11.83 MB)
    query-tee-linux-amd64-sha-256(65 bytes)
    query-tee-linux-arm64(11.68 MB)
    query-tee-linux-arm64-sha-256(65 bytes)
Owner
Grafana Labs
Grafana Labs is behind leading open source projects Grafana and Loki, and the creator of the first open & composable observability platform.
Grafana Labs
A set of components that can be composed into a highly available metric system with unlimited storage capacity

Overview Thanos is a set of components that can be composed into a highly available metric system with unlimited storage capacity, which can be added

Rohan 0 Oct 20, 2021
TiDB Mesh: Implement Multi-Tenant Keyspace by Decorating Message between Components

TiDB Mesh: Implement Multi-Tenant Keyspace by Decorating Message between Compone

null 3 Jan 11, 2022
grafana-sync Keep your grafana dashboards in sync.

grafana-sync Keep your grafana dashboards in sync. Table of Contents grafana-sync Table of Contents Installing Getting Started Pull Save all dashboard

Maksym Postument 169 Dec 14, 2022
Snowflake grafana datasource plugin allows Snowflake data to be visually represented in Grafana dashboards.

Snowflake Grafana Data Source With the Snowflake plugin, you can visualize your Snowflake data in Grafana and build awesome chart. Get started with th

Michelin 39 Dec 29, 2022
A Grafana backend plugin for automatic synchronization of dashboard between multiple Grafana instances.

Grafana Dashboard Synchronization Backend Plugin A Grafana backend plugin for automatic synchronization of dashboard between multiple Grafana instance

Novatec Consulting GmbH 8 Dec 23, 2022
Terraform-grafana-dashboard - Grafana dashboard Terraform module

terraform-grafana-dashboard terraform-grafana-dashboard for project Requirements

hadenlabs 1 May 2, 2022
Grafana-threema-forwarder - Alert forwarder from Grafana webhooks to Threema wire messages

Grafana to Threema alert forwarder Although Grafana has built in support for pus

Péter Szilágyi 4 Nov 11, 2022
Andrews-monitor - A Go program to monitor when times were available to order for Brown's Andrews dining hall. Used during the portion of the pandemic when the dining hall was only available for online order.

Andrews Dining Hall Monitor A Go program to monitor when times were available to order for Brown's Andrews dining hall. Used during the portion of the

null 0 Jan 1, 2022
Openshift's hpessa-exporter allows users to export SMART information of local storage devices as Prometheus metrics, by using HPE Smart Storage Administrator tool

hpessa-exporter Overview Openshift's hpessa-exporter allows users to export SMART information of local storage devices as Prometheus metrics, by using

Shachar Sharon 0 Jan 17, 2022
The Container Storage Interface (CSI) Driver for Fortress Block Storage This driver allows you to use Fortress Block Storage with your container orchestrator

fortress-csi The Container Storage Interface (CSI) Driver for Fortress Block Storage This driver allows you to use Fortress Block Storage with your co

Fortress 0 Jan 23, 2022
Otus prometheus grafana for golang

HW Prometheus. Grafana Clone the repo: git clone https://github.com/alikhanmurzayev/otus_kuber_part_3.git && cd otus_kuber_part_3 Prepare workspace: m

null 0 Dec 17, 2021
Flux prometheus grafana-example - A tool for keeping Kubernetes clusters in sync with sources ofconfiguration

Flux is a tool for keeping Kubernetes clusters in sync with sources of configuration (like Git repositories), and automating updates to configuration when there is new code to deploy.

null 0 Feb 1, 2022
PolarDB Stack is a DBaaS implementation for PolarDB-for-Postgres, as an operator creates and manages PolarDB/PostgreSQL clusters running in Kubernetes. It provides re-construct, failover swtich-over, scale up/out, high-available capabilities for each clusters.

PolarDB Stack开源版生命周期 1 系统概述 PolarDB是阿里云自研的云原生关系型数据库,采用了基于Shared-Storage的存储计算分离架构。数据库由传统的Share-Nothing,转变成了Shared-Storage架构。由原来的N份计算+N份存储,转变成了N份计算+1份存储

null 23 Nov 8, 2022
Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration

Karmada Karmada: Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration Karmada (Kubernetes Armada) is a Kubernetes management system that enables

null 3k Dec 30, 2022
Go WhatsApp Multi-Device Implementation in REST API with Multi-Session/Account Support

Go WhatsApp Multi-Device Implementation in REST API This repository contains example of implementation go.mau.fi/whatsmeow package with Multi-Session/

Dimas Restu H 62 Dec 3, 2022
Export Prometheus metrics from journald events using Prometheus Go client library

journald parser and Prometheus exporter Export Prometheus metrics from journald events using Prometheus Go client library. For demonstration purposes,

Mike Sgarbossa 0 Jan 3, 2022
SMART information of local storage devices as Prometheus metrics

hpessa-exporter Overview Openshift's hpessa-exporter allows users to export SMART information of local storage devices as Prometheus metrics, by using

Red Hat Storage 0 Feb 10, 2022
Orchestra is a library to manage long running go processes.

Orchestra Orchestra is a library to manage long running go processes. At the heart of the library is an interface called Player // Player is a long ru

Stephen Afam-Osemene 111 Oct 21, 2022
A long-running Go program that watches a Youtube playlist for new videos, and downloads them using yt-dlp or other preferred tool.

ytdlwatch A long-running Go program that watches a Youtube playlist for new videos, and downloads them using yt-dlp or other preferred tool. Ideal for

Raine Virta 9 Jul 25, 2022