A tool based on eBPF, prometheus and grafana to monitor network connectivity.

Overview

Connectivity Monitor

reuse compliant

Tracks the connectivity of a kubernetes cluster to its api server and exposes meaningful connectivity metrics.

Uses ebpf to observe all the TCP connection establishments from the shoot cluster to the kubernetes api server. Derives meaningful connectivity metrics (upper bound for meaningful availability) for the kubernetes api server that is running in the seed cluster.

Can be deployed in two different modes:

  • Deployed in a shoot cluster (or a normal kubernetes cluster) to track the connectivity to the api server.
  • Deployed in a seed cluster to track the connectivity of all shoot clusters hosted on the seed.

The network path

The network path from the shoot cluster to the api server.

The shoot cluster's api server is hosted in the seed cluster and the network path involves several hops:

  • the NAT gateway in the shoot cluster,
  • the load balancer in the seed cluster,
  • a k8s service hop and
  • the envoy reverse proxy.

The reverse proxy terminates the TCP connection, starts the TLS negotiation and chooses the api server of the shoot cluster based on the server name extension in the TLS ClientHello message (SNI). The TLS negotiation is relayed to the chosen api server so that the client actually establishes a TLS session directly with the api server. (See SNI GEP for details.)

Possible failure types

We can distinguish multiple failure types:

  • There is no network connectivity to the api server.

    The focus of this connectivity-monitor component.

    New TCP connections to the kubernetes api server are observed to confirm that all the components along the network path to the kubernetes api server, and the kubernetes api server itself, are working as expected. Many things can break along the network path: the DNS resolution of the domain name of the load balancer, packets can be dropped due to misconfiguration of connection tracking tables, or the reverse proxy might be overloaded to accept any new connections. The mundane failure case that there are no running api server processes is also covered by the connectivity monitor.

  • The api server reports an internal server error.

    Detecting this failure type is not feasible for the connectivity-monitor component; it can be achieved by processing the access logs of the api server.

    The failure cases when the connection is successfully established, but the api server detects and returns a internal server failure (4xx - user error, 5xx - internal error) are considered as successful connection attempts, hence the connectivity monitor yields an upper bound for meaningful availability. This situations can be detected on the server side, by parsing the access logs, knowing that due to the successful connections we can expect to find matching access logs.

  • The api server doesn't comply with the specification.

    Detecting this failure type requires test cases with a known expected outcome.

    The most tricky failure case is when the api server can not itself detect the error and returns an incorrect answer as a success (2xx - ok). This failure case can only be detected by running test cases against the api server, where the result is known ahead of time and it can be asserted that the expected and actual results are equivalent.

Observe all the connections from the shoot cluster to the api server

To capture all connection attempts by:

  • system components managed by Gardener: kubelet, kube-proxy, calico, ... and
  • any user workload that is talking to the api server

the connectivity-exporter must be deployed as a daemonset in the host network of the node, in the shoot cluster.

Deploying the connectivity-exporter directly in the shoot cluster is motivated by:

  • the connectivity-exporter is closer to the clients that initiate the connection and hence it can even capture failed attempts that don't reach the seed cluster at all (e.g. due to DNS misconfiguration),

  • by deploying the connectivity-exporter in the shoot cluster, the load is considerably smaller: it is tracking all the connections from a single shoot cluster (1-1k/s), and not all the connections from all the shoot clusters of a single seed cluster (300x).

Later, we plan to deploy the connectivity exporter in the seed cluster as well to monitor all the connections from all the shoot clusters centrally, that could at least reach the reverse proxy (envoy).

Annotate time based on state of connections

The connectivity-exporter assesses each connection attempt based on the packet sequence it observes in a certain time window:

  • unacknowledged connection: SYN (packet sent to the api server), no acknowledgment received

  • rejected connection: SYN packet sent, SYN+ACK packet received, but e.g. during the TLS negotiation the server responds with an RST+ACK packet to abort the connection

  • successful connection: SYN (packet sent to the api server), SYN+ACK (packet received from the api server)

The connectivity exporter annotates 1s long time buckets after a certain offset, to tolerate late arrivals and avoid issues at second boundaries:

  • active (/inactive) second: active if there were some new connection attempts, inactive if there were no new connection attempts,

  • failed (/successful) second: failed if there was at least one failed connection attempt (unacknowledged or rejected), or if there were no connection attempts and the preceding bucket was assessed as failed; successful otherwise.

If packets arrive too late (beyond a certain time window) or simply out of sequence (e.g. a SYN+ACK packet without a preceding SYN packet on the same connection), they are counted as an orphan packet.

Prometheus metrics

The state of the connectivity exporter is exposed with prometheus counter metrics, which can be comfortably scraped without losing the 1s granularity.

# HELP connectivity_exporter_connections_total Total number of new connections.
# TYPE connectivity_exporter_connections_total counter
connectivity_exporter_connections_total{kind="rejected"} 0
connectivity_exporter_connections_total{kind="successful"} 544
connectivity_exporter_connections_total{kind="unacknowledged"} 0

# HELP connectivity_exporter_packets_total Total number of new packets.
# TYPE connectivity_exporter_packets_total counter
connectivity_exporter_packets_total{kind="orphan"} 0

# HELP connectivity_exporter_seconds_total Total number of seconds.
# TYPE connectivity_exporter_seconds_total counter
connectivity_exporter_seconds_total{kind="active"} 337
connectivity_exporter_seconds_total{kind="active_failed"} 0
connectivity_exporter_seconds_total{kind="clock"} 2354
connectivity_exporter_seconds_total{kind="failed"} 0

When the connectivity exporter is deployed in the seed, an SNI label is added to the metrics above to differentiate the connections to the different api servers.

Inspiration

This work is motivated by the meaningful availability paper and the SRE books by Google.

The failed seconds counter metric is meaningful according to the definition of the paper: it captures what users experience. In every counted failed second, there was at least one failed connection attempt by a user or there weren't any successful connection attempts since the last failure. During the uptime of the monitoring stack itself, any failed connection attempt by a user (running in the shoot cluster) will be reported as a failed second.

Overview

The following sketch shows where are the TCP connections captured and how is time annotated based on the assessed connection states.

overview

The big picture of meaningful availability also includes application level access logs on the server side. Connectivity monitoring is a first step on the path to meaningful availability that yields an upper bound: availability requires connectivity.

Note that this is a low level and hence very generic approach with potential for widespread adoption. As long as the service is delivered via TCP/IP (i.e. all the services of our concern), service instances can be differentiated by the SNI TLS extension, we can measure the connectivity with 1s resolution with this approach. The connectivity exporter can be deployed anywhere along the path between the clients and the servers. This choice is a tradeoff: if deployed close to the clients, it can cover more failure cases and needs to handle less load; if it is deployed closer to the server, it might cover all the clients but miss certain failure cases.

In the Gardener architecture, we have the unique situation that all the relevant clients of the api server are running in the shoot cluster and we can deploy the connectivity exporter next to some other Gardener managed system components in the shoot cluster as well.

Comments
  • build: container images and helm charts

    build: container images and helm charts

    This adds Makefile targets:

    • docker/build
    • docker/push
    • helm/generate
    • helm/install
    • helm/uninstall

    The following environment variables can be redefined before running 'make':

    • REGISTRY
    • IMAGE_NAME
    • IMAGE_TAG

    For example, I run

    export REGISTRY=xxxxx.azurecr.io
    export IMAGE_NAME=connectivity-monitor
    export IMAGE_TAG=albantest
    

    With this, I can test:

    $ time make docker/build docker/push helm/install
    

    And then, the pod is deployed:

    $ kubectl logs -n connectivity-monitor connectivity-exporter-65rz4
    2021/11/08 14:45:01 maxprocs: Updating GOMAXPROCS=2: determined from CPU quota
    I1108 14:45:02.076150       9 metrics.go:24] Starting connectivity-exporter
    I1108 14:45:24.076205       9 packet.go:245] sni: dc.services.visualstudio.com, connections: 1
    ...
    

    There are some errors but that could be debugged later:

    • packet.go:159] Empty SNI
    • sni: , connections: 2

    TODO:

    • [ ] Add missing helm charts
    • [ ]
    reviewed/lgtm size/m 
    opened by alban 3
  • CI: initial GitHub Action

    CI: initial GitHub Action

    What this PR does / why we need it:

    This builds, runs the unit tests, creates a docker image and pushes it to the GitHub Container Registry.

    Which issue(s) this PR fixes: Fixes #

    Special notes for your reviewer:

    Release note:

    
    
    reviewed/lgtm size/m 
    opened by alban 2
  • Fix Prometheus counters for each SNI

    Fix Prometheus counters for each SNI

    What this PR does / why we need it:

    • Rename {succeeded,failed}_seconds to {succeeded,failed}_connections in the BPF map sni_stats
    • Separate stats for each SNI
    • On inactive seconds, carry over failed second state

    Which issue(s) this PR fixes: Fixes #

    Special notes for your reviewer:

    Release note:

    
    
    reviewed/lgtm size/m 
    opened by alban 1
  • ebpf: fix integer overflow with offsets

    ebpf: fix integer overflow with offsets

    Offsets should not be stored in __u8 because they might be bigger than 256. Typically, when the client wget supports a large amount of cypher suites, the offset for the SNI becomes bigger than 256.

    reviewed/lgtm size/xs 
    opened by alban 1
  • connectivity-exporter: add CLI flag -metrics-addr

    connectivity-exporter: add CLI flag -metrics-addr

    connectivity-exporter was previously listening on port 19100 on all network interfaces and this was not configurable.

    This patch adds a CLI flag -metrics-addr to make this configurable. The default is still ":19100" to keep the previous behaviour unchanged.

    This is useful when the Kubernetes cluster already has something listening on port 19100.

    needs/review size/xs 
    opened by alban 1
  • connectivity-exporter: monitoring all interfaces

    connectivity-exporter: monitoring all interfaces

    The network interface name could be specified with the "-i" CLI flag but it was not possible to monitor all network interfaces.

    With this patch, connectivity-exporter will monitor all network interfaces when the "-i" flag is empty or missing.

    It works by setting sll_ifindex to zero: see "man 7 packet":

    sll_ifindex is the interface index of the interface (see netdevice(7)); 0 matches any interface (only permitted for binding).

    needs/review size/xs 
    opened by alban 1
  • Simplify metric expiration

    Simplify metric expiration

    What this PR does / why we need it:

    SNIs should be "expired" after 15 minutes of inactivity. Previously each SNI received its own goroutine that would start a timer. This timer would be reset whenever there was activity. However, this added unnecessary complexity and made it more difficult to test the code. With this PR the SNIs are now expired in a single goroutine where we check the last time that it was updated. This should simplify the code and make it more readable/ testable.

    reviewed/lgtm size/m status/closed 
    opened by wyb1 0
  • Reset the weekly metrics on Sundays at midnight

    Reset the weekly metrics on Sundays at midnight

    What this PR does / why we need it:

    Previously, they were reset on Thursdays at midnight, because the start of the unix epoch time, January 1, 1970 was a Thursday.

    Special notes for your reviewer:

    May 9, 2022 is a Monday.

    image needs/review size/xs status/closed 
    opened by istvanballok 0
  • Add some panels to show the cluster downtimes in seconds

    Add some panels to show the cluster downtimes in seconds

    What this PR does / why we need it:

    Adds panels to show downtime in seconds for specific SNIs. This can be useful if you want to see how many seconds a downtime was versus a percentage.

    needs/review needs/second-opinion size/l status/closed 
    opened by wyb1 0
  • Rename instances of connectivity-monitor to connectivity-exporter

    Rename instances of connectivity-monitor to connectivity-exporter

    Cleanup any renaming instances of connectivity-monitor and replace them with connectivity-exporter. Done after renaming the repository to gardener/connectivity-exporter.

    needs/review size/xs status/closed 
    opened by wyb1 0
  • Fix BPF verifier issue

    Fix BPF verifier issue

    What this PR does / why we need it:

    On Kernel 5.15, the current version fails after 354 iterations of the unrolled for loop. TLS_MAX_SERVER_NAME_LEN is less than that (128) and if the for loop is rewritten in this (equivalent) way, the BPF verifier accepts the program (both on 5.13 and on 5.15).

    needs/review size/xs status/closed 
    opened by istvanballok 0
Owner
Gardener
Universal Kubernetes at Scale
Gardener
Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe energy related system stats and exports as Prometheus metrics

kepler Kepler (Kubernetes Efficient Power Level Exporter) uses eBPF to probe energy related system stats and exports as Prometheus metrics Architectur

Sustainable Computing 209 Dec 26, 2022
grafana-sync Keep your grafana dashboards in sync.

grafana-sync Keep your grafana dashboards in sync. Table of Contents grafana-sync Table of Contents Installing Getting Started Pull Save all dashboard

Maksym Postument 169 Dec 14, 2022
Snowflake grafana datasource plugin allows Snowflake data to be visually represented in Grafana dashboards.

Snowflake Grafana Data Source With the Snowflake plugin, you can visualize your Snowflake data in Grafana and build awesome chart. Get started with th

Michelin 39 Dec 29, 2022
A Grafana backend plugin for automatic synchronization of dashboard between multiple Grafana instances.

Grafana Dashboard Synchronization Backend Plugin A Grafana backend plugin for automatic synchronization of dashboard between multiple Grafana instance

Novatec Consulting GmbH 8 Dec 23, 2022
Terraform-grafana-dashboard - Grafana dashboard Terraform module

terraform-grafana-dashboard terraform-grafana-dashboard for project Requirements

hadenlabs 1 May 2, 2022
Grafana-threema-forwarder - Alert forwarder from Grafana webhooks to Threema wire messages

Grafana to Threema alert forwarder Although Grafana has built in support for pus

Péter Szilágyi 4 Nov 11, 2022
Monitor your network and internet speed with Docker & Prometheus

Stand-up a Docker Prometheus stack containing Prometheus, Grafana with blackbox-exporter, and speedtest-exporter to collect and graph home Internet reliability and throughput.

Jeff Geerling 1.2k Dec 26, 2022
Flux prometheus grafana-example - A tool for keeping Kubernetes clusters in sync with sources ofconfiguration

Flux is a tool for keeping Kubernetes clusters in sync with sources of configuration (like Git repositories), and automating updates to configuration when there is new code to deploy.

null 0 Feb 1, 2022
Webserver I built to serve Infura endpoints. Deployable via k8s and AWS EKS. Load testable via k6 tooling, and montiorable via prometheus and grafana

Infura Web Server Welcome to my verion of the take home project. I've created a webserver written in go to serve Infura api data over 3 possible data

Jacob Elias 3 Nov 15, 2022
A CLI tool that can be used to disrupt wireless connectivity in your area by jamming all the wireless devices connected to multiple access points.

sig-716i A CLI tool written in Go that can be used to disrupt wireless connectivity in the area accessible to your wireless interface. This tool scans

Narasimha Prasanna HN 73 Oct 14, 2022
Doraemon is a Prometheus based monitor system

English | 中文 Doraemon Doraemon is a Prometheus based monitor system ,which are made up of three components——the Rule Engine,the Alert Gateway and the

Qihoo 360 632 Nov 28, 2022
Andrews-monitor - A Go program to monitor when times were available to order for Brown's Andrews dining hall. Used during the portion of the pandemic when the dining hall was only available for online order.

Andrews Dining Hall Monitor A Go program to monitor when times were available to order for Brown's Andrews dining hall. Used during the portion of the

null 0 Jan 1, 2022
Go web monitor - A web monitor with golang

Step Download “go installer” and install on your machine. Open VPN. Go to “web-m

null 0 Jan 6, 2022
Creates Prometheus Metrics for PolicyReports and ClusterPolicyReports. It also sends PolicyReportResults to different Targets like Grafana Loki or Slack

PolicyReporter Motivation Kyverno ships with two types of validation. You can either enforce a rule or audit it. If you don't want to block developers

Frank Jogeleit 0 Aug 6, 2021
Plays videos using Prometheus and Grafana, e.g. Bad Apple.

prometheus_video_renderer Plays videos using Prometheus and Grafana, e.g. Bad Apple. Modes Currently 3 different modes are supported. Bitmap The bitma

Jacob Colvin 92 Nov 30, 2022
An example logging system using Prometheus, Loki, and Grafana.

Logging Example Structure Collector Export numerical data for Prometheus and log data for Promtail. Exporter uses port 8080 Log files are saved to ./c

YoungHwan Joo 5 Nov 21, 2022
Hubble - Network, Service & Security Observability for Kubernetes using eBPF

Network, Service & Security Observability for Kubernetes What is Hubble? Getting Started Features Service Dependency Graph Metrics & Monitoring Flow V

Cilium 2.4k Jan 2, 2023
A golang implementation of endlessh exporting Prometheus metrics, visualized by a Grafana dashboard.

endlessh-go A golang implementation of endlessh exporting Prometheus metrics, visualized by a Grafana dashboard. Introduction Endlessh is a great idea

Shizun Ge 79 Dec 23, 2022
Otus prometheus grafana for golang

HW Prometheus. Grafana Clone the repo: git clone https://github.com/alikhanmurzayev/otus_kuber_part_3.git && cd otus_kuber_part_3 Prepare workspace: m

null 0 Dec 17, 2021
Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.

Grafana Mimir Grafana Mimir is an open source software project that provides a scalable long-term storage for Prometheus. Some of the core strengths o

Grafana Labs 2.7k Jan 3, 2023