English | 简体中文
What is KubeProber?
KubeProber is a diagnostic tool designed for large-scale Kubernetes clusters. It is used to perform diagnostic items in the kubernetes cluster to prove that the functions of the cluster are normal, KubeProber has the following characteristics:
- Large-scale clusters support Supports multi-cluster management, supports configuring the relationship between clusters and diagnostic items on the management side and viewing the diagnostic results of all clusters in a unified manner;
- Cloud Native The core logic is implemented by operator, providing complete Kubernetes API compatibility;
- Extensible Support user-defined diagnostic items
Different from the monitoring system, KubeProber proves the functions of the cluster are normal from the perspective of diagnostic. Monitoring is a forward link and cannot cover all scenarios in the system. The monitoring data of each environment in the system is normal and cannot prove the system is 100% normal, so a tool is needed to prove the availability of the system from the reverse, and fundamentally to discover unavailable points in the cluster before users, such as:
- Whether all nodes in the set can be scheduled, whether there are special taints, etc;
- Whether the pod can be normally created, destroyed, and verified the entire link from kubernetes, kubelet to docker;
- Create a service and test unicom to verify whether the kube-proxy link is normal;
- Resolve an internal or external domain name to verify whether CoreDNS is working properly;
- Visit an ingress domain name to verify whether the ingress component in the cluster is working properly;
- Create and delete a namespace to verify whether the related webhook is working properly;
- Perform operations such as put/get/delete on Etcd to verify whether Etcd is running normally;
- Verify the normal operation of MySQL through the operation of mysql-client;
- Simulate users to log in and operate the business system to verify whether the main business process is frequent;
- Check whether the certificates of each environment have expired;
- Expiration check of cloud resources;
- ... more!
The operator running on the management cluster. This operator maintains two CRDs, one is Cluster, which is used to manage the managed cluster, and the other is Probe, which is used to manage the built-in and user-written diagnostic items, probe-master Through watch these two CRDs, the latest diagnostic configuration is pushed to the managed cluster, and probe-master provides an interface for viewing the diagnosis results of the managed cluster.
The operator running on the managed cluster. This operator maintains two CRDs. One is a Probe that is exactly the same as the probe-master. The probe-agent executes the cluster’s diagnostic items according to the definition of the probe. The other is ProbeStatus for Record the diagnosis results of each Probe. Users can view the diagnosis results of the cluster through kubectl get probestatus in the managed cluster.
Both the master and agent of kubeprober run as controllers in kubernetes. Before installation, make sure that you have deployed the kubernetes cluster and can access it using kubectl.
The operation of WebHook needs to verify the certificate, and you need to deploy the cert-manager service first:
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.3.1/cert-manager.yaml
APP=probe-master make deploy
Before deploying the agent, make sure that you have created a cluster in the master side:
kubectl apply -f config/samples/kubeprobe_v1_cluster.yaml kubectl get cluster
Modify the configmap configuration after creating the cluster:
vim config/manager-probe-agent/manager.yaml --- apiVersion: v1 kind: ConfigMap metadata: name: probeagent namespace: system data: probe-conf.yaml: | probe_master_addr: http://kubeprober-probe-master.kubeprober.svc.cluster.local:8088 cluster_name: moon secret_key: 2f5079a5-425c-4fb7-8518-562e1685c9b4
If only probe-agent need (e.g debug/developing or just running probe cases in one k8s cluster), following configurations needed, and probe-agent will stop communication with master.
vim config/manager-probe-agent/manager.yaml --- apiVersion: v1 kind: ConfigMap metadata: name: probeagent namespace: system data: probe-conf.yaml: | # default disabled, if enabled, probe-agent will stop communication with master agent_debug: true # default 1, if more verbose info needed, increase it debug_level: 1
APP=probe-agent make deploy
To start developing kubeprober
You can run and build probe-master and probe-agent locally. please make sure that ~/.kube/config can access the kubernetes cluster.
install crd && webhook resources
APP=probe-master make run
Before run probe-agent, a cluster crd resource should be created, same as section [Deploy probe-agent]
# create local config yaml file touch probe-conf.yaml # input configurations, eg. cluster info cat << EOF > probe-conf.yaml probe_master_addr: http://kubeprober-probe-master.kubeprober.svc.cluster.local:8088 cluster_name: moon secret_key: 2f5079a5-425c-4fb7-8518-562e1685c9b4 EOF # run probe-agent with config file APP=probe-agent CONF=./probe-conf.yaml make run
build binary file
APP=probe-master make build APP=probe-agent make build
# build with default version: latest # output image format: kubeprober/probe-master:latest APP=probe-master make docker-build # build with custom version: v0.0.1 # output image format: kubeprober/probe-master:v0.0.1 APP=probe-master V=v0.0.1 make docker-build # build with default version: latest APP=probe-agent make docker-build # push with default version: latest APP=probe-agent make docker-push # build & push APP=probe-agent make docker-build-push
Write your prober
Contributions are always welcomed. Please refer to Contributing to KubeProber for details.
If you have any questions, please feel free to contact us.
Email: [email protected]
Zhihu：Erda技术团队 (A Chinese knowledge community, similar to Quora.)
WeChat Official Account:
KubeProber is under the Apache 2.0 license. See the LICENSE file for details.