Provide the ability for exactly two Single Node OpenShift clusters to operate as a predefined pair in an active-passive or active-active configuration, detect when its peer has died and automatically take over its workloads after ensuring it is safe to do so.

Overview

High Availability Single Node Openshift Setup Operator

Provide the ability for exactly two Single Node OpenShift clusters to operate as a predefined pair in an active-passive or active-active configuration, detect when its peer has died and automatically take over its workloads after ensuring it is safe to do so.

Motivation

Some companies have a need for a highly available container management solution that fits within a reduced footprint.

  • The hardware savings are significant for customers deploying many remote sites (kiosks, branch offices, restaurant chains, etc), most notably for edge computing and RAN specifically.
  • The physical constraints of some deployments prevent more than two nodes (planes, submarines, satellites, and also RAN).
  • Some locations will not have reliable network connections or limited bandwidth (once again submarines, satellites, and disaster areas such as after hurricanes or floods)

Pre-requisites

  • 2 SNO Clusters.
  • Deployments that will be managed by HA Layer already exist.

Installation

  • Deploy This operator to each SNO cluster.
  • Load the yaml manifest of the HASNO (for each SNO cluster).

Assumptions

  • CRs will be updated simultaneously on both clusters by a config management tool - for example ACM (in case no such tool is used it should be done manually by the user).

Example CRs

An example HASNO object.

   apiVersion: app.hasno.com/v1alpha1
   kind: HALayerSet
   metadata:
     name: halayerset-sample
   spec:
     # Add fields here
     deployments:
       - "nginx-test"
       - "nginx-prod"
     fenceAgentsSpec:
     - name: "fence_ipmilan_1"
       type: "fence_ipmilan"
       params:
         ip: "192.168.126.1"
         username: "admin"
         password: "password"
         ipport: "9111"
         lanplus: "1"
         pcmk_host_list: "cluster1"
     - name: "fence_ipmilan_2"
       type: "fence_ipmilan"
       params:
         ip: "192.168.126.1"
         username: "admin"
         password: "password"
         ipport: "9222"
         lanplus: "1"
         pcmk_host_list: "cluster2"
     nodesSpec:
       firstNodeName: "cluster1"
       firstNodeIP: "192.168.126.10"
       secondNodeName: "cluster2"
       secondNodeIP: "192.168.126.11"

These CRs are created by the admin and are used to trigger the setting of the High Availability Layer on top of the SNO clusters.

Issues
  • Support Fence Agent Update

    Support Fence Agent Update

    This PR manages the previous status of the fence agent of the ha-sno CR.

    The purpose is to be able to identify update requests of the fence agent. In a nutshell: when a ha-sno CR change is detected we compare the current state of fence agent to the prev state in order to decide if an update/create/delete is needed.

    lgtm approved 
    opened by mshitrit 5
  • Support configurable container image

    Support configurable container image

    The HALayer pod contains a container image which is running the pacemaker and supports the fence agent. In this PR we allow the user to configure which container image to use (for example in case the user requires a fence agent that is not supported by the default container image).

    • In case nothing is configured we will use the default container image.
    • At the moment we only support this on pod creation and not on update.

    Jira Ticket

    approved 
    opened by mshitrit 1
  • Bug fixes & Refactoring

    Bug fixes & Refactoring

    This PR Contains 3 Commits:

    • Fix for a remediation hot loop which was caused when an error was returned when trying to delete an already deleted service.
    • Fix for a wrong text on a log message.
    • Refactor the way we wait for HALayer pod to be active (re-queuing the request instead of hard coded wait)
    opened by mshitrit 1
  • Use webhook in order to make sure NodeSpec fields are Immutable

    Use webhook in order to make sure NodeSpec fields are Immutable

    NodeSpec section in ha-sno CR contains fields that should not be changes after creation (node name/IP). In this PR webhooks are used in order to make sure NodeSpec fields are Immutable.

    opened by mshitrit 1
Owner
Medik8s
Medik8s (pronounced medicates) aims for automatic detection and recovery of unhealthy k8s nodes
Medik8s
Monitor & detect crashes in your Kubernetes(K8s) cluster

kwatch kwatch helps you monitor all changes in your Kubernetes(K8s) cluster, detects crashes in your running apps in realtime, and publishes notificat

Abdelrahman Ahmed 607 May 9, 2022
Spriting that sass has been missing

Wellington Wellington adds spriting to the lightning fast libsass. No need to learn a new tool, this all happens right in your Sass! OS Support Binari

Wellington 298 May 11, 2022
Crit: a command-line tool for bootstrapping Kubernetes clusters

Crit is a command-line tool for bootstrapping Kubernetes clusters. It handles the initial configuration of Kubernetes control plane components, and ad

Chris Marshall 2 Jan 31, 2022
A GNU/Linux monitoring and profiling tool focused on single processes.

Uroboros is a GNU/Linux monitoring tool focused on single processes. While utilities like top, ps and htop provide great overall details, they often l

Simone Margaritelli 645 May 6, 2022
A simple daemon which will watch files on your filesystem, mirror them to MFS, automatically update related pins, and update related IPNS keys.

ipfs-sync is a simple daemon which will watch files on your filesystem, mirror them to MFS, automatically update related pins, and update related IPNS keys, so you can always access your directories from the same address. You can use it to sync your documents, photos, videos, or even a website!

null 71 May 6, 2022
Every 10 minutes, memory, cpu and storage usage is checked and if they over 80%, sending alert via email.

linux-alert Every 10 minutes, memory, cpu and storage usage is checked and if they over 80%, sending alert via email. Usage Create .env file from .env

Meliksah Cetinkaya 0 Feb 6, 2022
A golang package to communicate with HipChat over XMPP

hipchat This is a abstraction in golang to Hipchat's implementation of XMPP. It communicates over TLS and requires zero knowledge of XML or the XMPP p

Dane Harrigan 110 Jan 12, 2022
A minimal configuration manager for Go applications.

Confetti A simple config manager for Go applications. Install Use the following: go get -u github.com/shivanshkc/confetti/v2 When to use Confetti Con

Shivansh Kuchchal 1 Dec 7, 2021
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

The open-source platform for monitoring and observability. Grafana allows you to query, visualize, alert on and understand your metrics no matter wher

Grafana Labs 48.6k May 6, 2022
Simple and configurable Logging in Go, with level, formatters and writers

go-log Logging package similar to log4j for the Golang. Support dynamic log level Support customized formatter TextFormatter JSONFormatter Support mul

Guoqiang Chen 11 Feb 21, 2021
A Go (golang) package providing high-performance asynchronous logging, message filtering by severity and category, and multiple message targets.

ozzo-log Other languages 简体中文 Русский Description ozzo-log is a Go package providing enhanced logging support for Go programs. It has the following fe

Ozzo Framework 118 Jan 11, 2022
Library and program to parse and forward HAProxy logs

haminer Library and program to parse and forward HAProxy logs. Supported forwarder, Influxdb Requirements Go for building from source code git for dow

Shulhan 22 Mar 20, 2019
Cloudinsight Agent is a system tool that monitors system processes and services, and sends information back to your Cloudinsight account.

Cloudinsight Agent 中文版 README Cloudinsight Agent is written in Go for collecting metrics from the system it's running on, or from other services, and

cloudinsight-backup 364 Mar 4, 2022
Distributed simple and robust release management and monitoring system.

Agente Distributed simple and robust release management and monitoring system. **This project on going work. Road map Core system First worker agent M

StreetByters Community 31 Mar 3, 2022
List files and their creation, modification and access time on android

andfind List files and their access, modification and creation date on a Android

Tek 2 Jan 5, 2022
Some tests and examples with goroutines and channels

goroutine-playground Some tests and examples with goroutines and channels simpleAsyncCalls Runs functions in background and doesn't wait for results a

Filipe Alves 1 Feb 9, 2022
Golog is a logger which support tracing and other custom behaviors out of the box. Blazing fast and simple to use.

GOLOG Golog is an opinionated Go logger with simple APIs and configurable behavior. Why another logger? Golog is designed to address mainly two issues

Damiano Petrungaro 3 May 4, 2022
Simple and blazing fast lockfree logging library for golang

glg is simple golang logging library Requirement Go 1.11 Installation go get github.com/kpango/glg Example package main import ( "net/http" "time"

Yusuke Kato 151 Apr 27, 2022
a golang log lib supports level and multi handlers

go-log a golang log lib supports level and multi handlers Use import "github.com/siddontang/go-log/log" //log with different level log.Info("hello wo

siddontang 30 Apr 13, 2022