Distributed-Log-Service - Distributed Log Service With Golang

Overview

Distributed Log Service

This project is essentially a result of my attempt to understand the concepts of Distributed Systems, and implement them with Golang, to consequently build a fully-fletched distributed log service. Since, the learning curve itself is madly steep and time-consuming, and understanding the concepts behind distributed service require some prior knowledge, I have separated the project developement in the stages listed below. However, the main aim is to build a distributed service with it's very own storage handling, networking over a client and server, and a way to distribute the server instances. At the end, if possible, I plan to deploy the service with Kubernetes to the cloud.

At this point in time ( 8th Janauary 2022), the first step has been successfully completed and tested. The second step is expected to be completed by the end of January.

The stages were decided in this order to reflect the content structure of the book Distributed Services with Go, written by Travis Jeffery. As the book proceeds with the concepts, I have tried to independently and simultaneously learn and build the different components of the service. Finally, the stages are as follows :

Building the project's storage layer, a web server to faciliate JSON over HTTP, and a custom made log libray

  • Develop the JSON over HTTP commit log service
  • Setup protobufs, and ways to aumatically generate the data structures based on the protobuf message structures
  • Building a commit log library that will essentially be the log for the entire service, to store and lookup data
    • The commit log library has the following structure:
      • A component that allows appending and reading records from the log by provisioning independent structures and methods to faciliate the following
        • Store file handling for record entries
        • Index file handling for index entries of the corresponding records
      • A component that combines the store and file handling components to provision a Segment file handling module to coordinate operations across store and index files.
      • Lasty, the final component ties all the components above, specially the Segment module, to create the final Log handling package for the entire libaray.
    • All the files for the log library can be found under the internal/log directory.
    • To know more about the log library scroll below

Creating the service over a network

  • Setting up gRPC, define the client server APIs in protobuf together with builing the client and server
  • Securing the service with authentication of the server with SSL/TLS, to encrypy/decrypt the data exchanged by authenticating requests with accress tokens.
  • Making service observable by addings logs, metrics and tracing

Distribute - making the service distributed

  • Building discovery into service to make server instances aware of each other
  • Adding Raft consensus to coordinate the efforts of our servers, and turn them into a cluster
  • Putting discovery into out gRPC clients, so that discover and connect to servers with client side load balancing.

Distributed log library (Overview)

A log basically records what happened and when. It is like a table that always orders the records by the time and the indexes each record by its offset and time created. Logs are split into list of segments to accommodate for the fact that disk spaces are not infinite. When the list of segments grows too big, the old segments are deleted whose data we have already processed. This clean up process usually occurs tin the background or concurrently. Each log contains an active segment, where data is written actively, when the active segment is filled up, the log moves into another segment. Each segment compromises of two files - store and index files. The store files gets record written into, and the index file is where the offset and index value of the record at the store file, is written into.

Logs in distributed system allow ordering changes and distributing data across the nodes in a cluster. More so, the distributed log is said to be a data structure that reflects the problem of consensus. With append-read only logs, each replica in a cluster can read the same input/data from the log for a given instant, and produce the same result.

This distributed log library is built to support a replicated, coordinated cluster. It’s done so by adding methods into log.go file that would allow the service to know about the offset range of each log. That way, we would know what nodes have the oldest and newest data, and what nodes are falling behind and need to replicate. Inside the log.go file, there are functions written to read all the segments of a log at It also supports snapshots and restoring of logs when necessary .

It has two separate files index.go and store.go to handle the reading and appending records and indexes into the store and index files respectively. Both .go files have structs for records and indexes, along with corresponding methods to read and append records or indexes. To view the field of the structs defined, you can go to the index.go and store.go files inside the internal/log directory. Also, all the files for the library can be found in the internal/log library. For the offset values in index files, relative offsets as uint32 is used, and not absolute offset values as uint64, to optimise the memory performance. The index files are mostly memory mapped as they are small and has only two data - offset and position of the record inside the store file, and the indexes are appended/read from the sync memory mapped files. This makes the read/append operations faster than what it would have been with the disk involved .

The log has been made to go through a graceful shutdown for the service. Service follows graceful shutdown, and returns the service to a state where it can restart properly and efficiently. This happens the close method for the index file ( present inside the index.go file ) truncates the persisted file first - by removing the empty spaces between the last record in the index file and the end of file which was there before to compute the maximum possible file size ( Open function did that) . By truncating the persisted file, we remove the empty spaces, and make sure the last entry is the last record appended in the file, and is at the end of file.

The segment portion (segment.go) wraps the index struct (defined in the index.go file), and store types to coordinate operations across the store and index files. This is because every time a record gets added to the store file, the index file needs to be updated with the offset and position values. For reads, the segments needs to look for the index from the index file, and search for the record at that index from the store file. The index and store files are saved with names that correspond to their baseOffset number; for example - with baseOffset of 3, the index and store file would be 3.index and 3.store. This naming convention gets handy when creating new segments in the segment.go file, as the baseOffset number for the index files and store files of a particular segment can be directly parsed from the file names. Also, if you are confused, by offset number of a record, I mean the the index of a record, for example is it the first record or second or third in the store file. By baseoffSet, I mean the offset number of the first record which was written into the record file. When the index and store files of a segment reaches the max size, new index and store files are created, and the files correspond to each other; i.e one cannot be created without the other , or their information sync exactly.

log

Development

The entire development of the project is dependent on my learning curve, and ability to grasp the concepts of distirbued services. Since, the project is entirely for educational purposes, it is hard to predict a possible timeline. However, by the end of this month, the entire project can be expected to be completed.

Disclaimer

As stated above, the project is being built by closely following the content and concepts outlined in the book Distributed Services with Go by Travis Jeffery. Hence, it's being developed purely for my own exploration and learning about distribued services. Howvever, anyone willing to contribue is more than welcomed. Thanks!.

Author

Hamza Yusuff - Email: [email protected]

Owner
Hamza Yusuff
Hamza Yusuff
Distributed Commit Log from Travis Jeffery's Distributed Services book

proglog A distributed commit log. This repository follows the book "Distributed Services with Go" by Travis Jeffrey. The official repository for this

Arindam Das 2 May 23, 2022
An golang log lib, supports tracking and level, wrap by standard log lib

Logex An golang log lib, supports tracing and level, wrap by standard log lib How To Get shell go get gopkg.in/logex.v1 source code import "gopkg.in/

chzyer 39 Apr 15, 2022
Log-analyzer - Log analyzer with golang

Log Analyzer what do we have here? Objective Installation and Running Applicatio

Lawrence Agbani 0 Jan 27, 2022
Nginx-Log-Analyzer is a lightweight (simplistic) log analyzer for Nginx.

Nginx-Log-Analyzer is a lightweight (simplistic) log analyzer, used to analyze Nginx access logs for myself.

Mao Mao 21 May 28, 2022
A simple web service for storing text log files

logpaste A minimalist web service for uploading and sharing log files. Run locally go run main.go Run in local Docker container The Docker container a

Michael Lynch 227 Jun 25, 2022
a golang log lib supports level and multi handlers

go-log a golang log lib supports level and multi handlers Use import "github.com/siddontang/go-log/log" //log with different level log.Info("hello wo

siddontang 31 Jun 15, 2022
Simple log parser written in Golang

Simple log parser written in Golang

Matteo Baiguini 0 Oct 31, 2021
Self-use log encapsulation for golang

package app import "github.com/restoflife/log" func Init() { log.Ne

Restoflife 1 Dec 29, 2021
Log to cloud object storage for golang. Zap extension.

cos-logger Log to cloud object storage for golang implemented as io.Writer. Use it as a plugin/extension to uber-go/zap logger Configure logger and ad

Prashant Sharma 0 Jan 19, 2022
Go-logging-logrus - Learn how to log management in golang with logrus

Learn how to logging in golang with logrus How to run this project git clone htt

Vandy Ahmad 0 Jan 19, 2022
Tlog - Golang log but via telegram bot support

tlog golang log but via telegram bot support how to use tlog.LinkBot("token", "c

Mahdi Akbari 4 May 6, 2022
Structured log interface

Structured log interface Package log provides the separation of the logging interface from its implementation and decouples the logger backend from yo

teris.io 24 Jan 3, 2021
lumberjack is a log rolling package for Go

lumberjack Lumberjack is a Go package for writing logs to rolling files. Package lumberjack provides a rolling logger. Note that this is v2.0 of lumbe

Nate Finch 3.4k Jun 23, 2022
CoLog is a prefix-based leveled execution log for Go

What's CoLog? CoLog is a prefix-based leveled execution log for Go. It's heavily inspired by Logrus and aims to offer similar features by parsing the

null 157 Jun 7, 2022
OpenTelemetry log collection library

opentelemetry-log-collection Status This project was originally developed by observIQ under the name Stanza. It has been contributed to the OpenTeleme

OpenTelemetry - CNCF 86 Jun 7, 2022
exo: a process manager & log viewer for dev

exo: a process manager & log viewer for dev exo- prefix – external; from outside. Features Procfile compatible process manager.

Deref 321 Jun 21, 2022
Write log entries, get X-Ray traces.

logtoxray Write to logs, get X-Ray traces. No distributed tracing instrumenation library required. ?? ?? ?? THIS PROJECT IS A WORK-IN-PROGRESS PROTOTY

JBD 27 Apr 24, 2022
Binalyze logger is an easily customizable wrapper for logrus with log rotation

logger logger is an easily customizable wrapper for logrus with log rotation Usage There is only one function to initialize logger. logger.Init() When

Binalyze 26 Nov 18, 2021
Log-structured virtual disk in Ceph

lsd_ceph Log-structured virtual disk in Ceph 1. Vision and Goals of the Project Implement the basic librbd API to work with the research block device

null 3 Dec 13, 2021