CRAQ (Chain Replication with Apportioned Queries) in Go

Overview

go-craq Test Status Go Reference

Package go-craq implements CRAQ (Chain Replication with Apportioned Queries) as described in the CRAQ paper. MIT Licensed.

CRAQ is a replication protocol that allows reads from any replica while still maintaining strong consistency. CRAQ should provide better read throughput than Raft and Paxos. Read performance grows linearly with the number of nodes added to the system. Network chatter is significantly lower compared to Raft and Paxos.

Learn more about CRAQ

CRAQ Paper

Chain Replication: How to Build an Effective KV Storage

MIT 6.824 Distributed Systems Lecture on CRAQ (80mins)

            +------------------+
            |                  |
      +-----+   Coordinator    |
      |     |                  |
Write |     +------------------+
      |
      v
  +---+----+     +--------+     +--------+
  |        +---->+        +---->+        |
  |  Node  |     |  Node  |     |  Node  |
  |        +<----+        +<----+        |
  +---+-+--+     +---+-+--+     +---+-+--+
      ^ |            ^ |            ^ |
 Read | |       Read | |       Read | |
      | |            | |            | |
      + v            + v            + v

Processes

There are 3 packages that should be started to run the chain. The default node implementation in cmd/node uses the Go net/rpc package and bbolt for storage.

Coordinator

Facilitates new writes to the chain; allows nodes to announce themselves to the chain; manages the order of the nodes of the chain. One Coordinator should be run for each chain. For better resiliency, you could run a cluster of Coordinators and use something like Raft or Paxos for leader election, but that's outside the scope of this project.

Run Flags

-a # Local address to listen on. Default: :1234

Node

Represents a single node in the chain. Responsible for storing writes, serving reads, and forwarding messages along the chain. In practice, you would probably have a single Node process running on a machine. Each Node should have it's own storage unit.

Run Flags

-a # Local address to listen on. Default: :1235
-p # Public address reachable by coordinator and the other nodes. Default: :1235
-c # Coordinator address. Default: :1234
-f # Bolt DB database file. Default: craq.db

Client

Basic CLI tool for interacting with the chain. Allows writes and reads. The one included in this project uses the net/rpc package as the transport layer and bbolt as the storage layer.

Usage

./client write hello "world" # Write a new entry for key 'hello'
./client read hello # read the latest committed version of key 'hello'

Communication

go-craq processes communicate via RPC. The project is designed to be used with whatever RPC system shall be desired. The basic default client included in the go-craq package uses the net/rpc package from Go's stdlib; an easy-to-work-with package with a great API.

Adding a New Transport Implementation

Pull requests for additional transport implementations are very welcome. Some common ones that would be great to have are gRPC and HTTP. Start by reading through transport/transport.go. Use transport/netrpc as an example.

Storage

go-craq is designed to make it easy to swap the persistance layer. CRAQ is flexible and any storage unit that implements the Storer interface in store/store.go can be used. Some implementations for common storage projects can be found in the store package. store/kv package is a very simple in-memory key/value store that is included as an example to work off of when adding new storage implementations.

Adding a New Storage Implementation

Pull requests for additional storage implementations are very welcome. Start by reading through the comments in store/store.go. Use the store/kv package as an example. CRAQ should work well with volatile and non-volatile storage but mixing should be avoided or else you may end up seeing long startup times due to data propagation. Mixing persistent storage mechanisms is an interesting idea I've been playing with myself. For example, one node storing items in the cloud and another storing items locally.

store/storetest should be used for testing new storage implementations. Run the test suite like this:

func TestStorer(t *testing.T) {
	storetest.Run(t, func(name string, test storetest.Test) {
    // New() is your store's constructor function.
		test(t, New())
	})
}

Reading the Code

There are several places to start that'll give you a great understanding of how things work.

connectToCoordinator method in node/node.go. This is the method the Node must run during startup to connect to the Coordinator and announce itself to the chain. The Coordinator responds with some metadata about where in the chain the node is now located. The node uses this info to connect to the predecessor in the chain.

Update method in node/node.go. This is the method the Coordinator uses to update the node's metadata. New data is sent to the node if the node's predecessor or successor changes, and if the address of the tail node changes.

ClientWrite method in node/node.go. This is the method the Coordinator uses to send writes to the head node. This is where the chain begins the process of propagation.

Q/A

What happens during a write?

A write request containing the key and value are sent to the Coordinator via the Coordinator's Write RPC method. If the chain is not empty, the Coordinator will pass the write request to the head node via the node's ClientWrite method.

The head node receives the key and value. The node looks up the key in it's store to determine what the latest version should be. If the key already exists, the version of the new write will be the existing version incremented by one. The new value is written to the store. If the node is not connected to another node in the chain, it commits the version. If the node does have a successor, the key, value, and version are forwarded to the next node in the chain via the successor's Write RPC method.

The key, value, and version are passed along the chain one-by-one. Each node adds the item to the store and sends a message to the successor in the chain. When the write reaches the tail node, the tail marks the item as committed. The tail sends a Commit RPC method to it's predecessor. The tail's predecessor commits that version of the item, then continues to forward the Commit message backwards through the chain, one node at a time, until every node has committed the version.

What happens when a new node joins the chain?

When the node.Start method is run, the Node will backfill it's list of latest versions for all committed items in it's store, then it'll connect to the coordinator. Each Node stores the latest version of each committed item in it's store in-memory. This is done so that if the node is or becomes the tail node, other nodes can query it for the latest committed version of an item. The resources at the top of the readme provide some info on why this is important, but basically it helps ensure strong consistency.

After backfilling the map of latest versions the Node connects to the Coordinator. The Coordinator adds the new Node to the list of Nodes in the chain, connects to the Node, responds with some metadata for the new Node, then sends updated metadata to the rest of the nodes in the chain to let it know there's a new tail.

The metadata that the Coordinator sends back to the new Node includes info to let the Node know if it's the head, the tail, and who it's predecessor in the chain is. The Node uses this info to connect to the predecessor in the chain.

After connecting to the predecessor, the Node asks the predecessor for all items it has that a) the Node has no record of, or b) the Node has older versions of. This ensures that the new Node is caught up with the chain. Because the new node is now the tail, any uncommitted items sent during propagation are immediately committed.

Once the Node is connected to it's neighbor and the coordinator, it starts listening for RPCs. The RPC server is setup and started in cmd/node.

Why store the latest committed versions in-memory?

It's worth mentioning that CRAQ works best with read-heavy workloads. One of it's best "features" is being able to read from any node in the chain. If a node receives a read request for key Y and the latest version of key Y in the store is not committed, the node will send a request to the tail to ask for the latest committed version of key Y; this helps ensure that all reads to any node returns the same value. In other words, it asserts strong consistency. As the chain grows, it takes longer for an item to be committed and the probability of needing to ask the tail for the latest version rises. Asking the tail for the latest version can quickly become a bottleneck. Therefore, storing these latest versions in-memory affords us higher throughput from the tail for requests for the latest version at the expense of slower startup times because the tail needs to backfill it's map of latest versions.

In the future, it may be beneficial to let the operator of the node signify whether they'd like to backfill at startup or serve 'latest version' requests directly from the store.

Backlog

  • Benchmarks based off the tests in the paper, as close as reasonably possible.
  • gRPC transporter
  • HTTP transporter
  • Allow nodes to join at any location in the chain.
Issues
  • build(deps): bump go.mongodb.org/mongo-driver from 1.4.6 to 1.5.1

    build(deps): bump go.mongodb.org/mongo-driver from 1.4.6 to 1.5.1

    Bumps go.mongodb.org/mongo-driver from 1.4.6 to 1.5.1.

    Release notes

    Sourced from go.mongodb.org/mongo-driver's releases.

    MongoDB Go Driver 1.5.1

    The MongoDB Go driver team is pleased to release 1.5.1 of the official Go driver.

    This release contains several bug fixes. Due to the issue below, we recommend all users upgrade to this version of the driver.

    Documentation can be found on pkg.go.dev and the MongoDB documentation site. BSON library documentation is also available on pkg.go.dev. Questions and inquiries can be asked on the MongoDB Developer Community. Bugs can be reported in the Go Driver Jira where a list of current issues can be found.

    This CVE describes a security issue with the driver's BSON marshalling system. BSON marshalling functions would incorrectly handle null bytes embedded in BSON key names and the pattern/options fields of a BSON regex value. BSON marshalling functions now correctly validate and error if there is an embedded null byte in BSON key names or the pattern/options fields of a BSON regex value. We recommend all users of the driver upgrade to this version.

    CVE ID: CVE-2021-20329 Title: Specific cstrings input may not be properly validated in the MongoDB Go Driver Description: Specific cstrings input may not be properly validated in the MongoDB Go Driver when marshalling Go objects into BSON. A malicious user could use a Go object with specific string to potentially inject additional fields into marshalled documents. This issue affects all MongoDB GO Drivers up to (and including) 1.5.0. CVSS score: 6.8 CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:N Affected products and versions, MongoDB Go Driver versions <= 1.5.0 Underlying operating systems affected: All

    For a full list of tickets included in this release, please see the links below:

    Bugs

    Tasks

    MongoDB Go Driver 1.5.0

    The MongoDB Go driver team is pleased to release 1.5.0 of the official Go driver.

    This release contains several new features and usability improvements for the driver.

    Documentation can be found on pkg.go.dev and the MongoDB documentation site. BSON library documentation is also available on pkg.go.dev. Questions and inquiries can be asked on the MongoDB Developer Community. Bugs can be reported in the Go Driver Jira where a list of current issues can be found.

    This release contains a new errors API for the primary mongo package. Users can now detect duplicate key errors, timeouts, and network errors via the mongo.IsDuplicateKeyError, mongo.IsTimeout, and mongo.IsNetworkError functions, respectively. Additionally, a new UpdateByID function has been added to the mongo.Collection type to update a single document with a given _id value.

    The Go Driver now supports using GCP and Azure key management services with the client-side field level encryption feature. In addition, AWS key management support has been enhanced to allow authenticating with temporary AWS credentials. See the MongoDB docs for more information about these improvements. Use of client-side field level encryption requires users to install the latest released version of libmongocrypt. Note: This means that existing applications that use this feature will need to upgrade the libmongocrypt dependency when upgrading to this driver version; otherwise, the application will fail to compile. Users can upgrade to the latest development release of libmongocrypt via the OS-specific instructions for macos, Windows, and Linux.

    Monitoring has now been added for various server events. A ServerMonitor set on a mongo.Client monitors changes on the MongoDB deployment it is connected to and reports the changes in the client's representation of the deployment.

    The driver will now error if a map with more than one key is used as a hint option, sort option, or for index creation. This is to prevent unexpected behavior, for example, an index being created with the keys in the wrong order.

    ... (truncated)

    Commits
    • 40c0e70 Update version to v1.5.1
    • 3a89e6c GODRIVER-1923 Error if BSON cstrings contain null bytes (#622)
    • 1a2534c GODRIVER-1935 Update scram/stringprep dependencies (#624)
    • 6ea353a GODRIVER-1918 Check for zero length in readstring (#613)
    • d5e11aa GODRIVER-1919 Support decoding ObjectIDs from hex strings in BSON (#610)
    • e0ed6d6 Update version to v1.5.1+prerelease
    • 6760875 Update version to v1.5.0
    • 19a368c GODRIVER-1911 Fix Windows/macos test failures for CSFLE (#603)
    • 2a5f9a4 GODRIVER-1879 Apply connectTimeoutMS to TLS handshake (#594)
    • 2c5b75b GODRIVER-1855 Support AWS authentication with temporary credentials in CSFLE ...
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 2
  • Default coordinator and node addresses in cmd/client/client.go are

    Default coordinator and node addresses in cmd/client/client.go are "192.168.0.30"

    The default address should probably be 127.0.0.1, same as cmd/coordinator and cmd/node. Or better you should use the ":port" form and let it default to the local IP.

    With the current values it's very probable that the client will not connect to the local servers (my machines are in the 10.0.0.0 range, and even with the network on the 192.168.0.0 range is very possible that the current IP is not 192.168.0.30)

    opened by raff 1
Owner
Des Preston
Des Preston
did:ar | multi-chain DIDs backed by Arweave

did:ar | multi-chain DIDs backed by Arweave - THIS IS A WIP ABSOLUTELY SHOULD NOT BE USED IN PRODUCTION ___ ___

Glass 14 Jun 2, 2022
Go-opera-test - EVM-compatible chain secured by the Lachesis consensus algorithm

Opera EVM-compatible chain secured by the Lachesis consensus algorithm. Building

Tenderly 0 Feb 14, 2022
Supply chain management indie game... IN SPACE!

Ship shape Supply chain management indie game ... IN SPACE! Current state is preliminary - there's a six-level tutorial, about an hour's worth of game

null 4 May 13, 2022
The goal of Binance Smart Chain is to bring programmability and interoperability to Binance Chain

Binance Smart Chain The goal of Binance Smart Chain is to bring programmability

lin 0 Dec 31, 2021
XT Smart Chain, a chain based on the go-ethereum fork

XT Smart Chain XT Smart Chain (XSC) is a decentralized, high-efficiency and ener

null 4 May 17, 2022
Berylbit PoW chain using Ethash, EPI-Burn and geth. The chain will be using bot congestion flashbot bundles through nodes

Berylbit PoW chain using Ethash, EPI-Burn and geth. The chain will be using bot congestion flashbot bundles through nodes. Soon, We will work towards

BerylBit 9 Jun 30, 2022
Go-chain - EVM-compatible chain secured by the Lachesis consensus algorithm

ICICB galaxy EVM-compatible chain secured by the Lachesis consensus algorithm. B

Talented Blockchain Developer 4 Jun 8, 2022
Golang MySql binary log replication listener

Go MySql binary log replication listener Pure Go Implementation of MySQL replication protocol. This allow you to receive event like insert, update, de

Pavel <Ven> Gulbin 185 Apr 13, 2022
MySQL replication topology management and HA

orchestrator [Documentation] orchestrator is a MySQL high availability and replication management tool, runs as a service and provides command line ac

null 4.6k Jun 28, 2022
A generic oplog/replication system for microservices

REST Operation Log OpLog is used as a real-time data synchronization layer between a producer and consumers. Basically, it's a generic database replic

Dailymotion 112 Jun 5, 2022
HA LDAP based key/value solution for projects configuration storing with multi master replication support

Recon is the simple solution for storing configs of you application. There are no specified instruments, no specified data protocols. For the full power of Recon you only need curl.

Mikhail Panfilov 12 Jun 15, 2022
Kubegres is a Kubernetes operator allowing to create a cluster of PostgreSql instances and manage databases replication, failover and backup.

Kubegres is a Kubernetes operator allowing to deploy a cluster of PostgreSql pods with data replication enabled out-of-the box. It brings simplicity w

Reactive Tech Ltd 1k Jun 30, 2022
Enhanced PostgreSQL logical replication

pgcat - Enhanced postgresql logical replication Why pgcat? Architecture Build from source Install Run Conflict handling Table mapping Replication iden

jinhua luo 357 May 28, 2022
A pure go library to handle MySQL network protocol and replication.

A pure go library to handle MySQL network protocol and replication.

null 3.7k Jul 2, 2022
logical is tool for synchronizing from PostgreSQL to custom handler through replication slot

logical logical is tool for synchronizing from PostgreSQL to custom handler through replication slot Required Postgresql 10.0+ Howto Download Choose t

梦飞 6 Jan 21, 2022
Streaming replication for SQLite.

Litestream Litestream is a standalone streaming replication tool for SQLite. It runs as a background process and safely replicates changes incremental

Ben Johnson 6.9k Jun 30, 2022
Asynchronous data replication for Kubernetes volumes

VolSync VolSync asynchronously replicates Kubernetes persistent volumes between clusters using either rsync or rclone. It also supports creating backu

Backube 45 Jun 16, 2022
A river for elasticsearch to automatically index mysql content using the replication feed.

Mysql River Plugin for ElasticSearch The Mysql River plugin allows to hook into Mysql replication feed using the excellent python-mysql-replication an

null 160 Jun 1, 2022
MySQL replication topology manager - agent (daemon)

orchestrator-agent MySQL topology agent (daemon) orchestrator-agent is a sub-project of orchestrator. It is a service that runs on MySQL hosts and com

GitHub 51 Mar 8, 2022
Litestream-read-replica-demo - A demo application for running live read replication on fly.io with Litestream

Litestream Read Replica Demo A demo application for running live read replicatio

Ben Johnson 64 Jun 3, 2022
An example of using Litestream's live read replication feature.

Litestream Read Replica Example This repository is an example of how to setup and deploy a multi-node SQLite database using Litestream's live read rep

Ben Johnson 33 Jun 12, 2022
Go native library for fast point tracking and K-Nearest queries

Geo Index Geo Index library Overview Splits the earth surface in a grid. At each cell we can store data, such as list of points, count of points, etc.

Hailo Network IP Ltd 340 May 26, 2022
Cache Slow Database Queries

Cache Slow Database Queries This package is used to cache the results of slow database queries in memory or Redis. It can be used to cache any form of

null 113 May 24, 2022
A package for Go that can be used for range queries on large number of intervals

go-stree go-stree is a package for Go that can be used to process a large number of intervals. The main purpose of this module is to solve the followi

Thomas Oberndörfer 39 May 14, 2022
Write your SQL queries in raw files with all benefits of modern IDEs, use them in an easy way inside your application with all the profit of compile time constants

About qry is a general purpose library for storing your raw database queries in .sql files with all benefits of modern IDEs, instead of strings and co

Sergey Treinis 21 Apr 25, 2022
pggen - generate type safe Go methods from Postgres SQL queries

pggen - generate type safe Go methods from Postgres SQL queries pggen is a tool that generates Go code to provide a typesafe wrapper around Postgres q

Joe Schafer 171 Jun 18, 2022
QueryCSV enables you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to a CSV file

QueryCSV enable you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to CSV file

Mohamed Shapan 100 Dec 22, 2021