Fast specialized time-series database for IoT, real-time internet connected devices and AI analytics.

Overview

unitdb GoDoc Go Report Card Build Status Coverage Status

Unitdb is blazing fast specialized time-series database for microservices, IoT, and realtime internet connected devices. As Unitdb satisfy the requirements for low latency and binary messaging, it is a perfect time-series database for applications such as internet of things and internet connected devices.

Don't forget to ⭐ this repo if you like Unitdb!

About unitdb

Key characteristics

  • 100% Go
  • Can store larger-than-memory data sets
  • Optimized for fast lookups and writes
  • Supports writing billions of records per hour
  • Supports opening database with immutable flag
  • Supports database encryption
  • Supports time-to-live on message entries
  • Supports writing to wildcard topics
  • Data is safely written to disk with accuracy and high performant block sync technique

Quick Start

To build Unitdb from source code use go get command.

go get github.com/unit-io/unitdb

Usage

Detailed API documentation is available using the go.dev service.

Make use of the client by importing it in your Go client source code. For example,

import "github.com/unit-io/unitdb"

Unitdb supports Get, Put, Delete operations. It also supports encryption, batch operations, and writing to wildcard topics. See usage guide.

Samples are available in the examples directory for reference.

Clustering

To bring up the Unitdb cluster start 2 or more nodes. For fault tolerance 3 nodes or more are recommended.

> ./bin/unitdb -listen=:6060 -grpc_listen=:6080 -cluster_self=one -db_path=/tmp/unitdb/node1
> ./bin/unitdb -listen=:6061 -grpc_listen=:6081 -cluster_self=two -db_path=/tmp/unitdb/node2

Above example shows each Unitdb node running on the same host, so each node must listen on different ports. This would not be necessary if each node ran on a different host.

Architecture Overview

The unitdb engine handles data from the point put request is received through writing data to the physical disk. Data is compressed and encrypted (if encryption is set) then written to a WAL for immediate durability. Entries are written to memdb and become immediately queryable. The memdb entries are periodically written to log files in the form of blocks.

To efficiently compact and store data, the unitdb engine groups entries sequence by topic key, and then orders those sequences by time and each block keep offset of previous block in reverse time order. Index block offset is calculated from entry sequence in the time-window block. Data is read from data block using index entry information and then it un-compresses the data on read (if encryption flag was set then it un-encrypts the data on read).

Unitdb stores compressed data (live records) in a memdb store. Data records in a memdb are partitioned into (live) time-blocks of configured capacity. New time-blocks are created at ingestion, while old time-blocks are appended to the log files and later sync to the disk store.

When Unitdb receives a put or delete request, it first writes records into tiny-log for recovery. Tiny-logs are added to the log queue to write it to the log file. The tiny-log write is triggered by the time or size of tiny-log incase of backoff due to massive loads.

The tiny-log queue is maintained in memory with a pre-configured size, and during massive loads the memdb backoff process will block the incoming requests from proceeding before the tiny-log queue is cleared by a write operation. After records are appended to the tiny-log, and written to the log files the records are then sync to the disk store using blazing fast block sync technique.

Next steps

In the future, we intend to enhance the Unitdb with the following features:

  • Distributed design: We are working on building out the distributed design of Unitdb, including replication and sharding management to improve its scalability.
  • Developer support and tooling: We are working on building more intuitive tooling, refactoring code structures, and enriching documentation to improve the onboarding experience, enabling developers to quickly integrate Unitdb to their time-series database stack.
  • Expanding feature set: We also plan to expand our query feature set to include functionality such as window functions and nested loop joins.
  • Query engine optimization: We will also be looking into developing more advanced ways to optimize query performance such as GPU memory caching.

Contributing

As Unitdb is under active development and at this time Unitdb is not seeking major changes or new features from new contributors. However, small bugfixes are encouraged.

Licensing

This project is licensed under Apache-2.0 License.

Releases(v0.2.0)
Owner
Saffat Technologies
Saffat Technologies
Scalable datastore for metrics, events, and real-time analytics

InfluxDB InfluxDB is an open source time series platform. This includes APIs for storing and querying data, processing it in the background for ETL or

InfluxData 23.7k Jun 26, 2022
Scalable datastore for metrics, events, and real-time analytics

InfluxDB InfluxDB is an open source time series platform. This includes APIs for storing and querying data, processing it in the background for ETL or

InfluxData 23.7k Jun 27, 2022
A GPU-powered real-time analytics storage and query engine.

AresDB AresDB is a GPU-powered real-time analytics storage and query engine. It features low query latency, high data freshness and highly efficient i

Uber Open Source 2.9k Jun 29, 2022
VictoriaMetrics: fast, cost-effective monitoring solution and time series database

VictoriaMetrics VictoriaMetrics is a fast, cost-effective and scalable monitoring solution and time series database. It is available in binary release

VictoriaMetrics 6.6k Jun 24, 2022
The Prometheus monitoring system and time series database.

Prometheus Visit prometheus.io for the full documentation, examples and guides. Prometheus, a Cloud Native Computing Foundation project, is a systems

Prometheus 43.1k Jun 24, 2022
LinDB is an open-source Time Series Database which provides high performance, high availability and horizontal scalability.

LinDB is an open-source Time Series Database which provides high performance, high availability and horizontal scalability. LinDB stores all monitoring data of ELEME Inc, there is 88TB incremental writes per day and 2.7PB total raw data.

LinDB 2.3k Jun 29, 2022
TalariaDB is a distributed, highly available, and low latency time-series database for Presto

TalariaDB is a distributed, highly available, and low latency time-series database that stores real-time data. It's built on top of Badger DB.

Grab 97 Jun 18, 2022
Export output from pg_stat_activity and pg_stat_statements from Postgres into a time-series database that supports the Influx Line Protocol (ILP).

pgstat2ilp pgstat2ilp is a command-line program for exporting output from pg_stat_activity and pg_stat_statements (if the extension is installed/enabl

Zikani Nyirenda Mwase 4 Dec 15, 2021
Time Series Database based on Cassandra with Prometheus remote read/write support

SquirrelDB SquirrelDB is a scalable high-available timeseries database (TSDB) compatible with Prometheus remote storage. SquirrelDB store data in Cass

Bleemeo 16 Jun 18, 2022
🤔 A minimize Time Series Database, written from scratch as a learning project.

mandodb ?? A minimize Time Series Database, written from scratch as a learning project. 时序数据库(TSDB: Time Series Database)大多数时候都是为了满足监控场景的需求,这里先介绍两个概念:

dongdong 475 Jun 21, 2022
Real-time Geospatial and Geofencing

Tile38 is an open source (MIT licensed), in-memory geolocation data store, spatial index, and realtime geofence. It supports a variety of object types

Josh Baker 8.1k Jun 17, 2022
Nipo is a powerful, fast, multi-thread, clustered and in-memory key-value database, with ability to configure token and acl on commands and key-regexes written by GO

Welcome to NIPO Nipo is a powerful, fast, multi-thread, clustered and in-memory key-value database, with ability to configure token and acl on command

Morteza Bashsiz 16 Jun 13, 2022
Owl is a db manager platform,committed to standardizing the data, index in the database and operations to the database, to avoid risks and failures.

Owl is a db manager platform,committed to standardizing the data, index in the database and operations to the database, to avoid risks and failures. capabilities which owl provides include Process approval、sql Audit、sql execute and execute as crontab、data backup and recover .

null 35 Jun 17, 2022
Beerus-DB: a database operation framework, currently only supports Mysql, Use [go-sql-driver/mysql] to do database connection and basic operations

Beerus-DB · Beerus-DB is a database operation framework, currently only supports Mysql, Use [go-sql-driver/mysql] to do database connection and basic

Beerus 6 Mar 5, 2022
rosedb is an embedded and fast k-v database based on LSM + WAL

A simple k-v database in pure Golang, supports string, list, hash, set, sorted set.

roseduan 2.9k Jun 23, 2022
BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go

BadgerDB BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go. It is the underlying database for Dgraph, a fast,

Blizard 1 Dec 10, 2021
This is a simple graph database in SQLite, inspired by "SQLite as a document database".

About This is a simple graph database in SQLite, inspired by "SQLite as a document database". Structure The schema consists of just two structures: No

Denis Papathanasiou 1.1k Jun 21, 2022
Hard Disk Database based on a former database

Hard Disk Database based on a former database

null 0 Nov 1, 2021
Simple key value database that use json files to store the database

KValDB Simple key value database that use json files to store the database, the key and the respective value. This simple database have two gRPC metho

Francisco Santos 0 Nov 13, 2021