Simple, ordered, key-value persistence library for the Go Language



gkvlite is a simple, ordered, ACID, key-value persistence library for Go.

GoDoc Build Status Coverage Status


gkvlite is a library that provides a simple key-value persistence store, inspired by SQLite and CouchDB/Couchstore.

gkvlite has the following features...

  • 100% implemented in the Go Language (golang).
  • Open source license - MIT.
  • Keys are ordered, so range scans are supported.
  • On-disk storage for a "Store" is a single file.
  • ACID properties are supported via a simple, append-only, copy-on-write storage design.
  • Concurrent goroutine support (multiple readers, single mutator).

Key concepts

  • A Store is held in a single file.
  • A Store can have zero or more Collections.
  • A Collection can have zero or more Items.
  • An Item is a key and value.
  • A key is a []byte, max length 64KB (length is uint16).
  • A value is a []byte, max length 4GB (length is uint32).

ACID properties

  • Atomicity - all unpersisted changes from all Collections during a Store.Flush() will be persisted atomically.
  • Consistency - simple key-value level consistency is supported.
  • Isolation - mutations won't affect concurrent readers or snapshots.
  • Durability - you control when you want to Flush() & fsync so your application can address its performance-vs-safety tradeoffs appropriately.


  • In general, performance is similar to probabilistic balanced binary tree performance.
  • O(log N) performance for item retrieval, insert, update, delete.
  • O(log N) performance to find the smallest or largest items (by key).
  • Range iteration performance is same as binary tree traversal performance.
  • You can optionally retrieve just keys only, to save I/O & memory resources.


  • Read-only Store snapshots are supported.
  • Mutations on the original Store won't be seen by snapshots.
  • Snapshot creation is a fast O(1) operation per Collection.


The simplest way to use gkvlite is in single-threaded fashion, such as by using a go channel or other application-provided locking to serialize access to a Store.

More advanced users may want to use gkvlite's support for concurrent goroutines. The idea is that multiple read-only goroutines, a single read-write (mutation) goroutine, and a single persistence (flusher) goroutine do not need to block each other.

More specifically, you should have only a single read-write (or mutation) goroutine per Store, and should have only a single persistence goroutine per Store (doing Flush()'s). And, you may have multiple, concurrent read-only goroutines per Store (doing read-only operations like Get()'s, Visit()'s, Snapshot()'s, CopyTo()'s, etc).

A read-only goroutine that performs a long or slow read operation, like a Get() that must retrieve from disk or a range scan, will see a consistent, isolated view of the collection. That is, mutations that happened after the slow read started will not be seen by the reader.

IMPORTANT: In concurrent usage, the user must provide a StoreFile implementation that is concurrent safe.

Note that os.File is not a concurrent safe implementation of the StoreFile interface. You will need to provide your own implementation of the StoreFile interface, such as by using a channel to serialize StoreFile method invocations.

Finally, advanced users may also use a read-write (mutation) goroutine per Collection instead of per Store. There should only be, though, only a single persistence (flusher) goroutine per Store.

Other features

  • In-memory-only mode is supported, where you can use the same API but without any persistence.
  • You provide the os.File - this library just uses the os.File you provide.
  • You provide the os.File.Sync() - if you want to fsync your file, call file.Sync() after you do a Flush().
  • Similar to SQLite's VFS feature, you can supply your own StoreFile interface implementation instead of an actual os.File, for your own advanced testing or I/O interposing needs (e.g., compression, checksums, I/O statistics, caching, enabling concurrency, etc).
  • You can specify your own KeyCompare function. The default is bytes.Compare(). See also the StoreCallbacks.KeyCompareForCollection() callback function.
  • Collections are written to file sorted by Collection name. This allows users with advanced concurrency needs to reason about how concurrent flushes interact with concurrent mutations. For example, if you have a main data collection and a secondary-index collection, with clever collection naming you can know that the main collection will always be "ahead of" the secondary-index collection even with concurrent flushing.
  • A Store can be reverted using the FlushRevert() API to revert the last Flush(). This brings the state of a Store back to where it was as of the next-to-last Flush(). This allows the application to rollback or undo changes on a persisted file.
  • Reverted snapshots (calling FlushRevert() on a snapshot) does not affect (is isolated from) the original Store and does not affect the underlying file. Calling FlushRevert() on the main Store, however, will adversely affect any active snapshots; where the application should stop using any snapshots that were created before the FlushRevert() invocation on the main Store.
  • To evict O(log N) number of items from memory, call Collection.EvictSomeItems(), which traverses a random tree branch and evicts any clean (already persisted) items found during that traversal. Eviction means clearing out references to those clean items, which means those items can be candidates for GC.
  • You can control item priority to access hotter items faster by shuffling them closer to the top of balanced binary trees (warning: intricate/advanced tradeoffs here).
  • Tree depth is provided by using the VisitItemsAscendEx() or VisitItemsDescendEx() methods.
  • You can associate transient, ephemeral (non-persisted) data with your items. If you do use the Item.Transient field, you should use sync/atomic pointer functions for concurrency correctness. In general, Item's should be treated as immutable, except for the Item.Transient field.
  • Application-level Item.Val buffer management is possible via the optional ItemValAddRef/ItemValDecRef() store callbacks, to help reduce garbage memory. Libraries like may be helpful here.
  • Errors from file operations are propagated all the way back to your code, so your application can respond appropriately.
  • Tested - "go test" unit tests.
  • Docs - "go doc" documentation.


Open source - MIT licensed.


import (

f, err := os.Create("/tmp/test.gkvlite")
s, err := gkvlite.NewStore(f)
c := s.SetCollection("cars", nil)

// You can also retrieve the collection, where c == cc.
cc := s.GetCollection("cars")

// Insert values.
c.Set([]byte("tesla"), []byte("$$$"))
c.Set([]byte("mercedes"), []byte("$$"))
c.Set([]byte("bmw"), []byte("$"))

// Replace values.
c.Set([]byte("tesla"), []byte("$$$$"))

// Retrieve values.
mercedesPrice, err := c.Get([]byte("mercedes"))

// One of the most priceless vehicles is not in the collection.
thisIsNil, err := c.Get([]byte("the-apollo-15-moon-buggy"))

// Iterate through items.
c.VisitItemsAscend([]byte("ford"), true, func(i *gkvlite.Item) bool {
    // This visitor callback will be invoked with every item
    // with key "ford" and onwards, in key-sorted order.
    // So: "mercedes", "tesla" are visited, in that ascending order,
    // but not "bmw".
    // If we want to stop visiting, return false;
    // otherwise return true to keep visiting.
    return true

// Let's get a snapshot.
snap := s.Snapshot()
snapCars := snap.GetCollection("cars")

// The snapshot won't see modifications against the original Store.
mercedesIsNil, err := c.Get([]byte("mercedes"))
mercedesPriceFromSnapshot, err := snapCars.Get([]byte("mercedes"))

// Persist all the changes to disk.

f.Sync() // Some applications may also want to fsync the underlying file.

// Now, other file readers can see the data, too.
f2, err := os.Open("/tmp/test.gkvlite")
s2, err := gkvlite.NewStore(f2)
c2 := s.GetCollection("cars")

bmwPrice, err := c2.Get([]byte("bmw"))


Because all collections are persisted atomically when you flush a store to disk, you can implement consistent secondary indexes by maintaining additional collections per store. For example, a "users" collection can hold a JSON document per user, keyed by userId. Another "userEmails" collection can be used like a secondary index, keyed by "emailAddress:userId", with empty values (e.g., []byte{}).

Bulk inserts or batched mutations are roughly supported in gkvlite where your application should only occasionally invoke Flush() after N mutations or after a given amount of time, as opposed to invoking a Flush() after every Set/Delete().

Implementation / design

The fundamental data structure is an immutable treap (tree + heap).

When used with random heap item priorities, treaps have probabilistic balanced tree behavior with the usual O(log N) performance bounds expected of balanced binary trees.

The persistence design is append-only, using ideas from Apache CouchDB and Couchstore / Couchbase, providing a simple approach to reaching ACID properties in the face of process or machine crashes. On re-opening a file, the implementation scans the file backwards looking for the last good root record and logically "truncates" the file at that point. New mutations are appended from that last good root location. This follows the MVCC (multi-version concurrency control) and "the log is the database" approach of CouchDB / Couchstore / Couchbase.

TRADEOFF: the append-only persistence design means file sizes will grow until there's a compaction. To get a compacted file, use CopyTo() with a high "flushEvery" argument.

The append-only file format allows the FlushRevert() API (undo the changes on a file) to have a simple implementation of scanning backwards in the file for the next-to-last good root record and physically truncating the file at that point.

TRADEOFF: the append-only design means it's possible for an advanced adversary to corrupt a gkvlite file by cleverly storing the bytes of a valid gkvlite root record as a value; however, they would need to know the size of the containing gkvlite database file in order to compute a valid gkvlite root record and be able to force a process or machine crash after their fake root record is written but before the next good root record is written/sync'ed.

The immutable, copy-on-write treap plus the append-only persistence design allows for fast and efficient MVCC snapshots.

TRADEOFF: the immutable, copy-on-write design means more memory garbage may be created than other designs, meaning more work for the garbage collector (GC).

TODO / ideas

  • TODO: Performance: consider splitting item storage from node storage, so we're not mixing metadata and data in same cache pages. Need to measure how much win this could be in cases like compaction. Tradeoff as this could mean no more single file simplicity.

  • TODO: Performance: persist items as log, and don't write treap nodes on every Flush().

  • TODO: Keep stats on misses, disk fetches & writes, etc.

  • TODO: Provide public API for O(log N) collection spliting & joining.

  • TODO: Provide O(1) MidItem() or TopItem() implementation, so that users can split collections at decent points.

  • TODO: Provide item priority shifting during CopyTo().

  • TODO: Build up perfectly balanced treap from large batch of (potentially externally) sorted items.

  • TODO: Allow users to retrieve an item's value size (in bytes) without having to first fetch the item into memory.

  • TODO: Allow more fine-grained cached item and node eviction. Node and item objects are currently cached in-memory by gkvlite for higher retrieval performance, only for the nodes & items that you either have updated or have fetched from disk. Certain applications may need that memory instead, though, for more important tasks. The current coarse workaround is to drop all your references to any relevant Stores and Collections, start brand new Store/Collection instances, and let GC reclaim memory.

  • See more TODO's throughout codebase / grep.

  • union code can't be correct

    union code can't be correct

    I haven't confirmed, but I can't see how this thing could possibly return an error.

    In general, I think anything more than two or three levels should be a sign that something is not right. In this case, there's something like nine levels of conditionals nested, each declaring a distinct err var and throwing it away.

    It seems like it should be possible to make a correct implementation that's correct and less... wide.


    opened by dustin 6
  • Reusing byte slice

    Reusing byte slice

    Thanks for the awesome library!

    I'm trying to reuse a byte slice as a key in a for loop, and it appears that gkvlite keeps using the same key.

    Some quick code to show an example:

    b := make([]byte, 4)
    var i uint32 = 0
    for {
        if i >= 10 {
        binary.LittleEndian.PutUint32(b, i)
        c.Set(b, []byte(""))

    This seems to keep reusing the same key (the file size is equal to that of one entry). If you create a new byte slice though on each iteration and use that, it seems to work as expected:

    var i uint32 = 0
    for {
        if i >= 10 {
        b := make([]byte, 4)
        binary.LittleEndian.PutUint32(b, i)
        c.Set(b, []byte(""))

    Any ideas?

    opened by beeker1121 4
  • CAS


    Hello :-)

    Can you explain how to make this correct with gkvlite: We have key "Key", with value "100". Two clients try to change value of "Key" by extracting 100(booth read this value simultaneously), but by application logic "Key" value cannot be less then 0, and just one of two can make extraction.

    WBR Paul.

    opened by freepk 4
  • Memory usage during CopyTo

    Memory usage during CopyTo

    I have a large database I am trying to compact. For this I am using CopyTo as described in the readme. The problem is it blows the memory on the machine I am running it on (RaspberryPi)*

    Some digging later and I believe the issue is in func (o *Store) visitNodes It recursively calls itself and while doing so keeps the pointers choiceT and choiceF both valid. This seems to mean that every Itemloc structure that the visitNodes traverses during the compact phase has to be held in memory until the end of this recursive traversal.

    I have experimented with some fixes that reduce the memory required, but fundamentally this recursive process limits the size of the database that can be traversed to some function of available system memory.

    I've attached a memory profile diagram of the program captured shortly before it crashed due to lack of memory.

    *yes I know I could in theory run this on a machine with more memory - but it's an easy way to show the scalability issue in question here. mem

    opened by cbehopkins 3
  • Limiting memory use

    Limiting memory use


    I am doing something wrong when I use gkvlite in this example, but I'm not sure what. I thought that periodic calls to the Collection.Write and Store.Flush would limit the in-memory size of the program, but that doesn't seem to be the case.

    Could I ask you what I'm doing wrong in the following self contained example?

    An example of the memory usage as the number of keys inserted is increased:

    $ for n in 100 200 400 800 1600 3200 6400 128000 256000 512000; do /usr/bin/time -f '%C %MkB' ./grow -n ${n}; done 2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile047548036/mem.pprof ./grow -n 100 6752kB 2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile335212671/mem.pprof ./grow -n 200 6928kB 2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile506557173/mem.pprof ./grow -n 400 7424kB 2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile956633517/mem.pprof ./grow -n 800 7840kB 2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile050576843/mem.pprof ./grow -n 1600 9824kB 2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile875078316/mem.pprof ./grow -n 3200 11776kB 2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile891669337/mem.pprof ./grow -n 6400 15648kB 2014/01/22 18:51:18 profile: memory profiling enabled, /tmp/profile244818683/mem.pprof ./grow -n 128000 159392kB 2014/01/22 18:51:24 profile: memory profiling enabled, /tmp/profile906229215/mem.pprof ./grow -n 256000 280112kB 2014/01/22 18:51:37 profile: memory profiling enabled, /tmp/profile813873422/mem.pprof ./grow -n 512000 580768kB

    The memory profile appears to indicate the memory is all in gkvlite:

    Adjusting heap profiles for 1-in-4096 sampling rate Welcome to pprof! For help, type 'help'. (pprof) top10 Total: 16.0 MB 7.8 48.7% 48.7% 7.8 48.7% 5.0 31.6% 80.3% 12.8 80.3% 1.2 7.8% 88.1% 1.2 7.8% 1.2 7.8% 95.8% 1.2 7.8% 0.7 4.2% 100.0% 15.8 99.3% main.main

    opened by jimrobinson 3
  • What does MarshalJSON return?

    What does MarshalJSON return?

    I wrote a code to test the JSON output. But I got a strange result. Here is the code

    package main
    func main(){
        f,_ := os.Create( "test.db" )
        s,_ := gkvlite.NewStore(f)
        c := s.SetCollection("car", nil)
        c.Set([]byte("Honda"), []byte("good"))
        b,_ := c.MarshalJSON()
        fmt.Printf( string(b) )

    Then output is


    What does this mean? I expect the MarshalJSON function returns a Json maybe like

    opened by hawkhsieh 1
  • Empty string is not a valid key

    Empty string is not a valid key

    Storing a value under the empty string fails silently:

    import "testing"
    import "bytes"
    import "os"
    import ""
    func TestSimpleEmptyStore(t *testing.T){
        f, err := os.Create("/tmp/test.gkvlite")
        if err != nil {
            t.Errorf("Failed to create Storage File: %v",err)
        res, err := gkvlite.NewStore(f)
        if err != nil {
            t.Errorf("Failed to create Storage Object: %v",err)
        db := res.SetCollection("test",nil)
        key :=  []byte{}
        text := []byte("fnord")
        readback,err := db.Get(key)
        if err != nil {
            t.Errorf("Failed to read back %v",err)
        if bytes.Compare(text,readback) != 0 {
            t.Errorf("Failed to read back, got: %v expected: %v",readback,text)

    The test case fails after returning the empty string instead of the stored value:

    --- FAIL: TestSimpleEmptyStore (0.00 seconds)
        gkvlite_test.go:26: Failed to read back, got: [] expected: [102 110 111 114 100]
    opened by eqv 1
  • Compiling example program

    Compiling example program


    I tried compiling the example program and got this error:

    ./g.go:31: cannot use func literal (type func(*gkvlite.Item) bool) as type bool in function argument ./g.go:31: not enough arguments in call to c.VisitItemsAscend ./g.go:48: undefined: bytes

    I think the example program is out-of-sync with the present release. The source for Collection.go indicates a different number of arguments than in the example: func (t *Collection) VisitItemsAscend(target []byte, withValue bool, v ItemVisitor)

    3 arguments

    example: c.VisitItemsAscend([]byte("ford"), func(i *gkvlite.Item) bool { 2 arguments

    Any chance you could update the example program, if I am correct? :-)

    Oh, I am running on Fedora 18, go 1.1 go version go1.1 linux/amd64

    Thanks, Glen

    opened by gnewton 1
  • Any plans to add transactions?

    Any plans to add transactions?

    Transactions (for any combination of reading & writing during the transaction) maintain a consistent view of the DB during some operation (typically upsert-like ones - i.e. read value V under key K, update part of V so, that it's based on its previous value, and finally save V under K).

    Having more go routines (or even real threads from a pool) whereas one is a mutator and the others are readers makes transactions a requirement.

    If one needs just non-persistent transactions, then synchronization primitives might be enough (though not easy to grasp from the code). But if one expects transactions take longer, then such blocking behavior is not an option and persistent transactions become a requirement.

    opened by dumblob 0
  • Fix function comments based on best practices from Effective Go

    Fix function comments based on best practices from Effective Go

    Every exported function in a program should have a doc comment. The first sentence should be a summary that starts with the name being declared. From effective go.

    I generated this with CodeLingo and I'm keen to get some feedback, but this is automated so feel free to close it and just say "opt out" to opt out of future CodeLingo outreach PRs.

    opened by Daanikus 1
  • manually evict items from memory

    manually evict items from memory

    I would like to be able to evict an item (or at least its value) from memory. Reopening a database as advised in is suboptimal, as it introduces long pauses and memory usage is still rather high, because items are stored in memory all the time until I call Write, Flush and reopen the database.

    Can you provide the following methods, please?

    func (t *Collection) EvictValue(i *Item)
    func (t *Collection) EvictItem(i *Item)

    I would call one of them after I call Set or Get to reduce memory usage.

    opened by starius 0
  • EvictSomeItems does not count leaf node in numEvicted

    EvictSomeItems does not count leaf node in numEvicted

    If I understand correctly, how it works, it should be:

    diff --git a/collection.go b/collection.go
    index 6e92191..1a91b0c 100644
    --- a/collection.go
    +++ b/collection.go
    @@ -245,6 +245,7 @@ func (t *Collection) EvictSomeItems() (numEvicted uint64) {
            if i != nil && err != nil {
          , i)
    +               numEvicted++
            return numEvicted
    opened by starius 1
  • Data structure corruption caused by snapshot open/close.

    Data structure corruption caused by snapshot open/close.

    Hey! While working on a downstream project, I traced a problem that I was experiencing back to what I believe is a corruption bug that manifests when using snapshots. I haven't had time to burrow into it yet, but I came away with this fairly minimal reproduction:

    package main
    import (
    func main() {
            // To trigger: secondKey < firstKey, secondKey < lookup
            firstKey, secondKey, lookup := "c", "a", "b"
            s, _ := gkvlite.NewStore(nil)
            c := s.SetCollection("", nil)
            c.Set([]byte(firstKey), []byte("foo"))
            // If these next two lines don't run, everything works great.
            snap1 := s.Snapshot()
            c.Set([]byte(secondKey), []byte("bar"))
            v, err := c.Get([]byte(lookup))
            fmt.Println(v, err)

    Prints: [] missing item after in GetItem()

    If the triggering conditions aren't met, if I remove the two snap1 lines, or if I don't close the snapshot, everything works as expected. It looks like there is a corruption of some sort caused by creating and closing snapshots.

    I'll take a look deeper tomorrow; just reporting this since now I have a repro.

    opened by danjacques 0
Steve Yen
Steve Yen
moss - a simple, fast, ordered, persistable, key-val storage library for golang

moss provides a simple, fast, persistable, ordered key-val collection implementation as a 100% golang library.

null 866 May 20, 2022
Golang-key-value-store - Key Value Store API Service with Go DDD Architecture

This document specifies the tools used in the Key-Value store and reorganizes how to use them. In this created service, In-Memory Key-Value Service was created and how to use the API is specified in the HTML file in the folder named "doc"

Kemal Emre Ballı 3 Mar 16, 2022
A key-value db api with multiple storage engines and key generation

Jet is a deadly-simple key-value api. The main goals of this project are : Making a simple KV tool for our other projects. Learn tests writing and git

null 12 Apr 5, 2022
Distributed cache and in-memory key/value data store. It can be used both as an embedded Go library and as a language-independent service.

Olric Distributed cache and in-memory key/value data store. It can be used both as an embedded Go library and as a language-independent service. With

Burak Sezer 2.2k May 11, 2022
Keyval - A simple key-value storage library written in Go

keyval keyval is a simple key-value storage library written in Go and its back c

sina safari 3 Mar 1, 2022
An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.

go-cache go-cache is an in-memory key:value store/cache similar to memcached that is suitable for applications running on a single machine. Its major

Patrick Mylund Nielsen 6.1k May 14, 2022
yakv is a simple, in-memory, concurrency-safe key-value store for hobbyists.

yakv (yak-v. (originally intended to be "yet-another-key-value store")) is a simple, in-memory, concurrency-safe key-value store for hobbyists. yakv provides persistence by appending transactions to a transaction log and restoring data from the transaction log on startup.

Aadhav Vignesh 5 Feb 24, 2022
ShockV is a simple key-value store with RESTful API

ShockV is a simple key-value store based on badgerDB with RESTful API. It's best suited for experimental project which you need a lightweight data store.

delihiros 2 Sep 26, 2021
Simple in memory key-value store.

Simple in memory key-value store. Development This project is written in Go. Make sure you have Go installed (download). Version 1.17 or higher is req

Mustafa Navruz 0 Nov 6, 2021
A simple in-memory key-value store application

vtec vtec, is a simple in-memory key-value store application. vtec provides persistence by appending transactions to a json file and restoring data fr

Ahmet Tek 2 Nov 18, 2021
NutsDB a simple, fast, embeddable and persistent key/value store written in pure Go.

A simple, fast, embeddable, persistent key/value store written in pure Go. It supports fully serializable transactions and many data structures such as list, set, sorted set.

徐佳军 2.1k May 12, 2022
Simple Distributed key-value database (in-memory/disk) written with Golang.

Kallbaz DB Simple Distributed key-value store (in-memory/disk) written with Golang. Installation go get Usage API // Get

Mohamed Samir 5 Jan 18, 2022
Build a simple decomposed Key-Value store by implementing two services which communicate over gRPC.

Build a simple decomposed Key-Value store by implementing two services which communicate over gRPC.

Robert Otting 0 Feb 13, 2022
kvStore is a simple key/value in-memory store

kvStore is a simple key/value in-memory store. It is designed for the API. kvStore keeps records at /tmp/kvStore/dbName.db. You can specify server port, dbName and, file save interval in your RunServer(Addr, dbName) call.

Buğra Mengi 2 Feb 24, 2022
An embedded key/value database for Go.

Bolt Bolt is a pure Go key/value store inspired by Howard Chu's LMDB project. The goal of the project is to provide a simple, fast, and reliable datab

BoltDB 12.8k May 15, 2022
A disk-backed key-value store.

What is diskv? Diskv (disk-vee) is a simple, persistent key-value store written in the Go language. It starts with an incredibly simple API for storin

Peter Bourgon 1.2k May 11, 2022
Distributed reliable key-value store for the most critical data of a distributed system

etcd Note: The master branch may be in an unstable or even broken state during development. Please use releases instead of the master branch in order

etcd-io 39.9k May 14, 2022
LevelDB key/value database in Go.

This is an implementation of the LevelDB key/value database in the Go programming language. Installation go get R

Suryandaru Triandana 5.1k May 18, 2022
Distributed, fault-tolerant key-value storage written in go.

A simple, distributed, fault-tolerant key-value storage inspired by Redis. It uses Raft protocotol as consensus algorithm. It supports the following data structures: String, Bitmap, Map, List.

Igor German 359 Mar 4, 2022