moss - a simple, fast, ordered, persistable, key-val storage library for golang

Related tags

Database moss
Overview

moss

moss provides a simple, fast, persistable, ordered key-val collection implementation as a 100% golang library.

moss stands for "memory-oriented sorted segments".

Build Status Coverage Status GoDoc Go Report Card

Features

  • ordered key-val collection API
  • 100% go implementation
  • key range iterators
  • snapshots provide for isolated reads
  • atomic mutations via a batch API
  • merge operations allow for read-compute-write optimizations for write-heavy use cases (e.g., updating counters)
  • concurrent readers and writers don't block each other
  • child collections allow multiple related collections to be atomically grouped
  • optional, advanced API's to avoid extra memory copying
  • optional lower-level storage implementation, called "mossStore", that uses an append-only design for writes and mmap() for reads, with configurable compaction policy; see: OpenStoreCollection()
  • mossStore supports navigating back through previous commit points in read-only fashion, and supports reverting to previous commit points.
  • optional persistence hooks to allow write-back caching to a lower-level storage implementation that advanced users may wish to provide (e.g., you can hook moss up to leveldb, sqlite, etc)
  • event callbacks allow the monitoring of asynchronous tasks
  • unit tests
  • fuzz tests via go-fuzz & smat (github.com/mschoch/smat); see README-smat.md
  • moss store's diagnostic tool: mossScope

License

Apache 2.0

Example

import github.com/couchbase/moss

c, err := moss.NewCollection(moss.CollectionOptions{})
c.Start()
defer c.Close()

batch, err := c.NewBatch(0, 0)
defer batch.Close()

batch.Set([]byte("car-0"), []byte("tesla"))
batch.Set([]byte("car-1"), []byte("honda"))

err = c.ExecuteBatch(batch, moss.WriteOptions{})

ss, err := c.Snapshot()
defer ss.Close()

ropts := moss.ReadOptions{}

val0, err := ss.Get([]byte("car-0"), ropts) // val0 == []byte("tesla").
valX, err := ss.Get([]byte("car-not-there"), ropts) // valX == nil.

// A Get can also be issued directly against the collection
val1, err := c.Get([]byte("car-1"), ropts) // val1 == []byte("honda").

For persistence, you can use...

store, collection, err := moss.OpenStoreCollection(directoryPath,
    moss.StoreOptions{}, moss.StorePersistOptions{})

Design

The design is similar to a (much) simplified LSM tree, with a stack of sorted, immutable key-val arrays or "segments".

To incorporate the next Batch of key-val mutations, the incoming key-val entries are first sorted into an immutable "segment", which is then atomically pushed onto the top of the stack of segments.

For readers, a higher segment in the stack will shadow entries of the same key from lower segments.

Separately, an asynchronous goroutine (the "merger") will continuously merge N sorted segments to keep stack height low.

In the best case, a remaining, single, large sorted segment will be efficient in memory usage and efficient for binary search and range iteration.

Iterations when the stack height is > 1 are implementing using a N-way heap merge.

In this design, the stack of segments is treated as immutable via a copy-on-write approach whenever the stack needs to be "modified". So, multiple readers and writers won't block each other, and taking a Snapshot is also a similarly cheap operation by cloning the stack.

See also the DESIGN.md writeup.

Limitations and considerations

NOTE: Keys in a Batch must be unique. That is, myBatch.Set("x", "foo"); myBatch.Set("x", "bar") is not supported. Applications that do not naturally meet this requirement might maintain their own map[key]val data structures to ensure this uniqueness constraint.

Max key length is 2^24 (24 bits used to track key length).

Max val length is 2^28 (28 bits used to track val length).

Metadata overhead for each key-val operation is 16 bytes.

Read performance characterization is roughly O(log N) for key-val retrieval.

Write performance characterization is roughly O(M log M), where M is the number of mutations in a batch when invoking ExecuteBatch().

Those performance characterizations, however, don't account for background, asynchronous processing for the merging of segments and data structure maintenance.

A background merger task, for example, that is too slow can eventually stall ingest of new batches. (See the CollectionOptions settings that limit segment stack height.)

As another example, one slow reader that holds onto a Snapshot or onto an Iterator for a long time can hold onto a lot of resources. Worst case is the reader's Snapshot or Iterator may delay the reclaimation of large, old segments, where incoming mutations have obsoleted the immutable segments that the reader is still holding onto.

Error handling

Please note that the background goroutines of moss may run into errors, for example during optional persistence operations. To be notified of these cases, your application can provide (highly recommended) an optional CollectionOptions.OnError callback func which will be invoked by moss.

Logging

Please see the optional CollectionOptions.Log callback func and the CollectionOptions.Debug flag.

Performance

Please try go test -bench=. for some basic performance tests.

Each performance test will emit output that generally looks like...

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
spec: {numItems:1000000 keySize:20 valSize:100 batchSize:100 randomLoad:false noCopyValue:false accesses:[]}
     open || time:     0 (ms) |        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s || cumulative:        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s
     load || time:   840 (ms) |  1190476 wop/s |   139508 wkb/s |        0 rop/s |        0 rkb/s || cumulative:  1190476 wop/s |   139508 wkb/s |        0 rop/s |        0 rkb/s
    drain || time:   609 (ms) |        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s || cumulative:   690131 wop/s |    80874 wkb/s |        0 rop/s |        0 rkb/s
    close || time:     0 (ms) |        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s || cumulative:   690131 wop/s |    80874 wkb/s |        0 rop/s |        0 rkb/s
   reopen || time:     0 (ms) |        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s || cumulative:   690131 wop/s |    80874 wkb/s |        0 rop/s |        0 rkb/s
     iter || time:    81 (ms) |        0 wop/s |        0 wkb/s | 12344456 rop/s |  1446616 rkb/s || cumulative:   690131 wop/s |    80874 wkb/s | 12344456 rop/s |  1446616 rkb/s
    close || time:     2 (ms) |        0 wop/s |        0 wkb/s |        0 rop/s |        0 rkb/s || cumulative:   690131 wop/s |    80874 wkb/s | 12344456 rop/s |  1446616 rkb/s
total time: 1532 (ms)
file size: 135 (MB), amplification: 1.133
BenchmarkStore_numItems1M_keySize20_valSize100_batchSize100-8

There are various phases in each test...

  • open - opening a brand new moss storage instance
  • load - time to load N sequential keys
  • drain - additional time after load for persistence to complete
  • close - time to close the moss storage instance
  • reopen - time to reopen the moss storage instance (OS/filesystem caches are still warm)
  • iter - time to sequentially iterate through key-val items
  • access - time to perform various access patterns, like random or sequential reads and writes

The file size measurement is after final compaction, with amplification as a naive calculation to compare overhead against raw key-val size.

Contributing changes

Please see the CONTRIBUTING.md document.

Issues
  • Correction to the example in readme

    Correction to the example in readme

    The CollectionOptions{} needed prefixing with moss. and c.NewBatch() returns a Batch and an error therefore can't re-assign to c (of type Collection).

    opened by chilts 7
  • Panic on Get

    Panic on Get

    I was trying to use moss as a backend store for Dgraph (https://github.com/dgraph-io/dgraph/tree/try/moss). But faced this issue

    panic: runtime error: slice bounds out of range
    
    goroutine 348 [running]:
    github.com/couchbase/moss.(*segment).FindStartKeyInclusivePos(0xc4201ee000, 0xc42a1cb7e0, 0x16, 0x16, 0x16)
    	/home/ashwin/go/src/github.com/couchbase/moss/segment.go:313 +0x19b
    github.com/couchbase/moss.(*segmentStack).get(0xc4201cac30, 0xc42a1cb7e0, 0x16, 0x16, 0x1e, 0x0, 0x7f7f00, 0xc42006efc0, 0x1f, 0x2a, ...)
    	/home/ashwin/go/src/github.com/couchbase/moss/segment_stack.go:90 +0x26b
    github.com/couchbase/moss.(*segmentStack).Get(0xc4201cac30, 0xc42a1cb7e0, 0x16, 0x16, 0xc4201cac00, 0x6, 0x6, 0x6, 0x0, 0x6)
    	/home/ashwin/go/src/github.com/couchbase/moss/segment_stack.go:74 +0x75
    github.com/couchbase/moss.(*Footer).Get(0xc42006efc0, 0xc42a1cb7e0, 0x16, 0x16, 0x465600, 0x1, 0x6, 0xc424e85910, 0x465182, 0xfc7740)
    	/home/ashwin/go/src/github.com/couchbase/moss/store_footer.go:426 +0x8a
    github.com/couchbase/moss.(*snapshotWrapper).Get(0xc420192fc0, 0xc42a1cb7e0, 0x16, 0x16, 0x0, 0x0, 0xc4200928f0, 0x6, 0x0, 0xc424e85948)
    	/home/ashwin/go/src/github.com/couchbase/moss/wrap.go:94 +0x62
    github.com/couchbase/moss.(*segmentStack).get(0xc42a1f0d20, 0xc42a1cb7e0, 0x16, 0x16, 0xffffffffffffffff, 0x0, 0xc424e85a00, 0x1, 0x1, 0x6, ...)
    	/home/ashwin/go/src/github.com/couchbase/moss/segment_stack.go:110 +0xa2
    github.com/couchbase/moss.(*segmentStack).Get(0xc42a1f0d20, 0xc42a1cb7e0, 0x16, 0x16, 0x0, 0x0, 0x414ea2, 0xc42a1f0cd0, 0x50, 0x48)
    	/home/ashwin/go/src/github.com/couchbase/moss/segment_stack.go:74 +0x75
    github.com/dgraph-io/dgraph/posting.(*List).getPostingList(0xc42a1e9e00, 0x0, 0x0)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/posting/list.go:190 +0x1ef
    github.com/dgraph-io/dgraph/posting.(*List).updateMutationLayer(0xc42a1e9e00, 0xc42a1f0cd0, 0x0)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/posting/list.go:263 +0x125
    github.com/dgraph-io/dgraph/posting.(*List).addMutation(0xc42a1e9e00, 0xfdec00, 0xc425576e40, 0xc42226c660, 0x5, 0x5, 0xb090c8)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/posting/list.go:340 +0xd7
    github.com/dgraph-io/dgraph/posting.(*List).AddMutationWithIndex(0xc42a1e9e00, 0xfdec00, 0xc425576e40, 0xc42226c660, 0x0, 0x0)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/posting/index.go:171 +0x2da
    github.com/dgraph-io/dgraph/worker.runMutations(0x7f62912c2280, 0xc425576e40, 0xc4221e0000, 0x3e8, 0x400, 0x0, 0x0)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/worker/mutation.go:50 +0x21a
    github.com/dgraph-io/dgraph/worker.(*node).processMutation(0xc42008c000, 0x2, 0x114, 0x0, 0xc420f4c000, 0x92c5, 0xa000, 0x0, 0x0, 0x0, ...)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/worker/draft.go:372 +0x13e
    github.com/dgraph-io/dgraph/worker.(*node).process(0xc42008c000, 0x2, 0x114, 0x0, 0xc420f4c000, 0x92c5, 0xa000, 0x0, 0x0, 0x0, ...)
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/worker/draft.go:405 +0x36a
    created by github.com/dgraph-io/dgraph/worker.(*node).processApplyCh
    	/home/ashwin/go/src/github.com/dgraph-io/dgraph/worker/draft.go:444 +0x49b
    

    Any help would be appriciated. Thanks!

    opened by ashwin95r 5
  • avoid deadlock by starting collection

    avoid deadlock by starting collection

    When running the Readme's example, you immediately run into an error since Start() hasn't been called on the collection. This will help getting the first example running without having to look around

    opened by MatthiasRMS 4
  • Moss is using undefined type from ghistogram

    Moss is using undefined type from ghistogram

    I was trying to use moss as a store for bleve package, but I get the following error:

    ../../go/src/search/vendor/github.com/couchbase/moss/api.go:173:15: undefined: ghistogram.Histograms
    ./../go/src/search/vendor/github.com/couchbase/moss/collection.go:98:13: undefined: ghistogram.Histograms
    

    where ~/go/src/search is my client package that vendored bleve and moss I see in modd/api.go:173 that it's using a structu from ghistogram package called Historgrams in plural, but this struct doesn't exist in ghistograms package.

    is there something I'm missing or that was a typo, or ghistogram was changed overtime and became incompatible with moss latest version?

    opened by emad-elsaid 4
  • optimization idea: copy-on-write, ref-counted segmentStack's

    optimization idea: copy-on-write, ref-counted segmentStack's

    Currently, creating snapshot copies the the segment stack, which means memory allocations and copying of pointers. Instead, perhaps creating a snapshot should just bump a ref-count on some existing segmentStack which should be treated as immutable (except for the ref-count).

    Whenever a writer (such as ExecuteBatch, the background Merger, etc) wants to modify the segmentStack, it should use copy-on-write.

    The existing moss implementation actually does parts of the above anyways, so we might be near to that already.

    opened by steveyen 4
  • Buckets?

    Buckets?

    Hi, from the description it is not obvious to me if moss supports buckets(like bolt)?

    There is the

    child collections allow multiple related collections to be atomically grouped

    which I'm not exactly sure if it is something like buckets or it is just a bunch of selected records manually put together?

    opened by ghost 2
  • Q: Transaction performance

    Q: Transaction performance

    Hi, this project looks very interesting. I have a lot of transaction writes, instead of multiple writes per transaction.

    How many transaction writes of small data can moss handle per second on average ssd? For example with BoltDB it was only about 250, so I wonder if this project can perform better or if it is also limited by the file system.

    opened by ghost 2
  • Q on write amplification

    Q on write amplification

    @mschoch I enjoyed your talk on Moss at GopherCon. At the end you pointed out a situation (I couldn't discern exactly when) where a small write lead to alot of copying/write-amplification.

    I just wanted to inquire if that issue had been addressed?

    opened by glycerine 2
  • add a non-snapshot Collection.Get(key, ReaderOptions) API

    add a non-snapshot Collection.Get(key, ReaderOptions) API

    If a user just wants to lookup a single item in a collection, they have to first create a snapshot, then snapshot.Get(), then snapshot.Close().

    One issue is creating a snapshot means memory allocations (of the snapshot instance and taking a copy of the segment stack).

    A "onesie" API to just lookup a single item, if it can be implemented efficiently (without undue memory allocations and with having to hold a lock for a long time), more be more convenient for folks to grok.

    (See also https://github.com/couchbase/moss/issues/14)

    opened by steveyen 2
  • support for multiple collections

    support for multiple collections

    Users can currently fake out multiple collections by explicitly adding a short collection name prefix to each of their keys. However, such a trick is suboptimal as it repeats a prefix for every key-val item.

    Instead, a proposal to support multiple collections natively in moss would be by introducing a few additional methods to the API, so that we remain backwards compatible for current moss users.

    The idea is that the current Collection in moss now logically becomes a "top-most" collection of an optional hierarchy of child collections.

    To the Batch interface, the proposal is to add the methods...

    NewChildCollectionBatch(childCollectionName string, hints) (Batch, error)
    
    DelChildCollection(childCollectionName string) error
    

    When a batch is executed, the entire hierarchy of a top-level batch and its batches for any child collections will be committed atomically.

    Removing a child collection takes precedence over adding more child collection mutations.

    To the Snapshot interface, the proposal is to add the methods...

    ChildCollectionNames() ([]string, error)
    
    ChildCollectionSnapshot(childCollectionName string) (Snapshot, error)
    

    And, that's it.

    The proposed API allows for deeply nested child collections of child collections, but the initial implementation might just return an error if the user tries to have deep nesting.

    enhancement 
    opened by steveyen 2
  • Iterator only returning one record

    Iterator only returning one record

    I'm using this code to store and retrieve records from moss.

    But when I call GetStoredSQSMessages() it only seems to return the last entry, as opposed to all the entries.

    If I run strings data-0000000000000001.moss I can see all the records I'm expecting, so I know their somewhere in the moss, but I just can't get at them w/ the iterator.

    Can you take a look at my GetStoredSQSMessages method and see if I'm doing anything wrong.

    If nothing is obvious, should I try repro'ing this in a unit test? I'm storing the moss in a docker volume mount, so it's possible I'm doing something funny (but like I said I can see all the records with strings, so it seems to be an iterator problem)

    opened by tleyden 2
  • graphplot deprecated?

    graphplot deprecated?

    PS is there no go alternative for this?

    go test -run=TestMossDGM -outputToFile
    03:12:21 OpsSet 412470 numKeysRead 1811156 dbSize 37mb 75mb Memory 0mb MMap 0mb
    03:12:22 OpsSet 524287 numKeysRead 1356666 dbSize 85mb 173mb Memory 0mb MMap 0mb
    03:12:23 OpsSet 403496 numKeysRead 1131032 dbSize 122mb 300mb Memory 0mb MMap 0mb
    03:12:24 OpsSet 149049 numKeysRead 1677611 dbSize 136mb 191mb Memory 0mb MMap 0mb
    03:12:25 OpsSet 582009 numKeysRead 1313250 dbSize 189mb 293mb Memory 0mb MMap 0mb
    03:12:26 OpsSet 169398 numKeysRead 994228 dbSize 205mb 412mb Memory 0mb MMap 0mb
    03:12:27 OpsSet 619084 numKeysRead 1458883 dbSize 261mb 546mb Memory 0mb MMap 0mb
    03:12:28 OpsSet 584209 numKeysRead 1328463 dbSize 315mb 644mb Memory 0mb MMap 0mb
    03:12:29 OpsSet 597415 numKeysRead 1085736 dbSize 370mb 753mb Memory 0mb MMap 0mb
    Workers Stop...
    03:12:30 OpsSet 99802 numKeysRead 803970 dbSize 379mb 903mb Memory 0mb MMap 0mb
    03:12:30 - Closing Collections...Done 1.419
    03:12:37 - Closing Collections...Done 1.009
    PASS
    ok      _/Users/gert/Desktop/moss       17.650s
    
    python -m pip install --upgrade pandas
    
    python graph/dgm-moss-plot.py Results_024923_48983.json 
    Traceback (most recent call last):
      File "graph/dgm-moss-plot.py", line 309, in <module>
        main()
      File "graph/dgm-moss-plot.py", line 92, in main
        resultsFile['memused'] = (resultsFile['memtotal'] - (resultsFile['memfree']))/1024/1024
      File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 2927, in __getitem__
        indexer = self.columns.get_loc(key)
      File "/usr/local/lib/python2.7/site-packages/pandas/core/indexes/base.py", line 2659, in get_loc
        return self._engine.get_loc(self._maybe_cast_indexer(key))
      File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
      File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
      File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
      File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
    KeyError: 'memtotal'
    
    cat Results_024923_48983.json 
    
    {"cfg_CompactionBufferPages":512,"cfg_CompactionLevelMaxSegments":9,"cfg_CompactionLevelMultiplier":3,"cfg_CompactionPercentage":0.65,"dbPath":"moss-test-data","diskMonitor":"sdc","keyLength":48,"keyOrder":"Random","memQuota":4294967296,"ncpus":8,"numReaders":1,"numWriters":1,"readBatchSize":100,"readBatchThinkTime":0,"runDescription":"-","runTime":10,"sampleFrequency":1000000000,"valueLength":48,"writeBatchSize":10000,"writeBatchThinkTime":0}
    {"intervaltime":"02:49:24","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1864177,"numKeysStart":0,"numKeysWrite":607959,"numReadBatches":39005,"numWriteBatches":60,"num_bytes_used_disk":73283936,"num_files":1,"num_segments":3,"processMem":0,"totalKeyBytes":17668224,"totalOpsDel":0,"totalOpsSet":368088,"totalValBytes":17668224,"total_compactions":1,"total_persists":19}
    {"intervaltime":"02:49:25","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1678772,"numKeysStart":0,"numKeysWrite":601788,"numReadBatches":34453,"numWriteBatches":60,"num_bytes_used_disk":168420704,"num_files":1,"num_segments":7,"processMem":0,"totalKeyBytes":42663744,"totalOpsDel":0,"totalOpsSet":520740,"totalValBytes":42663744,"total_compactions":0,"total_persists":12}
    {"intervaltime":"02:49:26","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1272640,"numKeysStart":0,"numKeysWrite":612793,"numReadBatches":26140,"numWriteBatches":62,"num_bytes_used_disk":291830838,"num_files":2,"num_segments":18,"processMem":0,"totalKeyBytes":66042192,"totalOpsDel":0,"totalOpsSet":487051,"totalValBytes":66042192,"total_compactions":0,"total_persists":19}
    {"intervaltime":"02:49:27","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1765483,"numKeysStart":0,"numKeysWrite":576301,"numReadBatches":36190,"numWriteBatches":57,"num_bytes_used_disk":174847246,"num_files":1,"num_segments":6,"processMem":0,"totalKeyBytes":69736416,"totalOpsDel":0,"totalOpsSet":76963,"totalValBytes":69736416,"total_compactions":1,"total_persists":5}
    {"intervaltime":"02:49:28","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1411938,"numKeysStart":0,"numKeysWrite":608265,"numReadBatches":28730,"numWriteBatches":61,"num_bytes_used_disk":279874336,"num_files":1,"num_segments":16,"processMem":0,"totalKeyBytes":95461344,"totalOpsDel":0,"totalOpsSet":535936,"totalValBytes":95461344,"total_compactions":0,"total_persists":18}
    {"intervaltime":"02:49:29","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1191213,"numKeysStart":0,"numKeysWrite":625228,"numReadBatches":24315,"numWriteBatches":63,"num_bytes_used_disk":382459295,"num_files":1,"num_segments":20,"processMem":0,"totalKeyBytes":128764464,"totalOpsDel":0,"totalOpsSet":693815,"totalValBytes":128764464,"total_compactions":0,"total_persists":20}
    {"intervaltime":"02:49:30","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":990468,"numKeysStart":0,"numKeysWrite":577997,"numReadBatches":20153,"numWriteBatches":58,"num_bytes_used_disk":578373280,"num_files":1,"num_segments":3,"processMem":0,"totalKeyBytes":115968048,"totalOpsDel":0,"totalOpsSet":0,"totalValBytes":115968048,"total_compactions":0,"total_persists":0}
    {"intervaltime":"02:49:31","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1462143,"numKeysStart":0,"numKeysWrite":605731,"numReadBatches":29763,"numWriteBatches":60,"num_bytes_used_disk":686256128,"num_files":1,"num_segments":16,"processMem":0,"totalKeyBytes":167349888,"totalOpsDel":0,"totalOpsSet":1070455,"totalValBytes":167349888,"total_compactions":0,"total_persists":21}
    {"intervaltime":"02:49:32","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":1152540,"numKeysStart":0,"numKeysWrite":610852,"numReadBatches":23471,"numWriteBatches":61,"num_bytes_used_disk":795824128,"num_files":1,"num_segments":20,"processMem":0,"totalKeyBytes":197996448,"totalOpsDel":0,"totalOpsSet":638470,"totalValBytes":197996448,"total_compactions":0,"total_persists":20}
    {"intervaltime":"02:49:33","mapped":0,"mhBlockDuration":0,"mhBlocks":0,"numKeysRead":864135,"numKeysStart":0,"numKeysWrite":552955,"numReadBatches":17685,"numWriteBatches":56,"num_bytes_used_disk":938213376,"num_files":1,"num_segments":22,"processMem":0,"totalKeyBytes":213553104,"totalOpsDel":0,"totalOpsSet":324097,"totalValBytes":213553104,"total_compactions":0,"total_persists":10}
    {"tot_mapped":0,"tot_mhBlockDuration":0,"tot_mhBlocks":0,"tot_numKeysRead":13653509,"tot_numKeysStart":0,"tot_numKeysWrite":5980000,"tot_numReadBatches":279905,"tot_numWriteBatches":598,"tot_num_bytes_used_disk":1030427776,"tot_num_files":1,"tot_num_segments":6,"tot_processMem":0,"tot_totalKeyBytes":216281328,"tot_totalOpsDel":0,"tot_totalOpsSet":4505861,"tot_totalValBytes":216281328,"tot_total_compactions":2,"tot_total_persists":144}
    
    opened by gertcuykens 0
  • StoreOptions parameter needed for both OpenStore and OpenCollection?

    StoreOptions parameter needed for both OpenStore and OpenCollection?

    Hi, for me it's not clear why both OpenStore and OpenCollection need a StoreOptions parameter? Is there a case to use different StoreOptions for OpenStore and store.OpenCollection? If they need to be the same isn't it safer to only set StoreOptions on theOpenStore call so no bug's can ocurre when StoreOptions get changed between the two calls?

    func OpenStoreCollection(dir string, options StoreOptions,
    	persistOptions StorePersistOptions) (*Store, Collection, error) {
    	store, err := OpenStore(dir, options)
    	if err != nil {
    		return nil, nil, err
    	}
    
    	coll, err := store.OpenCollection(options, persistOptions)
    	if err != nil {
    		store.Close()
    		return nil, nil, err
    	}
    
    	return store, coll, nil
    }
    
    opened by gertcuykens 0
  • Moss panic with runaway disk usage

    Moss panic with runaway disk usage

    Hey everyone, I'm seeing the following panic which looks like it's run out of memory. However, I've attached a graph of Grafana which shows the disk usage by the Moss index (its from a file walker so it could have a race condition if Moss is generating lots of files while it is walking)

    fatal error: runtime: cannot allocate memory
    
    goroutine 103 [running]:
    runtime.systemstack_switch()
    	stdlib%/src/runtime/asm_amd64.s:311 fp=0xc000b75970 sp=0xc000b75968 pc=0x459c90
    runtime.persistentalloc(0xd0, 0x0, 0x27ad2b0, 0x7c4eac)
    	GOROOT/src/runtime/malloc.go:1142 +0x82 fp=0xc000b759b8 sp=0xc000b75970 pc=0x40c932
    runtime.newBucket(0x1, 0x4, 0x425f76)
    	GOROOT/src/runtime/mprof.go:173 +0x5e fp=0xc000b759f0 sp=0xc000b759b8 pc=0x42573e
    runtime.stkbucket(0x1, 0x33a000, 0xc000b75a98, 0x4, 0x20, 0xc000b75a01, 0x7f08c8658138)
    	GOROOT/src/runtime/mprof.go:240 +0x1aa fp=0xc000b75a50 sp=0xc000b759f0 pc=0x425a3a
    runtime.mProf_Malloc(0xc01298a000, 0x33a000)
    	GOROOT/src/runtime/mprof.go:344 +0xd6 fp=0xc000b75bc8 sp=0xc000b75a50 pc=0x425fd6
    runtime.profilealloc(0xc0026e8000, 0xc01298a000, 0x33a000)
    	GOROOT/src/runtime/malloc.go:1058 +0x4b fp=0xc000b75be8 sp=0xc000b75bc8 pc=0x40c6cb
    runtime.mallocgc(0x33a000, 0x14f3080, 0x1, 0xc008fb0000)
    	GOROOT/src/runtime/malloc.go:983 +0x46c fp=0xc000b75c88 sp=0xc000b75be8 pc=0x40bdac
    runtime.makeslice(0x14f3080, 0x0, 0x338b32, 0xc008fb0000, 0x0, 0x17ec0)
    	GOROOT/src/runtime/slice.go:70 +0x77 fp=0xc000b75cb8 sp=0xc000b75c88 pc=0x442c17
    vendor/github.com/couchbase/moss.newSegment(...)
    	vendor/github.com/couchbase/moss/segment.go:158
    vendor/github.com/couchbase/moss.(*segmentStack).merge(0xc005eaf180, 0xc000b75e01, 0xc007dec910, 0xc002d71a90, 0xc00004bc90, 0x10, 0xc00004bcb)
    	vendor/github.com/couchbase/moss/segment_stack_merge.go:73 +0x1bb fp=0xc000b75e48 sp=0xc000b75cb8 pc=0xcb199b
    vendor/github.com/couchbase/moss.(*collection).mergerMain(0xc0004b00c0, 0xc005eaf180, 0xc007dec910, 0x1, 0xc005eaf180)
    	vendor/github.com/couchbase/moss/collection_merger.go:248 +0x306 fp=0xc000b75ef0 sp=0xc000b75e48 pc=0xca6946
    vendor/github.com/couchbase/moss.(*collection).runMerger(0xc0004b00c0)
    	vendor/github.com/couchbase/moss/collection_merger.go:126 +0x2d0 fp=0xc000b75fd8 sp=0xc000b75ef0 pc=0xca5e30
    runtime.goexit()
    	stdlib%/src/runtime/asm_amd64.s:1333 +0x1 fp=0xc000b75fe0 sp=0xc000b75fd8 pc=0x45bbf1
    created by vendor/github.com/couchbase/moss.(*collection).Start
    	vendor/github.com/couchbase/moss/collection.go:118 +0x62
    
    screen shot 2018-10-30 at 11 07 46 pm

    The disk usage grows for about 6 minutes then implodes when I assume the disk is completely filled. The green line after the blip is our service restarting and our indexes being rebuilt

    opened by connorgorman 4
  • Feature Request: Add ability to serialize and deserialize Batches and Collections

    Feature Request: Add ability to serialize and deserialize Batches and Collections

    I am working on a distributed cache that spreads Moss Collections across a cluster of nodes and while I have it working for basic Get, Set, and Delete operations, without the ability to serialize Batches, there isn't a really good way to replicate Batch operations. One solution would be to create my own batch implementation that can be serialized then "replay" the batch on the receiving node to create a moss.Batch, but it would be more convenient if a Batch could just be serialized directly and then deserialized on the receiving end.

    Similarly, I am using Raft for my replication and it would be nice if I could serialize an entire Collection so that I can create a Raft snapshot periodically. Currently, I am just iterating through all of the KVPs in the Collection and serializing them individually with my own serialization format, but this requires me to implement compaction and what-not myself and since Moss already has its own persistence format, as well as its own compaction algorithm, it would be nice to reuse this.

    I'm willing to implement both of these myself and submit PRs, but I was wondering if you had any pointers on doing this in a way that is backwards compatible and fits the overall vision and design goals of Moss.

    opened by jonbonazza 6
  • optimization - same length keys

    optimization - same length keys

    if moss can somehow detect for a batch or segment that all the keys are exactly the same length, then one optimization might be to compress the kvs array -- don't need to store the key length

    opened by steveyen 1
Fast and simple key/value store written using Go's standard library

Table of Contents Description Usage Cookbook Disadvantages Motivation Benchmarks Test 1 Test 4 Description Package pudge is a fast and simple key/valu

Vadim Kulibaba 318 Apr 30, 2022
pure golang key database support key have value. 非常高效实用的键值数据库。

orderfile32 pure golang key database support key have value The orderfile32 is standard alone fast key value database. It have two version. one is thi

null 3 Apr 30, 2022
Key-Value Storage written in Go.

kvs kvs is an in-memory key-value storage written in Go. It has 2 different usage. It can be used as a package by importing it to your code or as a se

Gökhan Özeloğlu 4 Nov 29, 2021
A simple, fast, embeddable, persistent key/value store written in pure Go. It supports fully serializable transactions and many data structures such as list, set, sorted set.

NutsDB English | 简体中文 NutsDB is a simple, fast, embeddable and persistent key/value store written in pure Go. It supports fully serializable transacti

徐佳军 2.2k May 16, 2022
Eagle - Eagle is a fast and strongly encrypted key-value store written in pure Golang.

EagleDB EagleDB is a fast and simple key-value store written in Golang. It has been designed for handling an exaggerated read/write workload, which su

null 6 Apr 28, 2022
Fast key-value DB in Go.

BadgerDB BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go. It is the underlying database for Dgraph, a fast,

Dgraph 10.8k May 14, 2022
BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go

BadgerDB BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go. It is the underlying database for Dgraph, a fast,

Blizard 1 Dec 10, 2021
Badger - Fast Key-Value DB in Go

BadgerDB This is a fork of dgraph-io/badger, maintained by the Outcaste team. Ba

Outcaste, Inc. 71 May 17, 2022
A simple golang api generator that stores struct fields in key/value based databases

Backgen A simple golang API generator that uses key/value based databases. It does not provide the database itself, only uses a interface to access se

null 0 Feb 4, 2022
An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.

go-cache go-cache is an in-memory key:value store/cache similar to memcached that is suitable for applications running on a single machine. Its major

Patrick Mylund Nielsen 6.1k May 18, 2022
Key event handling library for tcell - THIS IS A MIRROR - SEE LINK BELOW

cbind Key event handling library for tcell Features Set KeyEvent handlers Encode and decode KeyEvents as human-readable strings Usage // Create a new

Trevor Slocum 0 Jan 10, 2022
A simple Git Notes Key Value store

Gino Keva - Git Notes Key Values Gino Keva works as a simple Key Value store built on top of Git Notes, using an event sourcing architecture. Events a

Philips Software 23 Jan 19, 2022
Simple key value database that use json files to store the database

KValDB Simple key value database that use json files to store the database, the key and the respective value. This simple database have two gRPC metho

Francisco Santos 0 Nov 13, 2021
A rest-api that works with golang as an in-memory key value store

Rest API Service in GOLANG A rest-api that works with golang as an in-memory key value store Usage Run command below in terminal in project directory.

sercan aydın 0 Dec 6, 2021
Secure storage for personal records built to comply with GDPR

Databunker Databunker is a Personally Identifiable Information (PII) Data Storage Service built to Comply with GDPR and CCPA Privacy Requirements. Pro

null 980 May 20, 2022
A MySQL-compatible relational database with a storage agnostic query engine. Implemented in pure Go.

go-mysql-server is a SQL engine which parses standard SQL (based on MySQL syntax) and executes queries on data sources of your choice. A simple in-memory database and table implementation are provided, and you can query any data source you want by implementing a few interfaces.

DoltHub 792 May 15, 2022
A distributed MySQL binlog storage system built on Raft

What is kingbus? 中文 Kingbus is a distributed MySQL binlog store based on raft. Kingbus can act as a slave to the real master and as a master to the sl

Fei Chen 843 May 12, 2022
Multitiered file storage API built on Filecoin and IPFS

Powergate Powergate is a multitiered file storage API built on Filecoin and IPFS, and an index builder for Filecoin data. It's designed to be modular

textile.io 336 May 9, 2022
A GPU-powered real-time analytics storage and query engine.

AresDB AresDB is a GPU-powered real-time analytics storage and query engine. It features low query latency, high data freshness and highly efficient i

Uber Open Source 2.9k May 17, 2022