Sroar - 64-bit Roaring Bitmaps in Go

Overview

sroar: Serialized Roaring Bitmaps

This is a fork of dgraph-io/sroar, being maintained by the Outcaste team.

sroar is a re-written version of Roaring Bitmaps in Go, with the aim to have equality between in-memory representation and on-disk representation. An sroar.Bitmap does not need to be marshalled or unmarshalled, as the underlying represetation is a byte slice. Therefore, it can be written to disk, brought to memory, or shipped over the network immediately.

sroar only implements array and bitmap containers. It does NOT implement run containers, which is an optimization that RoaringBitmaps has. Despite that, it outperforms RoaringBitmaps as shown in the Benchmarks section.

The code borrows concepts and code from RoaringBitmaps.

Benchmarks

The benchmarks were run:

  • Using real data set as described in RoaringBitmaps.
  • Only on the 64-bit version of roaring bitmaps (roaring64).
  • Only on FastOr, which is the more expensive operation than And or equivalent.
  • On AMD Ryzen Threadripper 2950X 16-Core Processor.
  • Using Go benchmarks serially.

Based on the benchmarks, sroar is:

  • 6.5x faster (-85% p50) for benchmarks >1ms, uses
  • 15x (-93.5% p50) less memory for allocations >1MB.
  • 25x fewer allocations.

The benchmark command and the results are:

$ go test -bench BenchmarkRealDataFastOr --run=XXX --count=5 --benchmem

name CPU                                    old time/op    new time/op    delta
RealDataFastOr/census1881-32                 302ms ± 2%       2ms ± 3%   -99.29%  (p=0.008 n=5+5)
RealDataFastOr/wikileaks-noquotes-32        76.5ms ± 1%     0.9ms ± 1%   -98.83%  (p=0.008 n=5+5)
RealDataFastOr/wikileaks-noquotes_srt-32    34.8ms ± 5%     1.0ms ± 2%   -97.07%  (p=0.008 n=5+5)
RealDataFastOr/dimension_033-32             55.0ms ± 3%     2.7ms ± 0%   -95.16%  (p=0.016 n=5+4)
RealDataFastOr/census1881_srt-32            36.8ms ± 3%     2.9ms ± 1%   -92.13%  (p=0.008 n=5+5)
RealDataFastOr/dimension_003-32             50.4ms ± 1%    11.6ms ± 4%   -77.06%  (p=0.008 n=5+5)
RealDataFastOr/dimension_008-32             10.0ms ± 2%     3.7ms ± 2%   -62.69%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85_srt-32       6.13ms ± 3%    2.72ms ± 2%   -55.66%  (p=0.008 n=5+5)
RealDataFastOr/census-income-32             1.70ms ± 3%    1.05ms ± 1%   -38.53%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85-32           2.28ms ± 2%    4.07ms ± 2%   +78.52%  (p=0.008 n=5+5)

RealDataFastOr/uscensus2000-32               556µs ± 2%     791µs ± 1%   +42.17%  (p=0.008 n=5+5)
RealDataFastOr/census-income_srt-32          260µs ± 4%     986µs ± 2%  +279.09%  (p=0.008 n=5+5)

name MEM_BYTES                             old alloc/op   new alloc/op   delta
RealDataFastOr/census1881-32                 585MB ± 0%       1MB ± 0%   -99.75%  (p=0.008 n=5+5)
RealDataFastOr/wikileaks-noquotes-32        76.3MB ± 0%     0.6MB ± 0%   -99.24%  (p=0.008 n=5+5)
RealDataFastOr/wikileaks-noquotes_srt-32    22.8MB ± 0%     0.6MB ± 0%   -97.46%  (p=0.008 n=5+5)
RealDataFastOr/census1881_srt-32            15.3MB ± 0%     1.4MB ± 0%   -90.58%  (p=0.008 n=5+5)
RealDataFastOr/dimension_003-32             7.78MB ± 0%    1.44MB ± 0%   -81.49%  (p=0.008 n=5+5)
RealDataFastOr/dimension_033-32             1.10MB ± 0%    1.44MB ± 0%   +30.92%  (p=0.008 n=5+5)

RealDataFastOr/dimension_008-32              537kB ± 0%      97kB ± 0%   -81.94%  (p=0.008 n=5+5)
RealDataFastOr/census-income-32              187kB ± 0%      70kB ± 0%   -62.86%  (p=0.008 n=5+5)
RealDataFastOr/census-income_srt-32         99.1kB ± 0%    69.6kB ± 0%   -29.81%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85_srt-32        375kB ± 0%     292kB ± 0%   -21.95%  (p=0.008 n=5+5)
RealDataFastOr/uscensus2000-32               169kB ± 0%     231kB ± 0%   +36.97%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85-32            169kB ± 0%     292kB ± 0%   +72.93%  (p=0.008 n=5+5)

name MEM_ALLOCS                           old allocs/op  new allocs/op  delta
RealDataFastOr/census1881_srt-32             29.7k ± 0%      0.0k ± 0%   -99.91%  (p=0.008 n=5+5)
RealDataFastOr/wikileaks-noquotes_srt-32     6.06k ± 0%     0.02k ± 0%   -99.74%  (p=0.008 n=5+5)
RealDataFastOr/dimension_003-32              4.57k ± 0%     0.03k ± 2%   -99.42%  (p=0.008 n=5+5)
RealDataFastOr/dimension_033-32              4.33k ± 0%     0.03k ± 0%   -99.38%  (p=0.000 n=5+4)
RealDataFastOr/uscensus2000-32               1.75k ± 0%     0.06k ± 0%   -96.85%  (p=0.008 n=5+5)
RealDataFastOr/dimension_008-32                704 ± 0%        23 ± 3%   -96.79%  (p=0.008 n=5+5)
RealDataFastOr/census-income-32                271 ± 0%         9 ± 0%   -96.68%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85_srt-32          248 ± 0%        14 ± 0%   -94.35%  (p=0.008 n=5+5)
RealDataFastOr/weather_sept_85-32             81.0 ± 0%      14.0 ± 0%   -82.72%  (p=0.008 n=5+5)
RealDataFastOr/census-income_srt-32           40.0 ± 0%       9.0 ± 0%   -77.50%  (p=0.008 n=5+5)
RealDataFastOr/census1881-32                 54.5k ± 0%      0.0k ± 0%      ~     (p=0.079 n=4+5)
RealDataFastOr/wikileaks-noquotes-32         39.2k ± 0%      0.0k ± 0%      ~     (p=0.079 n=4+5)
You might also like...
A Go implementation of the 64-bit xxHash algorithm (XXH64)

xxhash xxhash is a Go implementation of the 64-bit xxHash algorithm, XXH64. This is a high-quality hashing algorithm that is much faster than anything

go/golang: fast bit set Bloom filter

package implements a fast bloom filter with real 'bitset' and JSONMarshal/JSONUnmarshal to store/reload the Bloom filter.

Bit is a modern Git CLI
Bit is a modern Git CLI

bit is an experimental modernized git CLI built on top of git that provides happy defaults and other niceties: command and flag suggestions to help yo

A Free 8-Bit Sprite Generator.  Create 256 variants from a single template .PNG
A Free 8-Bit Sprite Generator. Create 256 variants from a single template .PNG

BitSprite A Free 8-Bit Sprite Generator. What? BitSprite is a program that creates variants of an image across total sprite sheet of the resultant ima

Print debugging, but a little bit nicer

testlog Print debugging, but a little bit nicer. The use case this is primarily designed for is effectively debugging problematic, flaky tests.

urlhunter is a recon tool that allows searching on URLs that are exposed via shortener services such as bit.ly and goo.gl.
urlhunter is a recon tool that allows searching on URLs that are exposed via shortener services such as bit.ly and goo.gl.

a recon tool that allows searching on URLs that are exposed via shortener services

this allows you to get the real link of bit.ly
this allows you to get the real link of bit.ly

check the real url from a url shortener (bit.ly) Also you can use it as an API example with deno const rawResponse = await fetch("https://anti-url-s

Yandex Cloud Logging output for Fluent Bit

Fluent Bit plugin for Yandex Cloud Logging Fluent Bit output for Yandex Cloud Logging. Configuration parameters Key Description group_id (optional) Lo

from others structs a bit easier

structcopier This package is copy from deepcopier of Ulule team. Due to deepcopier hasn't updated for a long time, I created this repo to fix some bug

A Go package converting a monochrome 1-bit bitmap image into a set of vector paths.
A Go package converting a monochrome 1-bit bitmap image into a set of vector paths.

go-bmppath Overview Package bmppath converts a monochrome 1-bit bitmap image into a set of vector paths. Note that this package is by no means a sophi

A modification (and a bit of simplification) of the tracerr package.

Decrr A modification (and a bit of simplification) of the tracerr package. This essentially does pretty much the same, but instead of returning anothe

this allows you to get the real link of without get tracked bit.ly
this allows you to get the real link of without get tracked bit.ly

check the real url from a url shortener (bit.ly) Also you can use it as an API example with deno const rawResponse = await fetch("https://anti-url-s

Image compression codec for 16 bit medical images

MIC - Medical Image Codec This library introduces a lossless medical image compression codec MIC for 16 bit images which provides compression ratio si

other glyph sets for high-dpi 1-bit monochrome

hd1b_other other glyph sets for high-dpi 1-bit monochrome Currently included glyph sets: Hangul (Korean): U+1100..U+11FF, U+3131..U+318E, U+AC00..U+D7

This package provides manipulations for bit-packed k-mers

kmers This package provides manipulations for bit-packed k-mers (k=32, encoded in uint64). Related projects: unik provides k-mer serialization method

A bit reader tool written in golang

A bit reader tool written in golang

RIPEMD 128/160/256/320-bit Recursive Hasher (ISO/IEC 10118-3:2004)

RMDSUM RIPEMD Recursive Hasher (ISO/IEC 10118-3:2004) Usage of rmdsum: rmdsum [-v] [-b N] [-c hash.ext] [-r] file.ext -b int Bits: 128,

TUI Flappy Bird. It‘s a lil bit jank tbh

EBIRD TUI Flappy Bird. It's a lil bit jank tbh. Build and Install Build dependen

Eightbit - A converter to create shitty 8-bit like images

eightbit A converter to create shitty 8-bit like images. Usage To install: go in

Owner
Outcaste, Inc.
Outcaste, Inc.
A skip list of arbitrary elements that can be filtered using roaring bitmaps stored in an LRU cache

Skipfilter This package provides a data structure that combines a skiplist with a roaring bitmap cache. This library was created to efficiently filter

Kevin Burns 22 Aug 4, 2022
A Go implementation of the 64-bit xxHash algorithm (XXH64)

xxhash xxhash is a Go implementation of the 64-bit xxHash algorithm, XXH64. This is a high-quality hashing algorithm that is much faster than anything

Caleb Spare 1.3k Nov 30, 2022
go/golang: fast bit set Bloom filter

package implements a fast bloom filter with real 'bitset' and JSONMarshal/JSONUnmarshal to store/reload the Bloom filter.

Andreas Briese 125 Nov 3, 2022
IntSet - Integer based Set based on a bit-vector

IntSet - Integer based Set based on a bit-vector Every integer that is stored will be converted to a bit in a word in which its located. The words are

Jakob Möller 0 Feb 2, 2022
Roaring bitmaps in Go (golang)

roaring This is a go version of the Roaring bitmap data structure. Roaring bitmaps are used by several major systems such as Apache Lucene and derivat

Roaring bitmaps: A better compressed bitset 1.8k Dec 7, 2022
A skip list of arbitrary elements that can be filtered using roaring bitmaps stored in an LRU cache

Skipfilter This package provides a data structure that combines a skiplist with a roaring bitmap cache. This library was created to efficiently filter

Kevin Burns 22 Aug 4, 2022
Go implementation of BLAKE2 (b) cryptographic hash function (optimized for 64-bit platforms).

Go implementation of BLAKE2b collision-resistant cryptographic hash function created by Jean-Philippe Aumasson, Samuel Neves, Zooko Wilcox-O'Hearn, an

Dmitry Chestnykh 90 Jul 11, 2022
Optimized bit-level Reader and Writer for Go.

bitio Package bitio provides an optimized bit-level Reader and Writer for Go. You can use Reader.ReadBits() to read arbitrary number of bits from an i

András Belicza 205 Dec 1, 2022
A little bit of magic for keeping track of the things you have to do.

Be productive. To-do lists are supposed to help you get things done. And I suppose looking through all the stuff you still have to do each time you wa

Adam Lloyd 20 Jun 1, 2022
Static bit vector structures in Go

teivah/bitvector Overview A bit vector is an array data structure that compactly stores bits. This library is based on 5 static different data structu

Teiva Harsanyi 73 Nov 6, 2022