An observability database aims to ingest, analyze and store Metrics, Tracing and Logging data.



BanyanDB, as an observability database, aims to ingest, analyze and store Metrics, Tracing and Logging data. It's designed to handle observability data generated by Observability platform and APM system, like Apache SkyWalking etc.



For developers who want to contribute to this project, see Contribution Guide


Apache 2.0 License.

  • Add streaming API and topN aggregator

    Add streaming API and topN aggregator

    This PR tends to introduce a simple stream processing API and implementation for TopN aggregation.


    Flow is an abstraction for the streaming processes, with following operator(s),

    • Source: provide data stream for the Flow. As we've discussed before, it should be a listener consuming Measure write request continuously. Later, we could use a global binlog/WAL.
    • Mapper: func(T) R which transforms an element from T to R
    • Filter: func(T) bool which predicates whether an element should be passed to the downstream
    • Windows: Currently only SlidingEventTimeWindows is implemented
    • Sink: the place to write final result. We have to write the final result, e.g. TopN into a separate Measure storage.


    s := flow.New(tt.input).
        Filter(func(i int) bool {
            return i%2 == 0

    The Filter operator allows us to filter by criteria set in TopNAggregation.


    s := flow.New(tt.input).
        Map(func(i int) int {
            return i * 2

    The Mapper operator allows us to extract field from the record and transform it by groupBy operation.

    We currently do not have a separate keyBy (for example Apache/Flink) operation to do groupBy for simplicity


    Generally, the split of the window can be related to the "time", which could be either of the following concepts,

    • Event Time
    • Processing Time

    The above graph from Flink community discriminates these concepts. But in our case, the only "time" we care about is EventTime which represents the exact moment the record is produced, since we need to use the EventTime to drive the timely flush of the aggregation results, e.g. TopN ranks.

    image Sliding windows can fulfill our requirement in the sense that we need to flush the data more frequently.

    It means the flush interval should be much smaller than the interval of the real data points. For example in OAP, the downsampling rate can be MINUTE while the flush timer is set to 25 seconds by default.

    Technically, the SlidingEventWindow is built on,

    • A PriorityQueue maintains records which have not yet been emitted,
    • A PriorityQueueSet maintains all registered (depulicated) timers which will be triggered later


    With the above semantics, we can impl TopN as a window aggregation function,

        Filter(...). // where
        Mapper(...) // select and groupBy 
        Window(NewSlidingTimeWindows(time.Minute*1, time.Second*15)).
        TopN(10, OrderBy(modelv1.SORT_DESC), ...) // TopN with parameters

    TopN is implemented with the help of a TreeMap which maps sortedKey to the collection of records.

    opened by lujiajing1126 19
  • Add elementUI, sass and sass-loader@7.3.1

    Add elementUI, sass and [email protected]

    • Add elementUI, sass and [email protected]
    • Initialize page structure
      • Add Database.vue and Structure.vue
      • delete Laws.vue
    • Add Header Component
      • Add NavMenu from ElementUI
    opened by WuChuSheng1 19
  • Add groupBy to the measure query request

    Add groupBy to the measure query request

    Add groupBy and aggregation function to the query request:

    • the query request doesn't support sub or nested aggregation
    • the response's timestamp field is null on returning the aggregated result
    • the result is as same as the order by if the request doesn't specify the agg function on grouping
    opened by hanahmily 12
  • Add docs

    Add docs


    I leave empty CRUD examples for the future CLI tools.

    Signed-off-by: Gao Hongtao [email protected]

    opened by hanahmily 6
  • Benchmark flatbuffers and protobuf

    Benchmark flatbuffers and protobuf

    Benchmark env

    • CPU: Intel(R) Core(TM) i5-8257U CPU @ 1.40GHz
    • Memory: 8 GB 2133 MHz LPDDR3
    • Java: JDK8u292b10
    • protoc: 3.17.3
    • protobuf-java: 3.17.2
    • flatc: 2.0.0
    • flatbuffers-java: 2.0.2

    Performance: Serialization+Java

     * # JMH version: 1.32
     * # VM version: JDK 1.8.0_292, OpenJDK 64-Bit Server VM, 25.292-b10
     * # VM invoker: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/bin/java
     * # VM options: -javaagent:/Applications/IntelliJ -Dfile.encoding=UTF-8
     * # Blackhole mode: full + dont-inline hint
     * # Warmup: 5 iterations, 10 s each
     * # Measurement: 5 iterations, 10 s each
     * # Timeout: 10 min per iteration
     * # Threads: 1 thread, will synchronize iterations
     * # Benchmark mode: Average time, time/op
     * <p>
     * Benchmark                                                                  Mode  Cnt     Score    Error   Units
     * WriteEntitySerializationTest.flatbuffers                                   avgt   25  3044.054 ± 34.837   ns/op
     * WriteEntitySerializationTest.flatbuffers:·gc.alloc.rate                    avgt   25   911.872 ± 10.301  MB/sec
     * WriteEntitySerializationTest.flatbuffers:·gc.alloc.rate.norm               avgt   25  3056.000 ±  0.001    B/op
     * WriteEntitySerializationTest.flatbuffers:·gc.churn.PS_Eden_Space           avgt   25   912.394 ± 10.261  MB/sec
     * WriteEntitySerializationTest.flatbuffers:·gc.churn.PS_Eden_Space.norm      avgt   25  3057.783 ± 10.009    B/op
     * WriteEntitySerializationTest.flatbuffers:·gc.churn.PS_Survivor_Space       avgt   25     0.190 ±  0.018  MB/sec
     * WriteEntitySerializationTest.flatbuffers:·gc.churn.PS_Survivor_Space.norm  avgt   25     0.637 ±  0.059    B/op
     * WriteEntitySerializationTest.flatbuffers:·gc.count                         avgt   25  3878.000           counts
     * WriteEntitySerializationTest.flatbuffers:·gc.time                          avgt   25  2168.000               ms
     * WriteEntitySerializationTest.protobuf                                      avgt   25   514.010 ± 12.638   ns/op
     * WriteEntitySerializationTest.protobuf:·gc.alloc.rate                       avgt   25  3833.648 ± 90.162  MB/sec
     * WriteEntitySerializationTest.protobuf:·gc.alloc.rate.norm                  avgt   25  2168.000 ±  0.001    B/op
     * WriteEntitySerializationTest.protobuf:·gc.churn.PS_Eden_Space              avgt   25  3835.530 ± 94.020  MB/sec
     * WriteEntitySerializationTest.protobuf:·gc.churn.PS_Eden_Space.norm         avgt   25  2168.989 ±  6.134    B/op
     * WriteEntitySerializationTest.protobuf:·gc.churn.PS_Survivor_Space          avgt   25     0.195 ±  0.020  MB/sec
     * WriteEntitySerializationTest.protobuf:·gc.churn.PS_Survivor_Space.norm     avgt   25     0.110 ±  0.011    B/op
     * WriteEntitySerializationTest.protobuf:·gc.count                            avgt   25  3629.000           counts
     * WriteEntitySerializationTest.protobuf:·gc.time                             avgt   25  2227.000               ms

    Performance: Deserialization+Go

    goos: darwin
    goarch: amd64
    cpu: Intel(R) Core(TM) i5-8257U CPU @ 1.40GHz
    Benchmark_Deser_Flatbuffers-8   	100000000	       730.5 ns/op	      64 B/op	       2 allocs/op
    Benchmark_Deser_Protobuf-8      	14826262	      5044 ns/op	    1944 B/op	      49 allocs/op
    ok	153.927s


    For the same entity as illustrated in WriteEntitySerializationTest.EntityModel, (Unit in bytes)

    • Flatbuffers: 512
    • Protobuf: 169

    which means Protobuf is much more compact.


    From the perspective of bandwidth and write performance, protobuf is definitely a better choice.

    However, flatbuffer has better deserialization performance, in particular for partially read. The process of deserialization (i.e. GetRootAs***) actually does nothing. The real deserialization happens when users try to read from the byte buffer.


    Our conclusion is similar to what has been described in the above paper. See Fig.3,4,5.

    opened by lujiajing1126 6
  • Feat: query module

    Feat: query module


    This is a very perlimenary PR for the query module so far. Much things have to be considered further.

    Since it would be a principle module, I just want to discuss the current design/implementation ASAP to avoid improper design and find some better ideas to proceed.

    Logical Plan

    A Logical Plan is a DAG (Directed Acyclic Graph) of Params. The Param defines necessary parameters for query execution. The parameters are well prepared during the logical plan composing in order to reduce extra cost while executing physical plan.


    the logical plan can be plotted as Dot graph. For example,

    digraph  {
    	n2[label="ChunkIDsFetch{metadata={group=skywalking,name=trace},projection=[TraceID startTime]}"];
    	n6[label="TraceIDFetch{TraceID=aaaaaaaa,metadata={group=skywalking,name=trace},projection=[TraceID startTime]}"];

    We can leverage the online toolkit to visualize the logical plan,


    Physical Plan

    A Physical Plan contains the logical plan and the Transform(s) corresponding to each logical.Op.

    While the plan is triggered to run, a reversed topology-sorted slices (Future as items) will be generated from the logical plan.


    • [x] client utils for building EntityCriteria
    • [x] logical plan: Ops such as Sort, OffsetAndLimit, ChunkIDMerge, TableScan and IndexScan
    • [x] physical plan: topology sort, Transform
    • [ ] complete API to connect with Liaison (Add handlers) (Maybe next PR)

    To be discussed

    Index selection and optimization stage

    For now, I only use single-value indexes. But in the current implementation, we may be able to improve index selection during the process of generating Logical Plan.

    Any better idea? Since for the traditional databases, normally they have optimization stages (usually after generating hierarchical logical plan?) for indexes selection. How can we fit this optimization stage in our implementations?

    Sort and field orderliness

    I believe we have to impose stronger preconditions to sort-field since it is not possible to sort on a sparse field.

    And Sort requires the Field to be arranged in a strict order, i.e. we have to use fieldIndex number to access the field that is needed to be sorted quickly. Otherwise, it may cost much resources to find the specific field every time.

    opened by lujiajing1126 6
  • Add bydbctl's examples

    Add bydbctl's examples

    The CRUD and query examples are based on bydbctl.

    @lujiajing1126 @wankai123 you could use this command line tool to query schemas and data from server.

    opened by hanahmily 5
  • Update go 1.19

    Update go 1.19

    Lint issues

    Several lint errors occur after upgrading to go 1.19

    • Package comments missing:
    • Migrate ->
    • fully deprecate io/ioutil
    opened by lujiajing1126 5
  • Add measure query

    Add measure query

    This PR introduces basic measure query feature with local index scan.

    The implementation is based on,

    1. No global index for measure
    2. No limit and offset for measure

    As we've discussed, GroupBy and Aggregation will come after this PR.

    opened by lujiajing1126 5
  • Reload stream when metadata changes

    Reload stream when metadata changes

    This PR supersedes the previous PR #65 to allow metadata reload while it put logic mostly in the stream module instead of pursuing a strong-consistent metadata in the previous PR.

    As a result, the stream model starts a serially-running background job to continuously reconcile the opened stream (with underlying storage).

    Please have a review with the new design @hanahmily

    More test cases will be added later. I suppose Eventually method is necessary for these kinds of tests.

    opened by lujiajing1126 5
  • Introduce bytebuffer pool

    Introduce bytebuffer pool

    In this PR, I've introduced a very simple bytebuffer to optimize byte manipulation.

    The benchmark of query path has been added, the result shows approx. ~10% less allocation.

    opened by lujiajing1126 5
  • Add UI for creating and editing tagfamilies and tags. Change some UI styles.

    Add UI for creating and editing tagfamilies and tags. Change some UI styles.

    This PR is mainly for add UI for creating and editing tagfamilies and tags, and to change some UI styles.

    1. add UI for creating and editing tagfamilies and tags.

    • The original 'new resources' UI:


    • The old UI does not provide the function of adding and editing tagfamilies and tags,This is not good for the user experience.So I added a UI for this, which provides new and editing functions.
    • First, you can right-click the group and click 'new resources' to enter the dialog box of adding' resources':

    image image

    • Obviously, it provides add, delete, edit and batch delete functions.
    • You can click the 'Add the tagfamilies' button to enter the dialog box for adding' tagfamilies':

    image It provides add, delete, edit and batch delete functions too.

    Tips: It is worth noting that the UI is not connected to the background interface temporarily, but it does not affect the original new 'resources' function. I need to know whether the newly added 'resources' interface provides the newly added tagfamily function, and what is its data structure? Maybe you can help me?

    • change some UI styles:
    • The old right click menu:


    • The new right click menu:


    • The old dialog:

    image image

    • The new dialog:

    image image image

    opened by WuChuSheng1 2
  • v0.2.0(Nov 9, 2022)

    What's Changed

    • Update Go to 1.18 by @lujiajing1126 in
    • Add bydbctl and ui projects by @hanahmily in
    • Add roadmap by @hanahmily in
    • Do some chores by @hanahmily in
    • Fix several flaws of the toolchain by @hanahmily in
    • Bump axios from 0.18.1 to 0.21.2 in /ui by @dependabot in
    • Add grpc-gateway by @sacloudy in
    • Restful APIs by @hanahmily in
    • Add elementUI, sass and [email protected] by @WuChuSheng1 in
    • Install golangci-lint by go install by @hanahmily in
    • Bump up skywalking-eye by @hanahmily in
    • Preliminarily improve the overall structure of the databse page. by @WuChuSheng1 in
    • Add OAP traffic generator by @hanahmily in
    • api: add validation for proto message by @e1ijah1 in
    • Bump terser from 4.8.0 to 4.8.1 in /ui by @dependabot in
    • Add stress test components by @hanahmily in
    • Add some funtions by @WuChuSheng1 in
    • Remove term metadata store by @hanahmily in
    • Introduce race detector and coverage profile by @hanahmily in
    • Merge primary index into lsm index by @hanahmily in
    • Update go 1.19 by @lujiajing1126 in
    • Parameterize memory size and some key improvements by @hanahmily in
    • Fix tsdb's leaked goroutines by @hanahmily in
    • Add several metrics to measure the compression ratio by @hanahmily in
    • Fix data race by @hanahmily in
    • This PR is mainly used to update the optimized UI by @WuChuSheng1 in
    • Use bluge to implement inverted index by @hanahmily in
    • Introduce full text searching to stream and measure by @hanahmily in
    • Fix data race by @hanahmily in
    • Bump up api plugins by @hanahmily in
    • Update official doc path by @hanahmily in
    • Stream and measure schema management by @WuChuSheng1 in
    • [BREAKING CHANGES] Introduce OR logical operation by @hanahmily in
    • Add stream schema cli by @sacloudy in
    • Add segment interval options by @hanahmily in
    • Introduce two key feats to tsdb by @hanahmily in
    • This pr is mainly used to upgrade the technology stack of the ui by @WuChuSheng1 in
    • Refactor Property by @hanahmily in
    • Add indexed_only flag to tag spec by @hanahmily in
    • Fix test cases with Eventually semantics by @lujiajing1126 in
    • Make import order deterministic by @hanahmily in
    • add measure and group cmd by @sacloudy in
    • Add Exist endpoints to all services by @hanahmily in
    • Try to create the group if absent by @hanahmily in
    • Fix having semantic inconsistency by @hanahmily in
    • Fixes some issues found by Java client updating by @hanahmily in
    • Add streaming API and topN aggregator by @lujiajing1126 in
    • Other data models operation by @sacloudy in
    • Fix TopN sort direction issue and TopN query issue by @lujiajing1126 in
    • This PR is mainly used to solve the bugs caused by upgrading the technology stack by @WuChuSheng1 in
    • Add more data points to flush topn results by @hanahmily in
    • Add bydbctl's examples by @hanahmily in
    • Improve building system on Windows: by @hanahmily in
    • This PR is mainly used to reconstruct the top-level menu by @WuChuSheng1 in
    • Fix panic with single entity-related condition by @hanahmily in
    • Fix flaky test cases by @hanahmily in
    • Fix bugs of tsdb by @hanahmily in
    • Add load test and inject testing flags by @hanahmily in
    • Changes for releasing v0.2.0 by @hanahmily in
    • Add to explian objectes in API by @hanahmily in

    New Contributors

    • @dependabot made their first contribution in
    • @WuChuSheng1 made their first contribution in
    • @e1ijah1 made their first contribution in

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Jun 5, 2022)



    • BanyanD is the server of BanyanDB
      • TSDB module. It provides the primary time series database with a key-value data module.
      • Stream module. It implements the stream data model's writing.
      • Measure module. It implements the measure data model's writing.
      • Metadata module. It implements resource registering and property CRUD.
      • Query module. It handles the querying requests of stream and measure.
      • Liaison module. It's the gateway to other modules and provides access endpoints to clients.
    • gRPC based APIs
    • Document
      • API reference
      • Installation instrument
      • Basic concepts
    • Testing
      • UT
      • E2E with Java Client and OAP

    Full Changelog:

    Source code(tar.gz)
    Source code(zip)
The Apache Software Foundation
The Apache Software Foundation
Distributed tracing using OpenTelemetry and ClickHouse

Distributed tracing backend using OpenTelemetry and ClickHouse Uptrace is a dist

Uptrace 1.3k Jan 2, 2023
A tool I made to quickly store bug bounty program scopes in a local sqlite3 database

GoScope A tool I made to quickly store bug bounty program scopes in a local sqlite3 database. Download or copy a Burpsuite configuration file from the

null 3 Nov 18, 2021
[mirror] the database client and tools for the Go vulnerability database

The Go Vulnerability Database This repository is a prototype of the Go Vulnerability Database. Read the Draft Design. Neither the

Go 224 Dec 29, 2022
Database - Example project of database realization using drivers and models

database Golang based database realization Description Example project of databa

Denis 1 Feb 10, 2022
Library for scanning data from a database into Go structs and more

scany Overview Go favors simplicity, and it's pretty common to work with a database via driver directly without any ORM. It provides great control and

Georgy Savva 821 Jan 9, 2023
Lightweight SQL database written in Go for prototyping and playing with text (CSV, JSON) data

gopicosql Lightweight SQL database written in Go for prototyping and playing wit

null 2 Jul 27, 2022
A tool to run queries in defined frequency and expose the count as prometheus metrics. Supports MongoDB and SQL

query2metric A tool to run db queries in defined frequency and expose the count as prometheus metrics. Why ? Product metrics play an important role in

S Santhosh Nagaraj 19 Jul 1, 2022
Convert data exports from various services to a single SQLite database

Bionic Bionic is a tool to convert data exports from web apps to a single SQLite database. Bionic currently supports data exports from Google, Apple H

Bionic 175 Dec 9, 2022
Dumpling is a fast, easy-to-use tool written by Go for dumping data from the database(MySQL, TiDB...) to local/cloud(S3, GCP...) in multifarious formats(SQL, CSV...).

?? Dumpling Dumpling is a tool and a Go library for creating SQL dump from a MySQL-compatible database. It is intended to replace mysqldump and mydump

PingCAP 267 Nov 9, 2022
Create key value sqlite3 database from tabular data, fast.

Turn tabular data into a lookup table using sqlite3. This is a working PROTOTYPE with limitations, e.g. no customizations, the table definition is fixed, etc.

Martin Czygan 5 Apr 2, 2022
Make a sqlite3 database from tabular data, fast.

MAKTA make a database from tabular data Turn tabular data into a lookup table using sqlite3. This is a working PROTOTYPE with limitations, e.g. no cus

Martin Czygan 5 Apr 2, 2022
A database connection tool for sensitive data

go-sql 用于快速统计数据库行数、敏感字段匹配、数据库连接情况。 usage ./go-sql_darwin_amd64 -h ./go-sql_darwin_amd64 -f db.yaml -k name,user ./go-sql_darwin_amd64 -f db.yaml --min

null 5 Apr 4, 2022
A go package to add support for data at rest encryption if you are using the database/sql.

go-lockset A go package to add support for data at rest encryption if you are using the database/sql to access your database. Installation In your Gol

Bartlomiej Mika 0 Jan 30, 2022
InfluxDB metrics exporter for

opencensus-exporter-influxdb InfluxDB metrics exporter for Installation $ go get -u Regi

Huy Duc Dao 1 Nov 6, 2021
Simple key-value store on top of SQLite or MySQL

KV Work in progress, not ready for prime time. A simple key/value store on top of SQLite or MySQL (Go port of GitHub's KV). Aims to be 100% compatible

Sergio Rubio 4 Dec 3, 2022
A Go rest API project that is following solid and common principles and is connected to local MySQL database.

This is an intermediate-level go project that running with a project structure optimized RESTful API service in Go. API's of that project is designed based on solid and common principles and connected to the local MySQL database.

Kıvanç Aydoğmuş 22 Dec 25, 2022
Database Access Layer for Golang - Testable, Extendable and Crafted Into a Clean and Elegant API

REL Modern Database Access Layer for Golang. REL is golang orm-ish database layer for layered architecture. It's testable and comes with its own test

REL 614 Dec 29, 2022
🏋️ dbbench is a simple database benchmarking tool which supports several databases and own scripts

dbbench Table of Contents Description Example Installation Supported Databases Usage Custom Scripts Troubeshooting Development Acknowledgements Descri

Simon Jürgensmeyer 80 Dec 30, 2022