A distributed and coördination-free log management system

Overview

OK Log is archived

I hoped to find the opportunity to continue developing OK Log after the spike of its creation. Unfortunately, despite effort, no such opportunity presented itself. Please look at OK Log for inspiration, and consider using the (maintained!) projects that came from it, ulid and run.


OK Log

OK Log is a distributed and coördination-free log management system for big ol' clusters. It's an on-prem solution that's designed to be a sort of building block: easy to understand, easy to operate, and easy to extend.

Is OK Log for me?

You may consider OK Log if...

  • You're tailing your logs manually, find it annoying, and want to aggregate them without a lot of fuss
  • You're using a hosted solution like Loggly, and want to move logs on-prem
  • You're using Elasticsearch, but find it unreliable, difficult to operate, or don't use many of its features
  • You're using a custom log pipeline with e.g. Fluentd or Logstash, and having performance problems
  • You just wanna, like, grep your logs — why is this all so complicated?

Getting OK Log

OK Log is distributed as a single, statically-linked binary for a variety of target architectures. Download the latest release from the releases page.

Quickstart

$ oklog ingeststore -store.segment-replication-factor 1
$ ./myservice | oklog forward localhost
$ oklog query -from 5m -q Hello
2017-01-01 12:34:56 Hello world!

Deploying

Small installations

If you have relatively small log volume, you can deploy a cluster of identical ingeststore nodes. By default, the replication factor is 2, so you need at least 2 nodes. Use the -cluster flag to specify a routable IP address or hostname for each node to advertise itself on. And let each node know about at least one other node with the -peer flag.

foo$ oklog ingeststore -cluster foo -peer foo -peer bar -peer baz
bar$ oklog ingeststore -cluster bar -peer foo -peer bar -peer baz
baz$ oklog ingeststore -cluster baz -peer foo -peer bar -peer baz

To grow the cluster, just add a new node, and tell it about at least one other node via the -peer flag. Optionally, you can run the rebalance tool (TODO) to redistribute the data over the new topology. To shrink the cluster, just kill nodes fewer than the replication factor, and run the repair tool (TODO) to re-replicate lost records.

All configuration is done via commandline flags. You can change things like the log retention period (default 7d), the target segment file size (default 128MB), and maximum time (age) of various stages of the logging pipeline. Most defaults should be sane, but you should always audit for your environment.

Large installations

If you have relatively large log volume, you can split the ingest and store (query) responsibilities. Ingest nodes make lots of sequential writes, and benefit from fast disks and moderate CPU. Store nodes make lots of random reads and writes, and benefit from large disks and lots of memory. Both ingest and store nodes join the same cluster, so provide them with the same set of peers.

ingest1$ oklog ingest -cluster 10.1.0.1 -peer ...
ingest2$ oklog ingest -cluster 10.1.0.2 -peer ...

store1$ oklog store -cluster 10.1.9.1 -peer ...
store2$ oklog store -cluster 10.1.9.2 -peer ...
store3$ oklog store -cluster 10.1.9.3 -peer ...

To add more raw ingest capacity, add more ingest nodes to the cluster. To add more storage or query capacity, add more store nodes. Also, make sure you have enough store nodes to consume from the ingest nodes without backing up.

Forwarding

The forwarder is basically just netcat with some reconnect logic. Pipe the stdout/stderr of your service to the forwarder, configured to talk to your ingesters.

$ ./myservice | oklog forward ingest1 ingest2

OK Log integrates in a straightforward way with runtimes like Docker and Kubernetes. See the Integrations page for more details.

Querying

Querying is an HTTP GET to /query on any of the store nodes. OK Log comes with a query tool to make it easier to play with. One good thing is to first use the -stats flag to refine your query. When you're satisfied it's sufficiently constrained, drop -stats to get results.

$ oklog query -from 2h -to 1h -q "myservice.*(WARN|ERROR)" -regex
2016-01-01 10:34:58 [myservice] request_id 187634 -- [WARN] Get /check: HTTP 419 (0B received)
2016-01-01 10:35:02 [myservice] request_id 288211 -- [ERROR] Post /ok: HTTP 500 (0B received)
2016-01-01 10:35:09 [myservice] request_id 291014 -- [WARN] Get /next: HTTP 401 (0B received)
 ...

To query structured logs, combine a basic grep filter expression with a tool like jq.

$ oklog query -from 1h -q /api/v1/login
{"remote_addr":"10.34.115.3:50032","path":"/api/v1/login","method":"POST","status_code":200}
{"remote_addr":"10.9.101.113:51442","path":"/api/v1/login","method":"POST","status_code":500}
{"remote_addr":"10.9.55.2:55210","path":"/api/v1/login","method":"POST","status_code":200}
{"remote_addr":"10.34.115.1:51610","path":"/api/v1/login","method":"POST","status_code":200}
...

$ oklog query -from 1h -q /api/v1/login | jq '. | select(.status_code == 500)'
{
	"remote_addr": "10.9.55.2:55210",
	"path": "/api/v1/login",
	"method": "POST",
	"status_code": 500
}
...

UI

OK Log ships with a basic UI for making queries. You can access it on any store or ingeststore node, on the public API port (default 7650), path /ui. So, e.g. http://localhost:7650/ui.

Further reading

Integrations

Unofficial Docker images

Translation


OK icon by Karthik Srinivas from the Noun Project. Development supported by DigitalOcean.

Issues
  • Web UI

    Web UI

    This work started based on the hypothesis that an interface which follows the same design goals as the overall system can add immediate value for the brave who want to start using OK Log and might convince the doubtful to not discard it completely. In order to get an understanding early on what might be important for such an interface I'm putting it out it in the open in a very experimental/rough state and hope to avoid spending energy non-effectively. Therefore everybody should feel encouraged to bring forth their ideas and needs regarding a web interface that interacts in (maybe rich ways) with large log volumes.

    Goals

    • build a tool for users and operators of the system to interact with the dataset efficiently
    • stay true to OK Logs operational simplicity
    • encode/enforce best-practices and workflows for querying

    Non-Goals

    • blindly copying other log tooling
    • arguing for benefits or drawbacks of contemporary frontend technologies

    How to look at it?

    Given you have access to this code branch, start up any of the stores (oklog store, oklog ingeststore) and point your browser to /ui/.

    Improvements

    Achievable changes in this change-set.

    • [x] simplify range controls
    • [x] remove delay on stats query and fire on initialisation
    • [x] support regex query
    • [x] clarify planning output
    • [ ] responsive ui design (media queries galore)
    • [x] remove ULID column
    • [x] move towards chunked consumption (possibly infinite scrolling)
    • [ ] provide separate ui cmd
    • [ ] encode query in URL for sharing
    opened by xla 30
  • error when querying store

    error when querying store

    I have a 6 peer cluster setup. Three ingestors and three store nodes. I started piling in a lots of logs for load testing purposes. After 10 minutes I queried:

    $ oklog query -store log-store-1 -from 1h -q "JID-b6179b2707845b363de309af" results all good (about 5 lines of text)

    After 15 minutes:

    $ oklog query -store log-store-1 -from 1h -q "JID-b6179b2707845b363de309af" results all good (same 5 lines of text)

    After 20 minutes:

    $ oklog query -store log-store-1 -from 1h -q "JID-b6179b2707845b363de309af"

    Nothing.

    In the console for the store node I see:

    ts=2017-04-05T23:06:15.888364628Z level=error during=query_gather status_code=500 err="open /data/logs/01BD05E0P7H7B1WGY576CMHMSX-01BD05GNK917D75W1SNMXV1TMK.flushed: no such file or directory"
    

    I can also query log-store-2 and -3 and I get the same result. In log-store-1 I also sometimes get:

    ts=2017-04-05T23:06:01.113046951Z level=error during=query_gather err="Get http://10.240.0.32:7650/store/_query?from=2017-04-05T22%3A05%3A56Z&to=2017-04-05T23%3A05%3A56Z&q=JID-b6179b2707845b363de309af: net/http: timeout awaiting response headers"
    

    where 32 is store-2.

    ts=2017-04-05T23:06:01.113173666Z level=error during=query_gather err="Get http://10.240.0.40:7650/store/_query?from=2017-04-05T22%3A05%3A56Z&to=2017-04-05T23%3A05%3A56Z&q=JID-b6179b2707845b363de309af: net/http: timeout awaiting response headers"
    

    And where 40 is store-3.

    bug 
    opened by newhook 24
  • ingeststore non functional

    ingeststore non functional

    Following the directions in the README I've setup two hosts to test ingeststore. After starting the hosts I get:

    [email protected]:~$ oklog ingeststore -cluster log-store-1 -peer log-store-1 -peer log-store-2
    ts=2017-03-30T17:36:20.410582124Z level=info cluster=log-store-1:7659
    ts=2017-03-30T17:36:20.410667763Z level=info fast=tcp://0.0.0.0:7651
    ts=2017-03-30T17:36:20.410689283Z level=info durable=tcp://0.0.0.0:7652
    ts=2017-03-30T17:36:20.410707298Z level=info bulk=tcp://0.0.0.0:7653
    ts=2017-03-30T17:36:20.410733243Z level=info API=tcp://0.0.0.0:7650
    ts=2017-03-30T17:36:20.410890213Z level=info ingest_path=data/ingest
    ts=2017-03-30T17:36:20.410950353Z level=info store_path=data/store
    ts=2017-03-30T17:36:20.421332098Z level=debug component=cluster Join=1
    ts=2017-03-30T17:36:20.521814528Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ts=2017-03-30T17:36:21.522022322Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ts=2017-03-30T17:36:22.522240947Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ts=2017-03-30T17:36:23.621790312Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ts=2017-03-30T17:36:24.621966192Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ts=2017-03-30T17:36:25.42175071Z level=warn component=cluster NumMembers=1
    ts=2017-03-30T17:36:25.622150195Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ts=2017-03-30T17:36:26.721837292Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ts=2017-03-30T17:36:27.722031419Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ts=2017-03-30T17:36:28.722254832Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ts=2017-03-30T17:36:29.821805223Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    

    ....

    [email protected]:~$ oklog ingeststore -cluster log-store-2 -peer log-store-1 -peer log-store-2
    ts=2017-03-30T17:36:34.307731343Z level=info cluster=log-store-2:7659
    ts=2017-03-30T17:36:34.307863365Z level=info fast=tcp://0.0.0.0:7651
    ts=2017-03-30T17:36:34.307892432Z level=info durable=tcp://0.0.0.0:7652
    ts=2017-03-30T17:36:34.307917061Z level=info bulk=tcp://0.0.0.0:7653
    ts=2017-03-30T17:36:34.307940364Z level=info API=tcp://0.0.0.0:7650
    ts=2017-03-30T17:36:34.308098722Z level=info ingest_path=data/ingest
    ts=2017-03-30T17:36:34.308193966Z level=info store_path=data/store
    ts=2017-03-30T17:36:34.319719011Z level=debug component=cluster Join=2
    ts=2017-03-30T17:36:41.420291562Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ts=2017-03-30T17:36:42.420504608Z level=warn component=Consumer state=gather replication_factor=2 available_peers=1 err="replication currently impossible"
    ^Cts=2017-03-30T17:36:43.420713048Z level=debug component=Compacter shutdown_took=7.052µs
    received signal interrupt
    
    question 
    opened by newhook 19
  • panic in github.com/djherbis/nio

    panic in github.com/djherbis/nio

    Was doing a simple test as described in the quickstart. When querying, I got this panic from the ingeststore:

    % ~/Downloads/oklog-0.1.0-darwin-amd64 ingeststore -store.segment-replication-factor 1
    ts=2017-01-17T14:20:54Z level=info cluster=0.0.0.0:7659
    ts=2017-01-17T14:20:54Z level=info fast=tcp://0.0.0.0:7651
    ts=2017-01-17T14:20:54Z level=info durable=tcp://0.0.0.0:7652
    ts=2017-01-17T14:20:54Z level=info bulk=tcp://0.0.0.0:7653
    ts=2017-01-17T14:20:54Z level=info API=tcp://0.0.0.0:7650
    ts=2017-01-17T14:20:54Z level=info ingest_path=data/ingest
    ts=2017-01-17T14:20:54Z level=info store_path=data/store
    ts=2017-01-17T14:20:54Z level=debug component=cluster Join=0
    panic: runtime error: slice bounds out of range
    
    goroutine 1858 [running]:
    github.com/djherbis/nio.(*PipeWriter).Write(0xc42000e050, 0xc4205ec99c, 0x1e, 0x664, 0x0, 0x0, 0x0)
    	/Users/peter/src/github.com/djherbis/nio/sync.go:135 +0x30a
    github.com/oklog/oklog/pkg/store.newConcurrentFilteringReadCloser.func1(0x2693d80, 0xc421f83620, 0xc42000e050, 0xc4200148c0)
    	/Users/peter/src/github.com/oklog/oklog/pkg/store/query.go:265 +0x198
    created by github.com/oklog/oklog/pkg/store.newConcurrentFilteringReadCloser
    	/Users/peter/src/github.com/oklog/oklog/pkg/store/query.go:276 +0x210
    

    Was running ingeststore like so:

    % ~/Downloads/oklog-0.1.0-darwin-amd64 ingeststore -store.segment-replication-factor 1
    

    a producer like so:

    % while true; do echo hi; done | ~/Downloads/oklog-0.1.0-darwin-amd64 forward localhost
    

    and this query:

    % ~/Downloads/oklog-0.1.0-darwin-amd64 query -from 1m -v -q hi
    -from 2017-01-17T10:25:07-04:00 -to 2017-01-17T10:26:07-04:00
    Get http://localhost:7650/store/query?from=2017-01-17T10%3A25%3A07-04%3A00&to=2017-01-17T10%3A26%3A07-04%3A00&q=hi: EOF
    

    Will try and dig in but perhaps you'll spot the problem sooner.

    opened by danp 11
  • integration: Docker

    integration: Docker

    Like #13, we should provide a simple drop-in integration for plain Docker hosts.

    opened by peterbourgon 11
  • store: implement streaming queries

    store: implement streaming queries

    There are lots of use cases that are well-served by streaming queries. That is, set up a query that delivers results as they arrive at each store node. At a first thought this would look like

    • User makes a stream-type query to any store node
    • The store node broadcasts that query to all store nodes
    • Each store node registers the query in some kind of table as a websocket
    • When new segments are replicated, records are matched against each registered streaming query
    • Matching records are written to the websocket
    • The originating store node uses some kind of ringbuffer to deduplicate records over a time window
    • Every effort should be made to keep the websocket connections alive: reconnect logic, etc.
    enhancement 
    opened by peterbourgon 9
  • How to use with a service like Heroku?

    How to use with a service like Heroku?

    Hey there, I'd like to use OKLog, but I'm struggling figuring out a good way to run it with my heroku app. The "best" thing I've come up with is:

    Copy cmd/oklog into my app's directory and let heroku compile it and make it available (heroku automatically compiles and makes things in cmd/ available to run). I could then pipe my app into oklog in my Procfile.

    The only downside to this is it seems that my STDOUT will be swallowed by oklog and I'll lose heroku logs.

    Is there a better way?

    question 
    opened by zombor 9
  • integration: Kubernetes

    integration: Kubernetes

    We should make it as easy as possible to hook up OK Log to an existing Kubernetes cluster. At first glance, this involves some configuration or manifest files to install forwarders at the appropriate place/s, and an optional set of manifest files to actually deploy an OK Log installation into the cluster. (It probably makes sense to host that off-cluster for most people, but an all-in solution will be nice to have.)

    opened by peterbourgon 8
  • Log records can be stored in wrong order

    Log records can be stored in wrong order

    Hello,

    I've found that log records could come in wrong order sometimes.

    Suppose we have one ingestor and 2 forwarders (A and B), store node is offline at the beginning. If for example, ingestor will save two segments, one - from A in time range 2-5 and second - from B in time range 3-6. Than if we start store, it will start fetching segments and appending them to "active" (Consumer.active in the code). This buffer will have records in time range 2-5-3-6 and these records will never be reordered if I understood code correctly. So as a result we'll see query output like:

    default aaaaaaaaa 2018-06-13T18:07:11+03:00 foo 000000197 G
    default aaaaaaaaa 2018-06-13T18:07:11+03:00 foo 000000198 J
    default aaaaaaaaa 2018-06-13T18:07:11+03:00 foo 000000199 F
    default bbbbbbb 2018-06-13T18:07:09+03:00 foo 000000150 T
    default bbbbbbb 2018-06-13T18:07:09+03:00 foo 000000151 N
    default bbbbbbb 2018-06-13T18:07:09+03:00 foo 000000152 S
    

    I suppose this behaviour in not expected, so maybe it need to be fixed. I would be very grateful to someone who'll clarify it.

    opened by dmitry-guryanov 8
  • Buffered forwarder for #15

    Buffered forwarder for #15

    @peterbourgon please see this experimental implementation for #15 , although it's a bit racey for the time being. Please don't consider this complete yet.

    • It basically works - when I kill my ingeststore the forwarder buffers some messages, and then after restarting the ingeststore it reconnects & forwards the buffer :+1: ... but sometimes a couple of messages get sent twice - it needs work still.
    • I've been testing it by piping date once per second onto oklog forward -buf -bufsize=5 localhost, then repeatedly querying oklog query -from 30s in another window, and then killing/restarting my ingeststore for 5-15 seconds. It's pretty easy to see what's going on.
    • For the buffer I tried using container/ring from the standard library. I added a mutex and a couple of other fields to maintain state. Seems OKish, but not very simple really. Any advice on how you envisioned this?
    • I chose messages as the unit for buffering, rather than bytes. I figured it fits the forwarding code reasonably well.

    It's late now so I just thought I'd elicit some feedback for now, as you may have had something very different in mind. Cheers

    opened by laher 7
  • Web UI Improvements

    Web UI Improvements

    I have a couple of suggestions for improvements that can be made to the current OkLog Web UI:

    1. Entries are shown in time ascending order. Having an option to choose descending might be helpful.

    2. Allowing the UI to specify to return at most N results would also be useful.

    3. Timestamps are not displayed per log item. These are collected as part of the ULID, but displaying these in a human-readable format may help.

    4. The bottom log line is covered by the floating debug footer, I believe this is a bug.

    5. Editing the start time, query mode (plain/regex), and streaming settings result in "Your query hasn't been planned yet." This means someone needs to edit the query again to see an estimate of data returned.

    enhancement help wanted proposal 
    opened by msm595 0
  • Make log consume by store little more aggressive

    Make log consume by store little more aggressive

    Loop log segments gathering by store/igeststore until configurable timeout or no more data. This change allow to catch quickly log entries burst on ingest nodes.

    opened by bkmit 0
  • How oklog stores data?

    How oklog stores data?

    How oklog store the data . is it store data in json format or in text format .

    on request to localhost:port/store/stream i am getting bunch of data i am not understanding how/what it is displaying..

    opened by pavangond 0
  • What's the best way to test the cluster in large installation is working

    What's the best way to test the cluster in large installation is working

    Hi, I have the following large installation (2 ingests and 3 stores) started as follows:

    Commands: oklog-0.3.2-linux-amd64 ingest -cluster 172.27.47.21 -peer 172.27.47.21 -peer 172.27.47.22 -peer 172.27.47.23 -peer 172.27.47.24 -peer 172.27.47.25

    oklog-0.3.2-linux-amd64 ingest -cluster 172.27.47.22 -peer 172.27.47.21 -peer 172.27.47.22 -peer 172.27.47.23 -peer 172.27.47.24 -peer 172.27.47.25

    oklog-0.3.2-linux-amd64 store -cluster 172.27.47.23 -peer 172.27.47.21 -peer 172.27.47.22 -peer 172.27.47.23 -peer 172.27.47.24 -peer 172.27.47.25

    oklog-0.3.2-linux-amd64 store -cluster 172.27.47.24 -peer 172.27.47.21 -peer 172.27.47.22 -peer 172.27.47.23 -peer 172.27.47.24 -peer 172.27.47.25

    oklog-0.3.2-linux-amd64 store -cluster 172.27.47.25 -peer 172.27.47.21 -peer 172.27.47.22 -peer 172.27.47.23 -peer 172.27.47.24 -peer 172.27.47.25

    Output: First ingest keeps showing the following output until the second ingest joins the cluster: level=warn component=cluster NumMembers=1 msg="I appear to be alone in the cluster"

    Second ingest shows the following output: level=info ingest_path=data/ingest

    First store keeps showing the following output until the second store joins the cluster: ts=2018-08-02T13:06:04.503169534Z level=warn component=Consumer op=gather warning="replication factor 2, available peers 1: replication currently impossible"

    Second and third stores show the following output (respectively): ts=2018-08-02T13:07:44.610961417Z level=info StoreLog=data/store ts=2018-08-02T13:08:08.235069864Z level=info StoreLog=data/store

    Testing cluster: When I use testsvc to test the service I get output like the following - although I haven’t hooked the cluster with any application or source of logs -

    2018-08-02T14:26:32+01:00 foo 000000100 SBE2 0T30 0Y43 39E4 4Q7B TK7N VVTR K2HG VSKS Z8P6 R3A8 D49G FGWD 6QH2 A9Y7 41JJ 708W 6TGM 5RZQ AG4J ZGDJ JQVR PZVN ZZ8W A6WF ZTK0 0MBT WPH6 E5DH 3APC 58K8 KJMM 25GX Y440 HCWR SJ4D M8BG S21B 2B1M 1NAB XM1J 4D7Z 0QZZ 220Z QM5E 2BFN B216 4HM7 CMQN AXVT HEJ0 XGHV 17S8 WQXS M23N 55F6 RZHT XG9D 72AJ 9DGW S8NS 6RSV T2A1 FDJ2 771N 6HMQ WFQN 1KY3 3TD6 0DRW 2WWJ 1TGF CQ6W 8EMB B030 2TG2 K3Z2 Z9HQ DE1P HPPK BCZV SBBH 2RKD 6S16 DR8J P7DS 26YB Y4KC 4X8T 6E18 DGHE 8CDA 4KRG PA8W N 2018-08-02T14:26:32+01:00 foo 000000101 01SF32MTSYYSDGM 8N089YN3DFTGK0W RNV9BJ1Q03130GE FDEH838MRXE1PY4 DJKRTK0K1YD0BW3 TCVTZP4Q9SHEPRS ZDBE2N09XWZ3CDG JT1KJ9F8EPKJSYX W8K2EGX034KPZS9 8RS8QZ9GPVAMN24 MWM990KTNSSJEH5 T6VWGA30Z4SWCF6 DT8EKVC1E105BZX G1SG4TYCDKK4C1A GA545A0EK52MMSK THZXVHVV9DS8ER7 A1MD9Q4B93ED3X3 895GSW94RDQSCH0 Z05D63ZJKG8ZPFW RTKGT2VV5PC9NM1 85MQC6SG408DPKT V94F94H7B4YYTX0 GJ4E7YJPAJSG3TJ F6T6H3D79QBYVQZ MQY4EXTYDSW2AKC QY9NWDPYB4A30ZF DZF0NT2F7W056KN PDFPV5RDBCHW0V9 XFSBHZ65V497TQ2 ZD62Z4R 2018-08-02T14:26:33+01:00 foo 000000102 8XN18 E1HZC 6TS9H BPEP3 ARMCE CV25Y 0Q69D PFDHQ 2CDC6 Z708X Z3BFN EF20N Y2VP5 PAANT BW4WN EA8HJ 2SFX7 9BZ1V FD4VC 1A4ZF 1PP3W PBZJC 2B11Z GD0QK 419YZ Y0X2T BBJ2B ACX9Y G7X1W 4QN1Q CXNP9 JQAKB RETB7 0C6DM DBXSG 9MAVT SEFJN 1286S 5BY06 JP8SC XSAQD TYT3B T5FK7 JQSXX FNE53 DN71G 1R8J3 QSBTN HM7A2 PKAMV C6J5H CJKYT AHQK1 3SMJJ YTVT6 4AP5N 3RB3P 1WKS6 CP7FC BB7P4 VZ7QQ FT3JS 27T71 90F97 K3JSW XX6KA 8WJEC YQ5T4 719J9 Z5425 NK04S M966W 1VCM7 27TMX 8BHB6 090VN 0108D NNR48 JPH 2018-08-02T14:26:33+01:00 foo 000000103 J5S8CS5GJJNMQRK 41M7KPB5S8ZY686 SYT1ZE40XSETGN6 G608YKWF05C4EAK GBEVQEJWE6M3MYZ W1AGX1QE0NT47V4 91VJQPRRVG61AJM MTRTKJSPY98HMW2 QNHCEZ9FJBT93NV GEX84DTXNFHJW7T 7HRJ9MT6NK5AKQQ PN9E9QDN002M5TQ 4Z52WC0JMB491DV ZPE3RCSKK0XKTC8 0BPMVC63K9J8ZGS YEKGB1P84DYJM8W XHJ8TD31MRS339D QC2N7285DS17SP1 RHNH7NZHGGV7C1G VN9KBXR5S4MQF8A 6929BYWCFE5W8GH 7S26H0TAP8H15XJ F17MCQTQDJSG1ZF 71A6E7SVJZB2JZV XWWPYH809F2FWZG 4XVM7Y88Z4KQSH6 YRWG31Y41BRS50N 1N12451GAM8CXCH QA62Q85H7J0JHN0 2WT4BGG 2018-08-02T14:26:33+01:00 foo 000000104 S6YJ4SDHNDNMCCZ3 MX86GFKJCC921JPA B710CFZVY5HE2HHT RZP0AMF0A7AJDDFK S9YVY0SHS0HXY56H VAXHE1DHZFNBQ7NY 5ZAC1E70MKPQAM39 PKTVTENGJB5XY91P F674EXK1464CE8TV Q833271WAYP6PX1T 1RRKZXMZSCZ1MH5V R2TK9E35NP7B7VW3 4WQF328Y7GSSYZEH TQ48GYQG7SQK3774 MCCZRSANK695A020 7C08P0VY2CVJ2719 14Z9RAC5736CRZB9 KCRQ822P39ARB2YB HM6RSVAXP17SHYTV Q65QGX2W2SD37MT6 MYN96N87Q1XK0P5X 9SA0G4VGNS5MTA6R 57E6JVWC7EKERHV8 Z2K5RAEJP8F2YE7C WYKAMRW91YDW52QZ 93F320FVWJA5061X Z1NTG5W9YK33ZR5T H3GA9G5AYBWF

    So my question is: what is the best way to test the cluster is working before forwarding any logs to it (using output or a test or member list command) since the testsvc output doesn't seem to be accurate in my opinion (unless there's something wrong I am doing).

    Thanks, Hoch

    opened by Sentoj 0
  • Local node unable to connect to self after netowrk change

    Local node unable to connect to self after netowrk change

    Observed the following error:

    ts=2018-07-29T14:09:48.652160779Z level=warn component=Consumer op=gather warning="Get http://192.168.0.50:7650/ingest/next: dial tcp 192.168.0.50:7650: connect: no route to host" msg="ingester 192.168.0.50:7650, during /next: fatal error"

    This happened during local development and with the following start command:

    oklog ingeststore -ui.local -store.segment-replication-factor 1

    My assumption is this is to local ip addresses changing when switching networks on travels. Just wanted to leave it here for a second look.

    bug 
    opened by xla 0
  • Add ability to query by topic

    Add ability to query by topic

    I believe one of most popular query filters would be "get log records from a particular service and filter by some string", services can be identified by topic in oklog. So you need to be able to filter records by topic somehow.

    You can do it by providing regexp, like "^.*?", but this regexp works very slowly. So this patch adds additional filter which works much faster.

    I've generated log of size 1.5Gb and 20.000.000 records. And filter by regexp takes 18 seconds while separate filter by topic and plain string takes only 2.5 seconds.

    opened by dmitry-guryanov 1
  • another compaction algorithm

    another compaction algorithm

    Fix for https://github.com/oklog/oklog/issues/132

    opened by dmitry-guryanov 0
  • [RFC] store: add compression support

    [RFC] store: add compression support

    Implement compression on store side. If compression is enabled data is compressed before being written to segment files on disk and decompressed on read. The code which deals with records remains the same, it doesn't affected by this patch.

    Now gzip and zstandard are implemented, but other implementations can also be easily added.

    To enable compression -compression flag for 'oklog store' or 'oklog ingeststore' should be provided.

    opened by dmitry-guryanov 2
  • Compacter writes too much data to disk

    Compacter writes too much data to disk

    Suppose you run ingeststore with default parameters (-store.segment-target-size is 128M) and you get 1M of log data every 4 seconds. So at the beginning comparter will find two sequential files of size 1M and merge to a file of size 2M, then it will read 2M + 1M and write 3M e.t.c. When segment size will be close to 128M it will write more than 100Mb of data every 4 seconds, while real amount of log data is only 1Mb.

    I've added prometheus counted for number of bytes written in compacter, and got picture in the attachment. So by the time size of logs is 128Mb total number of data written is more than 20Gb.

    I think compaction algorithm can be improved, or at least there should be a choice between slower query time because of bigger number of segment files and higher write rate.

    screenshot_2018-06-29_15-14-35

    opened by dmitry-guryanov 1
  • Implement token based access control

    Implement token based access control

    The access control aspect does not seem to be mentioned anywhere. My two cents on this would be the following (probably somewhat inspired by consul/vault):

    Access control mechanism should be explicitly enabled in oklog's command line. When disabled, oklog's functionality should be completely unchanged. Read and write access control should be probably enabled separately. TBD: it may make sense to be able to enable access control both at global (disabled by default) and per-topic (overriding global setting) level.

    Let us define a token to be an arbitrary string that does not contain spaces and new-lines. Practically, this would probably be some kind of uuid but oklog should treat them as opaque.

    Let us define access control list (ACL) to be a list of token pairs: access token, id token (akin to vault's "token accessor"). ACLs are stored as plain text files with one entry per line, and access and id token being separated by space. "id token" is to be used as a non-secret reference to "access token" in various access logs (are there any?).

    A new "update config" REST API is to be added to "oklog". Parameters would be "access token", "topic" (or empty for global configs), "config name" (a short string), and "config value" (a long string). Access token is to be used to restrict calls to this API to bearers of tokens from admin ACL (either global or topic-specific when setting topic-specific config, and global when setting global configs). As a part of the call, the config should be parsed according to rules for this particular config (e.g. above-described format for ACL storage, in case of ACL configs), parsing errors should be reported to caller and config should remain unchanged. Dynamic configs should be stored as plain text files in a a directory managed by oklog. Writing should be performed in "write to temp, then rename" fashion. This API could be made a bit more flexible by accepting a list of "topic", "config name", "config value" tuples, and making an attempt for atomic update (at least, parse and write to temp all the configs, and then do the rename pass; a stricter atomicity could be implemented by complete copying of entire set of config files into new directory, and renaming the directory at the end). The first use for this dynamic config mechanism would be dynamic update of ACLs. Upon server start, all dynamic configs should be read from files, any parsing errors should be fatal. Each oklog instance would have it's own dynamic config store (directory). Initial provisioning and updating configs (via above described API) across all instances is outside of oklog's scope.

    There are three ACLs per topic and three global ACLs: readers, writers and admins.

    There should also be a global "master" admin ACL, supplied as a file name via command line option, that would be unmodifiable via API. This is to prevent lock-out situations that could occur due to accidental or malicious deletion of admin tokens.

    If read access control is enabled, query operations should require supplying of a token from either topic-specific or global ACL. If token is omitted, an "anonymous" token should be used.

    If write access control is enabled, each ingested log entry should start with an access token followed by a single space. Tokens from global write ACL allow writing to all topics. Tokens from per-topic write ACLs allow writing to corresponding topics. Writes without a valid token should be silently ignored, possibly adding an error entry to access log. Upon ingestion, "access token" is substituted with "token id". This both allows traceability (i.e., knowing, which token was used to produce any particular log entry) and keeps access tokens secret (without this substitution, all readers would be able to see write tokens).

    opened by kshpytsya 0
Releases(v0.3.2)
Owner
OK Log
The logs are gonna be okay
OK Log
A distributed key-value storage system developed by Alibaba Group

Product Overview Tair is fast-access memory (MDB)/persistent (LDB) storage service. Using a high-performance and high-availability distributed cluster

Alibaba 1.8k Oct 16, 2021
a dynamic configuration framework used in distributed system

go-archaius This is a light weight configuration management framework which helps to manage configurations in distributed system The main objective of

null 168 Oct 15, 2021
A distributed systems library for Kubernetes deployments built on top of spindle and Cloud Spanner.

hedge A library built on top of spindle and Cloud Spanner that provides rudimentary distributed computing facilities to Kubernetes deployments. Featur

null 19 Oct 15, 2021
An actor framework for Go

gosiris is an actor framework for Golang. Features Manage a hierarchy of actors (each actor has its own: state, behavior, mailbox, child actors) Deplo

Teiva Harsanyi 235 Oct 15, 2021
Dapr is a portable, event-driven, runtime for building distributed applications across cloud and edge.

Dapr is a portable, serverless, event-driven runtime that makes it easy for developers to build resilient, stateless and stateful microservices that run on the cloud and edge and embraces the diversity of languages and developer frameworks.

Dapr 15k Oct 22, 2021
Distributed reliable key-value store for the most critical data of a distributed system

etcd Note: The main branch may be in an unstable or even broken state during development. For stable versions, see releases. etcd is a distributed rel

etcd-io 37.6k Oct 23, 2021
The lightweight, distributed relational database built on SQLite

rqlite is a lightweight, distributed relational database, which uses SQLite as its storage engine. Forming a cluster is very straightforward, it grace

rqlite 9k Oct 21, 2021
Kafka implemented in Golang with built-in coordination (No ZK dep, single binary install, Cloud Native)

Jocko Kafka/distributed commit log service in Go. Goals of this project: Implement Kafka in Go Protocol compatible with Kafka so Kafka clients and ser

Travis Jeffery 4.4k Oct 21, 2021
Distributed Lab 2: RPC in Go

Distributed Lab 2: RPC in Go Using the lab sheet There are two ways to use the lab sheet, you can either: create a new repo from this template - this

null 0 Oct 15, 2021
A distributed system for embedding-based retrieval

Overview Vearch is a scalable distributed system for efficient similarity search of deep learning vectors. Architecture Data Model space, documents, v

vector search infrastructure for AI applications 1.2k Oct 21, 2021
Sandglass is a distributed, horizontally scalable, persistent, time sorted message queue.

Sandglass is a distributed, horizontally scalable, persistent, time ordered message queue. It was developed to support asynchronous tasks and message

Sandglass 1.5k Oct 3, 2021
Distributed Named Pipes

dnpipes Distributed Named Pipes (or: dnpipes) are essentially a distributed version of Unix named pipes comparable to, for example, SQS in AWS or the

Michael Hausenblas 451 Aug 17, 2021
Raft library Raft is a protocol with which a cluster of nodes can maintain a replicated state machine.

Raft library Raft is a protocol with which a cluster of nodes can maintain a replicated state machine. The state machine is kept in sync through the u

Kalyan Akella 0 Oct 15, 2021
implementation of some distributed system techniques

Distributed Systems These applications were built with the objective of studding a distributed systems using the most recent technics. The main ideia

Rafael A. C 1 Oct 16, 2021
Rink is a "distributed sticky ranked ring" using etcd.

Rink is a "distributed sticky ranked ring" using etcd. A rink provides role scheduling across distributed processes, with each role only assigned

Luno 4 Oct 4, 2021
Go Micro is a framework for distributed systems development

Go Micro Go Micro is a framework for distributed systems development. Overview Go Micro provides the core requirements for distributed systems develop

Asim Aslam 17k Oct 24, 2021
Go Micro is a standalone framework for distributed systems development

Go Micro Go Micro is a framework for distributed systems development. Overview Go Micro provides the core requirements for distributed systems develop

Asim Aslam 17k Oct 23, 2021
A standard library for microservices.

Go kit Go kit is a programming toolkit for building microservices (or elegant monoliths) in Go. We solve common problems in distributed systems and ap

Go kit 21.5k Oct 21, 2021