Fast, realtime regex-extraction, and aggregation into common formats such as histograms, numerical summaries, tables, and more!

Overview

rare

GitHub Workflow Status codecov

A file scanner/regex extractor and realtime summarizor.

Supports various CLI-based graphing and metric formats (histogram, table, etc).

rare gif

Features

  • Multiple summary formats including: filter (like grep), histogram, and numerical analysis
  • File glob expansions (eg /var/log/* or /var/log/*/*.log) and -R
  • Optional gzip decompression (with -z)
  • Following -f or re-open following -F (use --poll to poll)
  • Ignoring lines that match an expression
  • Aggregating and realtime summary (Don't have to wait for all data to be scanned)
  • Multi-threaded reading, parsing, and aggregation
  • Color-coded outputs (optionally)
  • Pipe support (stdin for reading, stdout will disable color) eg. tail -f | rare ...

Installation

Notes on versions: Besides your standard OS versions, there is an additional pcre build which is 4x faster than go's re2 implementation. In order to use this, you must make sure that libpcre2 is installed (eg apt install libpcre2-8-0). Right now, it is only bundled with the linux distribution.

Manual

Download appropriate binary from Releases, unzip, and put it in /bin

Homebrew

brew tap zix99/rare
brew install rare

From code

Clone the repo, and:

Requires GO 1.11 or higher (Uses go modules)

go get ./...

# Pack documentation (Only necessary for release builds)
go run github.com/gobuffalo/packr/v2/packr2

# Build binary
go build .

# OR, with experimental features
go build -tags experimental .

Available tags:

  • experimental Enable experimental features (eg. fuzzy search)
  • pcre2 Enables PCRE 2 (v10) where able. Currently linux only

Docs

All documentation may be found here, in the docs/ folder, and by running rare docs (embedded docs/ folder)

You can also see a dump of the CLI options at cli-help.md

Example

Extract status codes from nginx logs

$ rare histo -m '"(\w{3,4}) ([A-Za-z0-9/.]+).*" (\d{3})' -e '{3} {1}' access.log
200 GET                          160663
404 GET                          857
304 GET                          53
200 HEAD                         18
403 GET                          14

Extract number of bytes sent by bucket, and format

This shows an example of how to bucket the values into size of 1000. In this case, it doesn't make sense to see the histogram by number of bytes, but we might want to know the ratio of various orders-of-magnitudes.

$ rare histo -m '"(\w{3,4}) ([A-Za-z0-9/.]+).*" (\d{3}) (\d+)' -e "{bucket {4} 10000}" -n 10 access.log -b
0                   144239     ||||||||||||||||||||||||||||||||||||||||||||||||||
190000              2599       
10000               1290       
180000              821        
20000               496        
30000               445        
40000               440        
200000              427        
140000              323        
70000               222        
Matched: 161622 / 161622
Groups:  1203

Output Formats

Histogram (histo)

The histogram format outputs an aggregation by counting the occurences of an extracted match. That is to say, on every line a regex will be matched (or not), and the matched groups can be used to extract and build a key, that will act as the bucketing name.

NAME:
   rare histogram - Summarize results by extracting them to a histogram

USAGE:
   rare histogram [command options] <-|filename|glob...>

DESCRIPTION:
   Generates a live-updating histogram of the extracted information from a file
    Each line in the file will be matched, any the matching part extracted
    as a key and counted.
    If an extraction expression is provided with -e, that will be used
    as the key instead

OPTIONS:
   --follow, -f                 Read appended data as file grows
   --reopen, -F                 Same as -f, but will reopen recreated files
   --poll                       When following a file, poll for changes rather than using inotify
   --posix, -p                  Compile regex as against posix standard
   --match value, -m value      Regex to create match groups to summarize on (default: ".*")
   --extract value, -e value    Expression that will generate the key to group by (default: "{0}")
   --gunzip, -z                 Attempt to decompress file when reading
   --batch value                Specifies io batching size. Set to 1 for immediate input (default: 1000)
   --workers value, -w value    Set number of data processors (default: 5)
   --readers value, --wr value  Sets the number of concurrent readers (Infinite when -f) (default: 3)
   --ignore value, -i value     Ignore a match given a truthy expression (Can have multiple)
   --recursive, -R              Recursively walk a non-globbing path and search for plain-files
   --bars, -b                   Display bars as part of histogram
   --num value, -n value        Number of elements to display (default: 5)
   --reverse                    Reverses the display sort-order
   --sortkey, --sk              Sort by key, rather than value

Filter (filter)

Filter is a command used to match and (optionally) extract that match without any aggregation. It's effectively a grep or a combination of grep, awk, and/or sed.

NAME:
   rare filter - Filter incoming results with search criteria, and output raw matches

USAGE:
   rare filter [command options] <-|filename|glob...>

DESCRIPTION:
   Filters incoming results by a regex, and output the match or an extracted expression.
    Unable to output contextual information due to the application's parallelism.  Use grep if you
    need that

OPTIONS:
   --follow, -f                 Read appended data as file grows
   --reopen, -F                 Same as -f, but will reopen recreated files
   --poll                       When following a file, poll for changes rather than using inotify
   --posix, -p                  Compile regex as against posix standard
   --match value, -m value      Regex to create match groups to summarize on (default: ".*")
   --extract value, -e value    Expression that will generate the key to group by (default: "{0}")
   --gunzip, -z                 Attempt to decompress file when reading
   --batch value                Specifies io batching size. Set to 1 for immediate input (default: 1000)
   --workers value, -w value    Set number of data processors (default: 5)
   --readers value, --wr value  Sets the number of concurrent readers (Infinite when -f) (default: 3)
   --ignore value, -i value     Ignore a match given a truthy expression (Can have multiple)
   --recursive, -R              Recursively walk a non-globbing path and search for plain-files
   --line, -l                   Output line numbers

Numerical Analysis

This command will extract a number from logs and run basic analysis on that number (Such as mean, median, mode, and quantiles).

NAME:
   rare analyze - Numerical analysis on a set of filtered data

USAGE:
   rare analyze [command options] <-|filename|glob...>

DESCRIPTION:
   Treat every extracted expression as a numerical input, and run analysis
    on that input.  Will extract mean, median, mode, min, max.  If specifying --extra
    will also extract std deviation, and quantiles

OPTIONS:
   --follow, -f                 Read appended data as file grows
   --reopen, -F                 Same as -f, but will reopen recreated files
   --poll                       When following a file, poll for changes rather than using inotify
   --posix, -p                  Compile regex as against posix standard
   --match value, -m value      Regex to create match groups to summarize on (default: ".*")
   --extract value, -e value    Expression that will generate the key to group by (default: "{0}")
   --gunzip, -z                 Attempt to decompress file when reading
   --batch value                Specifies io batching size. Set to 1 for immediate input (default: 1000)
   --workers value, -w value    Set number of data processors (default: 5)
   --readers value, --wr value  Sets the number of concurrent readers (Infinite when -f) (default: 3)
   --ignore value, -i value     Ignore a match given a truthy expression (Can have multiple)
   --recursive, -R              Recursively walk a non-globbing path and search for plain-files
   --extra                      Displays extra analysis on the data (Requires more memory and cpu)
   --reverse, -r                Reverses the numerical series when ordered-analysis takes place (eg Quantile)
   --quantile value, -q value   Adds a quantile to the output set. Requires --extra (default: "90", "99", "99.9")

Example:

$ go run *.go --color analyze -m '"(\w{3,4}) ([A-Za-z0-9/[email protected]_-]+).*" (\d{3}) (\d+)' -e "{4}" testdata/access.log 
Samples:  161,622
Mean:     2,566,283.9616
Min:      0.0000
Max:      1,198,677,592.0000

Median:   1,021.0000
Mode:     1,021.0000
P90:      19,506.0000
P99:      64,757,808.0000
P99.9:    395,186,166.0000
Matched: 161,622 / 161,622

Tabulate

Create a 2D view (table) of data extracted from a file. Expression needs to yield a two dimensions separated by a tab. Can either use \x00 or the {$ a b} helper. First element is the column name, followed by the row name.

NAME:
   rare tabulate - Create a 2D summarizing table of extracted data

USAGE:
   rare tabulate [command options] <-|filename|glob...>

DESCRIPTION:
   Summarizes the extracted data as a 2D data table.
    The key is provided in the expression, and should be separated by a tab \x00
    character or via {$ a b} Where a is the column header, and b is the row

OPTIONS:
   --follow, -f                 Read appended data as file grows
   --reopen, -F                 Same as -f, but will reopen recreated files
   --poll                       When following a file, poll for changes rather than using inotify
   --posix, -p                  Compile regex as against posix standard
   --match value, -m value      Regex to create match groups to summarize on (default: ".*")
   --extract value, -e value    Expression that will generate the key to group by (default: "{0}")
   --gunzip, -z                 Attempt to decompress file when reading
   --batch value                Specifies io batching size. Set to 1 for immediate input (default: 1000)
   --workers value, -w value    Set number of data processors (default: 5)
   --readers value, --wr value  Sets the number of concurrent readers (Infinite when -f) (default: 3)
   --ignore value, -i value     Ignore a match given a truthy expression (Can have multiple)
   --recursive, -R              Recursively walk a non-globbing path and search for plain-files
   --delim value                Character to tabulate on. Use {$} helper by default (default: "\x00")
   --num value, -n value        Number of elements to display (default: 20)
   --cols value                 Number of columns to display (default: 10)
   --sortkey, --sk              Sort rows by key name rather than by values

Example:

$ rare tabulate -m "(\d{3}) (\d+)" -e "{$ {1} {bucket {2} 100000}}" -sk access.log

         200      404      304      403      301      206      
0        153,271  860      53       14       12       2                 
1000000  796      0        0        0        0        0                 
2000000  513      0        0        0        0        0                 
7000000  262      0        0        0        0        0                 
4000000  257      0        0        0        0        0                 
6000000  221      0        0        0        0        0                 
5000000  218      0        0        0        0        0                 
9000000  206      0        0        0        0        0                 
3000000  202      0        0        0        0        0                 
10000000 201      0        0        0        0        0                 
11000000 190      0        0        0        0        0                 
21000000 142      0        0        0        0        0                 
15000000 138      0        0        0        0        0                 
8000000  137      0        0        0        0        0                 
22000000 123      0        0        0        0        0                 
14000000 121      0        0        0        0        0                 
16000000 110      0        0        0        0        0                 
17000000 99       0        0        0        0        0                 
34000000 91       0        0        0        0        0                 
Matched: 161,622 / 161,622
Rows: 223; Cols: 6

Performance Benchmarking

I know there are different solutions, and rare accomplishes summarization in a way that grep, awk, etc can't, however I think it's worth analyzing the performance of this tool vs standard tools to show that it's at least as good.

It's worth noting that in many of these results rare is just as fast, but part of that reason is that it consumes CPU in a more efficient way (go is great at parallelization). So take that into account, for better or worse.

All tests were done on ~200MB of gzip'd nginx logs spread acorss 10 files.

Each program was run 3 times and the last time was taken (to make sure things were cached equally).

zcat & grep

$ time zcat testdata/* | grep -Poa '" (\d{3})' | wc -l
1131354

real	0m0.990s
user	0m1.480s
sys	0m0.080s

$ time zcat testdata/* | grep -Poa '" 200' > /dev/null

real	0m1.136s
user	0m1.644s
sys	0m0.044s

I believe the largest holdup here is the fact that zcat will pass all the data to grep via a synchronous pipe, whereas rare can process everything in async batches. Using pigz instead didn't yield different results, but on single-file results they did perform comparibly.

Silver Searcher (ag)

$ ag --version
ag version 0.31.0

Features:
  +jit +lzma +zlib

$ time ag -z '" (\d{3})' testdata/* | wc -l
1131354

real	0m3.944s
user	0m3.904s
sys	0m0.152s

rare

$ rare -v
rare version 0.1.16, 11ca2bfc4ad35683c59929a74ad023cc762a29ae

$ time rare filter -m '" (\d{3})' -e "{1}" -z testdata/* | wc -l
Matched: 1,131,354 / 3,638,594
1131354

real	0m0.927s
user	0m1.764s
sys	0m1.144s

$ time rare histo -m '" (\d{3})' -e "{1}" -z testdata/*
200                 1,124,767 
404                 6,020     
304                 371       
403                 98        
301                 84        

Matched: 1,131,354 / 3,638,594
Groups:  6

real	0m0.284s
user	0m1.648s
sys	0m0.048s

Development

New additions to rare should pass the following checks

  • Documentation for any new functionality or expression changes
  • Before and after CPU and memory benchmarking for core additions (Expressions, aggregation, benchmarking, and rendering)
  • Limit memory allocations (preferably 0!) in the high-throughput functions
  • Tests, and if it makes sense, benchmarks of a given function

Running/Testing

go run .
go test ./...

Profiling

New high-throughput changes should be performance benchmarked.

To Benchmark:

go run . --profile out <your test code>
go tool pprof -http=:8080 out.cpu.prof # CPU
go tool pprof -http=:8080 out_num.prof # Memory

License

Copyright (C) 2019  Christopher LaPointe

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>.
Comments
  • Panic in coloring logic when using nested groups in 'filter' mode

    Panic in coloring logic when using nested groups in 'filter' mode

    First of all, let me say, this is an awesome project. Nice work! I was in the process of writing something similar but much much worse; I think I might use this as a library instead!

    I've manage to cause a panic in the following scenario:

    echo '1,2,3,4,5,6,7,8,9,0' | rare filter -m "(^[^,]*)(,([^,]*)){5}"
    panic: runtime error: slice bounds out of range [11:10]
    
    goroutine 1 [running]:
    rare/pkg/color.WrapIndices(0xc000500000, 0x13, 0xc0000b4590, 0x6, 0x6, 0x206860, 0x1207860)
    	/Users/ondrejb/Documents/git/rare/pkg/color/coloring.go:95 +0x832
    rare/cmd.filterFunction(0xc00024c840, 0x0, 0xc0004752d0)
    	/Users/ondrejb/Documents/git/rare/cmd/filter.go:33 +0x276
    github.com/urfave/cli.HandleAction(0x13b20c0, 0x144c828, 0xc00024c840, 0xc00024c840, 0x0)
    	/Users/ondrejb/go/pkg/mod/github.com/urfave/[email protected]/app.go:523 +0xfd
    github.com/urfave/cli.Command.Run(0x142c0a8, 0x6, 0x1429f04, 0x1, 0x0, 0x0, 0x0, 0x1441dfa, 0x44, 0x0, ...)
    	/Users/ondrejb/go/pkg/mod/github.com/urfave/[email protected]/command.go:174 +0x58e
    github.com/urfave/cli.(*App).Run(0xc0004d6000, 0xc000090040, 0x4, 0x4, 0x0, 0x0)
    	/Users/ondrejb/go/pkg/mod/github.com/urfave/[email protected]/app.go:276 +0x7d4
    main.cliMain(0xc000090040, 0x4, 0x4, 0x0, 0x0)
    	/Users/ondrejb/Documents/git/rare/main.go:101 +0x666
    main.main()
    	/Users/ondrejb/Documents/git/rare/main.go:105 +0x49
    

    To save you parsing manually, the three groups here are the 1st column of a csv, the 6th column but including the leading comma, and the 6th field without the leading comma.

    When I make the second (outer) group non capturing, ie. (^[^,]*)(?:,([^,]*)){5} everything works fine and I get the 1st and 6th field (groups {1} and {3}). Obviously, if I use the --nocolor option, or if I use an expression eg. -e '{1} {2} {3}, everything is fine.

    I haven't looked deep into the code yet, but obviously since the match groups overlap, the starting index of the inner group lies inside the outer group, and the colouring logic doesn't account for this scenario.

    I'd suggest that the inner match should take precedence when colouring the matching text (ie. inner match colours "overwrite" the outer group)

    I'll have a crack at making a pull request to fix this myself soon.

    bug 
    opened by obaudys 5
  • Invalid syntax in tap

    Invalid syntax in tap

    Hey,

    Just tried to install from Homebrew and I get an error:

    ❯ brew tap zix99/rare
    ==> Tapping zix99/rare
    Cloning into '/usr/local/Homebrew/Library/Taps/zix99/homebrew-rare'...
    remote: Enumerating objects: 45, done.
    remote: Counting objects: 100% (45/45), done.
    remote: Compressing objects: 100% (30/30), done.
    remote: Total 45 (delta 14), reused 0 (delta 0), pack-reused 0
    Receiving objects: 100% (45/45), 6.65 KiB | 3.33 MiB/s, done.
    Resolving deltas: 100% (14/14), done.
    Error: Invalid formula: /usr/local/Homebrew/Library/Taps/zix99/homebrew-rare/rare.rb
    rare: Calling bottle :unneeded is disabled! There is no replacement.
    Please report this issue to the zix99/rare tap (not Homebrew/brew or Homebrew/core):
      /usr/local/Homebrew/Library/Taps/zix99/homebrew-rare/rare.rb:9
    
    Error: Cannot tap zix99/rare: invalid syntax in tap!
    

    Is there anything I need to do other than

    brew tap zix99/rare && brew install rare
    

    I'm doing this on a 2019 Mac Book Pro (Intel) Thank you

    opened by farzadmf 4
  • Sort heatmap columns numerically

    Sort heatmap columns numerically

    First off, great work with the heatmaps feature!

    I've been using heatmaps for numerical data, in particular nginx response times. I usually convert them it integers in rare by matching eg 0.053 (seconds) with (\d+)\.(\d{3}) and then using the expression {sumi {multi {1} 1000} {3}} to convert to milliseconds.

    I've found that the table and heatmap sort the column names as strings, and not numerically if possible. The results in meaningless heatmaps: image

    I made a small change to a local checkout of rare to basically test if the column headers could be converted to integers and then sort them numerically if they can:

    index 1be0be8..040b889 100644
    --- a/pkg/aggregation/table.go
    +++ b/pkg/aggregation/table.go
    @@ -103,8 +103,23 @@ func (s *TableAggregator) OrderedColumns() []string {
     func (s *TableAggregator) OrderedColumnsByName() []string {
            keys := s.Columns()
     
    +       // check if keys can be sorted numerically:
    +       numeric := true
    +       for _,k := range keys {
    +               if _, err := strconv.Atoi(k); err != nil {
    +                       numeric = false
    +                       break
    +               }
    +       }
    +
            sort.Slice(keys, func(i, j int) bool {
    -               return keys[i] < keys[j]
    +               if numeric {
    +                       k0, _ := strconv.Atoi(keys[i])
    +                       k1, _ := strconv.Atoi(keys[j])
    +                       return k0 < k1
    +               } else {
    +                       return keys[i] < keys[j]
    +               }
            })
     
            return keys
    
    

    The same heatmap is now much more meaningful: image

    Would you be interested in incorporating the above diff into rare?

    opened by obaudys 3
  • Bump mkdocs from 1.2.1 to 1.2.3

    Bump mkdocs from 1.2.1 to 1.2.3

    Bumps mkdocs from 1.2.1 to 1.2.3.

    Release notes

    Sourced from mkdocs's releases.

    1.2.3

    MkDocs 1.2.3 is a bugfix release for MkDocs 1.2.

    Aside: MkDocs has a new chat room on Gitter/Matrix. More details.

    Improvements:

    • Built-in themes now also support these languages:

    • Third-party plugins will take precedence over built-in plugins with the same name (#2591)

    • Bugfix: Fix ability to load translations for some languages: core support (#2565) and search plugin support with fallbacks (#2602)

    • Bugfix (regression in 1.2): Prevent directory traversal in the dev server (#2604)

    • Bugfix (regression in 1.2): Prevent webserver warnings from being treated as a build failure in strict mode (#2607)

    • Bugfix: Correctly print colorful messages in the terminal on Windows (#2606)

    • Bugfix: Python version 3.10 was displayed incorrectly in --version (#2618)

    Other small improvements; see commit log.

    1.2.2

    MkDocs 1.2.2 is a bugfix release for MkDocs 1.2 -- make sure you've seen the "major" release notes as well.

    • Bugfix (regression in 1.2): Fix serving files/paths with Unicode characters (#2464)

    • Bugfix (regression in 1.2): Revert livereload file watching to use polling observer (#2477)

      This had to be done to reasonably support usages that span virtual filesystems such as non-native Docker and network mounts.

      This goes back to the polling approach, very similar to that was always used prior, meaning most of the same downsides with latency and CPU usage.

    • Revert from 1.2: Remove the requirement of a site_url config and the restriction on use_directory_urls (#2490)

    • Bugfix (regression in 1.2): Don't require trailing slash in the URL when serving a directory index in mkdocs serve server (#2507)

      Instead of showing a 404 error, detect if it's a directory and redirect to a path with a trailing slash added, like before.

    • Bugfix: Fix gh_deploy with config-file in the current directory (#2481)

    • Bugfix: Fix reversed breadcrumbs in "readthedocs" theme (#2179)

    • Allow "mkdocs.yaml" as the file name when '--config' is not passed (#2478)

    ... (truncated)

    Commits
    • d167eab Release 1.2.3 (#2614)
    • 5629b09 Re-format translation files to pass a lint check (#2621)
    • 2c4679b Re-format translation files to pass a lint check (#2620)
    • 9262cc5 Fix the code to abbreviate Python's version (#2618)
    • 8345850 Add hint about -f/--config-file in configuration documentation (#2616)
    • 815af48 Added translation for Brazilian Portuguese (#2535)
    • 6563439 Update contact instructions: announce chat, preference for issues (#2610)
    • 6b72eef We can again announce support of zh_CN locale (#2609)
    • b18ae29 Drop assert_mock_called_once compat method from tests (#2611)
    • 7a27572 Isolate strict warning counter to just the ongoing build (#2607)
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies python 
    opened by dependabot[bot] 3
  • Regression:  'rare filter'  now ignores '-e' form of '--extract' flag

    Regression: 'rare filter' now ignores '-e' form of '--extract' flag

    Before: image

    Now: image

    Now but with --extract: image

    I've taken screenshots to preserve image highlighting; here's the cut-and-paste friendly test case: echo "a: 1 b: 2 c: 3" | ./rare f -m 'a: (\d+) b: (\d+) c: (\d+)' -e '{1} {2} {3}'

    Everything seems to work fine using the short '-e' with histogram, analyze, etc; it's just filter that is impacted.

    bug 
    opened by obaudys 3
  • Optimize memory usage by reducing buffered batches by default

    Optimize memory usage by reducing buffered batches by default

    With default settings, rare used ~50MB consistently. These tweaks and settings lower it to ~10MB while maintaining performance. For io-burst systems, you can tweak up the buffered batches via CLI.

    opened by zix99 2
  • Tables2

    Tables2

    • Refactor how table color-coding is handled
    • More dense information in table
    • At totals column/row
    • Change sorting to be consistent with other display methods
    opened by zix99 2
  • Follow reader

    Follow reader

    Replace gotail with internal file following is a significant performance improvement. Previous benchmark maxed out at 20-30 MB/sec (Likely mostly because of the single-batched channel from gotail) New benchmarks max out at 500 MB/sec, and seem limited by disk at that point

    opened by zix99 2
  • Readthru

    Readthru

    Introduce immediate-readahead, providing more immediate results at similar performance to readahead, and with the same memory characteristics. Resolves #61

    opened by zix99 2
  • Bump github.com/tidwall/gjson from 1.3.5 to 1.9.3

    Bump github.com/tidwall/gjson from 1.3.5 to 1.9.3

    Bumps github.com/tidwall/gjson from 1.3.5 to 1.9.3.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies go 
    opened by dependabot[bot] 2
  • Bump to go 1.17

    Bump to go 1.17

    Initial benchmarks show the callsite improvements are speding up rare by a few percentage points. Will post some benchmarks soon.

    Histogram benchmark (1.5 GB of logs): go 1.16: 23s real; 1m17s user time go 1.17: 20s real; 1m7s user time

    About a 12% savings on user-time at a high level.

    opened by zix99 2
  • Why not use grok?

    Why not use grok?

    Hello, grok is a generally common log parsing language that allows for a clear combination of regular expressions. It is used in tools like logstash and vector. I was just curious why you opted for traditional regex and match groups rather than using grok.

    Thanks, Cam.

    enhancement 
    opened by CameronNemo 5
  • Memory and CPU usage

    Memory and CPU usage

    I'm curious the Readme could included the memory and CPU usage between standard unix tools ga and rare?

    Another take to benchmark on embedded device is useful.

    documentation 
    opened by proyb6 1
Releases(0.3.0)
Owner
Chris LaPointe
Full stack Software Engineer at TripAdvisor, focusing on common services and backend infrastructure. I like to make other engineers live's easier.
Chris LaPointe
A CLI tool which loads data from yaml files into the Google Cloud Spanner tables

splanter A CLI tool which loads data from yaml files into the Google Cloud Spanner tables (mainly for the development).

Yuki Ito 15 Oct 27, 2022
A Go package for converting RGB and other color formats/colorspaces into DMC thread colors (DMC color name and floss number)

go-c2dmc A Go package for converting RGB and other color formats/colorspaces into DMC thread colors (DMC color name and floss number). Implemented as

null 6 Jul 25, 2022
Brigodier is a command parser & dispatcher, designed and developed for command lines such as for Discord bots or Minecraft chat commands. It is a complete port from Mojang's "brigadier" into Go.

brigodier Brigodier is a command parser & dispatcher, designed and developed to provide a simple and flexible command framework. It can be used in man

Minekube 16 Jun 5, 2022
Utilities to prettify console output of tables, lists, progress-bars, text, etc.

go-pretty Utilities to prettify console output of tables, lists, progress-bars, text, etc. Table Pretty-print tables into ASCII/Unicode strings.

Naveen Mahalingam 1.6k Nov 27, 2022
Stonks is a terminal based stock visualizer and tracker that displays realtime stocks in graph format in a terminal.

Stonks is a terminal based stock visualizer and tracker. Installation Requirements: golang >= 1.13 Manual Clone the repo Run make && make install Pack

Eric Moynihan 516 Nov 16, 2022
Use the command to convert arbitrary formats to Go Struct (including json, toml, yaml, etc.)

go2struct-tool Use the command to convert arbitrary formats to Go Struct (including json, toml, yaml, etc.) Installation Run the following command und

Afeyer 1 Dec 16, 2021
A go library for easy configure and run command chains. Such like pipelining in unix shells.

go-command-chain A go library for easy configure and run command chains. Such like pipelining in unix shells. Example cat log_file.txt | grep error |

null 36 Nov 18, 2022
ReverseSSH - a statically-linked ssh server with reverse shell functionality for CTFs and such

ReverseSSH A statically-linked ssh server with a reverse connection feature for simple yet powerful remote access. Most useful during HackTheBox chall

null 641 Nov 22, 2022
VIP video downloader, such as: iqiyi, youku, qq, ...etc.

vip-video-downloader VIP Video Downloader, such as: iqiyi, youku, qq, ...etc. usage Download vip-video-downloader download URL [flags] Merge vip-video

@Billcoding 13 Aug 18, 2022
A Go library and common interface for running local and remote commands

go-runcmd go-runcmd is a Go library and common interface for running local and remote commands providing the Runner interface which helps to abstract

AUCloud 1 Nov 25, 2021
Chore is a elegant and simple tool for executing common tasks on remote servers.

Chore is a tool for executing common tasks you run on your remote servers. You can easily setup tasks for deployment, commands, and more.

Ahmed waleed 39 May 20, 2022
kubeaudit helps you audit your Kubernetes clusters against common security controls

kubeaudit helps you audit your Kubernetes clusters against common security controls

Shopify 1.4k Nov 25, 2022
CLI tool to convert many common document types to plane text.

Textify. CLI tool to convert many common document types to plane text. Goals. SO many different document types exist today. PDFs, EPUB books, Microsof

Quin 1 Nov 19, 2021
Integrated console application library, using Go structs as commands, with menus, completions, hints, history, Vim mode, $EDITOR usage, and more ...

Gonsole - Integrated Console Application library This package rests on a readline console library, (giving advanced completion, hint, input and histor

null 18 Nov 20, 2022
convert curl commands to Python, JavaScript, Go, PHP, R, Dart, Java, MATLAB, Rust, Elixir and more

curlconverter curlconverter transpiles curl commands into programs in other programming languages. $ curlconverter --data "Hello, world!" example.com

null 6k Nov 20, 2022
Chalk is a Go Package which can be used for making terminal output more vibrant with text colors, text styles and background colors.

Chalk Chalk is a Go Package which can be used for making terminal output more vibrant with text colors, text styles and background colors. Documentati

null 6 Oct 29, 2022
Buildkite-cli - Command line tool for interacting with Buildkite pipelines, builds, and more

Buildkite CLI Command line tool for interacting with Buildkite pipelines, builds

Mark Skelton 1 Jan 7, 2022
Bk - Command line tool for interacting with Buildkite pipelines, builds, and more

Buildkite CLI Command line tool for interacting with Buildkite pipelines, builds

Mark Skelton 1 Jan 7, 2022