A Statsd implementation written in GO lang

Overview

statsgod

Build Status

Statsgod is a metric aggregation service inspired by the statsd project. Written in Golang, it increases performance and can be deployed without dependencies. This project uses the same metric string format as statsd, but adds new features like alternate sockets, authentication, etc.

Usage

Usage:  statsgod [args]
  -config="config.yml": YAML config file path

Example:

  1. start the daemon.

    gom exec go run statsgod.go
    
  2. Start a testing receiver.

     gom exec go run test_receiver.go
    
  3. Send data to the daemon. Set a gauge to 3 for your.metric.name

     echo "your.metric.name:3|g" | nc localhost 8125 # TCP
     echo "your.metric.name:3|g" | nc -4u -w0 localhost 8126 # UDP
     echo "your.metric.name:3|g" | nc -U /tmp/statsgod.sock # Unix Socket
    

Metric Format

Data is sent over a socket connection as a string using the format: [namespace]:[value]|[type] where the namespace is a dot-delimeted string like "user.login.success". Values are floating point numbers represented as strings. The metric type uses the following values:

  • Gauge (g): constant metric, keeps the last value.
  • Counter (c): increment/decrement a given namespace.
  • Timer (ms): a timer that calculates averages (see below).
  • Set (s): a count of unique values sent during a flush period.

Optionally you may denote that a metric has been sampled by adding "|@0.75" (where 0.75 is the sample rate as a float). Counters will inflate the value accordingly so that it can be accurately used to calculate a rate.

An example data string would be "user.login.success:123|c|@0.9"

Sending Metrics

Client code can send metrics via any one of three sockets which listen concurrently:

  1. TCP

    • Allows multiple metrics to be sent over a connection, separated by a newline character.
    • Connection will remain open until closed by the client.
    • Config:
      • connection.udp.enabled
      • connection.udp.host
      • connection.udp.port
  2. UDP

    • Allows multiple metrics to be sent over a connection, separated by a newline character. Note, you should be careful to not exceed the maximum packet size (default 1024 bytes).
    • Config:
      • connection.udp.enabled
      • connection.udp.host
      • connection.udp.port
      • connection.udp.maxpacket (buffer size to read incoming packets)
  3. Unix Domain Socket

    • Allows multiple metrics to be sent over a connection, separated by a newline character.
    • Connection will remain open until closed by the client.
    • Config:
      • connection.unix.enabled
      • config: connection.unix.file (path to the sock file)

Configuration

All runtime options are specified in a YAML file. Please see example.config.yml for defaults. e.g.

go run statsgod.go -config=/etc/statsgod.yml

Stats Types

Statsgod provides support for the following metric types.

  1. Counters - these are cumulative values that calculate the sum of all metrics sent. A rate is also calculated to determine how many values were sent during the flush interval:

     my.counter:1|c
     my.counter:1|c
     my.counter:1|c
     # flush produces a count and a rate:
     [prefix].my.counter.[suffix] [timestamp] 3
     [prefix].my.counter.[suffix] [timestamp] [3/(duration of flush interval in seconds)]
    
  2. Gauges - these are a "last in" measurement which discards all previously sent values:

     my.gauge:1|g
     my.gauge:2|g
     my.gauge:3|g
     # flush only sends the last value:
     [prefix].my.gauge.[suffix] [timestamp] 3
    
  3. Timers - these are timed values measured in milliseconds. Statsgod provides several calculated values based on the sent metrics:

     my.timer:100|ms
     my.timer:200|ms
     my.timer:300|ms
     # flush produces several calculated fields:
     [prefix].my.timer.mean_value.[suffix] [timestamp] [mean]
     [prefix].my.timer.median_value.[suffix] [timestamp] [median]
     [prefix].my.timer.min_value.[suffix] [timestamp] [min]
     [prefix].my.timer.max_value.[suffix] [timestamp] [max]
     [prefix].my.timer.mean_90.[suffix] [timestamp] [mean in 90th percentile]
     [prefix].my.timer.upper_90.[suffix] [timestamp] [upper in 90th percentile]
     [prefix].my.timer.sum_90.[suffix] [timestamp] [sum in 90th percentile]
    
  4. Sets - these track the number of unique values sent during a flush interval:

     my.unique:1|s
     my.unique:2|s
     my.unique:2|s
     my.unique:1|s
     # flush produces a single value counting the unique metrics sent:
     [prefix].my.unique.[suffix] [timestamp] 2
    

Prefix/Suffix

Prefixes and suffixes noted above can be customized in the configuration. Metrics will render as [prefix].[type prefix].[metric namespace].[type suffix].[suffix]. You may also use empty strings in the config for any values you do not wish statsgod to prefix/suffix before relaying.

namespace:
	prefix: "stats"
	prefixes:
		counters: "counts"
		gauges: "gauges"
		rates: "rates"
		sets: "sets"
		timers: "timers"
	suffix: ""
	suffixes:
		counters: ""
		gauges: ""
		rates: ""
		sets: ""
		timers: ""

Authentication

Auth is handled via the statsgod.Auth interface. Currently there are two types of authentication: no-auth and token-auth, which are specified in the configuration file:

  1. No auth

     # config.yml
     service:
     	auth: "none"
    

Works as you might expect, all metrics strings are parsed without authentication or manipulation. This is the default behavior.

  1. Token auth

     # config.yml
     service:
     	auth: "token"
     	tokens:
     		"token-name": false
     		"32a3c4970093": true
    

"token" checks the configuration file for a valid auth token. The config file may specify as many tokens as needed in the service.tokens map. These are written as "string": bool where the string is the token and the bool is whether or not the token is valid. Please note that these are read into memory when the proces is started, so changes to the token map require a restart.

When sending metrics, the token is specified at the beginning of the metric namespace followed by a dot. For example, a metric "32a3c4970093.my.metric:123|g" would look in the config tokens for the string "32a3c4970093" and see if that is set to true. If valid, the process will strip the token from the namespace, only parsing and aggregating "my.metric:123|g". NOTE: since metric namespaces are dot-delimited, you cannot use a dot in a token.

Signal handling

The statsgod service is equipped to handle the following signals:

  1. Shut down the sockets and clean up before exiting.

    • SIGABRT
    • SIGINT
    • SIGTERM
    • SIGQUIT
  2. Reload* the configuration without restarting.

    • SIGHUP

* When reloading configuration, not all values will affect the current runtime. The following are only available on start up and not currently reloadable:

  • connection.*
  • relay.*
  • stats.percentile
  • debug.verbose
  • debug.profile

Development

Read more about the development process.

License

Except as otherwise noted this software is licensed under the Apache License, Version 2.0

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Comments
  • Why is statsgod cooler than gostatsd?

    Why is statsgod cooler than gostatsd?

    Could be helpful to highlight why we should use this library over https://github.com/kisielk/gostatsd. Statsgod looks way cooler given the 5 minute sniff test, but spelling it would be great.

    question 
    opened by cpliakas 3
  • Explore using Ginkgo BDD

    Explore using Ginkgo BDD

    http://onsi.github.io/ginkgo/ looks to be a very nice testing framework. We should explore using that over our current approach (https://github.com/stretchr/testify).

    opened by kevinhankens 3
  • Support

    Support "set" metric types that track the unique values between flushes

    A set is a metric that keeps track of the number of unique occurances of a metric sent between flush periods. We could either support the "s" metric type like statsd or we could just calculate this as a value for all metrics. Also, I think "sets" is a terrible name... "uniques" might be better.

    Option 1 - "set" metric type (notice the "s" type):

    my.value:123|s
    my.value:124|s
    my.value:125|s
    my.value:123|s
    -- flush --
    generates: stats.sets.my.value 3
    

    Option 2 - generate a unique metric for all types (notice the "g" type):

    my.value:123|g
    my.value:124|g
    my.value:125|g
    my.value:123|g
    -- flush --
    generates: stats.gauges.my.value 123 -and- stats.uniques.my.value 3
    
    opened by kevinhankens 2
  • Create configurable global/type namespace prefixes

    Create configurable global/type namespace prefixes

    Instead of sending things prefixed as statsgod, they should be configurable. Examples:

    Currently

    my.stat.counter:123|c -> stats.my.stat.counter
    my.stat.gauge:123|g -> stats.gauges.my.stat.gauge
    my.stat.timer:123|ms -> stats.timers.my.stat.timer
    

    We should be able to define a global prefix to specify the leading "stats" and specify the type (which we currently don't have for counters).

    config.Namespace.Global = "acquia" config.Namespace.Counters = "c" config.Namespace.Gauges = "g" config.Namespace.Timers = "t"

    my.stat.counter:123|c -> acquia.c.my.stat.counter
    my.stat.gauge:123|g -> acquia.g.my.stat.gauge
    my.stat.timer:123|ms -> acquia.t.my.stat.timer
    
    opened by kevinhankens 2
  • optionally persist metrics in memory to create contiguous graphs

    optionally persist metrics in memory to create contiguous graphs

    Statsd has the option to either persist metrics indefinitely or to delete them after flushing to the backend relay. The persistence causes a memory leak, but we should at least allow the behavior as it could be seen as a good feature. We could also have the option to drop the metrics after a duration with no new metrics sent in.

    opened by kevinhankens 2
  • gom build fails unless yaml is already installed

    gom build fails unless yaml is already installed

    Running the instructions on the readme doesn't work unless I have already go gotten yaml1/2. We should fix the readme or get the gom process to get all the dependencies.

    This was on a fresh install of Go/Statsgod/etc. on a new machine.

    Initial error: $ gom build statsgod.go:22:2: cannot find package "gopkg.in/yaml.v1" in any of: /usr/local/go/src/gopkg.in/yaml.v1 (from $GOROOT) /Users/andrew.kenney/dev/statsgod/_vendor/src/gopkg.in/yaml.v1 (from $GOPATH) /Users/andrew.kenney/dev/statsgod/src/gopkg.in/yaml.v1 /Users/andrew.kenney/go/src/gopkg.in/yaml.v1 gom: exit status 1

    But once I do install yaml1/2:

    macbookpro-andrewkenney:statsgod andrew.kenney$ go get gopkg.in/yaml.v1 macbookpro-andrewkenney:statsgod andrew.kenney$ go get gopkg.in/yaml.v2 macbookpro-andrewkenney:statsgod andrew.kenney$ gom build macbookpro-andrewkenney:statsgod andrew.kenney$ gom install downloading github.com/jmcvetta/randutil downloading gopkg.in/yaml.v1 macbookpro-andrewkenney:statsgod andrew.kenney$ gom exec go run statsgod.go INFO: 2015/05/08 16:07:49 statsgod.go:103: Loaded Config: map[name:statsgod host:localhost port:8125 debug:false graphiteHost:localhost graphitePost:5001 flushTime:10s percentile:80] INFO: 2015/05/08 16:07:49 statsgod.go:106: Starting stats server on localhost:8125 INFO: 2015/05/08 16:07:49 statsgod.go:246: Flushing every 10s

    opened by kenney 1
  • Improving string performance.

    Improving string performance.

    After profiling I noticed that it was spending a ton of time in strings.Trim() so I reworked it a little bit to iterate over the runes (int32 char value) instead of using convenience functions in the strings lib. Subsequent profiling and pidstat monitoring looks much better. There is still a lot of room for improvement, but this is an incremental improvement.

    opened by kevinhankens 1
  • UDP Socket should accept multiple metrics delimited by a newline character.

    UDP Socket should accept multiple metrics delimited by a newline character.

    Right now only TCP and Unix sockets will allow the client to sent multiple metrics. UDP should do the same, allowing multiple metrics in one packet separated by a newline char. We currently only read 512 bytes, so also make sure we are set up to handle larger packets. The statsd docs have a decent description of the recommended packet sizes.

    opened by kevinhankens 1
  • Create a load test using unix socket connection pools

    Create a load test using unix socket connection pools

    We need to better understand the performance of using a conn pool with the unix socket. The only big change that this will require is that connection pools currently take a host and port as two arguments. Unix sockets use a single arg for the address. We could let the conn pool take a single host:port arg instead. It will also require a type arg "unix" or "tcp".

    opened by kevinhankens 1
  • The UDP listener needs to be closed more gracefully

    The UDP listener needs to be closed more gracefully

    diff --git a/statsgod.go b/statsgod.go
    index 4b5ccad..a4090d1 100644
    --- a/statsgod.go
    +++ b/statsgod.go
    @@ -151,15 +151,15 @@ func main() {
            go func() {
                    s := <-signalChannel
                    logger.Info.Printf("Processed signal %v", s)
    -               socketTcp.Close(logger)
    -               socketUdp.Close(logger)
    -               socketUnix.Close(logger)
                    finishChannel <- 1
            }()
    
            select {
            case <-finishChannel:
                    logger.Info.Println("Exiting program.")
    +               socketTcp.Close(logger)
    +               socketUdp.Close(logger)
    +               socketUnix.Close(logger)
            }
     }
    
    
    opened by kevinhankens 1
  • Apache licenses should list the copyright owner

    Apache licenses should list the copyright owner

    http://www.apache.org/licenses/LICENSE-2.0.html we need to list acquia in the file headers above the apache 2.0 license: "Copyright 2014 Acquia, Inc."

    e.g. https://github.com/acquia/nemesis/blob/master/Rakefile#L1-L13

    bug 
    opened by kevinhankens 1
  • Support histograms

    Support histograms

    I'd love for statsgod to support generating histograms instead of relying on percentiles and mean / median values to aggregate values.

    Would you be open to a PR to add this feature ?

    opened by fcantournet 1
  • Go vet is now bundled.

    Go vet is now bundled.

    This PR does three things

    • gets rid of the 1.4 travis tests as the version is out of date
    • gets rid of the external dependency on vet as it is now part of >1.5
    • explicitly vets and lints our files as travis includes other packages that shouldn't fail our tests
    opened by kevinhankens 0
  • Concurrency and data corruption

    Concurrency and data corruption

    With the concurrency setting in the configuration statsgod will setup multiple consumers for the relayChannel channel:

    https://github.com/acquia/statsgod/blob/master/statsgod.go#L94

    Each consumer maintains its own hash of metrics. So how does this not cause data corruption when the same metric key is handled by more than one of the running consumers? It looks like the individual flush cycles would end up overwriting the same metric from another consumer's flush cycle.

    opened by jjneely 3
  • Create a flat file relay for local storage

    Create a flat file relay for local storage

    We should consider having the option for a local flat file relay for all metrics. This would allow it to write to a log file - something like a CSV of all metrics that are simply appended instead of being sent to a remote relay. This would allow for alternate uses like reading data into R or gnuplot or something for local analysis without the overhead of needing a graphite setup.

    opened by kevinhankens 0
  • Create a spool file in case the remote relay is unavailable

    Create a spool file in case the remote relay is unavailable

    In the case where carbon goes away, we might want to preserve metrics. We could accomplish this by having a spool file where we write metrics in case of a relay error. When statsgod is first started we could have an optional config that reads in from the spool file and sends to the relay.

    opened by kevinhankens 0
Go implementation of systemd Journal's native API for logging

journald Package journald offers Go implementation of systemd Journal's native API for logging. Key features are: based on a connection-less socket wo

Grigory Zubankov 34 Dec 23, 2022
A reference implementation of blockchain in Go to demonstrate how blockchain works. For education purpose.

Mini-Blockchain Mini-Blockchain is a reference design for a blockchain system to demostate a full end2end flow in current blockchain technology. There

codingtmd 42 Nov 18, 2022
A logr LogSink implementation using Zerolog

Zerologr A logr LogSink implementation using Zerolog. Usage import ( "os" "github.com/go-logr/logr" "github.com/hn8/zerologr" "github

null 24 Nov 17, 2022
An implementation of A* in Golang

General This is an implementation of the a star path finding algoritm written in Golang. State This software is in pre-alpha state. Development starte

Torsten Sachse 0 Jan 7, 2022
A logr LogSink implementation using bytes.Buffer

buflogr A logr LogSink implementation using bytes.Buffer. Usage import ( "bytes" "fmt" "github.com/go-logr/logr" "github.com/tonglil/buflogr" )

Tony Li 4 Jan 6, 2023
Gale-Shapley algoritm implementation in Go

Stable matching Gale-Shapley algoritm implementation in Go. Inspired by Numberphile video. See the explanation on Wikipedia Inputs N×N table of propos

Eugene Shevchenko 0 Feb 12, 2022
A golang implementation of the Open Pixel Control protocol

__ ___ ___ _____ ___ /'_ `\ / __`\ _______ / __`\/\ '__`\ /'___\ /\ \L\ \/\ \L\ \/\______\/\ \L\ \ \ \L\ \/\ \__/ \ \

Kelly 18 Jul 3, 2022
The full power of the Go Compiler directly in your browser, including a virtual file system implementation. Deployable as a static website.

Static Go Playground Features Full Go Compiler running on the browser. Supports using custom build tags. Incremental builds (build cache). Supports mu

null 25 Jun 16, 2022
The Simplest and worst logging library ever written

gologger A Simple Easy to use go logger library. Displays Colored log into console in any unix or windows platform. You can even store your logs in fi

Sadlil Rhythom 41 Sep 26, 2022
LogVoyage - logging SaaS written in GoLang

No longer maintained, sorry. Completely rewritten v2 is going to be released soon. Please follow http://github.com/logvoyage LogVoyage - fast and simp

null 93 Sep 26, 2022
A system and resource monitoring tool written in Golang!

Grofer A clean and modern system and resource monitor written purely in golang using termui and gopsutil! Currently compatible with Linux only. Curren

PES Open Source Community 248 Jan 8, 2023
Logstash like, written in golang

gogstash Logstash like, written in golang Download gogstash from github check latest version Use docker image tsaikd/gogstash curl 'https://github.com

Tsai KD 607 Dec 18, 2022
GoVector is a vector clock logging library written in Go.

GoVector is a vector clock logging library written in Go. The vector clock algorithm is used to order events in distributed systems in the absence of a centralized clock. GoVector implements the vector clock algorithm and provides feature-rich logging and encoding infrastructure.

Distributed clocks 165 Nov 28, 2022
ChangeTower is intended to help you watch changes in webpages and get notified of any changes written in Go

ChangeTower is intended to help you watch changes in webpages and get notified of any changes written in Go

The Cats 34 Nov 17, 2022
Simple log parser written in Golang

Simple log parser written in Golang

Matteo Baiguini 0 Oct 31, 2021
Port information web scraper written in Go.

Whatport is an open source tool that scrapes port information from SpeedGuide's Port Database Usage whatport [port(s)] (Seperate ports with a space)

Abdelouahab 7 Aug 18, 2022
A reusable logger module for basic logging, written in Go

logger A reusable logger module for basic logging, written in Go. Usage Client p

Praveen Ravichandran 1 Jan 8, 2022
Logger - Simple logger without written with std pkg

Go-Logger Simple usage is: package main

MaskedTrench 2 Jan 2, 2022
Stream logs through websockets, written in Go

Stream logs through websockets, written in Go

Praveen Ravichandran 1 Jan 8, 2022