Netflix's Hystrix latency and fault tolerance library, for Go

Overview

hystrix-go

Build Status GoDoc Documentation

Hystrix is a great project from Netflix.

Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.

I think the Hystrix patterns of programmer-defined fallbacks and adaptive health monitoring are good for any distributed system. Go routines and channels are great concurrency primitives, but don't directly help our application stay available during failures.

hystrix-go aims to allow Go programmers to easily build applications with similar execution semantics of the Java-based Hystrix library.

For more about how Hystrix works, refer to the Java Hystrix wiki

For API documentation, refer to GoDoc

How to use

import "github.com/afex/hystrix-go/hystrix"

Execute code as a Hystrix command

Define your application logic which relies on external systems, passing your function to hystrix.Go. When that system is healthy this will be the only thing which executes.

hystrix.Go("my_command", func() error {
	// talk to other services
	return nil
}, nil)

Defining fallback behavior

If you want code to execute during a service outage, pass in a second function to hystrix.Go. Ideally, the logic here will allow your application to gracefully handle external services being unavailable.

This triggers when your code returns an error, or whenever it is unable to complete based on a variety of health checks.

hystrix.Go("my_command", func() error {
	// talk to other services
	return nil
}, func(err error) error {
	// do this when services are down
	return nil
})

Waiting for output

Calling hystrix.Go is like launching a goroutine, except you receive a channel of errors you can choose to monitor.

output := make(chan bool, 1)
errors := hystrix.Go("my_command", func() error {
	// talk to other services
	output <- true
	return nil
}, nil)

select {
case out := <-output:
	// success
case err := <-errors:
	// failure
}

Synchronous API

Since calling a command and immediately waiting for it to finish is a common pattern, a synchronous API is available with the hystrix.Do function which returns a single error.

err := hystrix.Do("my_command", func() error {
	// talk to other services
	return nil
}, nil)

Configure settings

During application boot, you can call hystrix.ConfigureCommand() to tweak the settings for each command.

hystrix.ConfigureCommand("my_command", hystrix.CommandConfig{
	Timeout:               1000,
	MaxConcurrentRequests: 100,
	ErrorPercentThreshold: 25,
})

You can also use hystrix.Configure() which accepts a map[string]CommandConfig.

Enable dashboard metrics

In your main.go, register the event stream HTTP handler on a port and launch it in a goroutine. Once you configure turbine for your Hystrix Dashboard to start streaming events, your commands will automatically begin appearing.

hystrixStreamHandler := hystrix.NewStreamHandler()
hystrixStreamHandler.Start()
go http.ListenAndServe(net.JoinHostPort("", "81"), hystrixStreamHandler)

Send circuit metrics to Statsd

c, err := plugins.InitializeStatsdCollector(&plugins.StatsdCollectorConfig{
	StatsdAddr: "localhost:8125",
	Prefix:     "myapp.hystrix",
})
if err != nil {
	log.Fatalf("could not initialize statsd client: %v", err)
}

metricCollector.Registry.Register(c.NewStatsdCollector)

FAQ

What happens if my run function panics? Does hystrix-go trigger the fallback?

No. hystrix-go does not use recover() so panics will kill the process like normal.

Build and Test

  • Install vagrant and VirtualBox
  • Clone the hystrix-go repository
  • Inside the hystrix-go directory, run vagrant up, then vagrant ssh
  • cd /go/src/github.com/afex/hystrix-go
  • go test ./...
Issues
  • Possible regression in 5b79165

    Possible regression in 5b79165

    I'm using hystrix-go in go-kit, as one of the circuit breaker implementations. Thank you for your good work!

    On a recent CI run, I noticed an error. I've reproduced the unit test as a single executable. Consider this program. When I checkout revision be7b59e, I get this output:

    ugh ~/tmp/hystr rm -rf $GOPATH/pkg/darwin_amd64/github.com/afex/hystrix-go ; go run example.go
    priming with 40 successful requests
    switching to errors...
    now the next few requests should give us our error: kaboom
    got expected error 1
    got expected error 2
    the circuit should have opened by now
    hystrix-go: opening circuit my-endpoint
    got expected error: hystrix: circuit open
    got expected error: hystrix: circuit open
    got expected error: hystrix: circuit open
    got expected error: hystrix: circuit open
    got expected error: hystrix: circuit open
    everything works as expected
    

    When I checkout revision 5b79165, I get this output:

    ugh ~/tmp/hystr rm -rf $GOPATH/pkg/darwin_amd64/github.com/afex/hystrix-go ; go run example.go
    priming with 40 successful requests
    switching to errors...
    now the next few requests should give us our error: kaboom
    got expected error 1
    got expected error 2
    the circuit should have opened by now
    got unexpected error at request 1: kaboom
    exit status 1
    

    If there's a bug in my unit test, I suspect it's to do with the shouldPass statement. Can you find a problem, or is this a regression?

    opened by peterbourgon 11
  • Goroutine leak?

    Goroutine leak?

    Hi! If a function times out, ala: https://github.com/afex/hystrix-go/blob/master/hystrix/hystrix.go#L160-L165

    Then this would block, causing a leaked goroutine: https://github.com/afex/hystrix-go/blob/master/hystrix/hystrix.go#L97

    Yeah?

    opened by sethgrid 7
  • Export isOpen() and metrics

    Export isOpen() and metrics

    We have a need to check if a circuit is open. Is there an easy way to do that? Also, we want to send some status to our metrics server and the only way I can think of is to hack the HTTP streamer. An example here (to stdout):

    // This is  used to output the Hystrix stream to stdout and only used for debugging
    // circuit stats
    type outputResponseStdout struct {
        HeaderMap http.Header
    }
    
    func (o *outputResponseStdout) Header() http.Header {
        m := o.HeaderMap
        if m == nil {
            m = make(http.Header)
            o.HeaderMap = m
        }
        return m
    }
    
    func (o *outputResponseStdout) Write(buf []byte) (int, error) {
        return os.Stdout.Write(buf)
    }
    
    func (o *outputResponseStdout) WriteHeader(c int) {
        fmt.Println("HTTP code: ", c)
    }
    
    func OutputHystrixEvents() {
        s := hystrix.NewStreamHandler()
        s.Start()
        rh := &outputResponseStdout{}
        req, err := http.NewRequest("GET", "", nil)
        if err != nil {
            return
        }
        s.ServeHTTP(rh, req)
    }
    
    2 - Working 
    opened by isaldana 4
  • tickets may not be returned to the pool and isTimeout race condition

    tickets may not be returned to the pool and isTimeout race condition

    In Go(), the original logic is one task goroutine acquires a ticket and then another timer goroutine returns it. Albeit it's unlikely, there's a chance that a ticket may never be returned when timer tries to return a ticket before task acquires one.

    opened by cfchou 3
  • Added support for CloseNotify in the event stream

    Added support for CloseNotify in the event stream

    StreamHandlers were hanging around after clients disconnected and creating noise. I suppose with enough hits to ServeHTTP it could have eventually impacted legitimate clients via port exhaustion, etc.

    opened by dlclark 3
  • Reduced the number of times a lock is acquired.

    Reduced the number of times a lock is acquired.

    There was a race condition in getCurrentBucket which is removed. The codepaths that went down the getCurrentBucket path would sometimes acquire the same lock as many as 3 times. Given that increment and set max would eventually have to acquire an exclusive lock, I just acquire that lock off the bat. Added some benchmarks to state the case: Current master: BenchmarkRollingNumberIncrement-8 10000000 204 ns/op BenchmarkRollingNumberUpdateMax-8 10000000 209 ns/op

    This branch: BenchmarkRollingNumberIncrement-8 10000000 147 ns/op BenchmarkRollingNumberUpdateMax-8 20000000 149 ns/op

    1 - Ready 
    opened by dzyp 3
  • Closing error channel can panic

    Closing error channel can panic

    If the callback returns an error (without having a fallback) an error is sent here: https://github.com/afex/hystrix-go/blob/master/hystrix/hystrix.go#L96.

    If a timeout in timer.C occurs, an error will be sent here: https://github.com/afex/hystrix-go/blob/master/hystrix/hystrix.go#L125

    Both errors can be triggered on the same call (i.e. if a timeout occurs and later the callback returns an error without a fallback function). If the error channel is closed after the caller receives the first error, the second error won't be able to be sent on a closed channel and panic. If this is expected behavior, should we document it here: https://github.com/afex/hystrix-go#waiting-for-output ? The workaround is not to close the channel and let it GC the second error.

    1 - Ready 
    opened by isaldana 3
  • Go() hangs if you only check for errors

    Go() hangs if you only check for errors

    Given the following code hystrix will permanently hang a go routine:

    errors := hystrix.Go("foo", func() error {
           fmt.Println("Just checking if success/failure")
        return nil
    }, nil)
    
    err := <- errors
    return err
    
    opened by keyneston 3
  • support context and ignore circuit metrics for canceled contexts

    support context and ignore circuit metrics for canceled contexts

    This is an attempt to appropriately support context in hystrix-go.

    It adds to new functions to the hystrix package: GoC and DoC which behave the same as Go and Do but take context as the first parameter.

    It keeps backwards compatibility by adapting Go and Do and calling GoC and DoC with context.Background().

    The most controversial choice is how I've chosen to deal with contexts. I've chosen to treat context errors separately from command errors. That is -- if the context passed in is canceled or deadline exceeded, we won't modify any circuit breaker metrics. The command in this case did not fail, just the computation is no longer relevant.

    This has the side effect of not triggering any 'Attempt' metrics in the case of context cancelation, which seems wrong, but avoids a mismatch between attempts and other failure conditions. We could consider adding a new metric 'cancels' and trigger 'attempts' and 'cancels' in the case we detect context canceled.

    opened by brildum 2
  • Fix default metric collector data race

    Fix default metric collector data race

    Fix for this race:

    WARNING: DATA RACE
    Write by goroutine 3233:
      github.com/afex/hystrix-go/hystrix/metric_collector.(*DefaultMetricCollector).Reset()
          /build/Godeps/_workspace/src/github.com/afex/hystrix-go/hystrix/metric_collector/default_metric_collector.go:105 +0xe76
      github.com/afex/hystrix-go/hystrix.(*metricExchange).Reset()
          /build/Godeps/_workspace/src/github.com/afex/hystrix-go/hystrix/metrics.go:101 +0x11a
      github.com/afex/hystrix-go/hystrix.(*CircuitBreaker).setClose()
          /build/Godeps/_workspace/src/github.com/afex/hystrix-go/hystrix/circuit.go:154 +0x1a6
      github.com/afex/hystrix-go/hystrix.(*CircuitBreaker).ReportEvent()
          /build/Godeps/_workspace/src/github.com/afex/hystrix-go/hystrix/circuit.go:160 +0xa9
      github.com/afex/hystrix-go/hystrix.func·002()
          /build/Godeps/_workspace/src/github.com/afex/hystrix-go/hystrix/hystrix.go:106 +0x779
    
    Previous read by goroutine 33:
      github.com/afex/hystrix-go/hystrix.(*StreamHandler).publishMetrics()
          /build/Godeps/_workspace/src/github.com/afex/hystrix-go/hystrix/eventstream.go:104 +0x766
      github.com/afex/hystrix-go/hystrix.(*StreamHandler).loop()
          /build/Godeps/_workspace/src/github.com/afex/hystrix-go/hystrix/eventstream.go:65 +0x1a0
    
    opened by ghost 2
  • support go module

    support go module

    • support go module to control the package dependencies
    • migrate github.com/cactus/go-statsd-client to the v5
      • github.com/cactus/go-statsd-client/statsd is not supported go module.

    Getting this error when executing go mod tidy.

      github.com/cactus/go-statsd-client/statsd: module github.com/cactus/[email protected] found (v3.2.1+incompatible), but does not contain package github.com/cactus/go-statsd-client/statsd
    
    opened by cs-lexliu 1
  • Incorrect build instructions

    Incorrect build instructions

    Problem description

    With the current "Build and test" instructions in the readme:

    • there is an error on vagrant up because the https://github.com/smartystreets/assertions package now needs Go 1.13 (for errors.Is but the VM provides Go 1.9
    • the go test ./... instructions don't work because /go is root:root but the VM guest user id vagrant:vagrant so the go tool can't download the dependencies
    • when adding a such chown -R vagrant:vagrant /go the go tool downloads success, but the build fails because of https://github.com/smartystreets/assertions dependency on Go 1.13 again.

    Suggested changes

    • In the short term, upgrade to Go 1.17 since this is the current version
    • In the longer term convert to docker for ease of use/speed.
    opened by fgm 0
  • Integrate with go-leak library and fix leaks

    Integrate with go-leak library and fix leaks

    Summary

    | PR Status | Type | Impact level | | :---: | :---: | :---: | | Ready | Bug | Medium |

    Description

    • Integrates with a Goleak detection library.
    • Fixed ~26 identified leaks in tests
    opened by isopropylcyanide 2
  • method executed within hystrix behaving weiredly

    method executed within hystrix behaving weiredly

    Hi,

    Sample code:

    	payload io.Reader) (bool, int) {
    	
    	//build request
    	request, err := http.NewRequest(method, url, payload)
    	if err != nil {
    		log.Printf("failed in request build %s %s \n", url, err.Error())
    		return false, 0
    	}
    
    	
    	//make request
    	response, err := h.HttpExecute(request)
    	if err != nil {
    		log.Println("HttpCommand=Error  URL=", url, " Error=", err)
    		return false, 0
    	} else {
    		log.Println("HttpCommand=Success  response=", response, " error=", err)
    		io.Copy(ioutil.Discard, response.Body)
    		defer response.Body.Close()
    
    		return true, response.StatusCode
    	}
    	
    }
    
    func (h *HTTPSink) HttpExecute(input *http.Request) (response *http.Response, err error){
    	if err := hystrix.Do("http", func() (err error) {
    		response, err = h.client.Do(input)
    		return err
    	}, nil); err != nil {
    		return nil, err
    	}
    	return response, nil
    }
    

    In the above code response, err := h.HttpExecute(request) is behaving in non deterministic fashion. While running in debug mode err is coming as nil although destination server failed (4xx) but while executing the same line second time getting correct error

    First invocation result

    Screenshot 2021-07-23 at 3 53 44 PM

    Second Invoction Result

    Screenshot 2021-07-23 at 3 53 56 PM
    opened by Kaustavd 0
  • circuit may open even though ErrorPercentThreshold is bigger than 100

    circuit may open even though ErrorPercentThreshold is bigger than 100

    Bug description

    In theory, when ErrorPercentThreshold is bigger than 100, the circuit should be always closed. But there are some exception. If you run the code below, you will find out that the circuit may possibly be open.

    package main
    
    import (
    	"fmt"
    	"github.com/afex/hystrix-go/hystrix"
    	"log"
    	"math/rand"
    	"sync"
    	"time"
    )
    
    func main() {
    	f := time.Millisecond * 200
    	cun := 1000
    	num := cun * 3
    	hystrix.SetLogger(log.Default())
    	hystrix.ConfigureCommand("my_command", hystrix.CommandConfig{
    		Timeout:               int(f.Milliseconds()),
    		MaxConcurrentRequests: cun,
    		RequestVolumeThreshold: 30,
    		SleepWindow: 30,
    		ErrorPercentThreshold: 120,
    	})
    
    	var lock sync.Mutex
    	var wg sync.WaitGroup
    	rand.Seed(int64(time.Now().Nanosecond()))
    	mm := make(map[string]int)
    	mark := 0
    	now := time.Now()
    	for i := 0; i < num; i++ {
    		wg.Add(1)
    		go func() {
    			defer wg.Done()
    			err := hystrix.Do("my_command", func() error {
    				a := rand.Intn(5)
    				if a < 1 {
    					time.Sleep(time.Millisecond * 10)
    					return fmt.Errorf("internal error")
    				} else if a >= 4 {
    					time.Sleep(time.Millisecond * 300)
    					return nil
    				} else {
    					time.Sleep(time.Millisecond * 9)
    					lock.Lock()
    					if _, found := mm["success"]; found {
    						mm["success"]++
    					} else {
    						mm["success"] = 1
    					}
    					lock.Unlock()
    					return nil
    				}
    			}, func(err error) error {
    				if err != nil {
    					lock.Lock()
    					if err.Error() == "hystrix: circuit open" && mark == 0 {
    						log.Println("circuit on at: ", time.Now().Sub(now).Milliseconds())
    						mark = 1
    					}
    					if _, found := mm[err.Error()]; found {
    						mm[err.Error()]++
    					} else {
    						mm[err.Error()] = 1
    					}
    					lock.Unlock()
    				} else {
    					panic("err is nil")
    				}
    
    				return nil
    			})
    
    			if err != nil {
    				fmt.Println("err: ", err)
    			}
    		}()
    	}
    	wg.Wait()
    	log.Println("end at: ", time.Now().Sub(now).Milliseconds())
    
    	count := 0
    	for key, value := range mm {
    		count += value
    		fmt.Printf("%s: %f\n", key, float32(value) / float32(num))
    	}
    	if count != num {
    		panic("don't match")
    	}
    }
    

    In https://github.com/afex/hystrix-go/blob/fa1af6a1f4f56e0e50d427fe901cd604d8c6fb8a/hystrix/metrics.go#L138, the value of errs may be larger than that of reqs because the code calculate the errs later than reqs.

    Resolution

    One simple way to fix the problem is to make the circuit always heathy when ErrorPercentThreshold is bigger than 100 like:

    func (m *metricExchange) IsHealthy(now time.Time) bool {
            errRate := getSettings(m.Name).ErrorPercentThreshold
    	if errRate > 100 {
                    return true
            }
            return m.ErrorPercent(now) < errRate
    }
    
    opened by Heisenberg-Y 0
  • Get the original error?

    Get the original error?

    output:

    fallback failed with '{"id":"user.srv","code":500,"detail":"error","status":"Internal Server Error"}'. run error was 'hystrix: timeout'
    

    How can I get the original error?

    opened by liuaiyuan 1
Owner
keith
keith
A standard library for microservices.

Go kit Go kit is a programming toolkit for building microservices (or elegant monoliths) in Go. We solve common problems in distributed systems and ap

Go kit 23.3k Jun 26, 2022
A library to help you create pipelines in Golang

pipeline Pipeline is a go library that helps you build pipelines without worrying about channel management and concurrency. It contains common fan-in

Delivery Hero SE 99 Jun 30, 2022
Canonicity testing library

What are canonical tests? That's when instead of comparing the expected and actual values in code: assert.Equal(t, expected, actual) You instead asser

Michael Sorokin 3 Dec 29, 2021
String Service. Microservice example using gokit library

Example of Microservices using go-kit Go kit is a collection of Go (golang) packages (libraries) that help you build robust, reliable, maintainable mi

Gabriel Camps 0 Dec 6, 2021
Go-fastapi: a library to quickly build APIs. It is inspired by Python's popular FastAPI

go-fastapi go-fastapi is a library to quickly build APIs. It is inspired by Pyth

null 59 Jun 25, 2022
A library to generate go models from given json files

generate A library to generate go models from given json files Requirements Go 1

null 2 May 18, 2022
Go Micro: a standard library for distributed systems development

Go Micro Go Micro is a standard library for distributed systems development. Ove

Pixelmatic 1 May 2, 2022
A Go library for building mongo queries with factory functions

FET Build query dynamically A Go library for building mongo queries with factory functions. What is FET? you can build queries with factory functions.

Ahmetcan ÖZCAN 4 Feb 15, 2022
GoMicroservices - RESTful microservices written in Go standard library

RESTful microservices written in Go standard library. Folder product corresponds

Emrah 1 Jan 21, 2022
A microservice gateway developed based on golang.With a variety of plug-ins which can be expanded by itself, plug and play. what's more,it can quickly help enterprises manage API services and improve the stability and security of API services.

Goku API gateway is a microservice gateway developed based on golang. It can achieve the purposes of high-performance HTTP API forwarding, multi tenant management, API access control, etc. it has a powerful custom plug-in system, which can be expanded by itself, and can quickly help enterprises manage API services and improve the stability and security of API services.

Eolink 201 Jul 1, 2022
Microservice Boilerplate for Golang with gRPC and RESTful API. Multiple database and client supported

Go Microservice Starter A boilerplate for flexible Go microservice. Table of contents Features Installation Todo List Folder Structures Features: Mult

Ahmad Saugi 11 May 27, 2022
Fast, intuitive, and powerful configuration-driven engine for faster and easier REST development

aicra is a lightweight and idiomatic configuration-driven engine for building REST services. It's especially good at helping you write large APIs that remain maintainable as your project grows.

xdrm-brackets 7 Jul 2, 2022
An example microservice demo using kubernetes concepts like deployment, services, persistent volume and claims, secrets and helm chart

Docker vs Kubernetes Docker Kubernetes container tech, isolated env for apps infra management, multiple containers automated builds and deploy apps -

abhijit wakchaure 0 Dec 13, 2021
Rpcx-framework - An RPC microservices framework based on rpcx, simple and easy to use, ultra fast and efficient, powerful, service discovery, service governance, service layering, version control, routing label registration.

RPCX Framework An RPC microservices framework based on rpcx. Features: simple and easy to use, ultra fast and efficient, powerful, service discovery,

ZYallers 1 Jan 5, 2022
GSOC-Mentorship - Scripts and Templates used to parse jsons and send mails to students of GSOC Mentorship

GSOC Mentorship The Club of Programmers, IIT(BHU) organizes it's flagship GSoC M

Lakshya Singh 4 Jan 26, 2022
Connect, secure, control, and observe services.

Istio An open platform to connect, manage, and secure microservices. For in-depth information about how to use Istio, visit istio.io To ask questions

Istio 30.7k Jun 29, 2022
Design-based APIs and microservices in Go

Goa is a framework for building micro-services and APIs in Go using a unique design-first approach. Overview Goa takes a different approach to buildin

Goa 4.7k Jul 5, 2022
Cloud-native and easy-to-use application management platform | 云原生且易用的应用管理平台

Website • Documentation What is NEW! August 24, 2020 ,Rainbond 5.2 Stable version is officially released View Release Rainbond Introduction Cloud nati

好雨科技 3.1k Jun 25, 2022
Generates Golang client and server based on OpenAPI2 (swagger) definitions

ExperienceOne Golang APIKit ExperienceOne Golang APIKit Overview Requirements Installation Usage Generate standard project structure Define the API wi

Experience One 140 Jul 1, 2022