On-line Machine Learning in Go (and so much more)

Overview

goml

Golang Machine Learning, On The Wire

GoDoc wercker status

goml is a machine learning library written entirely in Golang which lets the average developer include machine learning into their applications. (pronounced like the data format 'toml')

While models include traditional, batch learning interfaces, goml includes many models which let you learn in an online, reactive manner by passing data to streams held on channels.

The library includes comprehensive tests, extensive documentation, and clean, expressive, modular source code. Community contribution is heavily encouraged.

Each package (mentioned below) includes individual README's to learn more about the function, and purpose of the models. Above all, if you want to learn about models, read the GoDoc reference for the package. All models are, as mentioned above, heavily documented.

Installation

go get github.com/cdipaolo/goml/base

# This could be any other model package if you want
#
# Also, the base package is imported already
# by many of the packages so you might not even
# need to `go get` the package explicitly
go get github.com/cdipaolo/goml/perceptron

Documentation

All the code is well documented, and the source is/should be really readable if you'd like to make sense of it all! Look at each package (like right now, in GitHub,) and you will see a link to Godoc as well as an explanation of the package and an example usage. You can even click on the main bullets below and it'll take you to those packages. Also you could just use the Godoc link at the top of this README and navigate to the package you'd like to see more about.

Sub-bullets below will take you directly to the source code of the model.

Currently Implemented Models

Contributing!

see CONTRIBUTING.

I'd love help with any of this if anybody thinks that they would like to implement a model that isn't here, or if they have improvements to current models implemented, or if they want to help with documentation (this would be greatly appreciated, believe me, writing great documentation takes time! 👍 )

LICENSE - MIT

see LICENSE

Issues
  • Bayes.go tokenizer breaks the sentiment restoration

    Bayes.go tokenizer breaks the sentiment restoration

    #12 breaks restorations that don't call NewNaiveBayes()

    Sentiment is broken because it doesn't set the tokenizer. Should the tokenizer somehow be set when it is un-marshaled?

    opened by Corrob 9
  • Remove `fmt.Printf`s?

    Remove `fmt.Printf`s?

    Hello!

    Great library. I noticed during tests that the code decides to just fmt.Printf. I don't want the ML lib in my app to be outputting to the console without me knowing. Can we disable that? Or provide a way to provide an alternate io.Writer?

    Thanks!

    opened by mitchellh 5
  • fmt.Errorf format %v reads arg #2, but call has 1 arg

    fmt.Errorf format %v reads arg #2, but call has 1 arg

    This line in kmeans throws an error fmt.Errorf format %v reads arg #2, but call has 1 arg while running tests.

    A simple fix would be to replace the line in question

    errors <- fmt.Errorf("ERROR: point.X must have the same dimensions as clusters (len %v). Point: %v", point)
    

    with this

    errors <- fmt.Errorf("ERROR: point.X must have the same dimensions as clusters (len %v). Point: %v", centroids, point)
    

    Follow up question, is this project in active development?

    opened by ashnair1 4
  • add concurrency-friendly map access to fix #8

    add concurrency-friendly map access to fix #8

    Created a new type histogram that couples a sync.RWMutex and the existing map. The type itself isn't exported (no one should be instantiating these, right?), but its Get and Set methods are. This is a significant change for consumers that create their own NaiveBayes struct without calling NewNaiveBayes.

    opened by piazzamp 4
  • Examples

    Examples

    I'd like to learn more about machine learning and this library looks like a good place to start building something with. Are there any examples you could post to demonstrate some simple use cases?

    opened by jpillora 3
  • Add go.mod

    Add go.mod

    Thought I'd add this to make the project compatible with go modules. This fixes an issue with build where go get fails. Now the build should (in theory) only fail if tests don't pass.

    opened by ashnair1 2
  • Why do Predict and Probability functions use different operators?

    Why do Predict and Probability functions use different operators?

    I understand why the Naive Bayes "Predict" function uses a math.Log() to avoid an underflow. I don't understand why on lines 288 and 293 the operator is += instead of *=... Could you provide an explanation? Maybe an update to the docs?

    opened by wagslane 2
  • handle unicode in sanitization functions

    handle unicode in sanitization functions

    I renamed the old functions so that everything now 'defaults' to the unicode-friendly versions. I also changed the range of digits accepted by OnlyAsciiWordsAndNumbers and refactored the tests to make what each one was testing more obvious.

    opened by piazzamp 2
  • Fix persistence & restoration of Naive Bayes models

    Fix persistence & restoration of Naive Bayes models

    It seems that #9 has caused projects using github.com/cdipaolo/sentiment to break. I've found out the reason is that the new concurrentMap is no longer a simple map that Go knows how to un/marshal out of the box.

    This pull request implements the necessary functions for concurrentMap so that the Restore() function works again.

    opened by arianht 2
  • Text models, uint8 for number of classes?

    Text models, uint8 for number of classes?

    I don't know that much at the moment about ML so pardon me if this is ignorant. Is there a reason that the number of classes for text classification is limited to 255 via uint8? Would it be possible to increase this?

    opened by mitchellh 2
  • Any interest in XGBoost?

    Any interest in XGBoost?

    Hello, I have an experimental high performance XGBoost (tree_method=exact only) implementation here:

    https://github.com/Statfactory/cortado (python + llvm) https://github.com/Statfactory/cortado-fs (F#) https://github.com/Statfactory/JuML.jl (Julia)

    I could port it to golang with some help if there is interest:)

    Adam

    opened by amlocek 1
  • TFIDF doesn't work

    TFIDF doesn't work

    TFIDF doesn't work unless we actually save the DocsSeen value in the Bayes model.

    Currently the struct for Word doesn't do this.

    type Word struct { Count []uint64 Seen uint64 DocsSeen uint64 json:"-" }

    Should be:

    type Word struct { Count []uint64 Seen uint64 DocsSeen uint64 }

    opened by mmorells 1
  • Roadmap / Comparison to other Go ML libraries

    Roadmap / Comparison to other Go ML libraries

    How does goml compare to some of the other Go libraries in terms of product vision / roadmap?

    • https://github.com/sjwhitworth/golearn
    • https://github.com/alonsovidales/go_ml

    There's a decent amount of overlap in terms of the implemented algorithms / models. Is your goal to eventually include all of the other types (neural networks, collaborative filtering, etc)? It seems like the stated goal of being more stream oriented than batch oriented differentiates this library too.

    At the end of the day, this seems like the most active repo with an exciting direction. I'm very curious to know where you plan on taking things.

    opened by derekperkins 1
  • Comparison with Weka, others?

    Comparison with Weka, others?

    It would be very useful to compare performance (run time, memory used) with other commonly used machine learning libraries/frameworks. like Weka and Apache Mahout....

    opened by gnewton 1
Owner
Conner DiPaolo
Business Manager at Citadel GQS
Conner DiPaolo
Self-contained Machine Learning and Natural Language Processing library in Go

Self-contained Machine Learning and Natural Language Processing library in Go

NLP Odyssey 1.2k Aug 10, 2022
Machine Learning for Go

GoLearn GoLearn is a 'batteries included' machine learning library for Go. Simplicity, paired with customisability, is the goal. We are in active deve

Stephen Whitworth 8.5k Aug 18, 2022
Gorgonia is a library that helps facilitate machine learning in Go.

Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily

Gorgonia 4.6k Aug 16, 2022
Machine Learning libraries for Go Lang - Linear regression, Logistic regression, etc.

package ml - Machine Learning Libraries ###import "github.com/alonsovidales/go_ml" Package ml provides some implementations of usefull machine learnin

Alonso Vidales 192 Jul 27, 2022
Gorgonia is a library that helps facilitate machine learning in Go.

Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily

Gorgonia 4.6k Aug 16, 2022
Prophecis is a one-stop machine learning platform developed by WeBank

Prophecis is a one-stop machine learning platform developed by WeBank. It integrates multiple open-source machine learning frameworks, has the multi tenant management capability of machine learning compute cluster, and provides full stack container deployment and management services for production environment.

WeBankFinTech 356 Aug 3, 2022
Go Machine Learning Benchmarks

Benchmarks of machine learning inference for Go

Nikolay Dubina 23 May 27, 2022
A High-level Machine Learning Library for Go

Overview Goro is a high-level machine learning library for Go built on Gorgonia. It aims to have the same feel as Keras. Usage import ( . "github.

AUNUM 347 Aug 11, 2022
Standard machine learning models

Cog: Standard machine learning models Define your models in a standard format, store them in a central place, run them anywhere. Standard interface fo

Replicate 2.7k Aug 11, 2022
Katib is a Kubernetes-native project for automated machine learning (AutoML).

Katib is a Kubernetes-native project for automated machine learning (AutoML). Katib supports Hyperparameter Tuning, Early Stopping and Neural Architec

Kubeflow 1.2k Aug 12, 2022
PaddleDTX is a solution that focused on distributed machine learning technology based on decentralized storage.

中文 | English PaddleDTX PaddleDTX is a solution that focused on distributed machine learning technology based on decentralized storage. It solves the d

null 68 Aug 11, 2022
Spice.ai is an open source, portable runtime for training and using deep learning on time series data.

Spice.ai Spice.ai is an open source, portable runtime for training and using deep learning on time series data. ⚠️ DEVELOPER PREVIEW ONLY Spice.ai is

Spice.ai 742 Aug 14, 2022
Reinforcement Learning in Go

Overview Gold is a reinforcement learning library for Go. It provides a set of agents that can be used to solve challenges in various environments. Th

AUNUM 294 Aug 8, 2022
FlyML perfomant real time mashine learning libraryes in Go

FlyML perfomant real time mashine learning libraryes in Go simple & perfomant logistic regression (~100 LoC) Status: WIP! Validated on mushrooms datas

Vadim Kulibaba 1 May 30, 2022
Go (Golang) encrypted deep learning library; Fully homomorphic encryption over neural network graphs

DC DarkLantern A lantern is a portable case that protects light, A dark lantern is one who's light can be hidden at will. DC DarkLantern is a golang i

Raven 1 Dec 2, 2021
Go types, funcs, and utilities for working with cards, decks, and evaluating poker hands (Holdem, Omaha, Stud, more)

cardrank.io/cardrank Package cardrank.io/cardrank provides a library of types, funcs, and utilities for working with playing cards, decks, and evaluat

null 57 Aug 10, 2022
A tool for building identical machine images for multiple platforms from a single source configuration

Packer Packer is a tool for building identical machine images for multiple platforms from a single source configuration. Packer is lightweight, runs o

null 2 Oct 3, 2021
Command line tool for improving typing skills (programmers friendly)

Command line tool for improving typing speed and accuracy. The main goal is to help programmers practise programming languages. Demo Installation Pyth

Jan 355 Aug 9, 2022
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.

End-to-end computer vision platform Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises. onepa

Onepanel, Inc. 611 Aug 9, 2022