Ngram index for golang


go-ngram Build Status

N-gram index for Go.

Key features

  • Unicode support.
  • Append only. Data can't be deleted from index.
  • GC friendly (all strings are pooled and compressed)
  • Application agnostic (there is no notion of document or something that user needs to implement)


index, err := ngram.NewNGramIndex(ngram.SetN(3))
tokenId, err := index.Add("hello") 
str, err := index.GetString(tokenId)  // str == "hello"
resultsList, err := index.Search("world")


  • Smoothing functions (Laplace etc)


docs examples

library users

  • Add locks for concurrent access

    When trying to search for a match in different goroutines, we get a panic because the underlying storage uses a map (which is not concurrent safe).

    This implements a simple lock mechanism to prevent panics.

    opened by phacops 2
Eugene Lazin
Eugene Lazin
