Ngram index for golang

Overview

go-ngram Build Status

N-gram index for Go.

Key features

  • Unicode support.
  • Append only. Data can't be deleted from index.
  • GC friendly (all strings are pooled and compressed)
  • Application agnostic (there is no notion of document or something that user needs to implement)

Usage

index, err := ngram.NewNGramIndex(ngram.SetN(3))
tokenId, err := index.Add("hello") 
str, err := index.GetString(tokenId)  // str == "hello"
resultsList, err := index.Search("world")

TODO:

  • Smoothing functions (Laplace etc)

GoDoc

docs examples

library users

You might also like...
omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.

omniparser Omniparser is a native Golang ETL parser that ingests input data of various formats (CSV, txt, fixed length/width, XML, EDI/X12/EDIFACT, JS

Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler

Pagser Pagser inspired by page parser。 Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and str

iTunes and RSS 2.0 Podcast Generator in Golang

podcast Package podcast generates a fully compliant iTunes and RSS 2.0 podcast feed for GoLang using a simple API. Full documentation with detailed ex

TOML parser for Golang with reflection.

THIS PROJECT IS UNMAINTAINED The last commit to this repo before writing this message occurred over two years ago. While it was never my intention to

Your CSV pocket-knife (golang)

csvutil - Your CSV pocket-knife (golang) #WARNING I would advise against using this package. It was a language learning exercise from a time before "e

Golang (Go) bindings for GNU's gettext (http://www.gnu.org/software/gettext/)

gosexy/gettext Go bindings for GNU gettext, an internationalization and localization library for writing multilingual systems. Requeriments GNU gettex

agrep-like fuzzy matching, but made faster using Golang and precomputation.

goagrep There are situations where you want to take the user's input and match a primary key in a database. But, immediately a problem is introduced:

String-matching in Golang using the Knuth–Morris–Pratt algorithm (KMP)

gokmp String-matching in Golang using the Knuth–Morris–Pratt algorithm (KMP). Disclaimer This library was written as part of my Master's Thesis and sh

Golang HTML to plaintext conversion library

html2text Converts HTML into text of the markdown-flavored variety Introduction Ensure your emails are readable by all! Turns HTML into raw text, usef

Comments
  • Add locks for concurrent access

    Add locks for concurrent access

    When trying to search for a match in different goroutines, we get a panic because the underlying storage uses a map (which is not concurrent safe).

    This implements a simple lock mechanism to prevent panics.

    opened by phacops 2
Owner
Eugene Lazin
Eugene Lazin
Generate a global index for multiple markdown files recursively

markdown index Markdown-index is a library to help you generate a global index for multiple markdown files recursively in a directory, containing a su

Mateus Miranda 18 Sep 25, 2022
[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Goribot 一个分布式友好的轻量的 Golang 爬虫框架。 完整文档 | Document !! Warning !! Goribot 已经被迁移到 Gospider|github.com/zhshch2002/gospider。修复了一些调度问题并分离了网络请求部分到另一个仓库。此仓库会继续

null 210 Oct 29, 2022
golang 在线预览word,excel,pdf,MarkDown(Online Preview Word,Excel,PPT,PDF,Image by Golang)

Go View File 在线体验地址 http://39.97.98.75:8082/view/upload (不会经常更新,保留最基本的预览功能。服务器配置较低,如果出现链接超时请等待几秒刷新重试,或者换Chrome) 目前已经完成 docker部署 (不用为运行环境烦恼) Wor

CZC 76 Nov 23, 2022
bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS

bluemonday bluemonday is a HTML sanitizer implemented in Go. It is fast and highly configurable. bluemonday takes untrusted user generated content as

Microcosm 2.5k Nov 21, 2022
Elegant Scraper and Crawler Framework for Golang

Colly Lightning Fast and Elegant Scraping Framework for Gophers Colly provides a clean interface to write any kind of crawler/scraper/spider. With Col

Colly 18.3k Nov 24, 2022
A golang package to work with Decentralized Identifiers (DIDs)

did did is a Go package that provides tools to work with Decentralized Identifiers (DIDs). Install go get github.com/ockam-network/did Example packag

Ockam 69 Nov 25, 2022
wcwidth for golang

go-runewidth Provides functions to get fixed width of the character or string. Usage runewidth.StringWidth("つのだ☆HIRO") == 12 Author Yasuhiro Matsumoto

mattn 497 Nov 27, 2022
Parses the Graphviz DOT language in golang

Parses the Graphviz DOT language and creates an interface, in golang, with which to easily create new and manipulate existing graphs which can be writ

Walter Schulze 504 Nov 23, 2022
Go (Golang) GNU gettext utilities package

Gotext GNU gettext utilities for Go. Features Implements GNU gettext support in native Go. Complete support for PO files including: Support for multil

Leonel Quinteros 360 Oct 31, 2022
htmlquery is golang XPath package for HTML query.

htmlquery Overview htmlquery is an XPath query package for HTML, lets you extract data or evaluate from HTML documents by an XPath expression. htmlque

null 537 Nov 27, 2022