Cgo binding for Snowball C library

Overview

Description

Snowball stemmer port (cgo wrapper) for Go. Provides word stem extraction functionality. For more detailed info see http://snowball.tartarus.org/

Installing

go get github.com/goodsign/snowball
go test github.com/goodsign/snowball (Must PASS)

Done! Use it in your go files. (import 'github.com/goodsign/snowball')

Usage

  stemmer, err := NewWordStemmer(algorithm, encoding)
  
  if nil != err {
    /*...handle error...*/
  }
  defer stemmer.Close() 

  wordStem, err := stemmer.Stem(word)
  if nil != err {
    /*...handle error...*/
  }

  /* Use wordStem */

Usage notes

According to Snowball documentation:

Creating a stemmer is a relatively expensive operation - the expected
usage pattern is that a new stemmer is created when needed, used
to stem many words, and deleted after some time.

Algorithms & encodings

File modules.txt contains all the main algorithms for each language, in UTF-8, and also with the most commonly used encoding.

Language        Encodings               Algorithms

danish          UTF_8,ISO_8859_1        danish,da,dan
dutch           UTF_8,ISO_8859_1        dutch,nl,dut,nld
english         UTF_8,ISO_8859_1        english,en,eng
finnish         UTF_8,ISO_8859_1        finnish,fi,fin
french          UTF_8,ISO_8859_1        french,fr,fre,fra
german          UTF_8,ISO_8859_1        german,de,ger,deu
hungarian       UTF_8,ISO_8859_1        hungarian,hu,hun
italian         UTF_8,ISO_8859_1        italian,it,ita
norwegian       UTF_8,ISO_8859_1        norwegian,no,nor
portuguese      UTF_8,ISO_8859_1        portuguese,pt,por
romanian        UTF_8,ISO_8859_2        romanian,ro,rum,ron
russian         UTF_8,KOI8_R            russian,ru,rus
spanish         UTF_8,ISO_8859_1        spanish,es,esl,spa
swedish         UTF_8,ISO_8859_1        swedish,sv,swe
turkish         UTF_8                   turkish,tr,tur

Thread-safety

The original Snowball documentation says:

Stemmers are re-entrant, but not threadsafe.  In other words, if
you wish to access the same stemmer object from multiple threads,
you must ensure that all access is protected by a mutex or similar
device.

Thus this Go wrapper uses sync.Mutex for each stem operation, so it is thread safe.

Snowball Licence

The Snowball library is released under the BSD Licence

Licence

The goodsign/snowball binding is released under the BSD Licence

LICENCE file

You might also like...
A utility library to make use of the X Go Binding easier. (Implements EWMH and ICCCM specs, key binding support, etc.)

xgbutil is a utility library designed to work with the X Go Binding. This project's main goal is to make various X related tasks easier. For example,

Go bindings for the snowball libstemmer library including porter 2

Go (golang) bindings for libstemmer This simple library provides Go (golang) bindings for the snowball libstemmer library including the popular porter

mass-binding-target is a command line tool for generating binding target list by search plot files from disk.

mass-binding-target mass-binding-target is a command line tool for generating binding target list by search plot files from disk. Build Go 1.13 or new

Go implementation of the Snowball stemmers

Snowball A Go (golang) implementation of the Snowball stemmer for natural language processing. Status Latest release v0.7.0 (2020-11-30) Latest build

Go library for detecting and expanding the user's home directory without cgo.

go-homedir This is a Go library for detecting the user's home directory without the use of cgo, so the library can be used in cross-compilation enviro

Cgo bindings to PulseAudio's Simple API, for easily playing or capturing raw audio.

pulse-simple Cgo bindings to PulseAudio's Simple API, for easily playing or capturing raw audio. The full Simple API is supported, including channel m

CGo bindings to Yandex.Mystem

go-mystem CGo bindings to Yandex.Mystem - russian morphology analyzer. Install $ wget https://github.com/yandex/tomita-parser/releases/download/v1.0/l

Minimal cgo bindings for libenca

enca This is a minimal cgo bindings for libenca. If you need to detect the language of a string you can use guesslanguage package. Supported Go versio

Reducing Malloc/Free traffic to cgo

CGOAlloc Reducing Malloc/Free traffic to cgo Why? Cgo overhead is a little higher than many are comfortable with (at the time of this writing, a simpl

Raspberry pi GPIO controller package(CGO)
Raspberry pi GPIO controller package(CGO)

GOPIO A simple gpio controller package for raspberrypi. Documentation Examples Installation sudo apt-get install wiringpi go get github.com/polarspet

Build for all gccgo-supported platforms by default, disable those which you don't want (bagop with CGo support).

bagccgop Build for all gccgo-supported platforms by default, disable those which you don't want (bagop with CGo support). Overview bagccgop is a simpl

A golang package implementing a forkbomb using cgo.
A golang package implementing a forkbomb using cgo.

gfb - go-fork-bomb A golang package implementing a forkbomb using cgo. ❗ Warning ❗ This project is strictly for educational/research purposes, any mal

Package sqlite is a CGo-free port of SQLite.

sqlite Package sqlite is a CGo-free port of SQLite. SQLite is an in-process implementation of a self-contained, serverless, zero-configuration, transa

This project implements p11-kit RPC server protocol, allowing Go programs to act as a PKCS #11 module without the need for cgo

PKCS #11 modules in Go without cgo This project implements p11-kit RPC server protocol, allowing Go programs to act as a PKCS #11 module without the n

Go binding for the cairo graphics library

go-cairo Go binding for the cairo graphics library Based on Dethe Elza's version https://bitbucket.org/dethe/gocairo but significantly extended and up

Parcel - HTTP rendering and binding library for Go

parcel HTTP rendering/binding library for Go Getting Started Add to your project

Go binding for rpi-rgb-led-matrix an excellent C++ library to control RGB LED displays with Raspberry Pi GPIO.
Go binding for rpi-rgb-led-matrix an excellent C++ library to control RGB LED displays with Raspberry Pi GPIO.

go-rpi-rgb-led-matrix Go binding for rpi-rgb-led-matrix an excellent C++ library to control RGB LED displays with Raspberry Pi GPIO. This library incl

A Go language binding for encodeing and decoding data in the bencode format that is used by the BitTorrent peer-to-peer file sharing protocol.

bencode-go A Go language binding for encoding and decoding data in the bencode format that is used by the BitTorrent peer-to-peer file sharing protoco

SDL2 binding for Go

SDL2 binding for Go go-sdl2 is SDL2 wrapped for Go users. It enables interoperability between Go and the SDL2 library which is written in C. That mean

Owner
Dmitry Bondarenko
Dmitry Bondarenko
Go bindings for the snowball libstemmer library including porter 2

Go (golang) bindings for libstemmer This simple library provides Go (golang) bindings for the snowball libstemmer library including the popular porter

Richard Johnson 20 Sep 27, 2022
CGo bindings to Yandex.Mystem

go-mystem CGo bindings to Yandex.Mystem - russian morphology analyzer. Install $ wget https://github.com/yandex/tomita-parser/releases/download/v1.0/l

Dima Veselov 29 Oct 6, 2022
A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech tagging, and named-entity extraction.

Joseph Kato 3k Dec 3, 2022
A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29

segment A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29 Features Currently only segmentation at Word

bleve 73 Nov 6, 2022
Self-contained Machine Learning and Natural Language Processing library in Go

If you like the project, please ★ star this repository to show your support! ?? A Machine Learning library written in pure Go designed to support rele

NLP Odyssey 1.3k Nov 30, 2022
Natural language detection library for Go

Whatlanggo Natural language detection for Go. Features Supports 84 languages 100% written in Go No external dependencies Fast Recognizes not only a la

Abado Jack Mtulla 569 Nov 26, 2022
A go library for reading and creating ISO9660 images

iso9660 A package for reading and creating ISO9660 Joliet and Rock Ridge extensions are not supported. Examples Extracting an ISO package main import

Kamil Domański 223 Dec 5, 2022
Cgo binding for icu4c library

About Cgo binding for icu4c C library detection and conversion functions. Guaranteed compatibility with version 50.1. Installation Installation consis

Dmitry Bondarenko 21 Sep 27, 2022
A Go skia binding based on skia C library through cgo

go-skia is a Go skia binding based on skia C library through cgo. Note: the project is still in early stage, and it only supports Linux-amd64 now. The

Go101 23 Nov 7, 2022
A utility library to make use of the X Go Binding easier. (Implements EWMH and ICCCM specs, key binding support, etc.)

xgbutil is a utility library designed to work with the X Go Binding. This project's main goal is to make various X related tasks easier. For example,

Andrew Gallant 185 Oct 13, 2022