A Go package for n-gram based text categorization, with support for utf-8 and raw text

Overview

A Go package for n-gram based text categorization, with support for utf-8 and raw text.

To do:

  • write documentation
  • make it faster

Keywords: text categorization, language detector

Install

go get github.com/pebbe/textcat
go get github.com/pebbe/textcat/textcat
go get github.com/pebbe/textcat/textpat

Docs

Releases(v1.0.1)
Owner
Peter Kleiweg
Peter Kleiweg
A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech tagging, and named-entity extraction.

Joseph Kato 2.9k Jun 18, 2022
ASCII transliterations of Unicode text.

go-unidecode ASCII transliterations of Unicode text. Inspired by python-unidecode. Installation go get -u github.com/mozillazg/go-unidecode Install C

Huang Huang 92 Apr 15, 2022
A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29

segment A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29 Features Currently only segmentation at Word

bleve 70 Apr 24, 2022
A tool to find all duplicates in large sets of text documents.

⊧ dupi Dupi is an engine for identifying and exploring duplicative text in sets of documents. Status Dupi is in alpha/early beta development stage. Pl

go-air 13 Mar 3, 2022
Package i18n provides internationalization and localization for your Go applications.

i18n Package i18n provides internationalization and localization for your Go applications. Installation The minimum requirement of Go is 1.16. go get

null 54 May 11, 2022
Natural language detection package in pure Go

getlang getlang provides fast natural language detection in Go. Features Offline -- no internet connection required Supports 29 languages Provides ISO

Rylan 136 May 3, 2022
The shamoji (杓文字) is a word filtering package

shamoji About The shamoji (杓文字) is word filtering package. Install $ go get -u github.com/osamingo/shamoji Usage package main import ( "fmt" "sync

Osamu TONOMORI 12 May 4, 2022
i18n (Internationalization and localization) engine written in Go, used for translating locale strings.

go-localize Simple and easy to use i18n (Internationalization and localization) engine written in Go, used for translating locale strings. Use with go

Miles Croxford 35 Jun 1, 2022
Utilities for working with discrete probability distributions and other tools useful for doing NLP work

GNLP A few structures for doing NLP analysis / experiments. Basics counter.Counter A map-like data structure for representing discrete probability dis

Matt Jones 90 May 31, 2022
Read and use word2vec vectors in Go

Introduction This is a package for reading word2vec vectors in Go and finding similar words and analogies. Installation This package can be installed

Daniël de Kok 46 Jun 16, 2022
[UNMANTEINED] Extract values from strings and fill your structs with nlp.

nlp nlp is a general purpose any-lang Natural Language Processor that parses the data inside a text and returns a filled model Supported types int in

Juan Alvarez 381 Jun 13, 2022
Selected Machine Learning algorithms for natural language processing and semantic analysis in Golang

Natural Language Processing Implementations of selected machine learning algorithms for natural language processing in golang. The primary focus for t

James Bowman 366 Jun 15, 2022
Self-contained Machine Learning and Natural Language Processing library in Go

If you like the project, please ★ star this repository to show your support! ?? A Machine Learning library written in pure Go designed to support rele

NLP Odyssey 1.2k Jun 27, 2022
Stemmer packages for Go programming language. Includes English, German and Dutch stemmers.

Stemmer package for Go Stemmer package provides an interface for stemmers and includes English, German and Dutch stemmers as sub-packages: porter2 sub

Dmitry Chestnykh 51 Jan 23, 2022
A go library for reading and creating ISO9660 images

iso9660 A package for reading and creating ISO9660 Joliet and Rock Ridge extensions are not supported. Examples Extracting an ISO package main import

Kamil Domański 208 May 26, 2022
Gopher-translator - A HTTP API that accepts english word or sentences and translates them to Gopher language

Gopher Translator Service An interview assignment project. To see the full assig

Teodor Draganov 0 Jan 25, 2022
A Go package for n-gram based text categorization, with support for utf-8 and raw text

A Go package for n-gram based text categorization, with support for utf-8 and raw text. To do: write documentation make it faster Keywords: text categ

Peter Kleiweg 67 Feb 13, 2022
A UTF-8 and internationalisation testing utility for text rendering.

ɱéťàł "English, but metal" Metal is a tool that converts English text into a legible, Zalgo-like character swap for the purposes of testing localisati

Harley 0 Jan 14, 2022
Go wrapper for gram-tgcalls.

Go wrapper for gram-tgcalls. Features Doesn't let you worry about running Telegram clients, it starts an unlimited number of lightweight Gra

null 8 Dec 8, 2021
Package raw enables reading and writing data at the device driver level for a network interface. MIT Licensed.

raw Package raw enables reading and writing data at the device driver level for a network interface. MIT Licensed. For more information about using ra

Matt Layher 421 Jun 15, 2022
Go package for sharding databases ( Supports every ORM or raw SQL )

Octillery Octillery is a Go package for sharding databases. It can use with every OR Mapping library ( xorm , gorp , gorm , dbr ...) implementing data

BlasTrain Co., Ltd. 165 Jun 13, 2022
Upgit - Upgit helps you simply upload any file to your Github repository and then get a raw URL for it

Upgit - Upgit helps you simply upload any file to your Github repository and then get a raw URL for it

null 266 Jun 16, 2022
Write your SQL queries in raw files with all benefits of modern IDEs, use them in an easy way inside your application with all the profit of compile time constants

About qry is a general purpose library for storing your raw database queries in .sql files with all benefits of modern IDEs, instead of strings and co

Sergey Treinis 21 Apr 25, 2022
Cgo bindings to PulseAudio's Simple API, for easily playing or capturing raw audio.

pulse-simple Cgo bindings to PulseAudio's Simple API, for easily playing or capturing raw audio. The full Simple API is supported, including channel m

Tommy 20 Jun 17, 2022
Go fearless SQL. Sqlvet performs static analysis on raw SQL queries in your Go code base.

Sqlvet Sqlvet performs static analysis on raw SQL queries in your Go code base to surface potential runtime errors at build time. Feature highlights:

QP Hou 446 Jun 19, 2022
a benchmarking&stressing tool that can send raw HTTP requests

reqstress reqstress is a benchmarking&stressing tool that can send raw HTTP requests. It's written in Go and uses fasthttp library instead of Go's def

Utku Sen 153 May 30, 2022
Another Go shellcode loader designed to work with Cobalt Strike raw binary payload.

Bankai Another Go shellcode loader designed to work with Cobalt Strike raw binary payload. I created this project to mainly educate myself learning Go

bigb0ss 108 Jun 15, 2022
Raw ANSI sequence helpers

Raw ANSI sequence helpers

Christian Muehlhaeuser 15 Apr 14, 2022
Vtterm - An raw-mode vt100 screen reader

#VT100 TERMINAL This is a vt100 screen reader ( clone of jaguilar/v100 ) and inc

Navid YS 1 Feb 26, 2022