A Go port of the Rapid Automatic Keyword Extraction algorithm (RAKE)

Overview

A Go implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm as described in: Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic Keyword Extraction from Individual Documents. In M. W. Berry & J. Kogan (Eds.), Text Mining: Theory and Applications: John Wiley & Sons.

Original Python implementation available at: https://github.com/aneesha/RAKE

The source code is released under the MIT License.

Docs and Report Card

Example Usage

package main

import (
	"github.com/afjoseph/goRAKE"
	"fmt"
)

func main() {
	text := `The growing doubt of human autonomy and reason has created a state of moral confusion where man is left without the guidance of either revelation or reason. The result is the acceptance of a relativistic position which proposes that value judgements and ethical norms are exclusively matters of arbitrary preference and that no objectively valid statement can be made in this realm... But since man cannot live without values and norms, this relativism makes him an easy prey for irrational value systems.`

	candidates := rake.RunRake(text)

	for _, candidate := range candidates {
		fmt.Printf("%s --> %f\n", candidate.Key, candidate.Value)
	}

	fmt.Printf("\nsize: %d\n", len(candidates))
}

<!---------------------------------------------------------->
<!--output-->
<!---------------------------------------------------------->
<!--objectively valid statement --> 9.000000-->
<!--exclusively matters --> 4.000000-->
<!--arbitrary preference --> 4.000000-->
<!--easy prey --> 4.000000-->
<!--relativistic position --> 4.000000-->
<!--human autonomy --> 4.000000-->
<!--relativism makes --> 4.000000-->
<!--growing doubt --> 4.000000-->
<!--moral confusion --> 4.000000-->
<!--ethical norms --> 3.500000-->
<!--norms --> 1.500000-->
<!--made --> 1.000000-->
<!--guidance --> 1.000000-->
<!--man --> 1.000000-->
<!--result --> 1.000000-->
<!--systems --> 1.000000-->
<!--values --> 1.000000-->
<!--realm --> 1.000000-->
<!--live --> 1.000000-->
<!--judgements --> 1.000000-->
<!--reason --> 1.000000-->
<!--left --> 1.000000-->
<!--proposes --> 1.000000-->
<!--irrational --> 1.000000-->
<!--created --> 1.000000-->
<!--acceptance --> 1.000000-->
<!--revelation --> 1.000000-->
<!--state --> 1.000000-->

<!--size: 28-->
Issues
  • Performance degradation caused by commit b66ca2f2b6bf9f4d84f82946e670799b8d2b2e46

    Performance degradation caused by commit b66ca2f2b6bf9f4d84f82946e670799b8d2b2e46

    This commit fixes #1 but adds pretty heavy loop and result should be cached or stored as a []string but it isn't https://github.com/Obaied/RAKE.Go/blob/b66ca2f2b6bf9f4d84f82946e670799b8d2b2e46/stopwords.go#L584

    opened by kirillDanshin 3
  • Updated rake.go.

    Updated rake.go.

    Fixes : https://github.com/afjoseph/RAKE.Go/issues/7

    After this fix Rake is able to remove all the consecutive stopwords instead of just removing the first one.

    opened by namanbansal013 1
  • Hardcoded stop-list path

    Hardcoded stop-list path

    Hi.

    You send us a pull request here https://github.com/avelino/awesome-go/pull/1229.

    I've start my review, but I see that hardcoded stopPath should cause a fail here should cause a fail here if file not found, but it should not exist on linux/windows.

    opened by kirillDanshin 1
  • README.md Example: Stopwords are not Ignored

    README.md Example: Stopwords are not Ignored

    I believe your commit 6bf7f9f5e21bfa2097164aa0958b4d6dacfe2570, 'Load stop-words as a string slice instead of splitting a large string', broke stopwords from working. If I rollback to version prior to that commit, everything works fine (git checkout 7df06d19b2795d3b3101a8da3b79efad4c2ce7be).

    Running the README.md example (with the import fixed), stopwords are not removed.

    package main
    
    import (
    	"fmt"
    
    	rake "github.com/afjoseph/RAKE.Go"
    )
    
    func main() {
    	text := `The growing doubt of human autonomy and reason has created a state of moral confusion where man is left without the guidance of either revelation or reason. The result is the acceptance of a relativistic position which proposes that value judgements and ethical norms are exclusively matters of arbitrary preference and that no objectively valid statement can be made in this realm... But since man cannot live without values and norms, this relativism makes him an easy prey for irrational value systems.`
    
    	candidates := rake.RunRake(text)
    
    	for _, candidate := range candidates {
    		fmt.Printf("%s --> %f\n", candidate.Key, candidate.Value)
    	}
    
    	fmt.Printf("\nsize: %d\n", len(candidates))
    }
    
    a relativistic position --> 9.000000
    objectively valid statement --> 9.000000
    an easy prey --> 9.000000
    value judgements --> 4.000000
    human autonomy --> 4.000000
    the acceptance --> 4.000000
    growing doubt --> 4.000000
    relativism makes --> 4.000000
    the guidance --> 4.000000
    either revelation --> 4.000000
    be made --> 4.000000
    arbitrary preference --> 4.000000
    exclusively matters --> 4.000000
    this realm --> 4.000000
    moral confusion --> 4.000000
    since man --> 3.500000
    ethical norms --> 3.500000
    norms --> 1.500000
    man --> 1.500000
    proposes --> 1.000000
    irrational --> 1.000000
    left --> 1.000000
    created --> 1.000000
    reason --> 1.000000
    state --> 1.000000
    values --> 1.000000
    result --> 1.000000
    systems --> 1.000000
    live --> 1.000000
    that --> 1.000000
    
    size: 30
    

    I am running:

    go version go1.13.1 darwin/amd64
    
    opened by garystafford 0
  • Updated import statement

    Updated import statement

    Changed import to "github.com/afjoseph/RAKE.go" from "github.com/afjoseph/goRAKE", which was broken Fixes https://github.com/afjoseph/RAKE.Go/issues/6

    opened by abdulsmapara 0
Owner
Abdullah Joseph
Mobile Security Team Lead @ Adjust https://www.linkedin.com/in/afjoseph/
Abdullah Joseph
Chinese word splitting algorithm MMSEG in GO

MMSEGO This is a GO implementation of MMSEG which a Chinese word splitting algorithm. TO DO list Documentation/comments Benchmark Usage #Input Diction

Andy Song 61 Feb 21, 2022
Golang implementation of the Paice/Husk Stemming Algorithm

##Golang Implementation of the Paice/Husk stemming algorithm This project was created for the QUT course INB344. Details on the algorithm can be found

Aaron Groves 28 Jan 23, 2022
Golang port of Petrovich - an inflector for Russian anthroponyms.

Petrovich is the library which inflects Russian names to given grammatical case. This is the Go port of https://github.com/petrovich. Installation go

Ivan Ivanov 39 May 24, 2022
a Make/rake-like dev tool using Go

About Mage is a make-like build tool using Go. You write plain-old go functions, and Mage automatically uses them as Makefile-like runnable targets. I

Mage 3.1k Aug 7, 2022
Eunomia is a distributed application framework that support Gossip protocol, QuorumNWR algorithm, PBFT algorithm, PoW algorithm, and ZAB protocol and so on.

Introduction Eunomia is a distributed application framework that facilitates developers to quickly develop distributed applications and supports distr

Cong 2 Sep 28, 2021
repin is a tool to replace strings between keyword pair.

repin repin is a tool to replace strings between keyword pair. tl;dr repin is a tool that makes it easy to write operations that can be written in GNU

Ken’ichiro Oyama 3 Mar 8, 2022
Gopkg - Search go.dev packages by keyword

gopkg Search go.dev packages by keyword Usage Install go install github.com/luck

null 1 Apr 6, 2022
go-fastdfs 是一个简单的分布式文件系统(私有云存储),具有无中心、高性能,高可靠,免维护等优点,支持断点续传,分块上传,小文件合并,自动同步,自动修复。Go-fastdfs is a simple distributed file system (private cloud storage), with no center, high performance, high reliability, maintenance free and other advantages, support breakpoint continuation, block upload, small file merge, automatic synchronization, automatic repair.(similar fastdfs).

中文 English 愿景:为用户提供最简单、可靠、高效的分布式文件系统。 go-fastdfs是一个基于http协议的分布式文件系统,它基于大道至简的设计理念,一切从简设计,使得它的运维及扩展变得更加简单,它具有高性能、高可靠、无中心、免维护等优点。 大家担心的是这么简单的文件系统,靠不靠谱,可不

小张 3.2k Aug 1, 2022
Lima launches Linux virtual machines on macOS, with automatic file sharing, port forwarding, and containerd.

Lima: Linux-on-Mac ("macOS subsystem for Linux", "containerd for Mac")

Akihiro Suda 8.8k Aug 5, 2022
A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech tagging, and named-entity extraction.

Joseph Kato 2.9k Aug 7, 2022
:wink: :cyclone: :strawberry: TextRank implementation in Golang with extendable features (summarization, phrase extraction) and multithreading (goroutine) support (Go 1.8, 1.9, 1.10)

TextRank on Go This source code is an implementation of textrank algorithm, under MIT licence. The minimum requred Go version is 1.8. MOTIVATION If th

David Belicza 157 Aug 4, 2022
A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats.

grate A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats. Why? Grate focuses on speed and stability first

Jeremy Jay 108 Jul 31, 2022
Pi-hole data right from your terminal. Live updating view, query history extraction and more!

Pi-CLI Pi-CLI is a command line program used to view data from a Pi-Hole instance directly in your terminal.

Reece Mercer 41 Apr 26, 2022
:book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

prose prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech

Joseph Kato 2.9k Aug 1, 2022
PipeIt is a text transformation, conversion, cleansing and extraction tool.

PipeIt PipeIt is a text transformation, conversion, cleansing and extraction tool. Features Split - split text to text array by given separator. Regex

Allen Dang 72 Jul 25, 2022
Fast, realtime regex-extraction, and aggregation into common formats such as histograms, numerical summaries, tables, and more!

rare A file scanner/regex extractor and realtime summarizor. Supports various CLI-based graphing and metric formats (histogram, table, etc). Features

Chris LaPointe 157 Jul 29, 2022
Extraction politique de conformité : xlsx (fichier de suivi) -> xml (format AlgoSec)

go_policyExtractor Extraction politique de conformité : xlsx (fichier de suivi) -> xml (format AlgoSec). Le programme suivant se base sur les intitulé

Nokeni 0 Nov 4, 2021
A block parser tool that allows extraction of various data types on DAS

das-database A block parser tool that allows extraction of various data types on DAS (register, edit, sell, transfer, ...) from CKB Prerequisites Ubun

DAS 13 Jun 23, 2022
go-fasttld is a high performance top level domains (TLD) extraction module.

go-fasttld go-fasttld is a high performance top level domains (TLD) extraction module implemented with compressed tries. This module is a port of the

Wu Tingfeng 9 Jul 20, 2022
Go-enum-algorithm - Implement an enumeration algorithm in GO

go-enum-algorithm implement an enumeration algorithm in GO run the code go run m

Leon 1 Feb 15, 2022
Port of LZ4 lossless compression algorithm to Go

go-lz4 go-lz4 is port of LZ4 lossless compression algorithm to Go. The original C code is located at: https://github.com/Cyan4973/lz4 Status Usage go

Бранимир Караџић 209 Jun 14, 2022
A rapid http(s) benchmark tool written in Go

gonetx/httpit httpit is a rapid http(s) benchmark tool which on top of fasthttp. Also thanks to cobra and bubbletea. Installation Get binaries from re

null 154 Aug 1, 2022
Rapid Web Development w/ Go

Buffalo A Go web development eco-system, designed to make your project easier. Buffalo helps you to generate a web project that already has everything

Buffalo - The Go Web Eco-System 6.9k Aug 4, 2022
A secure, flexible, rapid Go web framework

A secure, flexible, rapid Go web framework Visit aah's official website https://aahframework.org to learn more News v0.12.3 released and tagged on Feb

aah framework 670 Aug 4, 2022
a microservice framework for rapid development of micro services in Go with rich eco-system

中文版README Go-Chassis is a microservice framework for rapid development of microservices in Go. it focus on helping developer to deliver cloud native a

null 2.5k Aug 1, 2022
Cross platform rapid GUI framework for golang based on Dear ImGui.

giu Cross platform rapid GUI framework for golang based on Dear ImGui and the great golang binding imgui-go. Any contribution (features, widgets, tuto

Allen Dang 1.5k Aug 8, 2022
Design, compile and deploy your own Endlesss soundpacks with rapid iteration in Studio and iOS

Squonker is a tool for building and installing your own custom Endlesss instruments.

Unbundlesss 6 Dec 28, 2021
kyoto uikit - UIKit for rapid development License Go Reference Go Report Card

kyoto uikit UIKit for rapid development Requirements kyoto page configured SSA basic knowledge of kyoto (twui) configured tailwindcss Installation <ki

Yurii Zinets 22 Jun 27, 2022
A secure, flexible, rapid Go web framework

A secure, flexible, rapid Go web framework Visit aah's official website https://aahframework.org to learn more News v0.12.3 released and tagged on Feb

null 0 Oct 26, 2021