Implementation of Boyer-Moore fast string search algorithm in Go

Overview

boyermoore

Implementation of Boyer-Moore fast string search algorithm in Go, according to my understanding of https://www.cs.utexas.edu/users/moore/best-ideas/string-searching/.

If you need to search big chunks of string (or []byte) a lot, you can benefit from this package or Boyer-Moore algorithm.

In my test cases long search terms can be 10 times faster than strings package. Looks like strings package is currently using Rabin Karp algorithm for longer sub string lookups.

For more details you can check source code it is around 60 lines.

Usage/Examples

package main

import(
    "fmt"
    "github.com/sarpdag/boyermoore"
) 

func main() {
    src := `Some long string which you need to check. Bla bla ... `
    subStr := `you need to check`
    if pos := boyermoore.Index(src, subStr); pos > -1 {
        fmt.Println("Found in position: ", pos)
    }
}
-1 { fmt.Println("Found in position: ", pos) } } " aria-label="Copy" class="ClipboardButton btn js-clipboard-copy m-2 p-0 tooltipped-no-delay" data-copy-feedback="Copied!" data-tooltip-direction="w">

If you need to check the same substring against multiple sources better to pre calculate the sliding table.

package main

import(
    "fmt"
    "github.com/sarpdag/boyermoore"
) 

func main() {
    src := []string{
		`Some long string which you need to check. Bla bla ... `,
		`Another source string ,,faksl fdklssdlkfsjaklsfjklsjfalks`,
	}
	subStr := `you need to check`
	table := boyermoore.CalculateSlideTable(substr)
	if pos := boyermoore.IndexWithTable(&table, src, subStr); pos > -1 {
		fmt.Println("Found in position: ", pos)
	}
}
-1 { fmt.Println("Found in position: ", pos) } } " aria-label="Copy" class="ClipboardButton btn js-clipboard-copy m-2 p-0 tooltipped-no-delay" data-copy-feedback="Copied!" data-tooltip-direction="w">

Benchmarks

$ go test -bench=. -count=5 > bench.txt
$ benchstat bench.txt

name                             time/op
Bruteforce/shortEarlySub-12       138ns ± 2%
Bruteforce/shortLateSub-12       6.77µs ± 2%
Bruteforce/longSub-12            7.11µs ± 9%
Bruteforce/longNotFound-12       8.89µs ± 8%
Bruteforce/endOfString-12        9.04µs ± 3%
Bruteforce/begOfString-12        4.77ns ± 3%
Bruteforce/shortMid-12           4.41µs ± 2%
StringsIndex/shortEarlySub-12    63.4ns ± 3%
StringsIndex/shortLateSub-12      214ns ± 2%
StringsIndex/longSub-12          3.79µs ± 4%
StringsIndex/longNotFound-12     2.27µs ± 1%
StringsIndex/endOfString-12      4.05µs ± 2%
StringsIndex/begOfString-12      7.13ns ± 3%
StringsIndex/shortMid-12         2.02µs ± 3%
BM/shortEarlySub-12               374ns ± 7%
BM/shortLateSub-12               9.51µs ± 1%
BM/longSub-12                    1.00µs ± 9%
BM/longNotFound-12                723ns ± 7%
BM/endOfString-12                3.07µs ± 8%
BM/begOfString-12                 199ns ± 6%
BM/shortMid-12                   3.29µs ± 4%
BMPregenerated/shortEarlySub-12   186ns ± 2%
BMPregenerated/shortLateSub-12   9.54µs ± 5%
BMPregenerated/longSub-12         744ns ± 2%
BMPregenerated/longNotFound-12    444ns ± 2%
BMPregenerated/endOfString-12    2.53µs ± 2%
BMPregenerated/begOfString-12    6.28ns ± 3%
BMPregenerated/shortMid-12       2.90µs ± 3%

License

MIT

You might also like...
An in-memory string-interface{} map with various expiration options for golang

TTLCache - an in-memory cache with expiration TTLCache is a simple key/value cache in golang with the following functions: Expiration of items based o

Convert json string to Golang struct

json-to-go-cli Convert json string to Golang struct How to install git clone https://github.com/tiancheng92/json-to-go-cli.git cd json-to-go-cli go bu

A memory-efficient trie for testing the existence/prefixes of string only(for now).

Succinct Trie A memory-efficient trie for testing the existence/prefixes of string only(for now). Install go get -u github.com/nobekanai/sutrie Docume

Data structure and algorithm library for go, designed to provide functions similar to C++ STL

GoSTL English | 简体中文 Introduction GoSTL is a data structure and algorithm library for go, designed to provide functions similar to C++ STL, but more p

golang sorting algorithm and data construction.

data-structures-questions 算法和程序结构是我们学习编程的基础,但是很多的时候,我们很多都是只是在应用,而没有深入的去研究这些,所以自己也在不断的思考和探索,然后分析,学习,总结自己学习的过程,希望可以和大家一起学习和交流下算法! 目录 网络协议 数据结构 算法 数据库 Go

记录算法学习和LeetCode、LintCode、codewars的学习路程。A record of algorithm learning.

Problem List Leetcode、LintCode、Codewars Algorithm problem solution written by golang. LeetCode id Name(Github) Name(Gitee) 00001 TwoSum TwoSum 00003 L

The simplest sorting algorithm that sorts in quadratic time
The simplest sorting algorithm that sorts in quadratic time

Simplest sort Showcases the simplest sorting algorithm that works in quadratic time. From here. The pseudocode for this algo can be seen below (sorts

Smartsort - A smart sorting algorithm for Go to sort filename containing digits that is not zero padded

smartsort A smart sorting algorithm for Go to sort filename containing digits th

Comments
  • Test count

    Test count

    Could you update benchmark results in file README.md

    go test -bench=. -count=5 > bench.txt
    benchstat bench.txt 
    

    For example see https://github.com/Konstantin8105/pow

    opened by Konstantin8105 1
  • Error in example code

    Error in example code

    It seems the example code for searching a string in an Slice is wrong. IndexWithTable only takes a string as an argument but in the example code a string Slice is passed. Maybe you forgot to iterate over the Slice while writing the example.

    opened by gobsej 0
Owner
sarp dağ demirel
Working on code search. https://ifloop.io/
sarp dağ demirel
Multi-String Pattern Matching Algorithm Using TrieHashNode

Multi-String Pattern Matching algorithm. This implementation is inspired from Aho-Corasick algorithm Getting Started modelA = mspm.NewModel("mspm_mode

Sujit Shakya 20 Dec 9, 2022
Data structure and relevant algorithms for extremely fast prefix/fuzzy string searching.

Trie Data structure and relevant algorithms for extremely fast prefix/fuzzy string searching. Usage Create a Trie with: t := trie.New() Add Keys with:

Derek Parker 623 Dec 27, 2022
A fast (5x) string keyed read-only map for Go - particularly good for keys using a small set of nearby runes.

faststringmap faststringmap is a fast read-only string keyed map for Go (golang). For our use case it is approximately 5 times faster than using Go's

The Sensible Code Company 30 Jan 8, 2023
A Go implementation of the 64-bit xxHash algorithm (XXH64)

xxhash xxhash is a Go implementation of the 64-bit xxHash algorithm, XXH64. This is a high-quality hashing algorithm that is much faster than anything

Caleb Spare 1.3k Dec 22, 2022
A Go implementation of the core algorithm in paper

Boolean Expression Indexer Go library A Go implementation of the core algorithm in paper <Indexing Boolean Expression>, which already supports the fol

wangyi 54 Dec 26, 2022
A Left-Leaning Red-Black (LLRB) implementation of balanced binary search trees for Google Go

GoLLRB GoLLRB is a Left-Leaning Red-Black (LLRB) implementation of 2-3 balanced binary search trees in Go Language. Overview As of this writing and to

Petar Maymounkov 747 Dec 23, 2022
An app with Trie tree and Breve search Implementation CLI and HTTP both 🥳

Introduction LifeLongLearner project consists of two different parts. My English Vocabulary My Technical Book Notes All of them provided by me within

A.Samet İleri 15 Jul 1, 2022
Golang string comparison and edit distance algorithms library, featuring : Levenshtein, LCS, Hamming, Damerau levenshtein (OSA and Adjacent transpositions algorithms), Jaro-Winkler, Cosine, etc...

Go-edlib : Edit distance and string comparison library Golang string comparison and edit distance algorithms library featuring : Levenshtein, LCS, Ham

Hugo Bollon 373 Dec 20, 2022
An in-memory string-interface{} map with various expiration options for golang

TTLCache - an in-memory cache with expiration TTLCache is a simple key/value cache in golang with the following functions: Expiration of items based o

Rene Kroon 569 Jan 8, 2023
Decode / encode XML to/from map[string]interface{} (or JSON); extract values with dot-notation paths and wildcards. Replaces x2j and j2x packages.

mxj - to/from maps, XML and JSON Decode/encode XML to/from map[string]interface{} (or JSON) values, and extract/modify values from maps by key or key-

Charles Banning 536 Dec 22, 2022