Super Fast Regex in Go

Overview

Rubex : Super Fast Regexp for Go

by Zhigang Chen ([email protected] or [email protected])

ONLY USE go1 BRANCH

A simple regular expression library that supports Ruby's regexp syntax. It implements all the public functions of Go's Regexp package, except LiteralPrefix. By the benchmark tests in Regexp, the library is 40% to 10X faster than Regexp on all but one test. Unlike Go's Regrexp, this library supports named capture groups and also allow "\1" and "\k" in replacement strings.

The library calls the Oniguruma regex library (5.9.2, the latest release as of now) for regex pattern searching. All replacement code is done in Go. This library can be easily adapted to support the regex syntax used by other programming languages or tools, like Java, Perl, grep, and emacs.

Installation

First, ensure you have Oniguruma installed. On OS X with brew, its as simple as

brew install oniguruma

On Ubuntu...

sudo apt-get install libonig2

Now that we've got Oniguruma installed, we can install Rubex!

go install github.com/moovweb/rubex

Example Usage

import "rubex"

rxp := rubex.MustCompile("[a-z]*")
if err != nil {
    // whoops
}
result := rxp.FindString("a me my")
if result != "" {
    // FOUND A STRING!! YAY! Must be "a" in this instance
} else {
    // no good
}
Comments
  • Size of int on 64 bit systems is 64 bits in Go 1.1.

    Size of int on 64 bit systems is 64 bits in Go 1.1.

    This commit changes the use of int -> int32 where needed to make all the unit tests pass with:

    $ go version go version devel +7cd19e1a734a Wed Mar 27 21:51:07 2013 +0100 darwin/amd64

    See http://tip.golang.org/doc/go1.1#int

    opened by quarnster 8
  • Ubuntu requires libonig-dev package

    Ubuntu requires libonig-dev package

    Heyo.

    I'm running Ubuntu 12.04 (64 bit) and needed to install the libonig-dev to install rubex. I'm not sure if edge cases pop up with other versions of other distros but I had to run:

    sudo apt-get install libonig2 libonig-dev

    So for at least a few cases the README probably needs to be updated.

    Thanks for the great library.

    opened by nodanaonlyzuul 2
  • 多goroutine下出错

    多goroutine下出错

    var re = regexp.MustCompile(test) re在多goroutine环境下有问题,Oniguruma库是线程安全的吗?go自带的regexp库是可以的: A Regexp is safe for concurrent use by multiple goroutines

    测试环境go1.1

    opened by ghost 2
  • How to compare regex'es?

    How to compare regex'es?

    On branch go1

    I'm trying to compare 2 regex'es and found that it is false always:

    rubex.MustCompile("[a-z]*") == rubex.MustCompile("[a-z]*") // will return false

    I checked the code and it looks like the difference is in num_comb_exp_check and _ fields inside regex field. Is there a workaround? Since it is a bit weird that 2 exactly same objects aren't equal.

    opened by cthulhu 0
  • Go1 branch is out of date

    Go1 branch is out of date

    Hi, Is there a way to update go1 branch to work with 1.6?

    I'm trying to install it with go 1.6 and getting next error:

    ./chelper.c:161:43:` warning: passing 'const OnigUChar *' (aka 'const unsigned char *') to parameter of type 'const char *' converts between pointers to integer types with different sign [-Wpointer-sign]
    /usr/include/secure/_string.h:119:34: note: expanded from macro 'strncpy'
    package rubex: unrecognized import path "rubex" (import path does not begin with hostname)
    
    

    I guess it has to do with import "C" statements. I also found that your master is ok with 1.6. Probably go1 branch is a bit out of date or so?

    Thanks in advance

    opened by cthulhu 0
Owner
Moovweb
Moovweb XDN delivers unparalleled site speeds via progressive web apps with server-side rendering, auto AMP creation, and CDN-as-code.
Moovweb
A full-featured regex engine in pure Go based on the .NET engine

regexp2 - full featured regular expressions for Go Regexp2 is a feature-rich RegExp engine for Go. It doesn't have constant time guarantees like the b

Doug Clark 641 Nov 24, 2022
A simple action that looks for multiple regex matches, in a input text, and returns the key of the first found match.

Key Match Action A simple action that looks for multiple regex matches, in a input text, and returns the key of the first found match. TO RUN Add the

Chris 1 Aug 4, 2022
Go Resume is a resume tailoring tool with super powers 🚀

Go Resume is a resume tailoring tool with super powers ?? Building ?? Dependencies Go NodeJS Latex Installation Steps Clone the repo with gi

Jorge Henriquez 22 Jan 13, 2022
bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS

bluemonday bluemonday is a HTML sanitizer implemented in Go. It is fast and highly configurable. bluemonday takes untrusted user generated content as

Microcosm 2.5k Nov 21, 2022
A fast string sorting algorithm (MSD radix sort)

Your basic radix sort A fast string sorting algorithm This is an optimized sorting algorithm equivalent to sort.Strings in the Go standard library. Fo

Algorithms to Go 180 Sep 27, 2022
Small and fast FTS (full text search)

Microfts A small full text indexing and search tool focusing on speed and space. Initial tests seem to indicate that the database takes about twice as

Bill Burdick 27 Jul 30, 2022
Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.

Geziyor Geziyor is a blazing fast web crawling and web scraping framework. It can be used to crawl websites and extract structured data from them. Gez

null 1.8k Nov 19, 2022
Fast and secure steganography CLI for hiding text/files in images.

indie CLI This complete README is hidden in the target.png file below without the original readme.png this could have also been a lie as none could ev

BoB 4 Mar 20, 2022
A fast, easy-of-use and dependency free custom mapping from .csv data into Golang structs

csvparser This package provides a fast and easy-of-use custom mapping from .csv data into Golang structs. Index Pre-requisites Installation Examples C

João Duarte 22 Nov 14, 2022
Fast, realtime regex-extraction, and aggregation into common formats such as histograms, numerical summaries, tables, and more!

rare A file scanner/regex extractor and realtime summarizor. Supports various CLI-based graphing and metric formats (histogram, table, etc). Features

Chris LaPointe 172 Nov 20, 2022
Match regex group into go struct using struct tags and automatic parsing

regroup Simple library to match regex expression named groups into go struct using struct tags and automatic parsing Installing go get github.com/oris

Ori Seri 126 Nov 5, 2022
:runner:runs go generate recursively on a specified path or environment variable and can filter by regex

Package generate Package generate runs go generate recursively on a specified path or environment variable like $GOPATH and can filter by regex Why wo

Go Playgound 28 Sep 27, 2022
A full-featured regex engine in pure Go based on the .NET engine

regexp2 - full featured regular expressions for Go Regexp2 is a feature-rich RegExp engine for Go. It doesn't have constant time guarantees like the b

Doug Clark 641 Nov 24, 2022
a Go code to detect leaks in JS files via regex patterns

a Go code to detect leaks in JS files via regex patterns

João Teles 113 Nov 13, 2022
LogAnalyzer - Analyze logs with custom regex patterns.Can search for particular patterns on multiple files in a directory.

LogAnalyzer Analyze logs with custom regex patterns.Can search for particular patterns on multiple files in a directory

Johnson Simon 6 May 31, 2022
Transparent TLS and HTTP proxy serve and operate on all 65535 ports, with domain regex whitelist and rest api control

goshkan Transparent TLS and HTTP proxy serve & operating on all 65535 ports, with domain regex whitelist and rest api control tls and http on same por

Sina Ghaderi 11 Nov 5, 2022
Header Block is a middleware plugin for Traefik to block request and response headers which regex matched by their name and/or value

Header Block is a middleware plugin for Traefik to block request and response headers which regex matched by their name and/or value Conf

null 3 May 24, 2022
A simple action that looks for multiple regex matches, in a input text, and returns the key of the first found match.

Key Match Action A simple action that looks for multiple regex matches, in a input text, and returns the key of the first found match. TO RUN Add the

Chris 1 Aug 4, 2022
GitHub Action to identify a path of changed files on monorepos, with regex and depth validation.

github-action-go GitHub Action to identify a path of changed files on monorepos, with regex and depth validation. Example use-case is execution path f

Pavel Snagovsky 1 Mar 1, 2022
Patternfinder - Find patterns in http output based on regex string. Display occurences

Patternfinder Find patterns in HTTP output based on regex string. Display occurr

YouGina 1 Feb 18, 2022