A tool and library for using structural regular expressions.

Overview

Structural Regular Expressions

Go Reference Go Report Card MIT License

sregx is a package and tool for using structural regular expressions as described by Rob Pike (link). sregx provides a very simple Go package for creating structural regular expression commands as well as a library for parsing and compiling sregx commands from the text format used in Pike's description. A CLI tool for using structural regular expressions is also provided in ./cmd/sregx, allowing you to perform advanced text manipulation from the command-line.

In a structural regular expression, regular expressions are composed using commands to perform tasks like advanced search and replace. A command has an input string and produces an output string. The following commands are supported:

  • p: prints the input string, and then returns the input string.
  • d: returns the empty string.
  • c//: returns the string .
  • s/

    //: returns a string where substrings matching the regular expression

    have been replaced with .

  • g/

    /: if

    matches the input, returns the result of evaluated on the input. Otherwise returns the input with no modification.

  • v/

    /: if

    does not match the input, returns the result of evaluated on the input. Otherwise returns the input with no modification.

  • x/

    /: returns a string where all substrings matching the regular expression

    have been replaced with the return value of applied to the particular substring.

  • y/

    /: returns a string where each part of the string that is not matched by

    is replaced by applying to the particular unmatched string.

  • n[N:M]: returns the application of to the input sliced from [N:M). Accepts negative numbers to refer to offsets from the end of the input. Offsets are zero-indexed.
  • l[N:M]: returns the application of to the input sliced from line N to line M (exclusive). Assumes newlines are represented with the \n character. Accepts negative numbers to refer to offsets from the last line of the input. Lines are zero-indexed.
  • u//: executes the shell command with the input as stdin and returns the resulting stdout of the command. Shell commands use a simple syntax where single or double quotes can be used to group arguments, and environment variables are accessible with $. This command is only directly available as part of the sregx CLI tool.

The commands n[...], l[...], and u are additions to the original description of structural regular expressions.

The sregx tool also provides another augmentation to the original sregx description from Pike: command pipelines. A command may be given as | | ... where the input of each command is the output of the previous one.

Examples

Most of these examples are from Pike's description, so you can look there for more detailed explanation. Since p is the only command that prints, technically you must append | p to commands that search and replace, because otherwise nothing will be printed. However, since you will probably forget to do this, the sregx tool will print the result of the final command before terminating if there were no uses of p anywhere within the command. Thus when using the CLI tool you can omit the | p in the following commands and still see the result.

Print all lines that contain "string":

x/.*\n/ g/string/p

Delete all occurrences of "string" and print the result:

x/string/d | p

Replace all occurrences of "foo" with "bar" in the range of lines 5-10 (zero-indexed):

l[5:10]s/foo/bar/ | p

Print all lines containing "rob" but not "robot":

x/.*\n/ g/rob v/robot/p

Capitalize all occurrences of the word "i":

x/[A-Za-z]+/ g/i/ v/../ c/I/ | p

or (more simply)

x/[A-Za-z]+/ g/^i$/ c/I/ | p

Print the last line of every paragraph that begins with "foo", where a paragraph is defined as text with no empty lines:

x/(.+\n)+/ g/^foo/ l[-2:-1]p

Change all occurrences of the complete word "foo" to "bar" except those occurring in double or single quoted strings:

y/".*"/ y/'.*'/ x/[a-zA-Z]+/ g/^foo$/ c/bar/ | p

Replace the complete word "TODAY" with the current date:

x/[A-Z]+/ g/^TODAY$/ u/date/ | p

Capitalize all words:

x/[a-zA-Z]+/ x/^./ u/tr a-z A-Z/ | p

Note: it is highly recommended when using the CLI tool that you enclose expressions in single or double quotes to prevent your shell from interpreting special characters.

Installation

There are three ways to install sregx.

  1. Download the prebuilt binary from the releases page (comes with man file).

  2. Install from source:

git clone https://github.com/zyedidia/sregx
cd sregx
make build # or make install to install to $GOBIN
  1. Install with go get (version info will be missing):
go get github.com/zyedidia/sregx/cmd/sregx

Usage

To use the CLI tool, first pass the expression and then the input file. If no file is given, stdin will be used. Here is an example to capitalize all occurrences of the word 'i' in file.txt:

sregx 'x/[A-Za-z]+/ g/^i$/ c/I/' file.txt

The tool tries to provide high quality error messages when you make a mistake in the expression syntax.

Base library

The base library is very simple and small (roughly 100 lines of code). In fact, it is surprisingly simple and elegant for something that can provide such powerful text manipulation, and I recommend reading the code if you are interested. Each type of command may be manually created directly in tree form. See the Go documentation for details.

Syntax library

The syntax library supports parsing and compiling a string into a structural regular expression command. The syntax follows certain rules, such as using "/" as a delimiter. The backslash (\) may be used to escape / or \, or to create special characters such as \n, \r, or \t. The syntax also supports specifying arbitrary bytes using octal, for example \14. Regular expressions use the Go syntax described here.

Future Work

Here are some ideas for some features that could be implemented in the future.

  • Internal manipulation language. Currently the u command runs shell commands. This is very flexible but can be costly because a new process is run to perform each transformation. For better performance we could provide a small language that has some string manipulation functions like toupper. A good candidate for this language would be Lua. This would also improve Windows support since most Windows environments lack utilities like tr.
  • Different regex engine. The Go regex engine is pretty good, but isn't especially performant. We could switch to Oniguruma (see the oniguruma branch), although this would mean using cgo.
  • Structural PEGs. Use PEGs instead of regular expressions.
Issues
  • Suggestion: Capture groups?

    Suggestion: Capture groups?

    Adding support for capture groups would make certain tasks much easier. For example, what if I wanted to duplicate every character? With regular sed this is as easy as s/./&&/g, or more explicitly s/\(.\)/\1\1/g. I have not found a way to achieve this with sregx yet though (without the use of /u/ that is).

    opened by Mango0x45 2
  • Fixes typo about syntax additions

    Fixes typo about syntax additions

    Sorry this isn't on a branch; I did this entirely from my phone.

    There was a small typo in the PP after the descriptions, where you referred to the l op as m.

    opened by xxxserxxx 0
Releases(v0.3.0)
Owner
Zachary Yedidia
Zachary Yedidia
rxscan provides functionality to scan text to variables using regular expression capture group.

rxscan rxscan provides functionality to scan text to variables using regular expression capture group. This library is still experimental, use at your

Ahmy Yulrizka 14 Dec 21, 2020
Perforator is a tool for recording performance metrics over subregions of a program using the Linux "perf" interface.

Perforator Perforator is a tool for recording performance metrics over subregions of a program (e.g., functions) using the Linux "perf" interface.

Zachary Yedidia 33 May 5, 2022
A go library to manipulate keyboards using libevdev

gokbd About This is an elementary library using libevdev to talk to a keyboard on Linux. It allows snooping the keys pressed as well as typing out key

Joshua Rich 2 Jan 19, 2022
Instrumentations of third-party libraries using opentelemetry-go library

OpenTelemetry Go Contributions About This reopsitory hosts instrumentations of the following OpenTelemetry libraries: confluentinc/confluent-kafka-go

eTF1 1 Oct 27, 2021
Extended library functions using generics in Go.

Just few extended standard library functions for Golang using generics.

null 1 Dec 16, 2021
Use is a go utility library using go1.18 generics

use use is a go utility library using go1.18 generics created by halpdesk 2022-01-22 use/slice Map updates a slice by applying a function to all membe

halpdesk 0 Jan 22, 2022
The main goal of this code is to create a basic dnstap printing tool based on the golang-dnstap library.

dnstap-parse The main goal of this code is to create a basic dnstap printing tool based on the golang-dnstap library. The output is supposed to mimic

Patrik Lundin 1 Nov 14, 2021
This project provides some working examples using Go and Hotwire Turbo.

hotwire-golang-website This project provides some working examples using Go the hotwire/turbo library published by basecamp.

Mark Wolfe 131 Jun 27, 2022
Bitwise AND on two byte-slices using SIMD instructions

This package provides a vectorised function which performs bitwise AND operation on all pairs of elements in two byte-slices. It detects CPU instruction set and chooses the available best one (AVX512, AVX2, SSE2).

Wei Shen 5 Dec 10, 2021
Validates Terraform Plans using TFSEC and OPA

Terraform Plan Validator Validates Terraform Plans using TFSEC and OPA Commands go run main.go check "delete-rg-test.json" "azure" Docker docker build

Brad McCoy 5 Jun 3, 2022
A program to create assembly 8086 strings to print without using any printing/strings related function but only mov-xchg-int and loops

Assembly String builder tool A program to create assembly 8086 strings to print without using any printing/strings related function but only mov-xchg-

Reg 2 Feb 1, 2022
Create deep copies (clones) of your maps and slices without using reflection.

DeepCopy DeepCopy helps you create deep copies (clones) of your maps and slices. Create deep copies (clones) of your objects The package is based on t

null 3 Apr 12, 2022
Calling functions by name and getting outputs by using reflect package.

Invoker A library to call (invoke) functions by taking names and sample inputs of those functions as parameters. And returns the types and values of o

null 3 Dec 20, 2021
Access and modify property values in deeply nested maps, using dot-separated paths

Dig lets you access and modify property values in deeply nested, unstructured maps, using dot-separated paths: source := make(map[string]interface{})

Preslav Rachev 12 May 7, 2022
Go-path - A helper package that provides utilities for parsing and using ipfs paths

go-path is a helper package that provides utilities for parsing and using ipfs paths

y 0 Jan 18, 2022
A full-featured license tool to check and fix license headers and resolve dependencies' licenses.

SkyWalking Eyes A full-featured license tool to check and fix license headers and resolve dependencies' licenses. Usage You can use License-Eye in Git

The Apache Software Foundation 136 Jun 28, 2022
🤖🤝A tool to test and analyze storage and retrieval deal capability on the Filecoin network.

Dealbot A tool to test and analyze storage and retrieval deal capability on the Filecoin network. Getting Started Clone the repo and build: git clone

Filecoin 29 Jun 6, 2022
A Go (golang) library for parsing and verifying versions and version constraints.

go-version is a library for parsing versions and version constraints, and verifying versions against a set of constraints. go-version can sort a collection of versions properly, handles prerelease/beta versions, can increment versions, etc.

HashiCorp 1.2k Jun 25, 2022
🍕 Enjoy a slice! A utility library for dealing with slices and maps that focuses on type safety and performance.

?? github.com/elliotchance/pie Enjoy a slice! pie is a library of utility functions for common operations on slices and maps. Quick Start FAQ What are

Elliot Chance 1k Jul 1, 2022