A command-line tool and library for generating regular expressions from user-provided test cases

Overview

grex


Build Status dependency status codecov lines of code Downloads

Docs.rs Crates.io Lib.rs license

Linux Download MacOS Download Windows Download

Table of Contents

  1. What does this tool do?
  2. Do I still need to learn to write regexes then?
  3. Current features
  4. How to install?
    4.1 The command-line tool
    4.2 The library
  5. How to use?
    5.1 The command-line tool
    5.2 The library
    5.3 Examples
  6. How to build?
  7. How does it work?
  8. Do you want to contribute?

1. What does this tool do? Top ▲

grex is a library as well as a command-line utility that is meant to simplify the often complicated and tedious task of creating regular expressions. It does so by automatically generating a single regular expression from user-provided test cases. The resulting expression is guaranteed to match the test cases which it was generated from.

This project has started as a Rust port of the JavaScript tool regexgen written by Devon Govett. Although a lot of further useful features could be added to it, its development was apparently ceased several years ago. The plan is now to add these new features to grex as Rust really shines when it comes to command-line tools. grex offers all features that regexgen provides, and more.

The philosophy of this project is to generate the most specific regular expression possible by default which exactly matches the given input only and nothing else. With the use of command-line flags (in the CLI tool) or preprocessing methods (in the library), more generalized expressions can be created.

The produced expressions are Perl-compatible regular expressions which are also compatible with the regular expression parser in Rust's regex crate. Other regular expression parsers or respective libraries from other programming languages have not been tested so far, but they ought to be mostly compatible as well.

2. Do I still need to learn to write regexes then? Top ▲

Definitely, yes! Using the standard settings, grex produces a regular expression that is guaranteed to match only the test cases given as input and nothing else. This has been verified by property tests. However, if the conversion to shorthand character classes such as \w is enabled, the resulting regex matches a much wider scope of test cases. Knowledge about the consequences of this conversion is essential for finding a correct regular expression for your business domain.

grex uses an algorithm that tries to find the shortest possible regex for the given test cases. Very often though, the resulting expression is still longer or more complex than it needs to be. In such cases, a more compact or elegant regex can be created only by hand. Also, every regular expression engine has different built-in optimizations. grex does not know anything about those and therefore cannot optimize its regexes for a specific engine.

So, please learn how to write regular expressions! The currently best use case for grex is to find an initial correct regex which should be inspected by hand if further optimizations are possible.

3. Current Features Top ▲

  • literals
  • character classes
  • detection of common prefixes and suffixes
  • detection of repeated substrings and conversion to {min,max} quantifier notation
  • alternation using | operator
  • optionality using ? quantifier
  • escaping of non-ascii characters, with optional conversion of astral code points to surrogate pairs
  • case-sensitive or case-insensitive matching
  • capturing or non-capturing groups
  • fully compliant to newest Unicode Standard 13.0
  • fully compatible with regex crate 1.3.5+
  • correctly handles graphemes consisting of multiple Unicode symbols
  • reads input strings from the command-line or from a file
  • optional syntax highlighting for nicer output in supported terminals

4. How to install? Top ▲

4.1 The command-line tool Top ▲

You can download the self-contained executable for your platform above and put it in a place of your choice. Alternatively, pre-compiled 64-Bit binaries are available within the package managers Scoop (for Windows) and Homebrew (for macOS and Linux).

grex is also hosted on crates.io, the official Rust package registry. If you are a Rust developer and already have the Rust toolchain installed, you can install by compiling from source using cargo, the Rust package manager. So the summary of your installation options is:

( scoop | brew | cargo ) install grex

4.2 The library Top ▲

In order to use grex as a library, simply add it as a dependency to your Cargo.toml file:

[dependencies]
grex = "1.1.0"

5. How to use? Top ▲

Detailed explanations of the available settings are provided in the library section. All settings can be freely combined with each other.

5.1 The command-line tool Top ▲

$ grex -h

grex 1.1.0
© 2019-2020 Peter M. Stahl <[email protected]>
Licensed under the Apache License, Version 2.0
Downloadable from https://crates.io/crates/grex
Source code at https://github.com/pemistahl/grex

grex generates regular expressions from user-provided test cases.

USAGE:
    grex [FLAGS] [OPTIONS] <INPUT>... --file <FILE>

FLAGS:
    -d, --digits             Converts any Unicode decimal digit to \d
    -D, --non-digits         Converts any character which is not a Unicode decimal digit to \D
    -s, --spaces             Converts any Unicode whitespace character to \s
    -S, --non-spaces         Converts any character which is not a Unicode whitespace character to \S
    -w, --words              Converts any Unicode word character to \w
    -W, --non-words          Converts any character which is not a Unicode word character to \W
    -r, --repetitions        Detects repeated non-overlapping substrings and
                             converts them to {min,max} quantifier notation
    -e, --escape             Replaces all non-ASCII characters with unicode escape sequences
        --with-surrogates    Converts astral code points to surrogate pairs if --escape is set
    -i, --ignore-case        Performs case-insensitive matching, letters match both upper and lower case
    -g, --capture-groups     Replaces non-capturing groups by capturing ones
    -c, --colorize           Provides syntax highlighting for the resulting regular expression
    -h, --help               Prints help information
    -v, --version            Prints version information

OPTIONS:
    -f, --file <FILE>                      Reads test cases on separate lines from a file
        --min-repetitions <QUANTITY>       Specifies the minimum quantity of substring repetitions
                                           to be converted if --repetitions is set [default: 1]
        --min-substring-length <LENGTH>    Specifies the minimum length a repeated substring must have
                                           in order to be converted if --repetitions is set [default: 1]

ARGS:
    <INPUT>...    One or more test cases separated by blank space 

5.2 The library Top ▲

5.2.1 Default settings

Test cases are passed either from a collection via RegExpBuilder::from() or from a file via RegExpBuilder::from_file(). If read from a file, each test case must be on a separate line. Lines may be ended with either a newline \n or a carriage return with a line feed \r\n.

use grex::RegExpBuilder;

let regexp = RegExpBuilder::from(&["a", "aa", "aaa"]).build();
assert_eq!(regexp, "^a(?:aa?)?$");

5.2.2 Convert to character classes

use grex::{Feature, RegExpBuilder};

let regexp = RegExpBuilder::from(&["a", "aa", "123"])
    .with_conversion_of(&[Feature::Digit, Feature::Word])
    .build();
assert_eq!(regexp, "^(\\d\\d\\d|\\w(?:\\w)?)$");

5.2.3 Convert repeated substrings

use grex::{Feature, RegExpBuilder};

let regexp = RegExpBuilder::from(&["aa", "bcbc", "defdefdef"])
    .with_conversion_of(&[Feature::Repetition])
    .build();
assert_eq!(regexp, "^(?:a{2}|(?:bc){2}|(?:def){3})$");

By default, grex converts each substring this way which is at least a single character long and which is subsequently repeated at least once. You can customize these two parameters if you like.

In the following example, the test case aa is not converted to a{2} because the repeated substring a has a length of 1, but the minimum substring length has been set to 2.

use grex::{Feature, RegExpBuilder};

let regexp = RegExpBuilder::from(&["aa", "bcbc", "defdefdef"])
    .with_conversion_of(&[Feature::Repetition])
    .with_minimum_substring_length(2)
    .build();
assert_eq!(regexp, "^(?:aa|(?:bc){2}|(?:def){3})$");

Setting a minimum number of 2 repetitions in the next example, only the test case defdefdef will be converted because it is the only one that is repeated twice.

use grex::{Feature, RegExpBuilder};

let regexp = RegExpBuilder::from(&["aa", "bcbc", "defdefdef"])
    .with_conversion_of(&[Feature::Repetition])
    .with_minimum_repetitions(2)
    .build();
assert_eq!(regexp, "^(?:bcbc|aa|(?:def){3})$");

5.2.4 Escape non-ascii characters

use grex::RegExpBuilder;

let regexp = RegExpBuilder::from(&["You smell like 💩."])
    .with_escaping_of_non_ascii_chars(false)
    .build();
assert_eq!(regexp, "^You smell like \\u{1f4a9}\\.$");

Old versions of JavaScript do not support unicode escape sequences for the astral code planes (range U+010000 to U+10FFFF). In order to support these symbols in JavaScript regular expressions, the conversion to surrogate pairs is necessary. More information on that matter can be found here.

use grex::RegExpBuilder;

let regexp = RegExpBuilder::from(&["You smell like 💩."])
    .with_escaped_non_ascii_chars(true)
    .build();
assert_eq!(regexp, "^You smell like \\u{d83d}\\u{dca9}\\.$");

5.2.5 Case-insensitive matching

The regular expressions that grex generates are case-sensitive by default. Case-insensitive matching can be enabled like so:

use grex::{Feature, RegExpBuilder};

let regexp = RegExpBuilder::from(&["big", "BIGGER"])
    .with_conversion_of(&[Feature::CaseInsensitivity])
    .build();
assert_eq!(regexp, "(?i)^big(?:ger)?$");

5.2.6 Capturing Groups

Non-capturing groups are used by default. Extending the previous example, you can switch to capturing groups instead.

use grex::{Feature, RegExpBuilder};

let regexp = RegExpBuilder::from(&["big", "BIGGER"])
    .with_conversion_of(&[Feature::CaseInsensitivity, Feature::CapturingGroup])
    .build();
assert_eq!(regexp, "(?i)^big(ger)?$");

5.2.7 Syntax highlighting

The method with_syntax_highlighting() may only be used if the resulting regular expression is meant to be printed to the console. The regex string representation returned from enabling this setting cannot be fed into the regex crate.

use grex::RegExpBuilder;

let regexp = RegExpBuilder::from(&["a", "aa", "123"])
    .with_syntax_highlighting()
    .build();

5.3 Examples Top ▲

The following examples show the various supported regex syntax features:

$ grex a b c
^[a-c]$

$ grex a c d e f
^[ac-f]$

$ grex a b x de
^(?:de|[abx])$

$ grex abc bc
^a?bc$

$ grex a b bc
^(?:bc?|a)$

$ grex [a-z]
^\[a\-z\]$

$ grex -r b ba baa baaa
^b(?:a{1,3})?$

$ grex -r b ba baa baaaa
^b(?:a{1,2}|a{4})?$

$ grex y̆ a z
^(?:y̆|[az])$
Note: 
Grapheme y̆ consists of two Unicode symbols:
U+0079 (Latin Small Letter Y)
U+0306 (Combining Breve)

$ grex "I ♥ cake" "I ♥ cookies"
^I ♥ c(?:ookies|ake)$
Note:
Input containing blank space must be 
surrounded by quotation marks.

The string "I ♥♥♥ 36 and ٣ and 💩💩." serves as input for the following examples using the command-line notation:

$ grex <INPUT>
^I ♥♥♥ 36 and ٣ and 💩💩\.$

$ grex -e <INPUT>
^I \u{2665}\u{2665}\u{2665} 36 and \u{663} and \u{1f4a9}\u{1f4a9}\.$

$ grex -e --with-surrogates <INPUT>
^I \u{2665}\u{2665}\u{2665} 36 and \u{663} and \u{d83d}\u{dca9}\u{d83d}\u{dca9}\.$

$ grex -d <INPUT>
^I ♥♥♥ \d\d and \d and 💩💩\.$

$ grex -s <INPUT>
^I\s♥♥♥\s36\sand\s٣\sand\s💩💩\.$

$ grex -w <INPUT>
^\w ♥♥♥ \w\w \w\w\w \w \w\w\w 💩💩\.$

$ grex -D <INPUT>
^\D\D\D\D\D\D36\D\D\D\D\D٣\D\D\D\D\D\D\D\D$

$ grex -S <INPUT>
^\S \S\S\S \S\S \S\S\S \S \S\S\S \S\S\S$

$ grex -dsw <INPUT>
^\w\s♥♥♥\s\d\d\s\w\w\w\s\d\s\w\w\w\s💩💩\.$

$ grex -dswW <INPUT>
^\w\s\W\W\W\s\d\d\s\w\w\w\s\d\s\w\w\w\s\W\W\W$

$ grex -r <INPUT>
^I ♥{3} 36 and ٣ and 💩{2}\.$

$ grex -er <INPUT>
^I \u{2665}{3} 36 and \u{663} and \u{1f4a9}{2}\.$

$ grex -er --with-surrogates <INPUT>
^I \u{2665}{3} 36 and \u{663} and (?:\u{d83d}\u{dca9}){2}\.$

$ grex -dgr <INPUT>
^I ♥{3} \d(\d and ){2}💩{2}\.$

$ grex -rs <INPUT>
^I\s♥{3}\s36\sand\s٣\sand\s💩{2}\.$

$ grex -rw <INPUT>
^\w ♥{3} \w(?:\w \w{3} ){2}💩{2}\.$

$ grex -Dr <INPUT>
^\D{6}36\D{5}٣\D{8}$

$ grex -rS <INPUT>
^\S \S(?:\S{2} ){2}\S{3} \S \S{3} \S{3}$

$ grex -rW <INPUT>
^I\W{5}36\Wand\W٣\Wand\W{4}$

$ grex -drsw <INPUT>
^\w\s♥{3}\s\d(?:\d\s\w{3}\s){2}💩{2}\.$

$ grex -drswW <INPUT>
^\w\s\W{3}\s\d(?:\d\s\w{3}\s){2}\W{3}$

6. How to build? Top ▲

In order to build the source code yourself, you need the stable Rust toolchain installed on your machine so that cargo, the Rust package manager is available.

git clone https://github.com/pemistahl/grex.git
cd grex
cargo build

The source code is accompanied by an extensive test suite consisting of unit tests, integration tests and property tests. For running the unit and integration tests, simply say:

cargo test

Property tests are disabled by default with the #[ignore] annotation because they are very long-running. They are used for automatically generating test cases for regular expression conversion. If a test case is found that produces a wrong conversion, it is shrinked to the shortest test case possible that still produces a wrong result. This is a very useful tool for finding bugs. If you want to run these tests, say:

cargo test -- --ignored

7. How does it work? Top ▲

  1. A deterministic finite automaton (DFA) is created from the input strings.

  2. The number of states and transitions between states in the DFA is reduced by applying Hopcroft's DFA minimization algorithm.

  3. The minimized DFA is expressed as a system of linear equations which are solved with Brzozowski's algebraic method, resulting in the final regular expression.

8. Do you want to contribute? Top ▲

In case you want to contribute something to grex even though it's in a very early stage of development, then I encourage you to do so nevertheless. Do you have ideas for cool features? Or have you found any bugs so far? Feel free to open an issue or send a pull request. It's very much appreciated. :-)

Comments
  • Add option to exclude test cases

    Add option to exclude test cases

    This adds a new option --file-negative, which contains a list of negative test cases. The resulting regex will strictly not matching any of these test cases. This fixes #16.


    To support negation, a second DFA is built of the negative cases, and then subtracted from the positive case DFA, using the standard DFA combination algorithm. To limit the number of nodes generated, combinations of nodes in the two DFAs are visited in depth-first order. Nodes that only occur in the negative match DFA are not visited.

    Because the repetition feature can produce grapheme transitions in the DFA that are variable length, code is added to calculate the overlap of two grapheme ranges.

    The generated graphs can contain 'dead ends' so some code is added to remove those. Some bug fixes for corner cases that were previously not hit were needed in the recreate_graph function were also necessary. Also find_next_state was written to use the new grapheme overlapping function, to prevent sometimes creating multiple conflicting edges out of a node.

    As part of this, a bug was fixed that previously caused blank lines the input to not be considered in the final regex, because the "initial" state could never be considered an accept state.

    I got rid of final_state_indices and moved that information into the node label. I also added descriptive labels to nodes to aid debugging.

    Adds appropriate tests. All pass. Ran through cargo fmt and cargo clippy.

    I haven't written much rust before so please let me know if there are any issues.

    opened by allanlw 10
  • Problems to consider when making anchors optional

    Problems to consider when making anchors optional

    It seems grex inherited this bug from regexgen: https://github.com/devongovett/regexgen/issues/31

    Repro:

    $ cat input
    AGBHD
    EIBCD
    EGBCD
    FBJBF
    AGBH
    EIBC
    EGBC
    EBC
    FBC
    CD
    F
    C
    ABCD
    EBCD
    FBCD
    
    $ # note the last entry to be matched, i.e. "FBCD"
    
    $ grex --file input
    ^(?:F(?:BJBF)?|(?:E(?:[GI])?BC|(?:FB)?C)D?|A(?:GBHD?|BCD))$
    

    After removing ^ and $ (see #30), this generated pattern does not match "FBCD" despite it being one of the input strings:

    'FBCD'.match(/(?:F(?:BJBF)?|(?:E(?:[GI])?BC|(?:FB)?C)D?|A(?:GBHD?|BCD))/g);
    // → ['F', 'CD']
    

    Here’s what I think the bug is: within the generated pattern, it should never happen that something on the left matches a prefix of something that's further on the right, because then the latter can never match.

    See https://github.com/devongovett/regexgen/issues/31#issuecomment-801380409 for some more details.

    enhancement 
    opened by mathiasbynens 6
  • Add optional CLI feature if using grex as a library

    Add optional CLI feature if using grex as a library

    Hi @pemistahl, first off, thanks for this wonderful library!

    But would it be possible to have an optional CLI feature in cargo.toml?

    In that way, if I'm using grex as a library, I don't need to get dependencies like structopt included in my project.

    enhancement 
    opened by jqnatividad 5
  • Grex hash for v1.2.0 release fails to verify in scoop

    Grex hash for v1.2.0 release fails to verify in scoop

    Simple as that:

    λ scoop install grex
    Installing 'grex' (1.2.0) [64bit]
    grex-v1.2.0-x86_64-pc-windows-msvc.zip (792,6 KB) [==========================================================================================================================] 100%
    Checking hash of grex-v1.2.0-x86_64-pc-windows-msvc.zip ... ERROR Hash check failed!
    App:         main/grex
    URL:         https://github.com/pemistahl/grex/releases/download/v1.2.0/grex-v1.2.0-x86_64-pc-windows-msvc.zip
    First bytes: 50 4B 03 04 14 00 00 00
    Expected:    d075efdbccb01c8b093b6c5120d064cc5ead534dec483c1a3d43cc4543d940ea
    Actual:      da9c50a4e19cbf7b1c4a001a9252c1a097b8eebbb9ec0bbf3f88bc79030e7d73
    

    Creating issue as this may be an overlooked thing. Another option is that I'm facing MIM attack which would be worse case scenario ;)

    opened by piaseckim 5
  • Make anchors

    Make anchors "^" and "$" optional

    Additional options: -B, --match-beginning - Match the beginning of the string (prepend ^) -E, --match-end - Match the end of the string (append $) -X, --match-line - Match the whole string (as a shorthand for -B -E)

    It's result of the discussion in the issue pemistahl/grex#30. Sorry, if some of my modifications look silly. It's my attempt to understand Rust from the scratch.

    opened by ildar-shaimordanov 4
  • Overly complex regex with input containing several common parts

    Overly complex regex with input containing several common parts

    While building a regex for the various possible formats of Creative Commons' Public Domain Mark (to assist in https://github.com/spdx/license-list-XML/issues/988), I noticed that grex produces a more complex regex than what the input requires.

    Here's what I provided:

    grex \
      "This work is free of known copyright restrictions." \
      "This work (WWW) is free of known copyright restrictions." \
      "This work (by AAA) is free of known copyright restrictions." \
      "This work, identified by CCC, is free of known copyright restrictions." \
      "This work (WWW, by AAA) is free of known copyright restrictions." \
      "This work (WWW), identified by CCC, is free of known copyright restrictions." \
      "This work (WWW, by AAA), identified by CCC, is free of known copyright restrictions." \
      "This work (by AAA), identified by CCC, is free of known copyright restrictions."
    

    The result was (after manually making groups non-capturing):

    ^This work(?:(?: \((?:(?:WWW, b|b)y AAA|WWW)\),|,) identified by CCC, |(?: \((?:(?:WWW, b|b)y AAA|WWW)\) | ))is free of known copyright restrictions\.$
    

    Visualized as a Debuggex diagram:

    Screenshot 2020-03-10 at 14 20 39

    A regex produced by hand to match the same input shows that this could be simplified:

    ^This work(?: \((?:WWW(?:, by AAA)?|by AAA)\))?(?:, identified by CCC,)? is free of known copyright restrictions\.$
    

    Debuggex diagram:

    Screenshot 2020-03-10 at 12 22 50

    wontfix 
    opened by waldyrious 4
  • Add feature for disabling capturing groups

    Add feature for disabling capturing groups

    grex produces regular expressions with capturing groups by default. Some users might prefer to create regexes with non-capturing groups instead, so I will add a new library method and a new command-line flag for handling this use case.

    enhancement 
    opened by pemistahl 4
  • Optional anchors

    Optional anchors "^" and "$"

    Added options to suppress anchors: -B, --no-match-beginning - Match the beginning of the string (prepend ^) -E, --no-match-end - Match the end of the string (append $) -X, --no-match-line - Match the whole string (as a shorthand for -B -E)

    This PR is intended to close the issue pemistahl/grex#30 and my previous GH-39 as this one covers the requirements to keep anchors by default.

    opened by ildar-shaimordanov 3
  • Couldn't compile ndarray

    Couldn't compile ndarray

    Hey I'm on Debian bullseye and cargo install grex won't succeed :

    [lots of errors]
    error: aborting due to 204 previous errors
    
    For more information about this error, try `rustc --explain E0277`.
    error: could not compile `ndarray`.
    

    ndarray version = 0.15.1 cargo version = 1.47

    has someone experienced the same issue ?

    opened by 0-Kala-0 3
  • Installation problem

    Installation problem

    When installing grex on Debian Linux, I get 365 syntax errors. They seem to be many repetitions of: the trait data_traits::RawDataSubst<u128> is not implemented for <S as data_traits::DataOwned>::MaybeUninit 348 | impl_scalar_lhs_op!(Complex, Ordered, /, Div, div, "division"); | -------------------------------------------------------------------- in this macro invocation | ::: /home/greg/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.15.0/src/data_traits.rs:411:1 | 411 | pub unsafe trait DataOwned: Data { | -------------------------------- required by data_traits::DataOwned I seem to get the same blast whether I run cargo from the command-line or in vscode. I installed with: $ git clone https://github.com/pemistahl/grex.git $ cd grex $ cargo build and $ cargo install grex Since I don't see any other complaints perhaps my distribution is to blame: $ uname -a Linux debian-dell-desktop 5.10.0-4-amd64 #1 SMP Debian 5.10.19-1 (2021-03-02) x86_64 GNU/Linux Creating an empty project with grex as the only dependency also fails. Version 1.1 of grex seems to run fine. I've only been programming rust for about a year, so I haven't gotten to writing macros yet, but I'll try to dig deeper. -- Greg

    opened by GregLawson 3
  • Inserting a character breaks repetition detection (sometimes)

    Inserting a character breaks repetition detection (sometimes)

    I have been looking for a way to find repeated substrings. I think I can parse grex results to find repetitions, and given that my strings are rather short, I could then compare group contents to find non-contiguous repetitions.

    I did some quick tests and I may have chanced upon a problem:

    • grex -dsr -c 'heeelooo world lalala lalala foo foo xalxalxal xalxalxal'

      gives ^he{3}lo{3}\sworld(?:\s(?:la){3}){2}(?:\sfo{2}){2}(?:\s(?:xal){3}){2}$

    • grex -dsr -c 'heeelooo world lalala lalala foo foo xalxalxal i xalxalxal'

      gives ^he{3}lo{3}\sworld\s(?:(?:la){3}\s){2}(?:fo{2}\s){2}(?:xal){3}\si\s(?:xal){3}$

    • grex -dsr -c 'heeelooo world lalala k lalala foo foo xalxalxal i xalxalxal'

      gives ^he{3}lo{3}\sworld\slalala\sk\slalala\s(?:fo{2}\s){2}(?:xal){3}\si\s(?:xal){3}$

    In the last probe, neither of the two lalala was detected as repetitious when a k was inserted, although xalxalxal was treated as expected. Any thoughts?

    bug 
    opened by loveencounterflow 3
  • Bump clap from 3.2.17 to 4.0.0

    Bump clap from 3.2.17 to 4.0.0

    Bumps clap from 3.2.17 to 4.0.0.

    Release notes

    Sourced from clap's releases.

    v4.0.0-rc.3

    Breaking Changes

    • ArgAction::Set, ArgAction::SetTrue, and Arg::Action::SetFalse now conflict by default to be like ArgAction::StoreValue and ArgAction::IncOccurrences, requiring cmd.args_override_self(true) to override instead (#4261)
    • (help) Line wrapping of help is now behind the existing wrap_help feature flag, either enable it or hard code your wraps (#4258)

    Features

    • Add From<&OsStr>, From<OsString>, From<&str>, and From<String> to value_parser! (#4257)
    • Added StyledStr::ansi() to Display with ANSI escape codes
    • (error) Added Error::render which returns a StyledStr
    • (help) Command::render_usage now returns a StyledStr
    • (help) Command::render_help and Command::render_long_help which returned StyledStr
    • (help) Command::render_usage now returns a StyledStr which returned StyledStr

    v4.0.0-rc.2

    Documentation

    • (derive) Clarify relationship with value parser (#4244)

    v4.0.0-rc.1

    Highlights

    Arg::num_args(range)

    Clap has had several ways for controlling how many values will be captured without always being clear on how they interacted, including

    • Arg::multiple_values(true)
    • Arg::number_of_values(4)
    • Arg::min_values(2)
    • Arg::max_values(20)
    • Arg::takes_value(true)

    These have now all been collapsed into Arg::num_args which accepts both single values and ranges of values. num_args controls how many raw arguments on the command line will be captured as values per occurrence and independent of value delimiters.

    See Issue 2688 for more background.

    Polishing Help

    Clap strives to give a polished CLI experience out of the box with little ceremony. With some feedback that has accumulated over time, we took this release as an opportunity to re-evaluate our --help output to make sure it is meeting that goal.

    In doing this evaluation, we wanted to keep in mind:

    • Whether other CLIs had ideas that make sense to apply
    • Providing an experience that fits within the rest of applications and works across all shells

    ... (truncated)

    Changelog

    Sourced from clap's changelog.

    [4.0.0] - 2022-09-28

    Highlights

    Arg::num_args(range)

    Clap has had several ways for controlling how many values will be captured without always being clear on how they interacted, including

    • Arg::multiple_values(true)
    • Arg::number_of_values(4)
    • Arg::min_values(2)
    • Arg::max_values(20)
    • Arg::takes_value(true)

    These have now all been collapsed into Arg::num_args which accepts both single values and ranges of values. num_args controls how many raw arguments on the command line will be captured as values per occurrence and independent of value delimiters.

    See Issue 2688 for more background.

    Polishing Help

    Clap strives to give a polished CLI experience out of the box with little ceremony. With some feedback that has accumulated over time, we took this release as an opportunity to re-evaluate our --help output to make sure it is meeting that goal.

    In doing this evaluation, we wanted to keep in mind:

    • Whether other CLIs had ideas that make sense to apply
    • Providing an experience that fits within the rest of applications and works across all shells

    Before:

    git
    A fictional versioning CLI
    

    USAGE: git <SUBCOMMAND>

    OPTIONS: -h, --help Print help information

    SUBCOMMANDS: add adds things clone Clones repos help Print this message or the help of the given subcommand(s) push pushes things stash

    ... (truncated)

    Commits
    • 3a74d82 chore: Release
    • 9cd1939 Merge pull request #4269 from epage/usage
    • cb1cd67 fix(error): Include failed arg in usage in --flag=bad-value error
    • 12d76d6 fix(error): Include 'Usage:' title in --flag=bad-value error
    • 3a8d2a5 test(parser): Verify existing --flag=bad-value case
    • c7dd03e Merge pull request #4267 from jpgrayson/override-usage-help-4
    • f925ca8 docs: Clarify how to use examples
    • 7dd216b docs: Update multiline usage override rules
    • a0c8c7d docs: Clean up StyledStr entries
    • 01672f8 chore: Release
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies rust 
    opened by dependabot[bot] 0
  • Bump itertools from 0.10.3 to 0.10.5

    Bump itertools from 0.10.3 to 0.10.5

    Bumps itertools from 0.10.3 to 0.10.5.

    Changelog

    Sourced from itertools's changelog.

    Changelog

    0.10.4

    0.10.2

    • Add Itertools::multiunzip (#362, #565)
    • Add intersperse and intersperse_with free functions (#555)
    • Add Itertools::sorted_by_cached_key (#424, #575)
    • Specialize ProcessResults::fold (#563)
    • Fix subtraction overflow in DuplicatesBy::size_hint (#552)
    • Fix specialization tests (#574)
    • More Debug impls (#573)
    • Deprecate fold1 (use reduce instead) (#580)
    • Documentation fixes (HomogenousTuple, into_group_map, into_group_map_by, MultiPeek::peek) (#543 et al.)

    0.10.1

    • Add Itertools::contains (#514)
    • Add Itertools::counts_by (#515)
    • Add Itertools::partition_result (#511)
    • Add Itertools::all_unique (#241)
    • Add Itertools::duplicates and Itertools::duplicates_by (#502)
    • Add chain! (#525)
    • Add Itertools::at_most_one (#523)
    • Add Itertools::flatten_ok (#527)
    • Add EitherOrBoth::or_default (#583)
    • Add Itertools::find_or_last and Itertools::find_or_first (#535)
    • Implement FusedIterator for FilterOk, FilterMapOk, InterleaveShortest, KMergeBy, MergeBy, PadUsing, Positions, Product , RcIter, TupleWindows, Unique, UniqueBy, Update, WhileSome, Combinations, CombinationsWithReplacement, Powerset, RepeatN, and WithPosition (#550)
    • Implement FusedIterator for Interleave, IntersperseWith, and ZipLongest (#548)

    0.10.0

    • Increase minimum supported Rust version to 1.32.0
    • Improve macro hygiene (#507)
    • Add Itertools::powerset (#335)
    • Add Itertools::sorted_unstable, Itertools::sorted_unstable_by, and Itertools::sorted_unstable_by_key (#494)
    • Implement Error for ExactlyOneError (#484)
    • Undeprecate Itertools::fold_while (#476)
    • Tuple-related adapters work for tuples of arity up to 12 (#475)
    • use_alloc feature for users who have alloc, but not std (#474)
    • Add Itertools::k_smallest (#473)
    • Add Itertools::into_grouping_map and GroupingMap (#465)
    • Add Itertools::into_grouping_map_by and GroupingMapBy (#465)
    • Add Itertools::counts (#468)
    • Add implementation of DoubleEndedIterator for Unique (#442)
    • Add implementation of DoubleEndedIterator for UniqueBy (#442)
    • Add implementation of DoubleEndedIterator for Zip (#346)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies rust 
    opened by dependabot[bot] 0
  • Bump unicode-segmentation from 1.9.0 to 1.10.0

    Bump unicode-segmentation from 1.9.0 to 1.10.0

    Bumps unicode-segmentation from 1.9.0 to 1.10.0.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies rust 
    opened by dependabot[bot] 0
  • Bump criterion from 0.3.6 to 0.4.0

    Bump criterion from 0.3.6 to 0.4.0

    Bumps criterion from 0.3.6 to 0.4.0.

    Changelog

    Sourced from criterion's changelog.

    [0.4.0] - 2022-09-10

    Removed

    • The Criterion::can_plot function has been removed.
    • The Criterion::bench_function_over_inputs function has been removed.
    • The Criterion::bench_functions function has been removed.
    • The Criterion::bench function has been removed.

    Changed

    • HTML report hidden behind non-default feature flag: 'html_reports'
    • Standalone support (ie without cargo-criterion) feature flag: 'cargo_bench_support'
    • MSRV bumped to 1.57
    • rayon and plotters are optional (and default) dependencies.
    • Status messages ('warming up', 'analyzing', etc) are printed to stderr, benchmark results are printed to stdout.
    • Accept subsecond durations for --warm-up-time, --measurement-time and --profile-time.
    • Replaced serde_cbor with ciborium because the former is no longer maintained.
    • Upgrade clap to v3 and regex to v1.5.

    Added

    • A --discard-baseline flag for discarding rather than saving benchmark results.
    • Formal support for benchmarking code compiled to web-assembly.
    • A --quiet flag for printing just a single line per benchmark.
    • A Throughput::BytesDecimal option for measuring throughput in bytes but printing them using decimal units like kilobytes instead of binary units like kibibytes.

    Fixed

    • When using bench_with_input, the input parameter will now be passed through black_box before passing it to the benchmark.
    Commits
    • 5e27b69 Merge branch 'version-0.4'
    • 4d6d69a Increment version numbers.
    • 935c632 Add Throughput::BytesDecimal. Fixes #581.
    • f82ce59 Remove critcmp code (it belongs in cargo-criterion) (#610)
    • a18d080 Merge branch 'master' into version-0.4
    • f9c6b8d Merge pull request #608 from Cryptex-github/patch-1
    • 8d0224e Fix html report path
    • 2934163 Add missing black_box for bench_with_input parameters. Fixes 566.
    • dfd7b65 Add duplicated benchmark ID to assertion message.
    • ce8259e Bump criterion-plot version number.
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies rust 
    opened by dependabot[bot] 0
  • Allow to specify characters that have to be converted to character class

    Allow to specify characters that have to be converted to character class

    First of all thank you for this great tool. When using, I often need to convert text into more detailed character classes, not just non-digits or non-blank characters. Is it possible to customize the range of characters to be converted into character classes, like [a-e\d], [①-⑨⒈-⒙] or specific languages such as Chinese and Japanese. For example, if the source text is 我的名字是Tom, I hope to get the regular expression [\u{4e00}-\u{9fa5}]{5}\w{3} instead of \w{8}, by specifying character class [\u{4e00}-\u{9fa5}]. And I want to specify the maximum and minimum length of repeated substrings. Sometimes I get results like (\w{5}|\w{7,8}|\w{10,17}), but the regular expression I expected is (\w{3,20}). So I hope to be able to specify the minimum and maximum repetition times of the substring, or combine the repetition times into an interval instead of multiple branches. I think these two points can be specified together, using multiple formats similar to \w{3,20} to specify characters that must be converted into character classes.

    opened by NightWatch0 1
Releases(v1.4.0)
Owner
Peter M. Stahl
Computational linguist, Rust enthusiast, green IT advocate
Peter M. Stahl
A command line tool that builds and (re)starts your web application everytime you save a Go or template fileA command line tool that builds and (re)starts your web application everytime you save a Go or template file

# Fresh Fresh is a command line tool that builds and (re)starts your web application everytime you save a Go or template file. If the web framework yo

null 0 Nov 22, 2021
A command line utility and library for generating professional looking invoices in Go.

ginvoicer A command line utility and library for generating professional looking invoices in Go. This is a very rough draft and there could still be b

Avi Zimmerman 11 Sep 26, 2022
mass-binding-target is a command line tool for generating binding target list by search plot files from disk.

mass-binding-target mass-binding-target is a command line tool for generating binding target list by search plot files from disk. Build Go 1.13 or new

null 0 Nov 5, 2021
git-glimpse is a command-line tool that is aimed at generating a git prompt like the one from zsh-vcs-prompt.

Git GoGlimpse git-glimpse is a command-line tool that is aimed at generating a git prompt like the one from zsh-vcs-prompt. The particularity of this

Corentin de Boisset 0 Jan 27, 2022
A commandline tool to resolve URI Templates expressions as specified in RFC 6570.

URI Are you tired to build, concat, replace URL(s) (via shell scripts sed/awk/tr) from your awesome commandline pipeline? Well! here is the missing pi

Luca Sepe 17 Jun 9, 2021
A command line utility for generating language-specific project structure.

hydra hydra is a command line utility for generating language-specific project structures. ⏬ ✨ Features Build project templates with just one command

Shravan 18 Oct 8, 2021
An open-source GitLab command line tool bringing GitLab's cool features to your command line

GLab is an open source GitLab CLI tool bringing GitLab to your terminal next to where you are already working with git and your code without switching

Clement Sam 2.1k Oct 2, 2022
A command line tool to prompt for a value to be included in another command line.

readval is a command line tool which is designed for one specific purpose—to prompt for a value to be included in another command line. readval prints

Venky 0 Dec 22, 2021
A CLI tool that you can use create regular backups of your Notion.so Pages.

notion-offliner A CLI tool that you can use create regular backups of your Notion.so Pages. Perfect for disaster scenarios and offline usage. MacOS an

Ned McClain 10 Apr 18, 2022
🚀 goprobe is a promising command line tool for inspecting URLs with modern and user-friendly way.

goprobe Build go build -o ./bin/goprobe Example > goprobe https://github.com/gaitr/goprobe > cat links.txt | goprobe > echo "https://github.com/gaitr/

null 3 Oct 24, 2021
A command line http test tool. Maintain the case via git and pure text

httptest A command line http test tool Maintain the api test cases via git and pure text We want to test the APIs via http requests and assert the res

wklken 12 Aug 4, 2022
CLI tool and library for generating a Software Bill of Materials from container images and filesystems

A CLI tool and Go library for generating a Software Bill of Materials (SBOM) from container images and filesystems. Exceptional for vulnerability dete

Anchore, Inc. 3.1k Sep 22, 2022
oc CLI plugin to interact with Helm features provided by the OpenShift Console

OpenShift provides support for managing the lifecycle of Helm charts. This capability is limited primarily to the Web Console. This plugin enables the management of Helm charts similar to using the standalone Helm CLI while offloading much of the work to OpenShift.

Andrew Block 0 Aug 20, 2022
tfacon is a CLI tool for connecting Test Management Platforms and Test Failure Analysis Classifier.

Test Failure Classifier Connector Description tfacon is a CLI tool for connecting Test Management Platforms and Test Failure Analysis Classifier. Test

Red Hat Quality Engineering 3 Jun 23, 2022
Binary Defense 52 Sep 23, 2022
wy : a set of command-line tools to test your container-based platform

wy wy (Abbreviation of Would You) is a set of command-line tools to test your container-based platform. ToC: Commands Deployment Monitoring Contributi

Yusuke Kuoka 2 Apr 30, 2022
Go test command line interface for dlv(delve)

What does it do? Delver makes the command line interface for starting dlv the same as the one used in go test Example Say you're using this when devel

Johan Håkansson 0 Jan 7, 2022
git-xargs is a command-line tool (CLI) for making updates across multiple Github repositories with a single command.

Table of contents Introduction Reference Contributing Introduction Overview git-xargs is a command-line tool (CLI) for making updates across multiple

Gruntwork 650 Sep 28, 2022
git-xargs is a command-line tool (CLI) for making updates across multiple GitHub repositories with a single command

git-xargs is a command-line tool (CLI) for making updates across multiple GitHub repositories with a single command. You give git-xargs:

Maxar Infrastructure 1 Feb 5, 2022