Syntax-aware grep for PHP code.

Overview

phpgrep

Go Report Card GoDoc Build Status

Syntax-aware grep for PHP code.

This repository is used for the library and command-line tool development. A good source for additional utilities and ready-to-run recipes is phpgrep-contrib repository.

Overview

phpgrep is both a library and a command-line tool.

Library can be used to perform syntax-aware PHP code matching inside Go programs while binary utility can be used from your favorite text editor or terminal emulator.

It's very close to the structural search and replace in PhpStorm, but better suited for standalone usage.

In many ways, it's inspired by github.com/mvdan/gogrep/.

See also: "phpgrep: syntax aware code search".

Quick start

If you're using VS Code, you might be interested in vscode-phpgrep extension.

Download a phpgrep binary from the latest release, put it somewhere under your $PATH.

Run a -help command to verify that everything is okay.

$ phpgrep -help
Usage: phpgrep [flags...] targets pattern [filters...]
Where:
  flags are command-line arguments that are listed in -help (see below)
  targets is a comma-separated list of file or directory names to search in
  pattern is a string that describes what is being matched
  filters are optional arguments bound to the pattern

Examples:
  # Find f calls with a single varible argument.
  phpgrep file.php 'f(${"var"})'

  # Like the previous example, but searches inside entire
  # directory recursively and variable names are restricted
  # to $id, $uid and $gid.
  # Also uses -v flag that makes phpgrep output more info.
  phpgrep -v ~/code/php 'f(${"x:var"})' 'x=id,uid,gid'

  # Run phpgrep on 2 folders (recursively).
  phpgrep dir1,dir2 '"some string"'

  # Print only matches, without locations.
  phpgrep -format '{{.Match}}' file.php 'pattern'

  # Print only assignments right-hand side.
  phpgrep -format '{{.rhs}}' file.php '$_ = $rhs'

  # Ignore vendored source code inside project.
  phpgrep --exclude '/vendor/' project/ 'pattern'

Custom output formatting is possible via the -format flag template.
  {{.Filename}}  match containing file name
  {{.Line}}      line number where the match started
  {{.MatchLine}} a source code line that contains the match
  {{.Match}}     an entire match string
  {{.x}}         $x submatch string (can be any submatch name)

The output colors can be configured with "--color-<name>" flags.
Use --no-color to disable the output coloring.

Exit status:
  0 if something is matched
  1 if nothing is matched
  2 if error occurred

# ... rest of output

Create a test file hello.php:

<?php
function f(...$xs) {}
f(10);
f(20);
f(30); // aha!
f($x);
f();

Run phpgrep over that file:

$ phpgrep hello.php 'f(${"x:int"})' 'x!=20'
hello.php:3: f(10);
hello.php:5: f(30); // aha!
found 2 matches

We found all f calls with a single argument x that is int literal not equal to 20.

Next thing to learn is ${"*"} matcher.

Suppose you need to match all foo function calls that have null argument.
foo is variadic, so it's unknown where that argument can be located.

This pattern will match null arguments at any position: foo(${"*"}, null, ${"*"}).

Read pattern language docs to learn more about how to write search patterns.

Read the user manual to learn more about phpgrep command line arguments and to get some insights on how to use it.

Recipes

This section contains ready-to-use phpgrep patterns.

srcdir is a target source directory (can also be a single filename).

Useful recipes

# Find arrays with at least 1 duplicated key.
$ phpgrep srcdir '[${"*"}, $k => $_, ${"*"}, $k => $_, ${"*"}]'

# Find where `$x ?: $y` can be applied.
$ phpgrep srcdir '$x ? $x : $y' # Use `$x ?: $y` instead

# Find where `$x ?? $y` can be applied.
$ phpgrep srcdir 'isset($x) ? $x : $y'

# Find in_array calls that can be replaced with $x == $y.
$ phpgrep srcdir 'in_array($x, [$y])'

# Find potential operator precedence issues.
$ phpgrep srcdir '$x & $mask == $y' # Should be ($x & $mask) == $y
$ phpgrep srcdir '$x & $mask != $y' # Should be ($x & $mask) != $y

# Find calls where func args are misplaced.
$ phpgrep srcdir 'stripos(${"str"}, $_)'
$ phpgrep srcdir 'explode($_, ${"str"}, ${"*"})'

# Find new calls without parentheses.
$ phpgrep srcdir 'new $t'

# Find all if statements with a body without {}.
$ phpgrep srcdir 'if ($cond) $x' 'x!~^\{'
# Or without regexp.
$ phpgrep srcdir 'if ($code) ${"expr"}'

# Find all error-supress operator usages.
$ phpgrep srcdir '@$_'

# Find all == (non-strict) comparisons with null.
$ phpgrep srcdir '$_ == null'

Miscellaneous recipes

# Find all function calls that have at least one var-argument that has _id suffix.
$ phpgrep srcdir '$f(${"*"}, ${"x:var"}, ${"*"})' 'x~.*_id$'

# Find foo calls where the second argument is integer literal.
$ phpgrep srcdir 'foo($_, ${"int"})'

Install from sources

You'll need Go tools to install phpgrep from sources.

To install phpgrep binary under your $(go env GOPATH)/bin:

go get -v github.com/quasilyte/phpgrep/cmd/phpgrep

If $GOPATH/bin is under your system $PATH, phpgrep command should be available after that.

Comments
  • After go get

    After go get

    go get -v github.com/quasilyte/phpgrep/cmd/phpgrep github.com/quasilyte/phpgrep (download) github.com/z7zmey/php-parser (download) github.com/quasilyte/phpgrep

    github.com/quasilyte/phpgrep

    go/src/github.com/quasilyte/phpgrep/utils.go:180:37: cannot use bytes.NewReader(code) (type *bytes.Reader) as type []byte in argument to php7.NewParser

    opened by brother0k 8
  • Add custom printing formatting

    Add custom printing formatting

    Define -format argument that accepts a string that describes how match results should be printed. If it's an empty string, the entire match is printed as is (current behavior). For non-empty patterns every match is formatted accordingly.

    Example:

    phpgrep -format '$s' srcdir 'die("s:str")'
    
    pattern: 'die("s:str")'
    format: '$s'
    
    input: die("hello")
    output without format: die("hello")
    output with format: "hello"
    
    opened by quasilyte 5
  • matcher.go:432:8: undefined: expr.ParenExpr

    matcher.go:432:8: undefined: expr.ParenExpr

    I can't seem to install your tool

    ➜ go get -v github.com/quasilyte/phpgrep/cmd/phpgrep
    github.com/z7zmey/php-parser/walker
    github.com/z7zmey/php-parser/position
    github.com/cznic/golex/lex
    github.com/z7zmey/php-parser/errors
    github.com/z7zmey/php-parser/freefloating
    github.com/z7zmey/php-parser/node
    github.com/z7zmey/php-parser/scanner
    github.com/z7zmey/php-parser/node/name
    github.com/z7zmey/php-parser/node/expr/cast
    github.com/z7zmey/php-parser/node/scalar
    github.com/z7zmey/php-parser/node/expr/assign
    github.com/z7zmey/php-parser/node/expr/binary
    github.com/z7zmey/php-parser/node/expr
    github.com/z7zmey/php-parser/node/stmt
    github.com/z7zmey/php-parser/printer
    github.com/z7zmey/php-parser/parser
    github.com/z7zmey/php-parser/php7
    github.com/quasilyte/phpgrep
    # github.com/quasilyte/phpgrep
    ../go/src/github.com/quasilyte/phpgrep/matcher.go:432:8: undefined: expr.ParenExpr
    ../go/src/github.com/quasilyte/phpgrep/utils.go:122:4: undefined: expr.ParenExpr
    ➜
    
    opened by ostrolucky 2
  • [${

    [${"*"}, $_] doesn't match [1, 2]

    $ cat test.php
    <?php
    var_dump([]);
    var_dump([1]);
    var_dump([1, 2]);
    var_dump([1,]);
    var_dump([1,2,]);
    $ phpgrep ./test.php '[${"*"}, $_]'
    ./test/php:4: [1]
    found 1 matches
    

    Expected 2 matches.

    opened by quasilyte 1
  • Patterns that don't always work without a proper backtracking support in matcher

    Patterns that don't always work without a proper backtracking support in matcher

    All cases below where the pattern doesn't match when it should are a result of no backtracking support inside the matcher.

    The ${"*"} node stops when no more nodes are left inside the block or when the next pattern node can be matched instead. If we fail a match after that, we never try to continue from the last ${"*"} to see if we can stop at a different point and accept the match.

    Backtracking would make matching slower, but some (how many?) cases where we need it can be determined during the pattern compile time. For the block that doesn't require it we can generate a non-backtracking node kind.

    Some examples that yield the unexpected results:

    <?php
    
    f(${"*"}, $_)
    // matches f(1)
    // doesn't match f(1, 2)
    // doesn't match f(1, 2, 3)
    //
    // Potential solution: generate "ends-with" pattern for this case,
    // check that the last node matches, ignore everything before it.
    
    f(${"*"}, 1, 1, ${"*"})
    // matches f(1, 1, 1)
    // doesn't match f(1, 1)
    // doesn't match f(0, 1, 0, 1, 1, 0)
    //
    // Either implement backtracking or reject such patterns during the compile time.
    
    global ${"*"}, $x, $x
    // matches global $a, $a
    // doesn't match global $a, $b, $b
    
    opened by quasilyte 1
  • Implement all matcher classes

    Implement all matcher classes

    | Class | Description | |---|---| | * | Any node, 0-N times | | + | Any node, 1-N times | | int | Integer literal | | float | Float literal | | num | Integer or float literal | | str | String literal | | var | Variable |

    opened by quasilyte 1
  • fix errors printing in

    fix errors printing in "update" progress mode

    If --progress is set to "update", don't print the errors right away, wait until we parse all files.

    This helps to avoid the cluttered output.

    opened by quasilyte 0
  • do case-insensitive search by default

    do case-insensitive search by default

    If case sensitivity should be strict, -case-sensitive flag can be used to match spellings literally.

    Note that case insensitive mode only applies to things that are insensitive in PHP as well: function and class names, etc.

    Suppose we have this PHP code (test.php):

    a::f()
    a::F()
    

    New phpgrep behavior:

    $ phpgrep test.php 'a::f()'
    finds both a::F() and a::f()
    
    $ phpgrep -case-sensitive test.php 'a::f()'
    finds only a::f()
    

    Refs #57

    Signed-off-by: Iskander Sharipov [email protected]

    opened by quasilyte 0
  • cmd/phpgrep: fix filters

    cmd/phpgrep: fix filters

    At some point we updated php-parser. A side effect of that is that there is no need to subtract 1 from the start pos to get the [begin,end] slice. We did subtract 1 and as a result text that was forwarded to the filters always had 1 extra byte from the beginning.

    This commit also adds a way to test a phpgrep binary (e2e test).

    Updates #5

    Signed-off-by: Iskander Sharipov [email protected]

    opened by quasilyte 0
  • unable to use phpgrep as library

    unable to use phpgrep as library

    getting error "use of internal package github.com/quasilyte/phpgrep/internal/phpgrep not allowed"

    Readme says phpgrep can be used as command as well as library but its not, please help us getting the right way or there is any fix required ?

    opened by ervishal 5
  • Prestashop sql statements / verify escaping or type cast

    Prestashop sql statements / verify escaping or type cast

    Hi,

    Hope you are all well !

    I wanted to use phpgrep to check if my prestashop code is missing some escaping function for any sql statement.

    For eg, in this commit https://github.com/PrestaShop/PrestaShop/commit/3fa0dfa5a8f4b149c7c90b948a12b4f5999a5ef8, you can see that the pSQL and (int) functions are missing.

    Is it possible to grep a list of all "Db::getInstance()" and check if the variables are escaped or cast ?

    Thanks for any insights or inputs on that :-)

    Cheers, Luc Michalski

    opened by ghost 0
  • Question: Possible to get whole body of function not one line?

    Question: Possible to get whole body of function not one line?

    Is that possible to get whole body of function as a string when you find specific function in this function?

    Example:

    function(){
        $var = "not important";
        notme();
    }
    
    function2(){
        $var = "important";
        imhere();
    }
    

    Command: phpgrep find only imhere();

    Output:

        $var = "important";
        imhere();
    
    opened by klebann 4
Releases(v1.0.0)
Owner
Iskander (Alex) Sharipov
Iskander (Alex) Sharipov
Pomerium is an identity-aware access proxy.

Pomerium is an identity-aware proxy that enables secure access to internal applications. Pomerium provides a standardized interface to add access cont

null 3.4k Jan 1, 2023
Cost-aware network traffic analysis

Traffic Refinery Overview Traffic Refinery is a cost-aware network traffic analysis library implemented in Go For a project overview, installation inf

null 6 Nov 21, 2022
sail is an operation framework based on Ansible/Helm. sail follows the principles of Infrastructure as Code (IaC), Operation as Code (OaC), and Everything as Code. So it is a tool for DevOps.

sail 中文文档 sail is an operation framework based on Ansible/Helm. sail follows the principles of Infrastructure as Code (IaC), Operation as Code (OaC),a

Bougou Nisou 10 Dec 16, 2021
Chat - Console mode chat done in Go, PHP and MySQL

Chat modo consola hecho en GO y PHP(https://github.com/RicardoValladares/AJAX) G

Ricardo Antonio Valladares Renderos 6 Nov 10, 2022
Fast, Docker-ready image processing server written in Go and libvips, with Thumbor URL syntax

Imagor Imagor is a fast, Docker-ready image processing server written in Go. Imagor uses one of the most efficient image processing library libvips (w

Adrian Shum 2.7k Dec 30, 2022
Search for HCL(v2) using syntax tree

hclgrep Search for HCL(v2) using syntax tree. The idea is heavily inspired by ht

magodo 79 Dec 12, 2022
Boxygen is a container as code framework that allows you to build container images from code

Boxygen is a container as code framework that allows you to build container images from code, allowing integration of container image builds into other tooling such as servers or CLI tooling.

nitric 5 Dec 13, 2021
Docker-based remote code runner / 基于 Docker 的远程代码运行器

Docker-based remote code runner / 基于 Docker 的远程代码运行器

E99p1ant 35 Nov 9, 2022
Run VS Code on any server over SSH.

sshcode This project has been deprecated in favour of the code-server install script See the discussion in #185 sshcode is a CLI to automatically inst

Coder 5.8k Dec 25, 2022
Open Source runtime tool which help to detect malware code execution and run time mis-configuration change on a kubernetes cluster

Kube-Knark Project Trace your kubernetes runtime !! Kube-Knark is an open source tracer uses pcap & ebpf technology to perform runtime tracing on a de

Chen Keinan 32 Sep 19, 2022
Source code and slides for Kubernetes Community Days - Bangalore.

kcdctl This is the source code for the demo done as part of the talk "Imperative, Declarative and Kubernetes" at the Kubernetes Community Days, Bengal

Madhav Jivrajani 15 Sep 19, 2021
Cloud Infrastructure as Code

CloudIaC Cloud Infrastructure as Code CloudIaC 是基于基础设施即代码构建的云环境自动化管理平台。 CloudIaC 将易于使用的界面与强大的治理工具相结合,让您和您团队的成员可以快速轻松的在云中部署和管理环境。 通过将 CloudIaC 集成到您的流程中

iDCOS 94 Dec 27, 2022
Test and benchmark KPHP code

Overview ktest is a tool that makes kphp programs easier to test. ktest phpunit can run PHPUnit tests using KPHP ktest bench run benchmarks using KPHP

VK.com 9 Dec 14, 2022
Flux is a tool for keeping Kubernetes clusters in sync with sources of configuration, and automating updates to configuration when there is new code to deploy.

Flux is a tool for keeping Kubernetes clusters in sync with sources of configuration (like Git repositories), and automating updates to configuration when there is new code to deploy.

Flux project 4.3k Jan 8, 2023
Go Support Code For Writing Falcosecurity Plugins

plugin-sdk-go Go package to facilitate writing Falco/Falco libs plugins. Before using this package, review the developer's guide which fully documents

Mark Stemm 2 Sep 20, 2021
Infrastructure as Code Workshop

infrastructure-as-code-workshop Infrastructure as Code Workshop Run Pulumi projects Just cd into the pulumi-* folder and type pulumi up Run Terraform

Engin Diri 7 Oct 21, 2022
tfa is a 2fa cli tool that aims to help you to generate 2fa code on CI/CD pipelines.

tfa tfa is 2fa cli tool that aim to help you to generate 2fa code on CI/CD pipelines. You can provide secret with stdin or flag. Install brew install

Kaan Karakaya 28 Nov 27, 2022
A tool for managing complex enterprise Kubernetes environments as code.

kubecfg A tool for managing Kubernetes resources as code. kubecfg allows you to express the patterns across your infrastructure and reuse these powerf

null 97 Dec 14, 2022