EarlyBird is a sensitive data detection tool capable of scanning source code repositories for clear text password violations, PII, outdated cryptography methods, key files and more.

Related tags

Security earlybird
Overview

Logo

EarlyBird is a sensitive data detection tool capable of scanning source code repositories for clear text password violations, PII, outdated cryptography methods, key files and more. It can be used to scan remote git repositories, local files or directories or as a pre-commit step.

Installation

Linux & Mac

Running the build.sh script will produce a binary for each OS, while the install.sh script will install Earlybird on your system. This will create a .go-earlybird directory in your home directory with all the configuration files. Finally installing go-earlybird as an executable in /usr/local/bin/.

./build.sh && ./install.sh

Windows

Running build.bat will produce your binaries while the install.bat script will create a 'go-earlybird' directory in C:\Users\[my user]\App Data\, and copy the required configurations there. This script will also install go-earlybird.exe as an executable in the App Data directory (which should be in your path).

build.bat && install.bat

Usage

To launch a basic EarlyBird scan against a directory:

$ go-earlybird --path=/path/to/directory
$ go-earlybird.exe --path=C:\path\to\directory

or to scan a remote git repo:

$ go-earlybird --git=https://github.com/americanexpress/earlybird

Click here for Detailed Usage instructions.

Documentation

Why Are We Doing This?

The MITRE Corporation provides a catalog of Common Weakness Enumerations (CWE), documenting issues that should be avoided. Some of the relevant CWEs that are handled by the use of EarlyBird include:


Contributing

We welcome your interest in the American Express Open Source Community on Github. Any Contributor to any Open Source Project managed by the American Express Open Source Community must accept and sign an Agreement indicating agreement to the terms below. Except for the rights granted in this Agreement to American Express and to recipients of software distributed by American Express, You reserve all right, title, and interest, if any, in and to your contributions. Please fill out the Agreement.

License

Any contributions made under this project will be governed by the Apache License 2.0.

Code of Conduct

This project adheres to the American Express Community Guidelines. By participating, you are expected to honor these guidelines.

Comments
  • Error: invalid memory address or nil pointer dereference

    Error: invalid memory address or nil pointer dereference

    I have just built the binaries from the source code (both linux/amd64 and windows/amd64 behave the same way). When executing it I get as a result:

    panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xa12032]

    goroutine 1 [running]: github.com/americanexpress/earlybird/pkg/core.(*EarlybirdCfg).GetRuleModulesMap.func1(0xc000026f60, 0x19, 0x0, 0x0, 0xc47f00, 0xc00010ed80, 0xc000107be8, 0x4108ad) /var/lib/jenkins/workspace/Earlybird-build-binaries/earlybird/pkg/core/core.go:148 +0x32 path/filepath.Walk(0xc000026f60, 0x19, 0xc000107c30, 0xc000026f60, 0x19) /usr/lib/golang/src/path/filepath/path.go:404 +0x6b github.com/americanexpress/earlybird/pkg/core.(*EarlybirdCfg).GetRuleModulesMap(0x1044e80, 0x10445e0, 0xc00002c810) /var/lib/jenkins/workspace/Earlybird-build-binaries/earlybird/pkg/core/core.go:147 +0xef github.com/americanexpress/earlybird/pkg/core.(*EarlybirdCfg).ConfigInit(0x1044e80) /var/lib/jenkins/workspace/Earlybird-build-binaries/earlybird/pkg/core/core.go:176 +0x2ef main.main() /var/lib/jenkins/workspace/Earlybird-build-binaries/earlybird/go-earlybird.go:47 +0x35d

    question 
    opened by spaluchowski 6
  • Build fails on macOS

    Build fails on macOS

    When running build.sh, the build fails with the following log:

    Running Unit Tests
    go: downloading golang.org/x/net v0.0.0-20200324143707-d3edc9973b7e
    go: downloading github.com/gorilla/mux v1.7.4
    go: downloading github.com/gocarina/gocsv v0.0.0-20200330101823-46266ca37bd3
    go: downloading gopkg.in/src-d/go-git.v4 v4.13.1
    go: downloading github.com/dghubble/sling v1.3.0
    go: downloading github.com/howeyc/gopass v0.0.0-20190910152052-7cb4b85ec19c
    go: downloading github.com/google/go-github v17.0.0+incompatible
    go: downloading golang.org/x/text v0.3.2
    go: downloading golang.org/x/crypto v0.0.0-20190701094942-4def268fd1a4
    go: downloading github.com/google/go-querystring v1.0.0
    go: downloading github.com/sergi/go-diff v1.0.0
    go: downloading gopkg.in/src-d/go-billy.v4 v4.3.2
    go: downloading github.com/kevinburke/ssh_config v0.0.0-20190725054713-01f96b0aa0cd
    go: downloading github.com/xanzy/ssh-agent v0.2.1
    go: downloading github.com/mitchellh/go-homedir v1.1.0
    go: downloading github.com/emirpasic/gods v1.12.0
    go: downloading golang.org/x/sys v0.0.0-20200323222414-85ca7c5b95cd
    go: downloading github.com/jbenet/go-context v0.0.0-20150711004518-d14ea06fba99
    go: downloading github.com/src-d/gcfg v1.4.0
    go: downloading gopkg.in/warnings.v0 v0.1.2
    # github.com/americanexpress/earlybird/pkg/api
    pkg/api/api.go:72:19: conversion from Duration (int64) to string yields a string of one rune, not a string of digits (did you mean fmt.Sprint(x)?)
    pkg/api/api.go:141:19: conversion from Duration (int64) to string yields a string of one rune, not a string of digits (did you mean fmt.Sprint(x)?)
    # github.com/americanexpress/earlybird/pkg/core
    pkg/core/core.go:261:19: conversion from Duration (int64) to string yields a string of one rune, not a string of digits (did you mean fmt.Sprint(x)?)
    # github.com/americanexpress/earlybird/pkg/writers
    pkg/writers/jsonout_test.go:55:17: conversion from Duration (int64) to string yields a string of one rune, not a string of digits (did you mean fmt.Sprint(x)?)
    Unit Tests FAILED!
    FAIL	github.com/americanexpress/earlybird/pkg/api [build failed]
    ok  	github.com/americanexpress/earlybird/pkg/config	0.325s
    FAIL	github.com/americanexpress/earlybird/pkg/core [build failed]
    Failed to open ignore file open .ge_ignore: no such file or directory
    Failed to open ignore file open /Users/phil/.ge_ignore: no such file or directory
    Failed to open ignore file open .ge_ignore: no such file or directory
    --- FAIL: Test_isIgnoredFile (0.00s)
        --- FAIL: Test_isIgnoredFile/Check_if_file_is_ignored (0.00s)
            fileUtil_test.go:148: isIgnoredFile() = false, want true
    FAIL
    FAIL	github.com/americanexpress/earlybird/pkg/file	0.222s
    ok  	github.com/americanexpress/earlybird/pkg/git	0.487s
    ok  	github.com/americanexpress/earlybird/pkg/postprocess	0.209s
    ok  	github.com/americanexpress/earlybird/pkg/scan	0.254s
    ok  	github.com/americanexpress/earlybird/pkg/update	0.468s
    ok  	github.com/americanexpress/earlybird/pkg/utils	0.252s
    ok  	github.com/americanexpress/earlybird/pkg/wildcard	0.184s
    FAIL	github.com/americanexpress/earlybird/pkg/writers [build failed]
    FAIL
    

    This occurs on macOS version 10.15.5, and Go version go1.15.2 darwin/amd64.

    opened by Phuurl 6
  • Cloning of git repository leaves files in temporary directory

    Cloning of git repository leaves files in temporary directory

    After data is being downloaded using -git flag to temporary directory it is not being removed after data check. I think that the expected behavior should be that the data is removed from the temporary location after checking presence of sensitive data.

    The problem is that if I am running check against many code repositories temp directory grows significantly and I need to have additional monitoring activity to erase temporary data.

    Version 1.24.6

    enhancement question 
    opened by spaluchowski 4
  • Configuration in default path is mandatory to launch the tool

    Configuration in default path is mandatory to launch the tool

    Even if configuration parameter is set in command line the default configuration path is still trying to load. And if this path is not present earlybird fails to launch with error: "Failed to load Earlybird configopen /var/lib/jenkins/.go-earlybird/earlybird.json: no such file or directory"

    Expected: If configuration path is provided in command line, the default configuration path should not be checked.

    Version: 1.24.6

    bug 
    opened by spaluchowski 3
  • meta: Comparison to gitleaks?

    meta: Comparison to gitleaks?

    👋 Cool project! I was curious how earlybird compares to other projects like gitleaks? I see earlybird can scan a bit more types of targets, but are the patterns both recognize the same? I've used gitleaks for a while and am curious to adopt both tools.

    Project: https://github.com/zricethezav/gitleaks

    opened by adamdecaf 3
  • Ignorefile case sensitivity is broken

    Ignorefile case sensitivity is broken

    .ge_ignore file case sensitivity works in a weird way. At least on Windows machine. Entries in ignore file have to be put in lower case to match something.

    The problem is that when in ignore file is entry: '*.txt' it matches any *.txt files (i.e. Readme.txt, readme.TXT) which is good behavior. But when I put entry "*.TXT" it does not match anything (at least not expecting readme.TXT). It showed when I tried to exclude some path which contained word "Libs" (i.e. "Libs/sweet.lib"). When I tried with all combinations of "*/Libs/*", "*Libs*", "Libs*" nothing worked. Only "*/libs/*" was matching.

    This is not a huge problem, but it is a very counter intuitive behavior.

    Version: 1.24.6

    good first issue 
    opened by spaluchowski 2
  • Fix/version issue

    Fix/version issue

    • Introduced version injection via ldflags during build to avoid maintaining the version
    • Switched to using semantic-release
    • semantic-release will dynamically generate next release version and generate CHANGELOG.md based of the commit msgs.
    • Update golang version to go 1.18
    • fixed few failing unit test
    released 
    opened by grinish21 1
  • feat: add keepAlive flag and fix worker flag read

    feat: add keepAlive flag and fix worker flag read

    • Added disableKeepAlive flag to address high sockets opening while running on super high load as http on Kube Clusters
    • Fix the issue with workerSize not being read from flags provided with 100 as default
    released 
    opened by grinish21 1
  • docs: fix typo in readme

    docs: fix typo in readme

    I found this typo in the readme, it did get me thinking though: should it be EarlyBird consistently across the docs or Earlybird? I see both usages in the Readme.

    released 
    opened by anescobar1991 1
  • Wrong verbose information about scanned files

    Wrong verbose information about scanned files

    When I scan folder with 4 files I see in log:

    Reading file Reading file /mnt/c/git-repo/earlybird/verify/.ge_ignore Reading file /mnt/c/git-repo/earlybird/verify/checkfile.properties Reading file /mnt/c/git-repo/earlybird/verify/checkfile2.properties ***** Total issues found ***** 0 TOTAL ISSUES

    4 files scanned in 18.7711ms

    The first entry is "Reading file" without the data of the file in folder.

    bug 
    opened by spaluchowski 1
  • Base directory should be ignored during ignore matching

    Base directory should be ignored during ignore matching

    This is rather an enhancement than a bug, but makes things more clear

    As of now whole path is being checked against ignorefile BUT Imagine the situation that:

    1. Ignore file contains entry "/test/"
    2. User uses his /var/lib/test/projects/ directory to download his projects into Then when Earlybird is executed with parameter: -path " /var/lib/test/projects/" it will ignore all project files, nothing will be scanned

    Proposed remediation: Base directory path should be removed from matching against ignored patterns

    enhancement 
    opened by spaluchowski 1
  • Ignore false positive string

    Ignore false positive string

    I see that we have a way of ignoring a file. Can we introduce a way to ignore a string as well?

    example: in my .env.example I have placeholders

    my_secret=ThisIsASecretToReplace

    I want them to see this in the first run and then add "ThisIsASecretToReplace" to an exception list. By doing this, it still forces them to think about the data they are putting in .env.example and will always require the initial review of the finding. Currently I have to ignore the file .env.example altogether which means if someone actually puts a secret in there that is valid then no one will be monitoring (outside of the PR review).

    question 
    opened by Cr0n1c 1
  • Feature Request: GitHub Action for EarlyBird

    Feature Request: GitHub Action for EarlyBird

    Couldn't find a Discord/Slack/etc to post this to so I will post this here. It would be amazing if this repo could support a GitHub Action so that we can bake this into our CI/CD easily.

    enhancement 
    opened by Cr0n1c 1
  • ignore files issue

    ignore files issue

    I'm testing earlybird against an ansible role repository, normally an easy one. I still get many false-positive matches that I'm trying to ignore zith ~/.ge_ignore but it seems the pattern style is not working or a format that I'm not expecting. Tried with filename, shell wildcard pattern. also double wildcard like https://github.com/aschaef19/earlybird/blob/main/.ge_ignore

    $ go-earlybird -path=. -verbose
    022/04/30 10:08:10 Go-EarlyBird version:  2.0.0
    Severity Fail threshold (at or above):  low
    Confidence Fail threshold (at or above):  low
    Severity Display threshold (at or above):  low
    Confidence Display threshold (at or above):  low
    Max file size to scan:  10240000  bytes
    2022/04/30 10:08:10 loading module:  ccnumber
    2022/04/30 10:08:10 loading module:  content
    2022/04/30 10:08:10 loading module:  filename
    2022/04/30 10:08:10 loading module:  inclusivity-rules
    2022/04/30 10:08:10 loading module:  password-secret
    2022/04/30 10:08:10 Scanning directory:  .
    2022/04/30 10:08:10 Ignore pattern:  *.git/*, .vagrant/, *.retry, .kitchen, inspec.lock, [._]*.s[a-v][a-z], [._]*.sw[a-p], [._]s[a-v][a-z], [._]sw[a-p], Session.vim, Sessionx.vim, .netrwhist, *~, tags, [._]*.un~, __pycache__/, *.py[cod], *$py.class, *.so, .Python, build/, develop-eggs/, dist/, downloads/, eggs/, .eggs/, lib/, lib64/, parts/, sdist/, var/, wheels/, *.egg-info/, .installed.cfg, *.egg, MANIFEST, *.manifest, *.spec, pip-log.txt, pip-delete-this-directory.txt, htmlcov/, .tox/, .coverage, .coverage.*, .cache, nosetests.xml, coverage.xml, *.cover, .hypothesis/, *.mo, *.pot, *.log, .static_storage/, .media/, local_settings.py, instance/, .webassets-cache, .scrapy, docs/_build/, target/, .ipynb_checkpoints, .python-version, celerybeat-schedule, *.sage.py, .env, .venv, env/, venv/, ENV/, env.bak/, venv.bak/, .spyderproject, .spyproject, .ropeproject, /site, .mypy_cache/, .DS_Store, .AppleDouble, .LSOverride, Icon, ._*, .DocumentRevisions-V100, .fseventsd, .Spotlight-V100, .TemporaryItems, .Trashes, .VolumeIcon.icns, .com.apple.timemachine.donotpresent, .AppleDB, .AppleDesktop, Network Trash Folder, Temporary Items, .apdisk, *~, .fuse_hidden*, .directory, .Trash-*, .nfs*, Thumbs.db, ehthumbs.db, ehthumbs_vista.db, *.stackdump, [Dd]esktop.ini, $RECYCLE.BIN/, *.cab, *.msi, *.msm, *.msp, *.lnk, secring.*, *.ca, *.crt, *.csr, *.der, *.kdb, *.org, *.p12, *.pem, *.rnd, *.ssleay, *.smime, **/.git, **/.gitignore, **/.github/workflows/galaxy.yml, **/.secrets.baseline, **/.pre-commit-config.yaml, **/test/earlybird/falsepositives-ansible.yaml, .git, .gitignore, .github/workflows/galaxy.yml, galaxy.yml, .secrets.baseline, .pre-commit-config.yaml, test/earlybird/falsepositives-ansible.yaml, falsepositives-ansible.yaml, */.git, */.gitignore, /.git, /.gitignore, */.git*, */.gitignore*, ./.git, ./.gitignore
    2022/04/30 10:08:10 Reading file  .ansible-lint
    2022/04/30 10:08:10 Reading file  .codespellignore
    2022/04/30 10:08:10 Reading file  .git
    2022/04/30 10:08:10 Reading file  .github/stale.yml
    2022/04/30 10:08:10 Reading file  .github/workflows/codespell.yml
    2022/04/30 10:08:10 Reading file  .github/workflows/default.yml
    2022/04/30 10:08:10 Reading file  .github/workflows/dryrun-bare.yml
    2022/04/30 10:08:10 Reading file  .github/workflows/earlybird.yml
    2022/04/30 10:08:10 Reading file  .github/workflows/galaxy.yml
    2022/04/30 10:08:10 Reading file  .github/workflows/lint.yml
    2022/04/30 10:08:10 Reading file  .gitignore
    [...]
    

    from my reading of code, "Reading file " should not appear if file is correctly ignored https://github.com/americanexpress/earlybird/blob/main/pkg/file/fileUtil.go#L196 match seems custom character per character as per https://github.com/americanexpress/earlybird/blob/main/pkg/wildcard/patternMatch.go

    Note that even if it says " Go-EarlyBird version: 2.0.0", this is from latest download aka https://github.com/americanexpress/earlybird/releases/download/v3.12.0/go-earlybird-linux

    Example run in https://github.com/juju4/ansible-adduser/runs/6142207431?check_suite_focus=true#step:6:1

    Thanks for sharing your work

    bug 
    opened by juju4 1
  • Fix install.sh if clause

    Fix install.sh if clause

    == with wildcard only works in bash, [[ is also bash-only, not bourne-shell which is used in this script. To compare with wildcard in bourne-shell we can use the case statement

    opened by sebix 3
  • Earlybird tests

    Earlybird tests

    Hello, Didn't know where to ask this question so I raised this issue. I tried earlybird on the following poor, test, C source code :

    #include<stdio.h>
    #include<string.h>
    
    int main(void) {
        char enteredPass[30];
        char password[30]="MyPassw0rd";
        printf("Enter Password:\n");
        scanf("%s", enteredPass);
        if (strcmp(enteredPass, password) == 0) {
            printf("%s is my Password!\nOops\n", password);
            return 0;
        } else {
            printf("You didn't found it!\n");
            return -1;
        }
    }
    

    and nothing is detected by earlybird.

    I got : 1 files scanned in 2.048829ms 2021/10/08 11:42:22 144 rules observed ***** Total issues found ***** 0 TOTAL ISSUES

    How is this possible?

    This is almost exactly what is described as C example in CWE-798. Thanks for the help.

    good first issue 
    opened by ggi-cetic 1
Releases(v3.13.0)
Owner
American Express
American Express
Ah shhgit! Find secrets in your code. Secrets detection for your GitHub, GitLab and Bitbucket repositories: www.shhgit.com

shhgit helps secure forward-thinking development, operations, and security teams by finding secrets across their code before it leads to a security br

Paul 3.5k Sep 29, 2022
The dynamic infrastructure framework for everybody! Distribute the workload of many different scanning tools with ease, including nmap, ffuf, masscan, nuclei, meg and many more!

Axiom is a dynamic infrastructure framework to efficiently work with multi-cloud environments, build and deploy repeatable infrastructure focussed on

pry0cc 2.9k Sep 23, 2022
WIP. Converts Azure Container Scan Action output to SARIF, for an easier integration with GitHub Code Scanning

container-scan-to-sarif container-scan-to-sarif converts Azure Container Scan Action output to Static Analysis Results Interchange Format (SARIF), for

Armel Soro 2 Jan 25, 2022
Secure software enclave for storage of sensitive information in memory.

MemGuard Software enclave for storage of sensitive information in memory. This package attempts to reduce the likelihood of sensitive data being expos

Awn 2.2k Sep 24, 2022
Sensitive information protection toolkit

godlp 一、简介 为了保障企业的数据安全和隐私安全,godlp 提供了一系列针对敏感数据的识别和处置方案, 其中包括敏感数据识别算法,数据脱敏处理方式,业务自定义的配置选项和海量数据处理能力。 godlp 能够应用多种隐私合规标准,对原始数据进行分级打标、判断敏感级别和实施相应的脱敏处理。 In

Bytedance Inc. 588 Sep 15, 2022
Cossack Labs 1.1k Sep 27, 2022
:key: Idiotproof golang password validation library inspired by Python's passlib

passlib for go Python's passlib is quite an amazing library. I'm not sure there's a password library in existence with more thought put into it, or wi

Hugo Landau 271 Sep 27, 2022
Nuclei is a fast tool for configurable targeted vulnerability scanning based on templates offering massive extensibility and ease of use.

Fast and customisable vulnerability scanner based on simple YAML based DSL. How • Install • For Security Engineers • For Developers • Documentation •

ProjectDiscovery 10k Sep 28, 2022
🌘🦊 DalFox(Finder Of XSS) / Parameter Analysis and XSS Scanning tool based on golang

Finder Of XSS, and Dal(달) is the Korean pronunciation of moon. What is DalFox ?? ?? DalFox is a fast, powerful parameter analysis and XSS scanner, bas

HAHWUL 2k Sep 30, 2022
Naabu - a port scanning tool written in Go that allows you to enumerate valid ports for hosts in a fast and reliable manner

Naabu is a port scanning tool written in Go that allows you to enumerate valid ports for hosts in a fast and reliable manner. It is a really simple tool that does fast SYN/CONNECT scans on the host/list of hosts and lists all ports that return a reply.

null 0 Jan 2, 2022
Portmantool - Port scanning and monitoring tool

portmantool Port scanning and monitoring tool Components runner while true do r

Thomann Bits & Beats 0 Feb 14, 2022
A Go Module to interact with Passbolt, a Open source Password Manager for Teams

go-passbolt A Go Module to interact with Passbolt, a Open source Password Manager for Teams This Module tries to Support the Latest Passbolt Community

Samuel Lorch 10 May 13, 2022
Driftwood is a tool that can enable you to lookup whether a private key is used for things like TLS or as a GitHub SSH key for a user.

Driftwood is a tool that can enable you to lookup whether a private key is used for things like TLS or as a GitHub SSH key for a user. Drift

Truffle Security 304 Sep 20, 2022
A FreeSWITCH specific scanning and exploitation toolkit for CVE-2021-37624 and CVE-2021-41157.

PewSWITCH A FreeSWITCH specific scanning and exploitation toolkit for CVE-2021-37624 and CVE-2021-41157. Related blog: https://0xinfection.github.io/p

Pinaki 23 Jun 23, 2022
erchive is a go program that compresses and encrypts files and entire directories into .zep files (encrypted zip files).

erchive/zep erchive is a go program that compresses and encrypts files and entire directories into .zep files (encrypted zip files). it compresses usi

Christopher Walters 1 May 16, 2022
DockerSlim (docker-slim): Don't change anything in your Docker container image and minify it by up to 30x (and for compiled languages even more) making it secure too! (free and open source)

Minify and Secure Docker containers (free and open source!) Don't change anything in your Docker container image and minify it by up to 30x making it

docker-slim 15.1k Oct 1, 2022