Query git repositories with SQL. Generate reports, perform status checks, analyze codebases. πŸ” πŸ“Š

Overview

Go Reference BuildStatus Go Report Card TODOs codecov

askgit AskGit Logo

askgit is a command-line tool for running SQL queries on git repositories. It's meant for ad-hoc querying of git repositories on disk through a common interface (SQL), as an alternative to patching together various shell commands. It can execute queries that look like:

-- how many commits have been authored by [email protected]?
SELECT count(*) FROM commits WHERE author_email = '[email protected]'

You can try queries on public git repositories without installing anything at https://try.askgit.com/

More in-depth examples and documentation can be found below. Also checkout our newsletter to stay up to date with feature releases and interesting queries and use cases.

Installation

Homebrew

brew tap askgitdev/askgit
brew install askgit

Pre-Built Binaries

The latest releases should have pre-built binaries for Mac and Linux. You can download and add the askgit binary somewhere on your $PATH to use. libaskgit.so is also available to be loaded as a SQLite run-time extension.

Go

libgit2 is a build dependency (used via git2go) and must be available on your system for linking.

The following (long 😬 ) go install commands can be used to install a binary via the go toolchain.

On Mac:

CGO_CFLAGS=-DUSE_LIBSQLITE3 CGO_LDFLAGS=-Wl,-undefined,dynamic_lookup go install -tags="sqlite_vtable,vtable,sqlite_json1,static,system_libgit2" github.com/askgitdev/[email protected]

On Linux:

CGO_CFLAGS=-DUSE_LIBSQLITE3 CGO_LDFLAGS=-Wl,--unresolved-symbols=ignore-in-object-files go install -tags="sqlite_vtable,vtable,sqlite_json1,static,system_libgit2" github.com/askgitdev/[email protected]

See the Makefile for more context. Checking out this repository and running make in the root will produce two files in the .build directory:

  1. askgit - the CLI binary (which can then be moved into your $PATH for use)
  2. libaskgit.so - a shared object file SQLite extension that can be used by SQLite directly

Using Docker

Build an image locally using docker

docker build -t askgit:latest .

Or use an official image from docker hub

docker pull augmentable/askgit:latest

Running commands

askgit operates on a git repository. This repository needs to be attached as a volume. This example uses the (bash) built-in command pwd for the current working directory

[pwd] Print the absolute pathname of the current working directory.

docker run --rm -v `pwd`:/repo:ro augmentable/askgit "SELECT * FROM commits"

Running commands from STDIN

For piping commands via STDIN, the docker command needs to be told to run non-interactively, as well as attaching the repository at /repo.

cat query.sql | docker run --rm -i -v `pwd`:/repo:ro augmentable/askgit

Public API

We maintain a free to use, public API for running queries (executed in an AWS Lambda function). See this page for more information.

Usage

askgit -h

Will output the most up to date usage instructions for your version of the CLI. Typically the first argument is a SQL query string:

askgit "SELECT * FROM commits"

Your current working directory will be used as the path to the git repository to query by default. Use the --repo flag to specify an alternate path, or even a remote repository reference (http(s) or ssh). askgit will clone the remote repository to a temporary directory before executing a query.

You can also pass a query in via stdin:

cat query.sql | askgit

By default, output will be an ASCII table. Use --format json or --format csv for alternatives. Use -v to print execution logs to stderr. This can be useful for understanding what API calls a query may be making, or similar runtime information. See -h for all the options.

Tables and Functions

Local Git Repository

The following tables access a git repository in the current directory by default. If the --repo flag is specified, they will use the path provided there instead. A parameter (usually the first) can also be provided to any of the tables below to override the default repo path. For instance, SELECT * FROM commits('https://github.com/askgitdev/askgit') will clone this repo to a temporary directory on disk and return its commits.

commits

Similar to git log, the commits table includes all commits in the history of the currently checked out commit.

Column Type
hash TEXT
message TEXT
author_name TEXT
author_email TEXT
author_when DATETIME
committer_name TEXT
committer_email TEXT
committer_when DATETIME
parents INT

Params:

  1. repository - path to a local (on disk) or remote (http(s)) repository
  2. rev - return commits starting at this revision (i.e. branch name or SHA), defaults to HEAD
-- return all commits starting at HEAD
SELECT * FROM commits

-- specify an alternative repo on disk
SELECT * FROM commits('/some/path/to/repo')

-- clone a remote repo and use it
SELECT * FROM commits('https://github.com/askgitdev/askgit')

-- use the default repo, but provide an alternate branch
SELECT * FROM commits('', 'some-ref')
refs
Column Type
name TEXT
type TEXT
remote TEXT
full_name TEXT
hash TEXT
target TEXT

Params:

  1. repository - path to a local (on disk) or remote (http(s)) repository
stats
Column Type
file_path TEXT
additions INT
deletions INT

Params:

  1. repository - path to a local (on disk) or remote (http(s)) repository
  2. rev - commit hash (or branch/tag name) to use for retrieving stats, defaults to HEAD
  3. to_rev - commit hash to calculate stats relative to
-- return stats of HEAD
SELECT * FROM stats

-- return stats of a specific commit
SELECT * FROM stats('', 'COMMIT_HASH')

-- return stats for every commit in the current history
SELECT commits.hash, stats.* FROM commits, stats('', commits.hash)
files
Column Type
path TEXT
executable BOOL
contents TEXT

Params:

  1. repository - path to a local (on disk) or remote (http(s)) repository
  2. rev - commit hash (or branch/tag name) to use for retrieving files in, defaults to HEAD
blame

Similar to git blame, the blame table includes blame information for all files in the current HEAD.

Column Type
line_no INT
commit_hash TEXT

Params:

  1. repository - path to a local (on disk) or remote (http(s)) repository
  2. rev - commit hash (or branch/tag name) to use for retrieving blame information from, defaults to HEAD
  3. file_path - path of file to blame

Utilities

JSON

The SQLite JSON1 extension is included for working with JSON data.

toml_to_json

Scalar function that converts toml to json.

SELECT toml_to_json('[some-toml]')

-- +-----------------------------+
-- | TOML_TO_JSON('[SOME-TOML]') |
-- +-----------------------------+
-- | {"some-toml":{}}            |
-- +-----------------------------+
xml_to_json

Scalar function that converts xml to json.

SELECT xml_to_json('
   
    hello
   ')

-- +-------------------------------------------+
-- | XML_TO_JSON('
   
    HELLO
   ') |
-- +-------------------------------------------+
-- | {"some-xml":"hello"}                      |
-- +-------------------------------------------+
yaml_to_json and yml_to_json

Scalar function that converts yaml to json.

SELECT yaml_to_json('hello: world')

-- +------------------------------+
-- | YAML_TO_JSON('HELLO: WORLD') |
-- +------------------------------+
-- | {"hello":"world"}            |
-- +------------------------------+
go_mod_to_json

Scalar function that parses a go.mod file and returns a JSON representation of it.

SELECT go_mod_to_json('
   
    '
   )
str_split

Helper for splitting strings on some separator.

SELECT str_split('hello,world', ',', 0)

-- +----------------------------------+
-- | STR_SPLIT('HELLO,WORLD', ',', 0) |
-- +----------------------------------+
-- | hello                            |
-- +----------------------------------+
SELECT str_split('hello,world', ',', 1)

-- +----------------------------------+
-- | STR_SPLIT('HELLO,WORLD', ',', 1) |
-- +----------------------------------+
-- | world                            |
-- +----------------------------------+

Enry Functions

Functions from the enry project are also available as SQL scalar functions

enry_detect_language

Supply a file path and some source code to detect the language.

SELECT enry_detect_language('some/path/to/file.go', '
   
    '
   )
enry_is_binary

Given a blob, determine if it's a binary file or not (returns 1 or 0).

SELECT enry_is_binary('
   
    '
   )
enry_is_configuration

Detect whether a file path is to a configuration file (returns 1 or 0).

SELECT enry_is_configuration('some/path/to/file/config.json')
enry_is_documentation

Detect whether a file path is to a documentation file (returns 1 or 0).

SELECT enry_is_documentation('some/path/to/file/README.md')
enry_is_dot_file

Detect whether a file path is to a dot file (returns 1 or 0).

SELECT enry_is_dot_file('some/path/to/file/.gitignore')
enry_is_generated

Detect whether a file path is generated (returns 1 or 0).

SELECT enry_is_generated('some/path/to/file/generated.go', '
   
    '
   )
enry_is_image

Detect whether a file path is to an image (returns 1 or 0).

SELECT enry_is_image('some/path/to/file/image.png')
enry_is_test

Detect whether a file path is to a test file (returns 1 or 0).

SELECT enry_is_test('some/path/to/file/image.png')
enry_is_vendor

Detect whether a file path is to a vendored file (returns 1 or 0).

SELECT enry_is_vendor('vendor/file.go')

GitHub API

You can use askgit to query the GitHub API (v4). Constraints in your SQL query are pushed to the GitHub API as much as possible. For instance, if your query includes an ORDER BY clause and if items can be ordered in the GitHub API response (on the specified column), your query can avoid doing a full table scan and rely on the ordering returned by the API.

Authenticating

You must provide an authentication token in order to use the GitHub API tables. You can create a personal access token following these instructions. askgit will look for a GITHUB_TOKEN environment variable when executing, to use for authentication. This is also true if running as a runtime loadable extension.

Rate Limiting

All API requests to GitHub are rate limited. The following tables make use of the GitHub GraphQL API (v4), which rate limits additionally based on the "complexity" of GraphQL queries. Generally speaking, the more fields/relations in your GraphQL query, the higher the "cost" of a single API request, and the faster you may reach a rate limit. Depending on your SQL query, it's hard to know ahead of time what a good client-side rate limit is. By default, each of the tables below will fetch 100 items per page and permit 2 API requests per second. You can override both of these parameters by setting the following environment variables:

  1. GITHUB_PER_PAGE - expects an integer between 1 and 100, sets how many items are fetched per-page in API calls that paginate results.
  2. GITHUB_RATE_LIMIT - expressed in the form (number of requests) / (number of seconds) (i.e. 1/3 means at most 1 request per 3 seconds)

If you encounter a rate limit error that looks like You have exceeded a secondary rate limit, consider setting the GITHUB_PER_PAGE value to a lower number. If you have a large number of items to scan in your query, it may take longer, but you should avoid hitting a rate limit error.

github_stargazers

Table-valued-function that returns a list of users who have starred a repository.

Column Type
login TEXT
email TEXT
name TEXT
bio TEXT
company TEXT
avatar_url TEXT
created_at DATETIME
updated_at DATETIME
twitter TEXT
website TEXT
location TEXT
starred_at DATETIME

Params:

  1. fullNameOrOwner - either the full repo name askgitdev/askgit or just the owner askgit (which would require the second argument)
  2. name - optional if the first argument is a "full" name, otherwise required - the name of the repo
SELECT * FROM github_stargazers('askgitdev', 'askgit');
SELECT * FROM github_stargazers('askgitdev/askgit'); -- both are equivalent
github_starred_repos

Table-valued-function that returns a list of repositories a user has starred.

Column Type
name TEXT
url TEXT
description TEXT
created_at DATETIME
pushed_at DATETIME
updated_at DATETIME
stargazer_count INT
name_with_owner TEXT
starred_at DATETIME

Params:

  1. login - the login of a GitHub user
SELECT * FROM github_starred_repos('patrickdevivo')
github_stargazer_count

Scalar function that returns the number of stars a GitHub repository has.

Params:

  1. fullNameOrOwner - either the full repo name askgitdev/askgit or just the owner askgit (which would require the second argument)
  2. name - optional if the first argument is a "full" name, otherwise required - the name of the repo
SELECT github_stargazer_count('askgitdev', 'askgit');
SELECT github_stargazer_count('askgitdev/askgit'); -- both are equivalent
github_user_repos and github_org_repos

Table-valued function that returns all the repositories belonging to a user or an organization.

Column Type
created_at DATETIME
database_id INT
default_branch_ref_name TEXT
default_branch_ref_prefix TEXT
description TEXT
disk_usage INT
fork_count INT
homepage_url TEXT
is_archived BOOLEAN
is_disabled BOOLEAN
is_fork BOOLEAN
is_mirror BOOLEAN
is_private BOOLEAN
issue_count INT
latest_release_author TEXT
latest_release_created_at DATETIME
latest_release_name TEXT
latest_release_published_at DATETIME
license_key TEXT
license_name TEXT
name TEXT
open_graph_image_url TEXT
primary_language TEXT
pull_request_count INT
pushed_at DATETIME
release_count INT
stargazer_count INT
updated_at DATETIME
watcher_count INT

Params:

  1. login - the login of a GitHub user or organization
SELECT * FROM github_user_repos('patrickdevivo')
SELECT * FROM github_org_repos('askgitdev')
github_repo_issues

Table-valued-function that returns all the issues of a GitHub repository.

Column Type
owner TEXT
reponame TEXT
author_login TEXT
body TEXT
closed BOOLEAN
closed_at DATETIME
comment_count INT
created_at DATETIME
created_via_email BOOLEAN
database_id TEXT
editor_login TEXT
includes_created_edit BOOLEAN
label_count INT
last_edited_at DATETIME
locked BOOLEAN
milestone_count INT
number INT
participant_count INT
published_at DATETIME
reaction_count INT
state TEXT
title TEXT
updated_at DATETIME
url TEXT

Params:

  1. fullNameOrOwner - either the full repo name askgitdev/askgit or just the owner askgit (which would require the second argument)
  2. name - optional if the first argument is a "full" name, otherwise required - the name of the repo
SELECT * FROM github_repo_issues('askgitdev/askgit');
SELECT * FROM github_repo_issues('askgitdev', 'askgit'); -- both are equivalent
github_repo_prs

Table-valued-function that returns all the pull requests of a GitHub repository.

Column Type
additions INT
author_login TEXT
author_association TEXT
base_ref_oid TEXT
base_ref_name TEXT
base_repository_name TEXT
body TEXT
changed_files INT
closed BOOLEAN
closed_at DATETIME
comment_count INT
commit_count INT
created_at TEXT
created_via_email BOOLEAN
database_id INT
deletions INT
editor_login TEXT
head_ref_name TEXT
head_ref_oid TEXT
head_repository_name TEXT
is_draft INT
label_count INT
last_edited_at DATETIME
locked BOOLEAN
maintainer_can_modify BOOLEAN
mergeable TEXT
merged BOOLEAN
merged_at DATETIME
merged_by TEXT
number INT
participant_count INT
published_at DATETIME
review_decision TEXT
state TEXT
title TEXT
updated_at DATETIME
url TEXT

Params:

  1. fullNameOrOwner - either the full repo name askgitdev/askgit or just the owner askgit (which would require the second argument)
  2. name - optional if the first argument is a "full" name, otherwise required - the name of the repo
SELECT * FROM github_repo_prs('askgitdev/askgit');
SELECT * FROM github_repo_prs('askgitdev', 'askgit'); -- both are equivalent
github_repo_file_content

Scalar function that returns the contents of a file in a GitHub repository

Params:

  1. fullNameOrOwner - either the full repo name askgitdev/askgit or just the owner askgit (which would require the second argument)
  2. name - optional if the first argument is a "full" name, otherwise required - the name of the repo
  3. expression - either a simple file path (README.md) or a rev-parse suitable expression that includes a path (HEAD:README.md or :README.md )
SELECT github_stargazer_count('askgitdev', 'askgit', 'README.md');
SELECT github_stargazer_count('askgitdev/askgit', 'README.md'); -- both are equivalent

Sourcegraph API (experimental!)

You can use askgit to query the Sourcegraph API.

Authenticating

You must provide an authentication token in order to use the Sourcegraph API tables. You can create a personal access token following these instructions. askgit will look for a SOURCEGRAPH_TOKEN environment variable when executing, to use for authentication. This is also true if running as a runtime loadable extension.

sourcegraph_search

Table-valued-function that returns results from a Sourcegraph search.

Column Type
__typename TEXT
results TEXT

__typename will be one of Repository, CommitSearchResult, or FileMatch. results will be the JSON value of a search result (will match what's returned from the API)

Params:

  1. query - a sourcegraph search query (docs)
SELECT sourcegraph_search('askgit');

NPM Registry

askgit can also query the NPM registry API.

npm_get_package

Scalar function that queries https://registry.npmjs.org/< > or https://registry.npmjs.org/< >/< > (depending on number of params) and returns the JSON response.

Params:

  1. package - name of the NPM package
  2. version - (optional) package version
SELECT npm_get_package('jquery')
SELECT npm_get_package('jquery', 'latest')

Example Queries

This will return all commits in the history of the currently checked out branch/commit of the repo.

SELECT * FROM commits

Return the (de-duplicated) email addresses of commit authors:

SELECT DISTINCT author_email FROM commits

Return the commit counts of every author (by email):

SELECT author_email, count(*) FROM commits GROUP BY author_email ORDER BY count(*) DESC

Same as above, but excluding merge commits:

SELECT author_email, count(*) FROM commits WHERE parents < 2 GROUP BY author_email ORDER BY count(*) DESC

Outputs the set of files in the current tree:

SELECT * FROM files

Returns author emails with lines added/removed, ordered by total number of commits in the history (excluding merges):

SELECT count(DISTINCT commits.hash) AS commits, SUM(additions) AS additions, SUM(deletions) AS deletions, author_email
FROM commits LEFT JOIN stats('', commits.hash)
WHERE commits.parents < 2
GROUP BY author_email ORDER BY commits

Returns commit counts by author, broken out by day of the week:

SELECT
    count(*) AS commits,
    count(CASE WHEN strftime('%w',author_when)='0' THEN 1 END) AS sunday,
    count(CASE WHEN strftime('%w',author_when)='1' THEN 1 END) AS monday,
    count(CASE WHEN strftime('%w',author_when)='2' THEN 1 END) AS tuesday,
    count(CASE WHEN strftime('%w',author_when)='3' THEN 1 END) AS wednesday,
    count(CASE WHEN strftime('%w',author_when)='4' THEN 1 END) AS thursday,
    count(CASE WHEN strftime('%w',author_when)='5' THEN 1 END) AS friday,
    count(CASE WHEN strftime('%w',author_when)='6' THEN 1 END) AS saturday,
    author_email
FROM commits GROUP BY author_email ORDER BY commits

Exporting

You can use the askgit export sub command to save the output of queries into a sqlite database file. The command expects a path to a db file (which will be created if it doesn't already exist) and a variable number of "export pairs," specified by the -e flag. Each pair represents the name of a table to create and a query to generate its contents.

askgit export my-export-file -e commits -e "SELECT * FROM commits" -e files -e "SELECT * FROM files"

This can be useful if you're looking to use another tool to examine the data emitted by askgit. Since the exported file is a plain SQLite database, queries should be much faster (as the original git repository is no longer traversed) and you should be able to use any tool that supports querying SQLite database files.

Issues
  • Following installation instructions doesn't work?

    Following installation instructions doesn't work?

    I'm not very familiar with Go, so perhaps I'm doing something wrong?

    $ go install -v -tags=sqlite_vtable github.com/augmentable-dev/gitqlite
    can't load package: package github.com/augmentable-dev/gitqlite: cannot find package "github.com/augmentable-dev/gitqlite" in any of:
            /usr/lib/go-1.10/src/github.com/augmentable-dev/gitqlite (from $GOROOT)
            /home/erez/go/src/github.com/augmentable-dev/gitqlite (from $GOPATH)
    
    opened by erezsh 12
  • Is there a query to extract file content on a specific date?

    Is there a query to extract file content on a specific date?

    Hi to all, imagine I have repo in which I update a txt file day by day.

    Is there a way to have the version of this file on 2021-12-21?

    A query like SELECT * FROM myFile.txt AS OF TIMESTAMP('2021-12-21'); that give me in output that file at that date?

    Thank you

    opened by aborruso 8
  • Took very long time on the first run

    Took very long time on the first run

    Is it expected that the basic command from README is so heavy? I initially thought that there's something wrong with my invocation, I did this:

    docker run --rm -v `pwd`:/repo:ro augmentable/askgit "SELECT * FROM commits"
    

    Then my computer just seemed to be stuck. I was seeing resource usage like this for over a minute:

    Screenshot 2020-08-30 at 10 02 57

    Then it eventually finished after about 2.5 minutes but I was seriously worried that I'm doing something wrong, e.g., not escaping the SQL query correctly.

    What does it do on the first run? Is it building some sort of database behind the scenes? Would even "simpler" queries like SELECT count(*) FROM commits take similarly long?

    opened by borekb 6
  • Installation is broken with Homebrew

    Installation is broken with Homebrew

    Upgrading askgitdev/askgit/askgit with Homebrew on Linux (Ubuntu 20.04) is failing:

    > uname -srm
    Linux 5.14.11-051411-generic x86_64
    
    > lsb_release -d
    Description:    Ubuntu 20.04.3 LTS
    
    > brew outdated
    askgitdev/askgit/askgit (v0.4.7) < v0.4.8
    
    > brew upgrade
    ==> Auto-updated Homebrew!
    Updated 1 tap (homebrew/cask).
    ==> Updated Casks
    Updated 1 cask.
    
    Updating Homebrew...
    ==> Upgrading 1 outdated package:
    askgitdev/askgit/askgit v0.4.7 -> v0.4.8
    ==> Downloading https://github.com/askgitdev/askgit/archive/v0.4.8.tar.gz
    Already downloaded: /home/giermulnik/.cache/Homebrew/downloads/bc83f30eb7ec1aa03e0e8e020c5cd9006e5ccb1da98eb05d36d61777e2d14864--askgit-0.4.8.tar.gz
    ==> Upgrading askgitdev/askgit/askgit
      v0.4.7 -> v0.4.8
    
    ==> make
    Last 15 lines from /home/giermulnik/.cache/Homebrew/Logs/askgit/01.make:
    
    -- nuking .build/
    -- building .build/libaskgit.so
    -- building .build/askgit
    # github.com/libgit2/git2go/v32
    /home/giermulnik/.cache/Homebrew/go_mod_cache/pkg/mod/github.com/libgit2/git2go/[email protected]/Build_system_dynamic.go:12:3: error: #error "Invalid libgit2 version; this git2go supports libgit2 between v1.2.0 and v1.2.0"
       12 | # error "Invalid libgit2 version; this git2go supports libgit2 between v1.2.0 and v1.2.0"
          |   ^~~~~
    # github.com/libgit2/git2go/v32
    /home/giermulnik/.cache/Homebrew/go_mod_cache/pkg/mod/github.com/libgit2/git2go/[email protected]/Build_system_static.go:12:3: error: #error "Invalid libgit2 version; this git2go supports libgit2 between v1.2.0 and v1.2.0"
       12 | # error "Invalid libgit2 version; this git2go supports libgit2 between v1.2.0 and v1.2.0"
          |   ^~~~~
    make: *** [Makefile:17: .build/libaskgit.so] Error 2
    make: *** Waiting for unfinished jobs....
    make: *** [Makefile:23: .build/askgit] Error 2
    
    If reporting this issue please do so at (not Homebrew/brew or Homebrew/core):
      https://github.com/askgitdev/homebrew-askgit/issues
    
    > brew info libgit2
    libgit2: stable 1.3.0 (bottled), HEAD
    C library of Git core methods that is re-entrant and linkable
    https://libgit2.github.com/
    /home/linuxbrew/.linuxbrew/Cellar/libgit2/1.3.0 (102 files, 4.8MB) *
      Poured from bottle on 2021-10-01 at 00:16:35
    From: https://github.com/Homebrew/linuxbrew-core/blob/HEAD/Formula/libgit2.rb
    License: GPL-2.0-only
    ==> Dependencies
    Build: cmake βœ”, pkg-config βœ”
    Required: libssh2 βœ”
    ==> Options
    --HEAD
            Install HEAD version
    ==> Analytics
    install: 1,052 (30 days), 1,994 (90 days), 6,492 (365 days)
    install-on-request: 140 (30 days), 193 (90 days), 598 (365 days)
    build-error: 0 (30 days)
    
    opened by yermulnik 5
  • Issues with repository directories with special characters

    Issues with repository directories with special characters

    askgit has trouble with repository directories that contain characters that are either special to Go's %q string encoding or special to sqlite. Some characters cause askgit to exit with "unrecognized token" trying to create the virtual table, while others make it further and fail (or produce no data) when executing SQL statements.

    Some special characters do not render well on github, so I've included equivalent shell commands for making these directories.

    The following directories behave the same. select count(*) from commits returns no results (not the number 0 - it returns an empty resultset), while select count(*) from files panics with "panic: invalid handle":

    • back\slash (mkdir 'back\slash')
    • thing﷐ (mkdir 'thing'$'\357\267\220')
    • new line (mkdir 'new'$'\n''line')
    • doublequotes");--injection (mkdir 'doublequotes");--injection')

    The following directories all fail without running the SQL query, with an error like unrecognized token: "");":

    • quotation"marks (mkdir 'quotation"marks')
    • comma",separated" (mkdir 'comma",separated"')

    This seems to be caused by building a SQL string using fmt.Sprintf and %q, which quotes/escapes strings in a way that Go understands rather than in a way that sqlite understands. Go will format " in the middle of a string as \", which Sqlite considers to be a literal backslash character followed by the end of a string, which is why most directories with the double quote character result in "unrecognized token". For other characters it seems like Go will escape them (e.g. \ becomes \\, newline becomes \n), and Sqlite happily passes the escaped versions to the modules' Create functions, which presumably try and fail to open a directory named e.g. back\\slash instead of back\slash

    This can be reproduced in the tests by changing the fixture repo from "repo" to e.g. "repo\\" or "repo\""

    opened by nhinds 5
  • Add support for `.mailmap` files

    Add support for `.mailmap` files

    See here for context. It would be useful to be able use mappings in a .mailmap of a repo to de-duplicate authors in queries.

    I'm not entirely sure how we add support for it - maybe as a helper function that takes the contents of a .mailmap and an email address, and returns the associated name.

    Something like SELECT mailmap(<mailmap-contents>, '[email protected]')

    enhancement 
    opened by patrickdevivo 4
  • Install error: cannot find package

    Install error: cannot find package "github.com/libgit2/git2go/v30"

    When trying to install by running this:

    go get -v -tags=sqlite_vtable github.com/augmentable-dev/askgit
    

    I get this:

    ➜ go get -v -tags=sqlite_vtable github.com/augmentable-dev/askgit
    
    github.com/libgit2/git2go (download)
    cannot find package "github.com/libgit2/git2go/v30" in any of:
    	/usr/local/go/src/github.com/libgit2/git2go/v30 (from $GOROOT)
    	/home/duncan/.go/src/github.com/libgit2/git2go/v30 (from $GOPATH)
    

    My environment is as follows:

     ➜ go version
    go version go1.15.5 linux/amd64
    
    ➜ go env
    GO111MODULE=""
    GOARCH="amd64"
    GOBIN=""
    GOCACHE="/home/duncan/.cache/go-build"
    GOENV="/home/duncan/.config/go/env"
    GOEXE=""
    GOFLAGS=""
    GOHOSTARCH="amd64"
    GOHOSTOS="linux"
    GOINSECURE=""
    GOMODCACHE="/home/duncan/.go/pkg/mod"
    GONOPROXY=""
    GONOSUMDB=""
    GOOS="linux"
    GOPATH="/home/duncan/.go"
    GOPRIVATE=""
    GOPROXY="https://proxy.golang.org,direct"
    GOROOT="/usr/local/go"
    GOSUMDB="sum.golang.org"
    GOTMPDIR=""
    GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
    GCCGO="gccgo"
    AR="ar"
    CC="gcc"
    CXX="g++"
    CGO_ENABLED="1"
    GOMOD=""
    CGO_CFLAGS="-g -O2"
    CGO_CPPFLAGS=""
    CGO_CXXFLAGS="-g -O2"
    CGO_FFLAGS="-g -O2"
    CGO_LDFLAGS="-g -O2"
    PKG_CONFIG="pkg-config"
    GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build228718343=/tmp/go-build -gno-record-gcc-switches"
    
     ➜ neofetch --backend off
    
    OS: Ubuntu 20.04.1 LTS x86_64 
    Kernel: 5.4.0-52-generic 
    Shell: bash 5.0.17 
    DE: Xfce 
    Memory: 38419MiB / 64206MiB 
    
    opened by dflock 4
  • Build fails looking for 'github.com/go-git/go-billy/v5/osfs'

    Build fails looking for 'github.com/go-git/go-billy/v5/osfs'

    Build log as follows:

    [email protected]:~/tmp$ go get -v -tags=sqlite_vtable github.com/augmentable-dev/askgit
    github.com/augmentable-dev/askgit (download)
    github.com/go-git/go-git (download)
    github.com/go-git/go-billy (download)
    Fetching https://golang.org/x/sys/unix?go-get=1
    Parsing meta tags from https://golang.org/x/sys/unix?go-get=1 (status code 200)
    get "golang.org/x/sys/unix": found meta tag get.metaImport{Prefix:"golang.org/x/sys", VCS:"git", RepoRoot:"https://go.googlesource.com/sys"} at https://golang.org/x/sys/unix?go-get=1
    get "golang.org/x/sys/unix": verifying non-authoritative meta tag
    Fetching https://golang.org/x/sys?go-get=1
    Parsing meta tags from https://golang.org/x/sys?go-get=1 (status code 200)
    golang.org/x/sys (download)
    github.com/go-git/gcfg (download)
    Fetching https://gopkg.in/warnings.v0?go-get=1
    Parsing meta tags from https://gopkg.in/warnings.v0?go-get=1 (status code 200)
    get "gopkg.in/warnings.v0": found meta tag get.metaImport{Prefix:"gopkg.in/warnings.v0", VCS:"git", RepoRoot:"https://gopkg.in/warnings.v0"} at https://gopkg.in/warnings.v0?go-get=1
    gopkg.in/warnings.v0 (download)
    github.com/mitchellh/go-homedir (download)
    github.com/jbenet/go-context (download)
    Fetching https://golang.org/x/net/context?go-get=1
    Parsing meta tags from https://golang.org/x/net/context?go-get=1 (status code 200)
    get "golang.org/x/net/context": found meta tag get.metaImport{Prefix:"golang.org/x/net", VCS:"git", RepoRoot:"https://go.googlesource.com/net"} at https://golang.org/x/net/context?go-get=1
    get "golang.org/x/net/context": verifying non-authoritative meta tag
    Fetching https://golang.org/x/net?go-get=1
    Parsing meta tags from https://golang.org/x/net?go-get=1 (status code 200)
    golang.org/x/net (download)
    github.com/emirpasic/gods (download)
    github.com/sergi/go-diff (download)
    Fetching https://golang.org/x/crypto/openpgp?go-get=1
    Parsing meta tags from https://golang.org/x/crypto/openpgp?go-get=1 (status code 200)
    get "golang.org/x/crypto/openpgp": found meta tag get.metaImport{Prefix:"golang.org/x/crypto", VCS:"git", RepoRoot:"https://go.googlesource.com/crypto"} at https://golang.org/x/crypto/openpgp?go-get=1
    get "golang.org/x/crypto/openpgp": verifying non-authoritative meta tag
    Fetching https://golang.org/x/crypto?go-get=1
    Parsing meta tags from https://golang.org/x/crypto?go-get=1 (status code 200)
    golang.org/x/crypto (download)
    github.com/kevinburke/ssh_config (download)
    github.com/xanzy/ssh-agent (download)
    Fetching https://golang.org/x/crypto/ssh?go-get=1
    Parsing meta tags from https://golang.org/x/crypto/ssh?go-get=1 (status code 200)
    get "golang.org/x/crypto/ssh": found meta tag get.metaImport{Prefix:"golang.org/x/crypto", VCS:"git", RepoRoot:"https://go.googlesource.com/crypto"} at https://golang.org/x/crypto/ssh?go-get=1
    get "golang.org/x/crypto/ssh": verifying non-authoritative meta tag
    Fetching https://golang.org/x/crypto/ssh/knownhosts?go-get=1
    Parsing meta tags from https://golang.org/x/crypto/ssh/knownhosts?go-get=1 (status code 200)
    get "golang.org/x/crypto/ssh/knownhosts": found meta tag get.metaImport{Prefix:"golang.org/x/crypto", VCS:"git", RepoRoot:"https://go.googlesource.com/crypto"} at https://golang.org/x/crypto/ssh/knownhosts?go-get=1
    get "golang.org/x/crypto/ssh/knownhosts": verifying non-authoritative meta tag
    Fetching https://golang.org/x/net/proxy?go-get=1
    Parsing meta tags from https://golang.org/x/net/proxy?go-get=1 (status code 200)
    get "golang.org/x/net/proxy": found meta tag get.metaImport{Prefix:"golang.org/x/net", VCS:"git", RepoRoot:"https://go.googlesource.com/net"} at https://golang.org/x/net/proxy?go-get=1
    get "golang.org/x/net/proxy": verifying non-authoritative meta tag
    github.com/imdario/mergo (download)
    github.com/mattn/go-sqlite3 (download)
    github.com/gitsight/go-vcsurl (download)
    github.com/olekukonko/tablewriter (download)
    github.com/mattn/go-runewidth (download)
    github.com/spf13/cobra (download)
    github.com/spf13/pflag (download)
    ../go/src/github.com/go-git/go-git/remote.go:9:2: code in directory /home/simon/go/src/github.com/go-git/go-billy/osfs expects import "github.com/go-git/go-billy/v5/osfs"
    

    On investigation, https://github.com/go-git/go-billy/v5/osfs does not exist but https://github.com/go-git/go-billy/osfs does. Suggest this is bit-rot caused by the upstream package changing its directory structure?

    opened by simon-brooke 4
  • provide releases for non-Go developers

    provide releases for non-Go developers

    It would be nice to release this as a downloadable set of binaries. I have had some experience with GoReleaser and I have to say it's a pretty nice little tool, especially if you're working in Go. I'd prefer to be able to just brew install gitqlite, which GoReleaser has good support for: https://goreleaser.com/customization/homebrew/

    opened by klauern 4
  • fix panic when querying at the empty repository

    fix panic when querying at the empty repository

    This PR fixes panic when querying at the empty repository. This is my first Go language work :) I wish I could add a test, but after digging for a few hours, I gave up writing the test case :( Sorry for not adding a test case!

    master branch's behavior:

    $ mkdir empty-git
    $ cd empty-git
    $ git init
    Initialized empty Git repository in /home/youngminz/dist/empty-git-dir/.git/
    
    $ gitqlite "select * from commits"
    panic: invalid handle
            panic: invalid handle
    
    goroutine 1 [running]:
    github.com/mattn/go-sqlite3.lookupHandleVal(0x0, 0x0, 0x0, 0x0)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/callback.go:128 +0x13d
    github.com/mattn/go-sqlite3.lookupHandle(...)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/callback.go:135
    github.com/mattn/go-sqlite3.goVClose(0x0, 0x435a61)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/sqlite3_opt_vtable.go:448 +0x2f
    github.com/mattn/go-sqlite3._cgoexpwrap_7ec2bdc2f5b0_goVClose(0x0, 0x0)
            _cgo_gotypes.go:1506 +0x64
    github.com/mattn/go-sqlite3._Cfunc_sqlite3_finalize(0x29449d8, 0x0)
            _cgo_gotypes.go:962 +0x49
    github.com/mattn/go-sqlite3.(*SQLiteStmt).Close.func1(0xc0001227b0, 0x1)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/sqlite3.go:1767 +0x5f
    github.com/mattn/go-sqlite3.(*SQLiteStmt).Close(0xc0001227b0, 0x0, 0x0)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/sqlite3.go:1767 +0xac
    github.com/mattn/go-sqlite3.(*SQLiteRows).Close(0xc0000aaa20, 0x43520a, 0x7f44b710c6d0)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/sqlite3.go:1956 +0xa7
    database/sql.(*Rows).close.func1()
            /usr/lib/go-1.13/src/database/sql/sql.go:3076 +0x3c
    database/sql.withLock(0xc45a40, 0xc0000e8280, 0xc0000f3288)
            /usr/lib/go-1.13/src/database/sql/sql.go:3184 +0x6d
    database/sql.(*Rows).close(0xc0000e8300, 0x0, 0x0, 0x0, 0x0)
            /usr/lib/go-1.13/src/database/sql/sql.go:3075 +0x129
    database/sql.(*Rows).Close(0xc0000e8300, 0xc000124640, 0xc00009d8c0)
            /usr/lib/go-1.13/src/database/sql/sql.go:3059 +0x33
    panic(0xa7f040, 0xc31020)
            /usr/lib/go-1.13/src/runtime/panic.go:679 +0x1b2
    github.com/mattn/go-sqlite3.lookupHandleVal(0x0, 0x0, 0x0, 0x0)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/callback.go:128 +0x13d
    github.com/mattn/go-sqlite3.lookupHandle(...)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/callback.go:135
    github.com/mattn/go-sqlite3.goVFilter(0x0, 0x0, 0x2950630, 0x0, 0x29457e8, 0x2)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/sqlite3_opt_vtable.go:464 +0x40
    github.com/mattn/go-sqlite3._cgoexpwrap_7ec2bdc2f5b0_goVFilter(0x0, 0x0, 0x2950630, 0x0, 0x29457e8, 0x0)
            _cgo_gotypes.go:1535 +0x9e
    github.com/mattn/go-sqlite3._Cfunc__sqlite3_step_internal(0x29449d8, 0x0)
            _cgo_gotypes.go:414 +0x49
    github.com/mattn/go-sqlite3.(*SQLiteRows).nextSyncLocked.func1(0xc0000aaa20, 0xc00014a0c0)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/sqlite3.go:2030 +0x62
    github.com/mattn/go-sqlite3.(*SQLiteRows).nextSyncLocked(0xc0000aaa20, 0xc00013c1c0, 0xe, 0xe, 0xc00013c1c0, 0xe0)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/sqlite3.go:2030 +0x43
    github.com/mattn/go-sqlite3.(*SQLiteRows).Next(0xc0000aaa20, 0xc00013c1c0, 0xe, 0xe, 0x0, 0x0)
            /home/youngminz/go/pkg/mod/github.com/mattn/[email protected]+incompatible/sqlite3.go:2007 +0x2fb
    database/sql.(*Rows).nextLocked(0xc0000e8300, 0x430000)
            /usr/lib/go-1.13/src/database/sql/sql.go:2767 +0xd5
    database/sql.(*Rows).Next.func1()
            /usr/lib/go-1.13/src/database/sql/sql.go:2745 +0x3c
    database/sql.withLock(0xc47100, 0xc0000e8330, 0xc0000f3ad0)
            /usr/lib/go-1.13/src/database/sql/sql.go:3184 +0x6d
    database/sql.(*Rows).Next(0xc0000e8300, 0xc00013c000)
            /usr/lib/go-1.13/src/database/sql/sql.go:2744 +0x87
    github.com/augmentable-dev/gitqlite/cmd.tableDisplay(0xc0000e8300, 0x43520a, 0xc0000e8300)
            /home/youngminz/dist/gitqlite/cmd/root.go:233 +0x1ba
    github.com/augmentable-dev/gitqlite/cmd.displayDB(0xc0000e8300, 0x0, 0xc0000a4000)
            /home/youngminz/dist/gitqlite/cmd/root.go:136 +0x7d
    github.com/augmentable-dev/gitqlite/cmd.glob..func1(0x1031420, 0xc00010eaa0, 0x1, 0x1)
            /home/youngminz/dist/gitqlite/cmd/root.go:104 +0x2a6
    github.com/spf13/cobra.(*Command).execute(0x1031420, 0xc00009c030, 0x1, 0x1, 0x1031420, 0xc00009c030)
            /home/youngminz/go/pkg/mod/github.com/spf13/[email protected]/command.go:846 +0x2aa
    github.com/spf13/cobra.(*Command).ExecuteC(0x1031420, 0x0, 0x0, 0x0)
            /home/youngminz/go/pkg/mod/github.com/spf13/[email protected]/command.go:950 +0x349
    github.com/spf13/cobra.(*Command).Execute(...)
            /home/youngminz/go/pkg/mod/github.com/spf13/[email protected]/command.go:887
    github.com/augmentable-dev/gitqlite/cmd.Execute()
            /home/youngminz/dist/gitqlite/cmd/root.go:111 +0x2d
    main.main()
            /home/youngminz/dist/gitqlite/gitqlite.go:8 +0x20
    

    After my fix:

    $ gitqlite "select * from commits"
    repository is empty
    
    opened by youngminz 4
  • FEAT: improve git tables interface

    FEAT: improve git tables interface

    This pull-request improves the git virtual tables, building upon previous functionality and making following changes:

    • Make all git modules into table-valued functions [1] This switch allows us to support multi-repository queries.
    • Replace branches and tags with a unified refs table. See PRAGMA table_info(refs) for more info.
    • Drop the blame table
    • Add support for services.RepoLocator. Implementations of this interface could provide support for locating repositories at different locations while keeping the core agnostic of the fact where the repository lives.

    It also switches to go-git (away from libgit2) as the underlying git library provider. The switch is justified as go-git provides more Go-like access to git's data and is relatively more easier to extend. And being written in pure Go, it simplifies the build process.

    opened by riyaz-ali 3
  • HTTP 502 error querying private repo with GITHUB_TOKEN

    HTTP 502 error querying private repo with GITHUB_TOKEN

    I'm trying to get information about our pull requests from an internal repo and I'm getting a 502 error and I'm not sure how to debug what the issue is:

    mergestat "SELECT count(*) from github_repo_prs('private-org/private-repo');" -v
    Apr 19 12:57:06 INF starting GitHub repo_pull_requests iterator for private-org/private-repo name=private-repo owner=private-org per-page=100
    Apr 19 12:57:06 INF fetching page of repo_pull_requests for private-org/private-repo cursor=null name=private-repo owner=private-org per-page=100
    Apr 19 12:57:13 INF fetching page of repo_pull_requests for private-org/private-repo cursor=Y3Vyc29yOnYyOpHOGcCNUA== name=private-repo owner=private-org per-page=100
    +----------+
    | COUNT(*) |
    +----------+
    +----------+
    Apr 19 12:57:28 ERR failed to output resultset: non-200 OK status code: 502 Bad Gateway body: "{\n   \"data\": null,\n   \"errors\":[\n      {\n         \"message\":\"Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `51EA:050C:152958:3CEDAD:625EF7F3` when reporting this issue.\"\n      }\n   ]\n}\n"
    
    opened by klauern 5
  • Hardening binary & shared library

    Hardening binary & shared library

    I'm one of the package maintainers for Arch Linux and I also maintain a few packages on the AUR, which mergestat can be found on.

    Just wondering if there's any interest in RELRO/PIE being applied to the binary & shared library?

    I generally try and apply these to all the Go-related packages that I maintain due to our Go package guidelines. I've found that mergestat seems to be working fine with these applied, as per this commit.

    opened by grawlinson 2
  • Finding file paths affected by a commit

    Finding file paths affected by a commit

    I've been trying to work out how to filter file paths affected by a specific commit. Or in other words, show me the files modified by a particular commit, or maybe all the files affected when a particular merge happens.

    As far as I can tell, each commit hash in the files table has a record for every file in the tree at that time. I'm struggling to see where in the data model the file paths changed in each commit might be available.

    Thanks!

    opened by eddiesholl 1
  • handle null values in `summarize` queries better

    handle null values in `summarize` queries better

    https://github.com/mergestat/mergestat/blob/main/cmd/summarize/commits/commits.go#L63-L70

    It's possible for author_name / author_email to be null, in which case we have a Scan error - should use sql.NullString in these cases

    bug 
    opened by patrickdevivo 0
  • support for github enterprise

    support for github enterprise

    Hi

    Interesting project! Currently trying to run https://docs.mergestat.com/miscellaneous/cloning-all-org-repos against github enterprise, but failing on auth. Is there any way to set the custom github enterprise url so I can auth to that instead of github.com?

    enhancement 
    opened by andaag 1
Releases(v0.5.6)
A better way to clone, organize and manage multiple git repositories

git-get git-get is a better way to clone, organize and manage multiple git repositories. git-get Description Installation macOS Linux Windows Usage gi

Greg Dlugoszewski 56 Apr 23, 2022
git-xargs is a command-line tool (CLI) for making updates across multiple GitHub repositories with a single command

git-xargs is a command-line tool (CLI) for making updates across multiple GitHub repositories with a single command. You give git-xargs:

Maxar Infrastructure 1 Feb 5, 2022
πŸ’Š A git query language

Gitql Gitql is a Git query language. In a repository path... See more here Reading the code ⚠️ This project was created in 2014 as my first go project

Claudson Oliveira 5.9k May 17, 2022
Go Coverage in Shell: a tool for exploring Go Coverage reports from the command line

Go Coverage in Shell: a tool for exploring Go Coverage reports from the command line

Yury Fedorov 140 May 18, 2022
A CLI to replace your git commit command, so your git message can partially follow the Conventional Changelog ecosystem

COMMIT CLI A CLI to replace your git commit command, so your git message can partially follow the Conventional Changelog ecosystem. And yes, it is bui

Hisam Fahri 1 Feb 9, 2022
git-glimpse is a command-line tool that is aimed at generating a git prompt like the one from zsh-vcs-prompt.

Git GoGlimpse git-glimpse is a command-line tool that is aimed at generating a git prompt like the one from zsh-vcs-prompt. The particularity of this

Corentin de Boisset 0 Jan 27, 2022
cross-platform, cli app to perform various operations on string

sttr is command line software that allows you to quickly run various transformation operations on the string.

Abhimanyu Sharma 369 May 13, 2022
Tnbassist - A CLI tool for thenewboston blockchain to perform various mundane tasks like taking daily accounts backup

TNB Assist is a CLI (Command Line Interface) tool for thenewboston blockchain to perform various mundane tasks like taking daily accounts backup, computing statistics, etc easier.

Open blockchain explorer 1 Feb 14, 2022
Architecture checks for Go projects

Arch-Go Architecture checks for Go projects Supported rules Dependencies Checks Supports defining import rules Allowed dependencies Not allowed depend

Francisco Daines 46 Apr 30, 2022
A simple go program which checks if your websites are running and runs forever (stop it with ctrl+c). It takes two optional arguments, comma separated string with urls and an interval.

uptime A simple go program which checks if your websites are running and runs forever (stop it with ctrl+c). It takes two optional arguments: -interva

Markus Tenghamn 6 Jan 6, 2022
🍫 A customisable, universally compatible terminal status bar

Shox: Terminal Status Bar A customisable terminal status bar with universal shell/terminal compatibility. Currently works on Mac/Linux. Installation N

Liam Galvin 672 May 11, 2022
A set of Go scripts to monitor YAGPDB status via the command-line.

A set of Go scripts to monitor YAGPDB status by making GET requests to the YAGPDB status endpoint.

Joe 2 Apr 20, 2022
A simple visualization from the terminal of tintin++ bot status.

Kalterm A simple visualization from the terminal of tintin++ bot status. It uses kalterm.tin (in the tintin directory) to create a #port session on 95

Tom Allen 1 Dec 15, 2021
Battery - cross-platform get battery status

battery Cross-platform get battery status. Tested on Arch Linux, Debian, Ubuntu, Windows, macOS. import "github.com/caiguanhao/battery" battery.GetSt

CGH 0 Dec 31, 2021
Dwmstatus - Simple modular dwm status thing made in go

dwm status simple modular dwm status command made in go that has drop in plugins

null 5 Feb 23, 2022
:zap: boilerplate template manager that generates files or directories from template repositories

Boilr Are you doing the same steps over and over again every time you start a new programming project? Boilr is here to help you create projects from

Tamer Tas 1.5k May 19, 2022
CLI tool for manipulating GitHub Labels across multiple repositories

takolabel Installation Mac $ brew install tommy6073/tap/takolabel Other platforms Download from Releases page in this repository. Usage Set variables

Takayuki NAGATOMI 9 Feb 16, 2022
Grab is a tool that downloads source code repositories into a convenient directory layout created from the repo's URL's domain and path

Grab is a tool that downloads source code repositories into a convenient directory layout created from the repo's URL's domain and path. It supports Git, Mercurial (hg), Subversion, and Bazaar repositories.

Jeff Hodges 18 Feb 25, 2022
Go terminal app listing open pull requests in chosen GitHub repositories

go-pr-watcher About Shows open pull requests on configured GitHub repositories. Getting started Create GitHub personal token with read permissions Cre

Oleg 0 Oct 29, 2021