Go sqlite3 http vfs: query sqlite databases over http with range headers

Overview

sqlite3vfshttp: a Go sqlite VFS for querying databases over http(s)

sqlite3vfshttp is a sqlite3 VFS for querying remote databases over http(s). This allows you to perform queries without needing to download the complete database first.

Your database must be hosted on a webserver that supports HTTP range requests (such as Amazon S3).

Example

See sqlitehttpcli/sqlitehttpcli.go for a simple CLI tool that is able to query a remotely hosted sqlite database.

Usage

	vfs := sqlite3vfshttp.New(*url)

	err := sqlite3vfs.RegisterVFS("httpvfs", vfs)
	if err != nil {
		log.Fatalf("register vfs err: %s", err)
	}

	db, err := sql.Open("sqlite3", "not_a_read_name.db?vfs=httpvfs&mode=ro")
	if err != nil {
		log.Fatalf("open db err: %s", err)
	}

Querying a database in S3

The original purpose of this library was to allow an AWS Lambda function to be able to query a sqlite database stored in S3 without downloading the entire database.

This is possible even for private files stored in S3 by generating a presigned URL and passing that to this library. That allows the client to make HTTP Get range requests without it needing to know how to sign S3 requests.

Building a loadable extension for the sqlite3 cli

The sqlite3 cli supports runtime loadable extensions. We can build sqlite3vfshttp as a shared library, and then load it from the sqlite3 cli to interactively query sqlite databases over http connections. The shared library code is located in the sqlite3http-ext directory. See the sqlite3http-ext/README.md for more details.

Demo

I've uploaded a 30MB sqlite database to a publicly accessible webserver for testing, based on "Balance of payments international investment position: March 2021 quarter – CSV" from https://www.stats.govt.nz/large-datasets/csv-files-for-download/. The schema is:

CREATE TABLE csv (series_reference,
period,
data_value,
suppressed,
status,
units,
magntude,
subject,
grp,
series_title_1);

You can query this dataset from the sqlite3 cli tool using the shared library extension:

$ cd sqlite3http-ext

# build httpvfs.so shared library
$ make
go build -tags SQLITE3VFS_LOADABLE_EXT -o sqlite3http_ext.a -buildmode=c-archive sqlite3http_ext.go
gcc -g -fPIC -shared -o httpvfs.so sqlite3http_ext.c sqlite3http_ext.a

# set url of sqlite3 db as environment variable SQLITE3VFSHTTP_URL:
$ export SQLITE3VFSHTTP_URL='https://www.sanford.io/demo.db'

$ sqlite3
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> -- load extention
sqlite> .load ./httpvfs
sqlite> -- open db using vfs=httpvfs, note you must use the sqlite uri syntax which starts with file://
sqlite> .open file:///foo.db?vfs=httpvfs
sqlite> -- query from remote db
sqlite> SELECT * from csv where period > '2010' limit 10;
series_reference      period      data_value  suppressed  status      units       magntude    subject                    grp                                                   series_title_1
--------------------  ----------  ----------  ----------  ----------  ----------  ----------  -------------------------  ----------------------------------------------------  --------------
BOPQ.S06AC000000000A  2010.03     17463                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2010.06     17260                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2010.09     15419                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2010.12     17088                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.03     18516                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.06     18835                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.09     16390                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2011.12     18748                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2012.03     18477                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual
BOPQ.S06AC000000000A  2012.06     18270                   F           Dollars     6           Balance of Payments - BOP  BPM6 Quarterly, Balance of payments major components  Actual

Alternatively, you can query this dataset using the sqlitehttpcli example tool:

# query the sqlite schema table
$ ./sqlitehttpcli -url 'https://www.sanford.io/demo.db' -query 'select * from main.sqlite_master'
row: [table csv csv 2 CREATE TABLE csv (series_reference,
period,
data_value,
suppressed,
status,
units,
magntude,
subject,
grp,
series_title_1)]

# get 10 rows from the dataset
./sqlitehttpcli -url 'https://www.sanford.io/demo.db' -query "select * from csv limit 10"
row: [BOPQ.S06AC000000000A 1971.06 426  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1971.09 435  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1971.12 360  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.03 417  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.06 528  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.09 471  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1972.12 437  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1973.03 607  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1973.06 666  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 1973.09 578  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]

# get 10 rows where the period is after 2010
$ ./sqlitehttpcli -url 'https://www.sanford.io/demo.db' -query "select * from csv where period > '2010' limit 10"
row: [BOPQ.S06AC000000000A 2010.03 17463  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2010.06 17260  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2010.09 15419  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2010.12 17088  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.03 18516  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.06 18835  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.09 16390  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2011.12 18748  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2012.03 18477  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]
row: [BOPQ.S06AC000000000A 2012.06 18270  F Dollars 6 Balance of Payments - BOP BPM6 Quarterly, Balance of payments major components Actual]

Issues
  • Update roundTripper in sqlitehttpcli.go to support file:// URIs

    Update roundTripper in sqlitehttpcli.go to support file:// URIs

    For example:

    > ~/go/bin/go1.17rc2 run sqlitehttpcli/sqlitehttpcli.go -url file:///usr/local/data/sfomuseum-architecture.db -query 'SELECT id FROM geojson LIMIT 10'
    row: [1159157037]
    row: [1159157039]
    row: [1159157041]
    row: [1159157045]
    row: [1159157047]
    row: [1159157049]
    row: [1159157051]
    row: [1159157053]
    row: [1159157055]
    row: [1159157057]
    

    I considered moving roundTripper in to its own public-facing package but decided against it for this PR. "One thing at a time" and all that :-)

    opened by thisisaaronland 1
  • Improve gh workflow logic

    Improve gh workflow logic

    Use the same build tag that sqlite3vfs uses for extension builds for the loadable module. This allows us to ignore this package for normal builds via go build ./....

    opened by psanford 0
Owner
Peter Sanford
Peter Sanford
SQLite extension for accessing other SQL databases

dblite SQLite extension for accessing other SQL databases, in SQLite. Similar to how Postgres Foreign Data Wrappers enable access to other databases i

MergeStat 9 May 20, 2022
BQB is a lightweight and easy to use query builder that works with sqlite, mysql, mariadb, postgres, and others.

Basic Query Builder Why Simple, lightweight, and fast Supports any and all syntax by the nature of how it works Doesn't require learning special synta

Aaron M 37 Jun 11, 2022
Sqlair - SQLite Query Layer With Golang

sqlair SQLite Query Layer Creates an abstract over the go sql package to provide

Simon Richardson 1 Feb 18, 2022
OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.

OctoSQL OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases, streaming sources and file formats using

Jacob Martin 3.1k Jun 27, 2022
Use SQL to instantly query instances, networks, databases, and more from Scaleway. Open source CLI. No DB required.

Scaleway Plugin for Steampipe Use SQL to query infrastructure servers, networks, databases and more from your Scaleway project. Get started → Document

Turbot 5 Dec 16, 2021
Use SQL to query databases, logs and more from PlanetScale

Use SQL to instantly query PlanetScale databases, branches and more. Open source CLI. No DB required.

Turbot 1 Feb 13, 2022
This is the code example how to use SQL to query data from any relational databases in Go programming language.

Go with SQL example This is the code example how to use SQL to query data from any relational databases in Go programming language. To start, please m

Muhammad Uzair Mohd Faizul 1 Mar 12, 2022
Create key value sqlite3 database from tabular data, fast.

Turn tabular data into a lookup table using sqlite3. This is a working PROTOTYPE with limitations, e.g. no customizations, the table definition is fixed, etc.

Martin Czygan 5 Apr 2, 2022
Make a sqlite3 database from tabular data, fast.

MAKTA make a database from tabular data Turn tabular data into a lookup table using sqlite3. This is a working PROTOTYPE with limitations, e.g. no cus

Martin Czygan 5 Apr 2, 2022
A tool I made to quickly store bug bounty program scopes in a local sqlite3 database

GoScope A tool I made to quickly store bug bounty program scopes in a local sqlite3 database. Download or copy a Burpsuite configuration file from the

null 3 Nov 18, 2021
Go-Postgresql-Query-Builder - A query builder for Postgresql in Go

Postgresql Query Builder for Go This query builder aims to make complex queries

Samuel Banks 4 May 24, 2022
SQL API is designed to be able to run queries on databases without any configuration by simple HTTP call.

SQL API SQL API is designed to be able to run queries on databases without any configuration by simple HTTP call. The request contains the DB credenti

Çiçeksepeti Tech 24 Jun 21, 2022
Convert data exports from various services to a single SQLite database

Bionic Bionic is a tool to convert data exports from web apps to a single SQLite database. Bionic currently supports data exports from Google, Apple H

Bionic 138 Jun 29, 2022
Pure Go SQLite file reader

Package SQLittle provides pure Go, read-only, access to SQLite (version 3) database files. What SQLittle reads SQLite3 tables and indexes. It iterates

Harmen 182 Jul 1, 2022
Low-level Go interface to SQLite 3

zombiezen.com/go/sqlite This package provides a low-level Go interface to SQLite 3. It is a fork of crawshaw.io/sqlite that uses modernc.org/sqlite, a

Ross Light 265 Jul 1, 2022
Streaming replication for SQLite.

Litestream Litestream is a standalone streaming replication tool for SQLite. It runs as a background process and safely replicates changes incremental

Ben Johnson 6.9k Jun 30, 2022
Experimental implementation of a SQLite backend for go-mysql-server

go-mysql-sqlite-server This is an experimental implementation of a SQLite backend for go-mysql-server from DoltHub. The go-mysql-server is a "frontend

MergeStat 5 May 8, 2022
Golang database driver for SQLite

go-sqlite Golang database driver for SQLite. Does not use cgo. This driver is based on pure-Go SQLite implementation (https://gitlab.com/cznic/sqlite)

glebarez 48 Jul 3, 2022