Crossjoin joins together your data from anywhere.
- Supports PostgreSQL, Redshift, CSV data sources
- Zero dependency CLI, or a single Docker container
Example
In the example directory, there are two CSVs (adapted from this AWS blog post) representing orders and returns data.
The config defines a data set using both CSVs joined on the Order ID
field. This example joins two CSVs, but you can mix and match data sources. For example, you can join a PostgreSQL data source with a different Redshift data source and a CSV.
data_sets:
- name: joined
data_source:
name: orders
type: csv
path: ./orders.csv
joins:
- type: JOIN
columns:
- left_column: Order ID
right_column: Order ID
data_source:
name: returns
type: csv
path: ./returns.csv
$ crossjoin --config ./config.yaml
2021/10/14 18:08:06 using config file path config.yaml
2021/10/14 18:08:06 starting crossjoin
2021/10/14 18:08:06 creating data set `joined`
2021/10/14 18:08:06 querying `orders`
2021/10/14 18:08:06 querying `returns`
2021/10/14 18:08:06 joining data
2021/10/14 18:08:06 finished crossjoin