xyr is a very lightweight, simple and powerful data ETL platform that helps you to query available data sources using SQL.

Related tags

Data Processing xyr
Overview

xyr [WIP]

xyr is a very lightweight, simple and powerful data ETL platform that helps you to query available data sources using SQL.

Supported Drivers

  • local+json: for extracting, transforming and loading json documents from local filesystem into xyr.
  • local+csv: for extracting, transforming and loading csv documents from local filesystem into xyr.
  • s3+json: for extracting, transforming and loading json documents from S3.
  • s3+csv: for extracting, transforming and loading csv documents from S3.
  • postgresql: for extracting, transforming and loading postgres results.
  • clickhouse: for extracting, transforming and loading clickhouse results.
  • redis: for extracting, transforming and loading redis datastructures.

Use Cases

  • Simple Presto Alternative.
  • Simple AWS Athena Alternative.
  • Convert your json documents into a SQL db.
  • Query your CSV files easily and join them with other data.

How it works?

internaly xyr utilizes SQLite as an embeded sql datastore (it may be changed in future and we can add multiple data stores), when you define a table in XYRCONFIG file then run $ xyr import you will be able to import all defined tables as well querying them via $ xyr exec "SELECT * FROM TABLE_NAME_HERE".

Plan

  • Building the initial core.
  • Add the basic import command for importing the tables into xyr.
  • Add the exec command to execute SQL query.
  • Expose another API beside the CLI to enable external Apps to query xyr.
    • JSON Endpoint?
    • Mysql Protocol?
    • Redis Protocol?
  • Improving the code base (iteration 1).
You might also like...
This project is meant to make you code a digital version of an ant farm

This project is meant to make you code a digital version of an ant farm. Create a program lem-in that will read from a file (describing the ants and the colony) given in the arguments. Upon successfully finding the quickest path, lem-in will display the content of the file passed as argument and each move the ants make from room to room. How does it work? You make an ant farm with tunnels and rooms. You place the ants on one side and look at how they find the exit.

DEPRECATED: Data collection and processing made easy.

This project is deprecated. Please see this email for more details. Heka Data Acquisition and Processing Made Easy Heka is a tool for collecting and c

Open source framework for processing, monitoring, and alerting on time series data

Kapacitor Open source framework for processing, monitoring, and alerting on time series data Installation Kapacitor has two binaries: kapacitor – a CL

Kanzi is a modern, modular, expendable and efficient lossless data compressor implemented in Go.

kanzi Kanzi is a modern, modular, expendable and efficient lossless data compressor implemented in Go. modern: state-of-the-art algorithms are impleme

Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data
Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data

Dev Lake is the one-stop solution that integrates, analyzes, and visualizes software development data throughout the software development life cycle (SDLC) for engineering teams.

A distributed, fault-tolerant pipeline for observability data

Table of Contents What Is Veneur? Use Case See Also Status Features Vendor And Backend Agnostic Modern Metrics Format (Or Others!) Global Aggregation

Data syncing in golang for ClickHouse.
Data syncing in golang for ClickHouse.

ClickHouse Data Synchromesh Data syncing in golang for ClickHouse. based on go-zero ARCH A typical data warehouse architecture design of data sync Aut

Machine is a library for creating data workflows.
Machine is a library for creating data workflows.

Machine is a library for creating data workflows. These workflows can be either very concise or quite complex, even allowing for cycles for flows that need retry or self healing mechanisms.

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

Gleam Gleam is a high performance and efficient distributed execution system, and also simple, generic, flexible and easy to customize. Gleam is built

Owner
Mohammed Al Ashaal
a software engineer 🤓
Mohammed Al Ashaal
A library for performing data pipeline / ETL tasks in Go.

Ratchet A library for performing data pipeline / ETL tasks in Go. The Go programming language's simplicity, execution speed, and concurrency support m

Daily Burn 385 Jan 19, 2022
sq is a command line tool that provides jq-style access to structured data sources such as SQL databases, or document formats like CSV or Excel.

sq: swiss-army knife for data sq is a command line tool that provides jq-style access to structured data sources such as SQL databases, or document fo

Neil O'Toole 400 Jan 1, 2023
Declarative streaming ETL for mundane tasks, written in Go

Benthos is a high performance and resilient stream processor, able to connect various sources and sinks in a range of brokering patterns and perform hydration, enrichments, transformations and filters on payloads.

Ashley Jeffs 5.5k Dec 29, 2022
Prometheus Common Data Exporter can parse JSON, XML, yaml or other format data from various sources (such as HTTP response message, local file, TCP response message and UDP response message) into Prometheus metric data.

Prometheus Common Data Exporter Prometheus Common Data Exporter 用于将多种来源(如http响应报文、本地文件、TCP响应报文、UDP响应报文)的Json、xml、yaml或其它格式的数据,解析为Prometheus metric数据。

null 7 May 18, 2022
Dud is a lightweight tool for versioning data alongside source code and building data pipelines.

Dud Website | Install | Getting Started | Source Code Dud is a lightweight tool for versioning data alongside source code and building data pipelines.

Kevin Hanselman 121 Jan 1, 2023
CUE is an open source data constraint language which aims to simplify tasks involving defining and using data.

CUE is an open source data constraint language which aims to simplify tasks involving defining and using data.

null 3.4k Jan 1, 2023
Simple CRUD application using CockroachDB and Go

Simple CRUD application using CockroachDB and Go

Martín Montes 2 Feb 20, 2022
Baker is a high performance, composable and extendable data-processing pipeline for the big data era

Baker is a high performance, composable and extendable data-processing pipeline for the big data era. It shines at converting, processing, extracting or storing records (structured data), applying whatever transformation between input and output through easy-to-write filters.

AdRoll 153 Dec 14, 2022
Stream data into Google BigQuery concurrently using InsertAll() or BQ Storage.

bqwriter A Go package to write data into Google BigQuery concurrently with a high throughput. By default the InsertAll() API is used (REST API under t

null 10 Dec 16, 2022
a go daemon that syncs MongoDB to Elasticsearch in realtime. you know, for search.

monstache a go daemon that syncs mongodb to elasticsearch in realtime Version 6 This version of monstache is designed for MongoDB 3.6+ and Elasticsear

Ryan Wynn 1.1k Dec 28, 2022