A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats.

Overview

grate

A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats.

Why?

Grate focuses on speed and stability first, and makes no attempt to parse charts, figures, or other content types that may be present embedded within the input files. It tries to perform as few allocations as possible and errs on the side of caution.

There are certainly still some bugs and edge cases, but we have run it successfully on a set of 400k .xls and .xlsx files to catch many bugs and error conditions. Please file an issue with any feedback and additional problem files.

Usage

Grate provides a simple standard interface for all supported filetypes, allowing access to both named worksheets in spreadsheets and single tables in plaintext formats.

package main

import (
    "fmt"
    "os"
    "strings"

    "github.com/pbnjay/grate"
    _ "github.com/pbnjay/grate/simple" // tsv and csv support
    _ "github.com/pbnjay/grate/xls"
    _ "github.com/pbnjay/grate/xlsx"
)

func main() {
    wb, _ := grate.Open(os.Args[1])  // open the file
    sheets, _ := wb.List()           // list available sheets
    for _, s := range sheets {       // enumerate each sheet name
        sheet, _ := wb.Get(s)        // open the sheet
        for sheet.Next() {           // enumerate each row of data
            row := sheet.Strings()   // get the row's content as []string
            fmt.Println(strings.Join(row, "\t"))
        }
    }
    wb.Close()
}

License

All source code is licensed under the GNU GPLv3.

You might also like...
Extraction politique de conformité : xlsx (fichier de suivi) - xml (format AlgoSec)

go_policyExtractor Extraction politique de conformité : xlsx (fichier de suivi) - xml (format AlgoSec). Le programme suivant se base sur les intitulé

Fast, realtime regex-extraction, and aggregation into common formats such as histograms, numerical summaries, tables, and more!
Fast, realtime regex-extraction, and aggregation into common formats such as histograms, numerical summaries, tables, and more!

rare A file scanner/regex extractor and realtime summarizor. Supports various CLI-based graphing and metric formats (histogram, table, etc). Features

Dumpling is a fast, easy-to-use tool written by Go for dumping data from the database(MySQL, TiDB...) to local/cloud(S3, GCP...) in multifarious formats(SQL, CSV...).

🥟 Dumpling Dumpling is a tool and a Go library for creating SQL dump from a MySQL-compatible database. It is intended to replace mysqldump and mydump

sq is a command line tool that provides jq-style access to structured data sources such as SQL databases, or document formats like CSV or Excel.

sq: swiss-army knife for data sq is a command line tool that provides jq-style access to structured data sources such as SQL databases, or document fo

datatable is a Go package to manipulate tabular data, like an excel spreadsheet.
datatable is a Go package to manipulate tabular data, like an excel spreadsheet.

datatable is a Go package to manipulate tabular data, like an excel spreadsheet. datatable is inspired by the pandas python package and the data.frame R structure. Although it's production ready, be aware that we're still working on API improvements

Command-line tool to load csv and excel (xlsx) files and run sql commands
Command-line tool to load csv and excel (xlsx) files and run sql commands

csv-sql supports loading and saving results as CSV and XLSX files with data processing with SQLite compatible sql commands including joins.

:triangular_ruler:gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt.
:triangular_ruler:gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt.

gofmtmd gofmtmd formats go source code block in Markdown. detects fenced code & formats code using gofmt. Installation $ go get github.com/po3rin/gofm

Formats discord tokens to different formats.
Formats discord tokens to different formats.

token_formatter Formats discord tokens to different formats. Features Format your current tokens to a new format! Every tool uses a different format f

Data visualization with chart, Create CSV file, Read Write CSV file

Data visualization with chart, Create CSV file, Read Write CSV file, Read from json file and many more in single project ......

A go library to improve readability in terminal apps using tabular data

uitable uitable is a go library for representing data as tables for terminal applications. It provides primitives for sizing and wrapping columns to i

Gotabulate - Easily pretty-print your tabular data with Go

Gotabulate - Easily pretty-print tabular data Summary Go-Tabulate - Generic Go Library for easy pretty-printing of tabular data. Installation go get g

Create key value sqlite3 database from tabular data, fast.
Create key value sqlite3 database from tabular data, fast.

Turn tabular data into a lookup table using sqlite3. This is a working PROTOTYPE with limitations, e.g. no customizations, the table definition is fixed, etc.

Make a sqlite3 database from tabular data, fast.
Make a sqlite3 database from tabular data, fast.

MAKTA make a database from tabular data Turn tabular data into a lookup table using sqlite3. This is a working PROTOTYPE with limitations, e.g. no cus

Converts a trace of Datadog to a sequence diagram of PlantUML (Currently, supports only gRPC)
Converts a trace of Datadog to a sequence diagram of PlantUML (Currently, supports only gRPC)

jigsaw Automatically generate a sequence diagram from JSON of Trace in Datadog. ⚠️ Only gRPC calls appear in the sequence diagram. Example w/ response

Scalable golang ratelimiter using the sliding window algorithm. Currently supports only Redis.
Scalable golang ratelimiter using the sliding window algorithm. Currently supports only Redis.

go-ratelimiter Scalable golang ratelimiter using the sliding window algorithm. Currently supports only Redis. Example usage client := redis.NewClient

Beerus-DB: a database operation framework, currently only supports Mysql, Use [go-sql-driver/mysql] to do database connection and basic operations

Beerus-DB · Beerus-DB is a database operation framework, currently only supports Mysql, Use [go-sql-driver/mysql] to do database connection and basic

Query, update and convert data structures from the command line. Comparable to jq/yq but supports JSON, TOML, YAML, XML and CSV with zero runtime dependencies.
Query, update and convert data structures from the command line. Comparable to jq/yq but supports JSON, TOML, YAML, XML and CSV with zero runtime dependencies.

dasel Dasel (short for data-selector) allows you to query and modify data structures using selector strings. Comparable to jq / yq, but supports JSON,

Comments
  • Date column prints as days from epoch

    Date column prints as days from epoch

    grater xlsm_date_hataly_hatar.xlsm prints

    F_MODKOD        F_TIPUS F_ERTEK F_HATALY        F_HATAR F_TERITO
    11622   E       4.5     43983   44347   T
    F_MODKOD        F_DIJFIZGYAK    F_DIJFIZMOD     F_ERTEK F_HATALY        F_HATAR F
    13101   E       C       496     43983   44317   T
    F_MODKOD        F_TARTAMTOL     F_TARTAMIG      F_KEZD_MULT     F_NYK_MULT      F_EXTRA_MULT    F_BEF_MULT      F_BEF_MULT2     F_HATALY        F_HATALYIG      F_FL    F_MINIMALIS_POOL        F_INIT_KOCK_MULT        F_LAST_KOCK_MULT      F_RESZVISSZA_KTSG
    13103   1       99      50      3       3       0.12    0.08    43983   44347   0       7144    1       1       1744
    F_MODK  F_TAGSZAM       F_EVESDIJ       F_HATALYTOL     F_HATALYIG
    12410   1       1111    43983   44347
    F_HATALY        F_MODKOD        F_BEKOD F_BOSSZEG 2014  F_TERITO        F_SZAZTOL       F_SZAZIG
    43983   12410   E31001  123456  T       100     100
    

    Here, "F_HATALY", "F_HATAR", "F_TARTAMTOL", "F_TARTAMIG", "F_HATALYTOL", "F_HATALYIG" columns are dates.

    xlsm_date_hataly_hatar.xlsm.gz

    opened by tgulacsi 4
  • xls reads

    xls reads "0" on rows with many integer values

    • Attached is testing.xls test case
    • testing.tsv (filetype not supported by github) was created by copying testing.xls data into testing.tsv file
    • grate/xls/simple_test.go TestBasic was edited to use testing.xls and testing.tsv and to log all mismatches
    
    func TestBasic(t *testing.T) {
    	trueFile, err := os.ReadFile("../testdata/testing.tsv")
    	if err != nil {
    		t.Skip()
    	}
    	lines := strings.Split(string(trueFile), "\n")
    
    	fn := "../testdata/testing.xls"
    	wb, err := Open(fn)
    	if err != nil {
    		t.Fatal(err)
    	}
    
    	sheets, err := wb.List()
    	if err != nil {
    		t.Fatal(err)
    	}
    	for _, s := range sheets {
    		sheet, err := wb.Get(s)
    		if err != nil {
    			t.Fatal(err)
    		}
    
    
    		i := 0
    		for sheet.Next() {
    			row := strings.Join(sheet.Strings(), "\t")
    			if lines[i] != row {
    				t.Logf("line %d mismatch: '%s' <> '%s'", i, row, lines[i])
    			}
    			i++
    		}
    	}
    
    	err = wb.Close()
    	if err != nil {
    		t.Fatal(err)
    	}
    }
    

    ` --- FAIL: TestBasic (0.00s)

    /Users/zeke/Programming/grate/xls/simple_test.go:71: line 2 mismatch: 'b	0	0	0' <> 'b	2	3	4'
    
    /Users/zeke/Programming/grate/xls/simple_test.go:71: line 4 mismatch: 'b	0	0	0' <> 'b	1	2	1'
    
    /Users/zeke/Programming/grate/xls/simple_test.go:71: line 5 mismatch: 'b	0	0	0' <> 'b	4	3	2'
    
    /Users/zeke/Programming/grate/xls/simple_test.go:71: line 6 mismatch: '0	0	0	0' <> '1	1	1   1'`
    

    testing.xls

    opened by zvandehy 3
  • date formatting weekdays

    date formatting weekdays

    Needs some backtracking in makeFormatter, currently "dddd" becomes "Sunday" but then "d" is applied to become "Sun11ay"

    opened by pbnjay 0
Owner
Jeremy Jay
Jeremy Jay
Golang bindings for libxlsxwriter for writing XLSX files

goxlsxwriter provides Go bindings for the libxlsxwriter C library. Install goxlsxwriter requires the libxslxwriter library to be installe

Frank Terragna 20 Nov 18, 2022
Golang library for reading and writing Microsoft Excel™ (XLSX) files.

Excelize Introduction Excelize is a library written in pure Go providing a set of functions that allow you to write to and read from XLSX / XLSM / XLT

360 Enterprise Security Group, Endpoint Security, inc. 13.5k Nov 25, 2022
Golang bindings for libxlsxwriter for writing XLSX files

goxlsxwriter goxlsxwriter provides Go bindings for the libxlsxwriter C library. Install goxlsxwriter requires the libxslxwriter library to be installe

Frank Terragna 730 May 30, 2021
Go (golang) library for reading and writing XLSX files.

XLSX Introduction xlsx is a library to simplify reading and writing the XML format used by recent version of Microsoft Excel in Go programs. Tutorial

Geoffrey J. Teale 5.4k Nov 26, 2022
Fast and reliable way to work with Microsoft Excel™ [xlsx] files in Golang

Xlsx2Go package main import ( "github.com/plandem/xlsx" "github.com/plandem/xlsx/format/conditional" "github.com/plandem/xlsx/format/conditional/r

Andrey G. 155 Oct 22, 2022
Pure go library for creating and processing Office Word (.docx), Excel (.xlsx) and Powerpoint (.pptx) documents

unioffice is a library for creation of Office Open XML documents (.docx, .xlsx and .pptx). Its goal is to be the most compatible and highest performan

UniDoc 3.6k Nov 22, 2022
go-eexcel implements encoding and decoding of XLSX like encoding/json

go-eexcel go-eexcel implements encoding and decoding of XLSX like encoding/json Usage func ExampleMarshal() { type st struct { Name string `eexce

sago35 0 Dec 9, 2021
A simple excel engine without ui to parse .csv files.

A simple excel engine without ui to parse .csv files.

Akmal Hossain 1 Nov 4, 2021
Fastq demultiplexer for single cell data from MGI sequencer (10x converted library).

fastq_demultiplexer Converts fastq single cell data from MGI (10x converted library) to Illumina compatible format. Installation go install github.com

Rostislav Vorobev 0 Nov 24, 2021
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

What is Miller? Miller is like awk, sed, cut, join, and sort for data formats such as CSV, TSV, JSON, JSON Lines, and positionally-indexed. What can M

John Kerl 5.8k Nov 30, 2022