csvutil provides fast and idiomatic mapping between CSV and Go (golang) values.

Overview

csvutil PkgGoDev GoDoc Build Status Build status Go Report Card codecov

Package csvutil provides fast and idiomatic mapping between CSV and Go (golang) values.

This package does not provide a CSV parser itself, it is based on the Reader and Writer interfaces which are implemented by eg. std Go (golang) csv package. This gives a possibility of choosing any other CSV writer or reader which may be more performant.

Installation

go get github.com/jszwec/csvutil

Requirements

  • Go1.7+

Index

  1. Examples
    1. Unmarshal
    2. Marshal
    3. Unmarshal and metadata
    4. But my CSV file has no header...
    5. Decoder.Map - data normalization
    6. Different separator/delimiter
    7. Custom Types
    8. Custom time.Time format
    9. Custom struct tags
    10. Slice and Map fields
    11. Nested/Embedded structs
    12. Inline tag
  2. Performance
    1. Unmarshal
    2. Marshal

Example

Unmarshal

Nice and easy Unmarshal is using the Go std csv.Reader with its default options. Use Decoder for streaming and more advanced use cases.

	var csvInput = []byte(`
name,age,CreatedAt
jacek,26,2012-04-01T15:00:00Z
john,,0001-01-01T00:00:00Z`,
	)

	type User struct {
		Name      string `csv:"name"`
		Age       int    `csv:"age,omitempty"`
		CreatedAt time.Time
	}

	var users []User
	if err := csvutil.Unmarshal(csvInput, &users); err != nil {
		fmt.Println("error:", err)
	}

	for _, u := range users {
		fmt.Printf("%+v\n", u)
	}

	// Output:
	// {Name:jacek Age:26 CreatedAt:2012-04-01 15:00:00 +0000 UTC}
	// {Name:john Age:0 CreatedAt:0001-01-01 00:00:00 +0000 UTC}

Marshal

Marshal is using the Go std csv.Writer with its default options. Use Encoder for streaming or to use a different Writer.

	type Address struct {
		City    string
		Country string
	}

	type User struct {
		Name string
		Address
		Age       int `csv:"age,omitempty"`
		CreatedAt time.Time
	}

	users := []User{
		{
			Name:      "John",
			Address:   Address{"Boston", "USA"},
			Age:       26,
			CreatedAt: time.Date(2010, 6, 2, 12, 0, 0, 0, time.UTC),
		},
		{
			Name:    "Alice",
			Address: Address{"SF", "USA"},
		},
	}

	b, err := csvutil.Marshal(users)
	if err != nil {
		fmt.Println("error:", err)
	}
	fmt.Println(string(b))

	// Output:
	// Name,City,Country,age,CreatedAt
	// John,Boston,USA,26,2010-06-02T12:00:00Z
	// Alice,SF,USA,,0001-01-01T00:00:00Z

Unmarshal and metadata

It may happen that your CSV input will not always have the same header. In addition to your base fields you may get extra metadata that you would still like to store. Decoder provides Unused method, which after each call to Decode can report which header indexes were not used during decoding. Based on that, it is possible to handle and store all these extra values.

	type User struct {
		Name      string            `csv:"name"`
		City      string            `csv:"city"`
		Age       int               `csv:"age"`
		OtherData map[string]string `csv:"-"`
	}

	csvReader := csv.NewReader(strings.NewReader(`
name,age,city,zip
alice,25,la,90005
bob,30,ny,10005`))

	dec, err := csvutil.NewDecoder(csvReader)
	if err != nil {
		log.Fatal(err)
	}

	header := dec.Header()
	var users []User
	for {
		u := User{OtherData: make(map[string]string)}

		if err := dec.Decode(&u); err == io.EOF {
			break
		} else if err != nil {
			log.Fatal(err)
		}

		for _, i := range dec.Unused() {
			u.OtherData[header[i]] = dec.Record()[i]
		}
		users = append(users, u)
	}

	fmt.Println(users)

	// Output:
	// [{alice la 25 map[zip:90005]} {bob ny 30 map[zip:10005]}]

But my CSV file has no header...

Some CSV files have no header, but if you know how it should look like, it is possible to define a struct and generate it. All that is left to do, is to pass it to a decoder.

	type User struct {
		ID   int
		Name string
		Age  int `csv:",omitempty"`
		City string
	}

	csvReader := csv.NewReader(strings.NewReader(`
1,John,27,la
2,Bob,,ny`))

	// in real application this should be done once in init function.
	userHeader, err := csvutil.Header(User{}, "csv")
	if err != nil {
		log.Fatal(err)
	}

	dec, err := csvutil.NewDecoder(csvReader, userHeader...)
	if err != nil {
		log.Fatal(err)
	}

	var users []User
	for {
		var u User
		if err := dec.Decode(&u); err == io.EOF {
			break
		} else if err != nil {
			log.Fatal(err)
		}
		users = append(users, u)
	}

	fmt.Printf("%+v", users)

	// Output:
	// [{ID:1 Name:John Age:27 City:la} {ID:2 Name:Bob Age:0 City:ny}]

Decoder.Map - data normalization

The Decoder's Map function is a powerful tool that can help clean up or normalize the incoming data before the actual decoding takes place.

Lets say we want to decode some floats and the csv input contains some NaN values, but these values are represented by the 'n/a' string. An attempt to decode 'n/a' into float will end up with error, because strconv.ParseFloat expects 'NaN'. Knowing that, we can implement a Map function that will normalize our 'n/a' string and turn it to 'NaN' only for float types.

	dec, err := NewDecoder(r)
	if err != nil {
		log.Fatal(err)
	}

	dec.Map = func(field, column string, v interface{}) string {
		if _, ok := v.(float64); ok && field == "n/a" {
			return "NaN"
		}
		return field
	}

Now our float64 fields will be decoded properly into NaN. What about float32, float type aliases and other NaN formats? Look at the full example here.

Different separator/delimiter

Some files may use different value separators, for example TSV files would use \t. The following examples show how to set up a Decoder and Encoder for such use case.

Decoder:

	csvReader := csv.NewReader(r)
	csvReader.Comma = '\t'

	dec, err := NewDecoder(csvReader)
	if err != nil {
		log.Fatal(err)
	}

	var users []User
	for {
		var u User
		if err := dec.Decode(&u); err == io.EOF {
			break
		} else if err != nil {
			log.Fatal(err)
		}
		users = append(users, u)
	}

Encoder:

	var buf bytes.Buffer

	w := csv.NewWriter(&buf)
        w.Comma = '\t'
	enc := csvutil.NewEncoder(w)

	for _, u := range users {
		if err := enc.Encode(u); err != nil {
			log.Fatal(err)
		}
        }

	w.Flush()
	if err := w.Error(); err != nil {
		log.Fatal(err)
	}

Custom Types and Overrides

There are multiple ways to customize or override your type's behavior.

  1. a type implements csvutil.Marshaler and/or csvutil.Unmarshaler
type Foo int64

func (f Foo) MarshalCSV() ([]byte, error) {
	return strconv.AppendInt(nil, int64(f), 16), nil
}

func (f *Foo) UnmarshalCSV(data []byte) error {
	i, err := strconv.ParseInt(string(data), 16, 64)
	if err != nil {
		return err
	}
	*f = Foo(i)
	return nil
}
  1. a type implements encoding.TextUnmarshaler and/or encoding.TextMarshaler
type Foo int64

func (f Foo) MarshalText() ([]byte, error) {
	return strconv.AppendInt(nil, int64(f), 16), nil
}

func (f *Foo) UnmarshalText(data []byte) error {
	i, err := strconv.ParseInt(string(data), 16, 64)
	if err != nil {
		return err
	}
	*f = Foo(i)
	return nil
}
  1. a type is registered using Encoder.Register and/or Decoder.Register
type Foo int64

enc.Register(func(f Foo) ([]byte, error) {
	return strconv.AppendInt(nil, int64(f), 16), nil
})

dec.Register(func(data []byte, f *Foo) error {
	v, err := strconv.ParseInt(string(data), 16, 64)
	if err != nil {
		return err
	}
	*f = Foo(v)
	return nil
})
  1. a type implements an interface that was registered using Encoder.Register and/or Decoder.Register
type Foo int64

func (f Foo) String() string {
	return strconv.FormatInt(int64(f), 16)
}

func (f *Foo) Scan(state fmt.ScanState, verb rune) error {
	// too long; look here: https://github.com/jszwec/csvutil/blob/master/example_decoder_register_test.go#L19
}

enc.Register(func(s fmt.Stringer) ([]byte, error) {
	return []byte(s.String()), nil
})

dec.Register(func(data []byte, s fmt.Scanner) error {
	_, err := fmt.Sscan(string(data), s)
	return err
})

The order of precedence for both Encoder and Decoder is:

  1. type is registered
  2. type implements an interface that was registered
  3. csvutil.{Un,M}arshaler
  4. encoding.Text{Un,M}arshaler

For more examples look here

Custom time.Time format

Type time.Time can be used as is in the struct fields by both Decoder and Encoder due to the fact that both have builtin support for encoding.TextUnmarshaler and encoding.TextMarshaler. This means that by default Time has a specific format; look at MarshalText and UnmarshalText. There are two ways to override it, which one you choose depends on your use case:

  1. Via Register func (based on encoding/json)
const format = "2006/01/02 15:04:05"

marshalTime := func(t time.Time) ([]byte, error) {
	return t.AppendFormat(nil, format), nil
}

unmarshalTime := func(data []byte, t *time.Time) error {
	tt, err := time.Parse(format, string(data))
	if err != nil {
		return err
	}
	*t = tt
	return nil
}

enc := csvutil.NewEncoder(w)
enc.Register(marshalTime)

dec, err := csvutil.NewDecoder(r)
if err != nil {
	return err
}
dec.Register(unmarshalTime)
  1. With custom type:
type Time struct {
	time.Time
}

const format = "2006/01/02 15:04:05"

func (t Time) MarshalCSV() ([]byte, error) {
	var b [len(format)]byte
	return t.AppendFormat(b[:0], format), nil
}

func (t *Time) UnmarshalCSV(data []byte) error {
	tt, err := time.Parse(format, string(data))
	if err != nil {
		return err
	}
	*t = Time{Time: tt}
	return nil
}

Custom struct tags

Like in other Go encoding packages struct field tags can be used to set custom names or options. By default encoders and decoders are looking at csv tag. However, this can be overriden by manually setting the Tag field.

	type Foo struct {
		Bar int `custom:"bar"`
	}
	dec, err := csvutil.NewDecoder(r)
	if err != nil {
		log.Fatal(err)
	}
	dec.Tag = "custom"
	enc := csvutil.NewEncoder(w)
	enc.Tag = "custom"

Slice and Map fields

There is no default encoding/decoding support for slice and map fields because there is no CSV spec for such values. In such case, it is recommended to create a custom type alias and implement Marshaler and Unmarshaler interfaces. Please note that slice and map aliases behave differently than aliases of other types - there is no need for type casting.

	type Strings []string

	func (s Strings) MarshalCSV() ([]byte, error) {
		return []byte(strings.Join(s, ",")), nil // strings.Join takes []string but it will also accept Strings
	}

	type StringMap map[string]string

	func (sm StringMap) MarshalCSV() ([]byte, error) {
		return []byte(fmt.Sprint(sm)), nil
	}

	func main() {
		b, err := csvutil.Marshal([]struct {
			Strings Strings   `csv:"strings"`
			Map     StringMap `csv:"map"`
		}{
			{[]string{"a", "b"}, map[string]string{"a": "1"}}, // no type casting is required for slice and map aliases
			{Strings{"c", "d"}, StringMap{"b": "1"}},
		})

		if err != nil {
			log.Fatal(err)
		}

		fmt.Printf("%s\n", b)

		// Output:
		// strings,map
		// "a,b",map[a:1]
		// "c,d",map[b:1]
	}

Nested/Embedded structs

Both Encoder and Decoder support nested or embedded structs.

Playground: https://play.golang.org/p/ZySjdVkovbf

package main

import (
	"fmt"

	"github.com/jszwec/csvutil"
)

type Address struct {
	Street string `csv:"street"`
	City   string `csv:"city"`
}

type User struct {
	Name string `csv:"name"`
	Address
}

func main() {
	users := []User{
		{
			Name: "John",
			Address: Address{
				Street: "Boylston",
				City:   "Boston",
			},
		},
	}

	b, err := csvutil.Marshal(users)
	if err != nil {
		panic(err)
	}

	fmt.Printf("%s\n", b)

	var out []User
	if err := csvutil.Unmarshal(b, &out); err != nil {
		panic(err)
	}

	fmt.Printf("%+v\n", out)

	// Output:
	//
	// name,street,city
	// John,Boylston,Boston
	//
	// [{Name:John Address:{Street:Boylston City:Boston}}]
}

Inline tag

Fields with inline tag behave similarly to embedded struct fields. However, it gives a possibility to specify the prefix for all underlying fields. This can be useful when one structure can define multiple CSV columns because they are different from each other only by a certain prefix. Look at the example below.

Playground: https://play.golang.org/p/jyEzeskSnj7

package main

import (
	"fmt"

	"github.com/jszwec/csvutil"
)

func main() {
	type Address struct {
		Street string `csv:"street"`
		City   string `csv:"city"`
	}

	type User struct {
		Name        string  `csv:"name"`
		Address     Address `csv:",inline"`
		HomeAddress Address `csv:"home_address_,inline"`
		WorkAddress Address `csv:"work_address_,inline"`
		Age         int     `csv:"age,omitempty"`
	}

	users := []User{
		{
			Name:        "John",
			Address:     Address{"Washington", "Boston"},
			HomeAddress: Address{"Boylston", "Boston"},
			WorkAddress: Address{"River St", "Cambridge"},
			Age:         26,
		},
	}

	b, err := csvutil.Marshal(users)
	if err != nil {
		fmt.Println("error:", err)
	}

	fmt.Printf("%s\n", b)

	// Output:
	// name,street,city,home_address_street,home_address_city,work_address_street,work_address_city,age
	// John,Washington,Boston,Boylston,Boston,River St,Cambridge,26
}

Performance

csvutil provides the best encoding and decoding performance with small memory usage.

Unmarshal

benchmark code

csvutil:

BenchmarkUnmarshal/csvutil.Unmarshal/1_record-12         	  280696	      4516 ns/op	    7332 B/op	      26 allocs/op
BenchmarkUnmarshal/csvutil.Unmarshal/10_records-12       	   95750	     11517 ns/op	    8356 B/op	      35 allocs/op
BenchmarkUnmarshal/csvutil.Unmarshal/100_records-12      	   14997	     83146 ns/op	   18532 B/op	     125 allocs/op
BenchmarkUnmarshal/csvutil.Unmarshal/1000_records-12     	    1485	    750143 ns/op	  121094 B/op	    1025 allocs/op
BenchmarkUnmarshal/csvutil.Unmarshal/10000_records-12    	     154	   7587205 ns/op	 1136662 B/op	   10025 allocs/op
BenchmarkUnmarshal/csvutil.Unmarshal/100000_records-12   	      14	  76126616 ns/op	11808744 B/op	  100025 allocs/op

gocsv:

BenchmarkUnmarshal/gocsv.Unmarshal/1_record-12           	  141330	      7499 ns/op	    7795 B/op	      97 allocs/op
BenchmarkUnmarshal/gocsv.Unmarshal/10_records-12         	   54252	     21664 ns/op	   13891 B/op	     307 allocs/op
BenchmarkUnmarshal/gocsv.Unmarshal/100_records-12        	    6920	    159662 ns/op	   72644 B/op	    2380 allocs/op
BenchmarkUnmarshal/gocsv.Unmarshal/1000_records-12       	     752	   1556083 ns/op	  650248 B/op	   23083 allocs/op
BenchmarkUnmarshal/gocsv.Unmarshal/10000_records-12      	      72	  17086623 ns/op	 7017469 B/op	  230092 allocs/op
BenchmarkUnmarshal/gocsv.Unmarshal/100000_records-12     	       7	 163610749 ns/op	75004923 B/op	 2300105 allocs/op

easycsv:

BenchmarkUnmarshal/easycsv.ReadAll/1_record-12           	  101527	     10662 ns/op	    8855 B/op	      81 allocs/op
BenchmarkUnmarshal/easycsv.ReadAll/10_records-12         	   23325	     51437 ns/op	   24072 B/op	     391 allocs/op
BenchmarkUnmarshal/easycsv.ReadAll/100_records-12        	    2402	    447296 ns/op	  170538 B/op	    3454 allocs/op
BenchmarkUnmarshal/easycsv.ReadAll/1000_records-12       	     272	   4370854 ns/op	 1595683 B/op	   34057 allocs/op
BenchmarkUnmarshal/easycsv.ReadAll/10000_records-12      	      24	  47502457 ns/op	18861808 B/op	  340068 allocs/op
BenchmarkUnmarshal/easycsv.ReadAll/100000_records-12     	       3	 468974170 ns/op	189427066 B/op	 3400082 allocs/op

Marshal

benchmark code

csvutil:

BenchmarkMarshal/csvutil.Marshal/1_record-12         	  279558	      4390 ns/op	    9952 B/op	      12 allocs/op
BenchmarkMarshal/csvutil.Marshal/10_records-12       	   82478	     15608 ns/op	   10800 B/op	      21 allocs/op
BenchmarkMarshal/csvutil.Marshal/100_records-12      	   10275	    117288 ns/op	   28208 B/op	     112 allocs/op
BenchmarkMarshal/csvutil.Marshal/1000_records-12     	    1075	   1147473 ns/op	  168508 B/op	    1014 allocs/op
BenchmarkMarshal/csvutil.Marshal/10000_records-12    	     100	  11985382 ns/op	 1525973 B/op	   10017 allocs/op
BenchmarkMarshal/csvutil.Marshal/100000_records-12   	       9	 113640813 ns/op	22455873 B/op	  100021 allocs/op

gocsv:

BenchmarkMarshal/gocsv.Marshal/1_record-12           	  203052	      6077 ns/op	    5914 B/op	      81 allocs/op
BenchmarkMarshal/gocsv.Marshal/10_records-12         	   50132	     24585 ns/op	    9284 B/op	     360 allocs/op
BenchmarkMarshal/gocsv.Marshal/100_records-12        	    5480	    212008 ns/op	   51916 B/op	    3151 allocs/op
BenchmarkMarshal/gocsv.Marshal/1000_records-12       	     514	   2053919 ns/op	  444506 B/op	   31053 allocs/op
BenchmarkMarshal/gocsv.Marshal/10000_records-12      	      52	  21066666 ns/op	 4332377 B/op	  310064 allocs/op
BenchmarkMarshal/gocsv.Marshal/100000_records-12     	       5	 207408929 ns/op	51169419 B/op	 3100077 allocs/op
Issues
  • Add support for splitting fields with MarshalCSV

    Add support for splitting fields with MarshalCSV

    Already implemented a proof-of-concept in https://github.com/diegommm/csvutil This allows to use a custom marshaler for a type and create any number of fields from it. Example:

    package main
    
    import (
    	"fmt"
    
    	"github.com/diegommm/csvutil"
    )
    
    type Embedded struct {
    	Field1 string
    	Field2 string
    	Field3 string
    }
    
    func (o Embedded) MarshalCSVMulti(key string) ([]byte, error) {
    	switch key {
    	case "outputName1":
    		return []byte(o.Field1 + o.Field2), nil
    	case "outputName2":
    		return []byte(o.Field3), nil
    	}
    	return []byte{}, fmt.Errorf("Output field not found: %v", key)
    }
    
    type Outer struct {
    	OuterField string
    	Embedded   `csv:",multi=outputName1 outputName2"`
    }
    
    func main() {
    	s := Outer{
    		OuterField: "Lorem Ipsum",
    		Embedded: Embedded{
    			Field1: "Dolor sit ",
    			Field2: "amet",
    			Field3: "consectetuer",
    		},
    	}
    
    	b, err := csvutil.Marshal([]Outer{s})
    	if err != nil {
    		fmt.Printf("ERROR: %v\n", err)
    	}
    	fmt.Printf("CSV OUTPUT: %v\n", string(b))
    
    	return
    }
    

    Example output:

    $ GO111MODULE=off go run main.go
    CSV OUTPUT: OuterField,outputName1,outputName2
    Lorem Ipsum,Dolor sit amet,consectetuer
    
    

    TODO:

    • Add tests
    • Add documentation
    • Add support for pointer receiving methods

    If this is a desired feature then I might add those.

    feature request 
    opened by diegommm 9
  • Proposal: make `decodeError` public

    Proposal: make `decodeError` public

    Description

    In v1.6 and later, it is possible to get more detailed errors. However if you want to use information such as Line or Column field of decodeError, (for example, when you want to customize error messages.(e.g. for i18n)), you cannot easily retrieve these. How about making the decodeError public for such cases?

    Or if you already have a case where you can handle decodeError properly, please let me know 🙏

    workaround for now

    This is how we currently get it.

    var ute *csvutil.UnmarshalTypeError
    if !errors.As(err, &ute) {
      // skip
    }
    var idx int
    for i, c := range decoder.Record() {
      if c == ute.Value {
        idx = i
      }
    }
    line, col := csvReader.FieldPos(idx)
    

    related: #43

    feature request proposal-accepted 
    opened by kanata2 7
  • Would this work with inconsistently delimited files?

    Would this work with inconsistently delimited files?

    Hey there, I have a folder full of files and unfortunately, the application that generates them, for some reason doesn't keep things extremely consistent. Examples will be below. From what I can tell, the first 3 lines are always comments, then the next section starting with HCONTEXT, there might be just one or there might be several. Then there are not always additional comments before you get to the sets of data, but in the second example, there are. The sets of data are always laid out the same with the first column being an application symbol, label, description, and then the last one is a list of 0 to N keys which are delimited by a space.

    The main issue is the delimiter between the four columns are not consistent. Their layout of data is always the same, but to delimit the text, some might have a single tab (\t), some might have two, some might have three, or a single \t and a space (\s), two spaces and a tab (\s\t\s, or \s\s\t), etc.

    I saw that this library was able to allow you to set the delimiter you want to search for and use, but does it have any capability to search for multiple types of delimiters, or allow for specific delimiters within a field? Example of that would be the 4th field for the key combinations, from what I have seen so far, each one is always delimited by a single space (\s) between them.

    If you would not mind letting me know if this library is able to help me out with this, I would greatly appreciate it. If not, do you happen to know of one that might? I was not exactly sure what search terms to use when looking, I tried "parse text", "csv", "multiple delimiters", and various other things, but this library so far is the only one that looks like it might help. Unless I need to just go and use multiple libraries and do it in different steps, I am hoping to keep it as absolutely performant as possible though at runtime.

    Thanks! -MH

    //
    // Desktop manager (separate app)
    //
    
    HCONTEXT deskmgr "Desktop Manager" "These keys are used in the Desktop Manager dialog."
    
    deskmgr.new		"New"		"Create a new desktop"		Alt+N N
    deskmgr.add		"Add"		"Add a desktop"			Alt+D D
    deskmgr.apply		"Apply"		"Apply current changes"
    deskmgr.accept		"Accept"	"Accept current changes"
    deskmgr.discard		"Discard"	"Discard current changes"
    deskmgr.reload		"Reload"	"Reload the desktops"
    deskmgr.refresh		"Refresh"	"Refresh the desktops"
    deskmgr.save		"Save"		"Save current changes"		Alt+S S
    deskmgr.cancel		"Cancel"	"Cancel current changes"	Esc
    
    
    //
    // Gplay hotkeys
    //
    
    HCONTEXT gplay "GPLAY Geometry Viewer" "These keys apply to the Geometry Viewer application."
    
    // File menu
    gplay.open		"Open"			"Open"			Alt+O Ctrl+O
    gplay.quit		"Quit"			"Quit"			Alt+Q Ctrl+Q
    
    // Display menu
    gplay.display_info	"Geometry Info"		"Geometry Info"		Alt+I
    gplay.unpack		"Unpack Geometry"	"Unpack Geometry"	Alt+U
    gplay.display_ssheet	"Geometry Spreadsheet"	"Geometry Speadsheet"	Alt+S
    gplay.flipbook		"Flipbook Current Viewport" "Flipbook the currently selected viewport"	Alt+F
    gplay.display_prefs	"Preferences"		"Preferences"		
    
    // Help menu
    gplay.help_menu		"Help Menu"		"Help Menu"		Alt+H
    
    // Commands not in menus
    gplay.quick_quit	"Quick Quit"		"Quick Quit"		Q
    gplay.next_geo		"Next Geometry"		"Next Geometry"		N
    gplay.prev_geo		"Previous Geometry"	"Previous Geometry"	P
    gplay.stop_play		"Stop Play"		"Stop Play"		Space
    
    
    question 
    opened by MostHated 7
  • feature request:

    feature request: "inline" array elements (similar to inline tag for structs)

    Hi Jacek,

    First off, I would like to take my hat of and applaud the effort you've been undertaking! The capabilities of csvutil are really quite impressive, especially when it comes to handling custom fields/tags/marshaling requirements, etc.

    Even given the extensive support for custom marshaling/unmarshaling in csvutil, there is still one situation I would like to explore.

    Lets say we have a struct T containing a few fields, one of which is an array; e.g.

    type T1 struct {
       A int
       B string
       C [3]int `csv:"C_,inline"`
    }
    
    var data1 = T1{A: 123, B: "foo", C: [3]int{1, 2, 3}}
    

    Marshaling the above above struct with csvutil would give the following error: unsupported type: [3]int.

    However, a very similar situation is supported with inline struct fields, namely:

    type T2 struct {
       A int
       B string
       C X `csv:"C_,inline"`
    }
    
    type X struct {
       F1 int
       F2 int
       F3 int
    }
    
    var data2 = T2{A: 123, B: "foo", C: X{F1: 1, F2: 2, F3: 3}}
    

    Which would marshal into:

    A   , B     , C_F1 , C_F2 , C_F3
    123 , "foo" , 1    , 2    , 3
    

    Feature request

    If it were possible to handle array fields in a similar fashion as "inline" struct fields, inlining the respective elements of the array, assigning name prefixes analogous to how its done for inlined structs, and using array element index as name suffix.

    Given the example above, data1 would marshal into (using 1-indexed name suffixes for array element indices):

    A   , B     , C_1 , C_2 , C_3
    123 , "foo" , 1   , 2   , 3
    

    The support for "inlined" arrays, would preferably be be implemented in a orthogonal way such that it naturally supports arrays with struct element types.

    One such example would be:

    type T3 struct {
       A int
       B string
       C [3]Y `csv:"C_,inline"`
    }
    
    type Y struct {
       D int
       E string
    }
    
    var data3 = T3{
       A: 123,
       B: "foo",
       C: [3]Y{
          Y{D: 1, E: "bar"},
          Y{D: 2, E: "baz"},
          Y{D: 3, E: "qux"},
       },
    }
    

    Given the example above, data3 would marshal into:

    A   , B     , C_D1 , C_E1 , C_D2 , C_E2 , C_D3 , C_E3
    123 , "foo" , 1   , "bar" , 2   , "baz" , 3   , "qux"
    

    Any thoughts or ideas? I'd be happy to bounce ideas and discuss any unforeseen issues with the proposal.

    Wish you happy coding and a most lovely Autumn.

    Cheerful regards, Robin

    feature request 
    opened by mewmew 6
  • Support TextMarshaler and TextUnmarshaler interfaces

    Support TextMarshaler and TextUnmarshaler interfaces

    Go types being serialized often implement TextMarshaler and TextUnmarshaler. This can be leveraged the same way as Stringer to provide the values needed to generate a CSV without having to register your type or implement MarshalCSV and UnmarshalCSV especially when types come from third party libraries that you cannot change.

    Suggested order: MarshalCSV() -> MarshalText() -> String()

    Example

    opened by cbelsole 6
  • Question - Optional Fields

    Question - Optional Fields

    Hi Jacek, I was wondering if it's possible to Unmarshal the following type of struct(s) using csvutil?

    type Device struct {
    	DeviceID       string          `json:"device-id" yaml:"device-id"`
    	Host           string          `json:"host"`
    	SystemID       string          `json:"system-id,omitempty" yaml:"system-id,omitempty"`
    	Authentication *Authentication `json:"authentication,omitempty" yaml:"authentication,omitempty"`
    	IAgent         *IAgent         `json:"iAgent,omitempty" yaml:"iAgent,omitempty"`
    	OpenConfig     *OpenConfig     `json:"open-config,omitempty" yaml:"open-config,omitempty"`
    	Snmp           *Snmp           `json:"snmp,omitempty" yaml:"snmp,omitempty"`
    	Vendor         *Vendor         `json:"vendor,omitempty" yaml:"vendor,omitempty"`
    }
    

    Where I'm using pointers to indicate optional content when unmarshaling from a variety of file types.

    Thanks in advance, Damian.

    question 
    opened by damianoneill 6
  • Suggestion for data normalization

    Suggestion for data normalization

    Hi

    I've done some patching to improve the performance of the data normalization. It spawned from the fact that I need to unmarshal dates from other formats, and found it strange that when I normalize the field, I have to convert it back to a string, so I added a way to do it directly.

    One thing I don't know is, would this mess with your cached part.

    When benchmarking with the patch the alloc/op is around 3 times lower after the 100 records mark.

    BenchmarkUnmarshal/csvutildk.Unmarshal/1_record-8         	  200000	     11239 ns/op	    7902 B/op	      41 allocs/op
    BenchmarkUnmarshal/csvutildk.Unmarshal/10_records-8       	   50000	     25840 ns/op	   15054 B/op	      72 allocs/op
    BenchmarkUnmarshal/csvutildk.Unmarshal/100_records-8      	   10000	    168801 ns/op	   76848 B/op	     345 allocs/op
    BenchmarkUnmarshal/csvutildk.Unmarshal/1000_records-8     	    1000	   1628027 ns/op	  637536 B/op	    3048 allocs/op
    BenchmarkUnmarshal/csvutildk.Unmarshal/10000_records-8    	      50	  20540432 ns/op	11187253 B/op	   30059 allocs/op
    BenchmarkUnmarshal/csvutildk.Unmarshal/100000_records-8   	       5	 207199960 ns/op	113993987 B/op	  300072 allocs/op
    BenchmarkUnmarshal/csvutil.Unmarshal/1_record-8           	  100000	     13039 ns/op	    8238 B/op	      53 allocs/op
    BenchmarkUnmarshal/csvutil.Unmarshal/10_records-8         	   30000	     40134 ns/op	   17118 B/op	     138 allocs/op
    BenchmarkUnmarshal/csvutil.Unmarshal/100_records-8        	    5000	    315599 ns/op	   96193 B/op	     951 allocs/op
    BenchmarkUnmarshal/csvutil.Unmarshal/1000_records-8       	     500	   3046062 ns/op	  829794 B/op	    9054 allocs/op
    BenchmarkUnmarshal/csvutil.Unmarshal/10000_records-8      	      50	  33160620 ns/op	13107352 B/op	   90065 allocs/op
    BenchmarkUnmarshal/csvutil.Unmarshal/100000_records-8     	       3	 352344833 ns/op	135428960 B/op	  900080 allocs/op
    

    Diff: https://github.com/KalleDK/csvutil/commit/787aaf6cc96fb03839d704366d25d528ebfc6cf6 Test: https://gist.github.com/KalleDK/6616d6634a787fa31cfff02c47110f2c

    feature request 
    opened by KalleDK 6
  • Read csv without unknow header and different number of fields each line

    Read csv without unknow header and different number of fields each line

    I have a file this this one:

    2021-07-14T17:49:48.837,FE, 6, 0, 0, D,FF,BE,    42,mavlink_request_data_stream_t,req_message_rate,4,target_system,0,target_component,0,req_stream_id,3,start_stop,1,,sig ,Len,14,crc16,15714
    2021-07-14T17:49:48.869,FE,1C, 0, 0,F0, 1, 1,    1E,mavlink_attitude_t,time_boot_ms,114029,roll,0.04656032,pitch,0.0197014,yaw,2.916162,rollspeed,0.0042041,pitchspeed,-0.00257009,yawspeed,-0.001730594,,sig ,Len,36,crc16,34405
    2021-07-14T17:49:48.869,FE,1C, 0, 0,F1, 1, 1,    21,mavlink_global_position_int_t,time_boot_ms,114029,lat,357658398,lon,1274153234,alt,375210,relative_alt,17,vx,-27,vy,-28,vz,12,hdg,16708,,sig ,Len,36,crc16,35248
    2021-07-14T17:49:48.874,FE,1F, 0, 0,F2, 1, 1,     1,mavlink_sys_status_t,onboard_control_sensors_present,325188655,onboard_control_sensors_enabled,308411439,onboard_control_sensors_health,326237231,load,160,voltage_battery,23569,current_battery,0,drop_rate_comm,0,errors_comm,0,errors_count1,0,errors_count2,0,errors_count3,0,errors_count4,0,battery_remaining,99,,sig ,Len,39,crc16,10226
    2021-07-14T17:49:48.875,FE, 6, 0, 0,F3, 1, 1,    7D,mavlink_power_status_t,Vcc,5112,Vservo,2,flags,3,,sig ,Len,14,crc16,32691
    2021-07-14T17:49:48.876,FE, 4, 0, 0,F4, 1, 1,    98,mavlink_meminfo_t,brkval,0,freemem,65535,freemem32,0,,sig ,Len,12,crc16,44665
    2021-07-14T17:49:48.877,FE,1A, 0, 0,F5, 1, 1,    3E,mavlink_nav_controller_output_t,nav_roll,2.662952,nav_pitch,1.120603,alt_error,0,aspd_error,0,xtrack_error,0,nav_bearing,167,target_bearing,0,wp_dist,0,,sig ,Len,34,crc16,42532
    2021-07-14T17:49:48.877,FE, 2, 0, 0,F6, 1, 1,    2A,mavlink_mission_current_t,seq,0,,sig ,Len,10,crc16,15872
    2021-07-14T17:49:48.878,FE,14, 0, 0,F7, 1, 1,    4A,mavlink_vfr_hud_t,airspeed,0.006447488,groundspeed,0.3970219,alt,375.21,climb,-0.1277524,heading,167,throttle,0,,sig ,Len,28,crc16,63070
    2021-07-14T17:49:48.878,FE,15, 0, 0,F8, 1, 1,    24,mavlink_servo_output_raw_t,time_usec,114029578,servo1_raw,982,servo2_raw,982,servo3_raw,982,servo4_raw,982,servo5_raw,982,servo6_raw,982,servo7_raw,0,servo8_raw,0,port,0,servo9_raw,0,servo10_raw,0,servo11_raw,0,servo12_raw,0,servo13_raw,0,servo14_raw,0,servo15_raw,0,servo16_raw,0,,sig ,Len,29,crc16,10946
    2021-07-14T17:49:48.879,FE,2A, 0, 0,F9, 1, 1,    41,mavlink_rc_channels_t,time_boot_ms,114029,chan1_raw,1494,chan2_raw,1492,chan3_raw,982,chan4_raw,1494,chan5_raw,1299,chan6_raw,1488,chan7_raw,1493,chan8_raw,1486,chan9_raw,1494,chan10_raw,1494,chan11_raw,1494,chan12_raw,1494,chan13_raw,1494,chan14_raw,1494,chan15_raw,1494,chan16_raw,1494,chan17_raw,0,chan18_raw,0,chancount,16,rssi,0,,sig ,Len,50,crc16,42490
    2021-07-14T17:49:48.879,FE,16, 0, 0,FA, 1, 1,    23,mavlink_rc_channels_raw_t,time_boot_ms,114029,chan1_raw,1494,chan2_raw,1492,chan3_raw,982,chan4_raw,1494,chan5_raw,1299,chan6_raw,1488,chan7_raw,1493,chan8_raw,1486,port,0,rssi,0,,sig ,Len,30,crc16,50926
    2021-07-14T17:49:48.879,FE,1A, 0, 0,FB, 1, 1,    1B,mavlink_raw_imu_t,time_usec,114029629,xacc,48,yacc,-31,zacc,-997,xgyro,15,ygyro,-7,zgyro,-1,xmag,-342,ymag,-1,zmag,337,id,0,temperature,0,,sig ,Len,34,crc16,61209
    2021-07-14T17:49:48.880,FE,16, 0, 0,FC, 1, 1,    74,mavlink_scaled_imu2_t,time_boot_ms,114029,xacc,40,yacc,-30,zacc,-997,xgyro,0,ygyro,-2,zgyro,-1,xmag,-302,ymag,-18,zmag,341,temperature,0,,sig ,Len,30,crc16,27531
    2021-07-14T17:49:48.880,FE, E, 0, 0,FD, 1, 1,    1D,mavlink_scaled_pressure_t,time_boot_ms,114029,press_abs,970.946,press_diff,0,temperature,3822,temperature_press_diff,0,,sig ,Len,22,crc16,12037
    2021-07-14T17:49:48.881,FE,1E, 0, 0,FE, 1, 1,    18,mavlink_gps_raw_int_t,time_usec,113820000,lat,357658394,lon,1274153298,alt,381980,eph,86,epv,132,vel,0,cog,25646,fix_type,3,satellites_visible,14,alt_ellipsoid,0,h_acc,0,v_acc,0,vel_acc,0,hdg_acc,0,yaw,0,,sig ,Len,38,crc16,56474
    2021-07-14T17:49:48.882,FE,23, 0, 0,FF, 1, 1,    7C,mavlink_gps2_raw_t,time_usec,113820000,lat,357658428,lon,1274153213,alt,383180,dgps_age,0,eph,86,epv,132,vel,2,cog,31181,fix_type,3,satellites_visible,14,dgps_numch,0,yaw,0,alt_ellipsoid,0,h_acc,0,v_acc,0,vel_acc,0,hdg_acc,0,,sig ,Len,43,crc16,20400
    2021-07-14T17:49:48.882,FE, C, 0, 0, 0, 1, 1,     2,mavlink_system_time_t,time_unix_usec,1626252579757269,time_boot_ms,114032,,sig ,Len,20,crc16,39761
    2021-07-14T17:49:48.883,FE,1C, 0, 0, 1, 1, 1,    A3,mavlink_ahrs_t,omegaIx,-0.01148541,omegaIy,0.005339885,omegaIz,-0.0001667932,accel_weight,0,renorm_val,0,error_rp,0.003915358,error_yaw,0.003721569,,sig ,Len,36,crc16,12478
    2021-07-14T17:49:48.883,FE,18, 0, 0, 2, 1, 1,    B2,mavlink_ahrs2_t,roll,0.02967097,pitch,0.02159006,yaw,2.961208,altitude,0,lat,0,lng,0,,sig ,Len,32,crc16,6036
    2021-07-14T17:49:48.883,FE,28, 0, 0, 3, 1, 1,    B6,mavlink_ahrs3_t,roll,0.0465561,pitch,0.01969914,yaw,2.916158,altitude,375.21,lat,357658398,lng,1274153234,v1,0,v2,0,v3,0,v4,0,,sig ,Len,48,crc16,19848
    2021-07-14T17:49:48.883,FE, 3, 0, 0, 4, 1, 1,    A5,mavlink_hwstatus_t,Vcc,5111,I2Cerr,0,,sig ,Len,11,crc16,54392
    2021-07-14T17:49:48.883,FE,16, 0, 0, 5, 1, 1,    88,mavlink_terrain_report_t,lat,357658398,lon,1274153234,terrain_height,0,current_height,0,spacing,0,pending,504,loaded,112,,sig ,Len,30,crc16,63659
    2021-07-14T17:49:48.883,FE, E, 0, 0, 6, 1, 1,    9E,mavlink_mount_status_t,pointing_a,0,pointing_b,-70,pointing_c,-52,target_system,0,target_component,0,,sig ,Len,22,crc16,53991
    2021-07-14T17:49:48.884,FE,16, 0, 0, 7, 1, 1,    C1,mavlink_ekf_status_report_t,velocity_variance,0.08776879,pos_horiz_variance,0.02859282,pos_vert_variance,0.023573,compass_variance,0.02226898,terrain_alt_variance,0,flags,831,airspeed_variance,0,,sig ,Len,30,crc16,37206
    2021-07-14T17:49:48.884,FE,1C, 0, 0, 8, 1, 1,    20,mavlink_local_position_ned_t,time_boot_ms,114035,x,-0.5938861,y,0.5027716,z,-0.01835009,vx,-0.2745549,vy,-0.2851341,vz,0.1275563,,sig ,Len,36,crc16,35136
    2021-07-14T17:49:48.884,FE,20, 0, 0, 9, 1, 1,    F1,mavlink_vibration_t,time_usec,114035034,vibration_x,0.01895511,vibration_y,0.01639727,vibration_z,0.01650113,clipping_0,0,clipping_1,0,clipping_2,0,,sig ,Len,40,crc16,17895
    2021-07-14T17:49:48.884,FE,24, 0, 0, A, 1, 1,    93,mavlink_battery_status_t,current_consumed,0,energy_consumed,0,temperature,32767,voltages,,current_battery,0,id,0,battery_function,0,type,0,battery_remaining,99,time_remaining,0,charge_state,0,voltages_ext,,,sig ,Len,44,crc16,27062
    2021-07-14T17:49:48.885,FE,1C, 0, 0, B, 1, 1,    1E,mavlink_attitude_t,time_boot_ms,114279,roll,0.04652789,pitch,0.01981729,yaw,2.916214,rollspeed,0.004798831,pitchspeed,-0.002019421,yawspeed,-0.001487666,,sig ,Len,36,crc16,6788
    2021-07-14T17:49:48.885,FE,1C, 0, 0, C, 1, 1,    21,mavlink_global_position_int_t,time_boot_ms,114279,lat,357658397,lon,1274153234,alt,375220,relative_alt,21,vx,-27,vy,-28,vz,12,hdg,16708,,sig ,Len,36,crc16,47284
    2021-07-14T17:49:48.885,FE,1F, 0, 0, D, 1, 1,     1,mavlink_sys_status_t,onboard_control_sensors_present,325188655,onboard_control_sensors_enabled,308411439,onboard_control_sensors_health,326237231,load,155,voltage_battery,23572,current_battery,0,drop_rate_comm,0,errors_comm,0,errors_count1,0,errors_count2,0,errors_count3,0,errors_count4,0,battery_remaining,99,,sig ,Len,39,crc16,18819
    2021-07-14T17:49:48.885,FE, 6, 0, 0, E, 1, 1,    7D,mavlink_power_status_t,Vcc,5107,Vservo,0,flags,3,,sig ,Len,14,crc16,25606
    2021-07-14T17:49:48.885,FE, 4, 0, 0, F, 1, 1,    98,mavlink_meminfo_t,brkval,0,freemem,65535,freemem32,0,,sig ,Len,12,crc16,11693
    2021-07-14T17:49:48.885,FE,1A, 0, 0,10, 1, 1,    3E,mavlink_nav_controller_output_t,nav_roll,2.66055,nav_pitch,1.128154,alt_error,0,aspd_error,0,xtrack_error,0,nav_bearing,167,target_bearing,0,wp_dist,0,,sig ,Len,34,crc16,11749
    2021-07-14T17:49:48.885,FE, 2, 0, 0,11, 1, 1,    2A,mavlink_mission_current_t,seq,0,,sig ,Len,10,crc16,56192
    2021-07-14T17:49:48.885,FE,14, 0, 0,12, 1, 1,    4A,mavlink_vfr_hud_t,airspeed,0.009835538,groundspeed,0.3958453,alt,375.22,climb,-0.126612,heading,167,throttle,0,,sig ,Len,28,crc16,52521
    2021-07-14T17:49:48.885,FE,15, 0, 0,13, 1, 1,    24,mavlink_servo_output_raw_t,time_usec,114279570,servo1_raw,982,servo2_raw,982,servo3_raw,982,servo4_raw,982,servo5_raw,982,servo6_raw,982,servo7_raw,0,servo8_raw,0,port,0,servo9_raw,0,servo10_raw,0,servo11_raw,0,servo12_raw,0,servo13_raw,0,servo14_raw,0,servo15_raw,0,servo16_raw,0,,sig ,Len,29,crc16,26468
    2021-07-14T17:49:48.886,FE,2A, 0, 0,14, 1, 1,    41,mavlink_rc_channels_t,time_boot_ms,114279,chan1_raw,1494,chan2_raw,1492,chan3_raw,982,chan4_raw,1494,chan5_raw,1299,chan6_raw,1488,chan7_raw,1493,chan8_raw,1486,chan9_raw,1494,chan10_raw,1494,chan11_raw,1494,chan12_raw,1494,chan13_raw,1494,chan14_raw,1494,chan15_raw,1494,chan16_raw,1494,chan17_raw,0,chan18_raw,0,chancount,16,rssi,0,,sig ,Len,50,crc16,60861
    2021-07-14T17:49:48.886,FE,16, 0, 0,15, 1, 1,    23,mavlink_rc_channels_raw_t,time_boot_ms,114279,chan1_raw,1494,chan2_raw,1492,chan3_raw,982,chan4_raw,1494,chan5_raw,1299,chan6_raw,1488,chan7_raw,1493,chan8_raw,1486,port,0,rssi,0,,sig ,Len,30,crc16,14666
    2021-07-14T17:49:48.886,FE,1A, 0, 0,16, 1, 1,    1B,mavlink_raw_imu_t,time_usec,114279615,xacc,48,yacc,-29,zacc,-998,xgyro,16,ygyro,-7,zgyro,-1,xmag,-341,ymag,-1,zmag,338,id,0,temperature,0,,sig ,Len,34,crc16,28862
    2021-07-14T17:49:48.886,FE,16, 0, 0,17, 1, 1,    74,mavlink_scaled_imu2_t,time_boot_ms,114279,xacc,43,yacc,-29,zacc,-999,xgyro,-1,ygyro,-2,zgyro,0,xmag,-300,ymag,-23,zmag,341,temperature,0,,sig ,Len,30,crc16,65170
    2021-07-14T17:49:48.886,FE, E, 0, 0,18, 1, 1,    1D,mavlink_scaled_pressure_t,time_boot_ms,114279,press_abs,970.9694,press_diff,0,temperature,3822,temperature_press_diff,0,,sig ,Len,22,crc16,37211
    2021-07-14T17:49:48.886,FE,1E, 0, 0,19, 1, 1,    18,mavlink_gps_raw_int_t,time_usec,114220000,lat,357658391,lon,1274153298,alt,381980,eph,86,epv,132,vel,2,cog,25646,fix_type,3,satellites_visible,14,alt_ellipsoid,0,h_acc,0,v_acc,0,vel_acc,0,hdg_acc,0,yaw,0,,sig ,Len,38,crc16,36449
    2021-07-14T17:49:48.886,FE,23, 0, 0,1A, 1, 1,    7C,mavlink_gps2_raw_t,time_usec,114220000,lat,357658426,lon,1274153210,alt,383150,dgps_age,0,eph,86,epv,132,vel,3,cog,31181,fix_type,3,satellites_visible,14,dgps_numch,0,yaw,0,alt_ellipsoid,0,h_acc,0,v_acc,0,vel_acc,0,hdg_acc,0,,sig ,Len,43,crc16,38274
    2021-07-14T17:49:48.886,FE, C, 0, 0,1B, 1, 1,     2,mavlink_system_time_t,time_unix_usec,1626252580007199,time_boot_ms,114281,,sig ,Len,20,crc16,28243
    2021-07-14T17:49:48.886,FE,1C, 0, 0,1C, 1, 1,    A3,mavlink_ahrs_t,omegaIx,-0.01150851,omegaIy,0.005350895,omegaIz,-0.0001637726,accel_weight,0,renorm_val,0,error_rp,0.003739818,error_yaw,0.003369438,,sig ,Len,36,crc16,53172
    2021-07-14T17:49:48.886,FE,18, 0, 0,1D, 1, 1,    B2,mavlink_ahrs2_t,roll,0.02958553,pitch,0.0215758,yaw,2.961349,altitude,0,lat,0,lng,0,,sig ,Len,32,crc16,64559
    2021-07-14T17:49:48.886,FE,28, 0, 0,1E, 1, 1,    B6,mavlink_ahrs3_t,roll,0.04652454,pitch,0.01981941,yaw,2.916214,altitude,375.22,lat,357658397,lng,1274153234,v1,0,v2,0,v3,0,v4,0,,sig ,Len,48,crc16,11276
    2021-07-14T17:49:48.887,FE, 3, 0, 0,1F, 1, 1,    A5,mavlink_hwstatus_t,Vcc,5107,I2Cerr,0,,sig ,Len,11,crc16,23168
    

    The structure of the file has a clear pattern for example: 2021-07-14T17:49:48.883,FE, 3, 0, 0, 4, 1, 1, A5,mavlink_hwstatus_t,Vcc,5111,I2Cerr,0,,sig ,Len,11,crc16,54392 in this line we can see we have the date we will call that GPSTime the we have seven hex digits that I don't know what they are or represent and I don't want to capture them Lets name them as Str02 until Str08. Then we have the MessageID it is a hex digit but it has 4 or 5 spaces in front. Then we have the MessageName and finally the MessageData where it should be parsed this Vcc,5111,I2Cerr,0,,sig ,Len,11,crc16,54392

    This is the struct I made

    type RawCSVData struct {
    	GPSTime     string            `json:"gps_time" csv:"gps_time"`
    	Str02       string            `json:"str_02" csv:"str_02"`
    	Str03       string            `json:"str_03" csv:"str_03"`
    	Str04       string            `json:"str_04" csv:"str_04"`
    	Str05       string            `json:"str_05" csv:"str_05"`
    	Str06       string            `json:"str_06" csv:"str_06"`
    	Str07       string            `json:"str_07" csv:"str_07"`
    	Str08       string            `json:"str_08" csv:"str_08"`
    	MessageID   string            `json:"msg_id" csv:"msg_id"`
    	MessageName string            `json:"msg_name" csv:"msg_name"`
    	MessageData map[string]string `json:"-" csv:"-"`
    }
    

    I followed your example:

    var csvHeader []string
    
    func init() {
    	h, err := csvutil.Header(RawCSVData{}, "csv")
    	if err != nil {
    		log.Fatal(err)
    	}
    	csvHeader = h
    }
    
    func main() {
    	data := []byte(`
    2021-07-14T17:49:48.883,FE, 3, 0, 0, 4, 1, 1,    A5,mavlink_hwstatus_t,Vcc,5111,I2Cerr,0,,sig ,Len,11,crc16,54392
    2021-07-14T17:49:48.885,FE, 6, 0, 0, E, 1, 1,    7D,mavlink_power_status_t,Vcc,5107,Vservo,0,flags,3,,sig ,Len,14,crc16,25606`)
    
    	r := csv.NewReader(bytes.NewReader(data))
    
    	dec, err := csvutil.NewDecoder(r, csvHeader...)
    	if err != nil {
    		log.Fatal(err)
    	}
    
    	var data []RawCSVData
    	for {
    		var u RawCSVData
    
    		if err := dec.Decode(&u); err == io.EOF {
    			break
    		} else if err != nil {
    			log.Fatal(err)
    		}
    
    		for _, i := range dec.Unused() {
    			fmt.Println(i)
    		}
    
    		data = append(data , u)
    	}
    
    	fmt.Printf("%+v", data )
    }
    

    When I run this code I got this error:

    2021/10/20 14:09:52 wrong number of fields in record
    

    There is any way to process this kind of csv file using this library? There are three features that I need to process this file:

    • Process each line based on the RawCSVData struct: this means the first 10 fields or colums
    • Columns that were not defined in the struct should be omitted or stored in the dec.Unused() so we can process them as we need.
    • A tag to omit consecutive columns probably like regex omit{7} in this case is a simple struct of 15 fields. But Imagine you have a file with 100 columns and you want to omit 90 of them. This means, you need to make an struct with 100 fields and use csv:"-" in 90 of them. I think is insane. It will be nice also to drop data that is not defined in the struct. For example if you have a file with rows like this:
    1478,2021-08-25 13:10:07.643,POS,38829326,35.7299019,127.353324,296.3,-0.2403892,-0.3103892,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
    

    I also got 2021/10/20 14:09:52 wrong number of fields in record At the end I created a new file and delete all the ,,,,,,,,,,,,, but this is not a good practice. That normalization might compromise the data.

    question 
    opened by teocci 5
  • Question: not possible to note write an entire column if all values empty?

    Question: not possible to note write an entire column if all values empty?

    Hello, thanks for providing this useful csv reader/writer.

    Is there a tag I can use to indicate that I don't want a column written if all its values are empty?

    I tested out "omitempty" and with the tag or without, it doesn't affect anything when marshaling.

    question 
    opened by foob26uk 5
  • Decode nil into pointers for blank fields

    Decode nil into pointers for blank fields

    The encoder encodes nil values to blank, but later decoding those values will result in an error unless omitempty is used. Adjust decoding to interpret blank values as nil for pointer fields, like the encoder.

    Description

    A pointer may be used for a struct field when you want both nil and the zero value of a field to have meaning. For example, in

    type UserInput struct {
      Agree *bool `csv:"bool"`
      Other string `csv:"other`
    }
    

    Agree's true and false values mean the user has input agreement, while nil may mean agreement has not yet been provided.

    As the decoder is currently written, blank values are not allowed in decoding unless omitempty is specified on a field, in which case the zero values aren't marshaled.

    For example:

    type UserInput struct {
    	Agree *bool  `csv:"bool,omitempty"`
    	Other string `csv:"other,omitempty"`
    }
    
    func main() {
    
    	b := false
    
    	exp := &UserInput{
    		Agree: &b,
    	}
    
    	buf := bytes.NewBuffer(nil)
    	w := csv.NewWriter(buf)
    	enc := csvutil.NewEncoder(w)
    
    	if err := enc.Encode(exp); err != nil {
    		panic(err)
    	}
    
    	w.Flush()
    
    	fmt.Println(string(buf.Bytes()))
    
    	tst := &UserInput{}
    
    	r := csv.NewReader(buf)
    	dec, err := csvutil.NewDecoder(r)
    	if err != nil {
    		panic(err)
    	}
    
    	if err := dec.Decode(tst); err != nil {
    		panic(err)
    	}
    
    	if reflect.DeepEqual(tst.Agree, exp.Agree) {
    		fmt.Println("equal")
    	} else {
    		fmt.Println("not equal")
    	}
    }
    

    prints:

    bool,other
    ,
    
    not equal
    

    The omitted false value gets decoded back into Agree as nil, so it's not equal. To keep false as a valid value, remove omitempty, and you get:

    bool,other
    false,
    
    equal
    

    Now set Agree to nil instead of &b. That yields:

    bool,other
    ,
    
    panic: csvutil: cannot unmarshal  into Go value of type bool
    

    You can't use the zero value if omitempty is specified, but you can't decode blanks if it's not. Also, an unmarshal/parse error makes sense for a value field, but for a pointer field the expectation would be to get nil.

    These changes make that happen.

    There is an ambiguous case when decoding to a string pointer. For consistency, that will also now remain nil. If the zero value is desired, you can just use a value field.

    We've run into needing this change because we are using this to load data changes from delimited files, and we interpret no value in a field as "do not change," whereas false is interpreted as "set to false." It's no problem doing this with the JSON marshaler, because null and false are distinct, but here blank is either invalid or the same as false.

    By the way, this is a great package! Thanks for putting this together.

    Checklist

    • [x] Code compiles without errors
    • [x] Added new tests for the provided functionality
    • [x] All tests are passing
    • [x] Updated the README and/or documentation, if necessary (there is no section about pointer handling, so I didn't update anything)
    feature request 
    opened by andrewmostello 5
  • Header with multiple lines

    Header with multiple lines

    Hi,

    I have a CSV with a 2 lines header, first is label, second is value. In my case, the important data is 23548480455000, then I need to parse all data ( timestamp, Value)

    Is it possible to do it ?

    I'm trying it this way, but it doesn't seem to work:

    func parseCSV(filename string) []models.CsvMeasure {
    	csvFile, _ := os.Open(filename)
    	csvReader := csv.NewReader(bufio.NewReader(csvFile))
    
    	dec, err := csvutil.NewDecoder(csvReader)
    	if err != nil {
    		log.Fatal("erreur", err)
    	}
    
    	header := dec.Header()
    	var measures []models.CsvMeasure
    	for {
    		measure := models.CsvMeasure{Header: make(map[string]string)}
    
    		if err := dec.Decode(&measure); err == io.EOF {
    			break
    		} else if err != nil {
    			log.Fatal(err)
    		}
    
    		for _, i := range dec.Unused() {
    			measure.Header[header[i]] = dec.Record()[i]
    		}
    		for key, value := range measure.Header {
    			fmt.Println("Key:", key, "Value:", value)
    		}
    		measures = append(measures, measure)
    	}
    
    	return measures
    }
    

    Here is original CSV

    Identifiant PRM | Type de donnees | Date de debut | Date de fin | Grandeur physique | Grandeur metier | Etape metier | Unite | Pas en minutes
    -- | -- | -- | -- | -- | -- | -- | -- | --
    23548480455000 | Courbe de charge | 08/07/2019 | 09/08/2019 | Energie active | Consommation | Comptage Brut | W |  
    Horodate | Valeur |   |   |   |   |   |   |  
    2019-07-08T00:30:00+02:00 | 216 |   |   |   |   |   |   |  
    2019-07-08T01:00:00+02:00 | 256 |   |   |   |   |   |   |  
    2019-07-08T01:30:00+02:00 | 220 |   |   |   |   |   |   |  
    2019-07-08T02:00:00+02:00 | 230 |   |   |   |   |   |   |  
    2019-07-08T02:30:00+02:00 | 250 |   |   |   |   |   |   |  
    2019-07-08T03:00:00+02:00 | 146 |   |   |   |   |   |   |  
    2019-07-08T03:30:00+02:00 | 140 |   |   |   |   |   |   |  
    2019-07-08T04:00:00+02:00 | 172 |   |   |   |   |   |   |  
    2019-07-08T04:30:00+02:00 | 128 |   |   |   |   |   |   |  
    2019-07-08T05:00:00+02:00 | 134 |   |   |   |   |   |   |  
    2019-07-08T05:30:00+02:00 | 116 |   |   |   |   |   |   |  
    2019-07-08T06:00:00+02:00 | 106 |   |   |   |   |   |   |  
    

    Can I manage it with csvutil ?

    question 
    opened by xoco70 5
  • fix: nexted prefix order with inline

    fix: nexted prefix order with inline

    Description

    Fix nested prefix order.

    	type Owner struct {
    		Name string `csv:"name"`
    	}
    
    	type Address struct {
    		Owner  Owner  `csv:"owner_,inline"`
    	}
    
    	type User struct {
    		Address     Address `csv:"address,inline"`
    	}
    

    got owner_address_name; want address_owner_name

    Checklist

    • [x] Code compiles without errors
    • [x] Added new tests for the provided functionality
    • [x] All tests are passing
    • [x] Updated the README and/or documentation, if necessary
    opened by hori-ryota 0
Releases(v1.7.0)
  • v1.7.0(May 27, 2022)

  • v1.6.0(Nov 22, 2021)

    Highlights

    • Add more context info when error happens during Unmarshal
    • Add SetHeader to Encoder for header overrides
    • Add NormalizeHeader to Decoder for normalizing header/column names
    • Minimal Go version is now Go1.8
    Source code(tar.gz)
    Source code(zip)
  • v1.5.1(Aug 29, 2021)

  • v1.5.0(Feb 18, 2021)

    Highlights

    • Add Decoder option DisallowMissingColumns
    • Add MissingColumnsError type
    • MarshalerError now implements Unwrap method for errors package
    Source code(tar.gz)
    Source code(zip)
  • v1.4.0(Jul 24, 2020)

    Highlights

    • Added Encoder.Register method
    • Added Decoder.Register method
    • Value receiver Unmarshalers/TextUnmarshalers are now properly called if they are under interface value
    • Pointer receiver Marshalers/TextMarshalers are now properly called if they are under interface value
    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Feb 26, 2020)

    Highlights

    • Added inline tag
    • Decoder now sets pointer fields to nil on blank values
    • Encode and Decode now accept slices and arrays
    • Marshal and Unmarshal now accept arrays
    • Fixed omitempty for initialized pointers with default values
    • Fixed omitempty for initialized interface with default values
    • Improved UnmarshalTypeError error message
    Source code(tar.gz)
    Source code(zip)
  • v1.2.2(Jan 20, 2020)

  • v1.2.1(Sep 9, 2018)

    Highlights

    • Fixed panic on encoding interface fields that contain pointer values
    • Added tests for potential data races on cached resources
    • Updated travis and appveyor to run with Go1.11
    Source code(tar.gz)
    Source code(zip)
  • v1.2.0(Jul 31, 2018)

    Highlights

    • Added support for older Go versions (minimum version is Go1.7)
    • Added Decoder.Map for data normalization (example)
    • Decoder can now properly handle interface values that are initialized pointers - it decodes data into these values instead of creating a string (example)
    • Added go.mod file
    • Fixed the issue where Header and EncoderHeader were not recognizing the type properly if the value was wrapped in additional interfaces
    • Improved internal code
    • Improved documentation
    Source code(tar.gz)
    Source code(zip)
  • v1.1.1(May 28, 2018)

  • v1.1.0(Apr 1, 2018)

Owner
Jacek Szwec
Jacek Szwec
idiomatic codec and rpc lib for msgpack, cbor, json, etc. msgpack.org[Go]

go-codec This repository contains the go-codec library, the codecgen tool and benchmarks for comparing against other libraries. This is a High Perform

Ugorji Nwoke 1.7k Jun 22, 2022
Some Golang types based on builtin. Implements interfaces Value / Scan and MarshalJSON / UnmarshalJSON for simple working with database NULL-values and Base64 encoding / decoding.

gotypes Some simple types based on builtin Golang types that implement interfaces for working with DB (Scan / Value) and JSON (Marshal / Unmarshal). N

null 0 Feb 12, 2022
Go library for decoding generic map values into native Go structures and vice versa.

mapstructure mapstructure is a Go library for decoding generic map values to structures and vice versa, while providing helpful error handling. This l

Mitchell Hashimoto 5.8k Jun 26, 2022
A library that provides dynamic features of Go language.

go-dynamic go-dynamic is a library that provides dynamic features of Go language. Installation To install go-dynamic, use go get: go get -u github.com

Garen Chan 1 Dec 8, 2021
CBOR RFC 7049 (Go/Golang) - safe & fast with standard API + toarray & keyasint, CBOR tags, float64/32/16, fuzz tested.

CBOR library in Go fxamacker/cbor is a CBOR encoder & decoder in Go. It has a standard API, CBOR tags, options for duplicate map keys, float64→32→16,

Faye Amacker 455 Jun 24, 2022
Fast implementation of base58 encoding on golang.

Fast Implementation of Base58 encoding Fast implementation of base58 encoding in Go. Base algorithm is adapted from https://github.com/trezor/trezor-c

Denis Subbotin 121 May 12, 2022
Asn.1 BER and DER encoding library for golang.

WARNING This repo has been archived! NO further developement will be made in the foreseen future. asn1 -- import "github.com/PromonLogicalis/asn1" Pac

Logicalis 50 Apr 8, 2022
Encode and decode Go (golang) struct types via protocol buffers.

protostructure protostructure is a Go library for encoding and decoding a struct type over the wire. This library is useful when you want to send arbi

Mitchell Hashimoto 171 Mar 24, 2022
auto-generate capnproto schema from your golang source files. Depends on go-capnproto-1.0 at https://github.com/glycerine/go-capnproto

bambam: auto-generate capnproto schema from your golang source files. Adding capnproto serialization to an existing Go project used to mean writing a

Jason E. Aten, Ph.D. 64 Jan 29, 2022
Fixed width file parser (encoder/decoder) in GO (golang)

Fixed width file parser (encoder/decoder) for GO (golang) This library is using to parse fixed-width table data like: Name Address

Oleg Lobanov 20 Mar 4, 2022
msgpack.org[Go] MessagePack encoding for Golang

MessagePack encoding for Golang ❤️ Uptrace.dev - All-in-one tool to optimize performance and monitor errors & logs Join Discord to ask questions. Docu

Vladimir Mihailenco 1.8k Jun 28, 2022
golang struct 或其他对象向 []byte 的序列化或反序列化

bytecodec 字节流编解码 这个库实现 struct 或其他对象向 []byte 的序列化或反序列化 可以帮助你在编写 tcp 服务,或者需要操作字节流时,简化数据的组包、解包 这个库的组织逻辑 copy 借鉴了标准库 encoding/json ?? 安装 使用 go get 安装最新版本

null 7 May 10, 2022
Dynamically Generates Ysoserial's Payload by Golang

Gososerial 介绍 ysoserial是java反序列化安全方面著名的工具 无需java环境,无需下载ysoserial.jar文件 输入命令直接获得payload,方便编写安全工具 目前已支持CC1-CC7,K1-K4和CB1链 Introduce Ysoserial is a well-

4ra1n 42 Jun 29, 2022
A k-mer serialization package for Golang

.uniq v5 This package provides k-mer serialization methods for the package kmers, TaxIds of k-mers are optionally saved, while there's no frequency in

Wei Shen 6 Jun 16, 2022
gogoprotobuf is a fork of golang/protobuf with extra code generation features.

GoGo Protobuf looking for new ownership Protocol Buffers for Go with Gadgets gogoprotobuf is a fork of golang/protobuf with extra code generation feat

null 0 Nov 26, 2021
generic sort for slices in golang

slices generic sort for slices in golang basic API func BinarySearch[E constraints.Ordered](list []E, x E) int func IsSorted[E constraints.Ordered](li

阮坤良 13 Jun 21, 2022
Bitbank-trezor - Bitbank trezor with golang

Bitbank - Trezor (c) 2022 Bernd Fix [email protected] >Y< bitbank-trezor is fre

Bernd Fix 0 Jan 27, 2022
Cap'n Proto library and parser for go. This is go-capnproto-1.0, and does not have rpc. See https://github.com/zombiezen/go-capnproto2 for 2.0 which has rpc and capabilities.

Version 1.0 vs 2.0 Update 2015 Sept 20: Big news! Version 2.0 of the go-bindings, authored by Ross Light, is now released and newly available! It feat

Jason E. Aten, Ph.D. 282 Jun 9, 2022
Encode and decode binary message and file formats in Go

Encode and Decode Binary Formats in Go This module wraps the package encoding/binary of the Go standard library and provides the missing Marshal() and

Joel Ling 7 Jun 3, 2022