Arbitrary transformations of JSON in Golang

Overview

kazaam

Travis Build Status Coverage Status MIT licensed GitHub release Go Report Card GoDoc

Description

Kazaam was created with the goal of supporting easy and fast transformations of JSON data with Golang. This functionality provides us with an easy mechanism for taking intermediate JSON message representations and transforming them to formats required by arbitrary third-party APIs.

Inspired by Jolt, Kazaam supports JSON to JSON transformation via a transform "specification" also defined in JSON. A specification is comprised of one or more "operations". See Specification Support, below, for more details.

Documentation

API Documentation is available at http://godoc.org/gopkg.in/qntfy/kazaam.v3.

Features

Kazaam is primarily designed to be used as a library for transforming arbitrary JSON. It ships with six built-in transform types, described below, which provide significant flexibility in reshaping JSON data.

Also included when you go get Kazaam, is a binary implementation, kazaam that can be used for development and testing of new transform specifications.

Finally, Kazaam supports the implementation of custom transform types. We encourage and appreciate pull requests for new transform types so that they can be incorporated into the Kazaam distribution, but understand sometimes time-constraints or licensing issues prevent this. See the API documentation for details on how to write and register custom transforms.

Due to performance considerations, Kazaam does not fully validate that input data is valid JSON. The IsJson() function is provided for convenience if this functionality is needed, it may significantly slow down use of Kazaam.

Specification Support

Kazaam currently supports the following transforms:

  • shift
  • concat
  • coalesce
  • extract
  • timestamp
  • uuid
  • default
  • pass
  • delete

Shift

The shift transform is the current Kazaam workhorse used for remapping of fields. The specification supports jsonpath-esque JSON accesses and sets. Concretely

{
  "operation": "shift",
  "spec": {
    "object.id": "doc.uid",
    "gid2": "doc.guid[1]",
    "allGuids": "doc.guidObjects[*].id"
  }
}

executed on a JSON message with format

{
  "doc": {
    "uid": 12345,
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"id": "guid0"}, {"id": "guid2"}, {"id": "guid4"}]
  },
  "top-level-key": null
}

would result in

{
  "object": {
    "id": 12345
  },
  "gid2": "guid2",
  "allGuids": ["guid0", "guid2", "guid4"]
}

The jsonpath implementation supports a few special cases:

  • Array accesses: Retrieve nth element from array
  • Array wildcarding: indexing an array with [*] will return every matching element in an array
  • Top-level object capture: Mapping $ into a field will nest the entire original object under the requested key
  • Array append/prepend and set: Append and prepend an array with [+] and [-]. Attempting to write an array element that does not exist results in null padding as needed to add that element at the specified index (useful with "inplace").

The shift transform also supports a "require" field. When set to true, Kazaam will throw an error if any of the paths in the source JSON are not present.

Finally, shift by default is destructive. For in-place operation, an optional "inplace" field may be set.

Concat

The concat transform allows the combination of fields and literal strings into a single string value.

{
    "operation": "concat",
    "spec": {
        "sources": [{
            "value": "TEST"
        }, {
            "path": "a.timestamp"
        }],
        "targetPath": "a.timestamp",
        "delim": ","
    }
}

executed on a JSON message with format

{
    "a": {
        "timestamp": 1481305274
    }
}

would result in

{
    "a": {
        "timestamp": "TEST,1481305274"
    }
}

Notes:

  • sources: list of items to combine (in the order listed)
    • literal values are specified via value
    • field values are specified via path (supports the same addressing as shift)
  • targetPath: where to place the resulting string
    • if this an existing path, the result will replace current value.
  • delim: Optional delimiter

The concat transform also supports a "require" field. When set to true, Kazaam will throw an error if any of the paths in the source JSON are not present.

Coalesce

A coalesce transform provides the ability to check multiple possible keys to find a desired value. The first matching key found of those provided is returned.

{
  "operation": "coalesce",
  "spec": {
    "firstObjectId": ["doc.guidObjects[0].uid", "doc.guidObjects[0].id"]
  }
}

executed on a json message with format

{
  "doc": {
    "uid": 12345,
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"id": "guid0"}, {"id": "guid2"}, {"id": "guid4"}]
  }
}

would result in

{
  "doc": {
    "uid": 12345,
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"id": "guid0"}, {"id": "guid2"}, {"id": "guid4"}]
  },
  "firstObjectId": "guid0"
}

Coalesce also supports an ignore array in the spec. If an otherwise matching key has a value in ignore, it is not considered a match. This is useful e.g. for empty strings

{
  "operation": "coalesce",
  "spec": {
    "ignore": [""],
    "firstObjectId": ["doc.guidObjects[0].uid", "doc.guidObjects[0].id"]
  }
}

Extract

An extract transform provides the ability to select a sub-object and have kazaam return that sub-object as the top-level object. For example

{
  "operation": "extract",
  "spec": {
    "path": "doc.guidObjects[0].path.to.subobject"
  }
}

executed on a json message with format

{
  "doc": {
    "uid": 12345,
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"path": {"to": {"subobject": {"name": "the.subobject", "field", "field.in.subobject"}}}}, {"id": "guid2"}, {"id": "guid4"}]
  }
}

would result in

{
  "name": "the.subobject",
  "field": "field.in.subobject"
}

Timestamp

A timestamp transform parses and formats time strings using the golang syntax. Note: this operation is done in-place. If you want to preserve the original string(s), pair the transform with shift. This transform also supports the $now operator for inputFormat, which will set the current timestamp at the specified path, formatted according to the outputFormat. $unix is supported for both input and output formats as a Unix time, the number of seconds elapsed since January 1, 1970 UTC as an integer string.

{
  "operation": "timestamp",
  "timestamp[0]": {
    "inputFormat": "Mon Jan _2 15:04:05 -0700 2006",
    "outputFormat": "2006-01-02T15:04:05-0700"
  },
  "nowTimestamp": {
    "inputFormat": "$now",
    "outputFormat": "2006-01-02T15:04:05-0700"
  },
  "epochTimestamp": {
    "inputFormat": "2006-01-02T15:04:05-0700",
    "outputFormat": "$unix"
  }
}

executed on a json message with format

{
  "timestamp": [
    "Sat Jul 22 08:15:27 +0000 2017",
    "Sun Jul 23 08:15:27 +0000 2017",
    "Mon Jul 24 08:15:27 +0000 2017"
  ]
}

would result in

{
  "timestamp": [
    "2017-07-22T08:15:27+0000",
    "Sun Jul 23 08:15:27 +0000 2017",
    "Mon Jul 24 08:15:27 +0000 2017"
  ]
  "nowTimestamp": "2017-09-08T19:15:27+0000"
}

UUID

A uuid transform generates a UUID based on the spec. Currently supports UUIDv3, UUIDv4, UUIDv5.

For version 4 is a very simple spec

{
    "operation": "uuid",
    "spec": {
        "doc.uuid": {
            "version": 4, //required
        }
    }
}

executed on a json message with format

{
  "doc": {
    "author_id": 11122112,
    "document_id": 223323,
    "meta": {
      "id": 23
    }
  }
}

would result in

{
  "doc": {
    "author_id": 11122112,
    "document_id": 223323,
    "meta": {
      "id": 23
    }
    "uuid": "f03bacc1-f4e0-4371-a5c5-e8160d3d6c0c"
  }
}

For UUIDv3 & UUIDV5 are a bit more complex. These require a Name Space which is a valid UUID already, and a set of paths, which generate UUID's based on the value of that path. If that path doesn't exist in the incoming document, a default field will be used instead. Note both of these fields must be strings. Additionally you can use the 4 predefined namespaces such as DNS, URL, OID, & X500 in the name space field otherwise pass your own UUID.

{
   "operation":"uuid",
   "spec":{
      "doc.uuid":{
         "version":5,
         "namespace":"DNS",
         "names":[
            {"path":"doc.author_name", "default":"some string"},
            {"path":"doc.type", "default":"another string"},
         ]
      }
   }
}

executed on a json message with format

{
  "doc": {
    "author_name": "jason",
    "type": "secret-document"
    "document_id": 223323,
    "meta": {
      "id": 23
    }
  }
}

would result in

{
  "doc": {
    "author_name": "jason",
    "type": "secret-document",
    "document_id": 223323,
    "meta": {
      "id": 23
    },
    "uuid": "f03bacc1-f4e0-4371-a7c5-e8160d3d6c0c"
  }
}

Default

A default transform provides the ability to set a key's value explicitly. For example

{
  "operation": "default",
  "spec": {
    "type": "message"
  }
}

would ensure that the output JSON message includes {"type": "message"}.

Delete

A delete transform provides the ability to delete keys in place.

{
  "operation": "delete",
  "spec": {
    "paths": ["doc.uid", "doc.guidObjects[1]"]
  }
}

executed on a json message with format

{
  "doc": {
    "uid": 12345,
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"id": "guid0"}, {"id": "guid2"}, {"id": "guid4"}]
  }
}

would result in

{
  "doc": {
    "guid": ["guid0", "guid2", "guid4"],
    "guidObjects": [{"id": "guid0"}, {"id": "guid4"}]
  }
}

Pass

A pass transform, as the name implies, passes the input data unchanged to the output. This is used internally when a null transform spec is specified, but may also be useful for testing.

Usage

To start, go get the versioned repository:

go get gopkg.in/qntfy/kazaam.v3

Using as an executable program

If you want to create an executable binary from this project, follow these steps (you'll need go installed and $GOPATH set):

go get gopkg.in/qntfy/kazaam.v3
cd $GOPATH/src/gopkg.in/qntfy/kazaam.v3/kazaam
go install

This will create an executable in $GOPATH/bin like you would expect from the normal go build behavior.

Examples

See godoc examples.

Issues
  • added uuid transform

    added uuid transform

    Added UUID transforms. The current iteration is that you define the same structure as a concat spec, so it tries to generate UUID based on those fields. If those fields do not exist, then it will generate an UUID based on the satori/uuid library

    Would love feedback in this implementation.

    opened by Nearhan 10
  • Help with transform spec

    Help with transform spec

    Hi, thx for the great work. I'm struggling to define a spec to transform the first element (labels) of an array in an object.

    from: { "metadata": [ { "label": "Amount", "labels": [ { "en": "Amount", "ptBR": "Quantidade" } ] }, { "label": "Value", "labels": [ { "en": "Value", "ptBR": "Valor" } ] } ] }

    into

    { "metadata": [ { "label": "Amount", "labels": { "en": "Amount", "ptBR": "Quantidade" } }, { "label": "Value", "labels": { "en": "Value", "ptBR": "Valor" } } ] }

    Can someone please guide me on how to achieve this? Thx!

    question 
    opened by rodolfoag 7
  • different result of transformation each time on same content

    different result of transformation each time on same content

    I have an example of complex transformation where kazaam demonstrates inconsistent behaviour. I ran https://go.dev/play/p/yVCIkyphvNJ several times and each time it drops section, returning them to correct positions, loses them again... on the same ruleset and input data.

    I guess it's because of random order somewhere inside app and the only way i can think of it is a map order of which is undetermined.

    You can run code several times too and watch yourself that event.payload.order.delivery: { expectedTo:.., expectedFrom:...} section either in place, partially in place with one of key-value pair or completely lost. All these outcome are on static data with same input and same output

    opened by mainpart 4
  • Fix tests

    Fix tests

    Tests often fail due to the random order of JSON keys. This may be due to golang's randomized map iteration. All of the tests should be reviewed to ensurer they account for this fact and that go test doesn't fail at random.

    opened by chilland 4
  • Add string concatenation operation

    Add string concatenation operation

    It would be nice to be able to specify a string to be concatenated to the contents of a given field. This could potentially be extended to regex more generally.

    opened by chilland 4
  • Create an array from existing keys

    Create an array from existing keys

    Hi,

    I am wondering if there's a way to create an array from existing keys using existing transform functions only ? something like

    converting the following JSON

    { 
       "key1": "field1",
       "key2": "field2",
       "key3": "field3"
    }
    
    

    to something like:

    { 
       "arr": ["field1", "field2", "field3"]
    }
    

    Another followup would be, Is it possible to create an array at top level ? for example, converting the above JSON to something like:

    ["field1", "field2", "field3"]

    Thanks

    opened by bhardwajdb 3
  • `Default` overwrites existing value

    `Default` overwrites existing value

    The transformer name default to me implies that the key is set to the value only if no value is already set. Instead, as far as I can tell, right now it simply means "always set this value, no matter what".

    I'm writing my own transformer to override this behaviour, but I figured I'd report this issue, in case you agree with me that the current behaviour is unexpected, based on the name.

    opened by JeanMertz 3
  • Check parsing of long ints

    Check parsing of long ints

    Ensure that json that is kazaam'd doesn't modify how long integers are handled.

    See: http://stackoverflow.com/questions/22343083/json-marshaling-with-long-numbers-in-golang-gives-floating-point-number https://go-review.googlesource.com/#/c/30371/

    JSON decoding may need to change to something along these lines:

            var src map[string]interface{}
            decoder := json.NewDecoder(bytes.NewReader(payload))
            decoder.UseNumber()
            err := decoder.Decode(&src)
    
    opened by yelskiy 3
  • Add timestamp operation

    Add timestamp operation

    It would be great if a timestamp field could be specified and a transformation applied to the string. For example you could transform a Unix timestamp to standard GMT. Probably could use the built-in golang date formatting as a template.

    opened by chilland 3
  • Documentation regarding RegisterTransform

    Documentation regarding RegisterTransform

    Using: go version go1.11.1 linux/amd64 in a known-working environment.

    On the page/section: https://godoc.org/gopkg.in/qntfy/kazaam.v3#Config.RegisterTransform

    The example block shown, when copied and used verbatim, produces:

    cannot use func literal (type func(*"gopkg.in/qntfy/kazaam.v3/transform".Config, []byte) ([]byte, error)) as type kazaam.TransformFunc in argument to kc.RegisterTransform
    
    opened by subcon42 2
  • README uses wrong path in Usage section

    README uses wrong path in Usage section

    Under the "Usage" section of the readme, the reader is instructed to cd into the Kazaam directory after running go get: cd $GOPATH/src/gopkg.in/qntfy.kazaam.v3/kazaam

    But this is the wrong path. The correct command with the correct path is: cd $GOPATH/src/gopkg.in/qntfy/kazaam.v3/kazaam

    opened by tuptaker 2
  • Merge nested objects in default transformation for same key

    Merge nested objects in default transformation for same key

    Spec Config

    [
      {
        "operation": "default",
        "spec": {
          "data.amount": "200",
          "data.receipt": "transaction2",
          "data.currency": "INR",
          "data.notes": {
            "beta": "transaction2",
            "alpha": "test notes"
          },
          "data.notes.fund_account_id": "a_JMjPtaaaaaaaaaa"
        }
      }
    ]
    

    Input {}

    Expected Output

    {
      "data": {
        "amount": "200",
        "receipt": "transaction2",
        "currency": "INR",
        "notes": {
          "alpha": "test notes",
          "beta": "transaction2",
          "fund_account_id": "fa_JMjPtaaaaaaaaaa"
        }
      }
    }
    

    Facing intermittent merge failures for data.notes child objects. Sometimes we get output as expected and sometimes it's overriding the child object during merge and resolution for same key.

    Intermittent incorrect output

    {
      "data": {
        "amount": "200",
        "receipt": "transaction2",
        "currency": "INR",
        "notes": {
          "alpha": "test notes",
          "beta": "transaction2",
        }
      }
    }
    

    data.notes.fund_account_id is getting overwritten by data.notes, if it gets iterated first. This is possibly happening due to random order of map (*spec.Spec) while iterating in default.go.

    opened by riyaagrahari 0
  • Array inside Array shift issue

    Array inside Array shift issue

    shift.json

    [
    		{"operation": "shift", "over": "monitor.details", "spec": {"list[*].myname":"list[*].name", "id":"id"}}
    ]
    

    doc.json

    {
    		"monitor" :{
    			"name": "hello", 
    			"details" : [
    			{
    				"list": [
    					{
    					"name": "abc",
    					"id": "1"
    					},
    					{
    					"name": "xyz",
    					"id": "2"
    					}
    				] 
    				
    			}
    			]
    		}
    	}
    

    Expected Result:-

    {
    		"monitor" :{
    			"name": "hello", 
    			"details" : [
    			{
    				"list": [
    					{
    					"myname": "abc",
    					"id": "1"
    					},
    					{
    					"myname": "xyz",
    					"id": "2"
    					}
    				] 
    				
    			}
    			]
    		}
    	}
    

    But Actual result,

    {
    		"monitor" :{
    			"name": "hello", 
    			"details" : [
    			{
    				"list": [
    					{
    					"name": "abc",
    					"id": "1"
    					},
    					{
    					"name": "xyz",
    					"id": "2"
    					}
    				] 
    				
    			}
    			]
    		}
    	}
    

    Can you please correct my spec.json? Not sure, what I am missing.

    opened by srajesh-elisity 1
  • Ability to grab values from outside `over` context.

    Ability to grab values from outside `over` context.

    I'm attempting to transform a document like this:

    {
      "first_name": "John",
      "last_name": "Doe",
      "children": [{ "first_name": "Jack" }, { "first_name": "Jane" }]
    }
    

    Into something like this:

    [
      {
        "first_name": "Jack",
        "last_name": "Doe"
      },
      {
        "first_name": "Jane",
        "last_name": "Doe"
      }
    ]
    

    This would take the children's first names and combine them with the parent's last name. I'm not sure if this can be done currently without some way of escaping the context created by using over, like this (borrowing from syntax I've seen in JSONata):

    {
      "operation": "shift",
      "spec": {
        "first_name": "first_name",
        "last_name": "$$.last_name"
      },
      "over": "children"
    }
    

    This would be a nice feature to have, unless some other solution is possible.

    opened by jmcnevin 1
  • Shift  Spec for Root Array

    Shift Spec for Root Array

    Is it possible to use shift when the root is an array?

    Given the json

    [{
      "firstName": "John",
      "lastName" : "doe",
      "age"      : 26,
      "address"  : {
    	"streetAddress": "naist street",
    	"city"         : "Nara",
    	"postalCode"   : "630-0192"
    },
    "phoneNumbers": [
    	{
    		"type"  : "iPhone",
    		"number": "0123-4567-8888"
    	},
    	{
    		"type"  : "home",
    		"number": "0123-4567-8910"
    	}
    ]
    }]
    

    And the spec

    [{"operation":"shift","spec":{"[*].streetAddress":"[*].address.streetAddress"},"require":false}]
    

    The result is

    {"[*]":{"streetAddress":null}}
    

    It should be

    [{
     "streetAddress":  "naist street"
    }]
    
    opened by austinarbor 2
  • Issues #101, #102, #103, #104

    Issues #101, #102, #103, #104

    This pull request will solve the following issues: #101 timestamp documentation is not correct -> corrected #102 timestamp $unix as inputFormat with integer value as value is not working -> now integer values are accepted in the value transformer. #103 timestamp $unix as output format will create quoted integer value -> $unix (and $unixext) will now create JSON integer as output format value #104 Feature: timestamp operation $unix with milli second support -> i introduce the $unixext format as input and output parameter for processing unix timestamps as millis from epoche. Adding tests for $unixext, too.

    opened by willie68 0
Releases(v4.0.1)
Owner
Qntfy
Empowering mental health with data.
Qntfy
a Go package to interact with arbitrary JSON

go-simplejson a Go package to interact with arbitrary JSON Importing import github.com/bitly/go-simplejson Documentation Visit the docs on Go package

Bitly 3.6k Jun 24, 2022
A Go package to interact with arbitrary JSON

go-simplejson a Go package to interact with arbitrary JSON Importing import github.com/bitly/go-simplejson Documentation Visit the docs on Go package

LYF 0 Oct 20, 2021
Get JSON values quickly - JSON parser for Go

get json values quickly GJSON is a Go package that provides a fast and simple way to get values from a json document. It has features such as one line

Josh Baker 10.5k Jun 30, 2022
JSON diff library for Go based on RFC6902 (JSON Patch)

jsondiff jsondiff is a Go package for computing the diff between two JSON documents as a series of RFC6902 (JSON Patch) operations, which is particula

William Poussier 172 Jun 23, 2022
Fast JSON encoder/decoder compatible with encoding/json for Go

Fast JSON encoder/decoder compatible with encoding/json for Go

Masaaki Goshima 1.6k Jun 26, 2022
Package json implements encoding and decoding of JSON as defined in RFC 7159

Package json implements encoding and decoding of JSON as defined in RFC 7159. The mapping between JSON and Go values is described in the documentation for the Marshal and Unmarshal functions

High Performance, Kubernetes Native Object Storage 2 Jun 26, 2022
Json-go - CLI to convert JSON to go and vice versa

Json To Go Struct CLI Install Go version 1.17 go install github.com/samit22/js

Samit Ghimire 5 Mar 3, 2022
JSON Spanner - A Go package that provides a fast and simple way to filter or transform a json document

JSON SPANNER JSON Spanner is a Go package that provides a fast and simple way to

null 3 Jun 30, 2022
Abstract JSON for golang with JSONPath support

Abstract JSON Abstract JSON is a small golang package provides a parser for JSON with support of JSONPath, in case when you are not sure in its struct

Stepan Pyzhov 119 Jun 24, 2022
JSON query in Golang

gojq JSON query in Golang. Install go get -u github.com/elgs/gojq This library serves three purposes: makes parsing JSON configuration file much easie

Qian Chen 182 Apr 27, 2022
Automatically generate Go (golang) struct definitions from example JSON

gojson gojson generates go struct definitions from json or yaml documents. Example $ curl -s https://api.github.com/repos/chimeracoder/gojson | gojson

Aditya Mukerjee 2.5k Jun 23, 2022
Parsing JSON is a hassle in golang

GoJSON Parsing JSON is a hassle in golang. This package will allow you to parse and search elements in a json without structs. Install gojson go get g

swaraj18 25 Nov 12, 2021
Fast JSON serializer for golang.

easyjson Package easyjson provides a fast and easy way to marshal/unmarshal Go structs to/from JSON without the use of reflection. In performance test

Free and open source software developed at Mail.Ru 3.8k Jul 4, 2022
Fastest JSON interperter for golang

Welcome To JIN "Your wish is my command" Fast and Easy Way to Deal With JSON Jin is a comprehensive JSON manipulation tool bundle. All functions teste

eco 58 May 28, 2022
Fast Color JSON Marshaller + Pretty Printer for Golang

ColorJSON: The Fast Color JSON Marshaller for Go What is this? This package is based heavily on hokaccha/go-prettyjson but has some noticible differen

Tyler Brock 110 Jun 16, 2022
Golang port of simdjson: parsing gigabytes of JSON per second

This is a Golang port of simdjson, a high performance JSON parser developed by Daniel Lemire and Geoff Langdale. It makes extensive use of SIMD instructions to achieve parsing performance of gigabytes of JSON per second.

High Performance, Kubernetes Native Object Storage 1.3k Jun 25, 2022
Copy of Golang's json library with IsZero feature

json Copy of Golang's json library with IsZero feature from CL13977 Disclaimer It is a package primary used for my own projects, I will keep it up-to-

Ferenc Fabian 4 Oct 9, 2021
Golang JSON decoder supporting case-sensitive, number-preserving, and strict decoding use cases

Golang JSON decoder supporting case-sensitive, number-preserving, and strict decoding use cases

Kubernetes SIGs 12 Apr 3, 2022
Benchmark of Golang JSON Libraries

Introduction This is a benchmark for the json packages. You are welcome to open an issue if you find anything wrong or out of date. TL;DR In conclusio

null 6 Nov 15, 2021