A CUE-based framework for portable, evolvable, schema

Overview

Scuemata

Scuemata is a system for writing schemas. Like JSON Schema or OpenAPI, it is general-purpose, and most obviously useful as an IDL.

Unlike JSON Schema, OpenAPI, or any other extant schema system, Scuemata's chief focus is on the evolution of schema. Rather than "one file/logical structure, one schema," Scuemata is "one file/logical structure, all the schema for a given kind of object, and logic for translating between them."

The effect of encapsulating schema definition, evolution, and translation into a single, portable, machine-verifiable logical structure is transformative. Taken together, these pieces allow systems that rely on schemas as the contracts for their communication to decouple and evolve independently - even across breaking changes to those schema.

Learn more in our (TODO) docs, or in this overview video!

Maturity

Scuemata is in early adolescence: it's mostly formed, but there are still some crucial undeveloped parts. Specifically, there are two planned changes that we are almost certain will cause breakages for users of scuemata:

Once these changes are finalized, however, we aim to treat the CUE and Go APIs as stable, scrupulously avoiding any breaking changes.

Comments
  • testing framework

    testing framework

    Hello,

    In my research about Scuemata, I have not seen a test framework being mentioned.

    It would be great to be able to provide some data, and validate that schema migrations and lacunaes are generates accordingly. Instead of everyone implementing it, an opiniated framework built into scuemata would be useful.

    opened by roidelapluie 4
  • Convert `thema.Library` to `*thema.Runtime`

    Convert `thema.Library` to `*thema.Runtime`

    thema.Library should be renamed to thema.Context. Its methods should also be converted to requiring a pointer, thereby requiring all existing function signatures to switch to *thema.Context.

    I originally named it Library because i thought this was a nice nod to how we basically load a bunch of CUE "functions" in from the thema CUE package, then call them from the various Go methods within thema. And i wanted it to be a value, not a pointer, because i was hoping to make the zero value useful, similar to e.g. sync.Mutex.

    However:

    • Library contains a cue.Context, and shuffling that thing around has ended up feeling like the main job of Library when actually using the Thema go package.
    • I'm not thrilled with calling it thema.Context because it sorta muddies the water by not having cancellation capabilities. But my real objection there is with how Go stdlib already muddied those waters by conflating variable bags together with cancellation. Asking people to learn a new term, Library, is a heavy lift for just resolving that ambiguity. Better to just fall in line with CUE itself and use thema.Context.
    • Implementing thema within Grafana has made it pretty clear that allowing an explicit nil is preferable for centralizing use of a single shared context. Adding nil to the accepted value space of a function signature allow the caller to make it clear that they are not specifying a value, and expecting the func impl to grab the central one instead. This could be accomplished without pointers by passing the zero value - thema.Library{} - but that feels like undocumented magic; expressing the absence of a value is, unfortunately, one of the main use cases for nil pointers in Go.

    Thus: thema.Library -> *thema.Context.

    enhancement breaking change 
    opened by sdboyer 2
  • thema: Introduce [de]hydrate API

    thema: Introduce [de]hydrate API

    First pass at a de/hydration API.

    Initially, i'd been thinking that it would work well to turn hydration into a formatting option. And that still might be worth doing. But i think this is cleanest and simplest for now - just have Hydrate() and Dehydrate() methods on Instance that apply the transformation and return a new instance. (Much category, very morphism!)

    (cc @ying-jeanne - really this just takes her code and adapts it to this new environment! 🎉 )

    Fixes #43


    edit: hydration is basically working, but i've futzed something up with dehydration. Going to circle back on it a little later.

    Also, it may be preferable to append a suffix to the name of the input src/file (e.g. input.json becomes input-hydrated.json) when calling Hydrate() or Dehydrate(). That way future errors (if any) on that instance can be made clearer that they're happening against a modified object.

    enhancement 
    opened by sdboyer 2
  • Ensure quoted fields when formatting to cue.Instance

    Ensure quoted fields when formatting to cue.Instance

    This is basically working around an upstream CUE issue with this:

    foo: {
      bar: number
      number: string
    }
    

    This is valid CUE, with the LHS number being interpreted as a label. But currently, the OpenAPI encoder in CUE stdlib behaves as though the label shadows the type keyword number. That is, it interprets the value of bar as a reference to the field labeled number, rather than as a primitive type. (TODO: open an issue on this)

    Now, using number as a field name is kind of an antipattern anyway - it's good practice to quote such a label field:

    foo: {
      bar: number
      "number": string
    }
    

    And, when exporting to openapi (cue def -o openapi: with whatever params), this is fine, because cue stdlib still can actually work with instances directly. We can't, though, and we have to rely on the hack of calling cue.Value.Syntax() to get what we want - and the output of that method does not preserve label quotes (TODO: open an issue on this).

    So, this introduces a short term hacky hack hack to just always ensure certain keyword fields are quoted.

    bug 
    opened by sdboyer 1
  • Add lineage printing and generation support

    Add lineage printing and generation support

    This introduces:

    • [x] New package for manipulating lineages at the AST level, encoding/cue (don't really like this name, but :shrug:)
    • [x] CLI commmand for generating empty lineage
    • [x] CLI command for generating lineage from openapi, json schema
    • ~CLI Options for generating lineages at subpaths~ (deferred)

    skipping tests again 😭 best test for this is just golden files, and i'm not gonna one-off that when thema's needs really merit a framework

    Fixes #55

    enhancement 
    opened by sdboyer 1
  • encoding/jsonschema: Add JSON Schema exporter

    encoding/jsonschema: Add JSON Schema exporter

    This introduces an exporter that converts a Thema schema into JSON Schema (Wright draft 4).

    What this actually does is run the OpenAPI exporter (which produces OpenAPI 3.0) and then does a bunch of AST hocus pocus to turn it into JSON Schema. I based this implementation on an annoyingly large amount of reading, this JS implementation, and these docs.

    There are some odds and ends to clean up:

    • [x] Handle format oddities between OpenAPI and JSON Schema
    • [x] Finish blacklisting all the oapi-specific keywords

    However, with the AST manipulation pattern in place, these are pretty trivial additions.

    This could possibly eventually be contributed back to CUE directly upstream. The preferable implementation there, though, is probably to update the encoding/openapi exporter to support oapi 3.1, which was specifically designed to be == to JSON Schema.

    Fixes #49

    enhancement 
    opened by sdboyer 1
  • Basic error reprocessing

    Basic error reprocessing

    Adds very basic reprocessing of errors from their raw CUE form to something more Thema-specific. This basically covers scalars, and works well enough down into nested fields. It's basically the 80% of value for 20% of effort.

    This is really bare bones - i omitted putting much in the way of tests in because that's a larger TODO for Thema in general.

    Part of #44.

    opened by sdboyer 1
  • thema: Introduce `AssignableTo` func and use it to fix `NewInputKernel()` type checking

    thema: Introduce `AssignableTo` func and use it to fix `NewInputKernel()` type checking

    This change introduces a reasonably well thought-out definition of "assignability" of a Thema schema to a Go type, adding it as a func:

    thema.AssignableTo(sch thema.Schema, T interface{}) error
    

    The definition of assignability needs formalization, but i see no reason the Go interface type signature would change.

    Fixes #23

    cc @mpvl @myitcv - this seems (?) similar to but somewhat different from what's in gocodec. Really, this is kinda intended to be complementary to, or even a stopgap for, proper Go type codegen - "given that we hand-wrote a Go struct type, is it equivalent to what proper codegen would produce?"

    bug enhancement 
    opened by sdboyer 1
  • Create standard approach to migrating to Thema from other schema systems

    Create standard approach to migrating to Thema from other schema systems

    A common case that's repeatedly come up is the need to gracefully go from some other schema system into Thema - an onramp of sorts.

    Technically, this can be done with the 0.0 schema, and the "real" Thema schemas start on 1.0, but that's not great, as you still have to write a reverse lens back to 0.0, people could be confused and try to add a 0.1, and everyone has to just know in perpetuity to never try to translate back to that version.

    It seems reasonable to me that lineages allocate a special, optional space for creating an optional onramp key, which need not follow joinSchema, and can be used to capture all the weird and warty former versions of an object - and then provide a forward-only lens. (Once the data's in, there's no going back.)

    opened by sdboyer 1
  • Make the input kernel return validation errors based on the `To` schema

    Make the input kernel return validation errors based on the `To` schema

    Currently, when data to InputKernel.Converge() fails validation, it will simply return a bland validation failed error. This isn't helpful in any real scenario. Instead, let's have it return the validation error that came back from the To schema, as we can be reasonably certain that that's the schema version the author is thinking about, anyway.

    i didn't do this initially because i wasn't quite sure it would fit the user's mental model, but on seeing folks actually use it, it seems clear that this is the assumption they'd make.

    enhancement 
    opened by sdboyer 1
  • Panic when attempting backwards compatibility verification between schemas

    Panic when attempting backwards compatibility verification between schemas

    Currently, CUE (v0.4.0) panics on attempting to call Subsume to verify backwards [in]compatibility between schemas as part of BindLineage. This means we can't do our backwards compatibility verification within/between sequences - the most basic check Thema promises.

    It seems to be tied to the cue.Value created by the Go CUE library's iterators, but i haven't narrowed it down yet. I'm putting up a CUE issue once i've isolated it at least a bit.

    bug invariants 
    opened by sdboyer 1
  • Change `UnwrapCUE()` to `Underlying()`

    Change `UnwrapCUE()` to `Underlying()`

    It doesn't feel like there's an ideal name here, but UnwrapCUE() is using "wrap" in a way that just doesn't feel right.

    Also, i don't like staring at CUE all over my code. It's yelly.

    Underlying() seems better, and vaguely reminiscent of Go's notion of the type underlying an interface, which at least feels less wrong than Unwrap. Open to suggestions, though. For the limited window before i just do this :)

    @IfSentient this change is coming, and it's obviously breaking. It'll be a trivial refactor though, just a method rename.

    opened by sdboyer 0
  • `BindLineage()` does not consider field removals to be backwards compatible

    `BindLineage()` does not consider field removals to be backwards compatible

    BindLineage() will accept the following seqs:

    seqs: [
    	{
    		schemas: [
    			{ // 0.0
    				one: int64
    				two: string
    			},
    			{ // 0.1
    				two: string
    			},
    		]
    	},
    ]
    

    Clearly, v0.1 is backwards incompatible with v0.0, and this should not be allowed.

    bug invariants 
    opened by sdboyer 0
  • Add constraints to `#Lineage.name`

    Add constraints to `#Lineage.name`

    At minimum, we want:

    • strings.MinRunes(1)
    • ~=[z-aA-Z0-9_]

    Probably the ideal target is the set of valid characters for an unquoted CUE label. That covers another known, future use case by disallowing slashes - important to be able to introduce e.g. an optional #Lineage.uri property later, and enforce that its trailing element is == #Lineage.name and be able to know a priori that there is only one trailing element.

    Again, we can open this up more later. Opting for restrictiveness initially gives us more options later.

    invariants 
    opened by sdboyer 2
  • Disallow certain CUE constructs within lineage (schema) declarations

    Disallow certain CUE constructs within lineage (schema) declarations

    There are some logical constructs in CUE that make analysis more complicated, and we'll probably have a much easier time creating the invariants if we just disallow their use within schema declarations. Here's a preliminary list:

    • if statements
    • comprehensions? def yes if composition logic (#8) can be kept entirely outside the schema itself and unified in when called
    • aliases? no specific reason to do this apart from it being suggestive that people are being too fancy

    Cats don't like going back in bags. Better to err on the side of being restrictive initially, then open up later.

    enhancement invariants 
    opened by sdboyer 1
  • Make `thema` CLI dynamically support `github.com/grafana/thema` imports

    Make `thema` CLI dynamically support `github.com/grafana/thema` imports

    Currently, running thema outside of either the thema repo itself, or outside of a cue.mod module context that does not have the github.com/grafana/thema codebase within its cue.mod/pkg dir, will result in

    Error: import failed: cannot find package "github.com/grafana/thema"
    

    This is clearly a big problem, as it means the thema CLI only works when the user has already set up their fs "correctly"...in a way that is difficult to even explain how to do correctly.

    bug 
    opened by sdboyer 0
  • Replace `kernel` package with `vmux`; introduce generics

    Replace `kernel` package with `vmux`; introduce generics

    This introduces generics in the base thema package, allowing the pairing of a Go type with a Schema (TypedSchema) and its corresponding Instance (TypedInstance).

    Building on these generics, we replace the previous kernel approach with a new vmux (version multiplexer) package. This package has the same basic goal as the original kernels, but is less work and more elegantly shaped in its final product.

    Rather than having two methods (Converge, ConvergeJSON) the user calls to pass bytes through a pipeline, vmux contains four muxers, each of which are themselves callable functions following the same basic muxing pattern (accept all versions->see one), mapping from bytes to...

    • Untyped: []byte -> *thema.Instance
    • Byte: []byte -> []byte
    • Typed: []byte -> *thema.TypedInstance[T]
    • Value: []byte -> T

    The pattern is also extensible for "middleware" in the future. The known case there is stepwise interception of Translate on the basis of lacuna that may have been emitted in each translation step.

    Also started using some of the go1.19 godoc hotlinking conventions.

    • [ ] Move tests over from kernel
    • [ ] Remove kernel package
    • [ ] Clean up docs on helper funcs for generic binding
    • [ ] Rewrite relevant tutorials
    • [ ] Add error taxonomy for differentiating error classes

    Fixes #53 Fixes #57

    enhancement 
    opened by sdboyer 0
Owner
Grafana Labs
Grafana Labs is behind leading open source projects Grafana and Loki, and the creator of the first open & composable observability platform.
Grafana Labs
Time Series Alerting Framework

Bosun Bosun is a time series alerting framework developed by Stack Exchange. Scollector is a metric collection agent. Learn more at bosun.org. Buildin

Bosun 3.3k Sep 21, 2022
Open source framework for processing, monitoring, and alerting on time series data

Kapacitor Open source framework for processing, monitoring, and alerting on time series data Installation Kapacitor has two binaries: kapacitor – a CL

InfluxData 2.2k Sep 26, 2022
A simple logging framework for Go program.

ASLP A Go language based log library, simple, convenient and concise. Three modes, standard output, file mode and common mode. Convenient, simple and

丙杺 1 Jan 9, 2022
Felix Geisendörfer 28 Feb 9, 2022
Time based rotating file writer

cronowriter This is a simple file writer that it writes message to the specified format path. The file path is constructed based on current date and t

Yuta UEKUSA 50 Sep 26, 2022
CoLog is a prefix-based leveled execution log for Go

What's CoLog? CoLog is a prefix-based leveled execution log for Go. It's heavily inspired by Logrus and aims to offer similar features by parsing the

null 156 Aug 2, 2022
rtop is an interactive, remote system monitoring tool based on SSH

rtop rtop is a remote system monitor. It connects over SSH to a remote system and displays vital system metrics (CPU, disk, memory, network). No speci

RapidLoop 2k Sep 19, 2022
Open Source Supreme Monitor Based on GoLang

Open Source Supreme Monitor Based on GoLang A module built for personal use but ended up being worthy to have it open sourced.

SneakyKiwi 19 May 6, 2022
Interfaces for LZ77-based data compression

Pack Interfaces for LZ77-based data compression. Introduction Many compression libraries have two main parts: Something that looks for repeated sequen

Andy Balholm 3 Oct 19, 2021
Multi-level logger based on go std log

mlog the mlog is multi-level logger based on go std log. It is: Simple Easy to use NOTHING ELSE package main import ( log "github.com/ccpaging/lo

null 0 May 18, 2022
Gomon - Go language based system monitor

Copyright © 2021 The Gomon Project. Welcome to Gomon, the Go language based system monitor Welcome to Gomon, the Go language based system monitor Over

zosmac 2 May 17, 2022
Continuous profiling of golang program based on pprof

基于 pprof 的 Golang 程序连续分析 Demo 点击 point Quick Start 需要被收集分析的golang程序,需要提供net/http/pprof端点,并配置在collector.yaml配置文件中 #run server :8080 go run ser

xyctruth 110 Sep 23, 2022
📝 🪵 A minimal level based logging library for Go

slogx A minimal level based logging library for Go. Installation Example Usage Logger Log Level Format Output Contribute License Installation go get g

私はレオンです 7 May 23, 2022
Based uber/prototool

Prototool Update: We recommend checking out Buf, which is under active development. There are a ton of docs for getting started, including for migrati

null 0 Dec 30, 2021
Peimports - based on golang's debug/pe this package gives quick access to the ordered imports of pe files with ordinal support

This code is almost entirely derived from the Go standard library's debug/pe package. It didn't provide access to ordinal based entries in the IAT and

Mike Wiacek 0 Jan 5, 2022
Tracetest - Trace-based testing. End-to-end tests powered by your OpenTelemetry Traces.

End-to-end tests powered by OpenTelemetry. For QA, Dev, & Ops. Live Demo | Documentation | Twitter | Discord | Blog Click on the image or this link to

kubeshop 230 Sep 21, 2022
CUE utilities and helpers for working with tree based objects in any combination of CUE, Yaml, and JSON.

Cuetils CUE utilities and helpers for working with tree based objects in any combination of CUE, Yaml, and JSON. Using As a command line binary The cu

_Hofstadter 59 Sep 10, 2022
poCo: portable Containers. Create statically linked, portable binaries from container images (daemonless)

poCo Containers -> Binaries Create statically linked, portable binaries from container images A simple, static golang bundler! poCo (portable-Containe

Ettore Di Giacinto 68 Sep 8, 2022
The new home of the CUE language! Validate and define text-based and dynamic configuration

The CUE Data Constraint Language Configure, Unify, Execute CUE is an open source data constraint language which aims to simplify tasks involving defin

null 3k Sep 25, 2022
A tool to compare if terraform provider migration schema snapshot is equal to schema defined in resource code

migration schema comparer for Terraform When develop Terraform provider sometimes we need do some state migration(not schema migration) via StateUpgra

null 0 Nov 18, 2021