A MySQL-compatible relational database with a storage agnostic query engine. Implemented in pure Go.

Overview

go-mysql-server

go-mysql-server is a SQL engine which parses standard SQL (based on MySQL syntax) and executes queries on data sources of your choice. A simple in-memory database and table implementation are provided, and you can query any data source you want by implementing a few interfaces.

go-mysql-server also provides a server implementation compatible with the MySQL wire protocol. That means it is compatible with MySQL ODBC, JDBC, or the default MySQL client shell interface.

Dolt, a SQL database with Git-style versioning, is the main database implementation of this package. Check out that project for reference implementations.

Scope of this project

These are the goals of go-mysql-server:

  • Be a generic extensible SQL engine that performs queries on your data sources.
  • Provide a simple database implementation suitable for use in tests.
  • Define interfaces you can implement to query your own data sources.
  • Provide a runnable server speaking the MySQL wire protocol, connected to data sources of your choice.
  • Optimize query plans.
  • Allow implementors to add their own analysis steps and optimizations.
  • Support indexed lookups and joins on data tables that support them.
  • Support external index driver implementations such as pilosa.
  • With few caveats and using a full database implementation, be a drop-in MySQL database replacement.

Non-goals of go-mysql-server:

  • Be an application/server you can use directly.
  • Provide any kind of backend implementation (other than the memory one used for testing) such as json, csv, yaml. That's for clients to implement and use.

What's the use case of go-mysql-server?

go-mysql-server has two primary uses case:

  1. Stand-in for MySQL in a golang test environment, using the built-in memory database implementation.

  2. Providing access to aribtrary data sources with SQL queries by implementing a handful of interfaces. The most complete real-world implementation is Dolt.

Installation

The import path for the package is github.com/dolthub/go-mysql-server.

To install it, run:

go get github.com/dolthub/go-mysql-server

Go Documentation

SQL syntax

The goal of go-mysql-server is to support 100% of the statements that MySQL does. We are continuously adding more functionality to the engine, but not everything is supported yet. To see what is currently included check the SUPPORTED file.

Third-party clients

We support and actively test against certain third-party clients to ensure compatibility between them and go-mysql-server. You can check out the list of supported third party clients in the SUPPORTED_CLIENTS file along with some examples on how to connect to go-mysql-server using them.

Available functions

Name Description
ABS(expr) returns the absolute value of an expression
ACOS(expr) returns the arccos of an expression
ARRAY_LENGTH(json) if the json representation is an array, this function returns its size.
ASIN(expr) returns the arcsin of an expression
ATAN(expr) returs the arctan of an expression
AVG(expr) returns the average value of expr in all rows.
CEIL(number) returns the smallest integer value that is greater than or equal to number.
CEILING(number) returns the smallest integer value that is greater than or equal to number.
CHARACTER_LENGTH(str) returns the length of the string in characters.
CHAR_LENGTH(str) returns the length of the string in characters.
COALESCE(...) returns the first non-null value in a list.
CONCAT(...) concatenates any group of fields into a single string.
CONCAT_WS(sep, ...) concatenates any group of fields into a single string. The first argument is the separator for the rest of the arguments. The separator is added between the strings to be concatenated. The separator can be a string, as can the rest of the arguments. If the separator is NULL, the result is NULL.
CONNECTION_ID() returns the current connection ID.
COS(expr) returns the cosine of an expression.
COT(expr) returns the arctangent of an expression.
COUNT(expr) returns a count of the number of non-NULL values of expr in the rows retrieved by a SELECT statement.
CURRENT_USER() returns the current user
DATE(date) returns the date part of the given date.
DATETIME(expr) returns a DATETIME value for the expression given (e.g. the string '2020-01-02').
DATE_ADD(date, interval) adds the interval to the given date.
DATE_SUB(date, interval) subtracts the interval from the given date.
DAY(date) is a synonym for DAYOFMONTH().
DAYOFMONTH(date) returns the day of the month (0-31).
DAYOFWEEK(date) returns the day of the week of the given date.
DAYOFYEAR(date) returns the day of the year of the given date.
DEGREES(expr) returns the number of degrees in the radian expression given.
EXPLODE(...) generates a new row in the result set for each element in the expressions provided.
FIRST(expr) returns the first value in a sequence of elements of an aggregation.
FLOOR(number) returns the largest integer value that is less than or equal to number.
FROM_BASE64(str) decodes the base64-encoded string str.
GREATEST(...) returns the greatest numeric or string value.
HOUR(date) returns the hours of the given date.
IFNULL(expr1, expr2) if expr1 is not NULL, it returns expr1; otherwise it returns expr2.
IF(expr1, expr2, expr3) if expr1 evaluates to true, retuns expr2. Otherwise returns expr3.
INSTR(str1, str2) returns the 1-based index of the first occurence of str2 in str1, or 0 if it does not occur.
IS_BINARY(blob) returns whether a blob is a binary file or not.
JSON_EXTRACT(json_doc, path, ...) extracts data from a json document using json paths. Extracting a string will result in that string being quoted. To avoid this, use JSON_UNQUOTE(JSON_EXTRACT(json_doc, path, ...)).
JSON_UNQUOTE(json) unquotes JSON value and returns the result as a utf8mb4 string.
LAST(expr) returns the last value in a sequence of elements of an aggregation.
LEAST(...) returns the smaller numeric or string value.
LEFT(str, int) returns the first N characters in the string given.
LENGTH(str) returns the length of the string in bytes.
LN(X) returns the natural logarithm of X.
LOG(X), LOG(B, X) if called with one parameter, this function returns the natural logarithm of X. If called with two parameters, this function returns the logarithm of X to the base B. If X is less than or equal to 0, or if B is less than or equal to 1, then NULL is returned.
LOG10(X) returns the base-10 logarithm of X.
LOG2(X) returns the base-2 logarithm of X.
LOWER(str) returns the string str with all characters in lower case.
LPAD(str, len, padstr) returns the string str, left-padded with the string padstr to a length of len characters.
LTRIM(str) returns the string str with leading space characters removed.
MAX(expr) returns the maximum value of expr in all rows.
MID(str, pos, [len]) returns a substring from the provided string starting at pos with a length of len characters. If no len is provided, all characters from pos until the end will be taken.
MIN(expr) returns the minimum value of expr in all rows.
MINUTE(date) returns the minutes of the given date.
MONTH(date) returns the month of the given date.
NOW() returns the current timestamp.
NULLIF(expr1, expr2) returns NULL if expr1 = expr2 is true, otherwise returns expr1.
POW(X, Y) returns the value of X raised to the power of Y.
POWER(X, Y) synonym for POW
RADIANS(expr) returns the radian value of the degrees argument given
RAND(expr?) returns a random number in the range 0 <= x < 1. If an argument is given, it is used to seed the random number generator.
REGEXP_MATCHES(text, pattern, [flags]) returns an array with the matches of the pattern in the given text. Flags can be given to control certain behaviours of the regular expression. Currently, only the i flag is supported, to make the comparison case insensitive.
REPEAT(str, count) returns a string consisting of the string str repeated count times.
REPLACE(str,from_str,to_str) returns the string str with all occurrences of the string from_str replaced by the string to_str.
REVERSE(str) returns the string str with the order of the characters reversed.
ROUND(number, decimals) rounds the number to decimals decimal places.
RPAD(str, len, padstr) returns the string str, right-padded with the string padstr to a length of len characters.
RTRIM(str) returns the string str with trailing space characters removed.
SECOND(date) returns the seconds of the given date.
SIN(expr) returns the sine of the expression given.
SLEEP(seconds) waits for the specified number of seconds (can be fractional).
SOUNDEX(str) returns the soundex of a string.
SPLIT(str,sep) returns the parts of the string str split by the separator sep as a JSON array of strings.
SQRT(X) returns the square root of a nonnegative number X.
SUBSTR(str, pos, [len]) returns a substring from the string str starting at pos with a length of len characters. If no len is provided, all characters from pos until the end will be taken.
SUBSTRING(str, pos, [len]) returns a substring from the string str starting at pos with a length of len characters. If no len is provided, all characters from pos until the end will be taken.
SUBSTRING_INDEX(str, delim, count) Returns a substring after count appearances of delim. If count is negative, counts from the right side of the string.
SUM(expr) returns the sum of expr in all rows.
TAN(expr) returns the tangent of the expression given.
TIMEDIFF(expr1, expr2) returns expr1 − expr2 expressed as a time value. expr1 and expr2 are time or date-and-time expressions, but both must be of the same type.
TIMESTAMP(expr) returns a timestamp value for the expression given (e.g. the string '2020-01-02').
TO_BASE64(str) encodes the string str in base64 format.
TRIM(str) returns the string str with all spaces removed.
UNIX_TIMESTAMP(expr?) returns the datetime argument to the number of seconds since the Unix epoch. With nor argument, returns the number of execonds since the Unix epoch for the current time.
UPPER(str) returns the string str with all characters in upper case.
USER() returns the current user name.
UTC_TIMESTAMP() returns the current UTC timestamp.
WEEKDAY(date) returns the weekday of the given date.
YEAR(date) returns the year of the given date.
YEARWEEK(date, mode) returns year and week for a date. The year in the result may be different from the year in the date argument for the first and the last week of the year.

Configuration

The behaviour of certain parts of go-mysql-server can be configured using either environment variables or session variables.

Session variables are set using the following SQL queries:

SET <variable name> = <value>
Name Type Description
INMEMORY_JOINS environment If set it will perform all joins in memory. Default is off.
inmemory_joins session If set it will perform all joins in memory. Default is off. This has precedence over INMEMORY_JOINS.
MAX_MEMORY environment The maximum number of memory, in megabytes, that can be consumed by go-mysql-server. Any in-memory caches or computations will no longer try to use memory when the limit is reached. Note that this may cause certain queries to fail if there is not enough memory available, such as queries using DISTINCT, ORDER BY or GROUP BY with groupings.
DEBUG_ANALYZER environment If set, the analyzer will print debug messages. Default is off.

Example

go-mysql-server contains a SQL engine and server implementation. So, if you want to start a server, first instantiate the engine and pass your sql.Database implementation.

It will be in charge of handling all the logic to retrieve the data from your source. Here you can see an example using the in-memory database implementation:

package main

import (
    "time"

    "github.com/dolthub/go-mysql-server/auth"
    "github.com/dolthub/go-mysql-server/memory"
    "github.com/dolthub/go-mysql-server/server"
    "github.com/dolthub/go-mysql-server/sql"
    sqle "github.com/dolthub/go-mysql-server"
)

func main() {
    driver := sqle.NewDefault()
    driver.AddDatabase(createTestDatabase())

    config := server.Config{
        Protocol: "tcp",
        Address:  "localhost:3306",
        Auth:     auth.NewNativeSingle("user", "pass", auth.AllPermissions),
    }

    s, err := server.NewDefaultServer(config, driver)
    if err != nil {
        panic(err)
    }

    s.Start()
}

func createTestDatabase() *memory.Database {
    const (
        dbName    = "test"
        tableName = "mytable"
    )

    db := memory.NewDatabase(dbName)
    table := memory.NewTable(tableName, sql.Schema{
        {Name: "name", Type: sql.Text, Nullable: false, Source: tableName},
        {Name: "email", Type: sql.Text, Nullable: false, Source: tableName},
        {Name: "phone_numbers", Type: sql.JSON, Nullable: false, Source: tableName},
        {Name: "created_at", Type: sql.Timestamp, Nullable: false, Source: tableName},
    })

    db.AddTable(tableName, table)
    ctx := sql.NewEmptyContext()

    rows := []sql.Row{
        sql.NewRow("John Doe", "[email protected]", []string{"555-555-555"}, time.Now()),
        sql.NewRow("John Doe", "[email protected]", []string{}, time.Now()),
        sql.NewRow("Jane Doe", "[email protected]", []string{}, time.Now()),
        sql.NewRow("Evil Bob", "[email protected]", []string{"555-666-555", "666-666-666"}, time.Now()),
	}

    for _, row := range rows {
        table.Insert(ctx, row)
    }

    return db
}

Then, you can connect to the server with any MySQL client:

> mysql --host=127.0.0.1 --port=3306 -u user -ppass test -e "SELECT * FROM mytable"
+----------+-------------------+-------------------------------+---------------------+
| name     | email             | phone_numbers                 | created_at          |
+----------+-------------------+-------------------------------+---------------------+
| John Doe | [email protected]      | ["555-555-555"]               | 2018-04-18 10:42:58 |
| John Doe | [email protected]   | []                            | 2018-04-18 10:42:58 |
| Jane Doe | [email protected]      | []                            | 2018-04-18 10:42:58 |
| Evil Bob | [email protected] | ["555-666-555","666-666-666"] | 2018-04-18 10:42:58 |
+----------+-------------------+-------------------------------+---------------------+

See the complete example here.

Queries examples

SELECT count(name) FROM mytable
+---------------------+
| COUNT(mytable.name) |
+---------------------+
|                   4 |
+---------------------+

SELECT name,year(created_at) FROM mytable
+----------+--------------------------+
| name     | YEAR(mytable.created_at) |
+----------+--------------------------+
| John Doe |                     2018 |
| John Doe |                     2018 |
| Jane Doe |                     2018 |
| Evil Bob |                     2018 |
+----------+--------------------------+

SELECT email FROM mytable WHERE name = 'Evil Bob'
+-------------------+
| email             |
+-------------------+
| [email protected] |
+-------------------+

Custom data source implementation

To create your own data source implementation you need to implement the following interfaces:

  • sql.Database interface. This interface will provide tables from your data source. You can also implement other interfaces on your database to unlock additional functionality:

    • sql.TableCreator to support creating new tables
    • sql.TableDropper to support dropping tables
    • sql.TableRenamer to support renaming tables
    • sql.ViewCreator to support creating persisted views on your tables
    • sql.ViewDropper to support dropping persisted views
  • sql.Table interface. This interface will provide rows of values from your data source. You can also implement other interfaces on your table to unlock additional functionality:

    • sql.InsertableTable to allow your data source to be updated with INSERT statements.
    • sql.UpdateableTable to allow your data source to be updated with UPDATE statements.
    • sql.DeletableTable to allow your data source to be updated with DELETE statements.
    • sql.ReplaceableTable to allow your data source to be updated with REPLACE statements.
    • sql.AlterableTable to allow your data source to have its schema modified by adding, dropping, and altering columns.
    • sql.IndexedTable to declare your table's native indexes to speed up query execution.
    • sql.IndexAlterableTable to accept the creation of new native indexes.
    • sql.ForeignKeyAlterableTable to signal your support of foreign key constraints in your table's schema and data.
    • sql.ProjectedTable to return rows that only contain a subset of the columns in the table. This can make query execution faster.
    • sql.FilteredTable to filter the rows returned by your table to those matching a given expression. This can make query execution faster (if your table implementation can filter rows more efficiently than checking an expression on every row in a table).

You can see a really simple data source implementation in the memory package.

Testing your data source implementation

go-mysql-server provides a suite of engine tests that you can use to validate that your implementation works as expected. See the enginetest package for details and examples.

Indexes

go-mysql-server exposes a series of interfaces to allow you to implement your own indexes so you can speed up your queries.

Native indexes

Tables can declare that they support native indexes, which means that they support efficiently returning a subset of their rows that match an expression. The memory package contains an example of this behavior, but please note that it is only for example purposes and doesn't actually make queries faster (although we could change this in the future).

Integrators should implement the sql.IndexedTable interface to declare which indexes their tables support and provide a means of returning a subset of the rows based on an sql.IndexLookup provided by their sql.Index implementation. There are a variety of extensions to sql.Index that can be implemented, each of which unlocks additional capabilities:

  • sql.Index. Base-level interface, supporting equality lookups for an index.
  • sql.AscendIndex. Adds support for > and >= indexed lookups.
  • sql.DescendIndex. Adds support for < and <= indexed lookups.
  • sql.NegateIndex. Adds support for negating other index lookups.
  • sql.MergeableIndexLookup. Adds support for merging two sql.IndexLookups together to create a new one, representing AND and OR expressions on indexed columns.

Custom index driver implementation

Index drivers provide different backends for storing and querying indexes, without the need for a table to store and query its own native indexes. To implement a custom index driver you need to implement a few things:

  • sql.IndexDriver interface, which will be the driver itself. Not that your driver must return an unique ID in the ID method. This ID is unique for your driver and should not clash with any other registered driver. It's the driver's responsibility to be fault tolerant and be able to automatically detect and recover from corruption in indexes.
  • sql.Index interface, returned by your driver when an index is loaded or created.
  • sql.IndexValueIter interface, which will be returned by your sql.IndexLookup and should return the values of the index.
  • Don't forget to register the index driver in your sql.Context using context.RegisterIndexDriver(mydriver) to be able to use it.

To create indexes using your custom index driver you need to use extension syntax USING driverid on the index creation statement. For example:

CREATE INDEX foo ON table USING driverid (col1, col2)

go-mysql-server does not provide a production index driver implementation. We previously provided a pilosa implementation, but removed it due to the difficulty of supporting it on all platforms (pilosa doesn't work on Windows).

You can see an example of a driver implementation in the memory package.

Metrics

go-mysql-server utilizes github.com/go-kit/kit/metrics module to expose metrics (counters, gauges, histograms) for certain packages (so far for engine, analyzer, regex). If you already have metrics server (prometheus, statsd/statsite, influxdb, etc.) and you want to gather metrics also from go-mysql-server components, you will need to initialize some global variables by particular implementations to satisfy following interfaces:

// Counter describes a metric that accumulates values monotonically.
type Counter interface {
	With(labelValues ...string) Counter
	Add(delta float64)
}

// Gauge describes a metric that takes specific values over time.
type Gauge interface {
	With(labelValues ...string) Gauge
	Set(value float64)
	Add(delta float64)
}

// Histogram describes a metric that takes repeated observations of the same
// kind of thing, and produces a statistical summary of those observations,
// typically expressed as quantiles or buckets.
type Histogram interface {
	With(labelValues ...string) Histogram
	Observe(value float64)
}

You can use one of go-kit implementations or try your own. For instance, we want to expose metrics for prometheus server. So, before we start mysql engine, we have to set up the following variables:

import(
    "github.com/go-kit/kit/metrics/prometheus"
    promopts "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/promhttp"
)

//....

// engine metrics
sqle.QueryCounter = prometheus.NewCounterFrom(promopts.CounterOpts{
		Namespace: "go_mysql_server",
		Subsystem: "engine",
		Name:      "query_counter",
	}, []string{
		"query",
	})
sqle.QueryErrorCounter = prometheus.NewCounterFrom(promopts.CounterOpts{
    Namespace: "go_mysql_server",
    Subsystem: "engine",
    Name:      "query_error_counter",
}, []string{
    "query",
    "error",
})
sqle.QueryHistogram = prometheus.NewHistogramFrom(promopts.HistogramOpts{
    Namespace: "go_mysql_server",
    Subsystem: "engine",
    Name:      "query_histogram",
}, []string{
    "query",
    "duration",
})

// analyzer metrics
analyzer.ParallelQueryCounter = prometheus.NewCounterFrom(promopts.CounterOpts{
    Namespace: "go_mysql_server",
    Subsystem: "analyzer",
    Name:      "parallel_query_counter",
}, []string{
    "parallelism",
})

// regex metrics
regex.CompileHistogram = prometheus.NewHistogramFrom(promopts.HistogramOpts{
    Namespace: "go_mysql_server",
    Subsystem: "regex",
    Name:      "compile_histogram",
}, []string{
    "regex",
    "duration",
})
regex.MatchHistogram = prometheus.NewHistogramFrom(promopts.HistogramOpts{
    Namespace: "go_mysql_server",
    Subsystem: "regex",
    Name:      "match_histogram",
}, []string{
    "string",
    "duration",
})

One important note - internally we set some labels for metrics, that's why have to pass those keys like "duration", "query", "driver", ... when we register metrics in prometheus. Other systems may have different requirements.

Powered by go-mysql-server

Acknowledgements

go-mysql-server was originally developed by the {source-d} organzation, and this repository was originally forked from src-d. We want to thank the entire {source-d} development team for their work on this project, especially Miguel Molina (@erizocosmico) and Juanjo Álvarez Martinez (@juanjux).

License

Apache License 2.0, see LICENSE

Issues
  • LastInsertId always returns 0

    LastInsertId always returns 0

    When running several insert commands like

    result, err := db.Exec("INSERT INTO mytable SET number = 18;")
    id, err := result.LastInsertId()
    

    Against a simple table like

    CREATE TABLE IF NOT EXISTS mytable (
    id int unsigned NOT NULL AUTO_INCREMENT,
    number int unsigned DEFAULT NULL,
    PRIMARY KEY (id),
    ) DEFAULT CHARSET=utf8;
    

    the returned id is always 0. While the go-mysql-driver returns the correct pkey.

    Used libraries: github.com/dolthub/go-mysql-server v0.6.1-0.20201228192939-415fc40f3a71 github.com/go-sql-driver/mysql v1.5.0

    opened by eqinox76 12
  • Support for prepared statements

    Support for prepared statements

    This is more important than we have been treating it, because there are many drivers that do this under the hood without clients explicitly asking for it. From https://github.com/liquidata-inc/go-mysql-server/issues/169

    opened by zachmu 11
  • Question/Feature Request: How can I increase the parallelism of expression evaluation?

    Question/Feature Request: How can I increase the parallelism of expression evaluation?

    I have a situation where I have a custom SQL function that is a bit slow (like a single network request slow). Because rows are demanded one at a time from a RowIter, these expressions are evaluated one at a time, meaning we run these network requests one at a time. I would like some way to evaluate these rows in parallel as this would greatly improve the speed of my queries. I can't prefetch everything because I do not have all the data needed for all the network requests until query execution. A while back I tried making a custom SQL plan node which wraps another node and prefetches rows from its child node in parallel, but I ran into some issues where the RowIter implementation I was calling misbehaved as it was not threadsafe. Do you have any suggestions for me? Was the parallel prefetch node a good/bad idea? I really appreciate your help with this, thanks.

    opened by andremarianiello 9
  • Cast equivalent float64s to int64 rather than failing with ErrInvalidValue

    Cast equivalent float64s to int64 rather than failing with ErrInvalidValue

    When working with Pandas (both with and without doltpy) I ran into this issue multiple times - integer columns are converted to floats (due to the way Python handles nan values), which then causes dolt table import to complain with the following error despite the floats being "integral":

    Rows Processed: 0, Additions: 0, Modifications: 0, Had No Effect: 0
    
    A bad row was encountered while moving data.
    Bad Row: 
    error: '10.0' is not a valid value for 'INT'
    These can be ignored using the '--continue'
    

    This PR converts "integral"/equivalent floats to int64s to prevent this from happening. This still prevents non-integral floats from being imported, e.g.:

    Rows Processed: 0, Additions: 0, Modifications: 0, Had No Effect: 0
    
    A bad row was encountered while moving data.
    Bad Row: 
    error: '10.1' is not a valid value for 'INT'
    These can be ignored using the '--continue'
    

    I'm not sure how/where this should be tested, but if it is an acceptable PR I'll be happy to write the tests for it too.

    P.S. The isIntegral function can be removed and used in the if-statement as a condition if that's more preferable, though I think it should be documented (perhaps in a comment) since it's purpose may not be immediately obvious.

    opened by abmyii 8
  • Index error lost in parent call

    Index error lost in parent call

    Hi,

    First, thank you for the great package!

    I'm not sure if this is intentional, an error reported by a custom index implementation is not handled. The code is here: https://github.com/dolthub/go-mysql-server/blob/main/sql/analyzer/indexes.go#L71

    Should errInAnalysis = err be added here (same as the previous if line 61) so the error is reported to the caller?

    If so, I can send a PR for the fix. If not, how should a custom index implementation handle errors that should stop the process?

    Thanks.

    opened by jfrabaute 7
  • Aggregate Partition Window Rows beyond 127

    Aggregate Partition Window Rows beyond 127

    The number of rows referenced in a partition "ROWS BETWEEN" clause seem to be stored in a short int, because up to and including 127 works, but 128 and above do not.

    stocks> select date, act_symbol, avg(close) OVER (PARTITION BY act_symbol ORDER BY date ROWS BETWEEN 127 PRECEDING AND CURRENT ROW) AS ma200 FROM ohlcv WHERE act_symbol='AAPL' having date = '2022-02-11'; +-------------------------------+------------+--------------------+ | date | act_symbol | ma200 | +-------------------------------+------------+--------------------+ | 2022-02-11 00:00:00 +0000 UTC | AAPL | 158.39554687499958 | +-------------------------------+------------+--------------------+

    stocks> select date, act_symbol, avg(close) OVER (PARTITION BY act_symbol ORDER BY date ROWS BETWEEN 128 PRECEDING AND CURRENT ROW) AS ma200 FROM ohlcv WHERE act_symbol='AAPL' having date = '2022-02-11'; offset must be a non-negative integer; found: 128

    bug 
    opened by inversewd2 6
  • Proxy support?

    Proxy support?

    Hello,

    Has there been any investigation or proof of concepts around implementing a MySQL proxy based on go-mysql-server?

    I have prototyped a few different storage backend options such as S3 and CSV and go-mysql-server has worked out very well. I am thinking it could be very beneficial as a generic caching solution for MySQL.

    If there has been any work in this area is there any documentation or branches you can share? Any information you can provide would be helpful.

    Thanks Michael

    opened by mgale 6
  • Add support for read-only transactions

    Add support for read-only transactions

    Running "START TRANSACTION READ ONLY" on latest master gives me:

     Error 1105: syntax error at position 23 near 'READ'
    

    So I am assuming that read-only transactions are not supported.

    Would it make sense to at least swallow the error and treat the transaction as a read-write one?

    opened by bojanz 6
  • Complex query with ABS, !=, REGEXP, and CONVERT fails.

    Complex query with ABS, !=, REGEXP, and CONVERT fails.

    I created a query that does something like the following: Table atable c1 = long. c2 = int, c3=int Table btable c1 = long

    select a.c1, a.c2, a.c3 from atable a JOIN btable b on a.c1=b.c1 WHERE (ABS(a.c2) = 1) AND a.c3 != 10 AND a.c1 REGEXP '^[-]?[0-9]+$' AND CONVERT(a.c1, SIGNED) != 0

    Expected: 1 row in the result.

    Actual: no results.

    Note: If I remove the ABS, the above query works! Also, if I do a.c2 = 1 OR a.c2 = -1, it also works in a standard mysql client.

    I'm a little puzzled because I don't see that the CONVERT is officially supported, however the ABS is supported?

    opened by joel-rieke 5
  • FilteredTable uses wrong schema with alias

    FilteredTable uses wrong schema with alias

    When FilteredTable interface is activated on join with table alias, the error field <field> is not on schema shows incorrectly. This only happens with table aliases.

    The problem is that the FixFieldIndexes uses the alias table schema and not the raw table schema. Therefore the field cannot be found and above error occur.

    To Reproduce

    Activate FilteredTable on memory store by setting WithFilters on to Table instead of FilterTable. This activates the implementation. Then run test case pushdown_filters_to_under_join_node for example.

    Changing from tableNode.Schema() to table.Schema() in pushdown.go did solve the issue for me but unit tests still don't look too encouraging.

    bug 
    opened by Allam76 5
  • "ORDER BY" expressions behaviour compatible with MySQL?

    Running the following on MySQL 8.0.25 (on a Mac) works, but fails on the in-memory server:

    mysql> CREATE DATABASE test;
    Query OK, 1 row affected (0.00 sec)
    
    mysql> USE test;
    Database changed
    
    mysql> CREATE TABLE test (
        ->     time TIMESTAMP,
        ->     value DOUBLE
        -> );
    Query OK, 0 rows affected (0.01 sec)
    
    mysql> INSERT INTO test VALUES
        ->    ("2021-07-04 10:00:00", 1.0),
        ->    ("2021-07-03 10:00:00", 2.0),
        ->    ("2021-07-02 10:00:00", 3.0),
        ->    ("2021-07-01 10:00:00", 4.0);
    Query OK, 4 rows affected (0.00 sec)
    Records: 4  Duplicates: 0  Warnings: 0
    
    mysql> SELECT
        ->   UNIX_TIMESTAMP(time) DIV 60 * 60 AS "time",
        ->   avg(value) AS "value"
        -> FROM test
        -> GROUP BY 1
        -> ORDER BY UNIX_TIMESTAMP(time) DIV 60 * 60;
    +------------+-------+
    | time       | value |
    +------------+-------+
    | 1625130000 |     4 |
    | 1625216400 |     3 |
    | 1625302800 |     2 |
    | 1625389200 |     1 |
    +------------+-------+
    4 rows in set (0.01 sec)
    

    Running on 8148809cd5cfc1c5fc23c49353afc16ab16a03dc:

    mysql> CREATE DATABASE test;
    Query OK, 1 row affected (0.00 sec)
    
    mysql> USE test;
    Database changed
    
    mysql> CREATE TABLE test (
        ->     time TIMESTAMP,
        ->     value DOUBLE
        -> );
    Empty set (0.00 sec)
    
    mysql> INSERT INTO test VALUES
        ->    ("2021-07-04 10:00:00", 1.0),
        ->    ("2021-07-03 10:00:00", 2.0),
        ->    ("2021-07-02 10:00:00", 3.0),
        ->    ("2021-07-01 10:00:00", 4.0);
    Query OK, 4 rows affected (0.00 sec)
    
    mysql> SELECT
        ->   UNIX_TIMESTAMP(time) DIV 60 * 60 AS "time",
        ->   avg(value) AS "value"
        -> FROM test
        -> GROUP BY 1
        -> ORDER BY UNIX_TIMESTAMP(time) DIV 60 * 60;
    ERROR 1105 (HY000): unable to sort: incompatible conversion to SQL type: DATETIME
    

    (For background this query is generated by the Grafana MySQL datasource).

    Now it seems go-mysql-server is trying to apply UNIX_TIMESTAMP(time) DIV 60 * 60 in the ORDER BY clause to the time column aliased out of the SELECT, and not noticing the the definition of the ORDER BY expression is the same. This is failing as UNIX_TIMESTAMP expects a time.Time and not an int.

    If you try and use a different column name and alias it to time (as Grafana expects) the problem might be more obvious:

    (On MySQL)

    mysql> CREATE DATABASE test;
    Query OK, 1 row affected (0.00 sec)
    
    mysql> USE test;
    Database changed
    
    mysql> CREATE TABLE test (
        ->     timestamp TIMESTAMP,
        ->     value DOUBLE
        -> );
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> INSERT INTO test VALUES
        ->    ("2021-07-04 10:00:00", 1.0),
        ->    ("2021-07-03 10:00:00", 2.0),
        ->    ("2021-07-02 10:00:00", 3.0),
        ->    ("2021-07-01 10:00:00", 4.0);
    Query OK, 4 rows affected (0.00 sec)
    Records: 4  Duplicates: 0  Warnings: 0
    
    mysql> SELECT
        ->   UNIX_TIMESTAMP(timestamp) DIV 60 * 60 AS "time",
        ->   avg(value) AS "value"
        -> FROM test
        -> GROUP BY 1
        -> ORDER BY UNIX_TIMESTAMP(timestamp) DIV 60 * 60;
    +------------+-------+
    | time       | value |
    +------------+-------+
    | 1625130000 |     4 |
    | 1625216400 |     3 |
    | 1625302800 |     2 |
    | 1625389200 |     1 |
    +------------+-------+
    4 rows in set (0.00 sec)
    

    (On go-sql-server):

    mysql> CREATE DATABASE test;
    Query OK, 1 row affected (0.00 sec)
    
    mysql> USE test;
    Database changed
    mysql> CREATE TABLE test (
        ->     timestamp TIMESTAMP,
        ->     value DOUBLE
        -> );
    Empty set (0.00 sec)
    
    mysql> INSERT INTO test VALUES
        ->    ("2021-07-04 10:00:00", 1.0),
        ->    ("2021-07-03 10:00:00", 2.0),
        ->    ("2021-07-02 10:00:00", 3.0),
        ->    ("2021-07-01 10:00:00", 4.0);
    Query OK, 4 rows affected (0.00 sec)
    
    mysql> SELECT
        ->   UNIX_TIMESTAMP(timestamp) DIV 60 * 60 AS "time",
        ->   avg(value) AS "value"
        -> FROM test
        -> GROUP BY 1
        -> ORDER BY UNIX_TIMESTAMP(timestamp) DIV 60 * 60
        -> ;
    ERROR 1105 (HY000): column "timestamp" could not be found in any table in scope
    

    Its failing as the output of the SELECT doesn't have a timestamp column (its been renamed to time).

    I'm not sure what the "correct" semantics should be - the reference docs seem to imply if should only column names, aliases and positions:

    Columns selected for output can be referred to in ORDER BY and GROUP BY clauses using column names, column aliases, or column positions. Column positions are integers and begin with 1:

    But then go on to say expressions are allowed:

    MySQL resolves unqualified column or alias references in ORDER BY clauses by searching in the select_expr values, then in the columns of the tables in the FROM clause. For GROUP BY or HAVING clauses, it searches the FROM clause before searching in the select_expr values. (For GROUP BY and HAVING, this differs from the pre-MySQL 5.0 behavior that used the same rules as for ORDER BY.)

    I wonder if, for Sort nodes, the order the columns are indexes in the resolver needs to be reversed?

    https://github.com/dolthub/go-mysql-server/blob/master/sql/analyzer/resolve_columns.go#L482

    opened by tomwilkie 5
  • Select changes from the collation rework

    Select changes from the collation rework

    Most of the code is plumbing CollatedString everywhere so it's not terribly interesting. I'd say these are the 4 files (and I included the deleted charsetcollation.go file too) that can give you a glimpse of how it's being used everywhere. Similar to my previous type changes, rather than pass string and []byte as string values, we pass CollatedString everywhere. This is no different than how DECIMAL and TIME currently work (returning a custom struct as their value type), except that we use strings in a million different places rather than 5.

    One thing to note is that I have chosen to reimplement a lot of the string functionality that we take for granted in Dolt. Sure for the utf8mb4 character set we (I think) can reuse most of the Go functions, but as soon as you use any other character set then you're presented with two options:

    1. What I'm currently doing, which is reimplementing everything. This allows us to maximize performance for different collations while increasing workload (by how much, I don't yet know).
    2. Convert different character sets to utf8mb4, run them through Go's standard library, and then convert back. Least amount of work but we're doing two conversions for literally every operation which is bound to be super slow and allocation-heavy.

    One thing to note is that I did consider something I'll call "transparent character sets". That is, we'd only have one actual character set that stores data, utf8mb4 which maps to Go's strings, and then each different character set is used for validation logic and such, and doesn't actually reflect on the storage side. This would fully allow the Go's standard library to be used for everything. This doesn't really work with foreign data, as it's safe to assume that clients and integrators who use a non-default character set expect the data to be in that encoding. This also heavily complicates collations, as we'd have to not only implement the collation sorting logic itself, but also how it maps against a different encoding. I think this is theoretically possible, but may actually end up being the largest amount of work (imagine collations mapping their unicode diacritics-equivalent across character sets, sounds like a nightmare).

    opened by Hydrocharged 0
  • Describe for default value expressions is formatted incorrectly

    Describe for default value expressions is formatted incorrectly

    Consider the following example

    CREATE TABLE t(pk int primary key, val int DEFAULT (pk * 2))
    

    Mysql with describe t will print

    mysql> desc t;
    +-------+------+------+-----+------------+-------------------+
    | Field | Type | Null | Key | Default    | Extra             |
    +-------+------+------+-----+------------+-------------------+
    | pk    | int  | NO   | PRI | NULL       |                   |
    | val   | int  | YES  |     | (`pk` * 2) | DEFAULT_GENERATED |
    +-------+------+------+-----+------------+-------------------+
    2 rows in set (0.02 sec)
    

    GMS will instead return ((pk * 2)) for val

    enhancement 
    opened by VinaiRachakonda 0
  • sql.JsonDocument ToString returns strings without whitespaces

    sql.JsonDocument ToString returns strings without whitespaces

    The sql.JsonDocument ToString method uses json.Marshall. This returns a string without any extraneous whitespace. MySQL on the other hand uses whitespaces for legibility:

    mysql> CREATE TABLE t1 (c1 JSON);
    
    mysql> INSERT INTO t1 VALUES
         >     ('{"x": 17, "x": "red"}'),
         >     ('{"x": 17, "x": "red", "x": [3, 5, 7]}');
    
    mysql> SELECT c1 FROM t1;
    +------------------+
    | c1               |
    +------------------+
    | {"x": "red"}     |
    | {"x": [3, 5, 7]} |
    +------------------+
    

    Notice that there is additional whitespace between the semicolon of a key/value pairs and between the comma of array elements.

    In Dolt's new storage format, we use sql.JsonDocument for JSON columns. Previously we used json.NomsJson which implements ToString correctly.

    good first issue 
    opened by druvv 0
  • Incorrect Group by syntax passes validation and executes

    Incorrect Group by syntax passes validation and executes

    Hi,

    it appears that GMS doesn't correctly validate incorrect GROUP BY queries. For example the SQL below is executed by GMS but its incorrect SQL syntax (i.e. its missing the group by clause).

    select C_MKTSEGMENT, count(*) from customer;

    explain plain gives - +--------------------------------------------------------+ | GroupBy | | ├─ SelectedExprs(customer.C_MKTSEGMENT, COUNT(*))
    | ├─ Grouping() | | └─ Projected table access on [C_MKTSEGMENT]
    | └─ Table(customer) | +--------------------------------------------------------+

    FYI the underlying table implements both sql.ProjectedTable and sql.FilteredTable. That probably doesn't matter because the syntax checker should catch the incorrect syntax before it hits the table.

    Cheers.

    opened by osawyerr 6
  • Support all MySQL logical operators

    Support all MySQL logical operators

    We currently have support for some, but not all of MySQL's logical operators.

    For example, select 1 OR 1; works, but not select 1 XOR 1;. We should review MySQL's logical operators and ensure we support all of them.

    Found by: go-sqlsmith

    good first issue sql-coverage 
    opened by fulghum 6
Releases(v0.12.0)
Owner
DoltHub
DoltHub
MySQL Storage engine conversion,Support mutual conversion between MyISAM and InnoDB engines.

econvert MySQL Storage engine conversion 简介 此工具用于MySQL存储引擎转换,支持CTAS和ALTER两种模式,目前只支持MyISAM和InnoDB存储引擎相互转换,其它引擎尚不支持。 注意:当对表进行引擎转换时,建议业务停止访问或者极少量访问时进行。 原

null 5 Oct 25, 2021
Go-Postgresql-Query-Builder - A query builder for Postgresql in Go

Postgresql Query Builder for Go This query builder aims to make complex queries

Samuel Banks 4 May 24, 2022
BQB is a lightweight and easy to use query builder that works with sqlite, mysql, mariadb, postgres, and others.

Basic Query Builder Why Simple, lightweight, and fast Supports any and all syntax by the nature of how it works Doesn't require learning special synta

Aaron M 37 Jun 11, 2022
mysql to mysql 轻量级多线程的库表数据同步

goMysqlSync golang mysql to mysql 轻量级多线程库表级数据同步 测试运行 设置当前binlog位置并且开始运行 go run main.go -position mysql-bin.000001 1 1619431429 查询当前binlog位置,参数n为秒数,查询结

null 13 Jun 14, 2022
PgSQL compatible on distributed database TiDB

TiDB for PostgreSQL Introduction TiDB for PostgreSQL is an open source launched by Digital China Cloud Base to promote and integrate into the open sou

DigitalChinaOpenSource 334 Jun 25, 2022
It's a Go console utility for migration from MSSQL to MySQL engine.

A tool for migration the databases to MySQL It's a Go console utility for migration from MSSQL to MySQL engine. The databases should have prepopulated

Eugen Vasilyeu 0 Jan 4, 2022
Go-get-it - Simple database query script for UNIX-terminal usage

go-get-it Simple database query script for UNIX-terminal usage Supports MongoDB Quick start Commands: Usage of ggi: -c string MongoDB collectio

Andrew Aleynikov 0 Mar 16, 2022
Vitess is a database clustering system for horizontal scaling of MySQL.

Vitess Vitess is a database clustering system for horizontal scaling of MySQL through generalized sharding. By encapsulating shard-routing logic, Vite

Vitess 14.3k Jul 6, 2022
Dumpling is a fast, easy-to-use tool written by Go for dumping data from the database(MySQL, TiDB...) to local/cloud(S3, GCP...) in multifarious formats(SQL, CSV...).

?? Dumpling Dumpling is a tool and a Go library for creating SQL dump from a MySQL-compatible database. It is intended to replace mysqldump and mydump

PingCAP 261 Jun 24, 2022
Vitess is a database clustering system for horizontal scaling of MySQL.

Vitess Vitess is a database clustering system for horizontal scaling of MySQL through generalized sharding. By encapsulating shard-routing logic, Vite

Vitess 14.3k Jun 29, 2022
Interactive terminal user interface and CLI for database connections. MySQL, PostgreSQL. More to come.

?? dbui dbui is the terminal user interface and CLI for database connections. It provides features like, Connect to multiple data sources and instance

Kanan Rahimov 91 Jun 30, 2022
Golang restAPI crud project with mySql database.

Golang RestAPI using gorilla/mux Golang restAPI crud project with mySql database. Test Api with Thunder Client vs code beautiful Extension. and use Be

Md Abu. Raihan 6 Mar 26, 2022
A Go rest API project that is following solid and common principles and is connected to local MySQL database.

This is an intermediate-level go project that running with a project structure optimized RESTful API service in Go. API's of that project is designed based on solid and common principles and connected to the local MySQL database.

Kıvanç Aydoğmuş 21 Jun 6, 2022
CRUD API example is written in Go using net/http package and MySQL database.

GoCrudBook CRUD API example is written in Go using net/http package and MySQL database. Requirements Go MySQL Code Editor Project Structure GoCrudBook

Serhat Karabulut 3 May 15, 2022
A proxy is database proxy that de-identifies PII for PostgresDB and MySQL

Surf Surf is a database proxy that is capable of de-identifying PII and anonymizing sentive data fields. Supported databases include Postgres, MySQL,

null 1 Dec 14, 2021
[mirror] the database client and tools for the Go vulnerability database

The Go Vulnerability Database golang.org/x/vulndb This repository is a prototype of the Go Vulnerability Database. Read the Draft Design. Neither the

Go 44 Jun 24, 2022
Database - Example project of database realization using drivers and models

database Golang based database realization Description Example project of databa

Denis 1 Feb 10, 2022
An experimental toolkit for injecting alternate authentication strategies into a PostgreSQL-compatible wire format.

PG Auth Proxy This is an experimental toolkit for injecting alternate authentication strategies into a PostgreSQL-compatible wire format. This is a pr

Cockroach Labs 1 Jan 20, 2022