This is a very perlimenary PR for the query module so far. Much things have to be considered further.
Since it would be a principle module, I just want to discuss the current design/implementation ASAP to avoid improper design and find some better ideas to proceed.
A Logical Plan is a DAG (Directed Acyclic Graph) of Params.
Param defines necessary parameters for query execution. The parameters are well prepared during the logical plan composing in order to reduce extra cost while executing physical plan.
the logical plan can be plotted as Dot graph. For example,
We can leverage the online toolkit to visualize the logical plan,
A Physical Plan contains the logical plan and the
Transform(s) corresponding to each
While the plan is triggered to run, a reversed topology-sorted slices (
Future as items) will be generated from the logical plan.
- [x] client utils for building
- [x] logical plan: Ops such as Sort, OffsetAndLimit, ChunkIDMerge, TableScan and IndexScan
- [x] physical plan: topology sort,
- [ ] complete API to connect with
Liaison (Add handlers) (Maybe next PR)
To be discussed
Index selection and optimization stage
For now, I only use single-value indexes. But in the current implementation, we may be able to improve index selection during the process of generating Logical Plan.
Any better idea? Since for the traditional databases, normally they have optimization stages (usually after generating hierarchical logical plan?) for indexes selection. How can we fit this optimization stage in our implementations?
Sort and field orderliness
I believe we have to impose stronger preconditions to sort-field since it is not possible to sort on a sparse field.
And Sort requires the Field to be arranged in a strict order, i.e. we have to use
fieldIndex number to access the field that is needed to be sorted quickly. Otherwise, it may cost much resources to find the specific field every time.