Curvature
is a high performance query engine that integrates lots of advanced techniques, like Vectorization, Morsel-Driven Parallelism, etc. Curvature
's execution model are adapted from Duckdb and its Array
/Vector
model are adapted from Arrow and Velox, which is implemented with the style described in type-exercise-in-urst.
I have been using datafusion for a period of time. It is a high performance query engine based on Arrow. It is great, but I am not satisfied with it. Because:
-
Tokio. Tokio is an awesome runtime! But, in my view, it is not suited for OLAP. Firstly, as we all know, linux does not support async io for filesystem(You may say
AIO
, Linux Torvalds said it is a horrible ad-hoc design). Read the file system asynchronously is achieved by put it in a thread pool, it is pretty inefficient.Io-Uring
can solve the problem perfectly. We will use themonoio
to implementCurvature
and schedule theTask
manually. -
Pull based model: Datafusion use the pull based model to execute the query. According to the issue, push based model is more flexible.
-
Datafusion will create a lot of tasks and schedule them on the Tokio, these tasks have dependencies but tokio can not be aware of it. Therefore, the tokio will pull them independently and the task, whose children is not ready, will be pending. Other threads whose task queue is empty can steal these tasks. Which means that the execution is not NUMA aware, which may cause lots of cache miss. Using the thread-per-core model and assign the tasks manually could solve this problem. It is the key point of the Morsel-Driven Parallelism
PhysicalPlan
: A tree, each node is aPhysicalOperator
PhysicalOpertor
: Operator likeFilter
,Projection
, etc. Note that it is NOT EXECUTABLE. To executePhysicalOperator
,PhysicalOperatorState
is needed. It contains the states that each operator should be aware of. For example, the global state forTableScan
, such that each parallelism unit(Task
) ofTableScan
can be synchronized.Expression
: expression used inwhere
condition,select
arguments, etcExpressionExecutor
: AExpression
is not executable,ExpressionExecutor
is used to execute it. It will take expression and DataChunk, produce the result to output DataChunk. It consists ofExpressionState
that used to keep track of the intermediate states of the execution process.Pipeline
: A fragment ofPhysicalPlan
. ThePhysicalPlan
can be break into multiple fragments callPipeline
viaPipeline Breaker
(likeHashAggregator
,Join
,Sort
, etc).Pipeline Breaker
can not flow the individual DataChunk, it has to collect all of the DataChunks, then It produce DataChunks to otherPhysicalOperator
. APipeline
has three roles ofPhysicalOperator
. ASink
inChild Pipeline
is also theSource
inParent Pipeline
, which means that aPhysicalOperator
can belong to differentPipelines
.Source
: First operator in thePipeline
, it emit data to resultNormal Operator
: Internal operator in thePipeline
, consume DataChunk produced by other operator, and produce result.Sink
: Last operator in thePipeline
, it only consumes the data, does not produce any result.
Event
: Split the execution intoEvents
, such that Sink's finalize could be only called once. Likeunion
, the sink is shared by different pipeline. If we execute the pipeline with finalize method, the finalize will be called multiple times.
This crate is guaranteed to compile on the latest stable Rust
- Code should follow the style.md
- Zero
cargo check
warning - Zero
cargo clippy
warning - Zero
FAILED
incargo test
- Zero
FAILED
incargo +nightly miri test
especially when you have unsafe functions!
Curvature
requires the x86-64 CPUs must new than x86-64-v2
microarchitecture.
Because we will use sse4.2
by default. And x86_64 CPU should support AES
, we
will also use it by default