Skip to content

Latest commit

 

History

History
120 lines (102 loc) · 9.4 KB

101-query-compilation.md

File metadata and controls

120 lines (102 loc) · 9.4 KB

Query Compilation 101

Prerequisite videos

Current Compiler Pipeline

Representations:

  • SQL — source language
  • AST — a parsed version of a SQL query.
  • HIR — high-level intermediate representation.
  • MIR — mid-level intermediate representation.
  • LIR — low-level intermediate representation.
  • TDO — target language (timely & differential operators).

Transformations in the compile-time lifecycle of a dataflow.

For a one-off query, we run all the transformations until the MIR stage. Then we determine whether we need to serve the query on the "slow path", that is, creating a temporary dataflow and then deleting it. If we don't need to serve the query on the "slow path", then we can skip the MIR ⇒ LIR and the LIR ⇒ TDO steps. Existing "fast paths" include:

Currently, the optimization team is mostly concerned with the HIR ⇒ MIR and MIR ⇒ MIR stages.

Testing

Integration tests

  • Sqllogictest
    • Philip’s RQG tests will be in this format.
      • Add Philip to any PR where query plans may change.
    • A PR can be merged if it passes Fast SLT.
    • A PR does not need to pass Full SLT tests (test/sqllogictest/sqlite) to be merged.
      • Full SLT tests take 2-3 hours.
      • You can manually initiate full SLT tests on your branch here.
  • Testdrive

Unit tests

Performance tests

Tooling

  • mzt — can be used to create repositories of plans and write up a markdown that explains something based on those plans (see Alexander’s mzt-repos for example).