Skip to content

feat(core): implement TypePlanner to compatible common SQL types#1314

Merged
douenergy merged 3 commits intoCanner:mainfrom
goldmedal:feat/compitable-type
Sep 11, 2025
Merged

feat(core): implement TypePlanner to compatible common SQL types#1314
douenergy merged 3 commits intoCanner:mainfrom
goldmedal:feat/compitable-type

Conversation

@goldmedal
Copy link
Copy Markdown
Contributor

@goldmedal goldmedal commented Sep 10, 2025

Description

For some specific user (e.g. BigQuery), they are used to use the type name of BigQuery directly. The SQL would be like

select cast(1 as int64)

int64 isn't valid for DataFusion by default. This PR invokes TypePlanner to customize the type planning to compatible common SQL types.

Summary by CodeRabbit

  • New Features
    • Improved SQL type planning for casts (int32/int64, float32/float64) and explicit support for datetime precisions (s, ms, µs, ns), improving SQL transformation accuracy.
  • Tests
    • Added SQL logic tests for integer, float, and datetime casting and a type compatibility test verifying transformed SQL.
  • Chores
    • Enabled type-planning integration during context creation; test harness now auto-creates a default runtime context for unknown test files.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Sep 10, 2025

Walkthrough

Integrates a new WrenTypePlanner into MDL context creation, exports it as mdl::type_planner, and adds tests validating SQL type planning and casting. The test harness now constructs a default runtime context for unknown files. Adds sqllogictest cases for int/float/datetime casts.

Changes

Cohort / File(s) Summary
Type planner integration
wren-core/core/src/mdl/context.rs, wren-core/core/src/mdl/type_planner.rs, wren-core/core/src/mdl/mod.rs
Adds a WrenTypePlanner module and wires an Arc<WrenTypePlanner> into create_ctx_with_mdl via SessionStateBuilder::with_type_planner. Implements DataFusion TypePlanner mapping for int32/int64, float32/float64, and datetime precisions to Arrow types. Exports pub mod type_planner.
Core tests update
wren-core/core/src/mdl/mod.rs
Adds/integrates SessionPropertiesRef in tests and introduces test_compatible_type asserting SQL transform: SELECT CAST(1 AS BIGINT) expected for cast(1 as int64).
Test context defaulting
wren-core/sqllogictest/src/test_context.rs
For unknown test files, constructs a default AnalyzedWrenMDL and initializes context with SessionPropertiesRef::default() and Mode::LocalRuntime (calls updated create_ctx_with_mdl signature) instead of returning None.
sqllogictest cases
wren-core/sqllogictest/test_files/type.slt
New tests covering casts: int32/int64, float32/float64, and datetime with 0/3/6/9 fractional precisions.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Test as Test/Caller
  participant Ctx as create_ctx_with_mdl
  participant SSB as SessionStateBuilder
  participant TP as WrenTypePlanner
  participant DF as DataFusion SessionContext

  Test->>Ctx: create_ctx_with_mdl(&ctx, mdl, session_props, mode)
  Ctx->>SSB: new(...).with_type_planner(Arc<WrenTypePlanner>)
  Note right of SSB: Type planner registered
  SSB->>DF: build()
  DF-->>Ctx: SessionContext
  Ctx-->>Test: SessionContext

  rect rgba(200,230,255,0.25)
  Note over TP,DF: During SQL planning
  DF->>TP: plan_type(SQLDataType)
  TP-->>DF: Option<DataType> (int/float/datetime)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • douenergy

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed Though slightly awkward in grammar, the title clearly summarizes the PR’s primary change—introducing a TypePlanner to enable compatibility with common SQL types—which accurately reflects the core implementation details outlined in the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Poem

A hop and a plan in the typing land,
I mapped your ints with a steady hand.
Datetime ticks at 0,3,6,9—so grand!
Tests now nibble where contexts stand.
With whiskers twitching, I press “run”—🥕

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added core rust Pull requests that update Rust code labels Sep 10, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
wren-core/core/src/mdl/context.rs (1)

51-76: Make timezone lookup case-insensitive by lowercasing properties before reading it.

You lowercase headers later (Lines 90-99) but read x-wren-timezone earlier (Lines 51-54). If callers send "X-Wren-Timezone", the session TZ won’t be applied. Lowercase first, then read the timezone, and drop the later lowercase block.

Apply:

@@
-pub async fn create_ctx_with_mdl(
+pub async fn create_ctx_with_mdl(
@@
-) -> Result<SessionContext> {
-    let session_timezone = properties
-        .get("x-wren-timezone")
-        .map(|v| v.as_ref().map(|s| s.as_str()).unwrap_or("UTC").to_string());
+) -> Result<SessionContext> {
+    // ensure all the key in properties is lowercase (do this first)
+    let properties = Arc::new(
+        properties
+            .iter()
+            .map(|(k, v)| (k.to_lowercase(), v.clone()))
+            .collect::<HashMap<_, _>>(),
+    );
+    let session_timezone = properties
+        .get("x-wren-timezone")
+        .map(|v| v.as_ref().map(|s| s.as_str()).unwrap_or("UTC").to_string());
@@
-    // ensure all the key in properties is lowercase
-    let properties = Arc::new(
-        properties
-            .iter()
-            .map(|(k, v)| {
-                let k = k.to_lowercase();
-                (k, v.clone())
-            })
-            .collect::<HashMap<_, _>>(),
-    );
+    // properties already lowercased above

Also applies to: 90-99

🧹 Nitpick comments (5)
wren-core/core/src/mdl/context.rs (1)

55-71: Minor: avoid setting the same config twice unless needed.

We set with_config(config.clone()) for reset_default_catalog_schema and again with_config(config) before build. If there’s no mutation of config between, the second call is redundant. If intentional, add a short comment.

Also applies to: 112-114

wren-core/sqllogictest/test_files/type.slt (1)

12-19: Consider adding a negative DATETIME precision case.

To lock intended behavior when precision ∉ {0,3,6,9} (e.g., 2 or 8), add a case that asserts planning fails (or is unchanged), matching the planner’s Ok(None) path.

wren-core/sqllogictest/src/test_context.rs (1)

80-91: Preserve error context: avoid .ok().unwrap() on Result.

Use expect on the Result to keep the error message if context creation fails.

Apply:

-                let ctx = create_ctx_with_mdl(
+                let ctx = create_ctx_with_mdl(
                     &ctx,
                     mdl.clone(),
                     SessionPropertiesRef::default(),
                     Mode::LocalRuntime,
                 )
-                .await
-                .ok()
-                .unwrap();
+                .await
+                .expect("failed to create default runtime SessionContext");
wren-core/core/src/mdl/mod.rs (1)

3530-3547: Typo: rename test to test_compatible_type.

Small naming fix.

Apply:

-    async fn test_compitable_type() -> Result<()> {
+    async fn test_compatible_type() -> Result<()> {
wren-core/core/src/mdl/type_planner.rs (1)

8-33: Planner logic is tight and safe; consider extending mappings incrementally.

  • Int64/Int32/Float32/Float64 and DATETIME(0|3|6|9|None) → Timestamp are mapped cleanly; the guard prevents unreachable branch.
  • Optional: add more common aliases over time (e.g., BOOL/BOOLEAN→Boolean, STRING→Utf8, BYTES→Binary, NUMERIC/BIGNUMERIC→Decimal128/256 defaults) to broaden compatibility, keeping default planner as fallback for standard types.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f220aa and c5cdb33.

📒 Files selected for processing (5)
  • wren-core/core/src/mdl/context.rs (2 hunks)
  • wren-core/core/src/mdl/mod.rs (4 hunks)
  • wren-core/core/src/mdl/type_planner.rs (1 hunks)
  • wren-core/sqllogictest/src/test_context.rs (2 hunks)
  • wren-core/sqllogictest/test_files/type.slt (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
wren-core/sqllogictest/src/test_context.rs (3)
wren-core/core/src/mdl/mod.rs (4)
  • mdl (196-231)
  • new (127-182)
  • new (535-537)
  • default (63-71)
wren-core/core/src/mdl/context.rs (2)
  • create_ctx_with_mdl (45-116)
  • new (334-364)
wren-core-py/src/context.rs (2)
  • new (81-203)
  • default (62-70)
wren-core/core/src/mdl/mod.rs (2)
wren-core/core/src/mdl/context.rs (3)
  • create_ctx_with_mdl (45-116)
  • properties (92-98)
  • new (334-364)
wren-core-py/src/context.rs (2)
  • new (81-203)
  • default (62-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: cargo check
  • GitHub Check: cargo test (macos-aarch64)
  • GitHub Check: test
  • GitHub Check: cargo test (win64)
  • GitHub Check: cargo test (macos)
  • GitHub Check: ci
🔇 Additional comments (3)
wren-core/core/src/mdl/context.rs (1)

78-83: Type planner wiring looks correct.

Instantiating WrenTypePlanner and injecting it via SessionStateBuilder.with_type_planner is the right place; this will influence parsing/planning early.

wren-core/sqllogictest/test_files/type.slt (1)

1-9: Good coverage for int/float aliases.

Covers INT64/INT32 and FLOAT64/FLOAT32 casts through the planner.

wren-core/core/src/mdl/mod.rs (1)

46-46: Publicly exporting type_planner is appropriate.

Keeps WrenTypePlanner consumable where needed.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
wren-core/core/src/mdl/mod.rs (2)

46-46: Re-export the planner type for ergonomic imports.

Make WrenTypePlanner available at crate::mdl::WrenTypePlanner to simplify downstream imports.

 pub mod type_planner;
+pub use type_planner::WrenTypePlanner;

3529-3547: Great smoke test; add a couple more cases to harden coverage.

Include case-insensitive variant, a float mapping, and one negative case to guard unknown types.

 #[tokio::test]
 async fn test_compatible_type() -> Result<()> {
     let ctx = SessionContext::new();

     let manifest = ManifestBuilder::default().build();
     let properties = SessionPropertiesRef::default();
     let mdl = Arc::new(AnalyzedWrenMDL::analyze(
         manifest,
         Arc::clone(&properties),
         Mode::Unparse,
     )?);
     let sql = "select cast(1 as int64)";
     assert_snapshot!(
         transform_sql_with_ctx(&ctx, Arc::clone(&mdl), &[], Arc::clone(&properties), sql).await?,
         @"SELECT CAST(1 AS BIGINT)"
     );
+    // Case-insensitive
+    let sql = "select cast(1 as INT64)";
+    assert_snapshot!(
+        transform_sql_with_ctx(&ctx, Arc::clone(&mdl), &[], Arc::clone(&properties), sql).await?,
+        @"SELECT CAST(1 AS BIGINT)"
+    );
+    // Float mapping
+    let sql = "select cast(1 as float64)";
+    assert_snapshot!(
+        transform_sql_with_ctx(&ctx, Arc::clone(&mdl), &[], Arc::clone(&properties), sql).await?,
+        @"SELECT CAST(1 AS DOUBLE)"
+    );
+    // Unknown type should fail
+    let sql = "select cast(1 as foobar)";
+    let res = transform_sql_with_ctx(&ctx, Arc::clone(&mdl), &[], Arc::clone(&properties), sql).await;
+    assert!(res.is_err());
     Ok(())
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c5cdb33 and 4bbd61c.

📒 Files selected for processing (1)
  • wren-core/core/src/mdl/mod.rs (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
wren-core/core/src/mdl/mod.rs (2)
wren-core/core/src/mdl/context.rs (3)
  • create_ctx_with_mdl (45-116)
  • new (334-364)
  • properties (92-98)
wren-core-py/src/context.rs (2)
  • new (81-203)
  • default (62-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: check Cargo.toml formatting
  • GitHub Check: cargo test (amd64)
  • GitHub Check: clippy
  • GitHub Check: cargo test (macos-aarch64)
  • GitHub Check: cargo test (win64)
  • GitHub Check: cargo test (macos)
  • GitHub Check: test
  • GitHub Check: ci
🔇 Additional comments (1)
wren-core/core/src/mdl/mod.rs (1)

553-553: Import looks good.

Consistent with other uses and avoids ambiguous context::... paths.

@goldmedal goldmedal requested a review from douenergy September 10, 2025 06:01
@douenergy
Copy link
Copy Markdown
Contributor

Thanks @goldmedal

@douenergy douenergy merged commit ff51c9d into Canner:main Sep 11, 2025
14 checks passed
nhaluc1005 pushed a commit to nhaluc1005/text2sql-practice that referenced this pull request Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants