feat(core): implement TypePlanner to compatible common SQL types#1314
feat(core): implement TypePlanner to compatible common SQL types#1314douenergy merged 3 commits intoCanner:mainfrom
Conversation
WalkthroughIntegrates a new WrenTypePlanner into MDL context creation, exports it as Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Test as Test/Caller
participant Ctx as create_ctx_with_mdl
participant SSB as SessionStateBuilder
participant TP as WrenTypePlanner
participant DF as DataFusion SessionContext
Test->>Ctx: create_ctx_with_mdl(&ctx, mdl, session_props, mode)
Ctx->>SSB: new(...).with_type_planner(Arc<WrenTypePlanner>)
Note right of SSB: Type planner registered
SSB->>DF: build()
DF-->>Ctx: SessionContext
Ctx-->>Test: SessionContext
rect rgba(200,230,255,0.25)
Note over TP,DF: During SQL planning
DF->>TP: plan_type(SQLDataType)
TP-->>DF: Option<DataType> (int/float/datetime)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Pre-merge checks (3 passed)✅ Passed checks (3 passed)
Poem
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
wren-core/core/src/mdl/context.rs (1)
51-76: Make timezone lookup case-insensitive by lowercasing properties before reading it.You lowercase headers later (Lines 90-99) but read x-wren-timezone earlier (Lines 51-54). If callers send "X-Wren-Timezone", the session TZ won’t be applied. Lowercase first, then read the timezone, and drop the later lowercase block.
Apply:
@@ -pub async fn create_ctx_with_mdl( +pub async fn create_ctx_with_mdl( @@ -) -> Result<SessionContext> { - let session_timezone = properties - .get("x-wren-timezone") - .map(|v| v.as_ref().map(|s| s.as_str()).unwrap_or("UTC").to_string()); +) -> Result<SessionContext> { + // ensure all the key in properties is lowercase (do this first) + let properties = Arc::new( + properties + .iter() + .map(|(k, v)| (k.to_lowercase(), v.clone())) + .collect::<HashMap<_, _>>(), + ); + let session_timezone = properties + .get("x-wren-timezone") + .map(|v| v.as_ref().map(|s| s.as_str()).unwrap_or("UTC").to_string()); @@ - // ensure all the key in properties is lowercase - let properties = Arc::new( - properties - .iter() - .map(|(k, v)| { - let k = k.to_lowercase(); - (k, v.clone()) - }) - .collect::<HashMap<_, _>>(), - ); + // properties already lowercased aboveAlso applies to: 90-99
🧹 Nitpick comments (5)
wren-core/core/src/mdl/context.rs (1)
55-71: Minor: avoid setting the same config twice unless needed.We set with_config(config.clone()) for reset_default_catalog_schema and again with_config(config) before build. If there’s no mutation of config between, the second call is redundant. If intentional, add a short comment.
Also applies to: 112-114
wren-core/sqllogictest/test_files/type.slt (1)
12-19: Consider adding a negative DATETIME precision case.To lock intended behavior when precision ∉ {0,3,6,9} (e.g., 2 or 8), add a case that asserts planning fails (or is unchanged), matching the planner’s Ok(None) path.
wren-core/sqllogictest/src/test_context.rs (1)
80-91: Preserve error context: avoid.ok().unwrap()on Result.Use
expecton the Result to keep the error message if context creation fails.Apply:
- let ctx = create_ctx_with_mdl( + let ctx = create_ctx_with_mdl( &ctx, mdl.clone(), SessionPropertiesRef::default(), Mode::LocalRuntime, ) - .await - .ok() - .unwrap(); + .await + .expect("failed to create default runtime SessionContext");wren-core/core/src/mdl/mod.rs (1)
3530-3547: Typo: rename test totest_compatible_type.Small naming fix.
Apply:
- async fn test_compitable_type() -> Result<()> { + async fn test_compatible_type() -> Result<()> {wren-core/core/src/mdl/type_planner.rs (1)
8-33: Planner logic is tight and safe; consider extending mappings incrementally.
- Int64/Int32/Float32/Float64 and DATETIME(0|3|6|9|None) → Timestamp are mapped cleanly; the guard prevents unreachable branch.
- Optional: add more common aliases over time (e.g., BOOL/BOOLEAN→Boolean, STRING→Utf8, BYTES→Binary, NUMERIC/BIGNUMERIC→Decimal128/256 defaults) to broaden compatibility, keeping default planner as fallback for standard types.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
wren-core/core/src/mdl/context.rs(2 hunks)wren-core/core/src/mdl/mod.rs(4 hunks)wren-core/core/src/mdl/type_planner.rs(1 hunks)wren-core/sqllogictest/src/test_context.rs(2 hunks)wren-core/sqllogictest/test_files/type.slt(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
wren-core/sqllogictest/src/test_context.rs (3)
wren-core/core/src/mdl/mod.rs (4)
mdl(196-231)new(127-182)new(535-537)default(63-71)wren-core/core/src/mdl/context.rs (2)
create_ctx_with_mdl(45-116)new(334-364)wren-core-py/src/context.rs (2)
new(81-203)default(62-70)
wren-core/core/src/mdl/mod.rs (2)
wren-core/core/src/mdl/context.rs (3)
create_ctx_with_mdl(45-116)properties(92-98)new(334-364)wren-core-py/src/context.rs (2)
new(81-203)default(62-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: cargo check
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: test
- GitHub Check: cargo test (win64)
- GitHub Check: cargo test (macos)
- GitHub Check: ci
🔇 Additional comments (3)
wren-core/core/src/mdl/context.rs (1)
78-83: Type planner wiring looks correct.Instantiating WrenTypePlanner and injecting it via SessionStateBuilder.with_type_planner is the right place; this will influence parsing/planning early.
wren-core/sqllogictest/test_files/type.slt (1)
1-9: Good coverage for int/float aliases.Covers INT64/INT32 and FLOAT64/FLOAT32 casts through the planner.
wren-core/core/src/mdl/mod.rs (1)
46-46: Publicly exportingtype_planneris appropriate.Keeps
WrenTypePlannerconsumable where needed.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
wren-core/core/src/mdl/mod.rs (2)
46-46: Re-export the planner type for ergonomic imports.Make
WrenTypePlanneravailable atcrate::mdl::WrenTypePlannerto simplify downstream imports.pub mod type_planner; +pub use type_planner::WrenTypePlanner;
3529-3547: Great smoke test; add a couple more cases to harden coverage.Include case-insensitive variant, a float mapping, and one negative case to guard unknown types.
#[tokio::test] async fn test_compatible_type() -> Result<()> { let ctx = SessionContext::new(); let manifest = ManifestBuilder::default().build(); let properties = SessionPropertiesRef::default(); let mdl = Arc::new(AnalyzedWrenMDL::analyze( manifest, Arc::clone(&properties), Mode::Unparse, )?); let sql = "select cast(1 as int64)"; assert_snapshot!( transform_sql_with_ctx(&ctx, Arc::clone(&mdl), &[], Arc::clone(&properties), sql).await?, @"SELECT CAST(1 AS BIGINT)" ); + // Case-insensitive + let sql = "select cast(1 as INT64)"; + assert_snapshot!( + transform_sql_with_ctx(&ctx, Arc::clone(&mdl), &[], Arc::clone(&properties), sql).await?, + @"SELECT CAST(1 AS BIGINT)" + ); + // Float mapping + let sql = "select cast(1 as float64)"; + assert_snapshot!( + transform_sql_with_ctx(&ctx, Arc::clone(&mdl), &[], Arc::clone(&properties), sql).await?, + @"SELECT CAST(1 AS DOUBLE)" + ); + // Unknown type should fail + let sql = "select cast(1 as foobar)"; + let res = transform_sql_with_ctx(&ctx, Arc::clone(&mdl), &[], Arc::clone(&properties), sql).await; + assert!(res.is_err()); Ok(()) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
wren-core/core/src/mdl/mod.rs(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
wren-core/core/src/mdl/mod.rs (2)
wren-core/core/src/mdl/context.rs (3)
create_ctx_with_mdl(45-116)new(334-364)properties(92-98)wren-core-py/src/context.rs (2)
new(81-203)default(62-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: check Cargo.toml formatting
- GitHub Check: cargo test (amd64)
- GitHub Check: clippy
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: cargo test (win64)
- GitHub Check: cargo test (macos)
- GitHub Check: test
- GitHub Check: ci
🔇 Additional comments (1)
wren-core/core/src/mdl/mod.rs (1)
553-553: Import looks good.Consistent with other uses and avoids ambiguous
context::...paths.
|
Thanks @goldmedal |
Description
For some specific user (e.g. BigQuery), they are used to use the type name of BigQuery directly. The SQL would be like
int64isn't valid for DataFusion by default. This PR invokesTypePlannerto customize the type planning to compatible common SQL types.Summary by CodeRabbit