fix(core): introduce DATE_DIFF function#1303
Conversation
|
Warning Rate limit exceeded@goldmedal has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 5 minutes and 33 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughAdds BigQuery DATE_DIFF/DATEDIFF translation and supported date-part mappings, updates tests (unit/integration) to cover valid/invalid granularities and types, renames one bypass UDF test, adjusts function-count expectations, and adds MDL transform tests (duplicate block present). Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Server as Ibis Server
participant MDL as MDL Analyzer
participant Dialect as BigQuery Dialect
participant AST as SQL AST Builder
Client->>Server: send query with date_diff/datediff(...)
Server->>MDL: analyze & transform expression
MDL->>Dialect: request translation (args[0]=PART, args[1], args[2])
alt valid PART and 3 args
Dialect->>AST: build DATE_DIFF(args[2], args[1], PART)
AST-->>MDL: transformed SQL with DATE_DIFF(...)
MDL-->>Server: SQL ready
Server-->>Client: 200 + results
else invalid PART
Dialect-->>MDL: raise unsupported-part error (lists valid values)
MDL-->>Server: propagate error
Server-->>Client: 422 Unprocessable Entity
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
ibis-server/tests/routers/v3/connector/bigquery/test_functions.py (2)
16-32: Manifest should declare o_orderdate to match queries.Queries reference orders.o_orderdate but the manifest declares only o_orderkey. Declare o_orderdate to keep the test self-consistent and reduce surprises if schema validation tightens.
Apply:
@@ { "name": "orders", "tableReference": { "schema": "tpch_tiny", "table": "orders", }, "columns": [ - {"name": "o_orderkey", "type": "integer"}, + {"name": "o_orderkey", "type": "integer"}, + {"name": "o_orderdate", "type": "date"}, ], },
165-245: Solid coverage; add arity error cases and a mixed-case part to harden it.
- Add a negative test for wrong arity (e.g., DATE_DIFF(DAY, CURRENT_DATE()) and DATE_DIFF(DAY, a, b, c)) to assert the 3-arg contract.
- Add a case with mixed/lowercase part (e.g., DATEDIFF('day', ...)) to verify case-insensitive handling end-to-end.
Happy to push a patch with the extra assertions if you want.
wren-core/core/src/mdl/mod.rs (4)
3119-3137: Nice coverage of DATE_DIFF/DATEDIFF variants; add a lowercase date part case.PR text mentions support for date_diff('day', …). Consider adding one assertion for 'day' (lowercase) to lock in case-insensitivity.
Apply this diff to add a check:
+ let sql = "select date_diff('day', date_1, date_2) from date_table"; + assert_snapshot!( + transform_sql_with_ctx(&ctx, Arc::clone(&analyzed_mdl), &[], Arc::clone(&headers), sql).await?, + @"SELECT DATE_DIFF(date_table.date_2, date_table.date_1, DAY) FROM (SELECT date_table.date_1, date_table.date_2 FROM (SELECT __source.date_1 AS date_1, __source.date_2 AS date_2 FROM date_table AS __source) AS date_table) AS date_table" + );
3146-3164: Error message: consider including WEEK() or clarify unsupported subparts.BigQuery allows WEEK(weekday) (e.g., WEEK(MONDAY)) in DIFF functions. If the dialect supports it (as with EXTRACT tests), list that form in “Valid values” or explicitly state WEEK(weekday) is unsupported here to avoid user confusion.
3166-3171: Prefer TIMESTAMP_DIFF for TIMESTAMP inputs (readability/idiomatic BigQuery).DATE_DIFF works across datetime family per BigQuery docs, but TIMESTAMP_DIFF with TIMESTAMP args is clearer. Consider emitting TIMESTAMP_DIFF for TIMESTAMP types and updating the expectation.
Apply this diff if you switch the dialect to type-specific functions:
- @"SELECT DATE_DIFF(timestamp_table.ts_2, timestamp_table.ts_1, HOUR) FROM (SELECT timestamp_table.ts_1, timestamp_table.ts_2 FROM (SELECT __source.ts_1 AS ts_1, __source.ts_2 AS ts_2 FROM timestamp_table AS __source) AS timestamp_table) AS timestamp_table" + @"SELECT TIMESTAMP_DIFF(timestamp_table.ts_2, timestamp_table.ts_1, HOUR) FROM (SELECT timestamp_table.ts_1, timestamp_table.ts_2 FROM (SELECT __source.ts_1 AS ts_1, __source.ts_2 AS ts_2 FROM timestamp_table AS __source) AS timestamp_table) AS timestamp_table"For reference on function names/semantics, see BigQuery TIMESTAMP_DIFF and DATE_DIFF docs. (cloud.google.com)
3119-3171: Minor: reuse a single headers Arc instead of redeclaring.Multiple let headers = Arc::new(HashMap::default()) re-decls add noise. Define once and reuse.
Apply once near the first assertion and drop subsequent redeclarations:
- let headers: Arc<HashMap<String, Option<String>>> = Arc::new(HashMap::default()); + let headers: Arc<HashMap<String, Option<String>>> = Arc::new(HashMap::default()); ... - let headers: Arc<HashMap<String, Option<String>>> = Arc::new(HashMap::default()); ... - let headers: Arc<HashMap<String, Option<String>>> = Arc::new(HashMap::default());
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
wren-core-py/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
ibis-server/tests/routers/v3/connector/bigquery/test_functions.py(1 hunks)wren-core/core/src/mdl/dialect/inner_dialect.rs(3 hunks)wren-core/core/src/mdl/function.rs(1 hunks)wren-core/core/src/mdl/mod.rs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
ibis-server/tests/routers/v3/connector/bigquery/test_functions.py (2)
ibis-server/tests/conftest.py (1)
client(18-23)ibis-server/tests/routers/v3/connector/bigquery/conftest.py (1)
connection_info(25-30)
wren-core/core/src/mdl/dialect/inner_dialect.rs (2)
wren-core/core/src/mdl/dialect/utils.rs (1)
args(30-48)wren-core/core/src/logical_plan/analyze/access_control.rs (1)
expr_to_sql(658-661)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: cargo test (win64)
- GitHub Check: cargo check
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: cargo test (macos)
- GitHub Check: ci
- GitHub Check: test
🔇 Additional comments (4)
wren-core/core/src/mdl/function.rs (1)
390-399: Rename avoids collision with new date_diff handling — good change.Switching the bypass UDF used in this unit test from date_diff to date_test prevents accidental coupling with the dialect rewrite. Expected plan string updated accordingly.
wren-core/core/src/mdl/dialect/inner_dialect.rs (2)
274-281: Nice: expanded date parts align with BigQuery’s supported units.MICROSECOND through HOUR inclusion is helpful for TIMESTAMP diffs; error message lists valid values.
150-193: Confirm DATEDIFF alias normalization
No direct handling ofDATEDIFF(no underscore) ininner_dialect.rs; verify that BigQueryDialect or the SQL parser normalizesDATEDIFFtodate_diffto avoid backend‐specific surprises.wren-core/core/src/mdl/mod.rs (1)
3091-3117: Good BigQuery-focused test scaffold.Manifest setup and Unparse mode context look correct for exercising dialect rewrites without execution.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
wren-core-py/tests/test_modeling_core.py (2)
140-140: Strengthen the assertion to verify new functions explicitly.
Counting is brittle. Also assert the presence of the newly added functions to catch catalog drift earlier.- assert len(functions) == 292 + assert len(functions) == 292 + names = {f.name for f in functions} + assert {"date_diff", "date_test"}.issubset(names)
152-152: Also assert that DATE_DIFF is available in the default catalog.
Helps validate the DataFusion fork wiring beyond a raw count.- assert len(functions) == 285 + assert len(functions) == 285 + assert any(f.name == "date_diff" for f in functions)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
ibis-server/tests/conftest.py(1 hunks)wren-core-py/tests/test_modeling_core.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
wren-core-py/tests/test_modeling_core.py (2)
ibis-server/app/routers/v3/connector.py (1)
functions(405-430)wren-core-py/src/context.rs (1)
functions(100-103)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: cargo test (amd64)
- GitHub Check: check Cargo.toml formatting
- GitHub Check: clippy
- GitHub Check: test
- GitHub Check: ci
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: cargo test (macos)
- GitHub Check: cargo test (win64)
🔇 Additional comments (1)
ibis-server/tests/conftest.py (1)
14-14: LGTM: function count bumped to 285 matches the new DATE_DIFF support.
No issues spotted.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (2)
ibis-server/resources/function_list/bigquery.csvis excluded by!**/*.csvibis-server/resources/white_function_list/bigquery.csvis excluded by!**/*.csv
📒 Files selected for processing (2)
ibis-server/tests/routers/v3/connector/bigquery/test_functions.py(3 hunks)ibis-server/tests/routers/v3/connector/local_file/test_functions.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- ibis-server/tests/routers/v3/connector/bigquery/test_functions.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: check Cargo.toml formatting
- GitHub Check: clippy
- GitHub Check: cargo test (amd64)
- GitHub Check: cargo test (macos)
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: cargo test (win64)
- GitHub Check: test
- GitHub Check: ci
|
BigQuery has been tested locally |
|
Thanks @goldmedal |
Description
Use the Canner DataFusion fork for support the
DATE_DIFFfunction:Now, we support the following SQL syntax:
and
BigQuery dialect
BigQuery have their own usage for DATE_DIFF. We need to change the ordering when converting for BigQuery.
Summary by CodeRabbit
New Features
Bug Fixes
Tests