fix(core): fix count statement for BigQuery#1274
Conversation
WalkthroughA new test for counting rows in the BigQuery connector was added. The BigQuery dialect now includes logic to override column aliases containing unsupported special characters, encoding them for compatibility. The Wren dialect delegates alias override logic to its inner dialect. Minor comment removals were made in the test file. A test snapshot was updated to reflect alias mangling in SQL output. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as test_count
participant Client as BigQuery Client
participant API as /query endpoint
participant DB as BigQuery
Test->>API: POST /query (count rows in orders)
API->>Client: Prepare SQL with column aliases
Client->>DB: Execute query
DB-->>Client: Return result
Client-->>API: Response with row count
API-->>Test: Response (status 200, count = 15000)
sequenceDiagram
participant User as User Code
participant Wren as WrenDialect
participant Inner as InnerDialect (BigQueryDialect)
User->>Wren: col_alias_overrides(alias)
Wren->>Inner: col_alias_overrides(alias)
Inner-->>Wren: Encoded or unchanged alias
Wren-->>User: Result
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Suggested labels
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
🔇 Additional comments (1)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
wren-core/core/src/mdl/dialect/inner_dialect.rs (1)
97-118: Improve efficiency by avoiding unnecessary string allocation.The implementation correctly handles BigQuery's special character restrictions, but there's a minor inefficiency on line 116 where
alias.to_string()is called even when no transformation is needed.Apply this diff to avoid unnecessary allocation:
if alias.chars().any(|c| special_chars.contains(&c)) { let mut encoded_name = String::new(); for c in alias.chars() { if special_chars.contains(&c) { encoded_name.push_str(&format!("_{}", c as u32)); } else { encoded_name.push(c); } } Ok(Some(encoded_name)) } else { - Ok(Some(alias.to_string())) + Ok(None) }This change aligns with the trait's design where
Noneindicates no override is needed, avoiding the allocation when the original alias is acceptable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
ibis-server/tests/routers/v3/connector/bigquery/test_query.py(4 hunks)wren-core/core/src/mdl/dialect/inner_dialect.rs(2 hunks)wren-core/core/src/mdl/dialect/wren_dialect.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
ibis-server/tests/routers/v3/connector/bigquery/test_query.py (1)
Learnt from: goldmedal
PR: #1029
File: ibis-server/app/model/metadata/object_storage.py:44-44
Timestamp: 2025-01-07T03:56:21.741Z
Learning: When working with DuckDB in Python, use conn.execute("DESCRIBE SELECT * FROM table").fetchall() to get column types instead of accessing DataFrame-style attributes like dtype or dtypes.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: cargo check
- GitHub Check: cargo test (macos)
- GitHub Check: cargo test (win64)
- GitHub Check: ci
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: test
🔇 Additional comments (5)
wren-core/core/src/mdl/dialect/inner_dialect.rs (1)
52-54: LGTM! Well-designed trait extension.The addition of
col_alias_overridesmethod to theInnerDialecttrait follows good design principles with a sensible default implementation returningOk(None).wren-core/core/src/mdl/dialect/wren_dialect.rs (1)
95-97: LGTM! Consistent delegation pattern.The delegation to the inner dialect follows the established pattern used by other methods in this implementation, maintaining clean separation of concerns.
ibis-server/tests/routers/v3/connector/bigquery/test_query.py (3)
13-13: LGTM! Appropriate manifest configuration.Adding the BigQuery data source specification ensures the correct dialect handling for these tests.
363-363: LGTM! Clean header specification.The removal of extraneous comments leaves clean, focused header configuration.
Also applies to: 381-381
392-410: LGTM! Excellent validation of alias encoding.The test effectively validates the BigQuery column alias encoding functionality. The expected column name
"count_40_42_41"correctly represents the encoded form of"count(*)"where:
(→_40*→_42)→_41This confirms the special character encoding logic is working as intended.
|
Thanks @goldmedal |
Description
We fixed the alias name of BigQuery in Canner/datafusion#1
However, we didn't apply it in our inner dialect.
Summary by CodeRabbit
New Features
Bug Fixes