feat(ibis): Add safe rounding for Decimal columns to support PyArrow Decimal128 limits#1243
Conversation
WalkthroughThe changes introduce a utility function to round decimal columns in Ibis tables to a consistent scale of 9, update connector logic to apply this rounding before returning results, and adjust test cases and their assertions to expect the new decimal precision. Additionally, the Ruff pre-commit hook version is updated. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Connector
participant Util
Client->>Connector: query(sql, limit)
Connector->>Connector: ibis_table = sql_query_with_limit
Connector->>Util: round_decimal_columns(ibis_table, scale=9)
Util-->>Connector: rounded_ibis_table
Connector->>Connector: convert to PyArrow table
Connector-->>Client: return result
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ibis-server/app/util.py (1)
191-202: Fix typo in comment and approve the implementation.The function correctly implements decimal rounding for PyArrow Decimal128 compatibility. However, there's a typo in the comment.
- # maxinum precision for pyarrow decimal is 38 + # maximum precision for pyarrow decimal is 38The implementation is solid:
- Correctly uses precision=38 to align with PyArrow Decimal128 limits
- Applies rounding with configurable scale (default 9)
- Preserves non-decimal columns unchanged
- Follows immutable pattern by returning a new table
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
ibis-server/app/model/connector.py(6 hunks)ibis-server/app/query_cache/__init__.py(2 hunks)ibis-server/app/util.py(2 hunks)ibis-server/tests/routers/v2/connector/test_postgres.py(1 hunks)ibis-server/tests/routers/v3/connector/bigquery/test_query.py(1 hunks)ibis-server/tests/routers/v3/connector/postgres/test_query.py(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
ibis-server/app/model/connector.py (4)
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object without requiring conn.register().
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object via its "replacement scan" mechanism that recognizes Python variables referencing Arrow objects as SQL tables. No conn.register() call is required.
Learnt from: goldmedal
PR: Canner/wren-engine#1029
File: ibis-server/app/model/metadata/object_storage.py:44-44
Timestamp: 2025-01-07T03:56:21.741Z
Learning: When working with DuckDB in Python, use `conn.execute("DESCRIBE SELECT * FROM table").fetchall()` to get column types instead of accessing DataFrame-style attributes like `dtype` or `dtypes`.
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:50-57
Timestamp: 2025-06-18T02:12:43.570Z
Learning: In the `to_json` function in `ibis-server/app/util.py`, the code intentionally uses `fetch_df()` to get a pandas DataFrame and then calls `to_dict(orient='split')` because this specific format is required for `orjson` serialization. The pandas conversion step is necessary to generate the correct dictionary structure for orjson.
ibis-server/app/util.py (2)
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object without requiring conn.register().
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object via its "replacement scan" mechanism that recognizes Python variables referencing Arrow objects as SQL tables. No conn.register() call is required.
ibis-server/app/query_cache/__init__.py (2)
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object without requiring conn.register().
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object via its "replacement scan" mechanism that recognizes Python variables referencing Arrow objects as SQL tables. No conn.register() call is required.
🧬 Code Graph Analysis (2)
ibis-server/tests/routers/v3/connector/bigquery/test_query.py (4)
ibis-server/tests/routers/v3/connector/postgres/test_query.py (2)
test_decimal_precision(702-714)manifest_str(123-124)ibis-server/tests/conftest.py (1)
client(18-23)ibis-server/tests/routers/v3/connector/bigquery/test_functions.py (1)
manifest_str(31-32)ibis-server/tests/routers/v3/connector/bigquery/conftest.py (1)
connection_info(24-29)
ibis-server/app/query_cache/__init__.py (1)
ibis-server/app/util.py (1)
round_decimal_columns(191-202)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: ci
🔇 Additional comments (8)
ibis-server/tests/routers/v3/connector/bigquery/test_query.py (1)
326-338: LGTM! Decimal precision test correctly implemented.The test effectively verifies that decimal division results are properly rounded to 9 decimal places, which aligns with the default scale used in the
round_decimal_columnsfunction. The use of high precisiondecimal(38, 8)appropriately tests edge cases for PyArrow Decimal128 limits.ibis-server/tests/routers/v2/connector/test_postgres.py (1)
1013-1026: LGTM! Consistent decimal precision testing across connectors.The test implementation maintains consistency with similar tests in other connector modules and correctly verifies the decimal rounding behavior for the Postgres v2 connector.
ibis-server/tests/routers/v3/connector/postgres/test_query.py (1)
702-714: LGTM! Comprehensive decimal precision test coverage.The test ensures decimal precision handling works consistently across API versions and connector types, providing valuable regression protection for the new rounding functionality.
ibis-server/app/util.py (1)
9-10: LGTM! Correct imports for decimal column handling.The imports for
DecimalandTabletypes are necessary for the newround_decimal_columnsfunction implementation.ibis-server/app/model/connector.py (3)
34-34: LGTM! Correct import for decimal column utility.The import of
round_decimal_columnsis necessary for the new decimal precision handling in query methods.
80-82: LGTM! Proper integration of decimal rounding in SimpleConnector.The
round_decimal_columnsfunction is correctly applied to the Ibis table before conversion to PyArrow, ensuring consistent decimal precision handling across all simple connectors.
121-126: LGTM! Consistent decimal handling and correct return type.The changes properly:
- Apply
round_decimal_columnsbefore PyArrow conversion- Update the return type annotation from
pd.DataFrametopa.Tableto reflect the actual return type- Maintain consistency with the SimpleConnector implementation
This ensures decimal precision is handled uniformly across all connector types.
ibis-server/app/query_cache/__init__.py (1)
12-12: LGTM: Import statement is correct.The import of
round_decimal_columnsutility function is properly added and necessary for the decimal precision standardization functionality.
We ensures decimal columns are cast and rounded to a fixed scale (default: 9) with a maximum precision of 36, aligning with the pyarrow Decimal128 limit (max 38).
It prevents overflow or unsupported precision issues when exporting or operating on decimal columns using pyarrow backends.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores