Skip to content

feat(ibis): Add safe rounding for Decimal columns to support PyArrow Decimal128 limits#1243

Merged
goldmedal merged 6 commits intoCanner:mainfrom
douenergy:decimal-round
Jul 2, 2025
Merged

feat(ibis): Add safe rounding for Decimal columns to support PyArrow Decimal128 limits#1243
goldmedal merged 6 commits intoCanner:mainfrom
douenergy:decimal-round

Conversation

@douenergy
Copy link
Copy Markdown
Contributor

@douenergy douenergy commented Jul 2, 2025

We ensures decimal columns are cast and rounded to a fixed scale (default: 9) with a maximum precision of 36, aligning with the pyarrow Decimal128 limit (max 38).

It prevents overflow or unsupported precision issues when exporting or operating on decimal columns using pyarrow backends.

Summary by CodeRabbit

  • New Features

    • Improved handling of decimal columns by ensuring consistent precision and rounding in query results.
  • Bug Fixes

    • Adjusted decimal precision and scale expectations in query results across supported databases.
  • Tests

    • Added new tests to validate decimal precision and rounding behavior for query results.
    • Updated existing tests to reflect changes in decimal precision and scale.
  • Chores

    • Upgraded the pre-commit configuration to use a newer version of Ruff for code linting.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jul 2, 2025

Walkthrough

The changes introduce a utility function to round decimal columns in Ibis tables to a consistent scale of 9, update connector logic to apply this rounding before returning results, and adjust test cases and their assertions to expect the new decimal precision. Additionally, the Ruff pre-commit hook version is updated.

Changes

File(s) Change Summary
app/model/connector.py Apply round_decimal_columns to query results; update return type annotation for CannerConnector.
app/util.py Add round_decimal_columns function for rounding decimal columns in Ibis tables.
tests/routers/v2/connector/test_postgres.py
tests/routers/v3/connector/bigquery/test_query.py
tests/routers/v3/connector/postgres/test_query.py
Add new test for decimal division precision.
tests/routers/v2/connector/test_clickhouse.py
tests/routers/v2/connector/test_oracle.py
tests/routers/v2/connector/test_s3_file.py
tests/routers/v3/connector/oracle/test_query.py
Update expected decimal precision/scale in test assertions and manifests.
.pre-commit-config.yaml Update Ruff pre-commit hook version from v0.4.8 to v0.12.0.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Connector
    participant Util

    Client->>Connector: query(sql, limit)
    Connector->>Connector: ibis_table = sql_query_with_limit
    Connector->>Util: round_decimal_columns(ibis_table, scale=9)
    Util-->>Connector: rounded_ibis_table
    Connector->>Connector: convert to PyArrow table
    Connector-->>Client: return result
Loading

Suggested reviewers

  • goldmedal

Poem

In fields of decimals, rabbits leap,
Rounding numbers, digits deep.
Precision nine, the new decree,
For every query, accuracy!
Tests now pass with happy cheer—
Hopping forward, code is clear.
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6944134 and b15ac08.

📒 Files selected for processing (3)
  • ibis-server/.pre-commit-config.yaml (1 hunks)
  • ibis-server/app/model/connector.py (3 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_query.py (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • ibis-server/.pre-commit-config.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • ibis-server/tests/routers/v3/connector/postgres/test_query.py
  • ibis-server/app/model/connector.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added ibis python Pull requests that update Python code labels Jul 2, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ibis-server/app/util.py (1)

191-202: Fix typo in comment and approve the implementation.

The function correctly implements decimal rounding for PyArrow Decimal128 compatibility. However, there's a typo in the comment.

-            # maxinum precision for pyarrow decimal is 38
+            # maximum precision for pyarrow decimal is 38

The implementation is solid:

  • Correctly uses precision=38 to align with PyArrow Decimal128 limits
  • Applies rounding with configurable scale (default 9)
  • Preserves non-decimal columns unchanged
  • Follows immutable pattern by returning a new table
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a3dd04d and 6be7dfa.

📒 Files selected for processing (6)
  • ibis-server/app/model/connector.py (6 hunks)
  • ibis-server/app/query_cache/__init__.py (2 hunks)
  • ibis-server/app/util.py (2 hunks)
  • ibis-server/tests/routers/v2/connector/test_postgres.py (1 hunks)
  • ibis-server/tests/routers/v3/connector/bigquery/test_query.py (1 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_query.py (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
ibis-server/app/model/connector.py (4)
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object without requiring conn.register().
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object via its "replacement scan" mechanism that recognizes Python variables referencing Arrow objects as SQL tables. No conn.register() call is required.
Learnt from: goldmedal
PR: Canner/wren-engine#1029
File: ibis-server/app/model/metadata/object_storage.py:44-44
Timestamp: 2025-01-07T03:56:21.741Z
Learning: When working with DuckDB in Python, use `conn.execute("DESCRIBE SELECT * FROM table").fetchall()` to get column types instead of accessing DataFrame-style attributes like `dtype` or `dtypes`.
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:50-57
Timestamp: 2025-06-18T02:12:43.570Z
Learning: In the `to_json` function in `ibis-server/app/util.py`, the code intentionally uses `fetch_df()` to get a pandas DataFrame and then calls `to_dict(orient='split')` because this specific format is required for `orjson` serialization. The pandas conversion step is necessary to generate the correct dictionary structure for orjson.
ibis-server/app/util.py (2)
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object without requiring conn.register().
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object via its "replacement scan" mechanism that recognizes Python variables referencing Arrow objects as SQL tables. No conn.register() call is required.
ibis-server/app/query_cache/__init__.py (2)
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object without requiring conn.register().
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object via its "replacement scan" mechanism that recognizes Python variables referencing Arrow objects as SQL tables. No conn.register() call is required.
🧬 Code Graph Analysis (2)
ibis-server/tests/routers/v3/connector/bigquery/test_query.py (4)
ibis-server/tests/routers/v3/connector/postgres/test_query.py (2)
  • test_decimal_precision (702-714)
  • manifest_str (123-124)
ibis-server/tests/conftest.py (1)
  • client (18-23)
ibis-server/tests/routers/v3/connector/bigquery/test_functions.py (1)
  • manifest_str (31-32)
ibis-server/tests/routers/v3/connector/bigquery/conftest.py (1)
  • connection_info (24-29)
ibis-server/app/query_cache/__init__.py (1)
ibis-server/app/util.py (1)
  • round_decimal_columns (191-202)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci
🔇 Additional comments (8)
ibis-server/tests/routers/v3/connector/bigquery/test_query.py (1)

326-338: LGTM! Decimal precision test correctly implemented.

The test effectively verifies that decimal division results are properly rounded to 9 decimal places, which aligns with the default scale used in the round_decimal_columns function. The use of high precision decimal(38, 8) appropriately tests edge cases for PyArrow Decimal128 limits.

ibis-server/tests/routers/v2/connector/test_postgres.py (1)

1013-1026: LGTM! Consistent decimal precision testing across connectors.

The test implementation maintains consistency with similar tests in other connector modules and correctly verifies the decimal rounding behavior for the Postgres v2 connector.

ibis-server/tests/routers/v3/connector/postgres/test_query.py (1)

702-714: LGTM! Comprehensive decimal precision test coverage.

The test ensures decimal precision handling works consistently across API versions and connector types, providing valuable regression protection for the new rounding functionality.

ibis-server/app/util.py (1)

9-10: LGTM! Correct imports for decimal column handling.

The imports for Decimal and Table types are necessary for the new round_decimal_columns function implementation.

ibis-server/app/model/connector.py (3)

34-34: LGTM! Correct import for decimal column utility.

The import of round_decimal_columns is necessary for the new decimal precision handling in query methods.


80-82: LGTM! Proper integration of decimal rounding in SimpleConnector.

The round_decimal_columns function is correctly applied to the Ibis table before conversion to PyArrow, ensuring consistent decimal precision handling across all simple connectors.


121-126: LGTM! Consistent decimal handling and correct return type.

The changes properly:

  • Apply round_decimal_columns before PyArrow conversion
  • Update the return type annotation from pd.DataFrame to pa.Table to reflect the actual return type
  • Maintain consistency with the SimpleConnector implementation

This ensures decimal precision is handled uniformly across all connector types.

ibis-server/app/query_cache/__init__.py (1)

12-12: LGTM: Import statement is correct.

The import of round_decimal_columns utility function is properly added and necessary for the decimal precision standardization functionality.

Copy link
Copy Markdown
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @douenergy 👍

@goldmedal goldmedal merged commit eac04eb into Canner:main Jul 2, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ibis python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants