Skip to content

feat(ibis): add statement_timeout for Clickhouse#1266

Merged
douenergy merged 6 commits intoCanner:mainfrom
goldmedal:feat/ch-statement-timeout
Jul 25, 2025
Merged

feat(ibis): add statement_timeout for Clickhouse#1266
douenergy merged 6 commits intoCanner:mainfrom
goldmedal:feat/ch-statement-timeout

Conversation

@goldmedal
Copy link
Copy Markdown
Contributor

@goldmedal goldmedal commented Jul 22, 2025

part of #1267

Description

Set up max_execution_time for clickhouse connection by default.

Other changes

  • Enhance the DTO handling

Summary by CodeRabbit

  • New Features

    • Added support for configuring ClickHouse query timeouts via request headers.
    • Server now cancels ClickHouse queries that exceed the specified timeout and returns a clear error message.
    • Enhanced connection configuration to accept additional ClickHouse session settings.
    • Connection info can now be provided via connection strings for more flexible setup.
  • Bug Fixes

    • Improved exception handling for ClickHouse query cancellations to provide more informative timeout errors.
  • Tests

    • Introduced a test to verify that ClickHouse queries exceeding the timeout are properly cancelled and reported.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jul 22, 2025

Walkthrough

The changes add support for statement-level timeout settings for ClickHouse connections by introducing a settings field to the connection info model, updating connection logic to pass timeout values from headers, handling ClickHouse query cancellation exceptions, and adding a test to verify timeout enforcement. The validation of connection info was also adjusted to parse connection info rather than reconstruct DTOs.

Changes

File(s) Change Summary
ibis-server/app/model/init.py Added optional settings and secure fields to ClickHouseConnectionInfo; updated ConnectionInfo union; broadened connection_info field types in DTOs; changed ssl_mode default to SecretStr.
ibis-server/app/model/data_source.py Refined type annotations; added logic to set ClickHouse timeout from headers; enabled connection URL parsing for ClickHouse; passed settings to ClickHouse connect call; added helper methods for URL parsing and bool conversion.
ibis-server/app/util.py Added ClickHouse exception handling for query cancellation in execute_with_timeout.
ibis-server/app/dependencies.py Modified verify_query_dto to validate connection info via get_connection_info instead of DTO reconstruction.
ibis-server/app/model/metadata/dto.py Broadened connection_info field type in MetadataDTO to accept dict or ConnectionInfo.
ibis-server/app/routers/v3/connector.py Added explicit handling to re-raise DatabaseTimeoutError in async endpoint functions to prevent fallback.
ibis-server/tests/routers/v2/connector/test_clickhouse.py Added async test to verify ClickHouse query timeout and cancellation via HTTP 504 response.
ibis-server/tests/routers/v3/connector/clickhouse/test_functions.py Updated expected function count assertion in test_function_list by +1.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API_Server
    participant ClickHouse

    Client->>API_Server: POST /query (with x-wren-db-statement_timeout header)
    API_Server->>API_Server: Parse header, set settings['max_execution_time']
    API_Server->>ClickHouse: Connect with settings
    API_Server->>ClickHouse: Execute query
    alt Query exceeds timeout
        ClickHouse-->>API_Server: QueryCancelled/DatabaseError
        API_Server-->>Client: HTTP 504 with cancellation message
    else Query completes
        ClickHouse-->>API_Server: Query result
        API_Server-->>Client: Query result
    end
Loading

Estimated code review effort

2 (~15 minutes)

Possibly related issues

Poem

In burrows deep where queries race,
A rabbit set a timeout pace—
With headers read and settings set,
ClickHouse queries break a sweat.
If they run long, a hop—abort!
504 replies with a clever retort.
Fast or slow, the code’s on track—
The timeout bunny has your back! 🐇⏱️

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added ibis python Pull requests that update Python code labels Jul 22, 2025
@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Jul 22, 2025
@goldmedal goldmedal force-pushed the feat/ch-statement-timeout branch from 97868f7 to 8078e6d Compare July 22, 2025 07:31
@github-actions github-actions bot added the ci label Jul 22, 2025
@goldmedal goldmedal force-pushed the feat/ch-statement-timeout branch from 75fe9df to 8490ec6 Compare July 22, 2025 09:40
@goldmedal goldmedal force-pushed the feat/ch-statement-timeout branch from bad1ba5 to c29d922 Compare July 23, 2025 07:14
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
ibis-server/app/model/data_source.py (1)

185-186: Consider using a more robust boolean conversion.

The current implementation handles common truthy values, but consider using Python's built-in distutils.util.strtobool or a more comprehensive approach for better reliability.

-    def _safe_strtobool(self, val: str) -> bool:
-        return val.lower() in {"1", "true", "yes", "y"}
+    def _safe_strtobool(self, val: str) -> bool:
+        from distutils.util import strtobool
+        try:
+            return bool(strtobool(val))
+        except ValueError:
+            return False
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bad1ba5 and c29d922.

⛔ Files ignored due to path filters (1)
  • ibis-server/resources/function_list/clickhouse.csv is excluded by !**/*.csv
📒 Files selected for processing (8)
  • ibis-server/app/dependencies.py (1 hunks)
  • ibis-server/app/model/__init__.py (7 hunks)
  • ibis-server/app/model/data_source.py (6 hunks)
  • ibis-server/app/model/metadata/dto.py (1 hunks)
  • ibis-server/app/routers/v3/connector.py (4 hunks)
  • ibis-server/app/util.py (2 hunks)
  • ibis-server/tests/routers/v2/connector/test_clickhouse.py (2 hunks)
  • ibis-server/tests/routers/v3/connector/clickhouse/test_functions.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • ibis-server/app/util.py
  • ibis-server/app/dependencies.py
  • ibis-server/app/model/metadata/dto.py
  • ibis-server/tests/routers/v2/connector/test_clickhouse.py
  • ibis-server/app/model/init.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci
🔇 Additional comments (11)
ibis-server/tests/routers/v3/connector/clickhouse/test_functions.py (1)

56-56: Verify that the function count change aligns with PR objectives.

The test expects one additional function (DATAFUSION_FUNCTION_COUNT + 6 instead of + 5), but the PR is focused on adding statement timeout functionality for ClickHouse. Please confirm that this change is intentional and related to the timeout feature implementation.

#!/bin/bash
# Description: Check if any new functions were added to the ClickHouse function list
# Expected: Find evidence of new timeout-related or other functions

# Search for function definitions or imports that might explain the count increase
rg -A 5 -B 5 "function.*timeout|timeout.*function" 

# Check for recent changes in function list files
fd "function_list" --type f | head -5 | xargs -I {} sh -c 'echo "=== {} ===" && head -20 "{}"'

Likely an incorrect or invalid review comment.

ibis-server/app/routers/v3/connector.py (4)

25-25: Good addition of timeout-specific exception handling.

The import of DatabaseTimeoutError properly supports the new timeout functionality for ClickHouse connections.


177-179: Appropriate timeout exception handling prevents inappropriate fallback.

The specific handling of DatabaseTimeoutError correctly prevents fallback to v2 API when a timeout occurs. This ensures that timeout errors are properly propagated to the client rather than being masked by fallback logic.


337-339: Consistent timeout handling across validation endpoint.

The timeout exception handling here mirrors the query endpoint implementation, maintaining consistency across the API.


420-422: Complete timeout handling coverage for model substitution.

The timeout exception handling is consistently applied to the model substitution endpoint as well, ensuring comprehensive timeout behavior across all affected endpoints.

ibis-server/app/model/data_source.py (6)

5-9: Good addition of required imports for URL parsing and type annotations.

The new imports support URL parsing functionality and improved type safety with more specific type annotations.


85-87: Improved type annotations enhance code clarity.

The refined type annotations (dict[str, Any] for data and dict[str, str] for headers) provide better type safety and code documentation.


106-111: Well-implemented ClickHouse timeout configuration.

The timeout handling correctly:

  • Uses the header value with a sensible default of 180 seconds
  • Only sets max_execution_time if not already configured
  • Initializes settings dict if needed
  • Converts the timeout to integer as required by ClickHouse

116-124: Robust connection URL detection and handling.

The logic properly detects connection URL formats and delegates ClickHouse URL parsing to a specialized method while falling back to generic ConnectionUrl for other data sources.


162-183: Comprehensive ClickHouse URL parsing implementation.

The URL parsing method handles all necessary components:

  • Schema validation
  • User/password extraction with proper URL decoding
  • Host/port/database extraction
  • Query parameter parsing with special handling for secure boolean
  • Proper error handling for invalid URLs

270-270: Proper settings propagation to ClickHouse connection.

The explicit passing of settings ensures that timeout configurations are properly applied to the ClickHouse connection.

@goldmedal goldmedal requested a review from douenergy July 23, 2025 08:35
@douenergy douenergy merged commit 671c9b5 into Canner:main Jul 25, 2025
5 checks passed
@douenergy
Copy link
Copy Markdown
Contributor

Thanks @goldmedal

@goldmedal goldmedal deleted the feat/ch-statement-timeout branch July 25, 2025 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci dependencies Pull requests that update a dependency file ibis python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants