feat(ibis): add statement level timeout for BigQuery#1339
feat(ibis): add statement level timeout for BigQuery#1339douenergy merged 2 commits intoCanner:mainfrom
Conversation
WalkthroughAdds optional BigQueryConnectionInfo.job_timeout_ms, derives it from X_WREN_DB_STATEMENT_TIMEOUT, initializes BigQuery via google.cloud.bigquery.Client with default QueryJobConfig including timeout, and adds a disconnect() fallback in SimpleConnector.close. No public API signature changes. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller
participant DataSource
participant GCP as bigquery.Client
participant Ibis as ibis.bigquery backend
Caller->>DataSource: get_bigquery_connection(info, headers)
Note over DataSource: derive job_timeout_ms from X_WREN_DB_STATEMENT_TIMEOUT if unset
DataSource->>GCP: Client(project, credentials, location)
DataSource->>GCP: set default_query_job_config (job_timeout_ms)
DataSource->>Ibis: ibis.bigquery.connect(client=GCP)
Ibis-->>Caller: backend instance
sequenceDiagram
autonumber
participant User
participant Connector as SimpleConnector
participant Conn as UnderlyingConnection
User->>Connector: close()
alt Conn has close()
Connector->>Conn: close()
else Conn has disconnect()
Connector->>Conn: disconnect()
else Conn has con with close()
Connector->>Conn: con.close()
else
Note over Connector: log / noop as existing behavior
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
ibis-server/app/model/__init__.py(1 hunks)ibis-server/app/model/connector.py(1 hunks)ibis-server/app/model/data_source.py(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: validate-pull-request-title
- GitHub Check: ci
🔇 Additional comments (4)
ibis-server/app/model/__init__.py (1)
106-109: LGTM!The optional
job_timeout_msfield is well-defined with appropriate type annotation and clear documentation. The default ofNoneallows backward compatibility while enabling timeout configuration when needed.ibis-server/app/model/connector.py (1)
236-238: LGTM!The
disconnect()fallback extends the existing close pattern appropriately, ensuring compatibility with backends that usedisconnect()instead ofclose(). The comment clearly explains the rationale, and the implementation preserves existing error handling.ibis-server/app/model/data_source.py (2)
12-12: LGTM!The import is necessary to support the new client-based BigQuery connection initialization.
128-131: LGTM with minor observation.The timeout handling logic correctly converts the header value from seconds to milliseconds and only sets
job_timeout_mswhen not already configured.Minor note: Since
job_timeout_msis a defined field onBigQueryConnectionInfo(with defaultNone), thehasattr(info, "job_timeout_ms")check will always beTrue. Theor info.job_timeout_ms is Nonecondition handles the actual case, making thehasattrcheck redundant but harmless.
| bq_client = bigquery.Client( | ||
| project=info.project_id.get_secret_value(), credentials=credentials | ||
| ) | ||
| job_config = bigquery.QueryJobConfig() | ||
| bq_client.default_query_job_config = job_config | ||
| backend = ibis.bigquery.connect(client=bq_client, credentials=credentials) | ||
| return backend |
There was a problem hiding this comment.
Apply the timeout configuration to QueryJobConfig.
The QueryJobConfig is created but not configured with the job_timeout_ms value from info. This means statement-level timeouts will not be enforced, which is the primary objective of this PR.
Apply this diff to configure the timeout:
bq_client = bigquery.Client(
project=info.project_id.get_secret_value(), credentials=credentials
)
job_config = bigquery.QueryJobConfig()
+if info.job_timeout_ms is not None:
+ job_config.job_timeout_ms = info.job_timeout_ms
bq_client.default_query_job_config = job_config
backend = ibis.bigquery.connect(client=bq_client, credentials=credentials)
return backend📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| bq_client = bigquery.Client( | |
| project=info.project_id.get_secret_value(), credentials=credentials | |
| ) | |
| job_config = bigquery.QueryJobConfig() | |
| bq_client.default_query_job_config = job_config | |
| backend = ibis.bigquery.connect(client=bq_client, credentials=credentials) | |
| return backend | |
| bq_client = bigquery.Client( | |
| project=info.project_id.get_secret_value(), credentials=credentials | |
| ) | |
| job_config = bigquery.QueryJobConfig() | |
| if info.job_timeout_ms is not None: | |
| job_config.job_timeout_ms = info.job_timeout_ms | |
| bq_client.default_query_job_config = job_config | |
| backend = ibis.bigquery.connect(client=bq_client, credentials=credentials) | |
| return backend |
🤖 Prompt for AI Agents
In ibis-server/app/model/data_source.py around lines 273 to 279, the
QueryJobConfig is created but not assigned the statement timeout from info; set
the timeout on the config by assigning job_config.job_timeout_ms to the value
from info (e.g., job_config.job_timeout_ms = int(info.job_timeout_ms or
info.job_timeout_ms.get_secret_value() as appropriate) or use the correct
accessor used elsewhere for secrets), then proceed to set
bq_client.default_query_job_config = job_config and return the backend so
statement-level timeouts are enforced.
Description
job_timeout_msfor BigQueryConnectionInfoX_WREN_DB_STATEMENT_TIMEOUTto BigQuerydisconnectto close the BigQuery connection.How to test
BigQuery has no
sleeporwaitfunction. To avoid making flaky tests, I only tested locally. If a timeout occurs, the error message would be{ "errorCode": "GENERIC_USER_ERROR", "message": "499 GET https://bigquery.googleapis.com/bigquery/v2/projects/....: Job execution was cancelled: Job timed out after 1 sec\n\nLocation: asia-east1\nJob ID: 4357235c-c72b-4398-96fb-2f3af803cd17\n", "metadata": { "dialectSql": "SELECT COUNT(1) AS count_40_42_41 FROM (SELECT events.`_TABLE_SUFFIX` FROM (SELECT events.`_TABLE_SUFFIX` FROM (SELECT __source.`_TABLE_SUFFIX` AS `_TABLE_SUFFIX` FROM project.analytics_446447784.`events_*` AS __source) AS events) AS events WHERE regexp_contains(events.`_TABLE_SUFFIX`, '^[0-9]+')) AS events" }, "phase": "SQL_EXECUTION", "timestamp": "2025-09-26T17:53:33.590191", "correlationId": "dd043b2a-8f82-46b9-a7f6-2708c0a3bc1e" }Summary by CodeRabbit
New Features
Bug Fixes