Skip to content

feat(ibis): Introduce WrenError for the structured ErrorResponse#1299

Merged
goldmedal merged 12 commits intoCanner:mainfrom
goldmedal:feat/enhance-error-handle
Aug 27, 2025
Merged

feat(ibis): Introduce WrenError for the structured ErrorResponse#1299
goldmedal merged 12 commits intoCanner:mainfrom
goldmedal:feat/enhance-error-handle

Conversation

@goldmedal
Copy link
Copy Markdown
Contributor

@goldmedal goldmedal commented Aug 26, 2025

Description

WrenError is the error that wraps the original exception to unify the format of error response. It includes error code, error phase and some debugging informations. The response looks like the following format now:

{
  "errorCode": "INVALID_SQL",
  "message": "Error during planning: table 'wren.public.orders1' not found",
  "metadata": null,
  "phase": "SQL_PLANNING",
  "timestamp": "2025-08-25T14:05:34.573940",
  "correlationId": "55b4302c-9c45-4956-9060-bb0da82fbf62"
}
  • ErrorCode is used to classify the error.
  • message is the original error messages.
  • metadata can collect some informations for debugging (e.g. dialect SQL or planned SQL)
  • phase mentions when the error occurs.

Summary by CodeRabbit

  • New Features

    • Consistent JSON error responses (errorCode, message, timestamp, correlationId).
    • Support for X-Correlation-ID header in requests/responses.
    • New SSL mode: verify_ca.
    • BigQuery: auto-derive schema for certain empty-result cases.
  • Improvements

    • Standardized HTTP statuses (e.g., 404, 501, 504) for clearer API behavior.
    • More contextual, backend-aware error messages with processing-phase info.
  • Bug Fixes

    • Several error responses changed from plain text to structured JSON.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Aug 26, 2025

Walkthrough

Centralizes error handling by adding app/model/error.py (WrenError, ErrorCode, ErrorPhase, ErrorResponse), replaces legacy custom exceptions across many modules, changes FastAPI error responses to JSON (ORJSONResponse) with X-Correlation-ID header support, and updates tests to assert JSON error payloads while removing many /validate tests.

Changes

Cohort / File(s) Summary
Core error model & exports
ibis-server/app/model/error.py, ibis-server/app/model/__init__.py
Added DIALECT_SQL, PLANNED_SQL, ErrorCode, ErrorPhase, ErrorResponse, WrenError, DatabaseTimeoutError; removed legacy CustomHttpError hierarchy; added SSLMode.VERIFY_CA.
App entrypoint, middleware & headers
ibis-server/app/main.py, ibis-server/app/dependencies.py
Added X_CORRELATION_ID constant; switched exception handlers to use WrenError; error responses use ORJSONResponse with ErrorResponse payload including correlation id and timestamp.
MDL, rewriter & planning
ibis-server/app/mdl/analyzer.py, ibis-server/app/mdl/rewriter.py, ibis-server/app/mdl/substitute.py, ibis-server/app/mdl/java_engine.py
Replaced local/legacy exceptions with WrenError and ErrorCode/ErrorPhase metadata; removed AnalyzeError/RewriteError/WrenEngineError/SubstituteError; added PLANNED_SQL where applicable.
Connectors, data sources & metadata
ibis-server/app/model/connector.py, ibis-server/app/model/data_source.py, ibis-server/app/model/metadata/canner.py, ibis-server/app/model/metadata/object_storage.py
Unified error wrapping into WrenError (INVALID_SQL, GET_CONNECTION_ERROR, etc.), attached SQL metadata, refined DuckDB/BigQuery flows, and replaced ValueError/other exceptions with coded WrenError.
Utilities & timeouts
ibis-server/app/util.py
Added ClickHouseDbError compatibility alias; moved DatabaseTimeoutError import to new module; removed execute_sample_with_timeout.
Wren CLI & session
ibis-server/wren/__init__.py, ibis-server/wren/__main__.py, ibis-server/wren/session/__init__.py
Replaced ValueError raises with WrenError(GENERIC_USER_ERROR); improved connection_info handling; preserved public signatures.
Tests — error shape & validation removal
ibis-server/tests/routers/**, ibis-server/tests/routers/v3/connector/postgres/*
Updated many tests to expect JSON ErrorResponse (errorCode/message/phase/correlationId); switched imports to app.model.error; removed numerous /validate rule tests and adjusted some status expectations (notably 404 for certain not-found cases).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant App as FastAPI App
  participant Endpoint as Route Logic
  participant Service as Service / DB / MDL
  participant Handler as WrenError Handler
  participant Resp as ORJSONResponse

  Client->>App: HTTP request (may include X-Correlation-ID)
  App->>Endpoint: dispatch
  Endpoint->>Service: perform operation
  alt Success
    Service-->>Endpoint: result
    Endpoint-->>Client: 2xx JSON
  else WrenError raised
    Service-->>Endpoint: raise WrenError(code, phase, metadata)
    Endpoint->>Handler: build ErrorResponse (errorCode, message, phase, metadata, correlationId, timestamp)
    Handler->>Resp: serialize JSON
    Resp-->>Client: status = WrenError.get_http_status_code()
  else Unexpected Exception
    Service-->>Endpoint: raise Exception
    Endpoint->>Handler: build generic ErrorResponse (GENERIC_INTERNAL_ERROR, correlationId, timestamp)
    Resp-->>Client: 500 JSON
  end
  note right of App: Correlation header: X-Correlation-ID
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • onlyjackfrost
  • douenergy

Poem

In burrows of bytes a rabbit hops light,
WrenError carrots gleam through the night.
Headers aligned, correlation bright,
JSON crumbs guide each request's flight.
Hop on, dear stack — errors made polite. 🥕

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added bigquery ibis python Pull requests that update Python code labels Aug 26, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
ibis-server/app/mdl/java_engine.py (1)

37-42: Bug: r.raise_for_status() returns None — chaining .text will crash on success.

httpx.Response.raise_for_status() returns None (and raises on non-2xx). As written, successful responses attempt None.text and fail.

-        return r.raise_for_status().text.replace("\n", " ")
+        r.raise_for_status()
+        return r.text.replace("\n", " ")
ibis-server/app/util.py (1)

252-264: Exceptions other than “timeout” are swallowed — function may return None silently.

In both ClickHouse and Trino handlers, if the error is not a timeout, the except block falls through without re-raising, potentially returning None and hiding real failures.

-    except TimeoutError:
+    except asyncio.TimeoutError:
         raise DatabaseTimeoutError(
             f"{operation_name} timeout after {app_timeout_seconds} seconds"
         )
     except clickhouse_connect.driver.exceptions.DatabaseError as e:
         if "TIMEOUT_EXCEEDED" in str(e):
             raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}")
+        raise
     except trino.exceptions.TrinoQueryError as e:
         if e.error_name == "EXCEEDED_TIME_LIMIT":
             raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}")
+        raise
     except psycopg.errors.QueryCanceled as e:
         raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}")
ibis-server/wren/__init__.py (1)

57-74: Use specific error codes and simplify file handling for MDL file

  • File readability check after a successful open is unnecessary; catch FileNotFoundError instead and map to MDL_NOT_FOUND (404).
  • Malformed JSON should map to INVALID_MDL.
  • Keep GENERIC_USER_ERROR only for generic 4xx conditions.
-    with open(mdl_path) as f:
-        if not f.readable():
-            raise WrenError(
-                ErrorCode.GENERIC_USER_ERROR, f"Cannot read MDL file at {mdl_path}"
-            )
-        try:
-            manifest = json.load(f)
-            manifest_base64 = (
-                base64.b64encode(json.dumps(manifest).encode("utf-8")).decode("utf-8")
-                if manifest
-                else None
-            )
-        except json.JSONDecodeError as e:
-            raise WrenError(
-                ErrorCode.GENERIC_USER_ERROR,
-                f"Invalid JSON in MDL file at {mdl_path}: {e}",
-            ) from e
+    try:
+        with open(mdl_path) as f:
+            manifest = json.load(f)
+        manifest_base64 = (
+            base64.b64encode(json.dumps(manifest).encode("utf-8")).decode("utf-8")
+            if manifest
+            else None
+        )
+    except FileNotFoundError as e:
+        raise WrenError(
+            ErrorCode.MDL_NOT_FOUND,
+            f"MDL file not found at {mdl_path}",
+        ) from e
+    except json.JSONDecodeError as e:
+        raise WrenError(
+            ErrorCode.INVALID_MDL,
+            f"Invalid JSON in MDL file at {mdl_path}: {e}",
+        ) from e
ibis-server/app/model/validator.py (2)

93-96: Avoid naive split on '=' when parsing relationship condition.

Use a single split to tolerate expressions containing '=' (e.g., CAST, JSON ops). Trim both sides safely.

-        columns = condition.split("=")
-        left_column = columns[0].strip().split(".")[1]
-        right_column = columns[1].strip().split(".")[1]
+        left_expr, right_expr = [p.strip() for p in condition.split("=", 1)]
+        left_column = left_expr.split(".")[1]
+        right_column = right_expr.split(".")[1]

97-104: Quote column identifiers in DISTINCT to avoid case/reserved-word issues.

The join SQL quotes columns; do the same for the uniqueness check for consistency and correctness.

-        def generate_column_is_unique_sql(model_name, column_name):
-            return f'SELECT count(*) = count(distinct {column_name}) AS result FROM "{model_name}"'
+        def generate_column_is_unique_sql(model_name, column_name):
+            return (
+                f'SELECT count(*) = count(distinct "{column_name}") AS result '
+                f'FROM "{model_name}"'
+            )
ibis-server/app/main.py (1)

105-114: And for 501 handler: use the runtime correlation ID.

-    return ORJSONResponse(
+    from asgi_correlation_id import correlation_id  # local import
+    corr_id = correlation_id.get() or request.headers.get(X_CORRELATION_ID)
+    return ORJSONResponse(
         status_code=501,
         content=ErrorResponse(
             error_code=ErrorCode.NOT_IMPLEMENTED.name,
             message=str(exc),
-            timestamp=datetime.now().isoformat(),
-            correlation_id=request.headers.get(X_CORRELATION_ID),
+            timestamp=datetime.now().isoformat(),
+            correlation_id=corr_id,
         ).model_dump(by_alias=True),
     )
ibis-server/app/model/connector.py (1)

399-441: BigQuery: return a PyArrow table for the empty-result workaround.

The method contract returns pa.Table. Returning a pandas.DataFrame here will break upstream code that expects .to_pandas()/.to_pyarrow().

-                    ibis_fields = ibis_schema_mapper.to_ibis(bq_fields.schema)
-                    return pd.DataFrame(columns=ibis_fields.names)
+                    ibis_fields = ibis_schema_mapper.to_ibis(bq_fields.schema)
+                    empty_df = pd.DataFrame(columns=ibis_fields.names)
+                    return pa.Table.from_pandas(empty_df)
🧹 Nitpick comments (60)
ibis-server/tests/routers/v2/connector/test_local_file.py (1)

248-251: Migrate raw-text error assertions to structured JSON checks across all tests

The API now consistently returns a WrenError/ErrorResponse JSON payload rather than plain text. Beyond the single snippet in test_local_file.py, dozens of other tests still assert on response.text or do exact-match on response.json()["message"]. To improve robustness and validate the full error schema, update each test to:

  • Parse the JSON once (e.g. err = response.json())
  • Assert the "message" key exists, is a string, and uses startswith or in rather than exact equality
  • Verify the presence (and string type) of the structured fields: "code", "phase", and "correlationId"

Key locations needing refactoring include (but are not limited to):

  • ibis-server/tests/routers/v2/test_relationship_valid.py
    • lines 121, 134, 150: raw response.text == …

  • ibis-server/tests/routers/v2/connector/test_minio_file.py
    • line 351: exact response.text == "Failed to list files: Unsupported format: unsupported"

  • ibis-server/tests/routers/v2/connector/test_local_file.py
    • lines 248–251: exact response.json()["message"] == "Failed to list files: Unsupported format: unsupported"

  • ibis-server/tests/routers/v2/connector/test_s3_file.py & test_gcs_file.py
    • lines 296–297: exact response.json()["message"] == "Failed to list files: Unsupported format: unsupported"

  • ibis-server/tests/routers/v2/connector/test_postgres.py
    • lines 598–599, 686–687, 954–955, 966–968: exact response.json()["message"] == …

  • ibis-server/tests/routers/v2/connector/test_oracle.py
    • lines 420–421: exact response.json()["message"] == 'Ambiguous model: …'

  • ibis-server/tests/routers/v3/connector/postgres/test_validate.py
    • lines 43–44, 111–112, 122–123, 177–178: exact response.json()["message"] == …

  • ibis-server/tests/routers/v3/connector/postgres/test_model_substitute.py
    • lines 271–272, 286–288: exact response.json()["message"] == …

…and all other tests under ibis-server/tests/routers/{v2,v3}/connector that still reference response.text or do exact JSON-message matches.

Example refactor for test_local_file.py:

-    assert (
-        response.json()["message"]
-        == "Failed to list files: Unsupported format: unsupported"
-    )
+    err = response.json()
+    # Verify message shape without brittle exact match
+    assert "message" in err and isinstance(err["message"], str)
+    assert err["message"].startswith("Failed to list files: Unsupported format")
+    # Ensure structured error fields exist
+    for key in ("code", "phase", "correlationId"):
+        assert key in err and isinstance(err[key], str)

Please apply similar changes across all identified tests to fully migrate to the structured ErrorResponse schema.

ibis-server/app/dependencies.py (2)

42-43: Use the constant and exact match for filtering correlation header.

Request headers from ASGI are lowercase; prefer comparing against X_CORRELATION_ID.lower() and use equality instead of startswith to avoid unintentionally passing through similarly prefixed headers.

Apply this diff:

-    elif header_string.startswith("x-correlation-id"):
-        return True
+    elif header_string == X_CORRELATION_ID.lower():
+        return True

44-49: Optional: allow W3C baggage propagation.

You already pass traceparent/tracestate/sentry-trace. If you use OpenTelemetry or Sentry, consider also allowing the standardized "baggage" header for context propagation consistency.

Apply this diff:

-    elif header_string == "sentry-trace":
+    elif header_string == "sentry-trace":
         return True
+    elif header_string == "baggage":
+        return True
ibis-server/tests/routers/v2/connector/test_s3_file.py (1)

295-298: Strengthen assertions to validate structured ErrorResponse.

Since the API now returns a structured error payload (ErrorResponse), assert presence of errorCode and correlationId, and that the correlationId matches the response header. Keeps tests aligned with the new contract and catches serialization regressions.

Apply this diff:

-    assert (
-        response.json()["message"]
-        == "Failed to list files: Unsupported format: unsupported"
-    )
+    err = response.json()
+    assert err["message"] == "Failed to list files: Unsupported format: unsupported"
+    # Structured fields
+    assert "errorCode" in err
+    assert "correlationId" in err
+    # Correlation header is propagated and matches payload
+    cid = response.headers.get("X-Correlation-ID")
+    assert cid and err["correlationId"] == cid
ibis-server/tests/routers/v2/connector/test_gcs_file.py (1)

295-298: Mirror S3 test improvements for GCS error shape.

Validate errorCode and correlationId presence, and assert correlation header parity with payload.

Apply this diff:

-    assert (
-        response.json()["message"]
-        == "Failed to list files: Unsupported format: unsupported"
-    )
+    err = response.json()
+    assert err["message"] == "Failed to list files: Unsupported format: unsupported"
+    # Structured fields
+    assert "errorCode" in err
+    assert "correlationId" in err
+    # Correlation header is propagated and matches payload
+    cid = response.headers.get("X-Correlation-ID")
+    assert cid and err["correlationId"] == cid
ibis-server/app/model/error.py (4)

72-85: Make timestamps timezone-aware (UTC).

Naive local timestamps complicate cross-service correlation. Use UTC and include the offset in ISO-8601.

Apply this diff:

-from datetime import datetime
+from datetime import datetime, timezone
@@
-        self.timestamp = datetime.now().isoformat()
+        self.timestamp = datetime.now(timezone.utc).isoformat()

72-80: Either use the 'cause' parameter or drop it.

The constructor accepts cause but never stores or exposes it. Persist it for logging/observability, or remove the parameter to avoid confusion.

Apply this diff:

     def __init__(
         self,
         error_code: ErrorCode,
         message: str,
         phase: ErrorPhase | None = None,
         metadata: dict[str, Any] | None = None,
         cause: Exception | None = None,
     ):
         self.error_code = error_code
         self.message = message
         self.phase = phase
         self.metadata = metadata
-        self.timestamp = datetime.now().isoformat()
+        self.timestamp = datetime.now(timezone.utc).isoformat()
+        self.cause = cause  # retained for logs/diagnostics

96-114: Map DUCKDB_FILE_NOT_FOUND to 404.

A file-not-found condition semantically aligns with HTTP 404 alongside other "not found" codes.

Apply this diff:

         match self.error_code:
             case (
                 ErrorCode.NOT_FOUND
                 | ErrorCode.MDL_NOT_FOUND
                 | ErrorCode.VALIDATION_RULE_NOT_FOUND
+                | ErrorCode.DUCKDB_FILE_NOT_FOUND
             ):
                 return HTTP_404_NOT_FOUND

116-125: Optional: include structured metadata for timeouts.

Consider adding metadata (e.g., timeoutMs, engine/resource info) to aid users in remediation without parsing the message string.

Example (outside current diff scope):

class DatabaseTimeoutError(WrenError):
    def __init__(self, message: str, *, timeout_ms: int | None = None, resource: str | None = None):
        meta = {"timeoutMs": timeout_ms, "resource": resource}
        super().__init__(error_code=ErrorCode.DATABASE_TIMEOUT, message=message, metadata=meta)
ibis-server/app/model/metadata/canner.py (1)

72-74: Attach phase and helpful metadata to WrenError.

This enriches diagnostics and keeps errors consistent with the new model.

Apply this diff:

-            raise WrenError(
-                ErrorCode.INVALID_CONNECTION_INFO, f"Workspace {ws_sql_name} not found"
-            )
+            from app.model.error import ErrorPhase
+            raise WrenError(
+                ErrorCode.INVALID_CONNECTION_INFO,
+                f"Workspace {ws_sql_name} not found",
+                phase=ErrorPhase.METADATA_FETCHING,
+                metadata={"workspaceSqlName": ws_sql_name},
+            )
ibis-server/app/mdl/java_engine.py (1)

31-35: Use a more precise phase and enrich metadata for observability.

Since this path guards dry-plan calls, SQL_DRY_RUN is a closer fit than SQL_PLANNING. Also consider attaching minimal metadata (e.g., which endpoint was missing) to aid debugging.

-            raise WrenError(
-                ErrorCode.GENERIC_INTERNAL_ERROR,
-                "WREN_ENGINE_ENDPOINT is not set. Cannot call dry_plan without a valid endpoint.",
-                phase=ErrorPhase.SQL_PLANNING,
-            )
+            raise WrenError(
+                ErrorCode.GENERIC_INTERNAL_ERROR,
+                "WREN_ENGINE_ENDPOINT is not set. Cannot call dry_plan without a valid endpoint.",
+                phase=ErrorPhase.SQL_DRY_RUN,
+                metadata={"reason": "missing_wren_engine_endpoint"},
+            )
ibis-server/app/util.py (4)

41-43: Polish the migration banner grammar (tiny nit).

-MIGRATION_MESSAGE = "Wren engine is migrating to Rust version now. \
-    Wren AI team are appreciate if you can provide the error messages and related logs for us."
+MIGRATION_MESSAGE = "Wren engine is migrating to the Rust version now. \
+    The Wren AI team would appreciate receiving error messages and related logs."

64-71: Typos in debug log — make log lines easier to grep.

-    logger.debug(f"formmated_sql: {formatted_sql}")
+    logger.debug(f"formatted_sql: {formatted_sql}")

213-234: Rename _formater → _formatter for consistency; update call site.

Minor naming nit; improves readability and greppability.

-def _formater(field: pa.Field) -> str:
+def _formatter(field: pa.Field) -> str:
@@
-        + ", ".join([_formater(field) for field in df.schema])
+        + ", ".join([_formatter(field) for field in df.schema])

248-256: Optional: handle the general case where operation is a Task vs. bare awaitable consistently.

Current usage is fine, but if you foresee passing coroutines directly more often, consider normalizing to a Task internally to simplify cancellation paths.

ibis-server/tests/routers/v2/connector/test_clickhouse.py (1)

471-499: Assert structured error payload for timeouts, not just substrings.

Given the unified ErrorResponse, assert errorCode and message in JSON. Keep substring check if desired.

-    assert response.status_code == 504  # Gateway Timeout
-    assert "Query was cancelled:" in response.text
+    assert response.status_code == 504  # Gateway Timeout
+    payload = response.json()
+    assert payload["errorCode"] == ErrorCode.DATABASE_TIMEOUT.name
+    assert "Query was cancelled:" in payload["message"]
@@
-    assert response.status_code == 504  # Gateway Timeout
-    assert "Query was cancelled:" in response.text
+    assert response.status_code == 504  # Gateway Timeout
+    payload = response.json()
+    assert payload["errorCode"] == ErrorCode.DATABASE_TIMEOUT.name
+    assert "Query was cancelled:" in payload["message"]
ibis-server/app/mdl/analyzer.py (3)

5-5: Consider importing ErrorPhase to tag where failures occur.

Adding phases improves troubleshooting and aligns with other modules.

-from app.model.error import ErrorCode, WrenError
+from app.model.error import ErrorCode, WrenError, ErrorPhase

20-25: Attach phase and minimal metadata to WrenError for better diagnostics.

Connect failures and HTTP errors benefit from phase and context (endpoint/path, status).

-        raise WrenError(
-            ErrorCode.LEGACY_ENGINE_ERROR, f"Can not connect to Java Engine: {e}"
-        ) from e
+        raise WrenError(
+            ErrorCode.LEGACY_ENGINE_ERROR,
+            f"Cannot connect to Java Engine: {e}",
+            phase=ErrorPhase.SQL_PLANNING,
+            metadata={"endpoint": wren_engine_endpoint, "path": "/v2/analysis/sql"},
+        ) from e
@@
-        raise WrenError(ErrorCode.GENERIC_USER_ERROR, e.response.text)
+        raise WrenError(
+            ErrorCode.GENERIC_USER_ERROR,
+            e.response.text,
+            phase=ErrorPhase.SQL_PLANNING,
+            metadata={
+                "endpoint": wren_engine_endpoint,
+                "path": "/v2/analysis/sql",
+                "status_code": e.response.status_code,
+            },
+        )

37-41: Mirror the same phase/metadata for analyze_batch.

-        raise WrenError(
-            ErrorCode.LEGACY_ENGINE_ERROR, f"Can not connect to Java Engine: {e}"
-        ) from e
+        raise WrenError(
+            ErrorCode.LEGACY_ENGINE_ERROR,
+            f"Cannot connect to Java Engine: {e}",
+            phase=ErrorPhase.SQL_PLANNING,
+            metadata={"endpoint": wren_engine_endpoint, "path": "/v2/analysis/sqls"},
+        ) from e
@@
-        raise WrenError(ErrorCode.GENERIC_USER_ERROR, e.response.text)
+        raise WrenError(
+            ErrorCode.GENERIC_USER_ERROR,
+            e.response.text,
+            phase=ErrorPhase.SQL_PLANNING,
+            metadata={
+                "endpoint": wren_engine_endpoint,
+                "path": "/v2/analysis/sqls",
+                "status_code": e.response.status_code,
+            },
+        )
ibis-server/app/model/data_source.py (2)

181-185: Expose non-sensitive metadata on invalid ClickHouse URL to speed up debugging.

Including the provided scheme (only) helps identify typical mistakes (e.g., “http” vs “clickhouse”) without leaking secrets.

-            raise WrenError(
-                ErrorCode.INVALID_CONNECTION_INFO,
-                "Invalid connection URL for ClickHouse",
-            )
+            raise WrenError(
+                ErrorCode.INVALID_CONNECTION_INFO,
+                "Invalid connection URL for ClickHouse",
+                metadata={"scheme": parsed.scheme or None},
+            )

411-416: SSL CA requirement is correct; consider clarifying the mode name.

Message is clear. Optionally add ssl_mode value to metadata for clarity.

-            raise WrenError(
-                ErrorCode.INVALID_CONNECTION_INFO,
-                "SSL CA must be provided when SSL mode is VERIFY CA",
-            )
+            raise WrenError(
+                ErrorCode.INVALID_CONNECTION_INFO,
+                "SSL CA must be provided when SSL mode is VERIFY CA",
+                metadata={"ssl_mode": "VERIFY_CA"},
+            )
ibis-server/app/model/metadata/object_storage.py (2)

84-88: Avoid leaking raw exception details; include cause and keep the message generic.

Returning the raw exception text in the end-user message risks exposing internal paths or provider messages. Keep the message stable, attach the underlying exception via cause, and rely on logs/observability for details. Also keeps phase metadata intact.

Apply this diff:

-            raise WrenError(
-                ErrorCode.GENERIC_USER_ERROR,
-                f"Failed to list files: {e!s}",
-                phase=ErrorPhase.METADATA_FETCHING,
-            )
+            raise WrenError(
+                ErrorCode.GENERIC_USER_ERROR,
+                "Failed to list files",
+                phase=ErrorPhase.METADATA_FETCHING,
+                cause=e,
+            )

350-355: Add phase metadata for consistency in get_version().

Other branches use ErrorPhase.METADATA_FETCHING; aligning here improves tracing and alerting. Message content is fine.

-            raise WrenError(
-                ErrorCode.GENERIC_USER_ERROR, "Failed to get DuckDB version"
-            )
+            raise WrenError(
+                ErrorCode.GENERIC_USER_ERROR,
+                "Failed to get DuckDB version",
+                phase=ErrorPhase.METADATA_FETCHING,
+            )
@@
-            raise WrenError(ErrorCode.GENERIC_USER_ERROR, "DuckDB version is empty")
+            raise WrenError(
+                ErrorCode.GENERIC_USER_ERROR,
+                "DuckDB version is empty",
+                phase=ErrorPhase.METADATA_FETCHING,
+            )
ibis-server/app/mdl/rewriter.py (5)

50-60: Capture dialect metadata and attach cause for SQLGlot failures.

Adding read/write to metadata materially improves debuggability in telemetry, and cause preserves the original stack for logs while keeping the client message concise.

         try:
             read = self._get_read_dialect(self.experiment)
             write = self._get_write_dialect(self.data_source)
             return sqlglot.transpile(planned_sql, read=read, write=write)[0]
         except Exception as e:
-            raise WrenError(
-                ErrorCode.SQLGLOT_ERROR,
-                str(e),
-                phase=ErrorPhase.SQL_TRANSPILE,
-                metadata={PLANNED_SQL: planned_sql},
-            )
+            raise WrenError(
+                ErrorCode.SQLGLOT_ERROR,
+                str(e),
+                phase=ErrorPhase.SQL_TRANSPILE,
+                metadata={PLANNED_SQL: planned_sql, "readDialect": read, "writeDialect": write},
+                cause=e,
+            )

112-125: Consider classifying engine connectivity/timeouts as external errors and set phase.

Using LEGACY_ENGINE_ERROR (101) maps to HTTP 500. For network connectivity/timeouts to an external service, GENERIC_EXTERNAL_ERROR (maps to 502) may better communicate the nature of the failure to clients. Also attach phase=SQL_PLANNING and cause for parity with other error sites.

If API consumers depend on 500 today, confirm before changing. Proposed diff:

-        except httpx.ConnectError as e:
-            raise WrenError(
-                ErrorCode.LEGACY_ENGINE_ERROR, f"Can not connect to Java Engine: {e}"
-            )
+        except httpx.ConnectError as e:
+            raise WrenError(
+                ErrorCode.GENERIC_EXTERNAL_ERROR,
+                f"Cannot connect to Java Engine: {e}",
+                phase=ErrorPhase.SQL_PLANNING,
+                cause=e,
+            )
         except httpx.TimeoutException as e:
-            raise WrenError(
-                ErrorCode.LEGACY_ENGINE_ERROR,
-                f"Timeout when connecting to Java Engine: {e}",
-            )
+            raise WrenError(
+                ErrorCode.GENERIC_EXTERNAL_ERROR,
+                f"Timeout when connecting to Java Engine: {e}",
+                phase=ErrorPhase.SQL_PLANNING,
+                cause=e,
+            )
         except httpx.HTTPStatusError as e:
-            raise WrenError(
-                ErrorCode.INVALID_SQL, e.response.text, ErrorPhase.SQL_PLANNING
-            )
+            raise WrenError(
+                ErrorCode.INVALID_SQL, e.response.text, ErrorPhase.SQL_PLANNING, cause=e
+            )

149-150: Attach cause to preserve stack on embedded planning errors.

Minor enhancement; keeps client message stable and observability richer.

-            raise WrenError(ErrorCode.INVALID_SQL, str(e), ErrorPhase.SQL_PLANNING)
+            raise WrenError(
+                ErrorCode.INVALID_SQL, str(e), ErrorPhase.SQL_PLANNING, cause=e
+            )

162-163: Same as above for sync path — include cause.

-            raise WrenError(ErrorCode.INVALID_SQL, str(e), ErrorPhase.SQL_PLANNING)
+            raise WrenError(
+                ErrorCode.INVALID_SQL, str(e), ErrorPhase.SQL_PLANNING, cause=e
+            )

180-181: Include cause for extraction errors.

Consistent with other error sites; retains original exception context in logs/traces.

-        raise WrenError(ErrorCode.INVALID_MDL, str(e), ErrorPhase.MDL_EXTRACTION)
+        raise WrenError(
+            ErrorCode.INVALID_MDL, str(e), ErrorPhase.MDL_EXTRACTION, cause=e
+        )
ibis-server/tests/routers/v2/connector/test_postgres.py (2)

598-600: Make the assertion robust; avoid stringifying rules in full.

The full rules mapping can be order/version-dependent; assert on a stable substring instead.

-    assert (
-        response.json()["message"]
-        == f"The rule `unknown_rule` is not in the rules, rules: {rules}"
-    )
+    msg = response.json()["message"]
+    assert "The rule `unknown_rule` is not in the rules" in msg

685-687: Add errorCode/phase checks for consistency with preceding test.

Keeps all validation-parameter tests aligned on the same invariants.

-    assert response.status_code == 422
-    assert response.json()["message"] == "modelName is required"
+    assert response.status_code == 422
+    res = response.json()
+    assert res["errorCode"] == ErrorCode.VALIDATION_PARAMETER_ERROR.name
+    assert res["phase"] == ErrorPhase.VALIDATION.name
+    assert res["message"] == "modelName is required"
ibis-server/tests/routers/v2/connector/test_mysql.py (1)

364-368: Use substring matching for driver error messages.

Exact MySQL driver error strings can vary across versions/platforms. Keep the check resilient.

-    assert response.status_code == 422
-    result = response.json()
-    assert result["errorCode"] == error_code.name
-    assert result["message"] == expected_error
+    assert response.status_code == 422
+    result = response.json()
+    assert result["errorCode"] == error_code.name
+    # Driver messages vary by version; assert containment for stability.
+    assert expected_error in result["message"]
ibis-server/tests/routers/v3/connector/postgres/test_validate.py (4)

43-45: Stabilize assertion: prefer error_code and substring checks over full message equality

Comparing the entire message string (including the dynamic rules dict) is brittle and can fail on minor formatting/order changes. Since the API now returns structured errors, assert the error_code and a stable substring instead. Also consider checking the presence of the correlation ID header.

-    assert (
-        response.json()["message"]
-        == f"The rule `unknown_rule` is not in the rules, rules: {rules}"
-    )
+    result = response.json()
+    assert result["error_code"] == "VALIDATION_RULE_NOT_FOUND"
+    assert "The rule `unknown_rule` is not in the rules" in result["message"]
+    # Optional: ensure correlation id is propagated
+    assert response.headers.get("x-correlation-id")

111-111: Leverage structured payload: assert error_code for parameter errors

You can strengthen this test by asserting the specific error_code that maps to parameter issues.

-    assert response.json()["message"] == "columnName is required"
+    result = response.json()
+    assert result["error_code"] == "VALIDATION_PARAMETER_ERROR"
+    assert result["message"] == "columnName is required"

122-122: Likewise, assert the parameter error_code for modelName

Same reasoning as above.

-    assert response.json()["message"] == "modelName is required"
+    result = response.json()
+    assert result["error_code"] == "VALIDATION_PARAMETER_ERROR"
+    assert result["message"] == "modelName is required"

175-179: Prefer structured checks for RLAC validation failure

This is a good, clear message. To make the test more resilient (and aligned with the new WrenError contract), assert the error_code and a stable substring.

-    assert (
-        response.json()["message"]
-        == "Error during planning: The session property @session_not_found is used for `rlac_validation` rule, but not found in the session properties"
-    )
+    result = response.json()
+    assert result["error_code"] == "VALIDATION_ERROR"
+    assert "@session_not_found" in result["message"]
+    # Optional: ensure correlation id is propagated
+    assert response.headers.get("x-correlation-id")
ibis-server/wren/session/__init__.py (6)

12-12: Import ErrorPhase to enrich error context in Task errors

Since the error model supports phases and other modules already set them, import ErrorPhase here so Task can annotate failures with the appropriate phase.

-from app.model.error import ErrorCode, WrenError
+from app.model.error import ErrorCode, ErrorPhase, WrenError

139-143: Use phase and correct guidance in dry_run precondition error

“transpile()” isn’t a method on Task; users should call plan() (or Context.sql(...).plan/execute). Also add phase to help downstream tooling.

-            raise WrenError(
-                ErrorCode.GENERIC_USER_ERROR,
-                "Dialect SQL is not set. Call transpile() first.",
-            )
+            raise WrenError(
+                ErrorCode.GENERIC_USER_ERROR,
+                "Dialect SQL is not set. Call plan() first.",
+                phase=ErrorPhase.SQL_DRY_RUN,
+            )

153-158: Return a more precise code for missing connection info and set phase

INVALID_CONNECTION_INFO is available and communicates the issue better than GENERIC_USER_ERROR. Also set the execution phase.

-        if self.context.connection_info is None:
-            raise WrenError(
-                ErrorCode.GENERIC_USER_ERROR,
-                "Connection info is not set. Cannot execute without connection info.",
-            )
+        if self.context.connection_info is None:
+            raise WrenError(
+                ErrorCode.INVALID_CONNECTION_INFO,
+                "Connection info is not set. Cannot execute without connection info.",
+                phase=ErrorPhase.SQL_EXECUTION,
+            )

158-163: Align execute precondition with phase and correct guidance

Same as dry_run: update guidance to “plan()” and include phase.

-        if self.dialect_sql is None:
-            raise WrenError(
-                ErrorCode.GENERIC_USER_ERROR,
-                "Dialect SQL is not set. Call transpile() first.",
-            )
+        if self.dialect_sql is None:
+            raise WrenError(
+                ErrorCode.GENERIC_USER_ERROR,
+                "Dialect SQL is not set. Call plan() first.",
+                phase=ErrorPhase.SQL_EXECUTION,
+            )

168-173: Optionally, include phase for result formatting precondition

Minor consistency tweak: carry the SQL_EXECUTION phase into this guard too.

-        if self.results is None:
-            raise WrenError(
-                ErrorCode.GENERIC_USER_ERROR,
-                "Results are not set. Call execute() first.",
-            )
+        if self.results is None:
+            raise WrenError(
+                ErrorCode.GENERIC_USER_ERROR,
+                "Results are not set. Call execute() first.",
+                phase=ErrorPhase.SQL_EXECUTION,
+            )

76-88: Typo: _lowrcase_properties → _lowercase_properties

Renaming improves readability and avoids leaking typos into your public API if this method ever becomes referenced elsewhere. Low-impact change but requires updating the call site.

-        self.properties: dict = self._lowrcase_properties(properties)
+        self.properties: dict = self._lowercase_properties(properties)
@@
-    def _lowrcase_properties(self, properties: dict | None) -> dict:
+    def _lowercase_properties(self, properties: dict | None) -> dict:
ibis-server/wren/__main__.py (4)

15-17: Enforce both required CLI args in usage check

The CLI requires <data_source> and <mdl_path>. Adjust the guard to require at least two args beyond the module name.

-    if len(sys.argv) < 2:
+    if len(sys.argv) < 3:
         print("Usage: python -m wren <data_source> <mdl_path> [connection_info_path]")  # noqa: T201
         sys.exit(1)

26-33: Replace unreadable-file check with structured exception handling; validate JSON

f.readable() will always be True for a successfully opened file in read mode. Prefer catching FileNotFoundError and JSONDecodeError, and return richer WrenErrors (use INVALID_CONNECTION_INFO for malformed JSON).

-        with open(connection_info_path := sys.argv[3]) as f:
-            if not f.readable():
-                raise WrenError(
-                    ErrorCode.GENERIC_USER_ERROR,
-                    f"Cannot read connection info file at {connection_info_path}",
-                )
-
-            connection_info = json.load(f)
+        try:
+            with open(connection_info_path := sys.argv[3]) as f:
+                connection_info = json.load(f)
+        except FileNotFoundError as e:
+            raise WrenError(
+                ErrorCode.GENERIC_USER_ERROR,
+                f"Cannot read connection info file at {connection_info_path}",
+            ) from e
+        except json.JSONDecodeError as e:
+            raise WrenError(
+                ErrorCode.INVALID_CONNECTION_INFO,
+                f"Invalid JSON in connection info file at {connection_info_path}: {e}",
+            ) from e

21-22: Harden invalid data_source handling

Constructing the DataSource enum with an invalid value raises ValueError and will produce an unfriendly traceback in a CLI. Wrap and re-raise as WrenError to keep UX consistent.

-    data_source = DataSource(data_source)
+    try:
+        data_source = DataSource(data_source)
+    except ValueError as e:
+        raise WrenError(
+            ErrorCode.GENERIC_USER_ERROR,
+            f"Unknown data_source: {data_source}",
+        ) from e

46-60: Graceful CLI error handling for WrenError

Catching WrenError at the CLI boundary avoids stack traces and prints a clean message before exiting with a non-zero code.

-    session = create_session_context(
-        data_source=data_source.name,
-        connection_info=connection_info,
-        mdl_path=mdl_path,
-    )
-    print(f"Session created: {session}")  # noqa: T201
-    print("You can now interact with the Wren session using the 'wren' variable:")  # noqa: T201
-    print("> task = wren.sql('SELECT * FROM your_table').execute()")  # noqa: T201
-    print("> print(task.results)")  # noqa: T201
-    print("> print(task.formatted_result())")  # noqa: T201
-    code.interact(local={"wren": session})
+    try:
+        session = create_session_context(
+            data_source=data_source.name,
+            connection_info=connection_info,
+            mdl_path=mdl_path,
+        )
+    except WrenError as e:
+        print(f"[{e.error_code.name}] {e.message}", file=sys.stderr)  # noqa: T201
+        sys.exit(2)
+    print(f"Session created: {session}")  # noqa: T201
+    print("You can now interact with the Wren session using the 'wren' variable:")  # noqa: T201
+    print("> task = wren.sql('SELECT * FROM your_table').execute()")  # noqa: T201
+    print("> print(task.results)")  # noqa: T201
+    print("> print(task.formatted_result())")  # noqa: T201
+    code.interact(local={"wren": session})
ibis-server/wren/__init__.py (1)

55-56: Optionally normalize invalid data_source early

Creating the DataSource enum can raise ValueError. Consider wrapping this to produce a WrenError with a clear message.

-    data_source = DataSource(data_source)
+    try:
+        data_source = DataSource(data_source)
+    except ValueError as e:
+        raise WrenError(
+            ErrorCode.GENERIC_USER_ERROR,
+            f"Unknown data_source: {data_source}",
+        ) from e
ibis-server/tests/routers/v3/connector/postgres/test_model_substitute.py (4)

106-106: Consider asserting the error payload (message/error_code) for x-user-catalog-only case

You already validate the 404. Tighten this by asserting the structured error fields to prevent regressions in error mapping.

-    assert response.status_code == 404
+    assert response.status_code == 404
+    payload = response.json()
+    assert payload["error_code"] in ("NOT_FOUND", "MDL_NOT_FOUND")
+    assert response.headers.get("x-correlation-id")

138-138: Likewise, assert structured error fields for x-user-schema-only case

Mirror the above for symmetry and stronger guarantees.

-    assert response.status_code == 404
+    assert response.status_code == 404
+    payload = response.json()
+    assert payload["error_code"] in ("NOT_FOUND", "MDL_NOT_FOUND")
+    assert response.headers.get("x-correlation-id")

286-288: Mirror the structured assertions for the headers-based request

Same enhancement as above for consistency.

-    assert response.status_code == 404
-    assert response.json()["message"] == 'Model not found: "Nation"'
+    assert response.status_code == 404
+    payload = response.json()
+    assert payload["message"] == 'Model not found: "Nation"'
+    assert payload["error_code"] in ("NOT_FOUND", "MDL_NOT_FOUND")  # set exact one after verifying
+    assert response.headers.get("x-correlation-id")

270-272: Augment “Model not found” test with precise errorCode and correlationId

The substitute implementation always raises WrenError with ErrorCode.NOT_FOUND for missing models, and the response model uses the alias errorCode and correlationId. Let’s update the assertions accordingly:

tests/routers/v3/connector/postgres/test_model_substitute.py
@@ async def test_model_substitute_non_existent_model(...):
-    assert response.status_code == 404
-    assert response.json()["message"] == 'Model not found: "Nation"'
+    assert response.status_code == 404
+    payload = response.json()
+    # message remains unchanged
+    assert payload["message"] == 'Model not found: "Nation"'
+    # implementation always uses NOT_FOUND for missing models
+    assert payload["errorCode"] == "NOT_FOUND"  # ErrorCode.NOT_FOUND 
+    # correlationId is populated via the exception handler 
+    assert "correlationId" in payload and payload["correlationId"] is not None
+    # the CorrelationIdMiddleware should also set the header
+    assert response.headers.get("x-correlation-id")
ibis-server/tests/routers/v2/connector/test_oracle.py (1)

419-422: Strengthen JSON error assertions (also assert errorCode, phase, correlationId).

You’ve switched to structured WrenError JSON; let’s assert the key fields too for better coverage and to prevent regressions.

     assert response.status_code == 422
-    assert (
-        response.json()["message"]
-        == 'Ambiguous model: found multiple matches for "ORDERS"'
-    )
+    data = response.json()
+    assert data["message"] == 'Ambiguous model: found multiple matches for "ORDERS"'
+    assert data["errorCode"] == "GENERIC_USER_ERROR"
+    assert data.get("phase") == "SQL_SUBSTITUTE"
+    assert data.get("correlationId") is not None
ibis-server/app/mdl/substitute.py (2)

43-47: Add useful metadata to WrenError for ambiguous model to aid debugging.

Include the offending source and duplicate keys in the error metadata. This improves observability without changing behavior.

-                    raise WrenError(
-                        ErrorCode.GENERIC_USER_ERROR,
-                        f"Ambiguous model: found multiple matches for {source}",
-                        phase=ErrorPhase.SQL_SUBSTITUTE,
-                    )
+                    raise WrenError(
+                        ErrorCode.GENERIC_USER_ERROR,
+                        f"Ambiguous model: found multiple matches for {source}",
+                        phase=ErrorPhase.SQL_SUBSTITUTE,
+                        metadata={"source": str(source), "duplicateKeys": duplicate_keys},
+                    )

50-55: Attach context metadata for “Model not found”.

Carrying the source that failed resolution in metadata makes operational debugging faster and keeps client messages stable.

-                    raise WrenError(
-                        ErrorCode.NOT_FOUND,
-                        f"Model not found: {source}",
-                        phase=ErrorPhase.SQL_SUBSTITUTE,
-                    )
+                    raise WrenError(
+                        ErrorCode.NOT_FOUND,
+                        f"Model not found: {source}",
+                        phase=ErrorPhase.SQL_SUBSTITUTE,
+                        metadata={"source": str(source)},
+                    )
ibis-server/app/model/validator.py (4)

25-29: Set phase=VALIDATION when the rule is unknown.

This aligns with the rest of the validator and improves downstream telemetry consistency.

-            raise WrenError(
-                ErrorCode.VALIDATION_RULE_NOT_FOUND,
-                f"The rule `{rule}` is not in the rules, rules: {rules}",
-            )
+            raise WrenError(
+                ErrorCode.VALIDATION_RULE_NOT_FOUND,
+                f"The rule `{rule}` is not in the rules, rules: {rules}",
+                phase=ErrorPhase.VALIDATION,
+            )

147-160: Prefer deterministic DataFrame accessors over Series.get().

Series.get(0) depends on the index label and can be brittle. Use .iat for positional access.

-            output["result"] = str(result.get("result").get(0))
-            output["is_related"] = str(result.get("is_related").get(0))
+            output["result"] = str(result["result"].iat[0])
+            output["is_related"] = str(result["is_related"].iat[0])
             if result.get("left_table_unique") is not None:
-                output["left_table_unique"] = str(
-                    result.get("left_table_unique").get(0)
-                )
+                output["left_table_unique"] = str(result["left_table_unique"].iat[0])
             if result.get("right_table_unique") is not None:
-                output["right_table_unique"] = str(
-                    result.get("right_table_unique").get(0)
-                )
+                output["right_table_unique"] = str(result["right_table_unique"].iat[0])

171-176: Include validation details in “relationship is not valid” metadata.

This keeps the message stable and transports machine-readable diagnostics.

-                raise WrenError(
-                    ErrorCode.VALIDATION_ERROR,
-                    f"Relationship {relationship_name} is not valid: {format_result(result)}",
-                    phase=ErrorPhase.VALIDATION,
-                )
+                raise WrenError(
+                    ErrorCode.VALIDATION_ERROR,
+                    f"Relationship {relationship_name} is not valid",
+                    phase=ErrorPhase.VALIDATION,
+                    metadata={
+                        "relationship": relationship_name,
+                        "details": format_result(result),
+                    },
+                )

237-239: Preserve exception chaining for RLAC syntax validation.

Add “from e” to keep the original traceback for ops while maintaining user-friendly payloads.

-            raise WrenError(
-                ErrorCode.VALIDATION_ERROR, str(e), phase=ErrorPhase.VALIDATION
-            )
+            raise WrenError(
+                ErrorCode.VALIDATION_ERROR, str(e), phase=ErrorPhase.VALIDATION
+            ) from e
ibis-server/app/main.py (1)

48-51: Centralize “X-Correlation-ID” constant usage

We’ve defined X_CORRELATION_ID in app/dependencies.py and wired it into CorrelationIdMiddleware, but elsewhere the literal "X-Correlation-ID" is still used. To avoid duplication and potential drift, import and use the constant in all places:

• app/middleware/init.py (line 12):
• Change request.headers.get("X-Correlation-ID")request.headers.get(X_CORRELATION_ID)
• app/util.py (lines 145–146):
• Change if "X-Correlation-ID" in headerif X_CORRELATION_ID in header
• Change header["X-Correlation-ID"]header[X_CORRELATION_ID]

Example diff:

--- a/ibis-server/app/middleware/__init__.py
+++ b/ibis-server/app/middleware/__init__.py
@@
-from starlette.requests import Request
+from starlette.requests import Request
+from app.dependencies import X_CORRELATION_ID
@@ async def dispatch(self, request: Request, call_next) -> Response:
-        correlation_id = request.headers.get("X-Correlation-ID")
+        correlation_id = request.headers.get(X_CORRELATION_ID)
--- a/ibis-server/app/util.py
+++ b/ibis-server/app/util.py
@@
 from opentelemetry.trace import Span
 from app.dependencies import X_CORRELATION_ID
@@ def extract_and_set_correlation_id(header: Mapping[str, str], span: Span) -> None:
-    if "X-Correlation-ID" in header:
-        span.set_attribute("correlation_id", header["X-Correlation-ID"])
+    if X_CORRELATION_ID in header:
+        span.set_attribute("correlation_id", header[X_CORRELATION_ID])

This ensures a single source of truth for the header name and prevents accidental mismatches down the road.

ibis-server/app/model/connector.py (2)

461-470: Push down LIMIT into DuckDB instead of slicing in memory.

Slicing fetches the full result first, which is wasteful for large datasets. Compose a limited SQL instead.

-        if limit is None:
-            # If no limit is specified, we return the full result
-            return self.connection.execute(sql).fetch_arrow_table()
-        else:
-            # If a limit is specified, we slice the result
-            # DuckDB does not support LIMIT in fetch_arrow_table, so we use slice
-            # to limit the number of rows returned
-            return self.connection.execute(sql).fetch_arrow_table().slice(length=limit)
+        if limit is None:
+            return self.connection.execute(sql).fetch_arrow_table()
+        # Push down limit to the engine to avoid large transfers
+        limited_sql = f"SELECT * FROM ({sql}) AS sub LIMIT {int(limit)}"
+        return self.connection.execute(limited_sql).fetch_arrow_table()

511-513: Set an error phase for DuckDB file listing failures.

Phase=METADATA_FETCHING clarifies where the error occurred.

-            raise WrenError(
-                ErrorCode.GENERIC_USER_ERROR, f"Failed to list files: {e!s}"
-            )
+            raise WrenError(
+                ErrorCode.GENERIC_USER_ERROR,
+                f"Failed to list files: {e!s}",
+                phase=ErrorPhase.METADATA_FETCHING,
+            )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f1347e3 and eebf1dc.

📒 Files selected for processing (34)
  • ibis-server/app/dependencies.py (1 hunks)
  • ibis-server/app/main.py (3 hunks)
  • ibis-server/app/mdl/analyzer.py (3 hunks)
  • ibis-server/app/mdl/java_engine.py (2 hunks)
  • ibis-server/app/mdl/rewriter.py (6 hunks)
  • ibis-server/app/mdl/substitute.py (2 hunks)
  • ibis-server/app/model/__init__.py (0 hunks)
  • ibis-server/app/model/connector.py (8 hunks)
  • ibis-server/app/model/data_source.py (4 hunks)
  • ibis-server/app/model/error.py (1 hunks)
  • ibis-server/app/model/metadata/canner.py (2 hunks)
  • ibis-server/app/model/metadata/object_storage.py (3 hunks)
  • ibis-server/app/model/validator.py (6 hunks)
  • ibis-server/app/routers/v3/connector.py (1 hunks)
  • ibis-server/app/util.py (1 hunks)
  • ibis-server/tests/routers/v2/connector/test_athena.py (0 hunks)
  • ibis-server/tests/routers/v2/connector/test_bigquery.py (0 hunks)
  • ibis-server/tests/routers/v2/connector/test_canner.py (0 hunks)
  • ibis-server/tests/routers/v2/connector/test_clickhouse.py (2 hunks)
  • ibis-server/tests/routers/v2/connector/test_gcs_file.py (1 hunks)
  • ibis-server/tests/routers/v2/connector/test_local_file.py (1 hunks)
  • ibis-server/tests/routers/v2/connector/test_mssql.py (0 hunks)
  • ibis-server/tests/routers/v2/connector/test_mysql.py (2 hunks)
  • ibis-server/tests/routers/v2/connector/test_oracle.py (1 hunks)
  • ibis-server/tests/routers/v2/connector/test_postgres.py (12 hunks)
  • ibis-server/tests/routers/v2/connector/test_redshift.py (0 hunks)
  • ibis-server/tests/routers/v2/connector/test_s3_file.py (1 hunks)
  • ibis-server/tests/routers/v2/connector/test_snowflake.py (0 hunks)
  • ibis-server/tests/routers/v2/connector/test_trino.py (0 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_model_substitute.py (4 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_validate.py (4 hunks)
  • ibis-server/wren/__init__.py (3 hunks)
  • ibis-server/wren/__main__.py (2 hunks)
  • ibis-server/wren/session/__init__.py (3 hunks)
💤 Files with no reviewable changes (8)
  • ibis-server/tests/routers/v2/connector/test_bigquery.py
  • ibis-server/tests/routers/v2/connector/test_canner.py
  • ibis-server/tests/routers/v2/connector/test_trino.py
  • ibis-server/tests/routers/v2/connector/test_snowflake.py
  • ibis-server/tests/routers/v2/connector/test_athena.py
  • ibis-server/tests/routers/v2/connector/test_redshift.py
  • ibis-server/app/model/init.py
  • ibis-server/tests/routers/v2/connector/test_mssql.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-18T02:23:34.040Z
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object without requiring conn.register().

Applied to files:

  • ibis-server/app/model/connector.py
🧬 Code graph analysis (17)
ibis-server/app/model/metadata/canner.py (1)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • WrenError (65-113)
ibis-server/wren/__init__.py (1)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • WrenError (65-113)
ibis-server/app/model/data_source.py (1)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • WrenError (65-113)
ibis-server/app/routers/v3/connector.py (1)
ibis-server/app/model/error.py (1)
  • DatabaseTimeoutError (116-125)
ibis-server/wren/session/__init__.py (2)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • WrenError (65-113)
ibis-server/app/model/connector.py (7)
  • query (85-110)
  • query (155-160)
  • query (273-277)
  • query (340-347)
  • query (403-441)
  • query (461-469)
  • query (558-566)
ibis-server/wren/__main__.py (1)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • WrenError (65-113)
ibis-server/app/mdl/substitute.py (1)
ibis-server/app/model/error.py (3)
  • ErrorCode (19-38)
  • ErrorPhase (41-52)
  • WrenError (65-113)
ibis-server/app/util.py (1)
ibis-server/app/model/error.py (1)
  • DatabaseTimeoutError (116-125)
ibis-server/app/model/metadata/object_storage.py (1)
ibis-server/app/model/error.py (3)
  • ErrorCode (19-38)
  • ErrorPhase (41-52)
  • WrenError (65-113)
ibis-server/tests/routers/v2/connector/test_postgres.py (2)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • ErrorPhase (41-52)
ibis-server/tests/conftest.py (1)
  • client (18-23)
ibis-server/tests/routers/v2/connector/test_clickhouse.py (1)
ibis-server/app/model/error.py (1)
  • ErrorCode (19-38)
ibis-server/app/mdl/analyzer.py (1)
ibis-server/app/model/error.py (2)
  • ErrorCode (19-38)
  • WrenError (65-113)
ibis-server/app/mdl/rewriter.py (2)
ibis-server/app/model/error.py (3)
  • ErrorCode (19-38)
  • ErrorPhase (41-52)
  • WrenError (65-113)
ibis-server/wren/session/__init__.py (2)
  • _get_read_dialect (119-120)
  • _get_write_dialect (122-134)
ibis-server/app/model/connector.py (2)
ibis-server/app/model/error.py (3)
  • ErrorCode (19-38)
  • ErrorPhase (41-52)
  • WrenError (65-113)
ibis-server/wren/session/__init__.py (3)
  • sql (37-52)
  • dry_run (136-143)
  • execute (145-164)
ibis-server/tests/routers/v2/connector/test_mysql.py (2)
ibis-server/app/model/error.py (1)
  • ErrorCode (19-38)
ibis-server/app/model/__init__.py (1)
  • SSLMode (523-526)
ibis-server/app/model/validator.py (3)
ibis-server/app/model/error.py (3)
  • ErrorCode (19-38)
  • ErrorPhase (41-52)
  • WrenError (65-113)
ibis-server/app/mdl/rewriter.py (3)
  • rewrite (63-72)
  • rewrite (107-124)
  • rewrite (136-149)
ibis-server/app/model/connector.py (6)
  • dry_run (112-136)
  • dry_run (194-195)
  • dry_run (303-320)
  • dry_run (366-368)
  • dry_run (472-473)
  • dry_run (569-571)
ibis-server/app/main.py (3)
ibis-server/app/mdl/java_engine.py (1)
  • JavaEngineConnector (13-68)
ibis-server/app/middleware/__init__.py (1)
  • RequestLogMiddleware (10-44)
ibis-server/app/model/error.py (5)
  • ErrorCode (19-38)
  • ErrorResponse (55-62)
  • WrenError (65-113)
  • get_http_status_code (96-113)
  • get_response (86-94)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci

@goldmedal goldmedal requested a review from douenergy August 26, 2025 02:23
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
ibis-server/app/model/connector.py (1)

412-451: Bug: Returns pandas DataFrame where pyarrow.Table is expected; enforce Arrow return with schema.

In the “Must pass schema” branch, query currently returns a pandas DataFrame, violating the declared return type (pa.Table) and the contract used by other connectors. Convert the empty result into a correctly-typed Arrow table.

Apply:

         except ValueError as e:
             # Import here to avoid override the custom datatypes
             import ibis.backends.bigquery  # noqa: PLC0415
@@
             if "Must pass schema" in str(e):
                 with tracer.start_as_current_span(
                     "get_schema", kind=trace.SpanKind.CLIENT
                 ):
                     credits_json = loads(
                         base64.b64decode(
                             self.connection_info.credentials.get_secret_value()
                         ).decode("utf-8")
                     )
                     credentials = service_account.Credentials.from_service_account_info(
                         credits_json
                     )
                     credentials = credentials.with_scopes(
                         [
                             "https://www.googleapis.com/auth/drive",
                             "https://www.googleapis.com/auth/cloud-platform",
                         ]
                     )
                     client = bigquery.Client(credentials=credentials)
                     ibis_schema_mapper = ibis.backends.bigquery.BigQuerySchema()
                     bq_fields = client.query(sql).result()
-                    ibis_fields = ibis_schema_mapper.to_ibis(bq_fields.schema)
-                    return pd.DataFrame(columns=ibis_fields.names)
+                    ibis_fields = ibis_schema_mapper.to_ibis(bq_fields.schema)
+                    # Build an empty Arrow table with the correct schema to avoid the ibis empty-result bug
+                    pa_fields = [
+                        pa.field(name, dtype.to_pyarrow(), nullable=True)
+                        for name, dtype in zip(ibis_fields.names, ibis_fields.types)
+                    ]
+                    pa_schema = pa.schema(pa_fields)
+                    empty_cols = [pa.array([], type=f.type) for f in pa_schema]
+                    return pa.Table.from_arrays(empty_cols, schema=pa_schema)
                 else:
                     raise e

Follow-up: consider caching the schema mapping for repeated “dry” queries of the same SQL if this path is hot.

ibis-server/app/util.py (1)

257-272: Exceptions are being swallowed if they’re not recognized as timeouts

In execute_with_timeout, non-timeout ClickHouseDbError and TrinoQueryError are caught but not re-raised when they don’t match the timeout condition. This causes the function to return None silently, masking real errors and breaking callers that expect a result. The timeout branch for asyncio.wait_for should also explicitly catch asyncio.TimeoutError to be unambiguous.

Apply this diff to ensure non-timeout errors propagate and to robustly catch asyncio timeouts:

 async def execute_with_timeout(operation, operation_name: str):
     """Asynchronously execute an operation with a timeout."""
     try:
         return await asyncio.wait_for(operation, timeout=app_timeout_seconds)
-    except TimeoutError:
+    except (asyncio.TimeoutError, TimeoutError):
         raise DatabaseTimeoutError(
             f"{operation_name} timeout after {app_timeout_seconds} seconds"
         )
     except ClickHouseDbError as e:
         if "TIMEOUT_EXCEEDED" in str(e):
             raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}")
+        raise
     except trino.exceptions.TrinoQueryError as e:
         if e.error_name == "EXCEEDED_TIME_LIMIT":
             raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}")
+        raise
     except psycopg.errors.QueryCanceled as e:
         raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}")
♻️ Duplicate comments (2)
ibis-server/app/model/connector.py (2)

11-19: Good guard: ClickHouse made optional with a sentinel error type.

This addresses the “optional ClickHouse dependency” concern and avoids import-time failures on non-ClickHouse deployments.

If you want to be stricter, narrow the exception to ImportError to avoid masking unrelated import side effects:

-try:
+try:
     import clickhouse_connect
     ClickHouseDbError = clickhouse_connect.driver.exceptions.DatabaseError
-except Exception:  # pragma: no cover
+except ImportError:  # pragma: no cover
     class ClickHouseDbError(Exception):
         pass

95-120: ClickHouse execution-phase fix is correct; verify upstream timeout mapping.

  • The phase for ClickHouse non-timeout errors is now SQL_EXECUTION — this fixes the earlier mislabel (was SQL_DRY_RUN). Good.
  • TIMEOUT_EXCEEDED is re-raised raw (plus Trino/QueryCanceled). Ensure the API layer (FastAPI error handler) maps these to WrenError with ErrorCode.DATABASE_TIMEOUT; otherwise clients will see backend-specific exceptions.

Run to confirm there is a central handler translating these exceptions to WrenError.DATABASE_TIMEOUT:

#!/bin/bash
# Find FastAPI exception handlers or middleware mapping DB timeouts to WrenError
rg -n -C3 -P 'TrinoQueryError|QueryCanceled|TIMEOUT_EXCEEDED|DATABASE_TIMEOUT|exception_handler' --type=py
🧹 Nitpick comments (9)
ibis-server/tests/routers/v2/test_relationship_valid.py (3)

134-134: Validate standardized error payload and relax exact message assertion

To make this test more robust against future wording changes and to ensure the full error schema is correct, update the assertion as follows:

-    assert response.json()["message"] == "relationshipName is required"
+    # parse the JSON payload once
+    payload = response.json()
+
+    # assert the canonical error code and phase for missing parameters
+    assert payload["errorCode"] == ErrorCode.VALIDATION_PARAMETER_ERROR.name
+    assert payload["phase"] == ErrorPhase.VALIDATION.name
+
+    # relax the message check to a substring match
+    assert "relationshipName is required" in payload["message"]
+
+    # validate correlation id in both payload and response header
+    assert "correlationId" in payload
+    assert response.headers.get("X-Correlation-ID") == payload["correlationId"]

Key updates:

  • Use .name on ErrorCode.VALIDATION_PARAMETER_ERROR and ErrorPhase.VALIDATION to lock in the right error semantics.
  • Change the message assertion to a substring check so minor wording tweaks won’t break this test.
  • Confirm the presence of correlationId in the payload and that it matches the X-Correlation-ID header.

150-151: Stabilize test assertion on relationship validation error message

This test is currently tied to the exact formatting of Python’s dictstr conversion, which makes it brittle (key order, quotes, spacing). Instead, verify only the stable prefix and the presence of important tokens. For example, update the snippet in test_relationship_valid.py at lines 150–151 as follows:

-    # brittle exact match on stringified dict
-    assert response.json()["message"]
-        == "Relationship invalid_t1_many_t2_id is not valid: {'result': 'False', 'is_related': 'True', 'left_table_unique': 'False', 'right_table_unique': 'True'}"
+    # check only the stable prefix…
+    msg = response.json()["message"]
+    assert msg.startswith("Relationship invalid_t1_many_t2_id is not valid:")
+
+    # …and verify key tokens without hard-coding the dict formatting
+    for key in ["left_table_unique", "right_table_unique", "is_related"]:
+        assert key in msg

If you’d like to surface the structured error code and phase (rather than overloading message), you can also assert on the top-level fields emitted in the JSON payload:

payload = response.json()
assert payload["errorCode"] == ErrorCode.VALIDATION_ERROR.name
assert payload["phase"]     == ErrorPhase.VALIDATION.name

121-121: Update test to assert structured error fields and relax message check

Please update ibis-server/tests/routers/v2/test_relationship_valid.py around line 121 as follows:

• Replace the exact message assertion with an “in” check:

-    assert response.json()["message"] == "Relationship not_found not found in manifest"
+    payload = response.json()
+    # message contains relationship name and not-found text
+    assert "Relationship not_found not found in manifest" in payload["message"]

• Ensure you import the enums at the top of the file if not already present:

from app.model.error import ErrorCode, ErrorPhase

• Add assertions for the structured error fields and correlation ID:

# structured fields
assert payload["errorCode"] == ErrorCode.VALIDATION_PARAMETER_ERROR.name
assert payload["phase"] == ErrorPhase.VALIDATION.name

# correlation ID present and matches header
assert "correlationId" in payload
assert response.headers.get("X-Correlation-ID") == payload["correlationId"]

These changes make the test more resilient to minor wording tweaks and lock in the new error contract.

ibis-server/app/model/connector.py (3)

471-479: Avoid fetching full result then slicing in-memory for DuckDB LIMIT.

Current approach executes the full query and slices the Arrow table, which is wasteful for large datasets. Push the LIMIT into SQL.

Apply:

     def query(self, sql: str, limit: int | None) -> pa.Table:
-        if limit is None:
-            # If no limit is specified, we return the full result
-            return self.connection.execute(sql).fetch_arrow_table()
-        else:
-            # If a limit is specified, we slice the result
-            # DuckDB does not support LIMIT in fetch_arrow_table, so we use slice
-            # to limit the number of rows returned
-            return self.connection.execute(sql).fetch_arrow_table().slice(length=limit)
+        if limit is None:
+            return self.connection.execute(sql).fetch_arrow_table()
+        # Push-down LIMIT to DuckDB to reduce compute and memory
+        limited_sql = f"SELECT * FROM ({sql}) AS _sub LIMIT {int(limit)}"
+        return self.connection.execute(limited_sql).fetch_arrow_table()

487-490: Clear DUCKDB_FILE_NOT_FOUND classification.

Good use of a specific code instead of a generic error. Consider adding a phase (e.g., METADATA_FETCHING) for observability, though optional.


557-560: Invalid Redshift connection info → GENERIC_INTERNAL_ERROR is appropriate.

Clear signal for a programming/config error as opposed to user input.

If RedshiftConnectionInfo.port is a SecretStr of a string, consider int() casting at callsite to avoid implicit coercions in the driver.

ibis-server/tests/routers/v2/connector/test_minio_file.py (3)

250-252: Align dry-run negative case with structured JSON errors

This still checks only text presence. Assert JSON structure and correlation header to stay consistent with WrenError responses.

Apply this diff:

     assert response.status_code == 422
-    assert response.text is not None
+    payload = response.json()
+    assert "errorCode" in payload and "message" in payload
+    assert "correlationId" in payload and payload["correlationId"]
+    assert response.headers.get("X-Correlation-ID") == payload["correlationId"]

270-270: Also validate structured errors for invalid connection info (first request)

Mirror the same structured checks here for consistency across tests.

Apply this diff:

     assert response.status_code == 422
+    payload = response.json()
+    assert "errorCode" in payload and "message" in payload
+    assert "correlationId" in payload and payload["correlationId"]
+    assert response.headers.get("X-Correlation-ID") == payload["correlationId"]

285-285: Also validate structured errors for invalid connection info (tables metadata request)

Same rationale: assert body fields and header correlation.

Apply this diff:

     assert response.status_code == 422
+    payload = response.json()
+    assert "errorCode" in payload and "message" in payload
+    assert "correlationId" in payload and payload["correlationId"]
+    assert response.headers.get("X-Correlation-ID") == payload["correlationId"]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between eebf1dc and 63f1fa0.

📒 Files selected for processing (4)
  • ibis-server/app/model/connector.py (8 hunks)
  • ibis-server/app/util.py (3 hunks)
  • ibis-server/tests/routers/v2/connector/test_minio_file.py (1 hunks)
  • ibis-server/tests/routers/v2/test_relationship_valid.py (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-18T02:23:34.040Z
Learnt from: goldmedal
PR: Canner/wren-engine#1224
File: ibis-server/app/util.py:49-56
Timestamp: 2025-06-18T02:23:34.040Z
Learning: DuckDB supports querying PyArrow Tables directly in SQL queries without needing to register them. When a pa.Table object is referenced in a FROM clause (e.g., "SELECT ... FROM df" where df is a pa.Table), DuckDB automatically handles the PyArrow object without requiring conn.register().

Applied to files:

  • ibis-server/app/model/connector.py
🧬 Code graph analysis (2)
ibis-server/app/model/connector.py (3)
ibis-server/app/util.py (1)
  • ClickHouseDbError (11-12)
ibis-server/app/model/error.py (3)
  • ErrorCode (19-38)
  • ErrorPhase (41-52)
  • WrenError (65-113)
ibis-server/wren/session/__init__.py (3)
  • sql (37-52)
  • dry_run (136-143)
  • execute (145-164)
ibis-server/app/util.py (2)
ibis-server/app/model/connector.py (1)
  • ClickHouseDbError (17-18)
ibis-server/app/model/error.py (1)
  • DatabaseTimeoutError (116-125)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci
🔇 Additional comments (9)
ibis-server/app/model/connector.py (7)

26-26: Imports align with downstream usage.

psycopg and trino exceptions are referenced later in the centralized error wrapper; import locations look good.

Also applies to: 29-29


51-56: Consistent structured errors (WrenError) integration looks solid.

Centralizing DIALECT_SQL, ErrorCode, ErrorPhase, WrenError here sets up clear, uniform error payloads.


121-146: Dry-run path mirrors query error policy; confirm consistent timeout translation.

The SQL_DRY_RUN phase for ClickHouse non-timeout errors is correct. Same upstream timeout mapping verification applies here.

Use the same search from the previous comment to verify dry-run timeouts are converted to WrenError.DATABASE_TIMEOUT centrally.


319-329: MSSql dry-run workaround with structured errors is appropriate.

  • INVALID_SQL for the known ibis AttributeError quirk, with a descriptive message derived via sys.dm_exec_describe_first_result_set — good call.
  • IBIS_PROJECT_ERROR for other AttributeError cases with SQL_DRY_RUN phase — reasonable classification.

482-482: Direct DuckDB dry-run via execute is fine.

Letting outer layers wrap exceptions into WrenError keeps concerns separated.


498-505: ATTACH errors mapped to ATTACH_DUCKDB_ERROR — looks right.

Both IOException and HTTPException are covered; message is concise.


520-523: File listing failures mapped to GENERIC_USER_ERROR.

Reasonable default. Ensure that e!s does not include sensitive paths for remote filesystems; if it does, consider redacting in the future.

ibis-server/tests/routers/v2/connector/test_minio_file.py (1)

351-354: Once we confirm how WrenError is serialized (i.e. which fields are included in the JSON payload and how the correlation ID header is mapped), we can refine the test assertion. Please share the output of the above inspection so we can finalize the strengthened assertion snippet.

ibis-server/app/util.py (1)

44-44: Import path update looks correct and aligns with the new error model

Using DatabaseTimeoutError from app.model.error is consistent with the WrenError-based centralized error handling introduced in this PR.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ibis-server/app/util.py (1)

257-268: Use asyncio.TimeoutError instead of built-in TimeoutError

asyncio.wait_for raises asyncio.TimeoutError, not the built-in TimeoutError. The existing except TimeoutError clauses will miss real timeouts and won’t translate them into DatabaseTimeoutError. Update all three occurrences in ibis-server/app/util.py:

  • At the primary execute_with_timeout wrapper (around line 261)
  • In the query-task cancellation block (around line 293)
  • In the connection‐cleanup block (around line 305)
--- a/ibis-server/app/util.py
@@ 258,264
-    except TimeoutError:
+    except asyncio.TimeoutError:
         raise DatabaseTimeoutError(
             f"{operation_name} timeout after {app_timeout_seconds} seconds"
         )
@@ 290,296
-            except (TimeoutError, asyncio.CancelledError):
+            except (asyncio.TimeoutError, asyncio.CancelledError):
                 # Task didn't cancel in time or was cancelled
                 logger.warning(
                     f"{operation_name} task cancellation timed out or was cancelled"
                 )
@@ 302,308
-        except TimeoutError:
+        except asyncio.TimeoutError:
             logger.warning("Connection cleanup timed out")
♻️ Duplicate comments (1)
ibis-server/app/util.py (1)

5-12: Centralize ClickHouseDbError; avoid duplicate class definitions across modules.

This file defines a fallback ClickHouseDbError while app/model/connector.py also defines one (lines 16–17 in that file). Split-brain exceptions can break except handlers when different imports are used. Move the guarded alias into a single shared module and import it here.

Apply in this file:

-try:
-    import clickhouse_connect
-
-    ClickHouseDbError = clickhouse_connect.driver.exceptions.DatabaseError
-except ImportError:  # pragma: no cover
-
-    class ClickHouseDbError(Exception):
-        pass
+from app.exceptions import ClickHouseDbError

New module (add file app/exceptions.py):

try:
    import clickhouse_connect
    ClickHouseDbError = clickhouse_connect.driver.exceptions.DatabaseError
except ImportError:  # pragma: no cover
    class ClickHouseDbError(Exception):
        pass
🧹 Nitpick comments (1)
ibis-server/app/util.py (1)

265-272: Prefer structured timeout detection and preserve cause chaining.

String matching ("TIMEOUT_EXCEEDED") is brittle. If the driver exposes structured fields (e.g., code/error_name), check those first; keep the string check as a fallback. Also chain the original exception for better diagnostics.

-    except ClickHouseDbError as e:
-        if "TIMEOUT_EXCEEDED" in str(e):
-            raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}")
+    except ClickHouseDbError as e:
+        code = getattr(e, "code", None) or getattr(e, "error_code", None)
+        if code == 160 or "TIMEOUT_EXCEEDED" in str(e):
+            raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}") from e
         raise
-    except trino.exceptions.TrinoQueryError as e:
-        if e.error_name == "EXCEEDED_TIME_LIMIT":
-            raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}")
+    except trino.exceptions.TrinoQueryError as e:
+        if getattr(e, "error_name", "") == "EXCEEDED_TIME_LIMIT":
+            raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}") from e
         raise
-    except psycopg.errors.QueryCanceled as e:
-        raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}")
+    except psycopg.errors.QueryCanceled as e:
+        raise DatabaseTimeoutError(f"{operation_name} was cancelled: {e}") from e
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 63f1fa0 and 6993622.

📒 Files selected for processing (2)
  • ibis-server/app/model/connector.py (8 hunks)
  • ibis-server/app/util.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • ibis-server/app/model/connector.py
🧰 Additional context used
🧬 Code graph analysis (1)
ibis-server/app/util.py (2)
ibis-server/app/model/connector.py (1)
  • ClickHouseDbError (17-18)
ibis-server/app/model/error.py (1)
  • DatabaseTimeoutError (116-125)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci
🔇 Additional comments (1)
ibis-server/app/util.py (1)

44-44: Import path switch to WrenError-based DatabaseTimeoutError — LGTM.

Correctly imports the unified error from app.model.error, aligning with the new structured ErrorResponse.

@goldmedal
Copy link
Copy Markdown
Contributor Author

BigQuery has been tested on local

poetry run pytest -m 'bigquery'
============================================================================================================================================ test session starts ============================================================================================================================================
platform darwin -- Python 3.11.11, pytest-8.4.1, pluggy-1.5.0
rootdir: /Users/jax/git/wren-engine/ibis-server
configfile: pyproject.toml
plugins: anyio-4.9.0
collected 369 items / 330 deselected / 39 selected                                                                                                                                                                                                                                                          

tests/routers/v2/connector/test_bigquery.py ..................                                                                                                                                                                                                                                        [ 46%]
tests/routers/v3/connector/bigquery/test_functions.py ....                                                                                                                                                                                                                                            [ 56%]
tests/routers/v3/connector/bigquery/test_query.py .................                                                                                                                                                                                                                                   [100%]

@goldmedal goldmedal merged commit e3d2d21 into Canner:main Aug 27, 2025
4 of 5 checks passed
@goldmedal goldmedal deleted the feat/enhance-error-handle branch August 27, 2025 03:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bigquery ibis python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants