Skip to content

fix(core): fix access controls case-insensitive issue and disable fallback for the query with access control#1295

Merged
goldmedal merged 2 commits intoCanner:mainfrom
goldmedal:fix/access-controls-case-insensitive
Aug 22, 2025
Merged

fix(core): fix access controls case-insensitive issue and disable fallback for the query with access control#1295
goldmedal merged 2 commits intoCanner:mainfrom
goldmedal:fix/access-controls-case-insensitive

Conversation

@goldmedal
Copy link
Copy Markdown
Contributor

@goldmedal goldmedal commented Aug 22, 2025

  • Disable Fallback when the MDL contains any access control rules.
  • Fix csae insensitive issue for rlac validation.

Summary by CodeRabbit

  • New Features
    • Added manifest backward-compatibility check to decide when v3 requests may fall back to v2.
  • Changes
    • Fallback now depends on manifest compatibility (e.g., presence of RLAC/CLAC) rather than request headers.
  • Bug Fixes
    • Session property names in RLAC are treated case-insensitively.
    • Access-control error messages standardized.
  • Tests
    • Expanded RLAC/CLAC and backward-compatibility test coverage; function-list test wiring improved.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Aug 22, 2025

Walkthrough

Replaces header-based fallback gating with a manifest compatibility check exposed from wren_core; v3 endpoints now consult is_backward_compatible(manifest_str) before falling back to v2. Adds PyO3 binding and Rust implementation for is_backward_compatible, normalizes session property case in access-control analysis, and updates tests and function-list test wiring.

Changes

Cohort / File(s) Summary of changes
Fallback gating refactor
ibis-server/app/dependencies.py, ibis-server/app/routers/v3/connector.py
Removed header-based exist_wren_variables_header; added is_backward_compatible(manifest_str) delegating to wren_core. v3 endpoints (query, dry_plan, dry_plan_for_data_source, validate, model_substitute) now gate v2 fallback on is_backward_compatible(dto.manifest_str) and existing fallback settings.
Postgres connector tests
ibis-server/tests/routers/v3/connector/postgres/conftest.py, .../test_fallback_v2.py, .../test_functions.py, .../test_query.py
Added function_list_path and autouse fixture to set/reset remote function list path in conftest; moved function-list toggling into tests; added RLAC/CLAC manifest cases asserting 422 under fallback; reordered few assertions in timeout test.
wren-core Python binding & manifest check
wren-core-py/src/lib.rs, wren-core-py/src/manifest.rs, wren-core-py/tests/test_modeling_core.py
Exposed manifest::is_backward_compatible to Python via PyO3; implemented is_backward_compatible to decode base64 MDL and return false if any model has RLAC or any column has CLAC; expanded tests to exercise RLAC/CLAC semantics and backward-compatibility checks.
Context formatting (no semantic change)
wren-core-py/src/context.rs
Reformatted destructuring/assignment in pushdown limit handling; behavior unchanged.
Access-control normalization
wren-core/core/src/logical_plan/analyze/access_control.rs
Lowercases session property identifiers (strip '@' then to_ascii_lowercase) when collecting RLAC session properties.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant V3 as ibis-server v3 Router
  participant J as Java Engine
  participant WC as wren_core.is_backward_compatible
  participant V2 as ibis-server v2 Fallback

  C->>V3: POST /v3/query (manifest_str, SQL)
  V3->>J: Execute request
  J-->>V3: Error (unavailable/feature)
  alt Fallback enabled
    V3->>WC: is_backward_compatible(manifest_str)
    WC-->>V3: true / false
    alt compatible (true)
      V3->>V2: Route to v2 execution
      V2-->>V3: v2 Response
      V3-->>C: v2 Response
    else not compatible (false)
      V3-->>C: Propagate original error
    end
  else Fallback disabled
    V3-->>C: Propagate original error
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • douenergy
  • wwwy3y3

Poem

A rabbit reads the manifest scroll,
Sniffs for RLAC, CLAC—control!
"Backward or not?" the whiskers twitch,
To v2 we hop if rules don't stitch.
Lowercase keys and bindings bright—thump! 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added core ibis rust Pull requests that update Rust code python Pull requests that update Python code labels Aug 22, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
wren-core/core/src/logical_plan/analyze/access_control.rs (1)

302-315: CLAC evaluation still does a case-sensitive property lookup — can bypass column rules.

validate_clac_rule retrieves the property value using the manifest-declared property.name verbatim. If the manifest uses mixed case (e.g., "Session_level") but headers/properties were normalized to lowercase elsewhere, properties.get(&property.name) returns None, and the function defaults to Ok(true) — effectively allowing the column when it should be evaluated/possibly hidden. Align this with your RLAC handling by lowercasing the key before lookup.

Apply this diff:

-    let property = &clac.required_properties[0];
-    let value_opt = properties.get(&property.name);
+    let property = &clac.required_properties[0];
+    // Case-insensitive: properties are normalized to lowercase upstream
+    let key = property.name.to_ascii_lowercase();
+    let value_opt = properties.get(&key);
🧹 Nitpick comments (19)
wren-core/core/src/logical_plan/analyze/access_control.rs (5)

139-146: Make RLAC property-name normalization consistent (use to_ascii_lowercase).

Elsewhere you now use to_ascii_lowercase(); continue that here to avoid locale-sensitive casing issues.

Apply this diff:

-                let property_name =
-                    value.trim_start_matches("@").to_string().to_lowercase();
+                let property_name = value.trim_start_matches("@").to_ascii_lowercase();

93-101: Avoid O(n^2) membership checks and unify ASCII-lowercasing for required properties.

Use a HashSet for required properties to speed up validation when property lists grow, and prefer to_ascii_lowercase() for consistency.

Apply this diff:

-    let required_properties: Vec<_> = required_properties
-        .iter()
-        .map(|property| property.name.to_lowercase())
-        .collect();
-
-    let missed_properties: Vec<_> = session_properties
-        .iter()
-        .filter(|property| !required_properties.contains(property))
-        .collect();
+    let required: std::collections::HashSet<_> = required_properties
+        .iter()
+        .map(|p| p.name.to_ascii_lowercase())
+        .collect();
+
+    let missed_properties: Vec<_> = session_properties
+        .iter()
+        .filter(|p| !required.contains(*p))
+        .collect();

324-326: Normalize with to_ascii_lowercase in header presence checks.

Use ASCII-lowercasing here as well to match other sites and prevent locale pitfalls.

Apply this diff:

-        .get(&property_name.to_lowercase())
+        .get(&property_name.to_ascii_lowercase())

235-236: Clarify and fix grammar in the error message for invalid session-property expressions.

Short, actionable error helps users. Also avoid implying you know the property name here.

Apply this diff:

-        _ => plan_err!("The session property {} allow only literal value", expr),
+        _ => plan_err!("Only literal values are allowed in session-property expressions: {}", expr),

63-65: Minor: drop redundant contains() check before insert.

HashSet::insert already returns whether the value was newly inserted. You can simplify.

Apply this diff:

-                if !session_properties.contains(&session_property) {
-                    session_properties.insert(session_property);
-                }
+                session_properties.insert(session_property);
wren-core-py/src/context.rs (1)

243-260: Limit pushdown patterning looks correct; consider two edge cases (optional).

  • If LIMIT is a non-literal (e.g., parameterized), current logic skips changes — fine, just noting the limitation.
  • When inserting a new limit, you force is to false. If your AST flag encodes “long”/“approximate” or similar semantics, consider preserving style consistently; otherwise this is OK.
wren-core-py/src/manifest.rs (1)

25-41: Broaden compatibility check and improve naming clarity.

  • Naming: ralc_exist/clac_exist read like “exists” but hold the negation (they’re all(... is_empty)/all(... is_none)). Rename for clarity.
  • Coverage: Consider also failing compatibility when deprecated ACL fields (rls/cls) are still present in columns. This guards older manifests that still use legacy fields.

Apply this diff:

-/// Check if the MDL can be used by the v2 wren core. If there are any access controls rules,
+/// Check if the MDL can be used by the v2 wren core. If there are any access-control rules,
 /// the MDL should be used by the v3 wren core only.
 #[pyfunction]
 pub fn is_backward_compatible(mdl_base64: &str) -> Result<bool, CoreError> {
     let manifest = to_manifest(mdl_base64)?;
-    let ralc_exist = manifest
+    let no_rlac = manifest
         .models
         .iter()
         .all(|model| model.row_level_access_controls().is_empty());
-    let clac_exist = manifest.models.iter().all(|model| {
+    let no_clac = manifest.models.iter().all(|model| {
         model
             .columns
             .iter()
             .all(|column| column.column_level_access_control().is_none())
     });
-    Ok(ralc_exist && clac_exist)
+    // Also treat legacy (deprecated) ACL fields as incompatible with v2
+    let no_legacy_rls = manifest.models.iter().all(|m| {
+        m.columns.iter().all(|c| c.rls.is_none())
+    });
+    let no_legacy_cls = manifest.models.iter().all(|m| {
+        m.columns.iter().all(|c| c.cls.is_none())
+    });
+    Ok(no_rlac && no_clac && no_legacy_rls && no_legacy_cls)
 }

If you prefer not to touch deprecated fields directly, add a unit test to ensure a manifest with legacy rls/cls returns false, then adjust the implementation accordingly.

ibis-server/app/dependencies.py (2)

1-1: Prefer lazy import to avoid hard failures at import time

Importing the extension module at top-level can break app startup if the native module isn’t present/initialized. Lazily import inside the helper to fail closed per-call (as you already do) without crashing module import.

Apply:

-import wren_core
+import logging

And update the function below (see next comment) to import from wren_core inside the try block.


52-56: Fail-closed is good; add lazy import + debug logging for diagnosability

Returning False on errors aligns with “disable fallback if uncertain.” Add lazy import and a debug log so we can trace why fallback was disabled in prod when needed.

-def is_backward_compatible(manifest_str: str) -> bool:
-    try:
-        return wren_core.is_backward_compatible(manifest_str)
-    except Exception:
-        return False
+def is_backward_compatible(manifest_str: str) -> bool:
+    try:
+        # Lazy import to avoid hard failures at module import time if extension is unavailable.
+        from wren_core import is_backward_compatible as _core_is_backward_compatible
+        return _core_is_backward_compatible(manifest_str)
+    except Exception as e:
+        logging.getLogger(__name__).debug(
+            "is_backward_compatible() failed; disabling fallback. Error: %s", e
+        )
+        return False
ibis-server/tests/routers/v3/connector/postgres/conftest.py (2)

23-24: Resolve the function list path once per module

Looks good. Consider making the autouse fixture module-scoped to avoid resetting this value for every test function.


65-70: Scope the autouse fixture to 'module' to reduce overhead and ensure consistent setup/teardown

Per-test flipping of a global configuration can be noisy. Module scope matches how the resources are shared in this directory.

-@pytest.fixture(autouse=True)
-def set_remote_function_list_path():
+@pytest.fixture(scope="module", autouse=True)
+def set_remote_function_list_path():
     config = get_config()
     config.set_remote_function_list_path(function_list_path)
     yield
     config.set_remote_function_list_path(None)
ibis-server/app/routers/v3/connector.py (2)

185-196: Outdated comment still mentions “header include row-level access control”

The guard now checks the manifest for RLAC/CLAC; headers are no longer the deciding factor. Update the inline comment to avoid confusion for future maintainers.

Example fix (apply similarly in each block):

-            # because the v2 API doesn't support row-level access control,
-            # we don't fallback to v2 if the header include row-level access control properties.
+            # v2 API doesn't support access control (RLAC/CLAC),
+            # so we don't fallback to v2 if the manifest includes any access control rules.

Also applies to: 231-241, 279-290, 345-355, 442-451


185-196: Optional: de-duplicate the fallback decision into a small helper

The same triage logic is repeated in five places. Consider extracting to a local helper to reduce future drift.

Proposed helper (add near top of this module):

def _should_fallback(java_engine_connector, headers: Headers, manifest_str: str) -> bool:
    is_fallback_disable = bool(
        headers.get(X_WREN_FALLBACK_DISABLE)
        and safe_strtobool(headers.get(X_WREN_FALLBACK_DISABLE, "false"))
    )
    return (
        java_engine_connector.client is not None
        and not is_fallback_disable
        and is_backward_compatible(manifest_str)
    )

Then replace the repeated if-block with:

if not _should_fallback(java_engine_connector, headers, dto.manifest_str):
    raise e
ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py (1)

88-89: Avoid hard-coded epoch thresholds for cache timestamps

Using a fixed epoch like 1743984000 can become brittle. Consider comparing to the test start time or simply asserting presence/format.

Example:

-assert int(response2.headers["X-Cache-Create-At"]) > 1743984000  # 2025.04.07
+assert int(response2.headers["X-Cache-Create-At"]) > 0

Or capture t0 = int(time.time()) before the request and assert the timestamp is >= t0.

wren-core-py/tests/test_modeling_core.py (2)

389-391: Relax brittle string-equality checks for error messages

Exact string comparisons on error messages tend to be fragile. Prefer substring checks or custom error types to avoid test flakes if wording changes.

-assert (
-    str(e) == 'Permission Denied: No permission to access "customer"."c_name"'
-)
+assert "Permission Denied" in str(e)
+assert '"customer"."c_name"' in str(e)

393-446: Optional CLAC-with-default test looks good; consider using pytest.raises for clarity

You’re manually try/except-ing. Using pytest.raises keeps the happy-path readable.

-try:
-    session_context.transform_sql(sql)
-except Exception as e:
-    assert (
-        str(e) == 'Permission Denied: No permission to access "orders"."o_orderkey"'
-    )
+with pytest.raises(Exception) as e:
+    session_context.transform_sql(sql)
+assert "Permission Denied" in str(e.value)
+assert '"orders"."o_orderkey"' in str(e.value)
ibis-server/tests/routers/v3/connector/postgres/test_functions.py (3)

7-8: Prefer not importing symbols from conftest directly

Directly importing constants from a conftest.py couples tests to pytest’s discovery file. Consider moving stable constants to a small tests/routers/v3/connector/postgres/constants.py (or exposing them as fixtures) to decouple imports from pytest internals. Optional, but improves maintainability.

If you choose to move constants, the import change here would look like:

-from tests.conftest import DATAFUSION_FUNCTION_COUNT
-from tests.routers.v3.connector.postgres.conftest import base_url, function_list_path
+from tests.conftest import DATAFUSION_FUNCTION_COUNT
+from tests.routers.v3.connector.postgres.constants import base_url, function_list_path

And you’d add a small constants module (outside the current line range, shown for completeness):

# tests/routers/v3/connector/postgres/constants.py
from pathlib import Path

# Keep these definitions in one place for reuse across tests
base_url = "/api/v3/connector/postgres"
function_list_path = Path(__file__).parent / "resources" / "function_list.json"

35-63: Ensure config cleanup on failures to avoid cross-test leakage

This test mutates global config and tries to reset it, but if an assertion fails before the final reset, later tests may inherit a polluted state. Wrap the body in try/finally to guarantee cleanup.

Apply this diff:

 async def test_function_list(client):
-    config = get_config()
-
-    config.set_remote_function_list_path(None)
-    response = await client.get(url=f"{base_url}/functions")
-    assert response.status_code == 200
-    result = response.json()
-    assert len(result) == DATAFUSION_FUNCTION_COUNT
-
-    config.set_remote_function_list_path(function_list_path)
-    response = await client.get(url=f"{base_url}/functions")
-    assert response.status_code == 200
-    result = response.json()
-    assert len(result) == DATAFUSION_FUNCTION_COUNT + 35
-    the_func = next(filter(lambda x: x["name"] == "extract", result))
-    assert the_func == {
-        "name": "extract",
-        "description": "Get subfield from date/time",
-        "function_type": "scalar",
-        "param_names": None,
-        "param_types": None,
-        "return_type": None,
-    }
-
-    config.set_remote_function_list_path(None)
-    response = await client.get(url=f"{base_url}/functions")
-    assert response.status_code == 200
-    result = response.json()
-    assert len(result) == DATAFUSION_FUNCTION_COUNT
+    config = get_config()
+    try:
+        config.set_remote_function_list_path(None)
+        response = await client.get(url=f"{base_url}/functions")
+        assert response.status_code == 200
+        result = response.json()
+        assert len(result) == DATAFUSION_FUNCTION_COUNT
+
+        config.set_remote_function_list_path(function_list_path)
+        response = await client.get(url=f"{base_url}/functions")
+        assert response.status_code == 200
+        result = response.json()
+        assert len(result) == DATAFUSION_FUNCTION_COUNT + 35
+        the_func = next(filter(lambda x: x["name"] == "extract", result))
+        assert the_func == {
+            "name": "extract",
+            "description": "Get subfield from date/time",
+            "function_type": "scalar",
+            "param_names": None,
+            "param_types": None,
+            "return_type": None,
+        }
+
+        config.set_remote_function_list_path(None)
+        response = await client.get(url=f"{base_url}/functions")
+        assert response.status_code == 200
+        result = response.json()
+        assert len(result) == DATAFUSION_FUNCTION_COUNT
+    finally:
+        # Ensure cleanup even if an assertion above fails
+        config.set_remote_function_list_path(None)

49-57: Make the assertion failure friendlier if “extract” isn’t present

Using next(filter(...)) raises StopIteration, which hides context. Prefer a guarded lookup to produce a clearer error.

Apply this diff:

-    the_func = next(filter(lambda x: x["name"] == "extract", result))
-    assert the_func == {
+    extract_func = next((f for f in result if f["name"] == "extract"), None)
+    assert extract_func is not None, "extract function not found in function list"
+    assert extract_func == {
         "name": "extract",
         "description": "Get subfield from date/time",
         "function_type": "scalar",
         "param_names": None,
         "param_types": None,
         "return_type": None,
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 92a1c7d and 3802559.

⛔ Files ignored due to path filters (1)
  • ibis-server/resources/function_list/postgres.csv is excluded by !**/*.csv
📒 Files selected for processing (11)
  • ibis-server/app/dependencies.py (2 hunks)
  • ibis-server/app/routers/v3/connector.py (6 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/conftest.py (3 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py (1 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_functions.py (1 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_query.py (2 hunks)
  • wren-core-py/src/context.rs (2 hunks)
  • wren-core-py/src/lib.rs (1 hunks)
  • wren-core-py/src/manifest.rs (1 hunks)
  • wren-core-py/tests/test_modeling_core.py (3 hunks)
  • wren-core/core/src/logical_plan/analyze/access_control.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-05-05T02:27:29.829Z
Learnt from: goldmedal
PR: Canner/wren-engine#1161
File: ibis-server/app/routers/v3/connector.py:78-83
Timestamp: 2025-05-05T02:27:29.829Z
Learning: The row-level access control implementation in Wren Engine filters headers with the prefix `X_WREN_VARIABLE_PREFIX` in `EmbeddedEngineRewriter.get_session_properties` and validates session property expressions in `access_control.rs` to ensure they only contain literal values, preventing SQL injection.

Applied to files:

  • wren-core/core/src/logical_plan/analyze/access_control.rs
🧬 Code graph analysis (6)
wren-core-py/src/lib.rs (1)
wren-core-py/src/manifest.rs (1)
  • is_backward_compatible (28-41)
wren-core-py/src/manifest.rs (1)
wren-core-base/manifest-macro/src/lib.rs (3)
  • manifest (26-56)
  • model (119-154)
  • column (159-197)
wren-core/core/src/logical_plan/analyze/access_control.rs (1)
wren-core-base/manifest-macro/src/lib.rs (1)
  • session_property (396-416)
ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py (1)
ibis-server/tests/conftest.py (1)
  • client (18-23)
wren-core-py/src/context.rs (1)
wren-core/core/src/mdl/context.rs (1)
  • create_ctx_with_mdl (46-107)
wren-core-py/tests/test_modeling_core.py (2)
wren-core-py/src/manifest.rs (1)
  • is_backward_compatible (28-41)
wren-core-py/src/validation.rs (1)
  • validate_rlac_rule (7-13)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: cargo test (macos-aarch64)
  • GitHub Check: cargo test (macos)
  • GitHub Check: cargo test (win64)
  • GitHub Check: cargo check
  • GitHub Check: test
  • GitHub Check: ci
🔇 Additional comments (19)
wren-core/core/src/logical_plan/analyze/access_control.rs (1)

59-62: Good fix: normalize session-property identifiers to ASCII lowercase.

Lowercasing here eliminates case-mismatch bugs when collecting properties from RLAC conditions.

wren-core-py/src/context.rs (1)

169-179: LGTM: clearer local naming while preserving behavior.

Renaming the intermediate ctx to unparser_ctx improves readability; no functional change.

ibis-server/tests/routers/v3/connector/postgres/test_query.py (1)

697-702: LGTM: assert status code after checking the timeout message.

Reordering keeps the more informative assertion first (message), then status code 504 — good for debugging flaky timeouts.

Also applies to: 713-718

wren-core-py/src/lib.rs (1)

26-26: Expose is_backward_compatible to Python — looks good.

Binding is correct and matches the function signature; no issues.

ibis-server/tests/routers/v3/connector/postgres/conftest.py (1)

8-8: LGTM: centralizing config access via get_config

This removes duplication in tests and keeps configuration setup in one place.

ibis-server/app/routers/v3/connector.py (6)

18-19: Import swap to manifest-based compatibility gate

Replacing the old header-based signal with is_backward_compatible is the right direction to couple fallback strictly to manifest semantics.


185-196: Fallback gating reads clearly and matches PR intent

The combined conditions correctly block fallback when:

  • Java client is unavailable,
  • explicit fallback disable header is set, or
  • manifest isn’t backward compatible (has RLAC/CLAC).

This is aligned with “disable fallback when access control is configured.”


231-241: Same gating applied to dry-plan endpoint—good consistency

The same manifest-based rule is uniformly applied here.


279-290: Same gating applied to dry-plan-for-data-source—good

Uniform behavior across endpoints reduces surprises.


345-355: Same gating applied to validate—good

Consistent with query and planning endpoints.


442-451: Same gating applied to model-substitute—good

Consistent fallback policy across all v3 endpoints.

ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py (3)

383-386: Asserting 200 with only session header and no RLAC in manifest is correct

This confirms that headers alone no longer suppress fallback—now manifest content dictates behavior.


385-426: RLAC manifest case returning 422 validates “no fallback when access control exists”

Good addition. It precisely captures the intended policy change.


427-471: CLAC manifest case returning 422 rounds out coverage

Nice complement to the RLAC case; confirms both access-control types disable fallback.

wren-core-py/tests/test_modeling_core.py (3)

11-11: Public API use of is_backward_compatible

Good to exercise the Python binding directly in tests.


336-354: RLAC validation cases: case-insensitivity and fully-qualified column reference

These assertions align with the PR intent to fix case-insensitive RLAC validation and fully-qualify columns.


447-550: Backward-compatibility checks comprehensively cover RLAC, CLAC, and plain manifests

This is the critical contract for server-side fallback gating; tests read clean and assert the right behavior.

ibis-server/tests/routers/v3/connector/postgres/test_functions.py (2)

7-8: Centralized imports from shared conftest — LGTM

Importing DATAFUSION_FUNCTION_COUNT, base_url, and function_list_path from shared test utilities avoids duplication and keeps this test lean.


7-8: Imports Verified in postgres conftest

All imported symbols are defined as expected:

  • ibis-server/tests/routers/v3/connector/postgres/conftest.py
    base_url = "/v3/connector/postgres" (line 13)
    function_list_path = file_path("../resources/function_list") (line 23)
  • ibis-server/tests/conftest.py
    DATAFUSION_FUNCTION_COUNT = 283 (line 14)

No further action needed—these imports are stable and correctly exported.

@goldmedal goldmedal requested a review from douenergy August 22, 2025 09:26
@goldmedal goldmedal force-pushed the fix/access-controls-case-insensitive branch from 3802559 to 300c9ec Compare August 22, 2025 09:50
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
ibis-server/app/dependencies.py (2)

1-1: Move wren_core import into the wrapper to avoid import-time failures and reduce module side effects

Top-level importing wren_core will crash the app at import time if the extension is unavailable/misconfigured. Given we already treat errors as “no fallback,” lazy-importing inside the helper keeps the process resilient and aligns with the “safe by default” behavior.

Apply this diff to drop the top-level import:

-import wren_core

I’ve proposed the corresponding in-function import in the next comment.


52-56: Behavior is correct; harden the wrapper with lazy import and debug logging (no manifest leakage)

Returning False on any error is the right safety default. To avoid import-time crashes and to aid debugging without leaking the manifest, prefer a lazy import and log at debug level.

Apply this diff:

-def is_backward_compatible(manifest_str: str) -> bool:
-    try:
-        return wren_core.is_backward_compatible(manifest_str)
-    except Exception:
-        return False
+def is_backward_compatible(manifest_str: str) -> bool:
+    """
+    Returns True only if the manifest (base64-encoded MDL JSON) contains no RLAC/CLAC rules.
+    On any error (invalid base64, parse failure, missing extension), returns False to disable fallback.
+    """
+    # Lazy import to prevent module import failures from taking down the process.
+    try:
+        from wren_core import is_backward_compatible as _core_is_backward_compatible
+    except Exception as e:  # ImportError or runtime binding error
+        import logging
+        logging.getLogger(__name__).debug(
+            "wren_core import failed (%s); treating as not backward-compatible", e
+        )
+        return False
+
+    try:
+        return bool(_core_is_backward_compatible(manifest_str))
+    except Exception as e:
+        import logging
+        logging.getLogger(__name__).debug(
+            "wren_core.is_backward_compatible failed (%s); treating as not backward-compatible",
+            e,
+        )
+        return False
wren-core-py/tests/test_modeling_core.py (4)

341-351: Keep column qualification consistent in the case-insensitivity test

The first RLAC test uses customer.c_name, while this one uses bare c_name. To avoid ambiguity and make intent clear, qualify the column here as well.

-        condition="c_name = @SEssion_user",
+        condition="customer.c_name = @SEssion_user",

383-391: Prefer pytest.raises over try/except in tests

Using pytest.raises yields cleaner failures and better tracebacks.

-    try:
-        session_context.transform_sql(sql)
-    except Exception as e:
-        assert (
-            str(e) == 'Permission Denied: No permission to access "customer"."c_name"'
-        )
+    with pytest.raises(Exception) as e:
+        session_context.transform_sql(sql)
+    assert 'Permission Denied: No permission to access "customer"."c_name"' in str(e.value)

393-446: Use pytest.raises in opt-CLAC test for clearer intent

Same reasoning as above; this also avoids swallowing unexpected exceptions.

-    try:
-        session_context.transform_sql(sql)
-    except Exception as e:
-        assert (
-            str(e) == 'Permission Denied: No permission to access "orders"."o_orderkey"'
-        )
+    with pytest.raises(Exception) as e:
+        session_context.transform_sql(sql)
+    assert 'Permission Denied: No permission to access "orders"."o_orderkey"' in str(e.value)

447-549: Add one negative test for invalid base64 to harden the contract

This ensures the Python binding’s behavior on malformed input is explicit (raise vs. coerce), and keeps parity with the server-side wrapper that returns False on errors.

Add this test near the other compatibility checks:

def test_backward_compatible_check_invalid_manifest():
    with pytest.raises(Exception):
        is_backward_compatible("not-base64")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3802559 and 300c9ec.

⛔ Files ignored due to path filters (1)
  • ibis-server/resources/function_list/postgres.csv is excluded by !**/*.csv
📒 Files selected for processing (11)
  • ibis-server/app/dependencies.py (2 hunks)
  • ibis-server/app/routers/v3/connector.py (6 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/conftest.py (3 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py (1 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_functions.py (1 hunks)
  • ibis-server/tests/routers/v3/connector/postgres/test_query.py (2 hunks)
  • wren-core-py/src/context.rs (1 hunks)
  • wren-core-py/src/lib.rs (1 hunks)
  • wren-core-py/src/manifest.rs (1 hunks)
  • wren-core-py/tests/test_modeling_core.py (3 hunks)
  • wren-core/core/src/logical_plan/analyze/access_control.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • wren-core-py/src/lib.rs
  • wren-core-py/src/manifest.rs
  • wren-core/core/src/logical_plan/analyze/access_control.rs
  • ibis-server/app/routers/v3/connector.py
  • ibis-server/tests/routers/v3/connector/postgres/conftest.py
  • wren-core-py/src/context.rs
  • ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py
  • ibis-server/tests/routers/v3/connector/postgres/test_functions.py
  • ibis-server/tests/routers/v3/connector/postgres/test_query.py
🧰 Additional context used
🧬 Code graph analysis (2)
ibis-server/app/dependencies.py (2)
wren-core-py/src/manifest.rs (1)
  • is_backward_compatible (28-41)
ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py (1)
  • manifest_str (30-31)
wren-core-py/tests/test_modeling_core.py (6)
ibis-server/app/dependencies.py (1)
  • is_backward_compatible (52-56)
wren-core-py/src/manifest.rs (1)
  • is_backward_compatible (28-41)
wren-core/core/src/logical_plan/analyze/access_control.rs (2)
  • validate_rlac_rule (85-114)
  • required_properties (252-278)
wren-core-py/src/validation.rs (1)
  • validate_rlac_rule (7-13)
wren-core-base/manifest-macro/src/lib.rs (2)
  • model (119-154)
  • manifest (26-56)
wren-core-py/src/context.rs (1)
  • transform_sql (207-223)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: check Cargo.toml formatting
  • GitHub Check: cargo test (amd64)
  • GitHub Check: clippy
  • GitHub Check: test
  • GitHub Check: ci
  • GitHub Check: cargo test (win64)
  • GitHub Check: cargo test (macos)
  • GitHub Check: cargo test (macos-aarch64)
🔇 Additional comments (3)
wren-core-py/tests/test_modeling_core.py (3)

11-11: LGTM: importing is_backward_compatible into the test module

The public API is exercised by the new tests later in the file. No issues.


336-340: LGTM: qualify column in RLAC validation test

Using customer.c_name = @session_user aligns with model-scoped column references and reduces ambiguity during rule validation.


447-549: Backward-compatibility checks: good coverage

Positive/negative cases for CLAC, RLAC, and a clean manifest look solid and mirror the Rust-side semantics.

@goldmedal goldmedal merged commit f04cee7 into Canner:main Aug 22, 2025
14 checks passed
@goldmedal goldmedal deleted the fix/access-controls-case-insensitive branch August 22, 2025 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core ibis python Pull requests that update Python code rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants