feat(core): introduce the row-level access control for the model#1161
feat(core): introduce the row-level access control for the model#1161douenergy merged 30 commits intoCanner:mainfrom
Conversation
WalkthroughThis set of changes introduces comprehensive support for row-level access control (RLAC) based on session properties throughout the Wren modeling and query stack. The Rust modeling core, procedural macros, and builder patterns are extended to allow models to declare RLAC rules with required or optional session properties and filter conditions. The logical plan analysis and generation layers are refactored to propagate session properties, validate RLAC rules, and inject filter expressions into query plans. Python bindings and server-side components are updated to pass header-derived session properties into the rewriting and transformation pipeline. Extensive tests and new examples are added to verify RLAC behavior, including error cases, property case-insensitivity, and integration with DataFusion. Supporting utilities and constants are introduced in both Python and Rust codebases. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API_Server
participant Rewriter
participant EmbeddedEngine
participant SessionContext
participant WrenEngine
Client->>API_Server: POST /query with headers (session props)
API_Server->>Rewriter: rewrite(sql, properties=dict(headers))
Rewriter->>EmbeddedEngine: rewrite(manifest, sql, properties)
EmbeddedEngine->>SessionContext: transform_sql(sql, session_properties)
SessionContext->>WrenEngine: transform_sql_with_ctx(..., session_properties)
WrenEngine-->>SessionContext: rewritten SQL (with RLAC filter)
SessionContext-->>EmbeddedEngine: rewritten SQL
EmbeddedEngine-->>Rewriter: rewritten SQL
Rewriter-->>API_Server: rewritten SQL
API_Server-->>Client: Query result
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (6)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 4
🔭 Outside diff range comments (4)
wren-core/core/src/mdl/context.rs (1)
67-69:⚠️ Potential issueFix cargo fmt error.
The pipeline reports a formatting issue on this line.
Run
cargo fmton this file to fix the formatting issue.🧰 Tools
🪛 GitHub Actions: Rust
[error] 69-69: cargo fmt check failed due to formatting differences. Code style does not match expected formatting.
ibis-server/app/routers/v3/connector.py (1)
304-313: 🛠️ Refactor suggestionMissing RLAC fallback check in model_substitute endpoint.
Unlike other endpoints, the model_substitute endpoint doesn't check for RLAC headers before falling back to v2.
Add the same RLAC fallback check logic to maintain consistency:
try: sql = ModelSubstitute(data_source, dto.manifest_str, headers).substitute( dto.sql ) Connector(data_source, dto.connection_info).dry_run( await Rewriter( dto.manifest_str, data_source=data_source, experiment=True, + properties=dict(headers), ).rewrite(sql) ) return sql except Exception as e: + # because the v2 API doesn't support row-level access control, + # we don't fallback to v2 if the header include row-level access control properties. + if exist_wren_variables_header(headers): + raise e + else: logger.warning( "Failed to execute v3 model-substitute, fallback to v2: {}", str(e) ) return await v2.connector.model_substitute( data_source, dto, headers, java_engine_connector )wren-core-base/manifest-macro/src/lib.rs (1)
1-555:⚠️ Potential issue
cargo fmtis failing – please format the macro crateThe Rust workflow reports a
cargo fmtmismatch.
Runcargo fmt --allbefore merging to unblock the pipeline.🧰 Tools
🪛 GitHub Actions: Rust
[error] 358-360: cargo fmt check failed due to formatting differences. Code style does not match expected formatting.
wren-core/core/src/logical_plan/analyze/access_control.rs (1)
1-544:⚠️ Potential issue
cargo fmtis failing for this fileCI indicates formatting differences.
Runcargo fmt --allto keep the style consistent.🧰 Tools
🪛 GitHub Actions: Rust
[error] 93-93: cargo fmt check failed due to formatting differences. Code style does not match expected formatting.
🧹 Nitpick comments (16)
wren-core/wren-example/examples/plan-sql.rs (1)
17-24: Updated function call to support session properties.The
transform_sql_with_ctxcall has been properly updated to include the newHashMapparameter for session properties, which is required for the row-level access control feature. This empty map is appropriate for this example since no specific session properties are needed.Would it be helpful to add an example that demonstrates the use of session properties with row-level access control? This would provide developers with a clear usage pattern for the new RLAC feature.
wren-core-py/tests/test_modeling_core.py (1)
290-300: Test provides good coverage of RLAC functionality.This test validates that row-level access control conditions are correctly applied during SQL transformation, verifying that the condition "c_name = 'test_user'" is added to the WHERE clause. The test confirms RLAC is properly integrated with the SQL transformation pipeline.
However, consider adding a negative test case where the session_user header is missing to verify the behavior when optional RLAC properties are not provided.
+ def test_rlac_without_properties(): + session_context = SessionContext(manifest_str, None) + sql = "SELECT * FROM my_catalog.my_schema.customer" + rewritten_sql = session_context.transform_sql(sql, {}) + assert ( + rewritten_sql + == "SELECT customer.c_custkey, customer.c_name FROM (SELECT customer.c_custkey, customer.c_name FROM (SELECT __source.c_custkey AS c_custkey, __source.c_name AS c_name FROM main.customer AS __source) AS customer) AS customer" + )ibis-server/tests/routers/v3/connector/postgres/test_query.py (1)
432-463: Thorough RLAC integration testing with header case sensitivity checks.The test effectively validates that:
- RLAC filters are properly applied when querying data via the API
- The headers are case-insensitive (testing both lowercase and uppercase variants)
- The filtered data contains only the matching rows
This provides excellent coverage of the RLAC functionality at the API level.
Consider adding a test case for when the session variable is not provided to verify the system's behavior with optional RLAC properties.
+ async def test_rlac_query_without_session(client, manifest_str, connection_info): + response = await client.post( + url=f"{base_url}/query", + json={ + "connectionInfo": connection_info, + "manifestStr": manifest_str, + "sql": "SELECT c_name FROM customer", + }, + ) + # Since session_user is optional, the query should return all rows + assert response.status_code == 200 + result = response.json() + assert len(result["data"]) > 1wren-core/wren-example/examples/row_level_access_control.rs.rs (3)
75-85: Consider simplifying session property setup.The session properties setup is verbose with repeated insertion patterns.
Consider using a more concise approach:
- let mut properties = HashMap::new(); - properties.insert( - "session_tenant_id".to_string(), - Some("'1acdef01-aaaa-aaaa-aaaa-aaaaaaaaaaaa'".to_string()), - ); - properties.insert( - "session_department".to_string(), - Some("'engineering'".to_string()), - ); - properties.insert("session_user_id".to_string(), Some("'1003-u3'".to_string())); - properties.insert("session_role".to_string(), Some("'ADMIN'".to_string())); + let properties: HashMap<String, Option<String>> = [ + ("session_tenant_id", "'1acdef01-aaaa-aaaa-aaaa-aaaaaaaaaaaa'"), + ("session_department", "'engineering'"), + ("session_user_id", "'1003-u3'"), + ("session_role", "'ADMIN'"), + ] + .iter() + .map(|(k, v)| (k.to_string(), Some(v.to_string()))) + .collect();
87-111: Extract property printing to a helper function.The code for printing session properties is repetitive.
Consider refactoring to a helper function:
- println!("#####################"); - println!( - "session_tenant_id: {}", - &properties - .get("session_tenant_id") - .unwrap() - .clone() - .unwrap() - ); - println!( - "session_department: {}", - &properties - .get("session_department") - .unwrap() - .clone() - .unwrap() - ); - println!( - "session_user_id: {}", - &properties.get("session_user_id").unwrap().clone().unwrap() - ); - println!( - "session_role: {}", - &properties.get("session_role").unwrap().clone().unwrap() - ); + println!("#####################"); + for key in &["session_tenant_id", "session_department", "session_user_id", "session_role"] { + println!( + "{}: {}", + key, + properties.get(*key).unwrap().clone().unwrap() + ); + }
1-1: File name has duplicate extension.The file is named "row_level_access_control.rs.rs" with a duplicate ".rs" extension.
Rename the file to "row_level_access_control.rs" to follow standard Rust naming conventions.
wren-core/core/src/logical_plan/analyze/model_generation.rs (1)
86-108: Combine RLAC filters withIterator::flatten().reduce()for clearer intentThe current
reducechain works, but the nestedOptionhandling and theunwrap()call make the control-flow a bit hard to follow and slightly brittle. A more idiomatic approach is to:
filter_mapaway theNones first,- then call
reduce(|acc, f| acc.and(f)).That avoids the manual
is_none/unwrapjuggling and the risk of accidentally moving values out of the accumulator.-let filters: Vec<Option<Expr>> = model_plan - .model - .row_level_access_controls() - .iter() - .map(|rule| { - self.generate_row_level_access_control_filter( - Arc::clone(&model_plan.model), - rule, - ) - }) - .collect::<Result<_>>()?; -let rls_filter = filters - .into_iter() - .reduce(|acc, filter| { - if acc.is_none() { - filter - } else if let Some(filter) = filter { - Some(acc.unwrap().and(filter)) - } else { - acc - } - }) - .flatten(); +let rls_filter = model_plan + .model + .row_level_access_controls() + .iter() + .filter_map(|rule| { + // `ok()` converts Err to None, so error is still bubbled up by `?` + self.generate_row_level_access_control_filter( + Arc::clone(&model_plan.model), + rule, + ).transpose() + }) + .flatten() + .reduce(|acc, f| acc.and(f));Cleaner, easier to reason about, and no
unwrap().wren-core/core/src/mdl/mod.rs (2)
358-366: Property key normalisation happens late – guard against mismatched cases up-front
properties_refis constructed and passed intocreate_ctx_with_mdl,
which performs normalisation. Any code paths that access
SessionPropertiesRefbefore this normalisation (now or in future
refactors) would silently fail look-ups due to case mismatch.Mitigation idea: normalise the
HashMapright here (e.g. convert all
keys to lowercase) so every subsequent consumer gets a consistent view
regardless of call site.
2125-2133: Tiny helper can be simplified withcollect()
build_headersmanually populates aHashMap. Using iterator tooling
keeps it concise and avoids an intermediate mutable map:-fn build_headers( - field: &[(String, Option<String>)], -) -> HashMap<String, Option<String>> { - let mut headers = HashMap::new(); - for (key, value) in field { - headers.insert(key.clone(), value.clone()); - } - headers -} +fn build_headers( + fields: &[(String, Option<String>)], +) -> HashMap<String, Option<String>> { + fields + .iter() + .cloned() + .collect() +}Nip-level, but reduces repetitive code.
wren-core/core/src/logical_plan/analyze/plan.rs (1)
376-392: Duplicaterequired_fieldsentries may inflate planning time
add_required_columns_from_session_propertiesblindlyextends the
vector with every column mentioned in RLAC conditions. Although later
deduplication via aBTreeSeteventually removes duplicates, the
intermediate vectors can grow quadratically when many overlapping RLAC
rules exist.A low-cost improvement is to insert into a
HashSetfirst or to
retainonly new expressions before extending, e.g.:let mut seen = HashSet::new(); for expr in collect_condition(model, &rule.condition)?.0 { if seen.insert(expr.to_string()) { // or a custom key required_fields.push(expr); } }This shaves unnecessary
Exprcloning and stringification work during
large manifests.wren-core-base/manifest-macro/src/lib.rs (2)
360-383: Missing serde defaults & potential deserialization pitfalls
RowLevelAccessControl.required_propertiesalready has#[serde(default)], great.
However, in many manifests theconditionfield will be the only mandatory one. To guard against accidental omission ofrequired_propertiesyou may also want
#[serde(default)] pub required_properties: Vec<SessionProperty>,in the macro for completeness (mirrors other structs).
SessionPropertystruct (see below) does not markdefault_expras#[serde(default)]. Deserialising a manifest that omits that field will fail unless the user providesnull. Consider:-pub default_expr: Option<String>, +#[serde(default)] +pub default_expr: Option<String>,
- Remember to re-run
cargo fmt; the CI formatting check is currently failing.
386-406: Expose helper ctors inside the macro outputThe tests rely on
SessionProperty::new_required/optional, but these helpers are not generated here.
If they live elsewhere, ignore this note; otherwise, add simple inherent impls to make authoring manifests easier:impl SessionProperty { pub fn new_required(name: impl Into<String>) -> Self { Self { name: name.into(), required: true, default_expr: None } } pub fn new_optional(name: impl Into<String>, default_expr: Option<String>) -> Self { Self { name: name.into(), required: false, default_expr } } }This keeps the public API symmetric with the newly introduced struct.
wren-core/core/src/logical_plan/analyze/access_control.rs (4)
31-32: Typo in variable name
seesion_properties→session_properties. While harmless, typos hinder readability and can propagate to log messages.-let mut seesion_properties: HashSet<String> = HashSet::new(); +let mut session_properties: HashSet<String> = HashSet::new();
49-54: Potential duplicate column expressionsWhen the same column appears multiple times in a condition, it is pushed repeatedly into
conditions.
Using aHashSetavoids redundant work later:-let mut conditions = vec![]; +let mut seen = HashSet::new(); +let mut conditions = Vec::new(); ... -if !value.starts_with("@") { - ... - conditions.push(Expr::Column(...)); +if !value.starts_with("@") { + ... + if seen.insert(value) { + conditions.push(Expr::Column(...)); + }
93-140: Repeated parsing of identical session values
parse_expr(property_value)is invoked for every occurrence of the same@propertytoken, causing unnecessary work.
Caching the parsedExpronce per property will speed up large expressions:-visit_expressions_mut(&mut expr, |expr| { +let mut cache = HashMap::new(); +visit_expressions_mut(&mut expr, |expr| { ... - match parse_expr(property_value) { + let parsed = cache.entry(property_name.clone()) + .or_insert_with(|| parse_expr(property_value)); + match parsed {🧰 Tools
🪛 GitHub Actions: Rust
[error] 93-93: cargo fmt check failed due to formatting differences. Code style does not match expected formatting.
150-152: Double parsing round-trip may be avoidable
expr.to_string()→create_logical_expr()re-parses the SQL.
Ifdatafusion::sql::planner::SqlToRelexposes a method that converts an ASTExprdirectly, you could pass the already-parsed tree and skip one parse cycle, improving performance.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
wren-core-py/Cargo.lockis excluded by!**/*.lockwren-core/wren-example/data/company/documents.csvis excluded by!**/*.csvwren-core/wren-example/data/company/tenants.csvis excluded by!**/*.csvwren-core/wren-example/data/company/users.csvis excluded by!**/*.csv
📒 Files selected for processing (30)
ibis-server/app/mdl/rewriter.py(4 hunks)ibis-server/app/routers/v3/connector.py(6 hunks)ibis-server/app/util.py(1 hunks)ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py(1 hunks)ibis-server/tests/routers/v3/connector/postgres/test_query.py(2 hunks)wren-core-base/manifest-macro/src/lib.rs(6 hunks)wren-core-base/src/mdl/builder.rs(8 hunks)wren-core-base/src/mdl/manifest.rs(5 hunks)wren-core-base/tests/data/mdl.json(1 hunks)wren-core-py/src/context.rs(3 hunks)wren-core-py/src/manifest.rs(2 hunks)wren-core-py/tests/test_modeling_core.py(2 hunks)wren-core/Cargo.toml(1 hunks)wren-core/benchmarks/src/tpch/run.rs(2 hunks)wren-core/core/Cargo.toml(1 hunks)wren-core/core/src/logical_plan/analyze/access_control.rs(1 hunks)wren-core/core/src/logical_plan/analyze/mod.rs(1 hunks)wren-core/core/src/logical_plan/analyze/model_anlayze.rs(5 hunks)wren-core/core/src/logical_plan/analyze/model_generation.rs(7 hunks)wren-core/core/src/logical_plan/analyze/plan.rs(15 hunks)wren-core/core/src/logical_plan/analyze/relation_chain.rs(6 hunks)wren-core/core/src/mdl/context.rs(7 hunks)wren-core/core/src/mdl/mod.rs(39 hunks)wren-core/sqllogictest/src/test_context.rs(2 hunks)wren-core/wren-example/examples/calculation-invoke-calculation.rs(2 hunks)wren-core/wren-example/examples/datafusion-apply.rs(1 hunks)wren-core/wren-example/examples/plan-sql.rs(2 hunks)wren-core/wren-example/examples/row_level_access_control.rs.rs(1 hunks)wren-core/wren-example/examples/to-many-calculation.rs(1 hunks)wren-core/wren-example/examples/view.rs(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (9)
wren-core-py/src/manifest.rs (1)
wren-core-base/src/mdl/manifest.rs (1)
row_level_access_controls(271-273)
wren-core/benchmarks/src/tpch/run.rs (1)
wren-core/core/src/mdl/mod.rs (3)
transform_sql_with_ctx(350-386)new(117-172)new(426-428)
wren-core/wren-example/examples/view.rs (1)
wren-core/core/src/mdl/mod.rs (3)
transform_sql_with_ctx(350-386)new(117-172)new(426-428)
ibis-server/tests/routers/v3/connector/postgres/test_query.py (2)
ibis-server/tests/conftest.py (1)
client(18-23)ibis-server/tests/routers/v3/connector/postgres/conftest.py (1)
connection_info(37-44)
wren-core/core/src/logical_plan/analyze/model_anlayze.rs (2)
wren-core/core/src/mdl/context.rs (1)
properties(72-78)wren-core/sqllogictest/src/test_context.rs (1)
analyzed_wren_mdl(104-106)
wren-core/core/src/logical_plan/analyze/relation_chain.rs (4)
wren-core/core/src/mdl/context.rs (1)
properties(72-78)wren-core/core/src/mdl/utils.rs (1)
quoted(209-211)wren-core/core/src/logical_plan/analyze/model_generation.rs (1)
model_plan(86-96)wren-core/core/src/logical_plan/analyze/plan.rs (1)
plan_name(84-86)
wren-core-base/src/mdl/builder.rs (3)
wren-core-base/src/mdl/manifest.rs (4)
name(245-247)name(278-280)name(289-291)name(295-297)wren-core/core/src/logical_plan/analyze/access_control.rs (1)
required_properties(173-190)wren-core-base/src/mdl/cls.rs (1)
new(41-55)
wren-core/core/src/mdl/context.rs (3)
wren-core/core/src/logical_plan/analyze/model_generation.rs (1)
new(38-48)wren-core/core/src/logical_plan/analyze/plan.rs (6)
new(69-82)new(109-124)new(649-651)new(766-860)new(914-961)new(1020-1022)wren-core/core/src/logical_plan/analyze/model_anlayze.rs (2)
new(69-79)new(794-802)
wren-core-base/manifest-macro/src/lib.rs (2)
wren-core-base/src/mdl/manifest.rs (5)
row_level_access_controls(271-273)name(245-247)name(278-280)name(289-291)name(295-297)wren-core/core/src/logical_plan/analyze/access_control.rs (1)
required_properties(173-190)
🪛 GitHub Actions: Rust
wren-core/core/src/mdl/context.rs
[error] 69-69: cargo fmt check failed due to formatting differences. Code style does not match expected formatting.
wren-core/core/src/logical_plan/analyze/access_control.rs
[error] 93-93: cargo fmt check failed due to formatting differences. Code style does not match expected formatting.
wren-core-base/manifest-macro/src/lib.rs
[error] 358-360: cargo fmt check failed due to formatting differences. Code style does not match expected formatting.
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: test
- GitHub Check: ci
🔇 Additional comments (61)
wren-core/core/Cargo.toml (1)
27-27: Good addition of snapshot testing capability.Adding the
instacrate is appropriate for implementing snapshot tests for the new row-level access control features. This will enable more robust testing of the RLAC functionality.wren-core/core/src/logical_plan/analyze/mod.rs (1)
1-1: Appropriate module addition for row-level access control.Adding the
access_controlmodule is a clean way to encapsulate the new row-level access control functionality within the logical plan analysis system.wren-core-py/src/manifest.rs (1)
46-47: Updated test models with row-level access control field.The test models have been correctly updated to include empty vectors for the new
row_level_access_controlsfield, ensuring the tests remain compatible with the updated Model struct definition.Also applies to: 57-58
wren-core/wren-example/examples/plan-sql.rs (1)
2-2: Added HashMap import for session properties support.Good addition of the HashMap import to support the new session properties parameter in RLAC-related functions.
ibis-server/app/util.py (1)
110-115: Well-implemented utility function for RLAC header detectionThe new
exist_wren_variables_headerfunction correctly checks for the presence of row-level access control variable headers. The implementation is simple and efficient, first checking if headers are None before using a generator expression withany()to search for keys with the "x-wren-variables-" prefix.wren-core/wren-example/examples/calculation-invoke-calculation.rs (2)
84-84: Correctly updated function call to support RLACThe function call has been properly updated to include the new empty
HashMapparameter required for the row-level access control feature. This matches the updated signature oftransform_sql_with_ctxas shown in the relevant code snippets.
110-110: Correctly updated function call to support RLACThe function call has been properly updated to include the new empty
HashMapparameter required for the row-level access control feature. This matches the updated signature oftransform_sql_with_ctx.wren-core/benchmarks/src/tpch/run.rs (2)
6-6: Correctly added HashMap importThe import for
HashMaphas been properly added to support the usage ofHashMap::new()in the updated function call.
61-68: Correctly updated transform_sql_with_ctx call to support RLACThe function call has been properly updated to include the new empty
HashMapparameter required for the row-level access control feature. The update maintains the existing functionality while adapting to the new function signature.wren-core/wren-example/examples/datafusion-apply.rs (1)
81-82: Correctly updated transform_sql_with_ctx call to support RLACThe function call has been properly updated to include the new empty
HashMapparameter required for the row-level access control feature. The code has been properly reformatted across two lines to maintain readability.wren-core/wren-example/examples/to-many-calculation.rs (1)
79-80: Function call updated to support row-level access control.The call to
create_ctx_with_mdlhas been properly modified to pass an emptyHashMapof session properties, which is required for the new row-level access control (RLAC) functionality. This change ensures the example code aligns with the updated API while maintaining its original behavior.ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py (1)
160-170: LGTM - Good test coverage for RLAC headers.This test effectively verifies that fallback to v2 API is prevented when RLAC headers are present, confirming proper isolation of row-level access control functionality. The test structure is consistent with other tests in the file and correctly asserts the expected 422 status code.
wren-core/wren-example/examples/view.rs (2)
1-1: Appropriate import added for new HashMap parameter.The HashMap import is correctly added to support the updated function call signature.
19-26: Function call updated to support row-level access control.The call to
transform_sql_with_ctxhas been properly updated to include the newHashMapparameter required for row-level access control support. This change aligns with the function signature modification shown in the relevant code snippet fromwren-core/core/src/mdl/mod.rs.wren-core/sqllogictest/src/test_context.rs (2)
304-310: Function call updated to support row-level access control.The call to
create_ctx_with_mdlin theregister_ecommerce_mdlfunction has been properly updated with an emptyArc<HashMap>as the session properties parameter. This modification correctly implements the interface changes for RLAC support while preserving test behavior.
540-546: Function call updated to support row-level access control.The call to
create_ctx_with_mdlin theregister_tpch_mdlfunction has been properly updated with an emptyArc<HashMap>as the session properties parameter, consistent with the changes throughout the codebase for RLAC support.wren-core-py/tests/test_modeling_core.py (1)
28-39: Row-level access controls are well configured in the manifest.The RLAC configuration properly defines a named rule "customer_access" with a condition that filters rows based on the session_user variable. The required property is correctly marked as optional (required: False), allowing queries to execute even when session_user isn't provided.
wren-core-base/tests/data/mdl.json (1)
37-69: Comprehensive RLAC rule definitions with good test coverage.The implementation includes three different RLAC rules demonstrating various configurations:
- Rule1: With a required property (mandatory)
- Rule2: With an optional property
- Rule3: With an optional property and a default expression
This provides excellent test coverage for different RLAC scenarios, particularly the default expression handling which is important for fallback behavior.
wren-core-base/src/mdl/manifest.rs (4)
23-24: Appropriate deprecation handling with compiler directives.The
#[allow(deprecated)]annotations are correctly applied to suppress warnings when using deprecated features. This is good practice during transitions while maintaining backward compatibility.Also applies to: 60-61
29-31: Macro imports properly organized.The imports of the new macros
row_level_access_controlandsession_propertyare well-organized alongside related macros, making the code organized and maintainable.Also applies to: 66-68
49-51: Consistent macro invocation pattern.The macro invocations follow the same pattern as other macro calls, with boolean parameters indicating whether Python bindings should be generated. This maintains consistency with the codebase's established patterns.
Also applies to: 88-90
271-273: Public accessor for RLAC follows established patterns.The
row_level_access_controls()method adheres to the existing accessor pattern in the Model implementation, providing a consistent API for retrieving RLAC rules. This ensures proper encapsulation while exposing the necessary functionality.ibis-server/tests/routers/v3/connector/postgres/test_query.py (1)
81-92: Well-defined RLAC configuration for integration testing.The RLAC configuration in the test manifest properly defines a rule with a condition that filters customer rows based on the session_user variable. This provides a good foundation for testing the end-to-end RLAC functionality with the Postgres connector.
wren-core/core/src/logical_plan/analyze/model_anlayze.rs (5)
3-3: Adding SessionPropertiesRef for row-level access control supportThis import introduces the
SessionPropertiesReftype, which is a key component for implementing row-level access control (RLAC) in the modeling engine.
37-37: New properties field for session-level RLACAdding the
propertiesfield toModelAnalyzeRuleenables the rule to access and apply session-specific access control policies during model analysis.
72-72: Constructor updated to accept session propertiesThe constructor now correctly initializes the properties field from the provided parameter, maintaining the pattern of using Arc for shared ownership.
Also applies to: 77-77
456-457: Passing session properties to ModelPlanNode in table scan analysisSession properties are now passed to ModelPlanNode during table scan analysis, ensuring that row-level access controls will be applied.
513-514: Passing session properties to ModelPlanNode in subquery alias model analysisSession properties are consistently passed to ModelPlanNode in the subquery alias analysis path, ensuring that row-level access controls will be applied regardless of how the model is referenced.
wren-core-py/src/context.rs (4)
23-23: Add HashMap import for session propertiesThis import supports the new session properties functionality which uses a HashMap to store key-value pairs.
127-134: Initialize context with empty session propertiesThe context creation now includes an empty HashMap wrapped in an Arc for session properties, aligning with the updated core API.
145-153: Update transform_sql signature to support session propertiesThe method now accepts optional SQL and session properties, with proper null checking for the SQL parameter. This enables passing RLAC properties for SQL transformation.
161-162: Pass session properties to transform_sql_with_ctxSession properties are now correctly passed to the core transformation function, with a sensible default to an empty map when not provided.
wren-core/core/src/logical_plan/analyze/relation_chain.rs (5)
9-9: Add SessionPropertiesRef import for RLAC supportThis import adds the session properties reference type needed for row-level access control.
67-67: Suppress too_many_arguments warningAppropriate use of the Clippy allowance for a method with many parameters that are all necessary for the function's purpose.
76-76: Add SessionPropertiesRef parameter to with_chain methodThis parameter enables passing session properties through the relation chain construction, necessary for RLAC enforcement.
108-109: Pass session properties to ModelPlanNodeSession properties are now correctly passed when constructing model plan nodes in the relation chain, ensuring RLAC rules will be applied.
203-204: Update to use plan_name() method instead of direct field accessThis reflects an architectural improvement where ModelPlanNode now holds a reference to the full Model object rather than just storing its name as a string.
Also applies to: 253-254
wren-core-base/src/mdl/builder.rs (6)
25-29: Add deprecated annotations for security-related importsAppropriately marking deprecated security components while transitioning to the new RLAC model.
114-114: Initialize row_level_access_controls in Model structAdding the new field for storing RLAC rules in the Model, initialized as an empty vector.
154-167: Add row-level access control builder methodThis method enables adding RLAC rules to a model with a clear API for specifying the rule name, required properties, and condition.
174-189: Implement SessionProperty constructorsThese constructors provide a clean API for creating required and optional session properties, with support for default expressions.
495-512: Add RLAC rules to test modelThe test now includes multiple RLAC rules demonstrating different property configurations: required, optional without default, and optional with default.
700-717: Add RLAC rules to JSON serialization testThe test now verifies that RLAC rules with different property types are correctly serialized and deserialized in the JSON format.
wren-core/core/src/mdl/context.rs (5)
45-45: Proper type alias definition for session properties.This new type definition appropriately uses Arc to enable thread-safe sharing of session properties across contexts. The Option value allows representing both present and absent values.
71-78: Good practice to normalize property keys to lowercase.This ensures case-insensitive lookups for session properties, which is important for consistent user experience when dealing with HTTP headers that might arrive with different casing.
48-53: API extension to support session properties.Well-structured function signature update to include the new session properties parameter for row-level access control.
104-108: Consistent propagation of session properties.The updated function signature ensures that session properties are passed through the analyzer rule creation pipeline.
136-140: Consistent parameter extension across related functions.Good consistency in propagating the session properties to both local runtime and unparsing analyzer rules.
ibis-server/app/mdl/rewriter.py (4)
27-41: Store session properties in the Rewriter class.Good addition of the properties parameter to the constructor. This allows for session context to be carried through the rewriting process.
59-59: Pass properties to the underlying rewriter.The Rewriter.rewrite method correctly forwards the properties to the underlying rewriter implementation.
120-129: EmbeddedEngineRewriter correctly handles session properties.The function properly passes filtered session properties to the transform_sql method.
133-143: Well-implemented session properties filtering.Good implementation of property filtering to extract RLAC-specific headers. The method handles null properties and focuses only on headers with the "x-wren-variables-" prefix.
It would be helpful to add a docstring explaining that this method:
- Filters headers starting with "x-wren-variables-"
- Removes this prefix from the keys before passing them to the engine
wren-core/wren-example/examples/row_level_access_control.rs.rs (2)
1-131: Well-structured example demonstrating row-level access control.This example clearly shows how to implement RLAC in a multi-tenant scenario. It demonstrates:
- Setting up session properties for tenant, department, user, and role
- Applying RLAC rules based on these properties
- Transforming and executing SQL with access controls applied
167-177: Clear row-level access control rule definitions.The RLAC rules are well-defined with clear comments explaining their purpose:
- Multi-tenant isolation ensuring users only see documents in their tenant
- Role-based access control with different permissions for members vs. admins
The use of SessionProperty objects to define required and optional parameters is well-structured.
ibis-server/app/routers/v3/connector.py (5)
71-75: Pass request headers as properties to Rewriter.The Rewriter class now receives the request headers as properties, enabling session context to influence query rewriting.
142-162: Prevent fallback to v2 API when RLAC headers are present.Good conditional logic to prevent fallback to v2 API when row-level access control headers are present, as the v2 API doesn't support this feature.
175-177: Consistent properties propagation across endpoints.The dry_plan endpoint correctly passes headers as properties to the Rewriter.
207-212: Properties passed to Rewriter in dry_plan_for_data_source.The endpoint properly initializes Rewriter with session properties from headers.
246-251: Properties passed to Rewriter in validate endpoint.The validate endpoint correctly initializes the Rewriter with properties from headers.
wren-core/core/src/logical_plan/analyze/model_generation.rs (1)
110-119: Potential double–aliasing: confirm that duplicate aliases can’t leak upstream
builder.alias(model_plan.plan_name())?is only executed when an RLAC
filter exists.
If the same model is referenced multiple times (e.g. in sub-queries that
also add an alias with the sameplan_name()), DataFusion may raise
“duplicate relation” errors.Please double-check complex query plans where:
- the same model appears multiple times and
- at least one occurrence carries an RLAC rule.
If you spot any collisions, consider appending a monotonically-increasing
suffix fromAliasGenerator.wren-core/core/src/logical_plan/analyze/access_control.rs (1)
127-140: No sanitisation of session-supplied SQL fragments
property_valueis fed straight into the SQL parser.
While DataFusion’s parser will error on multiple statements, malicious input could still lead to confusing planner errors.If possible, restrict
property_valueto literals (numbers, quoted strings) before parsing, e.g.:if !matches!(property_value.chars().next(), Some('\'' | '\"' | '0'..='9')) { error = Some(plan_err!("Only literal values are allowed for session property {property_name}")); return ControlFlow::Break(()); }
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (4)
ibis-server/app/dependencies.py (2)
19-24: Consider renaming the function for grammatical correctness.The function implementation correctly and efficiently checks for the presence of RLAC headers with proper null-safety. However, the function name
exist_wren_variables_headershould beexists_wren_variables_header(with an 's') to be grammatically correct since it's checking if at least one header with the prefix exists.-def exist_wren_variables_header( +def exists_wren_variables_header( headers: Headers, ) -> bool: if headers is None: return False return any(key.startswith(X_WREN_VARIABLE_PREFIX) for key in headers.keys())
22-24: Consider adding case-insensitive header matching.Since HTTP headers are case-insensitive, it would be more robust to perform case-insensitive matching when checking for headers with the RLAC prefix. The current implementation assumes headers are already lowercase, which might not always be the case depending on how clients send them.
if headers is None: return False - return any(key.startswith(X_WREN_VARIABLE_PREFIX) for key in headers.keys()) + return any(key.lower().startswith(X_WREN_VARIABLE_PREFIX) for key in headers.keys())wren-core/wren-example/examples/row-level-access-control.rs (1)
87-112: Reduce repetitiveprintln!blocksThe four back-to-back
println!calls repeat the same lookup / clone / unwrap pattern and will panic if the map changes in the future.-println!("session_tenant_id: {}", properties.get("session_tenant_id").unwrap().as_deref().unwrap()); -println!("session_department: {}", properties.get("session_department").unwrap().as_deref().unwrap()); -println!("session_user_id: {}", properties.get("session_user_id").unwrap().as_deref().unwrap()); -println!("session_role: {}", properties.get("session_role").unwrap().as_deref().unwrap()); +for (k, v) in &properties { + println!("{k}: {}", v.as_deref().unwrap_or("<unset>")); +}Benefits:
• Eliminates four panics → one safe path.
• Printing automatically adapts to added/removed properties.wren-core/core/src/logical_plan/analyze/access_control.rs (1)
154-156: Parsing expression viaexpr.to_string()causes double parsing & loses SQL dialect nuances
create_logical_expr(&expr.to_string(), …)serialises the already-parsed AST back to SQL only to parse it again. This:• adds unnecessary CPU cost,
• may re-introduce quoting/dialect edge-cases (the unparsers are lossy for some dialect-specific constructs).Instead pass the
Exprdirectly:- session_state.read().create_logical_expr(&expr.to_string(), &df_schema) + session_state.read().create_logical_expr_from_ast(expr.clone(), &df_schema)(DataFusion exposes
create_logical_expr_from_astbehind the planner module; if unavailable, consider extending the API or keeping the originalExpruntil plan creation.)This keeps the pipeline purely AST-based and avoids fragile round-tripping.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
.github/actions/rust/setup-rust-runtime/action.yaml(0 hunks)ibis-server/app/dependencies.py(2 hunks)ibis-server/app/mdl/rewriter.py(5 hunks)ibis-server/app/routers/v3/connector.py(6 hunks)ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py(2 hunks)ibis-server/tests/routers/v3/connector/postgres/test_query.py(3 hunks)wren-core-base/manifest-macro/src/lib.rs(6 hunks)wren-core-base/src/mdl/builder.rs(8 hunks)wren-core-base/src/mdl/manifest.rs(5 hunks)wren-core/core/Cargo.toml(1 hunks)wren-core/core/src/logical_plan/analyze/access_control.rs(1 hunks)wren-core/core/src/mdl/context.rs(7 hunks)wren-core/wren-example/examples/row-level-access-control.rs(1 hunks)
💤 Files with no reviewable changes (1)
- .github/actions/rust/setup-rust-runtime/action.yaml
✅ Files skipped from review due to trivial changes (1)
- wren-core/core/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (8)
- ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py
- wren-core-base/src/mdl/manifest.rs
- ibis-server/tests/routers/v3/connector/postgres/test_query.py
- ibis-server/app/mdl/rewriter.py
- ibis-server/app/routers/v3/connector.py
- wren-core-base/src/mdl/builder.rs
- wren-core-base/manifest-macro/src/lib.rs
- wren-core/core/src/mdl/context.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
wren-core/wren-example/examples/row-level-access-control.rs (3)
wren-core/core/src/mdl/mod.rs (2)
transform_sql_with_ctx(350-386)analyze_with_tables(77-90)wren-core/core/src/mdl/context.rs (1)
properties(73-79)wren-core-base/src/mdl/builder.rs (2)
new_required(175-181)new_optional(182-188)
🔇 Additional comments (1)
ibis-server/app/dependencies.py (1)
7-7: Good choice for header prefix constant.Using a dedicated constant for the session property header prefix is a good practice for maintainability. Since HTTP headers are case-insensitive, using lowercase for the constant value aligns with common practices in HTTP header handling.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
wren-core/core/src/logical_plan/analyze/access_control.rs (1)
102-108: Optional property fallback ignores empty default valuesThe fallback fetches
default_exprfor optional properties but does not verify that the default itself isSome(...)/non-empty before passing it further.
If the manifest mistakenly setsdefault_expr: None, the rule silently behaves as required and triggers the not found error – surprising for users.- .map(|r| &r.default_expr) + .filter_map(|r| r.default_expr.as_ref())This guarantees we only replace missing properties with existing defaults.
🧹 Nitpick comments (4)
wren-core/core/src/logical_plan/analyze/access_control.rs (4)
34-34: Typo:seesion_propertiesshould besession_propertiesThere's a typo in the variable name that might cause confusion when reading the code.
-let mut seesion_properties: HashSet<String> = HashSet::new(); +let mut session_properties: HashSet<String> = HashSet::new();
60-62: Same typo in variable name used consistentlyThe same typo appears here. Make sure to fix it in all occurrences.
- if !seesion_properties.contains(&session_property) { - seesion_properties.insert(session_property); + if !session_properties.contains(&session_property) { + session_properties.insert(session_property);
74-74: Typo in variable nameThe final reference to the misspelled variable name needs to be fixed as well.
- seesion_properties.into_iter().collect::<Vec<_>>(), + session_properties.into_iter().collect::<Vec<_>>(),
217-222: Potential inconsistency in optional property validationThe validation for optional properties checks if the property exists in headers or if it has a default expression, but doesn't verify if the default expression itself is a valid, non-empty string. This could lead to inconsistent behavior if a default expression is set to
Some("")or similar.For better consistency, consider enhancing the validation to also check the content of the default expression:
- if exist || property.default_expr.is_some() { + if exist || property.default_expr.as_ref().is_some_and(|expr| !expr.is_empty()) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
wren-core/core/src/logical_plan/analyze/access_control.rs(1 hunks)wren-core/wren-example/examples/row-level-access-control.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- wren-core/wren-example/examples/row-level-access-control.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
wren-core/core/src/logical_plan/analyze/access_control.rs (7)
wren-core-base/src/mdl/builder.rs (3)
condition(298-301)new_required(175-181)new_optional(182-188)wren-core/core/src/logical_plan/analyze/model_generation.rs (3)
new(38-48)name(303-305)analyze(290-301)wren-core/core/src/mdl/context.rs (2)
new(255-258)properties(73-79)wren-core/core/src/logical_plan/analyze/plan.rs (11)
new(69-82)new(109-124)new(649-651)new(766-860)new(914-961)new(1020-1022)name(43-47)name(710-712)name(870-872)name(971-973)name(1032-1034)wren-core/core/src/mdl/mod.rs (5)
new(117-172)new(426-428)analyze(71-75)build_headers(2125-2133)default(59-67)wren-core/core/src/logical_plan/analyze/model_anlayze.rs (2)
expr(589-599)session_state(81-83)wren-core-base/src/mdl/manifest.rs (4)
name(245-247)name(278-280)name(289-291)name(295-297)
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: cargo test (macos)
- GitHub Check: test
- GitHub Check: cargo test (win64)
- GitHub Check: ci
- GitHub Check: cargo check
🔇 Additional comments (6)
wren-core/core/src/logical_plan/analyze/access_control.rs (6)
167-193: Good security practice for session property validationThe
prevent_invalid_exprfunction is a strong security measure that restricts session property values to only literals and structured data like arrays and maps, preventing injection of arbitrary code or functions. This is an excellent approach for maintaining the security boundaries of the system.
229-239: Good case-insensitive property matchingThe implementation properly handles case-insensitive property matching by converting property names to lowercase before lookup. This is a good user experience enhancement that makes the system more forgiving of case variations in property names.
593-613: Comprehensive validation of valid property valuesThe test covers a wide range of valid property value types including integers, strings, floats, booleans, arrays, maps, and intervals. This thorough testing ensures the system correctly handles various data types that might be used as session properties.
615-641: Strong security validation in testsThe test thoroughly validates that potentially dangerous expressions like function calls, subqueries, operations, and other non-literal values are properly rejected. This is crucial for preventing SQL injection and maintaining the security of the system.
551-576: Effective case-insensitive property matching testThe test verifies that property names are matched case-insensitively, which is important for user experience and consistency. The system correctly handles properties regardless of the case used in the condition or in the headers.
42-43:Details
✅ Verification successful
TODO comment about CompoundIdentifier
The current implementation only handles simple identifiers and not compound identifiers or field access patterns. While the TODO comment acknowledges this limitation, it's worth noting that this might cause issues with more complex SQL expressions that reference fields from nested structures.
Consider addressing this in a follow-up task to ensure proper handling of all identifier types.
🏁 Script executed:
#!/bin/bash # Check if there are any compound identifiers used in existing model conditions # that would not be correctly handled by the current implementation echo "Searching for potential compound identifiers in model definitions..." rg -A 3 "condition:" --type-add 'rust:*.rs' --type rust | rg -A 2 "CompoundIdentifier|CompoundFieldAccess|\."Length of output: 5492
Acknowledge simple‐identifier limitation
The current matcher only handlesast::Expr::Identifierand won’t coverCompoundIdentifierorCompoundFieldAccesspatterns (e.g.table.column). We searched existing model conditions and found no compound identifiers in use today, so this does not break any current rules.• File:
wren-core/core/src/logical_plan/analyze/access_control.rsat theif let ast::Expr::Identifier…check
• No occurrences ofCompoundIdentifierorCompoundFieldAccessfound in model definitionsConsider scheduling a follow-up task to add full support for nested/compound identifiers when needed.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
wren-core/core/src/logical_plan/analyze/access_control.rs (1)
102-108: 🛠️ Refactor suggestionOptional property handling improvement needed
The fallback for optional properties has a potential issue. When a property is missing, you're trying to get its default_expr, but you're not checking if the default itself exists before passing it further. If default_expr is None for optional properties, the code will proceed and then fail at line 116 with a confusing "property not found" error rather than correctly skipping the rule.
- .map(|r| &r.default_expr) + .filter_map(|r| r.default_expr.as_ref())This ensures we only pass through non-null default values.
🧹 Nitpick comments (1)
wren-core/core/src/logical_plan/analyze/access_control.rs (1)
216-223: Rule validation logic could be more explicitThe code on line 218 combines two conditions (property existence vs default value existence) in a way that might be confusing to future maintainers.
- if exist || property.default_expr.as_ref().is_some_and(|expr| !expr.is_empty()) { + // If property exists in headers or has a valid default value, the rule applies + let has_valid_default = property.default_expr.as_ref().is_some_and(|expr| !expr.is_empty()); + if exist || has_valid_default {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
wren-core/core/src/logical_plan/analyze/access_control.rs(1 hunks)wren-core/core/src/mdl/mod.rs(39 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: cargo test (win64)
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: cargo test (macos)
- GitHub Check: cargo check
- GitHub Check: ci
- GitHub Check: test
🔇 Additional comments (8)
wren-core/core/src/logical_plan/analyze/access_control.rs (3)
169-193: Security validation looks goodThe function correctly restricts session property expressions to literal values only, preventing SQL injection and other security risks through user-provided values. The recursive validation ensures even nested literals in arrays and objects are safe.
229-239: Clean property presence check with case-insensitive handlingThe function effectively handles case-insensitive property matching and correctly validates both the existence and non-emptiness of values.
551-576: Comprehensive case-insensitivity testingThe test correctly verifies that property names are matched case-insensitively, which is important for a robust user experience when working with headers that might come from different systems with varying casing conventions.
wren-core/core/src/mdl/mod.rs (5)
362-365: Clean property passing implementationThe implementation effectively wraps the properties in an Arc and passes them to the context creation function, which is a clean way to integrate RLAC support without disrupting the existing code flow.
1590-1782: Comprehensive RLAC testing with required propertiesThe test suite thoroughly covers multiple scenarios for required properties, including:
- Header presence verification
- Multiple rule application
- Cross-model conditions
- Error cases for missing properties
- Case sensitivity handling
This ensures the RLAC implementation is robust and behaves as expected across a variety of use cases.
1784-1945: Well-structured optional property RLAC testingThe tests comprehensively verify optional property behavior including:
- Default value fallback when properties are missing
- Correct handling when defaults are missing or empty
- Mixed required and optional property scenarios
This ensures that users have flexibility in defining access rules with varying levels of strictness.
1947-2068: Good testing of complex calculated field scenariosThe tests verify RLAC works correctly with calculated fields in relationships, which is a more complex scenario. The TODO comments around line 2014 correctly note future improvements for propagating RLAC rules through relationships.
Consider creating a tracking issue for the TODO comment about applying RLAC rules to models used in calculated fields via relationships, as this would enhance the security model's completeness.
2173-2181: Useful test helper functionsThese functions make the tests more readable and maintainable by abstracting common operations.
c6bbe2d to
2cb2e26
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
wren-core/core/src/mdl/mod.rs (1)
2251-2267: Consider extractingbuild_headers/batches_to_stringinto a reusable helper to avoid test-only duplicationBoth helpers are now implemented here and in
logical_plan/analyze/access_control.rstest-suite.
Duplicating even small utilities makes maintenance harder (one side may evolve
silently and the other side will miss bug-fixes, e.g. the lower-case
normalisation logic).Suggestion:
-// repeated in two different test modules -fn build_headers(...) -> HashMap<_, _> { ... } -fn batches_to_string(...) -> String { ... } +// tests/common.rs (or crate::test_util) +pub(crate) fn build_headers(...) -> HashMap<_, _> { ... } +pub(crate) fn batches_to_string(...) -> String { ... }Then
#[cfg(test)]-modules can simplyuse crate::test_util::*;.wren-core/core/src/logical_plan/analyze/access_control.rs (2)
169-193:prevent_invalid_exprerror message prints the entire expression – can leak PIIWhen a user supplies an invalid literal (e.g. a phone number or e-mail) the
error returned embeds the literal verbatim:The session property allow only literal valueTwo follow-ups:
- Include the property name but not the full literal to avoid accidental
PII leakage in logs.- Consider a dedicated
AccessControlError::InvalidLiteralenum so callers
can pattern-match instead of string-matching error text.
453-461: Helper duplication again (build_headers) – extract to shared test utilSame note as in
mdl/mod.rs: the exact samebuild_headershelper now lives
in two test modules. Moving it to a centraltests/util.rsreduces drift and
keeps test code DRY.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
wren-core/wren-example/data/company/documents.csvis excluded by!**/*.csvwren-core/wren-example/data/company/tenants.csvis excluded by!**/*.csvwren-core/wren-example/data/company/users.csvis excluded by!**/*.csv
📒 Files selected for processing (30)
ibis-server/app/dependencies.py(2 hunks)ibis-server/app/mdl/rewriter.py(5 hunks)ibis-server/app/routers/v3/connector.py(6 hunks)ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py(2 hunks)ibis-server/tests/routers/v3/connector/postgres/test_query.py(3 hunks)wren-core-base/manifest-macro/src/lib.rs(6 hunks)wren-core-base/src/mdl/builder.rs(8 hunks)wren-core-base/src/mdl/manifest.rs(5 hunks)wren-core-base/tests/data/mdl.json(1 hunks)wren-core-py/src/context.rs(3 hunks)wren-core-py/src/manifest.rs(2 hunks)wren-core-py/tests/test_modeling_core.py(2 hunks)wren-core/Cargo.toml(1 hunks)wren-core/benchmarks/src/tpch/run.rs(2 hunks)wren-core/core/Cargo.toml(1 hunks)wren-core/core/src/logical_plan/analyze/access_control.rs(1 hunks)wren-core/core/src/logical_plan/analyze/mod.rs(1 hunks)wren-core/core/src/logical_plan/analyze/model_anlayze.rs(5 hunks)wren-core/core/src/logical_plan/analyze/model_generation.rs(7 hunks)wren-core/core/src/logical_plan/analyze/plan.rs(15 hunks)wren-core/core/src/logical_plan/analyze/relation_chain.rs(6 hunks)wren-core/core/src/mdl/context.rs(7 hunks)wren-core/core/src/mdl/mod.rs(40 hunks)wren-core/sqllogictest/src/test_context.rs(2 hunks)wren-core/wren-example/examples/calculation-invoke-calculation.rs(2 hunks)wren-core/wren-example/examples/datafusion-apply.rs(1 hunks)wren-core/wren-example/examples/plan-sql.rs(2 hunks)wren-core/wren-example/examples/row-level-access-control.rs(1 hunks)wren-core/wren-example/examples/to-many-calculation.rs(1 hunks)wren-core/wren-example/examples/view.rs(2 hunks)
✅ Files skipped from review due to trivial changes (2)
- wren-core/Cargo.toml
- wren-core/core/src/logical_plan/analyze/model_anlayze.rs
🚧 Files skipped from review as they are similar to previous changes (26)
- wren-core/core/Cargo.toml
- wren-core-py/src/manifest.rs
- wren-core/wren-example/examples/plan-sql.rs
- wren-core/wren-example/examples/datafusion-apply.rs
- wren-core-py/src/context.rs
- wren-core/wren-example/examples/to-many-calculation.rs
- wren-core/wren-example/examples/calculation-invoke-calculation.rs
- wren-core-base/src/mdl/manifest.rs
- wren-core/core/src/logical_plan/analyze/model_generation.rs
- wren-core-base/tests/data/mdl.json
- wren-core/benchmarks/src/tpch/run.rs
- wren-core-base/src/mdl/builder.rs
- ibis-server/tests/routers/v3/connector/postgres/test_query.py
- ibis-server/app/dependencies.py
- wren-core/core/src/logical_plan/analyze/relation_chain.rs
- wren-core/wren-example/examples/view.rs
- wren-core/core/src/logical_plan/analyze/mod.rs
- wren-core/sqllogictest/src/test_context.rs
- ibis-server/app/routers/v3/connector.py
- ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py
- ibis-server/app/mdl/rewriter.py
- wren-core/core/src/mdl/context.rs
- wren-core/core/src/logical_plan/analyze/plan.rs
- wren-core-py/tests/test_modeling_core.py
- wren-core/wren-example/examples/row-level-access-control.rs
- wren-core-base/manifest-macro/src/lib.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
wren-core/core/src/mdl/mod.rs (8)
wren-core/core/src/mdl/context.rs (3)
properties(73-79)new(255-258)create_ctx_with_mdl(48-103)wren-core-py/src/context.rs (5)
runtime(169-175)new(75-142)remote_functions(81-84)functions(93-96)default(58-64)wren-core/core/src/logical_plan/analyze/model_generation.rs (2)
new(38-48)analyze(290-301)wren-core/core/src/logical_plan/analyze/plan.rs (6)
new(69-82)new(109-124)new(649-651)new(766-860)new(914-961)new(1020-1022)wren-core/core/src/logical_plan/analyze/model_anlayze.rs (3)
new(69-79)new(794-802)analyze(47-61)wren-core/core/src/mdl/dialect/wren_dialect.rs (2)
new(99-103)default(93-95)wren-core-base/src/mdl/builder.rs (3)
default(38-40)new_required(175-181)new_optional(182-188)wren-core/core/src/logical_plan/analyze/access_control.rs (1)
build_headers(453-461)
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: cargo test (win64)
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: cargo test (macos)
- GitHub Check: test
- GitHub Check: cargo check
- GitHub Check: ci
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
wren-core/core/src/logical_plan/analyze/access_control.rs (2)
102-126: Nested-Optionhandling remains hard to read (see previous review)The double
let Some(property_value)pattern and two-levelOptiontype make this block difficult to follow and easy to miss error paths.
A flatter flow also allows a single ownership of the value and eliminates the second match.The same refactor was suggested earlier and is still applicable.
41-66: Compound identifiers and field access still unsupported
collect_conditiononly matchesast::Expr::Identifier, meaning conditions
such asnation.id = @xor"user"."role" = @roleare silently ignored.
This can lead to incorrect “column not found” errors or, worse, missed column
references.A TODO comment exists, but given RLAC is security-sensitive, consider handling
CompoundIdentifierandCompoundFieldAccesssooner rather than later.
🧹 Nitpick comments (3)
wren-core/core/src/mdl/mod.rs (2)
330-345: Consider accepting&HashMap/Arc<HashMap>instead of taking ownership
transform_sql/transform_sql_with_ctxnow takeproperties: HashMap<String, Option<String>>, which is immediately wrapped into anArc(Arc::new(properties)).
For a call-site this means:
- A full clone of the map is performed each time (potentially large).
- The API is now move-only, forcing callers to relinquish ownership or clone.
Passing
impl Into<Arc<HashMap<_, _>>>or simplyArc<HashMap<_, _>>keeps the allocation cost at one place and avoids an extra copy per invocation.-pub fn transform_sql( - analyzed_mdl: Arc<AnalyzedWrenMDL>, - remote_functions: &[RemoteFunction], - properties: HashMap<String, Option<String>>, +pub fn transform_sql( + analyzed_mdl: Arc<AnalyzedWrenMDL>, + remote_functions: &[RemoteFunction], + properties: Arc<HashMap<String, Option<String>>>,Call-sites that have a plain
HashMapcan still callArc::new(hmap).
363-365: Redundant allocation when wrappingpropertiesintoArc
let properties_ref = Arc::new(properties);Because
HashMapwas moved in,Arc::new(...)is unavoidable, but keep in mind that the entire table is cloned earlier if the caller already owns anArc<HashMap<_,_>>(see previous comment).
Adopting anArcin the function signature will eliminate both the clone and this extra allocation.wren-core/core/src/logical_plan/analyze/access_control.rs (1)
187-193: Nit: typo & wording in error message
fileds→fields"The session property {} allow only literal value"
should be “allows”.- ast::Expr::Dictionary(fileds) => { + ast::Expr::Dictionary(fields) => { for field in fields { - prevent_invalid_expr(&field.value)?; + prevent_invalid_expr(&field.value)?; } … - _ => plan_err!("The session property {} allow only literal value", expr), + _ => plan_err!("The session property {} allows only literal values", expr),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
wren-core/core/src/logical_plan/analyze/access_control.rs(1 hunks)wren-core/core/src/mdl/mod.rs(42 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: cargo test (win64)
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: cargo test (macos)
- GitHub Check: cargo check
- GitHub Check: test
- GitHub Check: ci
dede1c4 to
1c7c615
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🔭 Outside diff range comments (2)
ibis-server/app/routers/v3/connector.py (2)
346-351:⚠️ Potential issueMissing properties parameter in model_substitute.
Unlike all other endpoints, the Rewriter in model_substitute doesn't pass the properties from headers, which will prevent RLAC from working here.
Add the properties parameter to be consistent with other endpoints:
Connector(data_source, dto.connection_info).dry_run( await Rewriter( dto.manifest_str, data_source=data_source, experiment=True, + properties=dict(headers), ).rewrite(sql) )
358-359:⚠️ Potential issueMissing RLAC fallback check in model_substitute.
Unlike all other endpoints, the model_substitute endpoint is missing the check for RLAC headers when determining whether to fall back to v2 API. This will cause inconsistent behavior.
Add the check for RLAC headers:
is_fallback_disable = bool( headers.get(X_WREN_FALLBACK_DISABLE) and safe_strtobool(headers.get(X_WREN_FALLBACK_DISABLE, "false")) ) - if is_fallback_disable: + # because the v2 API doesn't support row-level access control, + # we don't fallback to v2 if the header include row-level access control properties. + if is_fallback_disable or exist_wren_variables_header(headers): raise e
♻️ Duplicate comments (2)
wren-core/core/src/logical_plan/analyze/access_control.rs (2)
234-241:⚠️ Potential issueCase-insensitive lookup relies on pre-lowercased header keys.
The function only lowercases the property name but not the keys in the headers map. This means if headers contain uppercase keys, the case-insensitive lookup will fail.
- .get(&property_name.to_lowercase()) + .iter() + .find(|(k, _)| k.eq_ignore_ascii_case(&property_name)) + .map(|(_, v)| v)
126-133:⚠️ Potential issueInconsistent whitespace handling between validation functions.
This function uses
trim().is_empty()whileis_property_presenton line 239 uses justis_empty(). This inconsistency means a property with only whitespace will be rejected here but might pass validation invalidate_rule.// Also update is_property_present on line 239 to match: - .map(|v| v.as_ref().is_some_and(|value| !value.is_empty())) + .map(|v| v.as_ref().is_some_and(|value| !value.trim().is_empty()))
🧹 Nitpick comments (1)
wren-core/core/src/logical_plan/analyze/access_control.rs (1)
102-116: Consider additional error handling for session property variables in SQL injection scenarios.The code correctly handles missing properties by checking against both direct properties and default values from required_properties. However, consider adding a safelist validation for property names to prevent potential SQL injection vectors through user-controlled session property names.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
wren-core/wren-example/data/company/documents.csvis excluded by!**/*.csvwren-core/wren-example/data/company/tenants.csvis excluded by!**/*.csvwren-core/wren-example/data/company/users.csvis excluded by!**/*.csv
📒 Files selected for processing (31)
ibis-server/app/dependencies.py(2 hunks)ibis-server/app/mdl/rewriter.py(5 hunks)ibis-server/app/routers/v3/connector.py(8 hunks)ibis-server/app/util.py(1 hunks)ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py(2 hunks)ibis-server/tests/routers/v3/connector/postgres/test_query.py(3 hunks)wren-core-base/manifest-macro/src/lib.rs(6 hunks)wren-core-base/src/mdl/builder.rs(8 hunks)wren-core-base/src/mdl/manifest.rs(5 hunks)wren-core-base/tests/data/mdl.json(1 hunks)wren-core-py/src/context.rs(3 hunks)wren-core-py/src/manifest.rs(2 hunks)wren-core-py/tests/test_modeling_core.py(2 hunks)wren-core/Cargo.toml(1 hunks)wren-core/benchmarks/src/tpch/run.rs(2 hunks)wren-core/core/Cargo.toml(1 hunks)wren-core/core/src/logical_plan/analyze/access_control.rs(1 hunks)wren-core/core/src/logical_plan/analyze/mod.rs(1 hunks)wren-core/core/src/logical_plan/analyze/model_anlayze.rs(5 hunks)wren-core/core/src/logical_plan/analyze/model_generation.rs(7 hunks)wren-core/core/src/logical_plan/analyze/plan.rs(15 hunks)wren-core/core/src/logical_plan/analyze/relation_chain.rs(6 hunks)wren-core/core/src/mdl/context.rs(7 hunks)wren-core/core/src/mdl/mod.rs(42 hunks)wren-core/sqllogictest/src/test_context.rs(2 hunks)wren-core/wren-example/examples/calculation-invoke-calculation.rs(2 hunks)wren-core/wren-example/examples/datafusion-apply.rs(1 hunks)wren-core/wren-example/examples/plan-sql.rs(2 hunks)wren-core/wren-example/examples/row-level-access-control.rs(1 hunks)wren-core/wren-example/examples/to-many-calculation.rs(1 hunks)wren-core/wren-example/examples/view.rs(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- wren-core/core/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (25)
- wren-core/core/src/logical_plan/analyze/mod.rs
- wren-core/wren-example/examples/plan-sql.rs
- wren-core/Cargo.toml
- wren-core/benchmarks/src/tpch/run.rs
- wren-core/wren-example/examples/to-many-calculation.rs
- wren-core/wren-example/examples/calculation-invoke-calculation.rs
- wren-core/wren-example/examples/view.rs
- wren-core-py/src/manifest.rs
- wren-core/core/src/logical_plan/analyze/model_anlayze.rs
- wren-core/sqllogictest/src/test_context.rs
- wren-core/wren-example/examples/datafusion-apply.rs
- ibis-server/tests/routers/v3/connector/postgres/test_query.py
- ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py
- wren-core-base/src/mdl/manifest.rs
- wren-core-base/tests/data/mdl.json
- wren-core-py/src/context.rs
- wren-core-base/src/mdl/builder.rs
- wren-core/core/src/logical_plan/analyze/relation_chain.rs
- wren-core/core/src/logical_plan/analyze/model_generation.rs
- ibis-server/app/mdl/rewriter.py
- wren-core/core/src/logical_plan/analyze/plan.rs
- wren-core/core/src/mdl/context.rs
- wren-core/wren-example/examples/row-level-access-control.rs
- wren-core-py/tests/test_modeling_core.py
- wren-core-base/manifest-macro/src/lib.rs
🧰 Additional context used
🧠 Learnings (1)
wren-core/core/src/logical_plan/analyze/access_control.rs (2)
Learnt from: goldmedal
PR: Canner/wren-engine#1161
File: wren-core/core/src/logical_plan/analyze/access_control.rs:0-0
Timestamp: 2025-04-30T01:15:15.009Z
Learning: In the row-level access control implementation, separate error checks are maintained for different failure modes (missing property vs null vs empty) to provide more precise and actionable error messages, even if it means slightly more verbose code with multiple Option checks.
Learnt from: goldmedal
PR: Canner/wren-engine#1161
File: wren-core/core/src/logical_plan/analyze/access_control.rs:0-0
Timestamp: 2025-04-30T01:18:21.776Z
Learning: In the `collect_condition` function of the row-level access control implementation, compound identifiers are intentionally ignored rather than causing failures when processing expressions. This is by design as confirmed by the team.
🧬 Code Graph Analysis (2)
ibis-server/app/util.py (1)
ibis-server/app/dependencies.py (1)
exist_wren_variables_header(42-47)
ibis-server/app/dependencies.py (1)
ibis-server/app/util.py (1)
exist_wren_variables_header(152-157)
🪛 Ruff (0.8.2)
ibis-server/app/dependencies.py
39-39: Undefined name request
(F821)
ibis-server/app/routers/v3/connector.py
35-35: Redefinition of unused exist_wren_variables_header from line 12
(F811)
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: cargo test (win64)
- GitHub Check: cargo test (macos)
- GitHub Check: cargo check
- GitHub Check: ci
- GitHub Check: test
🔇 Additional comments (23)
wren-core/core/src/logical_plan/analyze/access_control.rs (8)
42-62: Well-implemented column and session property extraction.The code effectively extracts and validates model columns while collecting session properties from the condition expression. Good use of a HashSet to avoid duplicate entries for session properties.
161-167: Good security practice with expression validation.The parsing of expressions with subsequent validation via
prevent_invalid_expris an excellent security practice that prevents injection of arbitrary code through session properties.
171-195: Well-structured expression validation with recursion for complex data types.The implementation properly handles validation for nested data structures (arrays, maps, dictionaries) by recursively checking each element. This thorough validation ensures that only literal values are accepted for session properties.
209-220: Comprehensive rule validation with clear handling of optional properties.The logic for checking required vs. optional properties with default expressions is well-structured. It's particularly good to see the explicit check for empty default expressions.
266-295: Thorough test case for session property collection.The test properly verifies extraction of session properties from various condition formats, including with comments and newlines. Good verification of model column extraction as well.
308-462: Comprehensive test suite for rule validation.These tests cover all important scenarios:
- Required properties (present, missing, empty)
- Optional properties with defaults (present, missing)
- Optional properties without defaults
- Multiple property combinations
This level of test coverage gives high confidence in the implementation.
474-554: Good test cases for filter expression building.The tests cover the key scenarios of building filter expressions with both required and optional properties, including error cases. The snapshot testing provides a clear expected output for regression testing.
589-654: Thorough validation of property value restrictions.The tests comprehensively verify that only literal values are allowed in session properties, preventing any form of code injection. Good to see both valid and invalid cases tested.
wren-core/core/src/mdl/mod.rs (7)
330-345: Good addition of session properties parameter.The
transform_sqlfunction has been properly updated to accept session properties, maintaining backward compatibility by not making it a required parameter with a default value.
347-366: Efficient handling of session properties.The properties are properly wrapped in an
Arcfor efficient sharing across the async context. The approach ensures session properties are propagated appropriately through the transformation pipeline.
1688-1802: Comprehensive test coverage for required RLAC properties.The tests thoroughly verify all aspects of required property handling:
- Property presence validation
- Filter application
- Multiple rules
- Join queries
- Error handling for missing properties
This gives good confidence in the implementation's correctness.
1883-2043: Good testing of optional properties with different default scenarios.The tests cover important cases for optional properties:
- With values provided
- With default values when not provided
- Without default values
- Empty default values
- Multiple optional properties
This testing ensures the feature works reliably across different configurations.
2169-2198: Case insensitivity test ensures property name flexibility.The test verifies that session property names are matched case-insensitively, making the API more user-friendly. This is a good usability feature.
2046-2166: Test for RLAC with calculated fields and join relationship.Testing RLAC on a calculated field that relates to another table through a relationship is important for complex query scenarios. The commented TODO section indicates awareness of additional scenarios to be handled in the future.
2271-2279: Useful helper function for creating test headers.The
build_headersfunction simplifies test setup by providing a clean way to construct the session property map for different test scenarios.ibis-server/app/dependencies.py (1)
8-8: New constant for RLAC header prefix added.This constant
X_WREN_VARIABLE_PREFIXdefines the prefix for headers that will carry session properties for row-level access control (RLAC).ibis-server/app/routers/v3/connector.py (7)
155-158: RLAC header check added to prevent fallback to v2 API.This addition prevents fallback to the v2 API when RLAC headers are present, as the v2 API doesn't support row-level access control. This is an important safety check.
188-190: Passing header properties to Rewriter for dry-plan.Consistent with the other endpoints, properties from headers are now passed to the Rewriter. This ensures RLAC is applied consistently across all API functions.
196-199: Consistent RLAC header check pattern.The same pattern to prevent v2 API fallback when RLAC headers exist is applied here, ensuring consistent behavior across endpoints.
228-233: Properties from headers passed to Rewriter for dry-plan with data source.Consistent with the other endpoints, properties from headers are passed to the Rewriter for data source-specific dry plans.
239-242: RLAC header check for data source-specific validation.The consistent pattern to prevent v2 API fallback when RLAC headers exist is applied here as well.
275-280: Properties from headers passed to Rewriter for validation.Session properties are now passed to the Rewriter for validation functions, ensuring RLAC is consistently applied.
289-292: RLAC header check for validation endpoint.The consistent pattern to prevent v2 API fallback when RLAC headers exist is maintained here.
dd689c7 to
15e307b
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (3)
wren-core/core/src/logical_plan/analyze/access_control.rs (3)
42-43: Consider implementing CompoundIdentifier handling as mentioned in TODOThe TODO comment correctly identifies a limitation in the current implementation. Not handling CompoundIdentifier means qualified column references like
table.columnwill be ignored in the condition parsing.Based on retrieved learnings, I understand this is intentional behavior as confirmed by the team. This approach of ignoring compound identifiers rather than failing is a deliberate design choice.
126-133: Inconsistency between validation and filter construction whitespace handlingThe function uses
trim().is_empty()to check for whitespace-only values, whileis_property_presentat line 239 uses!value.is_empty()without trimming.This creates a potential issue where a whitespace-only header value (
" ") would pass validation invalidate_rulebut fail here during filter construction.
234-241: Case-insensitive lookup relies on header keys already being lowercasedThe function converts
property_nameto lowercase but performs a direct HashMap lookup, which assumes the header keys are already normalized to lowercase elsewhere.To ensure truly case-insensitive matching, either normalize headers at creation time or use an approach that isn't dependent on exact case matching in the HashMap keys.
🧹 Nitpick comments (1)
wren-core/core/src/mdl/mod.rs (1)
2112-2165: Consider implementing relationship inheritance for RLAC rulesThe commented-out test case suggests a future enhancement where RLAC rules would automatically apply to related models when accessed through calculated fields. This would be a valuable addition for consistent security enforcement across related entities.
When implemented, this would allow a rule on a parent model (e.g., customer) to automatically apply when that model is accessed through a relationship (e.g., orders.customer), which would create more consistent access control.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
wren-core/wren-example/data/company/documents.csvis excluded by!**/*.csvwren-core/wren-example/data/company/tenants.csvis excluded by!**/*.csvwren-core/wren-example/data/company/users.csvis excluded by!**/*.csv
📒 Files selected for processing (30)
ibis-server/app/dependencies.py(2 hunks)ibis-server/app/mdl/rewriter.py(5 hunks)ibis-server/app/routers/v3/connector.py(7 hunks)ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py(2 hunks)ibis-server/tests/routers/v3/connector/postgres/test_query.py(3 hunks)wren-core-base/manifest-macro/src/lib.rs(6 hunks)wren-core-base/src/mdl/builder.rs(8 hunks)wren-core-base/src/mdl/manifest.rs(5 hunks)wren-core-base/tests/data/mdl.json(1 hunks)wren-core-py/src/context.rs(3 hunks)wren-core-py/src/manifest.rs(2 hunks)wren-core-py/tests/test_modeling_core.py(2 hunks)wren-core/Cargo.toml(1 hunks)wren-core/benchmarks/src/tpch/run.rs(2 hunks)wren-core/core/Cargo.toml(1 hunks)wren-core/core/src/logical_plan/analyze/access_control.rs(1 hunks)wren-core/core/src/logical_plan/analyze/mod.rs(1 hunks)wren-core/core/src/logical_plan/analyze/model_anlayze.rs(5 hunks)wren-core/core/src/logical_plan/analyze/model_generation.rs(7 hunks)wren-core/core/src/logical_plan/analyze/plan.rs(15 hunks)wren-core/core/src/logical_plan/analyze/relation_chain.rs(6 hunks)wren-core/core/src/mdl/context.rs(7 hunks)wren-core/core/src/mdl/mod.rs(42 hunks)wren-core/sqllogictest/src/test_context.rs(2 hunks)wren-core/wren-example/examples/calculation-invoke-calculation.rs(2 hunks)wren-core/wren-example/examples/datafusion-apply.rs(1 hunks)wren-core/wren-example/examples/plan-sql.rs(2 hunks)wren-core/wren-example/examples/row-level-access-control.rs(1 hunks)wren-core/wren-example/examples/to-many-calculation.rs(1 hunks)wren-core/wren-example/examples/view.rs(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (28)
- wren-core-py/tests/test_modeling_core.py
- wren-core/core/Cargo.toml
- wren-core-py/src/manifest.rs
- wren-core/core/src/logical_plan/analyze/mod.rs
- wren-core/Cargo.toml
- wren-core/benchmarks/src/tpch/run.rs
- wren-core/wren-example/examples/to-many-calculation.rs
- ibis-server/app/dependencies.py
- wren-core/wren-example/examples/plan-sql.rs
- wren-core/wren-example/examples/datafusion-apply.rs
- wren-core/sqllogictest/src/test_context.rs
- wren-core/wren-example/examples/calculation-invoke-calculation.rs
- ibis-server/tests/routers/v3/connector/postgres/test_fallback_v2.py
- wren-core-base/tests/data/mdl.json
- wren-core/wren-example/examples/view.rs
- wren-core-base/src/mdl/manifest.rs
- ibis-server/tests/routers/v3/connector/postgres/test_query.py
- wren-core/core/src/logical_plan/analyze/model_anlayze.rs
- wren-core/core/src/logical_plan/analyze/relation_chain.rs
- ibis-server/app/routers/v3/connector.py
- wren-core-base/src/mdl/builder.rs
- wren-core/wren-example/examples/row-level-access-control.rs
- wren-core/core/src/logical_plan/analyze/model_generation.rs
- wren-core-py/src/context.rs
- wren-core/core/src/mdl/context.rs
- ibis-server/app/mdl/rewriter.py
- wren-core-base/manifest-macro/src/lib.rs
- wren-core/core/src/logical_plan/analyze/plan.rs
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: goldmedal
PR: Canner/wren-engine#1161
File: ibis-server/app/routers/v3/connector.py:78-83
Timestamp: 2025-05-05T02:27:29.829Z
Learning: The row-level access control implementation in Wren Engine filters headers with the prefix `X_WREN_VARIABLE_PREFIX` in `EmbeddedEngineRewriter.get_session_properties` and validates session property expressions in `access_control.rs` to ensure they only contain literal values, preventing SQL injection.
wren-core/core/src/logical_plan/analyze/access_control.rs (2)
Learnt from: goldmedal
PR: Canner/wren-engine#1161
File: wren-core/core/src/logical_plan/analyze/access_control.rs:0-0
Timestamp: 2025-04-30T01:15:15.009Z
Learning: In the row-level access control implementation, separate error checks are maintained for different failure modes (missing property vs null vs empty) to provide more precise and actionable error messages, even if it means slightly more verbose code with multiple Option checks.
Learnt from: goldmedal
PR: Canner/wren-engine#1161
File: wren-core/core/src/logical_plan/analyze/access_control.rs:0-0
Timestamp: 2025-04-30T01:18:21.776Z
Learning: In the `collect_condition` function of the row-level access control implementation, compound identifiers are intentionally ignored rather than causing failures when processing expressions. This is by design as confirmed by the team.
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: cargo test (amd64)
- GitHub Check: check Cargo.toml formatting
- GitHub Check: clippy
- GitHub Check: cargo test (win64)
- GitHub Check: cargo test (macos-aarch64)
- GitHub Check: cargo test (macos)
- GitHub Check: ci
- GitHub Check: test
🔇 Additional comments (13)
wren-core/core/src/logical_plan/analyze/access_control.rs (6)
33-34: Good use of HashSet for de-duplicationUsing HashSet for both conditions and session_properties ensures unique entries are collected, avoiding redundant query processing and making the code more efficient.
102-110: Property fallback logic handles case-insensitive matching correctlyThe property name handling converts to lowercase and uses
eq_ignore_ascii_casewhen matching against required properties, ensuring case-insensitive matching works properly.
171-195: Strong security protection against SQL injectionThe
prevent_invalid_exprfunction is well-designed to allow only literal values in session properties, properly preventing SQL injection attacks by rejecting any dynamic or computed expressions.
209-225: Well-structured rule validation with clear semanticsThe validation logic clearly differentiates between required and optional properties, with comprehensive checks that align with the documented behavior in the function comment. The error messages are specific and actionable.
585-587: Comprehensive testing of case insensitivityThe test verifies that session property matching works correctly with mixed case in properties, both in conditions (
@SESSION_NAME) and required properties (SESSION_ID), ensuring the feature is robust against case variations.
599-654: Thorough validation of property value security constraintsThe tests comprehensively verify that only literal expressions are accepted as session property values, with proper rejection of any potentially dangerous expressions like functions or subqueries that could enable SQL injection.
wren-core/core/src/mdl/mod.rs (7)
332-345: Modified transform_sql to include session properties parameterThe transform_sql function now accepts session properties as a parameter, propagating them to the underlying asynchronous implementation. This maintains the existing synchronous API while adding RLAC support.
351-366: Property integration with context creationThe function now wraps properties in an Arc and passes them to context creation, ensuring session properties are available throughout the query analysis and execution pipeline.
1689-1822: Comprehensive testing of required property validationThe tests thoroughly verify that RLAC rules with required properties:
- Correctly filter when properties are present
- Fail appropriately when required properties are missing
- Work with multiple rules and properties
- Apply correctly in JOIN scenarios
This ensures the feature is robust for real-world usage.
1883-2043: Thorough testing of optional properties with fallback behaviorThe tests verify that optional property handling works correctly for:
- Properties with default values
- Properties without default values
- Mixed required and optional properties
- Empty string default values
This ensures the feature supports flexible access control policies.
2046-2108: Testing RLAC on calculated fields with relationshipsThe tests verify that RLAC works correctly when applied to calculated fields involving JOINs, ensuring that the feature integrates properly with Wren's relationship model.
2169-2198: Case-insensitive property name matching verifiedThe test confirms that property names are matched case-insensitively, allowing HTTP headers like "SESSION_NATION" to match properties defined as "session_nation", which improves usability across different client environments.
2272-2279: Utility function for test header creationThe
build_headersutility function creates a consistent format for session properties used in tests, simplifying test setup and making the code more maintainable.
Description
This PR introduces the row-level access control (RLAC) for the model in the Rust implementation. RLAC is used to apply some filters to a model. According to the carried session properties, we can change the filter condition dynamically.
Spec
[ { // model object "name": "user", "columns": [ .. ], "rowLevelAccessControls": [ { "name": "level rolw region rule", "requiredProperties": [ { "name": "SESION_LEVEL", "required": false, "default_expr": "1" }, { "name": "SESSION_ROLE", "required": true }, { "name": "SESSION_REGION", "required": false } ], "condition": "level = @SESSION_LEVEL AND role = @SESSION_ROLE OR region = @SESSION_REGION" }, ] } ]name: the display rule nameX-Wren-Variable-for it.requiredProperties: the required properties of this rule.name: The name of the property. It’s case-insensitive.required:default_expr: The default value of the property. If it’s an optional property, the default value will be applied if the session property is missing.condition: The condition will be applied to the model. It should be a boolean expression. It will be appended to theWHEREclause of the model.@means that it’s a session property. The engine will replace the value in the header for it.Session Property Header
X-Wren-Variable-. For example, given a required session property calledSESSION_LEVEL. The input header should be namedX-WREN-VARIABLE-SESSION_LEVEL.Example
There are some examples to show how it works
All property
Missing Optional with default
apply the default value
Missing Optional without default
Missing the required property
The Rust example
Check
wren-example/examples/row-level-access-control.rsfor an executable example.Known Issues
CompoundIdentifierandCompoundFieldAccessSummary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation
Chores