fix(embedded): prevent double RLS application in virtual datasets#37395
fix(embedded): prevent double RLS application in virtual datasets#37395YuriyKrasilnikov wants to merge 3 commits intoapache:masterfrom
Conversation
Fixes apache#37359: Guest users in embedded dashboards experienced double RLS application when using virtual datasets, causing SQL errors. Problem: - get_sqla_row_level_filters() includes guest RLS for ALL calls - For virtual datasets, it was called twice: 1. For underlying tables via get_predicates_for_table() 2. For the virtual dataset itself via get_sqla_query() - Global guest RLS rules were applied to BOTH, causing double filtering Solution: - Refactored get_sqla_row_level_filters() using Separate Method pattern - Created _get_sqla_row_level_filters_internal() with include_guest_rls param - Public API unchanged (backwards compatible) - In get_predicates_for_table(), call internal method with include_guest_rls=False - Guest RLS now applied only to outer query, not underlying tables Security: - Regular (non-guest) RLS still applied to underlying tables - Guest RLS still applied to outer query - No RLS bypass possible Testing: - Added 6 unit tests covering all scenarios - All tests pass
Code Review Agent Run #a80900Actionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Renamed tests to follow Superset's test naming convention: - test_public_api_includes_guest_rls → test_rls_filters_include_guest_when_enabled - test_internal_api_excludes_guest_rls_when_requested → test_rls_filters_exclude_guest_when_requested - test_internal_api_includes_guest_rls_by_default → test_rls_filters_include_guest_by_default Superset pattern: test_<functionality>_<scenario> No "public_api" or "internal_api" in test names. Ref: bito-code-review suggestion on PR apache#37395
Response to bito-code-review suggestionAnalysis of the suggestionThe bot noted that Investigation of Superset test naming patterns: # Examples from codebase:
test_csv_reader_cast_column_types_function # tests _cast_column_types
test_get_sqla_engine_user_impersonation # tests internal behavior
test_query_context_modified_tampered # describes scenarioSuperset pattern: Changes madeRenamed tests to follow Superset naming convention:
Method naming verificationThe internal method name
All 6 tests pass after renaming. |
Code Review Agent Run #cc567fActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #37395 +/- ##
==========================================
+ Coverage 60.48% 66.60% +6.11%
==========================================
Files 1931 642 -1289
Lines 76236 48999 -27237
Branches 8568 5491 -3077
==========================================
- Hits 46114 32636 -13478
+ Misses 28017 15069 -12948
+ Partials 2105 1294 -811
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Update test_get_predicates_for_table to mock _get_sqla_row_level_filters_internal instead of get_sqla_row_level_filters, matching the implementation change in get_predicates_for_table that uses the internal method with include_guest_rls=False.
CI FixFixed the failing Root cause: The test was mocking Note on pre-commit failure: The |
Response to codeant-ai bot suggestions1. rls.py:114 - "Logic/security regression" The bot's concern is incorrect. The change is intentional and does NOT disable guest RLS globally. Architecture:
Flow:
This prevents double application of guest RLS (the bug described in #37359), while ensuring guest RLS is still applied exactly once at the correct level. 2-4. test_double_rls_virtual_dataset.py - "with (patch(...),) tuple syntax" The bot is incorrect. The syntax |
Code Review Agent Run #b25117Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Summary
Fixes #37359: Guest users in embedded dashboards can now use virtual datasets without SQL errors.
Before: Guest RLS applied twice → SQL errors like
Unknown expression or function identifier 'my_table.tenant_id'After: Guest RLS applied once → Virtual datasets work correctly
Problem Analysis
Root Cause (Verified by Code Trace)
get_sqla_row_level_filters()includes guest RLS for every call, but for virtual datasets it's called twice:Key Code Locations
superset/models/helpers.pyapply_rls()call inget_from_clause()superset/utils/rls.pydataset.get_sqla_row_level_filters()callsuperset/connectors/sqla/models.pysuperset/models/helpers.pyWhy This Happens
Global guest RLS rules (without dataset ID) match both:
Both get guest RLS added → double application.
Solution: Separate Method Pattern
Architecture Decision
Evaluated 3 options:
Implementation
include_guest_rlsparameter:get_predicates_for_table():Why Separate Method Pattern?
Security Analysis
Mathematical model unchanged:
RLS(query) = RLS(underlying) AND RLS(outer)Related Issues/PRs
Testing
New Tests (6 tests, all pass)
test_public_api_includes_guest_rls- Backwards compatibilitytest_internal_api_excludes_guest_rls_when_requested- Core fixtest_internal_api_includes_guest_rls_by_default- Default behaviortest_regular_rls_always_included- Securitytest_guest_rls_skipped_when_feature_disabled- Feature flagtest_filter_grouping_preserved- Edge caseFiles Changed
superset/connectors/sqla/models.py- Refactor methodsuperset/utils/rls.py- Call internal methodtests/unit_tests/models/test_double_rls_virtual_dataset.py- New testsHow To Test Manually
SELECT * FROM physical_table{"clause": "tenant_id = 123"}Checklist