Skip to content

Comments

fix(embedded): prevent double RLS application in virtual datasets#37395

Open
YuriyKrasilnikov wants to merge 3 commits intoapache:masterfrom
YuriyKrasilnikov:fix/double-rls-virtual-datasets-37359
Open

fix(embedded): prevent double RLS application in virtual datasets#37395
YuriyKrasilnikov wants to merge 3 commits intoapache:masterfrom
YuriyKrasilnikov:fix/double-rls-virtual-datasets-37359

Conversation

@YuriyKrasilnikov
Copy link

Summary

Fixes #37359: Guest users in embedded dashboards can now use virtual datasets without SQL errors.

Before: Guest RLS applied twice → SQL errors like Unknown expression or function identifier 'my_table.tenant_id'
After: Guest RLS applied once → Virtual datasets work correctly

Problem Analysis

Root Cause (Verified by Code Trace)

get_sqla_row_level_filters() includes guest RLS for every call, but for virtual datasets it's called twice:

1. get_sqla_query() for virtual dataset
   ↓
2. get_from_clause() → apply_rls() → get_predicates_for_table()
   ↓
3. dataset.get_sqla_row_level_filters() ← INCLUDES GUEST RLS (models.py:758-763)
   ↓
4. Guest RLS applied to underlying tables in virtual dataset SQL
   ↓
5. Return to get_sqla_query()
   ↓
6. self.get_sqla_row_level_filters() ← INCLUDES GUEST RLS AGAIN (helpers.py:3198)
   ↓
7. Guest RLS applied to outer WHERE clause
   ↓
8. DOUBLE APPLICATION!

Key Code Locations

File Lines Description
superset/models/helpers.py 2052-2057 apply_rls() call in get_from_clause()
superset/utils/rls.py 108 dataset.get_sqla_row_level_filters() call
superset/connectors/sqla/models.py 758-763 Guest RLS addition
superset/models/helpers.py 3198 Outer query RLS call

Why This Happens

Global guest RLS rules (without dataset ID) match both:

  • The underlying physical table dataset
  • The virtual dataset itself

Both get guest RLS added → double application.

Solution: Separate Method Pattern

Architecture Decision

Evaluated 3 options:

Option Approach Risk Chosen
A Add parameter to public API Breaks plugins, 36 test mocks
B Tracking mechanism (like PR #35890) Complex, many files
C Separate internal method Safe, backwards compatible

Implementation

  1. Extract internal method with include_guest_rls parameter:
def get_sqla_row_level_filters(self, template_processor=None):
    """Public API - always includes guest RLS (backwards compatible)"""
    return self._get_sqla_row_level_filters_internal(
        template_processor, include_guest_rls=True
    )

def _get_sqla_row_level_filters_internal(
    self, template_processor=None, include_guest_rls=True
):
    """Internal method with optional guest RLS exclusion"""
    # ... implementation with include_guest_rls check ...
  1. Call internal method from get_predicates_for_table():
# Exclude guest RLS for underlying tables to prevent double application
for predicate in dataset._get_sqla_row_level_filters_internal(
    include_guest_rls=False
)

Why Separate Method Pattern?

Risk Separate Method Parameter in Public API
Breaking plugins ✅ None ❌ Breaks
Accidental RLS bypass ✅ Protected ❌ Possible
API backwards compatibility ✅ 100% ⚠️ Partial
Test mock updates ✅ Minimal ❌ 36 locations

Security Analysis

Aspect Status
Regular RLS on underlying tables ✅ Still applied
Guest RLS on outer query ✅ Still applied
Guest RLS bypass possible ❌ No - internal method protected

Mathematical model unchanged: RLS(query) = RLS(underlying) AND RLS(outer)

Related Issues/PRs

Testing

New Tests (6 tests, all pass)

  • test_public_api_includes_guest_rls - Backwards compatibility
  • test_internal_api_excludes_guest_rls_when_requested - Core fix
  • test_internal_api_includes_guest_rls_by_default - Default behavior
  • test_regular_rls_always_included - Security
  • test_guest_rls_skipped_when_feature_disabled - Feature flag
  • test_filter_grouping_preserved - Edge case

Files Changed

  • superset/connectors/sqla/models.py - Refactor method
  • superset/utils/rls.py - Call internal method
  • tests/unit_tests/models/test_double_rls_virtual_dataset.py - New tests

How To Test Manually

  1. Create virtual dataset: SELECT * FROM physical_table
  2. Create dashboard with chart using virtual dataset
  3. Get guest token with global RLS rule: {"clause": "tenant_id = 123"}
  4. Open embedded dashboard
  5. Before fix: SQL error with alias mismatch
  6. After fix: Chart renders correctly

Checklist

  • Refactored to Separate Method pattern (safe, backwards compatible)
  • 6 unit tests covering all scenarios
  • All tests pass
  • MyPy type checking passes
  • No changes to public API

Fixes apache#37359: Guest users in embedded dashboards experienced double
RLS application when using virtual datasets, causing SQL errors.

Problem:
- get_sqla_row_level_filters() includes guest RLS for ALL calls
- For virtual datasets, it was called twice:
  1. For underlying tables via get_predicates_for_table()
  2. For the virtual dataset itself via get_sqla_query()
- Global guest RLS rules were applied to BOTH, causing double filtering

Solution:
- Refactored get_sqla_row_level_filters() using Separate Method pattern
- Created _get_sqla_row_level_filters_internal() with include_guest_rls param
- Public API unchanged (backwards compatible)
- In get_predicates_for_table(), call internal method with include_guest_rls=False
- Guest RLS now applied only to outer query, not underlying tables

Security:
- Regular (non-guest) RLS still applied to underlying tables
- Guest RLS still applied to outer query
- No RLS bypass possible

Testing:
- Added 6 unit tests covering all scenarios
- All tests pass
@bito-code-review
Copy link
Contributor

bito-code-review bot commented Jan 23, 2026

Code Review Agent Run #a80900

Actionable Suggestions - 0
Additional Suggestions - 1
  • tests/unit_tests/models/test_double_rls_virtual_dataset.py - 1
    • Test intent mismatch · Line 46-86
      The test function name and docstring claim to test the public API get_sqla_row_level_filters(), but the code calls the internal _get_sqla_row_level_filters_internal() method instead. This mismatch could mislead reviewers and hide issues in the public wrapper.
Review Details
  • Files reviewed - 3 · Commit Range: b8c3448..b8c3448
    • superset/connectors/sqla/models.py
    • superset/utils/rls.py
    • tests/unit_tests/models/test_double_rls_virtual_dataset.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@dosubot dosubot bot added the authentication:row-level-security Related to Row Level Security label Jan 23, 2026
@netlify
Copy link

netlify bot commented Jan 23, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit b8c3448
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69737b0634e27f00080c6c18
😎 Deploy Preview https://deploy-preview-37395--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Renamed tests to follow Superset's test naming convention:
- test_public_api_includes_guest_rls → test_rls_filters_include_guest_when_enabled
- test_internal_api_excludes_guest_rls_when_requested → test_rls_filters_exclude_guest_when_requested
- test_internal_api_includes_guest_rls_by_default → test_rls_filters_include_guest_by_default

Superset pattern: test_<functionality>_<scenario>
No "public_api" or "internal_api" in test names.

Ref: bito-code-review suggestion on PR apache#37395
@YuriyKrasilnikov
Copy link
Author

Response to bito-code-review suggestion

Analysis of the suggestion

The bot noted that test_public_api_includes_guest_rls name suggests testing public API, but the code calls internal method.

Investigation of Superset test naming patterns:

# Examples from codebase:
test_csv_reader_cast_column_types_function  # tests _cast_column_types
test_get_sqla_engine_user_impersonation     # tests internal behavior
test_query_context_modified_tampered        # describes scenario

Superset pattern: test_<functionality>_<scenario> — no "public_api" or "internal_api" prefixes.

Changes made

Renamed tests to follow Superset naming convention:

Old Name New Name
test_public_api_includes_guest_rls test_rls_filters_include_guest_when_enabled
test_internal_api_excludes_guest_rls_when_requested test_rls_filters_exclude_guest_when_requested
test_internal_api_includes_guest_rls_by_default test_rls_filters_include_guest_by_default

Method naming verification

The internal method name _get_sqla_row_level_filters_internal follows existing Superset patterns:

  • _get_chart_preview_internal (mcp_service/chart/tool/get_chart_preview.py)
  • _get_screenshot_internal (mcp_service/screenshot/pooled_screenshot.py)

All 6 tests pass after renaming.

@bito-code-review
Copy link
Contributor

bito-code-review bot commented Jan 23, 2026

Code Review Agent Run #cc567f

Actionable Suggestions - 0
Review Details
  • Files reviewed - 3 · Commit Range: b8c3448..01932f6
    • superset/connectors/sqla/models.py
    • superset/utils/rls.py
    • tests/unit_tests/models/test_double_rls_virtual_dataset.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@rusackas rusackas requested review from dpgaspar and villebro January 23, 2026 19:17
@codecov
Copy link

codecov bot commented Jan 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.60%. Comparing base (76d897e) to head (68c8087).
⚠️ Report is 3480 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #37395      +/-   ##
==========================================
+ Coverage   60.48%   66.60%   +6.11%     
==========================================
  Files        1931      642    -1289     
  Lines       76236    48999   -27237     
  Branches     8568     5491    -3077     
==========================================
- Hits        46114    32636   -13478     
+ Misses      28017    15069   -12948     
+ Partials     2105     1294     -811     
Flag Coverage Δ
hive 41.91% <33.33%> (-7.24%) ⬇️
javascript ?
mysql 64.66% <100.00%> (?)
postgres 64.74% <100.00%> (?)
presto 41.93% <33.33%> (-11.88%) ⬇️
python 66.57% <100.00%> (+3.07%) ⬆️
sqlite 64.43% <100.00%> (?)
unit 100.00% <ø> (+42.36%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Update test_get_predicates_for_table to mock _get_sqla_row_level_filters_internal
instead of get_sqla_row_level_filters, matching the implementation change in
get_predicates_for_table that uses the internal method with include_guest_rls=False.
@YuriyKrasilnikov
Copy link
Author

CI Fix

Fixed the failing test_get_predicates_for_table unit test.

Root cause: The test was mocking get_sqla_row_level_filters() but the implementation now calls _get_sqla_row_level_filters_internal(include_guest_rls=False). Updated the mock to use the correct internal method.


Note on pre-commit failure: The pre-commit (previous) check failed due to an infrastructure issue (Broken pipe during helm-docs installation via brew), not related to this PR's changes.

@YuriyKrasilnikov
Copy link
Author

Response to codeant-ai bot suggestions

1. rls.py:114 - "Logic/security regression"

The bot's concern is incorrect. The change is intentional and does NOT disable guest RLS globally.

Architecture:

  • get_predicates_for_table() is called only for underlying tables in virtual dataset SQL (via apply_rls())
  • Guest RLS for the virtual dataset itself is applied separately in get_sqla_query() line 3198 via get_sqla_row_level_filters() which always calls _get_sqla_row_level_filters_internal(include_guest_rls=True)

Flow:

  1. get_from_clause()apply_rls() → underlying tables get regular RLS only (no guest RLS)
  2. get_sqla_query()get_sqla_row_level_filters() → virtual dataset gets guest RLS

This prevents double application of guest RLS (the bug described in #37359), while ensuring guest RLS is still applied exactly once at the correct level.


2-4. test_double_rls_virtual_dataset.py - "with (patch(...),) tuple syntax"

The bot is incorrect. The syntax with (patch(...), patch(...)): is valid Python 3.10+ syntax per PEP 617. Superset CI uses Python 3.10/3.11 where this is fully supported.

@bito-code-review
Copy link
Contributor

bito-code-review bot commented Jan 23, 2026

Code Review Agent Run #b25117

Actionable Suggestions - 0
Review Details
  • Files reviewed - 4 · Commit Range: 01932f6..68c8087
    • superset/connectors/sqla/models.py
    • superset/utils/rls.py
    • tests/unit_tests/models/test_double_rls_virtual_dataset.py
    • tests/unit_tests/sql_lab_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

YuriyKrasilnikov added a commit to YuriyKrasilnikov/superset that referenced this pull request Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authentication:row-level-security Related to Row Level Security size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Double RLS application in virtual datasets for guest users (regression from PR #36061)

1 participant