Skip to content

fix: cache warmup using WebDriver for reliable authentication#38449

Open
rusackas wants to merge 6 commits into
masterfrom
adopt-pr-34525-fix-cache-warmup
Open

fix: cache warmup using WebDriver for reliable authentication#38449
rusackas wants to merge 6 commits into
masterfrom
adopt-pr-34525-fix-cache-warmup

Conversation

@rusackas
Copy link
Copy Markdown
Member

@rusackas rusackas commented Mar 5, 2026

Summary

Adopts and rebases PR #34525 (originally PR #20387 by @ensky) onto
current master with conflict resolution.

  • Replaced HTTP/API-based warmup with WebDriver-based dashboard rendering
  • Added SUPERSET_CACHE_WARMUP_USER config (default: "admin")
  • Added persistent WebDriver instance support for efficiency
  • Warm up entire dashboards instead of individual charts
  • Added Celery beat configuration documentation

Attribution: Originally by @ensky (PR #20387), adopted by @rusackas (PR
#34525)

Test plan

  • Verify strategy tests pass with new URL-based assertions
  • Verify thumbnail tests pass with updated WebDriverSelenium API
  • Test cache warmup with configured SUPERSET_CACHE_WARMUP_USER

🤖 Generated with Claude Code

  Adopted from PR #34525 by @rusackas (originally PR #20387 by @ensky).
  Rebased on master with conflict resolution.

  Changes:
  - Use WebDriver (Selenium) to render dashboards for cache warmup
  - Add SUPERSET_CACHE_WARMUP_USER config for specifying the warmup user
  - Support persistent WebDriver instances for efficiency
  - Warm up entire dashboards instead of individual charts
  - Add Celery beat configuration documentation
  - Remove obsolete HTTP-based cache warmup tests

  Co-Authored-By: Evan Rusackas <evan@rusackas.com>
  Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dosubot dosubot Bot added change:backend Requires changing the backend doc Namespace | Anything related to documentation labels Mar 5, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 5, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 776ab3c
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69d573567f84d30008a2d12c
😎 Deploy Preview https://deploy-preview-38449--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Copy Markdown
Contributor

@bito-code-review bito-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #f86fd5

Actionable Suggestions - 2
  • superset/utils/webdriver.py - 2
Review Details
  • Files reviewed - 9 · Commit Range: 2ac03e4..2ac03e4
    • docs/admin_docs/configuration/cache.mdx
    • superset/config.py
    • superset/tasks/cache.py
    • superset/utils/screenshots.py
    • superset/utils/webdriver.py
    • tests/integration_tests/strategy_tests.py
    • tests/integration_tests/tasks/test_cache.py
    • tests/integration_tests/tasks/test_utils.py
    • tests/integration_tests/thumbnails_tests.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread superset/utils/webdriver.py
Comment thread superset/utils/webdriver.py
@sadpandajoe
Copy link
Copy Markdown
Member

@rusackas I see we deleted 2 test files, are those tests covered in the new files that are created?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR rebases and modernizes Superset’s cache warmup feature by switching from API-based chart warmup to WebDriver-driven dashboard rendering, aiming to make authentication and cache population more reliable (and closer to real user behavior).

Changes:

  • Replaced API/CSRF-based warmup flow with Selenium-based dashboard rendering (URL-based warmup).
  • Introduced SUPERSET_CACHE_WARMUP_USER config and updated warmup strategies/tests to return dashboard URLs.
  • Added “persistent” Selenium driver lifecycle hooks and documented Celery beat configuration for scheduling warmups.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
superset/tasks/cache.py Switches warmup to WebDriver-driven dashboard URL loading and updates strategies accordingly.
superset/utils/webdriver.py Adds persistent Selenium driver support and changes driver/auth/destroy behavior.
superset/utils/screenshots.py Updates screenshot driver creation/calls to align with new WebDriver API shape.
superset/config.py Adds SUPERSET_CACHE_WARMUP_USER default configuration.
docs/admin_docs/configuration/cache.mdx Documents Celery beat scheduling for cache warmup and the warmup user config.
tests/integration_tests/strategy_tests.py Updates strategy tests from task payload assertions to URL assertions.
tests/integration_tests/thumbnails_tests.py Adjusts Selenium screenshot tests for new WebDriverSelenium API usage.
tests/integration_tests/tasks/test_cache.py Removes tests for the deleted API-based warmup implementation.
tests/integration_tests/tasks/test_utils.py Removes tests for the deleted CSRF-token fetching helper usage.
Comments suppressed due to low confidence (1)

superset/utils/screenshots.py:199

  • When PLAYWRIGHT_REPORTS_AND_THUMBNAILS is enabled, driver() returns WebDriverPlaywright without any user context, and get_screenshot() calls driver.get_screenshot(...) without passing user. Since Playwright auth is now conditional on the user argument, this will likely generate unauthenticated screenshots (login page). Pass user through for the Playwright path (either at driver creation or call time).
        if feature_flag_manager.is_feature_enabled("PLAYWRIGHT_REPORTS_AND_THUMBNAILS"):
            # Try to use Playwright if available (supports WebGL/DeckGL, unlike Cypress)
            if PLAYWRIGHT_AVAILABLE:
                return WebDriverPlaywright(self.driver_type, window_size)


You can also share your feedback on Copilot code review. Take the survey.

Comment thread superset/utils/screenshots.py Outdated
Comment thread superset/tasks/cache.py Outdated
Comment thread superset/tasks/cache.py
Comment thread superset/config.py Outdated
Comment thread superset/utils/webdriver.py Outdated
Comment thread superset/utils/webdriver.py
Comment thread superset/utils/webdriver.py
…nges

- Fix _auth() to authenticate self._driver in-place instead of creating
  a second, leaked driver (critical bug: persistent driver was never authenticated)
- Replace assert with explicit RuntimeError for driver creation failure
- Fix get_dash_url() to strip trailing slash from WEBDRIVER_BASEURL to
  avoid double-slash URLs (e.g. http://host//superset/dashboard/1/)
- Fix BaseScreenshot.get_screenshot() to call driver.destroy() in a
  try/finally block, preventing Selenium process leaks for one-off screenshots
- Fix webdriver_pool._destroy_driver() to directly call close()/quit()
  on the raw WebDriver since destroy() is now an instance method, not static

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 45.05495% with 50 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.57%. Comparing base (0dbd4c5) to head (6fae0ff).
⚠️ Report is 563 commits behind head on master.

Files with missing lines Patch % Lines
superset/tasks/cache.py 41.02% 23 Missing ⚠️
superset/utils/webdriver.py 58.97% 10 Missing and 6 partials ⚠️
superset/utils/screenshots.py 14.28% 6 Missing ⚠️
superset/mcp_service/screenshot/webdriver_pool.py 0.00% 5 Missing ⚠️

❌ Your project check has failed because the head coverage (99.24%) is below the target coverage (100.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #38449      +/-   ##
==========================================
- Coverage   64.82%   63.57%   -1.25%     
==========================================
  Files        1815     2542     +727     
  Lines       71917   131146   +59229     
  Branches    22915    30070    +7155     
==========================================
+ Hits        46618    83376   +36758     
- Misses      25299    46278   +20979     
- Partials        0     1492    +1492     
Flag Coverage Δ
hive 39.87% <19.78%> (?)
mysql 60.37% <45.05%> (?)
postgres 60.46% <45.05%> (?)
presto 41.65% <19.78%> (?)
python 62.03% <45.05%> (?)
sqlite 60.09% <45.05%> (?)
unit 100.00% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread tests/integration_tests/strategy_tests.py Outdated
Comment thread tests/integration_tests/strategy_tests.py Outdated
Comment thread superset/utils/screenshots.py Outdated
Copy link
Copy Markdown
Contributor

@bito-code-review bito-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #6f0812

Actionable Suggestions - 1
  • superset/utils/webdriver.py - 1
Review Details
  • Files reviewed - 4 · Commit Range: 2ac03e4..776ab3c
    • superset/mcp_service/screenshot/webdriver_pool.py
    • superset/tasks/cache.py
    • superset/utils/screenshots.py
    • superset/utils/webdriver.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread superset/utils/webdriver.py
rusackas pushed a commit that referenced this pull request Apr 17, 2026
- theming.mdx: document brandAppName theme token (PR #37370) — controls
  app name in browser title/nav/emails, takes precedence over APP_NAME config
- cache.mdx: document SUPERSET_CACHE_WARMUP_USER config key (PR #38449) —
  controls the user account Selenium WebDriver authenticates as for thumbnail
  rendering and cache warmup; update selenium → Selenium capitalization
- security.mdx: document missing SQL Lab RBAC permissions (PR #36263) —
  can_estimate_query_cost and can_format_sql must be explicitly granted
- sql-templating.mdx: document Jinja support in calculated columns (PR #37791)
  with examples; add tip that "Format SQL" is Jinja-aware and dialect-specific
  (PRs #36277, #39393)
- creating-your-first-dashboard.mdx: document dashboard tab URLs (#38660),
  auto-refresh (#37459), "Last queried at" timestamp (#36934), and tab
  selection when saving charts to dashboards (#36332)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
rusackas pushed a commit that referenced this pull request Apr 22, 2026
- theming.mdx: document brandAppName theme token (PR #37370) — controls
  app name in browser title/nav/emails, takes precedence over APP_NAME config
- cache.mdx: document SUPERSET_CACHE_WARMUP_USER config key (PR #38449) —
  controls the user account Selenium WebDriver authenticates as for thumbnail
  rendering and cache warmup; update selenium → Selenium capitalization
- security.mdx: document missing SQL Lab RBAC permissions (PR #36263) —
  can_estimate_query_cost and can_format_sql must be explicitly granted
- sql-templating.mdx: document Jinja support in calculated columns (PR #37791)
  with examples; add tip that "Format SQL" is Jinja-aware and dialect-specific
  (PRs #36277, #39393)
- creating-your-first-dashboard.mdx: document dashboard tab URLs (#38660),
  auto-refresh (#37459), "Last queried at" timestamp (#36934), and tab
  selection when saving charts to dashboards (#36332)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
rusackas and others added 3 commits April 22, 2026 12:35
Assertions can be disabled at runtime, so use an explicit check and
raise instead — matches how driver creation failure is already handled.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WebDriverPlaywright's get_screenshot still needs the user argument to
authenticate its browser context; without it the Playwright path renders
private dashboards as unauthenticated pages. WebDriverSelenium already
accepts the optional user kwarg and re-authenticates in-place if it
differs from the stored one.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
get_dash_url() now rstrips the trailing slash from WEBDRIVER_BASEURL, so
the test expectations need the same treatment — otherwise a baseurl that
ends in / produces double-slash URLs that no longer match strategy
output. Fixes both test_top_n_dashboards_strategy and
test_dashboard_tags_strategy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Apr 22, 2026

Code Review Agent Run #7bac54

Actionable Suggestions - 0
Review Details
  • Files reviewed - 3 · Commit Range: 776ab3c..c5475c3
    • superset/utils/webdriver.py
    • superset/utils/screenshots.py
    • tests/integration_tests/strategy_tests.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Warmup ran as "admin" by default, which is the highest-privilege user
in a fresh install. If an operator enables the cache-warmup Celery beat
without explicit configuration, that default silently renders dashboards
as admin — larger blast radius than needed.

Now the default is None, and cache_warmup() returns a clear error
message pointing operators at SUPERSET_CACHE_WARMUP_USER before it
even tries to look up a user. Matches the reviewer's least-privilege
suggestion.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Apr 23, 2026

Code Review Agent Run #7aff56

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: c5475c3..6fae0ff
    • superset/config.py
    • superset/tasks/cache.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@rusackas
Copy link
Copy Markdown
Member Author

@michael-s-molina — gentle ping when you have a moment: CI is green and I've worked through the open review thread. Happy to make further changes if anything stands out.

1 similar comment
@rusackas
Copy link
Copy Markdown
Member Author

@michael-s-molina — gentle ping when you have a moment: CI is green and I've worked through the open review thread. Happy to make further changes if anything stands out.

@rusackas
Copy link
Copy Markdown
Member Author

@sadpandajoe Good catch, here's the breakdown:

  • tests/integration_tests/tasks/test_cache.py tested fetch_url() in superset/tasks/cache.py. That function is gone in this PR — the WebDriver flow replaces urllib.request-based URL fetching entirely (the dashboard is rendered through a real authenticated browser session, no manual CSRF dance). So the test is correctly removed along with the function it was testing.

  • tests/integration_tests/tasks/test_utils.py tested fetch_csrf_token() in superset/tasks/utils.py. The function still exists in utils.py in this PR — only its sole internal caller (fetch_url) was removed. I'll restore the test in the next push since the function is still part of the codebase / potentially public surface. (Open question for a follow-up: whether fetch_csrf_token should also be removed once we're confident nothing external relies on it — but that's a deprecation conversation, not this PR.)

While digging in I also noticed this branch needs a non-trivial rebase: superset/tasks/cache.py has changed substantially on master since this PR was opened — DashboardTagsStrategy.get_urls() / cache_warmup's URL-based dispatch has been refactored to a chart-level get_tasks() model. Reconciling the WebDriver-auth approach with master's per-chart task dispatch is more than a mechanical merge; want to flag it before doing the architecture call myself. I'll plan to rebase + restore the test_fetch_csrf_token file in the next push once the cache.py reconciliation direction is settled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:backend Requires changing the backend doc Namespace | Anything related to documentation preset-io size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants