Skip to content

feat(db): add dynamic schema support for athena#36003

Merged
rusackas merged 2 commits into
apache:masterfrom
ishmulyan:feature/athena-dynamic-schema
Jan 12, 2026
Merged

feat(db): add dynamic schema support for athena#36003
rusackas merged 2 commits into
apache:masterfrom
ishmulyan:feature/athena-dynamic-schema

Conversation

@ishmulyan
Copy link
Copy Markdown
Contributor

@ishmulyan ishmulyan commented Nov 5, 2025

SUMMARY

This PR adds dynamic schema support to AWS Athena connection.

When Athena connection is configured to some schema selecting different schema in SQL Lab or Dataset configuration doesn't affect the query.
As a result it ends up looking for the table in the connection configured schema (not in selected) and fails with table not found error.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before:
Screenshot From 2025-11-05 15-42-28

After:
Screenshot From 2025-11-05 15-42-43

TESTING INSTRUCTIONS

Unit tests added.

To manually test:

  • Install PyAthena module and connect to AWS Athena.
  • Check you're able to run a SQL query without specifying the schema in the SQL statement as long as it's selected in the dropdown.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Nov 5, 2025

Code Review Agent Run #4a009f

Actionable Suggestions - 0
Additional Suggestions - 5
  • superset/db_engine_specs/athena.py - 2
    • Unused class method argument catalog parameter · Line 103-103
      The `catalog` parameter in `adjust_engine_params` method is unused. Consider removing it or prefixing with underscore if it's part of an interface.
      Code suggestion
       @@ -102,2 +102,2 @@
                uri: URL,
                connect_args: dict[str, Any],
      -        catalog: str | None = None,
      +        _catalog: str | None = None,
                schema: str | None = None,
    • Unused class method argument connect_args parameter · Line 123-123
      The `connect_args` parameter in `get_schema_from_engine_params` method is unused. Consider removing it or prefixing with underscore if it's part of an interface.
      Code suggestion
       @@ -122,2 +122,2 @@
                sqlalchemy_uri: URL,
      -        connect_args: dict[str, Any],
      +        _connect_args: dict[str, Any],
            ) -> str | None:
  • tests/unit_tests/db_engine_specs/test_athena.py - 3
    • Multi-line docstring summary should start first line · Line 127-127
      The docstring summary should start on the first line instead of having an empty line. This follows PEP 257 conventions for multi-line docstrings.
      Code suggestion
       @@ -126,5 +126,4 @@
       def test_adjust_engine_params() -> None:
      -    """
      -    Test `adjust_engine_params`.
      -
      -    The method can be used to adjust the schema dynamically.
      -    """
      +    """Test `adjust_engine_params`.
      +
      +    The method can be used to adjust the schema dynamically.
      +    """
    • Missing trailing comma in function call · Line 135-135
      Missing trailing comma after the URL argument in `make_url()` call. Multiple similar trailing comma issues exist on lines 164 and 174.
      Code suggestion
       @@ -134,2 +134,2 @@
      -    url = make_url(
      -        "awsathena+rest://athena.us-east-1.amazonaws.com:443/default?s3_staging_dir=s3%3A%2F%2Fathena-staging"
      -    )
      +    url = make_url(
      +        "awsathena+rest://athena.us-east-1.amazonaws.com:443/default?s3_staging_dir=s3%3A%2F%2Fathena-staging",
      +    )
    • One-line docstring should fit single line · Line 156-156
      The docstring should be written as a single line since it's short enough. This addresses both `D200` and `D212` rules for docstring formatting.
      Code suggestion
       @@ -155,4 +155,1 @@
       def test_get_schema_from_engine_params() -> None:
      -    """
      -    Test the ``get_schema_from_engine_params`` method.
      -    """
      +    """Test the ``get_schema_from_engine_params`` method."""
Review Details
  • Files reviewed - 2 · Commit Range: 0930e25..0930e25
    • superset/db_engine_specs/athena.py
    • tests/unit_tests/db_engine_specs/test_athena.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Default Agent You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@dosubot dosubot Bot added the data:connect:athena Related to Athena label Nov 5, 2025
Copy link
Copy Markdown

@korbit-ai korbit-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Functionality Catalog parameter ignored in URI adjustment ▹ view
Files scanned
File Path Reviewed
superset/db_engine_specs/athena.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

Comment thread superset/db_engine_specs/athena.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for dynamic schema selection in AWS Athena by implementing the adjust_engine_params and get_schema_from_engine_params methods in the AthenaEngineSpec class.

  • Adds the supports_dynamic_schema flag to enable dynamic schema changes
  • Implements adjust_engine_params to modify the SQLAlchemy URI based on provided catalog and schema parameters
  • Implements get_schema_from_engine_params to extract the schema from the SQLAlchemy URI

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
superset/db_engine_specs/athena.py Adds dynamic schema support by implementing adjust_engine_params and get_schema_from_engine_params methods, and setting the supports_dynamic_schema flag
tests/unit_tests/db_engine_specs/test_athena.py Adds comprehensive test coverage for the new methods with various URL scenarios


awsathena+rest://athena.{region_name}.amazonaws.com:443/{schema_name}?catalog_name={catalog_name}&s3_staging_dir={s3_staging_dir}
"""
return sqlalchemy_uri.database
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_schema_from_engine_params method should handle the case where sqlalchemy_uri.database is None or an empty string. Without this check, accessing .database on a URL without a database component could lead to issues. Consider adding a guard to return None if database is falsy, similar to how Presto and Snowflake handle this (they check if \"/\" not in database).

Suggested change
return sqlalchemy_uri.database
database = sqlalchemy_uri.database
if not database:
return None
return database

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case sqlalchemy_uri.database is None the None is returned, no need to return None explicitly. No additional checks or guards needed.
Presto and Snowflake checks whether database contains /. Because they have different sqlalchemy connection strings and as a result a database could look like {caatalog}/{schema}. In case sqlalchemy_uri.database doesn't contain / it means no schemas provided and None is returned.


awsathena+rest://athena.{region_name}.amazonaws.com:443/{schema_name}?catalog_name={catalog_name}&s3_staging_dir={s3_staging_dir}
"""
return sqlalchemy_uri.database
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding explicit handling for empty strings to match the pattern used in similar implementations. The current code returns the database attribute directly, but if the database is an empty string, it should return None instead. Add: return sqlalchemy_uri.database or None to ensure consistent behavior.

Suggested change
return sqlalchemy_uri.database
return sqlalchemy_uri.database or None

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see the same pattern over the codebase and don't think it's needed. Could you point me out to the code?

@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 5, 2025

Codecov Report

❌ Patch coverage is 50.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.73%. Comparing base (04231c8) to head (6848dd4).
⚠️ Report is 479 commits behind head on master.

Files with missing lines Patch % Lines
superset/db_engine_specs/athena.py 50.00% 6 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #36003       +/-   ##
===========================================
+ Coverage        0   68.73%   +68.73%     
===========================================
  Files           0      622      +622     
  Lines           0    45730    +45730     
  Branches        0     4977     +4977     
===========================================
+ Hits            0    31434    +31434     
- Misses          0    13051    +13051     
- Partials        0     1245     +1245     
Flag Coverage Δ
hive 44.20% <50.00%> (?)
mysql 67.81% <50.00%> (?)
postgres 67.86% <50.00%> (?)
presto 47.76% <50.00%> (?)
python 68.70% <50.00%> (?)
sqlite 67.48% <50.00%> (?)
unit 100.00% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@rusackas
Copy link
Copy Markdown
Member

CC @betodealmeida

Copy link
Copy Markdown
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@rusackas rusackas merged commit fac5d2b into apache:master Jan 12, 2026
73 of 74 checks passed
jesperct pushed a commit to jesperct/superset that referenced this pull request Jan 19, 2026
qfcwell pushed a commit to qfcwell/superset that referenced this pull request May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants