feat(db): add dynamic schema support for athena#36003
Conversation
Code Review Agent Run #4a009fActionable Suggestions - 0Additional Suggestions - 5
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
There was a problem hiding this comment.
Review by Korbit AI
Korbit automatically attempts to detect when you fix issues in new commits.
| Category | Issue | Status |
|---|---|---|
| Catalog parameter ignored in URI adjustment ▹ view |
Files scanned
| File Path | Reviewed |
|---|---|
| superset/db_engine_specs/athena.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for dynamic schema selection in AWS Athena by implementing the adjust_engine_params and get_schema_from_engine_params methods in the AthenaEngineSpec class.
- Adds the
supports_dynamic_schemaflag to enable dynamic schema changes - Implements
adjust_engine_paramsto modify the SQLAlchemy URI based on provided catalog and schema parameters - Implements
get_schema_from_engine_paramsto extract the schema from the SQLAlchemy URI
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| superset/db_engine_specs/athena.py | Adds dynamic schema support by implementing adjust_engine_params and get_schema_from_engine_params methods, and setting the supports_dynamic_schema flag |
| tests/unit_tests/db_engine_specs/test_athena.py | Adds comprehensive test coverage for the new methods with various URL scenarios |
|
|
||
| awsathena+rest://athena.{region_name}.amazonaws.com:443/{schema_name}?catalog_name={catalog_name}&s3_staging_dir={s3_staging_dir} | ||
| """ | ||
| return sqlalchemy_uri.database |
There was a problem hiding this comment.
The get_schema_from_engine_params method should handle the case where sqlalchemy_uri.database is None or an empty string. Without this check, accessing .database on a URL without a database component could lead to issues. Consider adding a guard to return None if database is falsy, similar to how Presto and Snowflake handle this (they check if \"/\" not in database).
| return sqlalchemy_uri.database | |
| database = sqlalchemy_uri.database | |
| if not database: | |
| return None | |
| return database |
There was a problem hiding this comment.
In case sqlalchemy_uri.database is None the None is returned, no need to return None explicitly. No additional checks or guards needed.
Presto and Snowflake checks whether database contains /. Because they have different sqlalchemy connection strings and as a result a database could look like {caatalog}/{schema}. In case sqlalchemy_uri.database doesn't contain / it means no schemas provided and None is returned.
|
|
||
| awsathena+rest://athena.{region_name}.amazonaws.com:443/{schema_name}?catalog_name={catalog_name}&s3_staging_dir={s3_staging_dir} | ||
| """ | ||
| return sqlalchemy_uri.database |
There was a problem hiding this comment.
Consider adding explicit handling for empty strings to match the pattern used in similar implementations. The current code returns the database attribute directly, but if the database is an empty string, it should return None instead. Add: return sqlalchemy_uri.database or None to ensure consistent behavior.
| return sqlalchemy_uri.database | |
| return sqlalchemy_uri.database or None |
There was a problem hiding this comment.
I do not see the same pattern over the codebase and don't think it's needed. Could you point me out to the code?
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #36003 +/- ##
===========================================
+ Coverage 0 68.73% +68.73%
===========================================
Files 0 622 +622
Lines 0 45730 +45730
Branches 0 4977 +4977
===========================================
+ Hits 0 31434 +31434
- Misses 0 13051 +13051
- Partials 0 1245 +1245
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
SUMMARY
This PR adds dynamic schema support to AWS Athena connection.
When Athena connection is configured to some schema selecting different schema in SQL Lab or Dataset configuration doesn't affect the query.
As a result it ends up looking for the table in the connection configured schema (not in selected) and fails with table not found error.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Before:

After:

TESTING INSTRUCTIONS
Unit tests added.
To manually test:
ADDITIONAL INFORMATION