Skip to content

feat(sqllab): use sqlglot instead of sqlparse#33542

Merged
betodealmeida merged 4 commits intomasterfrom
sqlglot-in-sqllab
May 30, 2025
Merged

feat(sqllab): use sqlglot instead of sqlparse#33542
betodealmeida merged 4 commits intomasterfrom
sqlglot-in-sqllab

Conversation

@betodealmeida
Copy link
Member

@betodealmeida betodealmeida commented May 20, 2025

SUMMARY

Part of #26786, stacked on:

Hooks SQL Lab to use the new SQLScript, instead of ParsedQuery. I also cleaned up the execute_sql_statement function, which did way more than executing a SQL statement — it included RLs modifications, adding limits, CTAS/CVAS, etc.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@betodealmeida betodealmeida marked this pull request as draft May 20, 2025 14:35
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Functionality Missing Predicate Parsing for Kusto KQL ▹ view 🧠 Not in scope
Performance Uncached RLS Predicate Lookups ▹ view 🧠 Incorrect
Performance Inefficient Repeated Predicate Parsing ▹ view ✅ Fix detected
Files scanned
File Path Reviewed
superset/exceptions.py
superset/sql/parse.py
superset/sql_lab.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

@betodealmeida betodealmeida force-pushed the sqlglot-in-sqllab branch 7 times, most recently from e913e28 to 71eed5a Compare May 21, 2025 23:20
@betodealmeida betodealmeida changed the title WIP chore(sqllab): use sqlglot instead of sqlparse May 21, 2025
@betodealmeida betodealmeida marked this pull request as ready for review May 21, 2025 23:26
@dosubot dosubot bot added the sqllab Namespace | Anything related to the SQL Lab label May 21, 2025
@betodealmeida betodealmeida changed the title chore(sqllab): use sqlglot instead of sqlparse feat(sqllab): use sqlglot instead of sqlparse May 21, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed my review and didn't find any issues.

Files scanned
File Path Reviewed
superset/sqllab/sqllab_execution_context.py
superset/exceptions.py
superset/models/sql_lab.py
superset/sql/parse.py
superset/sql_lab.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

Base automatically changed from sqlglot-ctas-cvas to master May 28, 2025 13:46
Comment on lines +467 to +469
if is_feature_enabled("RLS_IN_SQLLAB"):
for statement in parsed_script.statements:
apply_rls(query, statement)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is RLS going to be applied to a CTA/CVA statement that's executed later on as well? I didn't know we would inject these into these as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not saying this shouldn't happen, just genuinely asking)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should, right? Otherwise a user could bypass RLS by running a CTAS and looking at the table they created.

Copy link
Member

@mistercrunch mistercrunch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, improving coverage is great in this area!

@Vitor-Avila
Copy link
Contributor

hey @betodealmeida I've been manually testing this, and noticed 1 issue (could be specific to MS SQL Server). I've created an RLS that's 1=1 and applied it to my account (with RLS_IN_SQL_LAB FF enabled) and I'm getting an error when trying to do a simple select *:

image

It does work on master, but could be MSSQL specific

@Vitor-Avila
Copy link
Contributor

Ok, it's not only with 1=1 -- just created a new RLS with OrderID =101 and the same error happens:

image

Full stack trace:

2025-05-30 15:47:38,584:WARNING:superset.views.error_handling:SupersetErrorsException
Traceback (most recent call last):
  File "/Users/vitoravila/.pyenv/versions/3.11.10/envs/superset-oss/lib/python3.11/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vitoravila/.pyenv/versions/3.11.10/envs/superset-oss/lib/python3.11/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vitoravila/.pyenv/versions/3.11.10/envs/superset-oss/lib/python3.11/site-packages/flask_appbuilder/security/decorators.py", line 109, in wraps
    return f(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vitoravila/code/superset/superset/views/base_api.py", line 120, in wraps
    duration, response = time_function(f, self, *args, **kwargs)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vitoravila/code/superset/superset/utils/core.py", line 1371, in time_function
    response = func(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vitoravila/code/superset/superset/views/base_api.py", line 92, in wraps
    return f(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vitoravila/code/superset/superset/utils/log.py", line 304, in wrapper
    value = f(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^
  File "/Users/vitoravila/code/superset/superset/sqllab/api.py", line 406, in execute_sql_query
    command_result: CommandResult = command.run()
                                    ^^^^^^^^^^^^^
  File "/Users/vitoravila/code/superset/superset/utils/decorators.py", line 271, in wrapped
    return on_error(ex)
           ^^^^^^^^^^^^
  File "/Users/vitoravila/code/superset/superset/utils/decorators.py", line 236, in on_error
    raise ex
  File "/Users/vitoravila/code/superset/superset/utils/decorators.py", line 264, in wrapped
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vitoravila/code/superset/superset/commands/sql_lab/execute.py", line 105, in run
    status = self._run_sql_json_exec_from_scratch()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vitoravila/code/superset/superset/commands/sql_lab/execute.py", line 158, in _run_sql_json_exec_from_scratch
    return self._sql_json_executor.execute(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vitoravila/code/superset/superset/sqllab/sql_json_executer.py", line 110, in execute
    raise SupersetErrorsException(
superset.exceptions.SupersetErrorsException: [SupersetError(message='Boolean value of this clause is not defined', error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': 'Microsoft SQL Server', 'issue_codes': [{'code': 1002, 'message': 'Issue 1002 - The database returned an unexpected error.'}]})]
2025-05-30 15:47:38,586:INFO:werkzeug:127.0.0.1 - - [30/May/2025 15:47:38] "POST /api/v1/sqllab/execute/ HTTP/1.1" 500 -

Screenshot from master:
image

@betodealmeida
Copy link
Member Author

@Vitor-Avila nice catch! Just fixed it and added 2 unit tests to cover it.

@betodealmeida betodealmeida merged commit cf31538 into master May 30, 2025
46 checks passed
@betodealmeida betodealmeida deleted the sqlglot-in-sqllab branch May 30, 2025 21:08
@Vitor-Avila
Copy link
Contributor

that's awesome! Thank you so much for these improvements 🙏 the flow is much easier to follow now

LevisNgigi pushed a commit to LevisNgigi/superset that referenced this pull request Jun 18, 2025
@github-actions github-actions bot added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 6.0.0 First shipped in 6.0.0 labels Dec 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels preset-io size/XXL sqllab Namespace | Anything related to the SQL Lab 🚢 6.0.0 First shipped in 6.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants