Skip to content

fix: trino thread app missing full context#29981

Merged
dpgaspar merged 2 commits intoapache:masterfrom
preset-io:danielgaspar/sc-82301/it-s-not-possible-to-use-trino-db-connections
Aug 22, 2024
Merged

fix: trino thread app missing full context#29981
dpgaspar merged 2 commits intoapache:masterfrom
preset-io:danielgaspar/sc-82301/it-s-not-possible-to-use-trino-db-connections

Conversation

@dpgaspar
Copy link
Member

@dpgaspar dpgaspar commented Aug 21, 2024

SUMMARY

Fixes the issue where Trino SQL execution was missing Flask's g values in the context.
Flask's requests and g are local to the request thread or greenlet, by creating a new thread we loose this context.

This PR passes a copy of g into the thread, g is then recreated inside the thread, this g will be local to the thread but will contain all the necessary context of the current flask request

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Aug 21, 2024

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.64%. Comparing base (76d897e) to head (6ed299d).
⚠️ Report is 2419 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #29981       +/-   ##
===========================================
+ Coverage   60.48%   83.64%   +23.16%     
===========================================
  Files        1931      528     -1403     
  Lines       76236    38150    -38086     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31912    -14202     
+ Misses      28017     6238    -21779     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive ∅ <33.33%> (∅)
javascript ?
mysql ∅ <33.33%> (?)
postgres ∅ <33.33%> (?)
presto ∅ <33.33%> (∅)
python ∅ <100.00%> (∅)
sqlite ∅ <33.33%> (?)
unit ∅ <100.00%> (∅)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dpgaspar dpgaspar marked this pull request as ready for review August 21, 2024 12:04
@dosubot dosubot bot added data:connect Namespace | Anything related to db connections / integrations data:connect:trino Related to Trino labels Aug 21, 2024
# from the parent thread,
# meaning the g object and other context-bound variables are not
# accessible
with app.app_context():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpgaspar I did a quick GPT search.

from flask import current_app
import threading

def my_thread():
    with current_app.app_context():
        # Access Flask context within the thread
        app = current_app._get_current_object()
        # Do something with the app

# Create and start the thread
thread = threading.Thread(target=my_thread)
thread.start()

Would something like this work for our use cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems basically similar, except I'm passing the current_app into the thread, also we need to recreate g inside the thread

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use copy_current_request_context?

@copy_current_request_context
def worker():
    # Access Flask's g object here
    # Your code goes here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only the request context is copied according to: pallets/flask#3306

Copy link
Contributor

@joaoferrao joaoferrao Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed not including @ copy_current_request_context makes the thread's downstream calls url_for function calls fail with the screenshot here #27631 (comment)

I found this out in the context adapting #27631 for Trino. My understanding is that, when using OAuth2 config, the class OAuth2ClientConfigSchema will be called in a separated thread and has this missing information.

@dpgaspar
Copy link
Member Author

Thank you for the review @michael-s-molina

@sadpandajoe sadpandajoe added the v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch label Aug 22, 2024
@dpgaspar dpgaspar merged commit 4d821f4 into apache:master Aug 22, 2024
@dpgaspar dpgaspar deleted the danielgaspar/sc-82301/it-s-not-possible-to-use-trino-db-connections branch August 22, 2024 17:01
sadpandajoe pushed a commit that referenced this pull request Aug 22, 2024
@github-actions github-actions bot added 🍒 4.1.0 Cherry-picked to 4.1.0 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels labels Nov 14, 2024
@mistercrunch mistercrunch added the 🍒 4.1.1 Cherry-picked to 4.1.1 label Nov 27, 2024
@github-actions github-actions bot added the 🍒 4.1.2 Cherry-picked to 4.1.2 label Apr 1, 2025
@mistercrunch mistercrunch added 🍒 4.1.3 Cherry-picked to 4.1.3 🚢 5.0.0 First shipped in 5.0.0 labels Jul 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels data:connect:trino Related to Trino data:connect Namespace | Anything related to db connections / integrations size/M v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch 🍒 4.1.0 Cherry-picked to 4.1.0 🍒 4.1.1 Cherry-picked to 4.1.1 🍒 4.1.2 Cherry-picked to 4.1.2 🍒 4.1.3 Cherry-picked to 4.1.3 🍒 4.1.4 🚢 5.0.0 First shipped in 5.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants