-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use selectinload
in trigger
#40487
Use selectinload
in trigger
#40487
Conversation
3105adb
to
5a85601
Compare
5a85601
to
06ba5bb
Compare
The spellcheck is not happy with "selectin". Could you add it to |
Looks fantastic otherwise! |
6f5e60a
to
0a2fe73
Compare
Is there a metric where we can write a test against to ensure this query is optimised? |
Our Airflow cluster is hit pretty badly by this issue, are there any blockers to merging this? Any way we can contribute? |
0a2fe73
to
2787b5b
Compare
@thirtyseven We have been using this fix in our Airflow deployments for a while now, have not seen any issues. Provided the database plan comparison above, but we need some guidance on how to implement unit test for performance fixes like this. Would like to get this PR reviewed and merged too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am good with this. We don't really have a performance testing framework in OSS. But we have some amount of QA downstream. And I looked at the queries and I'm not surprised that the "before" one causes trouble.
Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. |
Thanks for merging, any chance this can be cherry picked into the next patch release? |
@ephraimbuddy What do you think? Can we consider it as a bug fix? It is an performance improvement fix |
Ok. So, I rushed in and created a new label for things we need to backport to Airflow 2. However, we shouldn't backport this one to Airflow 2. It's an improvement change. |
From some comments, should we could consider it as a bug fix? #40487 (comment) |
The issue makes deferred tasks unusable when the task_instance table grows too large, I'd personally consider it a bug fix. |
I'm sceptical about getting it into Airflow 2, even with the comments, but if we feel sure about it, let's backport it. cc @jedcunningham |
I'd say bugfix for this one. It fixes a real problem vs just optimizing something that isn't problematic. |
+1 on requesting this as a bugfix - It has caused our Trigger-based processes to become unusable in production on a few occasions already, and while we do have a workaround (regularly running |
@ephraimbuddy from the feedbacks from others I added back the label "needs backport to 2". How's that work? We no longer do the backport now but the release manager will do it before the next 2.10 release? |
Hi @vincbeck , we do the backport. Feel free to backport it. I want to keep a tab of things we should have backported with the label. That's the only purpose of the label, and we will use it in auto-backporting bot, too |
Sounds good! I'll do it :) |
(cherry picked from commit 46b41e3)
(cherry picked from commit 46b41e3) Co-authored-by: Joseph Ang <[email protected]>
(cherry picked from commit 46b41e3) Co-authored-by: Joseph Ang <[email protected]>
(cherry picked from commit 46b41e3) Co-authored-by: Joseph Ang <[email protected]>
closes: #33647
As mentioned by @arunravimv in #33647, we have added this patch to our own Airflow deployment and have noticed improvements in triggerer performance.
Following are the
Explain Analyze
outputs for the two SQL Alchemy relationship loading strategiesTriggerer Process
Using
joinedload
in bulk_fetch methodUsing
selectinload
in bulk_fetch methodtriggerview/list API
Using
joined
for relationship between TaskInstance and TriggerUsing
selectin
for relationship between TaskInstance and TriggerFrom above
Explain Analyze
results, we can see that usingselectinload
is gives more optimal performance for triggerer process as well as the triggerview list api.