Skip to content

Ensure taskInfo finalized on permanent worker failure#20021

Merged
losipiuk merged 1 commit intotrinodb:masterfrom
losipiuk:lo/fix-stuck-aborting
Dec 5, 2023
Merged

Ensure taskInfo finalized on permanent worker failure#20021
losipiuk merged 1 commit intotrinodb:masterfrom
losipiuk:lo/fix-stuck-aborting

Conversation

@losipiuk
Copy link
Copy Markdown
Member

@losipiuk losipiuk commented Dec 5, 2023

Description

Even though state in taskStatus is Done already it could not be the case for task info.
We could have received valid response from taskStatusFetcher just before the worker went
down, but taskInfoFetcher still holds old state.
Update taskInfo so task is not stuck in FTE execution mode which depends on final task info
being delivered.

#fixes #18603

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Section
* Fix bug when query could hang with `retry.policy` set to `TASK` when Trino worker node died. ({issue}`18603 `)

Even though state in taskStatus is Done already it could not be the case for task info.
We could have received valid response from taskStatusFetcher just before the worker went
down, but taskInfoFetcher still holds old state.
Update taskInfo so task is not stuck in FTE execution mode which depends on final task info
being delivered.
Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe test in TestHttpRemoteTask?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

FTE stage gets stuck in Pending

3 participants