Add diagnostics for stuck FTE scheduler#19879
Merged
losipiuk merged 7 commits intotrinodb:masterfrom Nov 28, 2023
Merged
Conversation
findepi
approved these changes
Nov 24, 2023
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
67fb915 to
c4ab707
Compare
Member
Author
|
Updated @findepi , @wweiss-starburst PTAL |
c4ab707 to
885f11a
Compare
findepi
reviewed
Nov 27, 2023
| { | ||
| try { | ||
| Event event = eventQueue.poll(1, MINUTES); | ||
| Event event = eventQueue.poll(EVENT_PROCESSING_ENFORCED_FREQUENCY.toMillis(), MILLISECONDS); |
Member
There was a problem hiding this comment.
or call it EVENT_PROCESSING_ENFORCED_FREQUENCY_MILLIS
9100c98 to
903b8ea
Compare
Member
Author
|
Some updates - unfortunatelly we cannot just filter out empty SplitAssignmentEvents. |
losipiuk
commented
Nov 27, 2023
Comment on lines
1417
to
1673
903b8ea to
94bbc08
Compare
findepi
reviewed
Nov 28, 2023
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/execution/scheduler/faulttolerant/SplitAssigner.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/execution/scheduler/faulttolerant/SplitAssigner.java
Outdated
Show resolved
Hide resolved
findepi
approved these changes
Nov 28, 2023
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/exchange/ExchangeSourceOutputSelector.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/exchange/ExchangeSourceOutputSelector.java
Outdated
Show resolved
Hide resolved
core/trino-spi/src/main/java/io/trino/spi/exchange/ExchangeSourceOutputSelector.java
Outdated
Show resolved
Hide resolved
findepi
reviewed
Nov 28, 2023
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
.../java/io/trino/execution/scheduler/faulttolerant/EventDrivenFaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
Also add default implementations for EventListener methods which delegate to appropriate intermediate method according to class hierachy.
04b1d30 to
c6eb8c6
Compare
Add code which will dump log debug information in case FTE scheduler is not getting any events for 10 minutes. This is to track rare bug where we observe queries running with retry_policy set to FALSE stuck sometimes.
c6eb8c6 to
f4d34fc
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add code which will dump log debug information in case FTE scheduler is not getting any events for 10 minutes. This is to track rare bug where we observe queries running with retry_policy set to FALSE stuck sometimes.
TODO: