Add progress bar support for fault tolerant execution#16036
Add progress bar support for fault tolerant execution#16036losipiuk merged 1 commit intotrinodb:masterfrom
Conversation
There was a problem hiding this comment.
This feels like a bit to much for this single usecase. Maybe instead let's have a flag if query is using FTE and a counter how many stages are already scheduled?
There was a problem hiding this comment.
I have thought about that. However, that may cause confusion to the end users as we won't be able to know whether we are using FTE here
There was a problem hiding this comment.
I do not think it is a big deal that we always return false here.
We can also make value OptionalBoolean.
Or we can have a String retryMode field where we can put
NONE, QUERY, TASK and UNKNOWN. We can then use UNKNOWN in this context
b60577b to
27ecf31
Compare
losipiuk
left a comment
There was a problem hiding this comment.
The formulas looks like. Let's settle on how we mark query as FTE in stats.
d8a47d9 to
66958fb
Compare
There was a problem hiding this comment.
nit: maybe extract the implementation into a helper method (you can pass a boolean flag on what to compute)
There was a problem hiding this comment.
nit: maybe have stats#getRunningPercentage that has a condition based on what execution mode is used?
There was a problem hiding this comment.
Not really feasible if we don't want to change the signature of formatProgressBar:
if (stats.isFaultTolerantExecution()) {
progressBar = formatProgressBar(progressWidth, progressPercentage, stats.getFaultTolerantExecutionRunningPercentage(), 100);
}
else {
progressBar = formatProgressBar(progressWidth,
stats.getCompletedSplits(),
max(0, stats.getRunningSplits()),
stats.getTotalSplits());
}
core/trino-main/src/main/java/io/trino/execution/QueryStateMachine.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/execution/QueryStateMachine.java
Outdated
Show resolved
Hide resolved
66958fb to
42ebf47
Compare
There was a problem hiding this comment.
does this one guarantee that progress will not decrease? totalSplits can grow for stage, so I do not believe it is guaranteed. Am I missing something, or we just not care?
There was a problem hiding this comment.
Oh I didn't know totalSplits can grow per stage. But still, seems this is the best thing we can do now.
42ebf47 to
69180a1
Compare
|
I tested it out on tcph.sf10 with: For quite a few iterations the running splits are reported as 0 which make progress bar not hav '>' at the end. It is kinda accurate but does not look pretty. |
Interesting. I don't really know how is it possible to have 0 running splits for prolonged period. Did we check what do running tasks report? |
69180a1 to
54f4cf0
Compare
electrum
left a comment
There was a problem hiding this comment.
We should take this opportunity to move the progress percentage logic into the server. Rather than including a faultTolerantExecution flag, we simply add a progressPercentage property to the StatementStats constructor.
Many years ago, we added the getProgressPercentage() method with @JsonProperty, so the server returns it and this is used by the web UI and maybe other clients as well. We should complete this transition by moving the computation into the constructor, so that StatementStats directly accepts the computed value.
This will mean that any clients using the property will automatically get the FTE behavior.
|
Looking at this closer, seems like we need to add a Does the if (stats.isScheduled()) {but we could change that to if (stats.getProgressPercentage().isPresent()) {since that already includes the |
|
@electrum : we can't really. The logic for formatting a progress bar is different for streaming mode and FTE mode (to prevent progress bar from going back and forth): FTE mode uses |
|
What if we add The goal here is to keep as much of the logic on the server as possible. We don't want every client to have to understand all of these details. Does this make sense? |
|
OK, I will address this some time later |
54f4cf0 to
d003de8
Compare
core/trino-main/src/main/java/io/trino/server/protocol/ProtocolUtil.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/server/protocol/ProtocolUtil.java
Outdated
Show resolved
Hide resolved
client/trino-client/src/main/java/io/trino/client/StatementStats.java
Outdated
Show resolved
Hide resolved
d003de8 to
0fe651d
Compare
arhimondr
left a comment
There was a problem hiding this comment.
LGTM % There's a chance something may break we don't know of once we remove the scheduled field. Unless there's a good reason why to remove it I would recommend keeping it.
client/trino-client/src/main/java/io/trino/client/StatementStats.java
Outdated
Show resolved
Hide resolved
598df8b to
6dbe38e
Compare
fe73abe to
be73172
Compare
be73172 to
28150ed
Compare
Description
Unlike pipelined execution, fault tolerant execution doesn't execute stages all at once. This means in the middle of execution, some stages will be in PLANNED state, which is seen as not scheduled. Therefore, no progress will be displayed until the very end for fault tolerant execution.
This PR adds progress bar support for fault tolerant execution.
Additional context and related issues
Fix #13072
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: