Skip to content

Add progress bar support for fault tolerant execution#12275

Closed
linzebing wants to merge 1 commit intotrinodb:masterfrom
linzebing:tardigrade-progress-bar
Closed

Add progress bar support for fault tolerant execution#12275
linzebing wants to merge 1 commit intotrinodb:masterfrom
linzebing:tardigrade-progress-bar

Conversation

@linzebing
Copy link
Copy Markdown
Member

Unlike pipelined execution, fault tolerant execution doesn't execute stages all at once. This means in the middle of execution, some stages will be in PLANNED state, which is seen as not scheduled. Therefore, we don't know the number of total splits upfront for fault tolerant execution.

The way we deal with this is, we calculate the percentages of each stage (if it's not scheduled, percentage is 0), and average them as the progress of fault tolerant execution.

Description

Is this change a fix, improvement, new feature, refactoring, or other?

Improvement

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Client library

How would you describe this change to a non-technical end user or system administrator?

This adds progress bar support for queries executing in fault-tolerant mode.

Related issues, pull requests, and links

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

(x) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label May 6, 2022
@linzebing linzebing requested review from arhimondr and losipiuk May 6, 2022 21:26
@findepi
Copy link
Copy Markdown
Member

findepi commented May 9, 2022

cc @electrum

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does it impact progress bar rendering for non-fault tolerant execution? Does it impact anything else than progress bar rendering?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked all text occurrences of isScheduled, seems it's only used in progress bar rendering

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend to refrain from changing the isScheduled semantics. If it's only used for progress bar rendering can we implement a check that we need there? (or maybe add a separate boolean filed, isAnyStageScheduled)?

Copy link
Copy Markdown
Member Author

@linzebing linzebing May 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that anyMatch is the right semantics here. As long as any of the stage is scheduled, then the whole query should be seen as scheduled, right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a single stage the SCHEDULED status indicates that all splits have been scheduled. That's why I'm a little worried that it might be not intuitive for isScheduled to return true as soon as only a single stage has all it's split scheduled.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then for fault tolerant execution, it's not scheduled until the last stage executes, which doesn't make sense. Plus, isScheduled is only used in progress bar, so if we add a isAnyStageScheduled, then will leave isScheduled effectively useless (and therefore should be removed).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can simply rename it?

@linzebing linzebing requested a review from electrum May 9, 2022 16:25
@linzebing linzebing force-pushed the tardigrade-progress-bar branch from e33b1e4 to 339f9b7 Compare May 9, 2022 16:39
@linzebing
Copy link
Copy Markdown
Member Author

CI: #12298

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend to refrain from changing the isScheduled semantics. If it's only used for progress bar rendering can we implement a check that we need there? (or maybe add a separate boolean filed, isAnyStageScheduled)?

@linzebing linzebing force-pushed the tardigrade-progress-bar branch from 339f9b7 to dde9a8c Compare May 9, 2022 23:51
@linzebing
Copy link
Copy Markdown
Member Author

linzebing commented Jul 1, 2022

Closing as this is low-pri and I don't plan to work on it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants