-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Set initial number of tasks for scaled writer with HBO #20901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3c0ba5b to
f8efc49
Compare
f8efc49 to
214c542
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get the number of tasks for the stage, and record it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add table writer stats to estimate
presto-main/src/main/java/com/facebook/presto/sql/planner/planPrinter/TextRenderer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add table writer node statistics to plan statistics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a field which specify the number of tasks to start from for scaled writer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Start from 1 if no initial task number specified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get the suggested number of table writer tasks from query plan, it finds the TableWriterNode, and read from its task number if scale writer field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a rule to set the initial number of tasks for a table writer
presto-main/src/main/java/com/facebook/presto/sql/planner/planPrinter/PlanPrinter.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a field to specify the number of tasks to begin with if it's a scaled writer
0660644 to
70ad759
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get the preferred number of tasks from table writer nodes in the plan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is TableWriterMergeNode relevant here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the scaled writer optimization is only related to table writer node, not related to table writer merger node.
presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/ScaledWriterRule.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/plan/TableWriterNode.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/statistics/TableWriterNodeStatistics.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/TableWriterNodeStatsEstimate.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/execution/scheduler/ScaledWriterScheduler.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/planPrinter/PlanPrinter.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/planPrinter/TextRenderer.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/SystemSessionProperties.java
Outdated
Show resolved
Hide resolved
dc63547 to
af6430c
Compare
af6430c to
9e88b78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just merged #20990 that tracks which optimizers were cost-based and the source of stats used (CBO/HBO).
Can you override functions isCostBased and getStatsSource so this optimizer also gets tracked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in a separate PR #21120
9e88b78 to
47d69a3
Compare
47d69a3 to
b3f5944
Compare
Description
Addresses #20355
Record the number of tasks used in scaled writers in HBO, and use HBO to set the initial number of writers to begin with for scaled writers.
Motivation and Context
Scaled writers first have only 1 task to write data out, and increase the number of tasks as needed when the source is throttled. In this PR, the scaled writer will start with a number based on the number of previous runs, so that it can have larger parallelism in the beginning and hence improve latency.
Impact
Latency improvement for scaled writer pipelines
Test Plan
Test query
Also run with verifier suite
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.