Adaptive task sizing for fault tolerant execution#16719
Adaptive task sizing for fault tolerant execution#16719losipiuk merged 4 commits intotrinodb:masterfrom
Conversation
dd49fb4 to
f65a27b
Compare
core/trino-main/src/main/java/io/trino/execution/QueryManagerConfig.java
Outdated
Show resolved
Hide resolved
.../trino-testing/src/main/java/io/trino/testing/FaultTolerantExecutionConnectorTestHelper.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
nit: It would be nice to ensure that this translates to number of splits which are multiply of task.max-drivers-per-task here.
One option - maybe good enough would be to assume that minTargetPartitionSizeInBytes satisfies this requirement and then ensure that calculated targetPartitionSizeInBytes is a multiply of minTargetPartitionSizeInBytes.
Then we can use adaptiveGrowthFactor less than 2.0 to have more fluent growth. 2.0 is pretty agressive.
cc: @arhimondr
There was a problem hiding this comment.
One option - maybe good enough would be to assume that minTargetPartitionSizeInBytes satisfies this requirement and then ensure that calculated targetPartitionSizeInBytes is a multiply of minTargetPartitionSizeInBytes.
I was also thinking about something along these lines, basically try to round to the closest minTargetPartitionSizeInBytes
core/trino-main/src/test/java/io/trino/execution/TestQueryManagerConfig.java
Outdated
Show resolved
Hide resolved
losipiuk
left a comment
There was a problem hiding this comment.
LGTM % comments.
@arhimondr PTAL
core/trino-main/src/main/java/io/trino/execution/QueryManagerConfig.java
Outdated
Show resolved
Hide resolved
f65a27b to
60816cb
Compare
core/trino-main/src/main/java/io/trino/execution/QueryManagerConfig.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Let's discuss what would be good defaults offline
There was a problem hiding this comment.
Changed defaults to:
private int faultTolerantExecutionArbitraryDistributionComputeTaskTargetSizeGrowthPeriod = 64;
private double faultTolerantExecutionArbitraryDistributionComputeTaskTargetSizeGrowthFactor = 1.2;
private DataSize faultTolerantExecutionArbitraryDistributionComputeTaskTargetSizeMin = DataSize.of(512, MEGABYTE);
private DataSize faultTolerantExecutionArbitraryDistributionComputeTaskTargetSizeMax = DataSize.of(50, GIGABYTE);
private int faultTolerantExecutionArbitraryDistributionWriteTaskTargetSizeGrowthPeriod = 64;
private double faultTolerantExecutionArbitraryDistributionWriteTaskTargetSizeGrowthFactor = 1.2;
private DataSize faultTolerantExecutionArbitraryDistributionWriteTaskTargetSizeMin = DataSize.of(4, GIGABYTE);
private DataSize faultTolerantExecutionArbitraryDistributionWriteTaskTargetSizeMax = DataSize.of(50, GIGABYTE);
private DataSize faultTolerantExecutionHashDistributionComputeTaskTargetSize = DataSize.of(512, MEGABYTE);
private DataSize faultTolerantExecutionHashDistributionWriteTaskTargetSize = DataSize.of(4, GIGABYTE);
private int faultTolerantExecutionHashDistributionWriteTaskTargetMaxCount = 2000;
let me know your thoughts
There was a problem hiding this comment.
One option - maybe good enough would be to assume that minTargetPartitionSizeInBytes satisfies this requirement and then ensure that calculated targetPartitionSizeInBytes is a multiply of minTargetPartitionSizeInBytes.
I was also thinking about something along these lines, basically try to round to the closest minTargetPartitionSizeInBytes
60816cb to
1f1082d
Compare
This would not guarantee that we are actually having number of splits per task which is multiply of |
…onSizeInBytes For adaptive task sizing in ArbitraryDistributionSplitAssigner
1f1082d to
4561f0c
Compare
|
@jhlodin : can you (or someone else) help me update documentation for this PR? |
Description
This PR adds adaptive task sizing for fault tolerant execution. Specifically:
This change dramatically improves small query latency on fault-tolerant execution. Preliminary testing on tpcds-sf100 shows 40%+ latency reduction.
Additional context and related issues
Fix #16103
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: