Improve exchange interfaces to support speculative execution#14448
Improve exchange interfaces to support speculative execution#14448arhimondr merged 7 commits intotrinodb:masterfrom
Conversation
core/trino-spi/src/main/java/io/trino/spi/exchange/ExchangeSource.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Is there a contract that we limit task to have at most 127 attempts. Feels reasonable but it should probably be enforced somewhere. In TaskId constructor?
There was a problem hiding this comment.
Yeah, that makes sense. Let's address it as a separate PR: #14459
There was a problem hiding this comment.
unknown
Also you should probably provide totalNumberOfPartitions and verify that partitionCount is equal to that if you are building final selector (or do you validate it somewhere?)
There was a problem hiding this comment.
Actually yeah, i think it's a good idea to make the builder more restrictive and apply more validations. Improved.
...e-filesystem/src/main/java/io/trino/plugin/exchange/filesystem/FileSystemExchangeSource.java
Outdated
Show resolved
Hide resolved
...system/src/main/java/io/trino/plugin/exchange/filesystem/FileSystemExchangeSourceHandle.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/execution/scheduler/TestFaultTolerantStageScheduler.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/exchange/TestExchangeSourceOutputSelector.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/execution/scheduler/FaultTolerantStageScheduler.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/execution/scheduler/FaultTolerantQueryScheduler.java
Outdated
Show resolved
Hide resolved
losipiuk
left a comment
There was a problem hiding this comment.
LGTM (I did not read super deeply)
489fe40 to
3edb9db
Compare
|
@losipiuk Updated |
There was a problem hiding this comment.
We may need to enforce upper bound to be Short.MAX_VALUE. Or if you think we may exceed this limit, then we need to change it to Integer https://github.com/starburstdata/trino-buffer-service/blob/cdbbc6c08a11f5e472951e6b53ff10c7dbc25b83/data-server/src/main/java/io/starburst/stargate/buffer/data/execution/Partition.java#L152
There was a problem hiding this comment.
Currently it's an integer everywhere in Trino. We can change it if needed, but probably as a follow up PR.
core/trino-spi/src/main/java/io/trino/spi/exchange/ExchangeSourceOutputSelector.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
When a certain partition has to be split (e.g.: due to high memory pressure). In such scenario the original partition id (all attempts) must be excluded and new partition ids (created during split) must be included.
There was a problem hiding this comment.
It's not super clear to me on when we need such a class instead of constructing a final map which maps taskId to attemptId. For memory efficiency?
There was a problem hiding this comment.
Memory efficiency + validations
There was a problem hiding this comment.
I wonder if we allow transition from a final selector to another final selector
There was a problem hiding this comment.
Good point. I don't think we should. Let me forbid it.
3edb9db to
fd13dfe
Compare
|
Updated |
fd13dfe to
b45ab72
Compare
There was a problem hiding this comment.
at this commit this code is weird. Why do we have outer loop as we are always exiting from it in frist iteration. I clearly miss sth
There was a problem hiding this comment.
Oh - it used to be findFirst. Maybe drop for in this commit and only add it in the next one?
There was a problem hiding this comment.
Good point, changed
There was a problem hiding this comment.
can this be record? Or you worried about getRetainedSizeInBytes?
There was a problem hiding this comment.
Yeah, I didn't want to implement getRetainedSizeInBytes myself.
There was a problem hiding this comment.
why? Some sinks may still be written, right?
There was a problem hiding this comment.
Currently we don't split tasks and don't "forget" entire partitions. Once we have this functionally we will remove this check. For now I wanted to keep it in place for verification purposes.
Rename blockedOnSourceHandles to blockedOnFiles
…Handle To allow post filtering at the FileSystemExchangeSource level
The engine may decide that not all the sinks are necessary. It should be up to the engine to provide a signal to the exchange when all useful data has already been written.
To suppress Intellij warning
b45ab72 to
80b5c99
Compare
|
Updated |
Description
Non-technical explanation
N/A
Release notes
(X) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: