Hard-fork join operator and remove spilling from it#12618
Hard-fork join operator and remove spilling from it#12618raunaqmorarka merged 9 commits intotrinodb:masterfrom
Conversation
60fd49d to
c0747d3
Compare
core/trino-main/src/main/java/io/trino/operator/join/unspilled/ArrayPositionLinks.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/DefaultPageJoiner.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
We can skip OperatorFactories for "new" join. Won't serve any purpose
core/trino-main/src/main/java/io/trino/operator/join/LookupSourceFactory.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/LocalExecutionPlanner.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/unspilled/DefaultPageJoiner.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/unspilled/DefaultPageJoiner.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/unspilled/HashBuilderOperator.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/unspilled/HashBuilderOperator.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/unspilled/HashBuilderOperator.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/unspilled/JoinProcessor.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/unspilled/JoinProcessor.java
Outdated
Show resolved
Hide resolved
c0747d3 to
69a8fb4
Compare
69a8fb4 to
d6b0f07
Compare
5a558ad to
65e6124
Compare
65e6124 to
cd24df1
Compare
core/trino-main/src/main/java/io/trino/operator/join/unspilled/LookupSourceFactory.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/unspilled/PageJoiner.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/unspilled/PageJoiner.java
Outdated
Show resolved
Hide resolved
cd24df1 to
b5b68ac
Compare
|
@raunaqmorarka comments addressed. Added q commit. |
core/trino-main/src/main/java/io/trino/operator/join/unspilled/LookupSourceFactory.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/join/PartitionedConsumption.java
Outdated
Show resolved
Hide resolved
|
@raunaqmorarka Added 2 commits to resolve the remaining issues |
cab9836 to
c915ecc
Compare
|
Added a session property and a config option to disable the forked operator |
|
TPC benchmarks results. Roughly 1%-1.5% gain |
This commit clones most of the classes in the join operator without introducing any major changes. The forked one will (in future commits) not support spilling, yet at this point it is identical.
This commit removes mainly dead code associated with no spilling. No major logical changes are introduced
After the removal of spilling this class is fairly basic and in 1:1 relation to LookupJoinOperator, so it can be easily inlined
Only the write lock is used at this point so it can be safely replaced with synchronize
Since there is only one implementation - DefaultPageJoiner, which is now called PageJoiner
Since there is only one implementation - PartitionedLookupSourceFactory
This is a rollback of changes made in 'hard-fork join operator' commit. The changes were necessary form the fork, but after the spilling is removed they can be rolled back.
c915ecc to
3aa42d8
Compare
sopel39
left a comment
There was a problem hiding this comment.
some retrospection comments. Please address as follow-ups
core/trino-main/src/main/java/io/trino/sql/planner/LocalExecutionPlanner.java
Show resolved
Hide resolved
| * This page builder creates pages with dictionary blocks: | ||
| * normal dictionary blocks for the probe side and the original blocks for the build side. | ||
| * <p> | ||
| * TODO use dictionary blocks (probably extended kind) to avoid data copying for build side |
There was a problem hiding this comment.
missing reference to github issue
There was a problem hiding this comment.
This is not my todo, it's a forked code
| for (int index : probe.getOutputChannels()) { | ||
| Block block = probe.getPage().getBlock(index); | ||
| // Estimate the size of the probe row | ||
| // TODO: improve estimation for unloaded blocks by making it similar as in PageProcessor |
There was a problem hiding this comment.
missing reference to Github issue. You probably should create umbrella issue
| return new LookupJoinOperatorFactory( | ||
| operatorId, | ||
| planNodeId, | ||
| (JoinBridgeManager<? extends LookupSourceFactory>) lookupSourceFactoryManager, |
There was a problem hiding this comment.
You should create a new type hierarchy or new operator factory methods rather than explicitly cast here
| import static java.util.Objects.requireNonNull; | ||
|
|
||
| public final class PartitionedLookupSourceFactory | ||
| implements JoinBridge |
There was a problem hiding this comment.
This pretends to be LookupSourceFactory, but doesn't implement interface. Name is wrong and confusing
| probeHashChannel, | ||
| partitioningSpillerFactory); | ||
| } | ||
| else { |
Description
This PR hard-forkes the entire join operator. Some interfaces remain common as they are an external interface. Minor changes has been made in the original operator to make the fork possible.
The latter commits remove spilling from the forked operator. No observable gain will come from that right now.
This is a preparation for some optimization work in the non-spilling join operator.
Some dead/suboptimal code still exists in the forked operator after this PR. Some cleanup will follow with some more optimisations later on.
Refactoring to simplify join operator for non-spilled cases
core query engine
Code refactoring to simplify join operator for non-spilled cases
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
() No release notes entries required.
(x) Release notes entries required with the following suggested text: