-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Hash join buffering on probe side #19761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Hash join buffering on probe side #19761
Conversation
|
run benchmarks |
|
🤖 Hi @gabotechs, thanks for the request (#19761 (comment)). |
|
run benchmarks |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark tpcds tpch10 |
|
🤖 |
|
Benchmark script failed with exit code 1. Last 10 lines of output: Click to expand |
|
run benchmark tpch10 |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark tpch |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark tpcds |
|
🤖 |
|
Benchmark script failed with exit code 1. Last 10 lines of output: Click to expand |
|
🤔 the tpcds benchmark command seems broken |
Potentially yes, although I do see value in having this be its own node, that way we can:
This is actually something we (DataDog) have had for a long time, we have a similar |
09c6b68 to
d8e32b1
Compare
6fa7c76 to
139cf50
Compare
|
run benchmark tpcds tpch |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark tpch10 |
|
🤖 |
|
🤖: Benchmark completed Details
|
Which issue does this PR close?
It does not close any issue, but it's related to:
Rationale for this change
This is a PR from a batch of PRs that attempt to improve performance in hash joins:
It adds the new
BufferExecnode at the top of the probe side of hash joins so that some work is eagerly performed before the build side of the hash join is completely finished.Why should this speed up joins?
In order to better understand the impact of this PR, it's useful to understand how streams work in Rust: creating a stream does not perform any work, progress is just made if the stream gets polled.
This means that whenever we call
.execute()on anExecutionPlan(like the probe side of a join), nothing happens, not even the most basic TCP connections or system calls are performed. Instead, all this work is delayed as much as possible until the first poll is made to the stream, losing the opportunity to make some early progress.This gets worst when multiple hash joins are chained together: they will get executed in cascade as if they were domino pieces, which has the benefit of leaving a small memory footprint, but underutilizes the resources of the machine for executing the query faster.
NOTE: still don't know if this improves the benchmarks, just experimenting for now
What changes are included in this PR?
Adds a new
HashJoinBufferingphysical optimizer rule that will idempotently placeBufferExecnodes on the probe side of has joins:Are these changes tested?
yes, by existing tests
Are there any user-facing changes?
yes, users will see a new
BufferExecbeing placed at top of the probe side of each hash join. (Still unsure about whether de default mode should be enabled)Results
Warning
I'm very skeptical about this benchmarks run on my laptop, take them with a grain of salt, they should be run in a more controlled environment