Conversation
|
fyi @martint |
|
@skrzypo987 could you share perf results? |
8ea23c9 to
2cb6340
Compare
I messed up some things so there is a regression. Will get back with the results once I fix it. |
|
before: after |
eb12136 to
f5ae9a7
Compare
|
I got rid of the biggest commit as it showed some regressions that were difficult to spot. I'll return to it in different PR. |
f5ae9a7 to
406a41b
Compare
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
why iterate over channels first? we then update combinations multiple times but we could compute it once per position
There was a problem hiding this comment.
So that we iterate over a single block at a time. Microbenchmarks show better results that way.
There was a problem hiding this comment.
is this correct that we take dict channels.length - j?
Im not sure if I understand correctly but e.g.
dict size 1, 2, 3
values 0, 1, 0
(((0 * 2) + 1 ) * 2) + 0 = 2
values 0, 0, 2
(((0 * 2) + 0 ) * 4) + 2 = 2
There was a problem hiding this comment.
You are absolutely right. This is one of those "how could that even worked" kind of bug.
I fixed it and added some tests.
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
7589e02 to
25eca66
Compare
|
Got rid of the commit introducting tpch aggregation benchmark as its newer version exists in #11031 |
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
you should call page = page.getLoadedPage() before dictionary checks (same for getGroupIds). Same for BigintGroupByHash
There was a problem hiding this comment.
This is unrelated to the PR
core/trino-main/src/test/java/io/trino/operator/TestGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/TestGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/TestGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/TestGroupByHash.java
Outdated
Show resolved
Hide resolved
|
Is |
8d185a4 to
ef28b26
Compare
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
nit: I would remove cardinality > SMALL_DICTIONARIES_MAX_CARDINALITY and only leave ratio unless there is strong reason to keep this check
There was a problem hiding this comment.
Fine. Made some benchmarks and there is indeed little chance for regression
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/TestGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/TestGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/TestGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/TestGroupByHash.java
Outdated
Show resolved
Hide resolved
e9ed1ba to
1e161fa
Compare
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/operator/TestGroupByHash.java
Outdated
Show resolved
Hide resolved
1e161fa to
59f8dfb
Compare
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
59f8dfb to
f78d524
Compare
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/MultiChannelGroupByHash.java
Outdated
Show resolved
Hide resolved
If the number of combinations of all dictionaries in a page is below certain number, we can store the results in a small array and reuse found groups
f78d524 to
ffd1ee8
Compare
Few neat tricks to speed up aggregation a bit