Replace GroupIdBlock with int[] by dain · Pull Request #17688 · trinodb/trino

dain · 2023-05-29T22:04:34Z

Group id block was used to carry the max group id to the grouped accumulator, but the max group id is now set directly with the setGroupCount method.
An int[] is used because the GroupByHash implementations only create int group ids.

Release notes

(x) This is not user-visible or docs only and no release notes are required.

lukasz-stec

Nice! This should bring some perf improvement, I have seen 20% perf drop in some queries because of group id code was not compiled optimally by jit. using int[] will likely help with that.

lukasz-stec · 2023-05-30T07:43:00Z

core/trino-main/src/main/java/io/trino/operator/BigintGroupByHash.java

Would it be beneficial to handle this case better?
We could still have GroupIdBLock that has two states, int[] and single value and count similar to SelectedPositions and then do

if (groupIdsBlock.isArray()) { // handle array } else { // handle rle } ``` in the `Accumulator`

Maybe. I'm not sure the complexity will result in better performance in the 99% case where we don't have RLEs. My thought is we switch to int[] and establish a new base line. Then in follow up work we can investigate if adding wrapper with RLE support helps.

lukasz-stec · 2023-05-30T07:45:09Z

core/trino-main/src/main/java/io/trino/operator/aggregation/AccumulatorCompiler.java

Does it make sense to also make io.trino.spi.function.GroupedAccumulatorState#setGroupId accept int instead of long?

Yes! That will be in some follow up work.

Group id block was used to carry the max group id to the grouped accumulator, but the max group id is now set directly with the setGroupCount method. An int[] is used because the GroupByHash implementations only create int group ids.

dain requested a review from martint May 29, 2023 22:04

cla-bot bot added the cla-signed label May 29, 2023

dain mentioned this pull request May 29, 2023

Project Hummingbird #14237

Open

19 tasks

dain force-pushed the remove-group-id-block branch from 796cf6c to a730466 Compare May 30, 2023 00:55

lukasz-stec reviewed May 30, 2023

View reviewed changes

dain force-pushed the remove-group-id-block branch from a730466 to 1fc7432 Compare June 1, 2023 23:00

martint approved these changes Jun 2, 2023

View reviewed changes

Replace GroupIdBlock with int[]

ca61901

Group id block was used to carry the max group id to the grouped accumulator, but the max group id is now set directly with the setGroupCount method. An int[] is used because the GroupByHash implementations only create int group ids.

dain force-pushed the remove-group-id-block branch from 1fc7432 to ca61901 Compare June 5, 2023 02:54

dain merged commit 1657e52 into trinodb:master Jun 6, 2023

dain deleted the remove-group-id-block branch June 6, 2023 03:47

github-actions bot added this to the 420 milestone Jun 6, 2023

colebow mentioned this pull request Jun 21, 2023

Add Trino 420 release notes #17997

Merged

lukasz-stec mentioned this pull request Jun 26, 2023

Drop unused consecutive method #18043

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace GroupIdBlock with int[]#17688

Replace GroupIdBlock with int[]#17688
dain merged 1 commit intotrinodb:masterfrom
dain:remove-group-id-block

dain commented May 29, 2023

Uh oh!

lukasz-stec left a comment

Uh oh!

lukasz-stec May 30, 2023

Uh oh!

dain May 30, 2023

Uh oh!

lukasz-stec May 30, 2023

Uh oh!

dain May 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Conversation

dain commented May 29, 2023

Release notes

Uh oh!

lukasz-stec left a comment

Choose a reason for hiding this comment

Uh oh!

lukasz-stec May 30, 2023

Choose a reason for hiding this comment

Uh oh!

dain May 30, 2023

Choose a reason for hiding this comment

Uh oh!

lukasz-stec May 30, 2023

Choose a reason for hiding this comment

Uh oh!

dain May 30, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants