Vectorize Hashing in FlatGroupByHash by pettyjamesm · Pull Request #19302 · trinodb/trino

pettyjamesm · 2023-10-06T19:26:21Z

Description

Implements column-wise hash calculations in FlatHashCompiler and changes FlatGroupByHash to use it to implement a batched approach to first computing a range of position hashes and then attempting to insert those positions using the hashes that were precomputed.

By calculating position hashes in a columnar traversal, we can avoid repeated expensive bounds checking per access, allow the JIT to emit more efficient unrolled loops, and access the memory in a way that friendlier and more predictable for CPU caches.

Additionally, when attempting to insert the positions into the FlatGroupByHash after precomputing the hash code we can start loading the relevant portition of the hash table into memory sooner since we aren't intermixing computing the hash with the memory accesses.

Since re-hashing is still performed row-at-a-time, the performance improvement is much more significant when the number of groups is small and when the number of columns in the hash calculation are larger.

BenchmarkGroupByHash.addPages results generally improve between 5-50% depending on the scenario:

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

wendigo · 2023-10-06T20:14:56Z

test-other-modules failure is unrelated. I just fixed that in master. Sorry :)

pettyjamesm · 2023-10-06T20:16:37Z

test-other-modules failure is unrelated. I just fixed that in master. Sorry :)

No worries, thanks for the heads up- just rebased to pick up the fix.

starburstdata-automation · 2023-10-07T02:01:39Z

Started benchmark workflow for this PR.

Benchmark finished with status: failure
Status message: Failed benchmarks.
❌ Benchmark workflow failed.

starburstdata-automation · 2023-10-09T17:29:01Z

Started benchmark workflow for this PR.

Benchmark finished with status: success
Status message: No baseline found.
report.zip

Benchmark Summary:

Top 5 duration differences

Left Environment	Right Environment	Benchmark Name	Query Name	Diff	Diff Pct	Left Mean	Left Mean Err	Right Mean	Right Mean Err
hp-base-oss-6_r6g_8xlarge-f92b119a0e-hive_sf1000_parquet_part-s3452	vectorize-flatgroupbyhash-hashing-oss-6_r6g_8xlarge-9fd75333be-hive_sf1000_parquet_part-p18	presto/tpch	q01	00:00:01.295833	▲10.18	00:00:11.438	±00:00:00.754284 (0.07%)	00:00:12.733833	±00:00:01.522639 (0.12%)
hp-base-oss-6_r6g_8xlarge-f92b119a0e-hive_sf1000_parquet_part-s3452	vectorize-flatgroupbyhash-hashing-oss-6_r6g_8xlarge-9fd75333be-hive_sf1000_parquet_part-p18	presto/tpch	q02	00:00:00.559333	▲5.29	00:00:10.024	±00:00:00.337022 (0.03%)	00:00:10.583333	±00:00:01.334157 (0.13%)
hp-base-oss-6_r6g_8xlarge-f92b119a0e-hive_sf1000_parquet_part-s3452	vectorize-flatgroupbyhash-hashing-oss-6_r6g_8xlarge-9fd75333be-hive_sf1000_parquet_part-p18	presto/tpch	q17	-00:00:02.282833	▼-5.36	00:00:42.582167	±00:00:01.506255 (0.04%)	00:00:40.299333	±00:00:00.873434 (0.02%)
hp-base-oss-6_r6g_8xlarge-f92b119a0e-hive_sf1000_parquet_part-s3452	vectorize-flatgroupbyhash-hashing-oss-6_r6g_8xlarge-9fd75333be-hive_sf1000_parquet_part-p18	presto/tpch	q18	-00:00:03.299	▼-6.21	00:00:53.157	±00:00:02.329143 (0.04%)	00:00:49.858	±00:00:01.509759 (0.03%)
hp-base-oss-6_r6g_8xlarge-f92b119a0e-hive_sf1000_parquet_part-s3452	vectorize-flatgroupbyhash-hashing-oss-6_r6g_8xlarge-9fd75333be-hive_sf1000_parquet_part-p18	presto/tpch	q07	-00:00:00.818	▼-6.90	00:00:11.862	±00:00:00.401573 (0.03%)	00:00:11.044	±00:00:00.440955 (0.04%)

martint

Squash the first three commits, as they are all essentially incremental parts of the same change.

core/trino-main/src/main/java/io/trino/operator/FlatHashStrategyCompiler.java

Implements column-wise hash calculations in FlatHashCompiler and changes FlatGroupByHash to use it to implement a batched approach to first computing a range of position hashes and then attempting to insert those positions using the hashes that were precomputed. By calculating position hashes in a columnar tarversal path, we can avoid repeated expensive bounds checking and allow the JIT to unroll loops into much more efficient forms. Additionally, when attempting to insert the positions into the FlatGroupByHash after precomputing the hash code we can start loading the relevant portition of the hash table into memory sooner since we aren't intermixing computing the hash with the memory accesses.

mosabua · 2023-10-21T00:39:58Z

Please note that we added a release notes entry for the performance improvement @pettyjamesm .. for future reference and as an example.

cla-bot bot added the cla-signed label Oct 6, 2023

pettyjamesm requested a review from dain October 6, 2023 19:28

pettyjamesm force-pushed the vectorize-flatgroupbyhash-hashing branch from 09ac658 to 68f7df6 Compare October 6, 2023 20:16

pettyjamesm marked this pull request as ready for review October 6, 2023 22:24

pettyjamesm force-pushed the vectorize-flatgroupbyhash-hashing branch from 3bbddf6 to 9fd7533 Compare October 9, 2023 15:56

pettyjamesm force-pushed the vectorize-flatgroupbyhash-hashing branch from 9fd7533 to 8df5f3a Compare October 11, 2023 19:37

martint approved these changes Oct 18, 2023

View reviewed changes

core/trino-main/src/main/java/io/trino/operator/FlatHashStrategyCompiler.java Outdated Show resolved Hide resolved

martint mentioned this pull request Oct 18, 2023

Project Hummingbird #14237

Open

19 tasks

pettyjamesm added 3 commits October 19, 2023 11:41

Unwrap lazy group-by blocks for hash aggregations

24c37fa

Unwrap lazy input for GroupedAggregator

1a75e64

pettyjamesm force-pushed the vectorize-flatgroupbyhash-hashing branch from 8df5f3a to 1a75e64 Compare October 19, 2023 15:47

pettyjamesm merged commit 1ea28e4 into trinodb:master Oct 19, 2023

pettyjamesm deleted the vectorize-flatgroupbyhash-hashing branch October 19, 2023 20:15

github-actions bot added this to the 430 milestone Oct 19, 2023

mosabua mentioned this pull request Oct 20, 2023

Add Trino 430 release notes #19428

Merged

This was referenced Dec 10, 2025

Vectorize hashing for Exchange and Local Exchange Operators #27607

Closed

Vectorize hashing for Exchange and Local Exchange Operators #27610

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize Hashing in FlatGroupByHash#19302

Vectorize Hashing in FlatGroupByHash#19302
pettyjamesm merged 3 commits intotrinodb:masterfrom
pettyjamesm:vectorize-flatgroupbyhash-hashing

pettyjamesm commented Oct 6, 2023 •

edited

Loading

Uh oh!

wendigo commented Oct 6, 2023

Uh oh!

pettyjamesm commented Oct 6, 2023

Uh oh!

starburstdata-automation commented Oct 7, 2023 •

edited

Loading

Uh oh!

starburstdata-automation commented Oct 9, 2023 •

edited

Loading

Uh oh!

martint left a comment

Uh oh!

Uh oh!

mosabua commented Oct 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

Conversation

pettyjamesm commented Oct 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Release notes

Uh oh!

wendigo commented Oct 6, 2023

Uh oh!

pettyjamesm commented Oct 6, 2023

Uh oh!

starburstdata-automation commented Oct 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

starburstdata-automation commented Oct 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Summary:

Top 5 duration differences

Uh oh!

martint left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mosabua commented Oct 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

pettyjamesm commented Oct 6, 2023 •

edited

Loading

starburstdata-automation commented Oct 7, 2023 •

edited

Loading

starburstdata-automation commented Oct 9, 2023 •

edited

Loading