Skip to content

Vectorize hashing for Exchange and Local Exchange Operators#27610

Closed
pgandhi999 wants to merge 3 commits intotrinodb:masterfrom
pgandhi999:vectorize-hashing-for-exchange
Closed

Vectorize hashing for Exchange and Local Exchange Operators#27610
pgandhi999 wants to merge 3 commits intotrinodb:masterfrom
pgandhi999:vectorize-hashing-for-exchange

Conversation

@pgandhi999
Copy link
Copy Markdown
Member

Description

Previously, there was work done in OSS Trino by @pettyjamesm to implement a vectorized approach for combined columnar hash calculation generation via codegen(PR: #19302) for FlatGroupByHash operator. This PR extends the work for Partitioned Exchange and Local Exchange Operators.

The results from running BenchmarkPartitionedOutputOperator.verifyAddPage for varying number of columns are summarized below:

Environment: MacOS Local Machine
Partition Count: 256
Runtime JDK: Java 25
Page Count: 5000

Number of Columns Baseline(InterpretedHashGenerator)seconds Prototype(Codegen based Vectorized HashGenerator)seconds % Gain
1 3.1 2.6 16.12903
5 5.4 3.8 29.62963
10 8 6.3 21.25
50 40 28 30
100 93 69 25.80645
500 603 343 43.11774
1000 1065 623 41.50235
       

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

@pgandhi999
Copy link
Copy Markdown
Member Author

The Old PR(#27607) got accidentally closed when i forced push my branch so I had to create a new pull request.

@starburstdata-automation
Copy link
Copy Markdown

starburstdata-automation commented Dec 11, 2025

Started benchmark workflow for this PR with test type = iceberg/sf1000_parquet_part.

Building Trino finished with status: success
Benchmark finished with status: success
Comparing results to the static baseline values, follow above workflow link for more details/logs.
Status message: NO Regression found.
Benchmark Comparison to the closest run from Master: Report

@starburstdata-automation
Copy link
Copy Markdown

starburstdata-automation commented Dec 11, 2025

Started benchmark workflow for this PR with test type = iceberg/sf1000_parquet_unpart.

Building Trino finished with status: failure
Building Trino finished with status: success
Benchmark finished with status: success
Comparing results to the static baseline values, follow above workflow link for more details/logs.
Status message: NO Regression found.
Benchmark Comparison to the closest run from Master: Report

@pgandhi999
Copy link
Copy Markdown
Member Author

For this work, related correspondence and discussion can be found on PR #27657.

@pettyjamesm
Copy link
Copy Markdown
Member

pettyjamesm commented Dec 17, 2025

Probably close in favor of #27657 ?

@pgandhi999
Copy link
Copy Markdown
Member Author

Closing the PR in favor of #27657, thank you.

@pgandhi999 pgandhi999 closed this Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants