Skip to content

Optimize hash code generation for exchanges#27657

Merged
raunaqmorarka merged 4 commits intotrinodb:masterfrom
raunaqmorarka:raunaq/hashing-opt
Dec 17, 2025
Merged

Optimize hash code generation for exchanges#27657
raunaqmorarka merged 4 commits intotrinodb:masterfrom
raunaqmorarka:raunaq/hashing-opt

Conversation

@raunaqmorarka
Copy link
Copy Markdown
Member

@raunaqmorarka raunaqmorarka commented Dec 15, 2025

Description

Optimize InterpretedHashGenerator using byte code generation and batched loops.
This also brings optimised handling of dictionaries to FlatGroupByHash

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## General
* Improve performance of queries with data exchanges or aggregations. ({issue}`27657`)

@cla-bot cla-bot bot added the cla-signed label Dec 15, 2025
@github-actions github-actions bot added the delta-lake Delta Lake connector label Dec 15, 2025
@github-actions github-actions bot added hive Hive connector postgresql PostgreSQL connector labels Dec 15, 2025
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes hash code generation for data exchanges and aggregations by introducing bytecode generation and batched loop processing through a new NullSafeHashCompiler class. The optimization focuses on improving the performance of hash computation across multiple positions in a single operation rather than computing hashes one position at a time.

Key changes:

  • Introduced NullSafeHashCompiler and NullSafeHash classes for bytecode-based hash generation with batched operations
  • Enhanced InterpretedHashGenerator to use batched hash computation with specialized handling for RLE and dictionary blocks
  • Added batched methods (getBuckets, getPartitions) to BucketFunction and PartitionFunction interfaces with default implementations

Reviewed changes

Copilot reviewed 57 out of 57 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
core/trino-main/src/main/java/io/trino/operator/NullSafeHashCompiler.java New compiler class that generates bytecode for efficient batched hash computation
core/trino-main/src/main/java/io/trino/operator/NullSafeHash.java New interface defining batched hash methods for single blocks
core/trino-main/src/main/java/io/trino/operator/InterpretedHashGenerator.java Refactored to use batched hash operations with optimized RLE and dictionary handling
core/trino-main/src/main/java/io/trino/operator/FlatHashStrategyCompiler.java Updated to delegate batched hash operations to InterpretedHashGenerator
core/trino-spi/src/main/java/io/trino/spi/connector/BucketFunction.java Added getBuckets method for batch bucket computation
core/trino-main/src/main/java/io/trino/operator/PartitionFunction.java Added getPartitions method for batch partition computation
core/trino-main/src/main/java/io/trino/operator/HashGenerator.java Added hash and getPartitions methods for batch operations
core/trino-main/src/main/java/io/trino/operator/output/PagePartitioner.java Simplified to use batched partition computation, removing dictionary-specific optimization
core/trino-main/src/main/java/io/trino/sql/planner/SystemPartitioningHandle.java Updated to use NullSafeHashCompiler instead of TypeOperators
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HivePageSink.java Updated to use batched bucket computation
Various test files Updated test setup to provide NullSafeHashCompiler instances

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@starburstdata-automation
Copy link
Copy Markdown

starburstdata-automation commented Dec 15, 2025

Started benchmark workflow for this PR with test type = iceberg/sf1000_parquet_part.

Building Trino finished with status: success
Benchmark finished with status: failure
Comparing results to the static baseline values, follow above workflow link for more details/logs.
Status message: Found regressions for:<br/>(presto/tpcds, q09, totalCpuTime, over by 24.6%)
Benchmark Comparison to the closest run from Master: Report

@starburstdata-automation
Copy link
Copy Markdown

starburstdata-automation commented Dec 15, 2025

Started benchmark workflow for this PR with test type = iceberg/sf1000_parquet_unpart.

Building Trino finished with status: success
Benchmark finished with status: success
Comparing results to the static baseline values, follow above workflow link for more details/logs.
Status message: NO Regression found.
Benchmark Comparison to the closest run from Master: Report

@raunaqmorarka
Copy link
Copy Markdown
Member Author

Screenshot 2025-12-16 at 11 52 52 AM Screenshot 2025-12-16 at 11 51 28 AM

Copy link
Copy Markdown
Member

@pettyjamesm pettyjamesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, some small comments and I do think there are some places where the temporary allocations might be ripe for further improvements.

for (int position = 0; position < partitionPage.getPositionCount(); position++) {
int partition = partitionFunction.getPartition(partitionPage, position);
partitionAssignments[partition].add(position);
partitionAssignments[partitions[position]].add(position);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's probably an improvement to be gained by fusing this loop with partitionAssignments.getPartitions, but we can worry about that as a follow up.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is better explored as a follow-up, I expect that might need a different method in PartitionFunction


default void getPartitions(int partitionCount, int positionOffset, Page page, int length, int[] partitions)
{
long[] hashes = new long[length];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reusing this array is likely to be beneficial for most operator use cases, since allocating a new instance on each invocation is non-trivial allocation pressure. Having this default implementation seems like a performance hazard

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allocation here is very short-lived and the JVM is pretty good at optimizing for that. Trying to reuse array adds some complexity as it has to be passed down from the calling operator, where it potentially needs to be tracked as a retained memory allocation. Since we didn't observe a problem with this in production for a while, I'm inclined to keep it simple for now and explore reuse as a follow-up.

@pgandhi999
Copy link
Copy Markdown
Member

pgandhi999 commented Dec 16, 2025

I also got a chance to run the micro benchmark that I had performed for my PR(#27610) on your PR and the results align with what was stated on my PR. Thank you @raunaqmorarka.

Introduce batched implementations for hashing pages
for exchange
Use byte code generation to avoid megamorphic call sites
in hash code generation
@raunaqmorarka raunaqmorarka merged commit d8c8057 into trinodb:master Dec 17, 2025
102 checks passed
@raunaqmorarka raunaqmorarka deleted the raunaq/hashing-opt branch December 17, 2025 15:26
@github-actions github-actions bot added this to the 480 milestone Dec 17, 2025
@starburstdata-automation
Copy link
Copy Markdown

starburstdata-automation commented Dec 17, 2025

Started benchmark workflow for this PR with test type = iceberg/sf1000_parquet_part.

Building Trino finished with status: success
Benchmark finished with status: failure
Comparing results to the static baseline values, follow above workflow link for more details/logs.
Status message: Found regressions for:<br/>(presto/tpcds, q09, totalCpuTime, over by 31.1%)
Benchmark Comparison to the closest run from Master: Report

@starburstdata-automation
Copy link
Copy Markdown

starburstdata-automation commented Dec 17, 2025

Started benchmark workflow for this PR with test type = iceberg/sf1000_parquet_unpart.

Building Trino finished with status: success
Benchmark finished with status: success
Comparing results to the static baseline values, follow above workflow link for more details/logs.
Status message: NO Regression found.
Benchmark Comparison to the closest run from Master: Report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed delta-lake Delta Lake connector hive Hive connector performance postgresql PostgreSQL connector

Development

Successfully merging this pull request may close these issues.

6 participants