feat(optimizer): Pre-aggregate before GroupId to reduce row multiplication by kaikalur · Pull Request #27290 · prestodb/presto

kaikalur · 2026-03-08T03:10:48Z

Summary

Adds a new iterative optimizer rule PreAggregateBeforeGroupId that inserts a PARTIAL aggregation below GroupIdNode to reduce the number of rows that GroupId multiplies across grouping sets
Fixes a bug in HashAggregationOperator where SkipAggregationBuilder could be incorrectly used for INTERMEDIATE aggregations
Gated behind session property pre_aggregate_before_grouping_sets (disabled by default)

Test plan

Unit tests for the optimizer rule (6 tests covering positive and negative cases)
Integration tests with real GROUPING SETS queries on the orders table
TestFeaturesConfig updated and passing
Full CI validation

== RELEASE NOTES ==

General Changes
* Add optimizer rule to pre-aggregate data before GroupId node in grouping sets queries, reducing row multiplication. Enabled via session property ``pre_aggregate_before_grouping_sets``. (:pr:`27290`)
* Fix a bug where adaptive partial aggregation could incorrectly bypass INTERMEDIATE aggregation steps.

sourcery-ai · 2026-03-08T03:10:53Z

Reviewer's Guide

Implements a new PreAggregateBeforeGroupId iterative optimizer rule that inserts a partial aggregation below GroupId for grouping sets queries, adds determinism-aware nested IF simplification, tightens adaptive partial aggregation conditions to only apply on RAW→PARTIAL steps, and wires everything behind a new optimizer.pre-aggregate-before-grouping-sets feature flag and corresponding session property with tests.

Class diagram for PreAggregateBeforeGroupId optimizer rule and related planner components

classDiagram
    class PreAggregateBeforeGroupId {
        - FunctionAndTypeManager functionAndTypeManager
        + PreAggregateBeforeGroupId(FunctionAndTypeManager functionAndTypeManager)
        + Pattern getPattern()
        + boolean isEnabled(Session session)
        + Result apply(AggregationNode node, Captures captures, Context context)
        - boolean isLambda(RowExpression rowExpression)
    }

    class Rule {
        <<interface>>
    }

    class AggregationNode {
        + Map aggregations
        + Step getStep()
        + Map getAggregations()
        + List getGroupingSets()
    }

    class GroupIdNode {
        + Map getGroupingColumns()
        + List getGroupingSets()
        + List getAggregationArguments()
        + VariableReferenceExpression getGroupIdVariable()
        + PlanNode getSource()
    }

    class FunctionAndTypeManager {
        + FunctionMetadata getFunctionMetadata(FunctionHandle functionHandle)
        + AggregationFunctionImplementation getAggregateFunctionImplementation(FunctionHandle functionHandle)
    }

    class Aggregation {
        + FunctionHandle getFunctionHandle()
        + CallExpression getCall()
        + List getArguments()
        + Optional getFilter()
        + Optional getOrderBy()
        + boolean isDistinct()
        + Optional getMask()
    }

    class CallExpression {
        + RowExpression getSourceLocation()
        + List getArguments()
    }

    class VariableReferenceExpression {
    }

    class PlanNode {
    }

    class Session {
    }

    class FeaturesConfig {
        - boolean preAggregateBeforeGroupingSets
        + boolean isPreAggregateBeforeGroupingSets()
        + FeaturesConfig setPreAggregateBeforeGroupingSets(boolean preAggregateBeforeGroupingSets)
    }

    class SystemSessionProperties {
        + String PRE_AGGREGATE_BEFORE_GROUPING_SETS
        + boolean isPreAggregateBeforeGroupingSets(Session session)
    }

    class PlanOptimizers {
        + PlanOptimizers(...)
    }

    class HashAggregationOperator {
        - AggregationBuilder aggregationBuilder
        + void initializeAggregationBuilderIfNeeded()
    }

    class PartialAggregationController {
        + boolean isPartialAggregationDisabled()
    }

    class SkipAggregationBuilder {
    }

    class LocalExecutionPlanner {
        + Optional createPartialAggregationController(Optional maxPartialAggregationMemorySize, AggregationNode.Step step, Session session)
    }

    Rule <|.. PreAggregateBeforeGroupId
    PlanNode <|-- AggregationNode
    PlanNode <|-- GroupIdNode

    PreAggregateBeforeGroupId --> AggregationNode
    PreAggregateBeforeGroupId --> GroupIdNode
    PreAggregateBeforeGroupId --> FunctionAndTypeManager
    PreAggregateBeforeGroupId --> Session

    AggregationNode --> Aggregation
    Aggregation --> CallExpression
    Aggregation --> FunctionHandle
    CallExpression --> RowExpression
    AggregationNode --> VariableReferenceExpression
    GroupIdNode --> VariableReferenceExpression
    GroupIdNode --> PlanNode

    FunctionAndTypeManager --> AggregationFunctionImplementation

    SystemSessionProperties --> Session
    SystemSessionProperties --> FeaturesConfig

    PlanOptimizers --> PreAggregateBeforeGroupId

    HashAggregationOperator --> AggregationNode.Step
    HashAggregationOperator --> PartialAggregationController
    HashAggregationOperator --> SkipAggregationBuilder

    LocalExecutionPlanner --> AggregationNode.Step
    LocalExecutionPlanner --> PartialAggregationController
    LocalExecutionPlanner --> Session

Class diagram for SimplifyRowExpressions nested IF simplification

classDiagram
    class SimplifyRowExpressions {
        + SimplifyRowExpressions(Metadata metadata, ExpressionOptimizerManager expressionOptimizerManager)
        + static RowExpression rewrite(RowExpression expression, Metadata metadata, ExpressionOptimizerManager expressionOptimizerManager, Session session)
    }

    class Rewriter {
        - NestedIfSimplifier nestedIfSimplifier
        - ExpressionOptimizerManager expressionOptimizerManager
        - LogicalExpressionRewriter logicalExpressionRewriter
        + Rewriter(Metadata metadata, ExpressionOptimizerManager expressionOptimizerManager)
        + RowExpression rewrite(RowExpression expression, Session session)
    }

    class NestedIfSimplifier {
        - RowExpressionDeterminismEvaluator determinismEvaluator
        + NestedIfSimplifier(RowExpressionDeterminismEvaluator determinismEvaluator)
        + RowExpression rewriteSpecialForm(SpecialFormExpression node, Void context, RowExpressionTreeRewriter treeRewriter)
    }

    class LogicalExpressionRewriter {
        + RowExpression rewrite(RowExpression expression, Boolean context, RowExpressionTreeRewriter treeRewriter)
    }

    class RowExpressionTreeRewriter {
        + static RowExpression rewriteWith(RowExpressionRewriter rewriter, RowExpression expression)
        + static RowExpression rewriteWith(RowExpressionRewriter rewriter, RowExpression expression, boolean context)
        + RowExpression defaultRewrite(SpecialFormExpression node, Void context)
    }

    class RowExpressionRewriter {
        <<interface>>
    }

    class RowExpressionDeterminismEvaluator {
        + boolean isDeterministic(RowExpression expression)
    }

    class SpecialFormExpression {
        + Form getForm()
        + List getArguments()
        + RowExpression getType()
    }

    class RowExpression {
    }

    class Session {
    }

    class Metadata {
        + FunctionAndTypeManager getFunctionAndTypeManager()
    }

    class ExpressionOptimizerManager {
        + ExpressionOptimizer getExpressionOptimizer(ConnectorSession connectorSession)
    }

    class ExpressionOptimizer {
        + RowExpression optimize(RowExpression expression, OptimizationLevel level, ConnectorSession connectorSession)
    }

    class ConnectorSession {
    }

    class FunctionAndTypeManager {
    }

    SimplifyRowExpressions --> Rewriter
    Rewriter --> NestedIfSimplifier
    Rewriter --> LogicalExpressionRewriter
    Rewriter --> ExpressionOptimizerManager
    Rewriter --> Metadata

    NestedIfSimplifier --> RowExpressionDeterminismEvaluator
    NestedIfSimplifier ..|> RowExpressionRewriter

    LogicalExpressionRewriter ..|> RowExpressionRewriter

    RowExpressionTreeRewriter --> RowExpressionRewriter
    RowExpressionTreeRewriter --> RowExpression
    RowExpressionTreeRewriter --> SpecialFormExpression

    RowExpressionDeterminismEvaluator --> RowExpression

    SpecialFormExpression --> RowExpression

    SimplifyRowExpressions --> Metadata
    SimplifyRowExpressions --> ExpressionOptimizerManager
    SimplifyRowExpressions --> Session

    ExpressionOptimizerManager --> ExpressionOptimizer
    ExpressionOptimizer --> RowExpression
    ExpressionOptimizer --> ConnectorSession

Flow diagram for plan transformation by PreAggregateBeforeGroupId rule

flowchart LR
    subgraph OriginalPlan
        A1[Source]
        B1[GroupIdNode
            grouping_sets]
        C1[AggregationNode
            step SINGLE
            group_by grouping_set_keys + groupId]
        A1 --> B1 --> C1
    end

    subgraph TransformedPlanWithPreAggregation
        A2[Source]
        B2[AggregationNode
            PARTIAL
            group_by union_of_all_grouping_set_columns]
        C2[GroupIdNode
            grouping_sets]
        D2[AggregationNode
            INTERMEDIATE
            group_by grouping_set_keys + groupId]
        A2 --> B2 --> C2 --> D2
    end

    subgraph RuleApplicationConditions
        R0[Session property
            pre_aggregate_before_grouping_sets enabled]
        R1[AggregationNode.step == SINGLE]
        R2[All aggregations decomposable]
    end

    OriginalPlan -->|Iterative optimizer| PreRule[PreAggregateBeforeGroupId]

    PreRule -->|When R0 and R1 and R2| TransformedPlanWithPreAggregation

File-Level Changes

Change	Details	Files
Add PreAggregateBeforeGroupId optimizer rule and wire it into the optimizer pipeline behind a feature flag/session property.	Introduce PreAggregateBeforeGroupId rule that matches SINGLE-step Aggregation over GroupId and rewrites it into PARTIAL aggregation below GroupId plus INTERMEDIATE aggregation above GroupId, grouping PARTIAL by the union of grouping-set keys and only for decomposable aggregates. Register PreAggregateBeforeGroupId in PlanOptimizers so it runs in the iterative optimizer sequence for scalar/connector planning. Add unit tests (TestPreAggregateBeforeGroupId) covering enabled/disabled behavior, step filtering, non-GroupId sources, DISTINCT aggregates, and multiple aggregations. Add end-to-end AbstractTestQueries test enabling pre_aggregate_before_grouping_sets and validating GROUPING SETS query results on the orders table.	`presto-main-base/src/main/java/com/facebook/presto/sql/planner/iterative/rule/PreAggregateBeforeGroupId.java` `presto-main-base/src/main/java/com/facebook/presto/sql/planner/PlanOptimizers.java` `presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestPreAggregateBeforeGroupId.java` `presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueries.java`
Introduce configuration and session plumbing for pre-aggregating before grouping sets.	Add optimizer.pre-aggregate-before-grouping-sets boolean to FeaturesConfig with default false and expose getter/setter. Add PRE_AGGREGATE_BEFORE_GROUPING_SETS system session property and accessor, defaulting from FeaturesConfig. Extend TestFeaturesConfig defaults and explicit property mappings to cover the new feature flag.	`presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/FeaturesConfig.java` `presto-main-base/src/main/java/com/facebook/presto/SystemSessionProperties.java` `presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/TestFeaturesConfig.java`
Extend SimplifyRowExpressions with a determinism-aware nested IF simplifier to flatten redundant IF structures.	Add NestedIfSimplifier RowExpressionRewriter that rewrites IF(x, IF(y, v, E), E) to IF(x AND y, v, E) when the inner condition is deterministic, performing bottom-up rewriting to handle deep nests. Instantiate NestedIfSimplifier in SimplifyRowExpressions.Rewriter and run it after LogicalExpressionRewriter but before the main expression optimizer. Add TestSimplifyRowExpressions.testSimplifyNestedIf with cases covering null/omitted ELSE, multiple-level nesting, equal/unequal ELSE branches, and non-IF true branches.	`presto-main-base/src/main/java/com/facebook/presto/sql/planner/iterative/rule/SimplifyRowExpressions.java` `presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestSimplifyRowExpressions.java`
Tighten adaptive partial aggregation so SkipAggregationBuilder is only used on RAW→PARTIAL aggregation steps.	Guard HashAggregationOperator’s use of SkipAggregationBuilder with step.isInputRaw() along with step.isOutputPartial() and the partial-aggregation-disabled flag, preventing its use on INTERMEDIATE steps. Require step.isInputRaw() in LocalExecutionPlanner when creating PartialAggregationController so adaptive partial aggregation only applies to RAW input stages.	`presto-main-base/src/main/java/com/facebook/presto/operator/HashAggregationOperator.java` `presto-main-base/src/main/java/com/facebook/presto/sql/planner/LocalExecutionPlanner.java`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've found 5 issues, and left some high level feedback:

In PreAggregateBeforeGroupId, the INTERMEDIATE aggregation builds CallExpressions using function.getFinalType(), but INTERMEDIATE steps should normally operate on and produce the intermediate type; consider using the intermediate type for the call/output or changing the step to FINAL if you intend to produce final results there.
The logic that rewrites GroupIdNode.aggregationArguments to intermediate variables walks all original aggregations and matches arguments by equality; this can be fragile for more complex RowExpressions (e.g., lambdas, casts, or reused variables), so it may be safer to derive the mapping from the AggregationNode’s output variables directly rather than pattern-matching on argument expressions.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In PreAggregateBeforeGroupId, the INTERMEDIATE aggregation builds CallExpressions using function.getFinalType(), but INTERMEDIATE steps should normally operate on and produce the intermediate type; consider using the intermediate type for the call/output or changing the step to FINAL if you intend to produce final results there.
- The logic that rewrites GroupIdNode.aggregationArguments to intermediate variables walks all original aggregations and matches arguments by equality; this can be fragile for more complex RowExpressions (e.g., lambdas, casts, or reused variables), so it may be safer to derive the mapping from the AggregationNode’s output variables directly rather than pattern-matching on argument expressions.

## Individual Comments

### Comment 1
<location path="presto-main-base/src/main/java/com/facebook/presto/sql/planner/iterative/rule/PreAggregateBeforeGroupId.java" line_range="139-148" />
<code_context>
+        for (Map.Entry<VariableReferenceExpression, Aggregation> entry : node.getAggregations().entrySet()) {
</code_context>
<issue_to_address>
**issue (bug_risk):** Aggregation arguments in the new PARTIAL node are not remapped from GroupId outputs to source columns, which can break variable references.

The partial aggregation is built on `groupIdNode.getSource()`, but `originalAggregation.getArguments()` refer to `GroupIdNode` output variables, which are out of scope below `GroupId`. Despite the comment claiming arguments are mapped to source variables, they are passed through unchanged. For grouping keys and aggregation arguments rewritten by `GroupId`, this will produce invalid references. The partial aggregation arguments need to be remapped to source variables using `groupIdNode.getGroupingColumns()` and `groupIdNode.getAggregationArguments()` before creating the `CallExpression`.
</issue_to_address>

### Comment 2
<location path="presto-main-base/src/main/java/com/facebook/presto/sql/planner/iterative/rule/PreAggregateBeforeGroupId.java" line_range="218-227" />
<code_context>
+        Map<VariableReferenceExpression, Aggregation> intermediateAggregations = new HashMap<>();
</code_context>
<issue_to_address>
**issue (bug_risk):** The INTERMEDIATE aggregation uses the final output type and mixes intermediate state with original arguments, which is inconsistent with decomposed aggregation semantics.

For INTERMEDIATE, the aggregation should consume and produce only the intermediate state; the FINAL step produces the final type. Here the `CallExpression` is built with `function.getFinalType()` and arguments `[intermediateVariable] + original lambdas`, so the INTERMEDIATE step incorrectly returns the final type and reuses lambda arguments that should already be captured in the intermediate state. This can violate type assumptions in the planner/execution. Please change the INTERMEDIATE call to use the intermediate type as its return type and restrict its arguments to the intermediate state variables (only passing lambdas if the implementation explicitly requires them).
</issue_to_address>

### Comment 3
<location path="presto-main-base/src/main/java/com/facebook/presto/sql/planner/iterative/rule/PreAggregateBeforeGroupId.java" line_range="79-81" />
<code_context>
+{
+    private static final Capture<GroupIdNode> GROUP_ID = newCapture();
+
+    private static final Pattern<AggregationNode> PATTERN = aggregation()
+            .with(source().matching(
+                    groupId().capturedAs(GROUP_ID)));
+
+    private final FunctionAndTypeManager functionAndTypeManager;
</code_context>
<issue_to_address>
**suggestion (bug_risk):** The rule lacks guards that the aggregation’s grouping keys match the GroupId output, which may apply the rewrite in unsupported layouts.

This currently rewrites any `AggregationNode` over a `GroupIdNode` without verifying that the aggregation’s grouping sets are exactly `grouping_set_keys + groupId`. The new PARTIAL node groups by `allSourceGroupingKeys` from `GroupIdNode.getGroupingSets()`, while the INTERMEDIATE keeps `node.getGroupingSets()` unchanged. If the original aggregation includes extra grouping expressions/columns or omits the groupId variable, the transformed plan may change semantics. Please add pattern or `apply`-time checks that each aggregation grouping set is `grouping_set ∪ {groupIdVariable}` before applying this rule.

Suggested implementation:

```java
    public PreAggregateBeforeGroupId(FunctionAndTypeManager functionAndTypeManager)
    {
        this.functionAndTypeManager = requireNonNull(functionAndTypeManager, "functionAndTypeManager is null");
    }

    /**
     * Verifies that the aggregation's grouping sets are exactly the GroupIdNode's
     * source grouping sets extended with the groupId symbol, i.e. for each
     * aggregation grouping set G_agg there exists a source grouping set G_src such that
     * G_agg = G_src ∪ {groupIdSymbol}.
     */
    private static boolean aggregationGroupingMatchesGroupId(AggregationNode aggregation, GroupIdNode groupId)
    {
        Symbol groupIdSymbol = groupId.getGroupIdSymbol();

        // Build the set of all valid grouping sets the aggregation is allowed to use:
        // for each source grouping set, add the groupId symbol.
        java.util.Set<java.util.Set<Symbol>> validGroupingSets = new java.util.HashSet<>();
        for (java.util.List<Symbol> sourceGroupingSet : groupId.getGroupingSets()) {
            java.util.Set<Symbol> groupingSetWithGroupId = new java.util.HashSet<>(sourceGroupingSet);
            groupingSetWithGroupId.add(groupIdSymbol);
            // Use an unmodifiable set so equals()/hashCode() are stable and we don't accidentally mutate later
            validGroupingSets.add(java.util.Collections.unmodifiableSet(groupingSetWithGroupId));
        }

        // Every aggregation grouping set must match one of the expected sets.
        for (java.util.Set<Symbol> aggregationGroupingSet : aggregation.getGroupingSets()) {
            if (!validGroupingSets.contains(aggregationGroupingSet)) {
                return false;
            }
        }

        return true;
    }

    @Override
    public Pattern<AggregationNode> getPattern()
    {
        return PATTERN;

```

To fully enforce the guard, update the `apply` method in this class to call the new helper and bail out when the grouping layouts are not compatible. For example, at the very beginning of `apply`:

1. Retrieve the captured `GroupIdNode`:
   ```java
   GroupIdNode groupIdNode = captures.get(GROUP_ID);
   ```

2. Check the grouping sets and, if they do not match, skip the rewrite:
   ```java
   if (!aggregationGroupingMatchesGroupId(node, groupIdNode)) {
       return Result.empty();
   }
   ```

Place this check before any logic that constructs the PARTIAL/INTERMEDIATE aggregations. This ensures the rule only fires when each aggregation grouping set is exactly `grouping_set ∪ {groupIdVariable}`, preventing semantic changes when the aggregation has extra or missing grouping expressions relative to the `GroupIdNode` output.
</issue_to_address>

### Comment 4
<location path="presto-main-base/src/main/java/com/facebook/presto/sql/planner/iterative/rule/SimplifyRowExpressions.java" line_range="137" />
<code_context>
+                List<RowExpression> innerArgs = innerIf.getArguments();
+                RowExpression innerCondition = innerArgs.get(0);
+                if (falseValue.equals(innerArgs.get(2)) && determinismEvaluator.isDeterministic(innerCondition)) {
+                    RowExpression combinedCondition = new SpecialFormExpression(AND, BOOLEAN, condition, innerCondition);
+                    return new SpecialFormExpression(rewritten.getSourceLocation(), IF, rewritten.getType(), combinedCondition, innerArgs.get(1), falseValue);
+                }
</code_context>
<issue_to_address>
**nitpick:** Consider preserving source locations on the combined AND condition for better diagnostics and tooling.

Other `SpecialFormExpression` instances here (including the rewritten IF) carry a `SourceLocation`, but the new `AND` expression is created with the constructor that omits it. Please use the constructor that takes a `SourceLocation` (e.g., from `rewritten.getSourceLocation()` or `condition`) so diagnostics (stack traces, plan visualizations) remain accurate.
</issue_to_address>

### Comment 5
<location path="presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestSimplifyRowExpressions.java" line_range="144-145" />
<code_context>
+public class TestPreAggregateBeforeGroupId
+        extends BaseRuleTest
+{
+    @Test
+    public void testPreAggregatesBeforeGroupId()
+    {
</code_context>
<issue_to_address>
**suggestion (testing):** Add a negative test for non-deterministic inner conditions to validate determinism-aware behavior

The existing tests cover only deterministic inner conditions. Please add a case with a non-deterministic inner condition (e.g., `IF(X, IF(rand() > 0.5, V, CAST(null AS boolean)), CAST(null AS boolean))`) that must not be simplified, to ensure we don’t regress the determinism check in `NestedIfSimplifier`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

.../src/main/java/com/facebook/presto/sql/planner/iterative/rule/PreAggregateBeforeGroupId.java

...ase/src/main/java/com/facebook/presto/sql/planner/iterative/rule/SimplifyRowExpressions.java

sourcery-ai · 2026-03-08T03:50:56Z

...src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestSimplifyRowExpressions.java

+    @Test
+    public void testSimplifyNestedIf()


suggestion (testing): Add a negative test for non-deterministic inner conditions to validate determinism-aware behavior

The existing tests cover only deterministic inner conditions. Please add a case with a non-deterministic inner condition (e.g., IF(X, IF(rand() > 0.5, V, CAST(null AS boolean)), CAST(null AS boolean))) that must not be simplified, to ensure we don’t regress the determinism check in NestedIfSimplifier.

steveburnett · 2026-03-09T14:13:05Z

Thanks for the release note! Please add a row of three (`) above and below it to format it, like this:

== RELEASE NOTES ==

General Changes

* Add optimizer rule to pre-aggregate data before GroupId node in grouping sets queries, reducing row multiplication. Enabled via session property pre_aggregate_before_grouping_sets. (:pr:27101)
* Fix a bug where adaptive partial aggregation could incorrectly bypass INTERMEDIATE aggregation steps.

steveburnett

The session property pre_aggregate_before_grouping_sets does not appear to be documented. Could you show me where this property is documented in Presto? If not, please add documentation.

steveburnett

Responding to a review request, but I don't find any documentation or .rst files in the PR for me to review. Please re-request when doc is added and I am happy to take another look anytime.

aaneja · 2026-03-11T03:49:37Z

.../src/main/java/com/facebook/presto/sql/planner/iterative/rule/PreAggregateBeforeGroupId.java

+ * <p>This rule runs after exchange placement, when the plan has already been
+ * split into PARTIAL and FINAL aggregations. It matches:
+ * <pre>
+ * AggregationNode(PARTIAL) -> GroupIdNode -> Source
+ * </pre>
+ * and transforms it to:
+ * <pre>
+ * AggregationNode(INTERMEDIATE) -> GroupIdNode -> AggregationNode(PARTIAL, GROUP BY all_keys) -> Source
+ * </pre>


Can you re-write this to use the tree notation we use in other rules ?

Did you try enabling the add_exchange_below_partial_aggregation_over_group_id if it helped for your use case ? It should have a similar impact as this rule - reducing the rows that GroupId multiplies

Updated the Javadoc to use tree notation consistent with AddExchangesBelowPartialAggregationOverGroupIdRuleSet. The "before" and "after" trees now use clean node names without parenthetical descriptions.

Yes, add_exchange_below_partial_aggregation_over_group_id adds exchanges below GroupId to reshuffle the raw data, which helps the partial aggregation above GroupId be more effective. However, it does not actually reduce the row count before GroupId — it only redistributes the same rows.

This rule goes further by inserting a PARTIAL aggregation below GroupId that actually reduces the data (e.g., 60K rows → 2,526 for the TPC-H CUBE query). The pre-aggregated intermediate states are then shuffled and merged via an INTERMEDIATE aggregation before GroupId multiplies them. So the two rules are complementary — this one reduces rows, the existing one redistributes them.

Benchmark comparison (TPC-H lineitem CROSS JOIN UNNEST(ARRAY[1,2,3,4,5]), GROUP BY CUBE(yr, mo, dy, shipmode, returnflag) — 32 grouping sets, ~300K input rows, 3 warmup + 5 measured runs):

Configuration Avg Time Speedup

Baseline (both disabled) 6574 ms 1.00x

pre_aggregate_before_grouping_sets = true 1686 ms 3.90x

add_exchange_below_partial_aggregation_over_group_id = true 6609 ms 0.99x

The existing exchange rule didn't fire in this benchmark — likely because its cost-model guard (estimateAggregationMemoryRequirements < maxPartialAggregationMemoryUsage) determined that partial aggregation memory was manageable without redistribution. The pre-aggregation rule, by contrast, always reduces rows before GroupId multiplies them, yielding a consistent ~4x speedup.

kaikalur · 2026-03-11T04:44:46Z

@aaneja Thanks for the review! Addressed your comments:

Tree notation: Rewrote the Javadoc to use the tree notation format used in other rules like AddExchangesBelowPartialAggregationOverGroupIdRuleSet.

Re: add_exchange_below_partial_aggregation_over_group_id: Good question! I did look at it — they target the same problem (GroupId row multiplication) but with different mechanisms:

AddExchangesBelowPartialAggregationOverGroupIdRuleSet adds exchanges (data redistribution) below GroupId so that the partial aggregation above GroupId is more effective due to better data partitioning. It doesn't reduce the row count before GroupId — GroupId still multiplies all incoming rows.
PreAggregateBeforeGroupId adds an actual partial aggregation below GroupId that collapses duplicate grouping-key combinations before GroupId sees them. This directly reduces the number of rows that GroupId multiplies.

The pre-aggregation approach is more effective when the data has many duplicate values on the grouping columns — the aggregation can collapse N duplicate rows into 1 before GroupId multiplies them. The exchange approach helps when partial aggregation above GroupId is ineffective due to data skew/distribution but doesn't reduce the input to GroupId itself.

They could also be complementary — redistributing and pre-aggregating before GroupId.

Also added .rst documentation for the session property and configuration property per @steveburnett's feedback.

aaneja · 2026-03-11T11:19:03Z

@kaikalur The exchange is done on exactly the variables that the partial agg needs, see this coderef
Here's a snippet of TPCDS Q67 eg -

- Aggregate(PARTIAL)[i_category$gid, i_class$gid, i_brand$gid, i_product_name$gid, d_year$gid, d_qoy$gid, d_moy$gid, s_store_id$gid, groupid][PlanNodeId 1710] => [i_category$gid:varchar(50), i_class$gid:varchar(50), i_brand$gid:varchar(50), i_product_name$gid:varchar(50), d_year$gid:integer, d_qoy$gid:integer, d_moy$gid:integer, s_store_id$gid:varchar(16), groupid:bigint, sum_136:varbinary]
            CPU: 35.57m (48.30%), Scheduled: 1.05h (46.54%), Output: 591,831,903 rows (79.00GB)
            Input total: 4,847,526,981 rows (518.06GB), avg.: 37,871,304.54 rows, std.dev.: 387.32%
            sum_136 := "native.default.sum"((expr)) (21:20)
        - GroupId[PlanNodeId 9][[], [i_category], [i_category, i_class], [i_category, i_class, i_brand], [i_category, i_class, i_brand, i_product_name], [i_category, i_class, i_brand, i_product_name, d_year], [i_category, i_class, i_brand, i_product_name, d_year, d_qoy], [i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy], [i_category, i_class, i_brand, i_product_name, d_year, d_qoy, d_moy, s_store_id]] => [i_brand$gid:varchar(50), s_store_id$gid:varchar(16), i_category$gid:varchar(50), i_class$gid:varchar(50), d_year$gid:integer, i_product_name$gid:varchar(50), d_qoy$gid:integer, d_moy$gid:integer, expr:decimal(17,2), groupid:bigint]
                CPU: 17.29s (0.39%), Scheduled: 27.32s (0.34%), Output: 4,847,526,981 rows (518.06GB)
                Input total: 538,614,109 rows (100.53GB), avg.: 4,207,922.73 rows, std.dev.: 387.32%
                i_brand$gid := i_brand (30:51)
                s_store_id$gid := s_store_id (30:98)
                i_category$gid := i_category (30:30)
                i_class$gid := i_class (30:42)
                d_year$gid := d_year (30:76)
                i_product_name$gid := i_product_name (30:60)
                d_qoy$gid := d_qoy (30:84)
                d_moy$gid := d_moy (30:91)
            - LocalExchange[PlanNodeId 1785][HASH] (i_product_name) => [i_category:varchar(50), i_class:varchar(50), i_brand:varchar(50), i_product_name:varchar(50), d_year:integer, d_qoy:integer, d_moy:integer, s_store_id:varchar(16), expr:decimal(17,2)]
                    Estimates: {source: CostBasedSourceInfo, rows: 487,075,940 (53.35GB), cpu: 471,008,999,931.42, memory: 186,518,273.27, network: 57,469,954,495.64}
                    CPU: 31.74s (0.72%), Scheduled: 1.08m (0.80%), Output: 538,614,109 rows (100.53GB)
                    Input total: 538,614,109 rows (100.53GB), avg.: 4,207,922.73 rows, std.dev.: 387.32%
                - RemoteSource[4]  => [i_category:varchar(50), i_class:varchar(50), i_brand:varchar(50), i_product_name:varchar(50), d_year:integer, d_qoy:integer, d_moy:integer, s_store_id:varchar(16), expr:decimal(17,2)]
                        CPU: 1.36m (1.85%), Scheduled: 2.26m (1.68%), Output: 538,614,109 rows (100.53GB)
                        Input total: 538,614,109 rows (100.53GB), avg.: 4,207,922.73 rows, std.dev.: 387.32%

The impact is exactly the same as adding the partial agg that you have in this rule; but it can do better because it fetches all data across workers

Which is why I wanted to know if there was a real query that benefited from this rule, but not from AddExchangesBelowPartialAggregationOverGroupIdRuleSet (or from a combination of the two) ?

kaikalur · 2026-03-12T00:24:04Z

@aaneja Thanks for the detailed TPCDS Q67 example — that's really helpful for understanding the exchange rule's behavior.

Looking at your plan snippet, the key observation is:

Source → GroupId: 538M rows → 4.8B rows (9x multiplication)
GroupId → Partial Agg: 4.8B rows → 591M rows

The exchange rule makes the partial agg above GroupId more effective by redistributing data, but GroupId still processes all 538M input rows and produces the full 4.8B multiplied rows. The reduction happens after multiplication.

With PreAggregateBeforeGroupId, if the data has duplicate combinations on the union of grouping columns, we collapse them before GroupId sees them. For example, if pre-aggregation reduces 538M → 50M distinct key combinations, GroupId would only produce 450M rows (9 × 50M) instead of 4.8B — avoiding the work of both multiplying and then reducing redundant rows.

The motivation comes from production queries at our org where engineers manually rewrite queries to pre-aggregate before GROUPING SETS for significant speedups (often 3-5x). This rule automates that well-known pattern. These queries typically have high duplication on the grouping columns relative to the base table cardinality — a common scenario in fact tables with many rows per dimension key combination.

That said, you're right that the two rules are complementary. The exchange rule helps with data distribution for the aggregation, while pre-aggregation reduces the raw row count. In cases where AddExchangesBelowPartialAggregationOverGroupIdRuleSet already achieves sufficient reduction, this rule's benefit would be marginal. The biggest wins come when grouping column cardinality is much lower than table cardinality (high duplication), where pre-aggregation can dramatically shrink the input to GroupId.

Happy to provide a more concrete benchmark comparison if that would help.

aaneja · 2026-03-12T09:03:36Z

@kaikalur

With PreAggregateBeforeGroupId, if the data has duplicate combinations on the union of grouping columns, we collapse them before GroupId sees them. For example, if pre-aggregation reduces 538M → 50M distinct key combinations, GroupId would only produce 450M rows (9 × 50M) instead of 4.8B — avoiding the work of both multiplying and then reducing redundant rows.

This makes sense, but I don't see how this would result in latency, CPU or memory gains, since we're doing the pre and intermediate aggs on the same worker while streaming rows. We usually add intermediate aggs with Exchanges between them.

Yes, a concrete benchmark comparison would help clear this up !

kaikalur · 2026-03-12T19:20:07Z

@aaneja The rule now produces a complete pre-aggregation pipeline below GroupId with a shuffle. Here's the plan:

Source → PARTIAL → Exchange(hash) → INTERMEDIATE → GroupId → INTERMEDIATE → Exchange → FINAL

The data flow:

Source produces raw rows
PARTIAL aggregation reduces rows locally (at the finest granularity — union of all grouping set columns)
Exchange shuffles the reduced partial states by grouping keys across workers
INTERMEDIATE merges partial states from different workers
GroupId multiplies the already-reduced-and-merged rows across grouping sets
INTERMEDIATE above GroupId merges within each grouping set
Exchange → FINAL completes the aggregation

The win: steps 2-4 drastically reduce the row count before GroupId multiplies them. Without pre-aggregation, GroupId multiplies the full raw row count.

Our rule runs before AddExchangesBelowPartialAggregationOverGroupIdRuleSet in the optimizer pipeline (PlanOptimizers.java), so the two optimizations are independent.

steveburnett

Thank you for the documentation! Just a nit of phrasing.

presto-docs/src/main/sphinx/admin/properties-session.rst

kaikalur · 2026-03-13T00:25:30Z

@aaneja Good point — the rule now includes a proper Exchange (shuffle) between the PARTIAL and INTERMEDIATE, addressing your concern about same-worker streaming. Here's a concrete example using TPC-H lineitem:

SELECT sum(extendedprice), count(extendedprice),
       day(shipdate), month(shipdate), shipdate
FROM lineitem
GROUP BY CUBE (day(shipdate), month(shipdate), shipdate)

Without optimization (current behavior):

Source (60K rows) → GroupId (×8 grouping sets = 480K rows) → PARTIAL → Exchange → FINAL

GroupId multiplies all 60K raw rows by 8, producing 480K rows that flow through aggregation.

With optimization (this PR):

Source (60K rows) → PARTIAL (→ 2,526 distinct shipdates) → Exchange(hash) → INTERMEDIATE (merged) → GroupId (×8 = ~20K rows) → INTERMEDIATE → Exchange → FINAL

The key numbers:

lineitem has ~60K rows but only 2,526 distinct shipdate values
PARTIAL reduces 60K → 2,526 rows locally (24x reduction)
Exchange shuffles only 2,526 rows (not 60K)
INTERMEDIATE merges across workers
GroupId multiplies only 2,526 × 8 = ~20K rows instead of 480K

That's a 24x reduction in the number of rows GroupId multiplies. The savings come from:

Less data shuffled — Exchange moves 2,526 rows instead of 60K
Less row multiplication — GroupId produces ~20K rows instead of 480K
Less aggregation work — downstream INTERMEDIATE/FINAL process far fewer rows

The effect scales with data size and duplication ratio. For production tables with billions of rows and high duplication on grouping keys, the reduction can be orders of magnitude.

steveburnett

LGTM! (docs)

Pull updated branch, new local doc build, looks good. Thanks!

aaneja · 2026-03-14T02:31:59Z

@kaikalur Can you run the above query on say a 2-4 worker setup for TPCH SF 100 and share the EXPLAIN ANALYZE when this rule is enabled vs when the AddExchangesBelowPartialAggregationOverGroupIdRuleSet rules are enabled ?

I'll try to do it our setup as well

kaikalur · 2026-03-14T14:47:33Z

@aaneja All CI checks are green and we've addressed all review feedback (tree notation Javadoc, comparison with add_exchange_below_partial_aggregation_over_group_id, docs style fixes). Would appreciate your review when you get a chance — we'd like to get this merged soon. Thanks!

aaneja · 2026-03-15T08:13:23Z

@aaneja All CI checks are green and we've addressed all review feedback (tree notation Javadoc, comparison with add_exchange_below_partial_aggregation_over_group_id, docs style fixes). Would appreciate your review when you get a chance — we'd like to get this merged soon. Thanks!

Looks like you missed my message above - #27290 (comment)

kaikalur · 2026-03-15T14:56:22Z

@aaneja Apologies for missing your earlier messages!

To clarify the architecture concern — our rule does not run pre-agg and intermediate on the same worker while streaming. The transformed plan includes a RemoteExchange between PARTIAL and INTERMEDIATE:

Source → Partial Agg (group by all_keys)
       → RemoteExchange (hash by all_keys)
       → Intermediate Agg (group by all_keys)
       → GroupId
       → Intermediate Agg (group by grouping_set_keys + groupId)

So data is shuffled across workers between the PARTIAL and INTERMEDIATE steps — this is a fully distributed operation, not same-worker streaming.

The key difference from AddExchangesBelowPartialAggregationOverGroupIdRuleSet:

Existing rule: Shuffles raw rows below GroupId → GroupId still multiplies all rows → PARTIAL above reduces
Our rule: PARTIAL reduces rows locally → RemoteExchange shuffles reduced data → INTERMEDIATE merges globally → GroupId multiplies the already-reduced dataset

Using your TPCDS Q67 numbers as an example: with the existing rule, GroupId still multiplies 538M → 4.8B rows, and then PARTIAL reduces to 591M. With our rule, if PARTIAL reduces 538M to, say, 50M distinct key combinations, GroupId would only produce 450M rows (9 × 50M) — never materializing the 4.8B intermediate rows.

I don't currently have a multi-worker TPCH SF 100 setup available to run EXPLAIN ANALYZE. Since you mentioned you'll try it on your setup, that would be great! The session property to enable is pre_aggregate_before_grouping_sets=true. You could compare against add_exchange_below_partial_aggregation_over_group_id=true on the CUBE query from the PR description or TPCDS Q67.

aaneja · 2026-03-16T07:27:57Z

@kaikalur I can't sign-off until we get clear signal about the rule's applicability

kaikalur · 2026-03-16T14:07:21Z

@kaikalur I can't sign-off until we get clear signal about the rule's applicability

well it's rule derived from manual applocation on our prod workloads. Also see the lates benchmark numbers that claude posted on tpch

kaikalur · 2026-03-16T14:08:38Z

Oh claude just edited the old reply. Here:

Benchmark comparison (TPC-H lineitem CROSS JOIN UNNEST(ARRAY[1,2,3,4,5]), GROUP BY CUBE(yr, mo, dy, shipmode, returnflag) — 32 grouping sets, ~300K input rows, 3 warmup + 5 measured runs):

Configuration Avg Time Speedup
Baseline (both disabled) 6574 ms 1.00x
pre_aggregate_before_grouping_sets = true 1686 ms 3.90x
add_exchange_below_partial_aggregation_over_group_id = true 6609 ms 0.99x

kaikalur · 2026-03-16T14:11:04Z

@feilong-liu csn you review this? There is no question about it's applicability. Don't want to delay this any further as its off by default and peop;e who don't want don't need to use it

kaikalur · 2026-03-16T14:11:53Z

@kaikalur Can you run the above query on say a 2-4 worker setup for TPCH SF 100 and share the EXPLAIN ANALYZE when this rule is enabled vs when the AddExchangesBelowPartialAggregationOverGroupIdRuleSet rules are enabled ?

I'll try to do it our setup as well

explain does not show the power of this. See the benchmark we posted

kaikalur · 2026-03-16T17:46:31Z

@aaneja Added a reusable HiveDistributedBenchmarkRunner utility and a BenchmarkGroupingSetsPreAggregation benchmark test in presto-hive/src/test/.../benchmark/.

You can run it directly:

mvn test -pl presto-hive -Dtest=BenchmarkGroupingSetsPreAggregation -DfailIfNoTests=false

It compares three scenarios (baseline, pre_aggregate_before_grouping_sets, and add_exchange_below_partial_aggregation_over_group_id) on a GROUP BY CUBE(5 cols) query over lineitem CROSS JOIN UNNEST(ARRAY[1,2,3,4,5]) (~300K rows, 32 grouping sets). Results are written to /tmp/hive_benchmark_results.txt.

Configuration	Avg Time	Speedup
Baseline (both disabled)	6574 ms	1.00x
`pre_aggregate_before_grouping_sets = true`	1686 ms	3.90x
`add_exchange_below_partial_aggregation_over_group_id = true`	6609 ms	0.99x

Conversation

kaikalur commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

sourcery-ai bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Class diagram for PreAggregateBeforeGroupId optimizer rule and related planner components

Class diagram for SimplifyRowExpressions nested IF simplification

Flow diagram for plan transformation by PreAggregateBeforeGroupId rule

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sourcery-ai bot Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

steveburnett commented Mar 9, 2026

Uh oh!

steveburnett left a comment

Choose a reason for hiding this comment

Uh oh!

steveburnett left a comment

Choose a reason for hiding this comment

Uh oh!

aaneja Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

aaneja Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

kaikalur Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

kaikalur Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaikalur commented Mar 11, 2026

Uh oh!

aaneja commented Mar 11, 2026

Uh oh!

kaikalur commented Mar 12, 2026

Uh oh!

aaneja commented Mar 12, 2026

Uh oh!

kaikalur commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steveburnett left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kaikalur commented Mar 13, 2026

Uh oh!

steveburnett left a comment

Choose a reason for hiding this comment

Uh oh!

aaneja commented Mar 14, 2026

Uh oh!

kaikalur commented Mar 14, 2026

Uh oh!

aaneja commented Mar 15, 2026

Uh oh!

kaikalur commented Mar 15, 2026

Uh oh!

aaneja commented Mar 16, 2026

Uh oh!

kaikalur commented Mar 16, 2026

kaikalur commented Mar 8, 2026 •

edited

Loading

sourcery-ai bot commented Mar 8, 2026 •

edited

Loading

kaikalur Mar 13, 2026 •

edited

Loading

kaikalur commented Mar 12, 2026 •

edited

Loading

kaikalur commented Mar 16, 2026 •

edited

Loading

kaikalur commented Mar 17, 2026 •

edited

Loading