feat(optimizer): Simplify COALESCE over equi-join keys based on join type#27250
Merged
kaikalur merged 1 commit intoprestodb:masterfrom Mar 6, 2026
Merged
feat(optimizer): Simplify COALESCE over equi-join keys based on join type#27250kaikalur merged 1 commit intoprestodb:masterfrom
kaikalur merged 1 commit intoprestodb:masterfrom
Conversation
Contributor
Reviewer's GuideIntroduces a new iterative optimizer rule that rewrites redundant two-argument COALESCE expressions over equi-join key pairs in ProjectNodes above JoinNodes based on join type, guarded by a configurable feature flag exposed via config and session properties, and covered by unit and end-to-end query tests. Sequence diagram for applying SimplifyCoalesceOverJoinKeys during planningsequenceDiagram
participant Config as FeaturesConfig
participant SSP as SystemSessionProperties
participant Sess as Session
participant Planner as PlanOptimizers
participant Optimizer as IterativeOptimizer
participant Rule as SimplifyCoalesceOverJoinKeys
Config->>SSP: construct SystemSessionProperties(featuresConfig)
SSP->>Sess: register system properties
Sess->>Planner: create query session
Planner->>Optimizer: build IterativeOptimizer with rule set {SimplifyCoalesceOverJoinKeys, ...}
Optimizer->>Rule: isEnabled(session)
Rule->>SSP: isSimplifyCoalesceOverJoinKeys(session)
SSP-->>Rule: boolean enabled
alt feature flag enabled
Optimizer->>Rule: apply(projectNode, captures, context)
Rule->>Rule: match ProjectNode over JoinNode via PATTERN
Rule->>Rule: inspect JoinNode.getType() and criteria
Rule->>Rule: trySimplifyCoalesce on COALESCE expressions
Rule-->>Optimizer: Result with rewritten ProjectNode
else feature flag disabled
Rule-->>Optimizer: Result.empty
end
Optimizer-->>Planner: optimized plan
Class diagram for SimplifyCoalesceOverJoinKeys rule and related planner wiringclassDiagram
class SimplifyCoalesceOverJoinKeys {
+SimplifyCoalesceOverJoinKeys()
+Pattern getPattern()
+boolean isEnabled(Session session)
+Result apply(ProjectNode project, Captures captures, Context context)
-RowExpression trySimplifyCoalesce(RowExpression expression, JoinType joinType, Set leftVariables, Set rightVariables, Map leftToRight, Map rightToLeft)
-static Capture JOIN
-static Pattern PATTERN
}
class Rule {
<<interface>>
+Pattern getPattern()
+boolean isEnabled(Session session)
+Result apply(Node node, Captures captures, Context context)
}
class ProjectNode {
+Assignments getAssignments()
+PlanNode getSource()
+Map getOutputVariables()
}
class JoinNode {
+JoinType getType()
+List getCriteria()
+PlanNode getLeft()
+PlanNode getRight()
+List getOutputVariables()
}
class EquiJoinClause {
+VariableReferenceExpression getLeft()
+VariableReferenceExpression getRight()
}
class Assignments {
+Map getMap()
+static Builder builder()
}
class SpecialFormExpression {
+Form getForm()
+List getArguments()
enum Form
}
class VariableReferenceExpression {
}
class JoinType {
<<enum>>
INNER
LEFT
RIGHT
FULL
}
class FeaturesConfig {
-boolean simplifyCoalesceOverJoinKeys
+boolean isSimplifyCoalesceOverJoinKeys()
+FeaturesConfig setSimplifyCoalesceOverJoinKeys(boolean simplifyCoalesceOverJoinKeys)
}
class SystemSessionProperties {
+static String SIMPLIFY_COALESCE_OVER_JOIN_KEYS
+static boolean isSimplifyCoalesceOverJoinKeys(Session session)
-SystemSessionProperties(FeaturesConfig featuresConfig)
}
class PlanOptimizers {
-PlanOptimizers(..., RuleStats ruleStats, StatsCalculator statsCalculator, EstimatedExchangesCostCalculator estimatedExchangesCostCalculator, Set rules)
}
class IterativeOptimizer {
-Set rules
}
class Session {
+Object getSystemProperty(String key, Class type)
}
SimplifyCoalesceOverJoinKeys ..|> Rule
SimplifyCoalesceOverJoinKeys --> ProjectNode
SimplifyCoalesceOverJoinKeys --> JoinNode
SimplifyCoalesceOverJoinKeys --> Assignments
SimplifyCoalesceOverJoinKeys --> EquiJoinClause
SimplifyCoalesceOverJoinKeys --> SpecialFormExpression
SimplifyCoalesceOverJoinKeys --> VariableReferenceExpression
SimplifyCoalesceOverJoinKeys --> JoinType
SimplifyCoalesceOverJoinKeys --> Session
JoinNode --> EquiJoinClause
JoinNode --> JoinType
PlanOptimizers --> IterativeOptimizer
IterativeOptimizer --> Rule
IterativeOptimizer --> SimplifyCoalesceOverJoinKeys
SystemSessionProperties --> FeaturesConfig
SystemSessionProperties --> Session
SimplifyCoalesceOverJoinKeys ..> SystemSessionProperties
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
…type Add SimplifyCoalesceOverJoinKeys optimizer rule that eliminates redundant COALESCE expressions over equi-join key pairs. For equi-join condition l.x = r.y, COALESCE(l.x, r.y) can be simplified based on join type: - LEFT JOIN: always l.x (left key guaranteed non-null) - RIGHT JOIN: always r.y (right key guaranteed non-null) - INNER JOIN: first argument (both non-null) - FULL JOIN: cannot simplify (either side may be null) This optimization is important for tool-generated queries that produce patterns like SELECT COALESCE(l.x, r.y) FROM l LEFT JOIN r ON l.x = r.y, where the COALESCE prevents bucketed join optimizations. Fixes: prestodb#26984
7777f53 to
4275c21
Compare
Contributor
Author
|
Friendly ping @jaystarshot @feilong-liu @elharo — CI is all green on this PR. Would appreciate a review when you get a chance. Thanks! |
feilong-liu
approved these changes
Mar 6, 2026
Comment on lines
+63
to
+64
| private static final Pattern<ProjectNode> PATTERN = project() | ||
| .with(source().matching(join().capturedAs(JOIN))); |
Contributor
There was a problem hiding this comment.
Nit: I remember we can put the join type check within the Pattern here too?
garimauttam
pushed a commit
to garimauttam/presto
that referenced
this pull request
Mar 9, 2026
…type (prestodb#27250) ## Summary - Adds `SimplifyCoalesceOverJoinKeys` optimizer rule that eliminates redundant `COALESCE` expressions over equi-join key pairs based on join type - For equi-join condition `l.x = r.y`: - **LEFT JOIN**: `COALESCE(l.x, r.y)` or `COALESCE(r.y, l.x)` → `l.x` (left key guaranteed non-null) - **RIGHT JOIN**: `COALESCE(l.x, r.y)` or `COALESCE(r.y, l.x)` → `r.y` (right key guaranteed non-null) - **INNER JOIN**: `COALESCE(first, second)` → `first` (both sides non-null, pick first argument) - **FULL JOIN**: cannot simplify (either side may be null) - This is important for tool-generated queries that produce patterns like `SELECT COALESCE(l.x, r.y) FROM l LEFT JOIN r ON l.x = r.y`, where the COALESCE prevents bucketed join optimizations ## Changes - **`SimplifyCoalesceOverJoinKeys.java`**: New optimizer rule matching `ProjectNode` over `JoinNode`, simplifying COALESCE expressions - **`FeaturesConfig.java`**: Added `optimizer.simplify-coalesce-over-join-keys` config (default: disabled) - **`SystemSessionProperties.java`**: Added `simplify_coalesce_over_join_keys` session property - **`PlanOptimizers.java`**: Registered the rule - **`TestSimplifyCoalesceOverJoinKeys.java`**: 14 unit tests covering all join types and edge cases - **`TestFeaturesConfig.java`**: Config validation tests - **`AbstractTestQueries.java`**: End-to-end tests with SQL queries Fixes: prestodb#26984 ## Test plan - [x] `TestSimplifyCoalesceOverJoinKeys` — 14 unit tests (all pass) - [x] `TestFeaturesConfig` — config property tests (passes) - [x] `TestReorderJoins` — verified no regression - [x] End-to-end SQL tests in `AbstractTestQueries` — LEFT, RIGHT, INNER, FULL joins with COALESCE ## Contributor checklist - [x] Please make sure your submission complies with our [contributing guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md), in particular [code style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style) and [commit standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards). - [x] PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced. - [x] Adequate tests were added if applicable. - [x] CI passed. ## Release Notes Please follow [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines) and fill in the release notes below. ``` == RELEASE NOTES == General Changes * Add optimizer rule ``SimplifyCoalesceOverJoinKeys`` that simplifies redundant ``COALESCE`` expressions over equi-join key pairs based on join type, enabling bucketed join optimizations for tool-generated queries. Controlled by the ``simplify_coalesce_over_join_keys`` session property (disabled by default). ```
This was referenced Mar 31, 2026
15 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SimplifyCoalesceOverJoinKeysoptimizer rule that eliminates redundantCOALESCEexpressions over equi-join key pairs based on join typel.x = r.y:COALESCE(l.x, r.y)orCOALESCE(r.y, l.x)→l.x(left key guaranteed non-null)COALESCE(l.x, r.y)orCOALESCE(r.y, l.x)→r.y(right key guaranteed non-null)COALESCE(first, second)→first(both sides non-null, pick first argument)SELECT COALESCE(l.x, r.y) FROM l LEFT JOIN r ON l.x = r.y, where the COALESCE prevents bucketed join optimizationsChanges
SimplifyCoalesceOverJoinKeys.java: New optimizer rule matchingProjectNodeoverJoinNode, simplifying COALESCE expressionsFeaturesConfig.java: Addedoptimizer.simplify-coalesce-over-join-keysconfig (default: disabled)SystemSessionProperties.java: Addedsimplify_coalesce_over_join_keyssession propertyPlanOptimizers.java: Registered the ruleTestSimplifyCoalesceOverJoinKeys.java: 14 unit tests covering all join types and edge casesTestFeaturesConfig.java: Config validation testsAbstractTestQueries.java: End-to-end tests with SQL queriesFixes: #26984
Test plan
TestSimplifyCoalesceOverJoinKeys— 14 unit tests (all pass)TestFeaturesConfig— config property tests (passes)TestReorderJoins— verified no regressionAbstractTestQueries— LEFT, RIGHT, INNER, FULL joins with COALESCEContributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.