Skip to content

feat(analyzer): Add support for SELECT alias references in HAVING clause#27199

Merged
hantangwangd merged 1 commit intoprestodb:masterfrom
mehradpk:analyzer-alias-support
Mar 11, 2026
Merged

feat(analyzer): Add support for SELECT alias references in HAVING clause#27199
hantangwangd merged 1 commit intoprestodb:masterfrom
mehradpk:analyzer-alias-support

Conversation

@mehradpk
Copy link
Copy Markdown
Contributor

@mehradpk mehradpk commented Feb 24, 2026

Description

This change enables the HAVING clause to reference SELECT output aliases.

Previously, HAVING expressions were analyzed without resolving SELECT output aliases, which caused queries referencing projected aliases to fail semantic analysis.

This change rewrites the HAVING predicate by using existing OrderByExpressionRewriter prior to expression analysis, aligning alias resolution semantics with existing ORDER BY behavior.

Example:

Before (this query would fail with a semantic analysis error):

HAVING alias 'total' unresolved

SELECT sum(a) AS total
FROM t1
GROUP BY b
HAVING total > 10;
-- Error: Column 'total' cannot be resolved

After (this query works correctly)

HAVING can reference SELECT alias 'total'

SELECT sum(a) AS total
FROM t1
GROUP BY b
HAVING total > 10;

Motivation and Context

Queries that reference SELECT output aliases in the HAVING clause currently fail during analysis due to unresolved aliases. This behavior is inconsistent with ORDER BY, which supports alias resolution.

This change improves SQL compatibility.

Impact

HAVING clause can now reference SELECT output aliases.

No changes to public APIs.
No expected performance impact.
No changes to planning or execution logic.

Test Plan

Updated test case which now also includes negative case. Local testing performed.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Add support for SELECT alias references in HAVING clause

Summary by Sourcery

Enable semantic analysis of HAVING clauses that reference SELECT output aliases by reusing the existing alias-rewriting logic shared with ORDER BY.

New Features:

  • Allow HAVING clauses to reference SELECT output aliases for improved SQL compatibility.

Enhancements:

  • Generalize the ORDER BY expression rewriter to be parameterized by clause name and reuse it for HAVING alias resolution, including clearer ambiguity error messages.
  • Include rewritten HAVING expressions in grouping and aggregation analysis to keep semantics consistent.

Tests:

  • Expand analyzer tests to cover successful HAVING references to SELECT aliases across grouped and aggregated queries.

@mehradpk mehradpk requested review from a team, feilong-liu and jaystarshot as code owners February 24, 2026 09:58
@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Feb 24, 2026
@prestodb-ci prestodb-ci requested review from a team, bibith4 and infvg and removed request for a team February 24, 2026 09:58
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Feb 24, 2026

Reviewer's Guide

Adds support for resolving SELECT output aliases in HAVING clause by reusing the existing OrderByExpressionRewriter, ensuring HAVING predicates are rewritten and analyzed consistently with ORDER BY, and updates tests accordingly.

Sequence diagram for HAVING clause analysis with SELECT alias resolution

sequenceDiagram
    actor Client
    participant StatementAnalyzer
    participant Analysis
    participant OrderByExpressionRewriter

    Client->>StatementAnalyzer: analyze QuerySpecification
    activate StatementAnalyzer

    Note over StatementAnalyzer: analyzeHaving(node, scope)
    StatementAnalyzer->>StatementAnalyzer: extractNamedOutputExpressions(select)
    StatementAnalyzer->>OrderByExpressionRewriter: new OrderByExpressionRewriter(namedOutputExpressions, HAVING)
    activate OrderByExpressionRewriter
    StatementAnalyzer->>OrderByExpressionRewriter: rewrite(predicate)
    OrderByExpressionRewriter-->>StatementAnalyzer: rewrittenPredicate
    deactivate OrderByExpressionRewriter

    StatementAnalyzer->>StatementAnalyzer: analyzeExpression(rewrittenPredicate, scope)
    StatementAnalyzer->>Analysis: recordSubqueries(node, expressionAnalysis)
    StatementAnalyzer->>StatementAnalyzer: getType(rewrittenPredicate)
    StatementAnalyzer->>Analysis: setHaving(node, rewrittenPredicate)

    Note over StatementAnalyzer: visitQuerySpecification(node, scope)
    StatementAnalyzer->>Analysis: getHaving(node)
    Analysis-->>StatementAnalyzer: havingExpression
    alt havingExpression not null
        StatementAnalyzer->>StatementAnalyzer: add havingExpression to sourceExpressions
    else
        StatementAnalyzer->>StatementAnalyzer: add original node.getHaving() to sourceExpressions
    end

    StatementAnalyzer->>StatementAnalyzer: analyzeGroupingOperations(node, sourceExpressions, orderByExpressions)
    StatementAnalyzer->>StatementAnalyzer: analyzeAggregations(node, sourceExpressions, orderByExpressions)
    deactivate StatementAnalyzer
Loading

Updated class diagram for StatementAnalyzer HAVING alias resolution

classDiagram
    class StatementAnalyzer {
        +visitQuerySpecification(node, scope) Scope
        -analyzeHaving(node, scope) void
    }

    class Analysis {
        +setHaving(node, expression) void
        +getHaving(node) Expression
        +setOrderByExpressions(node, expressions) void
        +setOrderByExpressionsFrom(node, sourceExpressions) void
        +recordSubqueries(node, expressionAnalysis) void
    }

    class OrderByExpressionRewriter {
        -assignments Multimap~QualifiedName,Expression~
        -clauseName String
        +OrderByExpressionRewriter(assignments)
        +OrderByExpressionRewriter(assignments, clauseName)
        +rewriteIdentifier(reference, context, treeRewriter) Expression
    }

    class ExpressionAnalysis {
        +getType(expression) Type
        +getWindowFunctions() Collection~FunctionCall~
    }

    class QuerySpecification {
        +getSelect() Select
        +getHaving() Optional~Expression~
    }

    class Select {
    }

    class Expression {
    }

    class Type {
    }

    class QualifiedName {
    }

    class FunctionCall {
    }

    class Multimap~QualifiedName,Expression~ {
    }

    class Scope {
    }

    StatementAnalyzer --> Analysis : uses
    StatementAnalyzer ..> OrderByExpressionRewriter : creates and uses
    StatementAnalyzer --> ExpressionAnalysis : uses
    StatementAnalyzer --> QuerySpecification : analyzes
    QuerySpecification --> Select : contains
    Analysis --> Expression : stores HAVING
    OrderByExpressionRewriter --> Expression : rewrites
    ExpressionAnalysis --> Type : returns
    ExpressionAnalysis --> FunctionCall : returns
    OrderByExpressionRewriter --> Multimap~QualifiedName,Expression~ : uses
    Multimap~QualifiedName,Expression~ --> QualifiedName : keys
    Multimap~QualifiedName,Expression~ --> Expression : values
Loading

Flow diagram for rewritten HAVING predicate analysis

flowchart TD
    A[Start analyzeHaving for QuerySpecification] --> B{node.hasHaving?}
    B -->|No| Z[Return without HAVING analysis]
    B -->|Yes| C[Read original HAVING predicate]
    C --> D[extractNamedOutputExpressions from SELECT]
    D --> E[Create OrderByExpressionRewriter with clauseName HAVING]
    E --> F[Rewrite HAVING predicate using named output aliases]
    F --> G[Analyze rewrittenPredicate with analyzeExpression]
    G --> H[Record subqueries in Analysis]
    H --> I[Determine type of rewrittenPredicate]
    I --> J{Type is BOOLEAN or UNKNOWN?}
    J -->|No| K[Throw SemanticException for non boolean HAVING]
    J -->|Yes| L[Store rewrittenPredicate in Analysis as HAVING expression]
    L --> M[Return to caller]
    K --> M
    Z --> M
Loading

File-Level Changes

Change Details Files
Rewrite HAVING predicates using the same alias-resolution logic as ORDER BY and ensure analysis uses the rewritten expression.
  • In analyzeHaving, construct namedOutputExpressions from the SELECT list and rewrite the HAVING predicate via OrderByExpressionRewriter before expression analysis.
  • Use the rewritten HAVING predicate for expression analysis, type checking, error reporting, and store it into Analysis via analysis.setHaving.
  • Adjust type retrieval and SemanticException locations to reference the rewritten predicate instead of the original expression.
presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java
Ensure grouping and aggregation analysis sees the rewritten HAVING predicate rather than the original expression when available.
  • In visitQuerySpecification, when building sourceExpressions, prefer the HAVING expression stored in Analysis if present; fall back to the original HAVING expression otherwise.
  • Pass this updated sourceExpressions list into analyzeGroupingOperations and analyzeAggregations so they consider alias-resolved HAVING expressions.
presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java
Generalize OrderByExpressionRewriter to support different clause names for better error messages and reuse in HAVING.
  • Extend OrderByExpressionRewriter with a clauseName field and a constructor overload that defaults to "ORDER BY" for existing usages.
  • Update ambiguous attribute error messages to include the generic clause name (e.g., ORDER BY or HAVING) instead of hard-coded ORDER BY.
  • Use the new constructor when rewriting HAVING and pass the clause name string (note: verify spelling of the clause name constant).
presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java
Update analyzer tests to assert successful analysis of HAVING clauses referencing SELECT aliases.
  • Replace the previous negative test expecting MISSING_ATTRIBUTE for HAVING alias usage with multiple positive analyze() calls that reference SELECT aliases in HAVING.
  • Add coverage for multiple aliases, GROUP BY with aliases, and simple aggregate-alias comparisons in HAVING.
presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/TestAnalyzer.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 4 issues, and left some high level feedback:

  • The OrderByExpressionRewriter usage in analyzeHaving passes the clause name as "HAVNG" instead of "HAVING", which will lead to a misspelled clause name in ambiguity error messages and should be corrected.
  • Now that OrderByExpressionRewriter is reused for HAVING predicates, consider renaming it (and its factory usage) to a more generic alias-resolution name to better reflect its broader purpose.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `OrderByExpressionRewriter` usage in `analyzeHaving` passes the clause name as "HAVNG" instead of "HAVING", which will lead to a misspelled clause name in ambiguity error messages and should be corrected.
- Now that `OrderByExpressionRewriter` is reused for HAVING predicates, consider renaming it (and its factory usage) to a more generic alias-resolution name to better reflect its broader purpose.

## Individual Comments

### Comment 1
<location path="presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java" line_range="3823-3825" />
<code_context>
-                ExpressionAnalysis expressionAnalysis = analyzeExpression(predicate, scope);
+                // Reuse OrderByExpressionRewriter to resolve SELECT aliases in HAVING
+                Multimap<QualifiedName, Expression> namedOutputExpressions = extractNamedOutputExpressions(node.getSelect());
+                Expression rewrittenPredicate = ExpressionTreeRewriter.rewriteWith(new OrderByExpressionRewriter(namedOutputExpressions, "HAVNG"), predicate);
+
+                // Analyze the rewritten expression
</code_context>
<issue_to_address>
**issue (typo):** Fix the typo in the clause name string passed to the rewriter ("HAVNG" → "HAVING").

This string is used in the error message `"'%s' in '%s' is ambiguous"`, so keeping it as `"HAVNG"` will produce confusing output like `in 'HAVNG'`. It should be corrected to `"HAVING"` to match the actual clause name.

```suggestion
                // Reuse OrderByExpressionRewriter to resolve SELECT aliases in HAVING
                Multimap<QualifiedName, Expression> namedOutputExpressions = extractNamedOutputExpressions(node.getSelect());
                Expression rewrittenPredicate = ExpressionTreeRewriter.rewriteWith(new OrderByExpressionRewriter(namedOutputExpressions, "HAVING"), predicate);
```
</issue_to_address>

### Comment 2
<location path="presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/TestAnalyzer.java" line_range="305-308" />
<code_context>
     {
-        assertFails(MISSING_ATTRIBUTE, "SELECT sum(a) x FROM t1 HAVING x > 5");
+        // HAVING now support referencing SELECT aliases for improved SQL compatibility
+        analyze("SELECT sum(a) x FROM t1 HAVING x > 5");
+        analyze("SELECT sum(a) AS total FROM t1 GROUP BY b HAVING total > 10");
+        analyze("SELECT count(*) AS cnt, sum(a) AS total FROM t1 GROUP BY b HAVING cnt > 5 AND total > 100");
+        analyze("SELECT sum(a) as sum_a FROM t1 GROUP BY b HAVING sum_a > 1");
     }

</code_context>
<issue_to_address>
**suggestion (testing):** Add negative tests for ambiguous alias references in HAVING

Since HAVING now supports SELECT aliases, please also add negative tests for ambiguous aliases, similar to ORDER BY. For instance:

```java
assertFails(AMBIGUOUS_ATTRIBUTE, "SELECT sum(a) x, sum(b) x FROM t1 HAVING x > 5");
```

This ensures multiple definitions of the same alias in the SELECT list cause the expected ambiguity error when referenced from HAVING, and validates the new alias resolution logic and error message for this case.
</issue_to_address>

### Comment 3
<location path="presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/TestAnalyzer.java" line_range="306-308" />
<code_context>
-        assertFails(MISSING_ATTRIBUTE, "SELECT sum(a) x FROM t1 HAVING x > 5");
+        // HAVING now support referencing SELECT aliases for improved SQL compatibility
+        analyze("SELECT sum(a) x FROM t1 HAVING x > 5");
+        analyze("SELECT sum(a) AS total FROM t1 GROUP BY b HAVING total > 10");
+        analyze("SELECT count(*) AS cnt, sum(a) AS total FROM t1 GROUP BY b HAVING cnt > 5 AND total > 100");
+        analyze("SELECT sum(a) as sum_a FROM t1 GROUP BY b HAVING sum_a > 1");
     }

</code_context>
<issue_to_address>
**suggestion (testing):** Add a negative test for HAVING referencing a non-existent or non-select alias

With the new positive cases, we should also keep a negative one where HAVING references a non-existent, non-resolvable name (not a SELECT alias), e.g.:

```java
assertFails(MISSING_ATTRIBUTE, "SELECT sum(a) AS total FROM t1 HAVING unknown_alias > 5");
```

This ensures unresolved references in HAVING still produce the expected semantic error.
</issue_to_address>

### Comment 4
<location path="presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/TestAnalyzer.java" line_range="307-308" />
<code_context>
+        // HAVING now support referencing SELECT aliases for improved SQL compatibility
+        analyze("SELECT sum(a) x FROM t1 HAVING x > 5");
+        analyze("SELECT sum(a) AS total FROM t1 GROUP BY b HAVING total > 10");
+        analyze("SELECT count(*) AS cnt, sum(a) AS total FROM t1 GROUP BY b HAVING cnt > 5 AND total > 100");
+        analyze("SELECT sum(a) as sum_a FROM t1 GROUP BY b HAVING sum_a > 1");
     }

</code_context>
<issue_to_address>
**suggestion (testing):** Add tests to ensure HAVING cannot indirectly introduce window functions via aliases

Since HAVING now rewrites expressions using SELECT aliases, please add a test that window functions are still rejected when referenced via an alias, e.g.:

```java
assertFails(INVALID_WINDOW_FUNCTION,
        "SELECT row_number() OVER () AS rn FROM t1 GROUP BY b HAVING rn > 1");
```
(or using the same error code used elsewhere in `TestAnalyzer` for window functions in HAVING). This verifies that alias rewriting does not bypass the HAVING window-function restriction.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +305 to +308
analyze("SELECT sum(a) x FROM t1 HAVING x > 5");
analyze("SELECT sum(a) AS total FROM t1 GROUP BY b HAVING total > 10");
analyze("SELECT count(*) AS cnt, sum(a) AS total FROM t1 GROUP BY b HAVING cnt > 5 AND total > 100");
analyze("SELECT sum(a) as sum_a FROM t1 GROUP BY b HAVING sum_a > 1");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add negative tests for ambiguous alias references in HAVING

Since HAVING now supports SELECT aliases, please also add negative tests for ambiguous aliases, similar to ORDER BY. For instance:

assertFails(AMBIGUOUS_ATTRIBUTE, "SELECT sum(a) x, sum(b) x FROM t1 HAVING x > 5");

This ensures multiple definitions of the same alias in the SELECT list cause the expected ambiguity error when referenced from HAVING, and validates the new alias resolution logic and error message for this case.

Comment on lines +306 to +308
analyze("SELECT sum(a) AS total FROM t1 GROUP BY b HAVING total > 10");
analyze("SELECT count(*) AS cnt, sum(a) AS total FROM t1 GROUP BY b HAVING cnt > 5 AND total > 100");
analyze("SELECT sum(a) as sum_a FROM t1 GROUP BY b HAVING sum_a > 1");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a negative test for HAVING referencing a non-existent or non-select alias

With the new positive cases, we should also keep a negative one where HAVING references a non-existent, non-resolvable name (not a SELECT alias), e.g.:

assertFails(MISSING_ATTRIBUTE, "SELECT sum(a) AS total FROM t1 HAVING unknown_alias > 5");

This ensures unresolved references in HAVING still produce the expected semantic error.

Comment on lines +307 to +308
analyze("SELECT count(*) AS cnt, sum(a) AS total FROM t1 GROUP BY b HAVING cnt > 5 AND total > 100");
analyze("SELECT sum(a) as sum_a FROM t1 GROUP BY b HAVING sum_a > 1");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add tests to ensure HAVING cannot indirectly introduce window functions via aliases

Since HAVING now rewrites expressions using SELECT aliases, please add a test that window functions are still rejected when referenced via an alias, e.g.:

assertFails(INVALID_WINDOW_FUNCTION,
        "SELECT row_number() OVER () AS rn FROM t1 GROUP BY b HAVING rn > 1");

(or using the same error code used elsewhere in TestAnalyzer for window functions in HAVING). This verifies that alias rewriting does not bypass the HAVING window-function restriction.

@mehradpk mehradpk force-pushed the analyzer-alias-support branch 2 times, most recently from 58a8fc0 to 2d4559c Compare March 2, 2026 10:21
@mehradpk mehradpk force-pushed the analyzer-alias-support branch from 2d4559c to 7a83182 Compare March 6, 2026 01:43
node.getHaving().ifPresent(sourceExpressions::add);
// Use the rewritten HAVING expression if present (to resolve SELECT alias references)
if (node.getHaving().isPresent()) {
Expression havingExpression = analysis.getHaving(node);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

analyzeHaving() is already called earlier than this. So when would this below else happen? Isn't that unreachable?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, that else branch is unreachable since analyzeHaving() always calls analysis.setHaving() when a HAVING clause is present. I've simplified the code to remove the unnecessary null check. Thanks for pointing out!

Allow HAVING clause to reference SELECT aliases, implementation reuses existing OrderByExpressionRewriter for consistent alias resolution.
@mehradpk mehradpk force-pushed the analyzer-alias-support branch from 7a83182 to fd34be5 Compare March 9, 2026 05:49
@mehradpk mehradpk requested a review from agrawalreetika March 9, 2026 05:55
Copy link
Copy Markdown
Member

@agrawalreetika agrawalreetika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mehradpk, lgtm!

@hantangwangd hantangwangd merged commit 626c09d into prestodb:master Mar 11, 2026
113 of 115 checks passed
shelton408 pushed a commit to shelton408/presto that referenced this pull request Mar 19, 2026
Summary:
This reverts the following PR
GitHub Pull Request: prestodb#27199

Differential Revision: D97181706
@shelton408
Copy link
Copy Markdown
Contributor

shelton408 commented Mar 19, 2026

Will be reverting in #27372 , this change has 2 bugs that are breaking existing queries. Testing on the revert has confirmed that this is the cause of 2 errors, see the PR for more details

shelton408 pushed a commit to shelton408/presto that referenced this pull request Mar 19, 2026
Summary:
This reverts the following PR
GitHub Pull Request: prestodb#27199

Differential Revision: D97181706
@shelton408 shelton408 mentioned this pull request Mar 19, 2026
7 tasks
stevechuck pushed a commit that referenced this pull request Mar 19, 2026
feilong-liu pushed a commit that referenced this pull request Mar 19, 2026
## Summary:
This reverts the following PR
GitHub Pull Request: #27199
GitHub Author: Deepak Mehra mehradeeeepak@gmail.com
Meta internal revision:D96106074

Fully reverts the HAVING alias expansion feature, which threw
7 failed queries (2 different errors) on meta internal test suites. I
have verified that these errors are all caused by this change by
attempting to fix them and testing that the revert clears the errors.

## Errors:
1. java.sql.SQLException: Query failed (#20260316_083518_04833_pwriw):
type of variable 'expr_189' is expected 2. to be varchar(25), but the
actual type is varchar
java.sql.SQLException: Query failed (#20260316_083713_05663_pwriw):
Cannot nest aggregations inside aggregation 'sum':
["sum"(CTX_pe_revenue)]

## Root cause:
The PR added SELECT alias resolution in HAVING by rewriting
the HAVING predicate with OrderByExpressionRewriter. This caused two
bugs:
1. Type mismatches (5 failures): (AI reasoning) When a column name
matches a SELECT alias
(e.g., SELECT CASE...END AS category), the rewriter expands the column
reference in HAVING to the full CASE expression, changing its type from
varchar(25) to unbounded varchar. QueryPlanner.filter() uses the
rewritten expression via analysis.getHaving(), creating plan nodes with
mismatched types that TypeValidator rejects.
2. Nested aggregation (1 failure): (confirmed) When a SELECT alias
references an
aggregation and HAVING uses that alias inside another aggregation
(e.g., HAVING sum(total) where total = sum(x)), expansion creates
invalid nested aggregations sum(sum(x))
3. bigint vs double (1 failure): Same mechanism as issue 1 but with
numeric
type coercion differences.

## Fix:
I tried to fix these issues, but only managed to clear the error for #2.
Reverting the change due to these errors.
According to AI:
The HAVING alias resolution feature needs a different implementation
that
doesn't change expression types in the plan — likely by resolving
aliases
during planning rather than during analysis.

## Reproduction:
I can't share the reproducing queries since they reference meta internal
data, but these should give you an idea of how to reproduce the issues

Bug 1: Type mismatch (varchar(25) vs varchar)

```
  -- The SELECT aliases `category` to a CASE expression returning varchar strings.
  -- HAVING references `category` which the rewriter expands to the CASE expression,
  -- changing its type from the subquery's varchar(25) to unbounded varchar.
  -- Triggers: "type of variable 'expr_N' is expected to be varchar(25), but the actual type is varchar"

  WITH data AS (
      SELECT
          CAST('cpu' AS varchar(25)) AS category,
          true AS is_cpu,
          100.0 AS value
      UNION ALL
      SELECT CAST('gpu' AS varchar(25)), false, 200.0
  )
  SELECT
      (CASE
          WHEN COALESCE(category, CAST(is_cpu AS varchar)) IS NULL THEN 'total'
          WHEN is_cpu THEN 'cpu_total'
          ELSE category
      END) category,
      sum(value) AS total_value
  FROM data
  GROUP BY GROUPING SETS ((), (category), (is_cpu))
  HAVING (CASE
      WHEN COALESCE(category, CAST(is_cpu AS varchar)) IS NULL THEN 'total'
      WHEN is_cpu THEN 'cpu_total'
      ELSE category
  END) IS NOT NULL
```

  Bug 2: Nested aggregation (SYNTAX_ERROR)

```
  -- SELECT defines `total` as sum(revenue). HAVING uses sum(total).
  -- The rewriter expands `total` to `sum(revenue)`, creating sum(sum(revenue)).
  -- Triggers: "Cannot nest aggregations inside aggregation 'sum': ["sum"(revenue)]"

  SELECT
      region,
      sum(revenue) AS total
  FROM (
      VALUES ('us', 100), ('eu', 200), ('us', 150), ('eu', 50)
  ) AS t(region, revenue)
  GROUP BY region
  HAVING sum(total) > 100
```

Differential Revision: D97181706

## Summary by Sourcery

Revert support for resolving SELECT output aliases in HAVING clauses and
restore previous validation behavior.

Enhancements:
- Simplify OrderByExpressionRewriter by removing clause-specific error
context now that it is only used for ORDER BY.

Tests:
- Update analyzer tests to again treat alias references in HAVING as
invalid and remove related positive and error-coverage cases.


## Release Notes
Please follow [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines)
and fill in the release notes below.

```
== RELEASE NOTES ==

General Changes
* Fix 2 bugs caused by Select Alias references in Having clause

Co-authored-by: mehradeeeepak@gmail.com <mehradeeeepak@gmail.com>
shelton408 added a commit that referenced this pull request Mar 20, 2026
## Summary:
This reverts the following PR
GitHub Pull Request: #27199
GitHub Author: Deepak Mehra mehradeeeepak@gmail.com
Meta internal revision:D96106074

Fully reverts the HAVING alias expansion feature, which threw
7 failed queries (2 different errors) on meta internal test suites. I
have verified that these errors are all caused by this change by
attempting to fix them and testing that the revert clears the errors.

## Errors:
1. java.sql.SQLException: Query failed (#20260316_083518_04833_pwriw):
type of variable 'expr_189' is expected 2. to be varchar(25), but the
actual type is varchar
java.sql.SQLException: Query failed (#20260316_083713_05663_pwriw):
Cannot nest aggregations inside aggregation 'sum':
["sum"(CTX_pe_revenue)]

## Root cause:
The PR added SELECT alias resolution in HAVING by rewriting
the HAVING predicate with OrderByExpressionRewriter. This caused two
bugs:
1. Type mismatches (5 failures): (AI reasoning) When a column name
matches a SELECT alias
(e.g., SELECT CASE...END AS category), the rewriter expands the column
reference in HAVING to the full CASE expression, changing its type from
varchar(25) to unbounded varchar. QueryPlanner.filter() uses the
rewritten expression via analysis.getHaving(), creating plan nodes with
mismatched types that TypeValidator rejects.
2. Nested aggregation (1 failure): (confirmed) When a SELECT alias
references an
aggregation and HAVING uses that alias inside another aggregation
(e.g., HAVING sum(total) where total = sum(x)), expansion creates
invalid nested aggregations sum(sum(x))
3. bigint vs double (1 failure): Same mechanism as issue 1 but with
numeric
type coercion differences.

## Fix:
I tried to fix these issues, but only managed to clear the error for #2.
Reverting the change due to these errors.
According to AI:
The HAVING alias resolution feature needs a different implementation
that
doesn't change expression types in the plan — likely by resolving
aliases
during planning rather than during analysis.

## Reproduction:
I can't share the reproducing queries since they reference meta internal
data, but these should give you an idea of how to reproduce the issues

Bug 1: Type mismatch (varchar(25) vs varchar)

```
  -- The SELECT aliases `category` to a CASE expression returning varchar strings.
  -- HAVING references `category` which the rewriter expands to the CASE expression,
  -- changing its type from the subquery's varchar(25) to unbounded varchar.
  -- Triggers: "type of variable 'expr_N' is expected to be varchar(25), but the actual type is varchar"

  WITH data AS (
      SELECT
          CAST('cpu' AS varchar(25)) AS category,
          true AS is_cpu,
          100.0 AS value
      UNION ALL
      SELECT CAST('gpu' AS varchar(25)), false, 200.0
  )
  SELECT
      (CASE
          WHEN COALESCE(category, CAST(is_cpu AS varchar)) IS NULL THEN 'total'
          WHEN is_cpu THEN 'cpu_total'
          ELSE category
      END) category,
      sum(value) AS total_value
  FROM data
  GROUP BY GROUPING SETS ((), (category), (is_cpu))
  HAVING (CASE
      WHEN COALESCE(category, CAST(is_cpu AS varchar)) IS NULL THEN 'total'
      WHEN is_cpu THEN 'cpu_total'
      ELSE category
  END) IS NOT NULL
```

  Bug 2: Nested aggregation (SYNTAX_ERROR)

```
  -- SELECT defines `total` as sum(revenue). HAVING uses sum(total).
  -- The rewriter expands `total` to `sum(revenue)`, creating sum(sum(revenue)).
  -- Triggers: "Cannot nest aggregations inside aggregation 'sum': ["sum"(revenue)]"

  SELECT
      region,
      sum(revenue) AS total
  FROM (
      VALUES ('us', 100), ('eu', 200), ('us', 150), ('eu', 50)
  ) AS t(region, revenue)
  GROUP BY region
  HAVING sum(total) > 100
```

Differential Revision: D97181706

## Summary by Sourcery

Revert support for resolving SELECT output aliases in HAVING clauses and
restore previous validation behavior.

Enhancements:
- Simplify OrderByExpressionRewriter by removing clause-specific error
context now that it is only used for ORDER BY.

Tests:
- Update analyzer tests to again treat alias references in HAVING as
invalid and remove related positive and error-coverage cases.


## Release Notes
Please follow [release notes
guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines)
and fill in the release notes below.

```
== RELEASE NOTES ==

General Changes
* Fix 2 bugs caused by Select Alias references in Having clause

Co-authored-by: mehradeeeepak@gmail.com <mehradeeeepak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants