Skip to content

fix(optimizer): Fix infinite loop in UnaliasSymbolReferences.canonicalize()#27428

Merged
feilong-liu merged 1 commit intoprestodb:masterfrom
kaikalur:fix-unalias-canonicalize-infinite-loop
Mar 25, 2026
Merged

fix(optimizer): Fix infinite loop in UnaliasSymbolReferences.canonicalize()#27428
feilong-liu merged 1 commit intoprestodb:masterfrom
kaikalur:fix-unalias-canonicalize-infinite-loop

Conversation

@kaikalur
Copy link
Copy Markdown
Contributor

@kaikalur kaikalur commented Mar 24, 2026

Summary

  • Fix infinite loop in UnaliasSymbolReferences.canonicalize() when the alias mapping contains a cycle
  • Queries with redundant GROUP BY over constant expressions hang indefinitely during planning because this non-iterative optimizer has no timeout protection
  • Add cycle detection using a visited set; when a cycle is found, remove the cyclic mapping entry and break

Fixes #27427

Root Cause

The canonicalize() method follows alias chains via a while loop:

while (mapping.containsKey(canonical)) {
    canonical = mapping.get(canonical);
}

When multiple variables map to the same constant expression across different ProjectNodes (e.g., ds = '2026-01-01' and report_date = '2026-01-01'), a cycle can form (ds → report_date → ds), causing infinite looping.

Reproduction (TPC-H)

SELECT report_date, total_price, order_cnt
FROM (
    SELECT ds, report_date,
        SUM(total_price) AS total_price,
        SUM(order_cnt) AS order_cnt
    FROM (
        SELECT report_date, '2026-01-01' AS ds,
            SUM(totalprice) AS total_price, COUNT(1) AS order_cnt
        FROM (
            SELECT '2026-01-01' AS report_date, orderstatus,
                SUM(totalprice) AS totalprice
            FROM orders GROUP BY 1, 2
        ) GROUP BY 1, 2
    ) GROUP BY ds, report_date  -- redundant GROUP BY over two constants
)

Test plan

  • Repro query: infinite hang → completes in 3.3s
  • TestUnaliasSymbolReferences: 2/2 pass
  • TestLocalQueries: 523/523 pass, 0 regressions

Release Notes

== RELEASE NOTES ==

General Changes
* Fix infinite loop in ``UnaliasSymbolReferences`` when alias mapping contains a cycle caused by multiple variables mapped to the same constant expression across different ProjectNodes.

…ze() (prestodb#27427)

The canonicalize() method follows alias chains via a while loop but has
no cycle detection. When multiple variables map to the same constant
expression across different ProjectNodes, a cycle can form in the
mapping (e.g., ds → report_date → ds), causing the loop to spin forever.

Since UnaliasSymbolReferences is a non-iterative optimizer with no
timeout protection, this hangs the query indefinitely during planning.

Fix: add a visited set to detect cycles. When a cycle is found, remove
the cyclic mapping entry and break.
@kaikalur kaikalur requested review from a team, feilong-liu and jaystarshot as code owners March 24, 2026 22:21
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 24, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Adds cycle detection to UnaliasSymbolReferences.canonicalize() to prevent infinite loops when the alias mapping contains cycles by tracking visited symbols and breaking/removing the cyclic mapping entry.

Class diagram for updated canonicalize logic in UnaliasSymbolReferences

classDiagram
    class UnaliasSymbolReferences {
    }

    class VariableReferenceExpression {
        +getName() String
        +getSourceLocation() SourceLocation
    }

    class SymbolReference {
        <<value object>>
    }

    class SourceLocation {
    }

    class Map_String_String {
        +containsKey(key String) boolean
        +get(key String) String
        +remove(key String) void
    }

    class Map_SymbolReference_Type {
        +get(key SymbolReference) Type
    }

    class Set_String {
        +add(value String) boolean
    }

    class Type {
    }

    UnaliasSymbolReferences o-- Map_String_String : mapping
    UnaliasSymbolReferences o-- Map_SymbolReference_Type : types

    UnaliasSymbolReferences : -mapping Map_String_String
    UnaliasSymbolReferences : -types Map_SymbolReference_Type
    UnaliasSymbolReferences : +canonicalize(variable VariableReferenceExpression) VariableReferenceExpression

    UnaliasSymbolReferences ..> Set_String : uses for visited
    UnaliasSymbolReferences ..> VariableReferenceExpression
    UnaliasSymbolReferences ..> SymbolReference
    UnaliasSymbolReferences ..> Type
    UnaliasSymbolReferences ..> SourceLocation
Loading

Flow diagram for cycle detection in canonicalize method

flowchart TD
    A["Start canonicalize(variable)"] --> B["canonical = variable.getName()"]
    B --> C["visited = new HashSet"]
    C --> D["visited.add(canonical)"]
    D --> E{"mapping.containsKey(canonical)?"}
    E -- "No" --> F["Create SymbolReference from canonical and source location"]
    F --> G["Lookup Type from types map"]
    G --> H["Return new VariableReferenceExpression"]
    E -- "Yes" --> I["canonical = mapping.get(canonical)"]
    I --> J{"visited.add(canonical) succeeds?"}
    J -- "Yes (not seen)" --> E
    J -- "No (already seen)" --> K["Cycle detected"]
    K --> L["mapping.remove(canonical)"]
    L --> F
Loading

File-Level Changes

Change Details Files
Add cycle detection in alias canonicalization to prevent infinite loops when alias mappings contain cycles.
  • Initialize a visited set with the starting variable name before following alias mappings.
  • During canonicalization, after each alias resolution, attempt to add the new canonical name to the visited set and detect cycles when the add fails.
  • When a cycle is detected, remove the mapping entry for the symbol that would close the loop and break out of the resolution loop, leaving the current canonical name as the final one.
presto-main-base/src/main/java/com/facebook/presto/sql/planner/optimizations/UnaliasSymbolReferences.java

Assessment against linked issues

Issue Objective Addressed Explanation
#27427 Prevent infinite loops in UnaliasSymbolReferences.canonicalize() by adding cycle detection to the alias mapping traversal.
#27427 Ensure queries with redundant GROUP BY over constant expressions (which previously caused canonicalize() to loop indefinitely) complete successfully instead of hanging during planning.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The new visited set is allocated on every canonicalize call even for variables that don’t follow any alias chain; consider creating it lazily only after the first successful mapping.get(...) to avoid overhead in the common case.
  • When a cycle is detected you mutate mapping by removing canonical, which breaks the cycle but isn’t actually the edge that closed the loop; consider clarifying the comment or restructuring this so the specific back-edge is removed (e.g., by tracking the predecessor) to better match the intention.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new `visited` set is allocated on every `canonicalize` call even for variables that don’t follow any alias chain; consider creating it lazily only after the first successful `mapping.get(...)` to avoid overhead in the common case.
- When a cycle is detected you mutate `mapping` by removing `canonical`, which breaks the cycle but isn’t actually the edge that closed the loop; consider clarifying the comment or restructuring this so the specific back-edge is removed (e.g., by tracking the predecessor) to better match the intention.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@kaikalur kaikalur changed the title Fix infinite loop in UnaliasSymbolReferences.canonicalize() fix(optimizer): Fix infinite loop in UnaliasSymbolReferences.canonicalize() Mar 24, 2026
@feilong-liu feilong-liu merged commit 2c26196 into prestodb:master Mar 25, 2026
83 of 89 checks passed
@ethanyzhang ethanyzhang added the from:Meta PR from Meta label Mar 25, 2026
@prestodb-ci
Copy link
Copy Markdown
Contributor

Saved that user @kaikalur is from Meta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:Meta PR from Meta

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Infinite loop in UnaliasSymbolReferences.canonicalize() with redundant GROUP BY over constants

4 participants