Skip to content

fix(optimizer): Fix PlanRemoteProjections to keep JsonPath arguments inline for local functions#27323

Merged
feilong-liu merged 1 commit intoprestodb:masterfrom
feilong-liu:export-D96397113
Mar 13, 2026
Merged

fix(optimizer): Fix PlanRemoteProjections to keep JsonPath arguments inline for local functions#27323
feilong-liu merged 1 commit intoprestodb:masterfrom
feilong-liu:export-D96397113

Conversation

@feilong-liu
Copy link
Copy Markdown
Contributor

@feilong-liu feilong-liu commented Mar 13, 2026

Description

JsonPath type is not serializable (its createBlockBuilder throws). When
PlanRemoteProjections processes a local function that has a JsonPath-typed
argument (e.g., json_extract_scalar(remote_func(), '$.key')), it previously
extracted the JsonPath argument into a separate ProjectNode output variable.
This caused a runtime failure when the engine tried to materialize the
JsonPath-typed block.

The fix keeps JsonPath-typed arguments inline within the local function call
when argumentProjection is empty (i.e., the argument is a simple leaf
expression that doesn't need further decomposition). This mirrors the existing
behavior for ConstantExpression arguments.

Motivation and Context

Remote function calls involving json_extract_scalar with JsonPath arguments
would fail at runtime because JsonPath type cannot be serialized into a block.

Impact

No public API changes. Fixes runtime failures for queries using
json_extract_scalar (or similar JsonPath-consuming functions) on results of
remote functions.

Test Plan

Added testJsonPathArgumentKeptInline in TestPlanRemoteProjections that
verifies json_extract_scalar(CAST(remote_foo() AS VARCHAR), '$.key') is
rewritten without extracting any JsonPath-typed variable.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code
    style
    and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

== NO RELEASE NOTE == 

Summary by Sourcery

Adjust PlanRemoteProjections handling of JsonPath-typed arguments for local functions to avoid creating non-serializable project outputs and add coverage for this behavior.

Bug Fixes:

  • Prevent JsonPath-typed arguments to local functions from being extracted into separate ProjectNode outputs, avoiding runtime failures when materializing JsonPath blocks.

Tests:

  • Add a planner rule test ensuring JsonPath-typed arguments to local functions are kept inline and never projected as JsonPath-typed variables.

…nline for local functions

Summary:
JsonPath type is not serializable (its `createBlockBuilder` throws). When
`PlanRemoteProjections` processes a local function that has a JsonPath-typed
argument (e.g., `json_extract_scalar(remote_func(), '$.key')`), it previously
extracted the JsonPath argument into a separate ProjectNode output variable.
This caused a runtime failure when the engine tried to materialize the
JsonPath-typed block.

The fix keeps JsonPath-typed arguments inline within the local function call
when `argumentProjection` is empty (i.e., the argument is a simple leaf
expression that doesn't need further decomposition). This mirrors the existing
behavior for ConstantExpression arguments.

Differential Revision: D96397113
@feilong-liu feilong-liu requested review from a team and jaystarshot as code owners March 13, 2026 00:06
@prestodb-ci prestodb-ci added the from:Meta PR from Meta label Mar 13, 2026
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 13, 2026

Reviewer's Guide

Adjusts PlanRemoteProjections so that JsonPath-typed arguments to local functions are kept inline rather than being extracted into separate projection variables, and adds a regression test to verify JsonPath arguments are not materialized as ProjectNode outputs.

Updated class diagram for PlanRemoteProjections argument handling

classDiagram
  class PlanRemoteProjections {
    +List~ProjectionContext~ processArguments(List~RowExpression~ arguments, boolean local)
  }

  class RowExpression {
    <<interface>>
    +Type getType()
    +List~ProjectionContext~ accept(RowExpressionVisitor visitor, Object context)
  }

  class ProjectionContext {
    +Map~VariableReferenceExpression, RowExpression~ projections
    +boolean remote
    +ProjectionContext(Map~VariableReferenceExpression, RowExpression~ projections, boolean remote)
  }

  class VariableReferenceExpression {
    +Type getType()
  }

  class Type {
  }

  class JsonPathType {
  }

  class VariableAllocator {
    +VariableReferenceExpression newVariable(RowExpression expression)
  }

  PlanRemoteProjections --> VariableAllocator : uses
  PlanRemoteProjections --> ProjectionContext : creates
  PlanRemoteProjections --> RowExpression : processes
  RowExpression --> ProjectionContext : returns from accept
  VariableReferenceExpression --> Type : has
  RowExpression --> Type : has
  JsonPathType --|> Type

  %% Key logic in PlanRemoteProjections.processArguments
  class ProcessArgumentsLogic {
    +handleArgument(RowExpression argument, boolean local)
  }

  PlanRemoteProjections ..> ProcessArgumentsLogic : contains logic

  %% Relevant conditional branches
  class ProcessArgumentsLogic {
    +if argumentProjection not empty: useProjection()
    +if argumentProjection empty and local and type == JSON_PATH: keepInline()
    +if argumentProjection empty and not JSON_PATH: createVariableProjection()
  }
Loading

Flow diagram for JsonPath handling in PlanRemoteProjections.processArguments

flowchart TD
  A["Start processing function arguments"] --> B["Take next argument"]
  B --> C{More arguments?}
  C -- No --> Z["Done"]
  C -- Yes --> D["Call argument.accept(visitor, null)"]
  D --> E{argumentProjection is empty?}
  E -- No --> F["Use existing argumentProjection
(newArguments add variable from projection)"]
  F --> B
  E -- Yes --> G{local function call?}
  G -- No --> H["Allocate new VariableReferenceExpression
from variableAllocator"]
  H --> I["Wrap in ProjectionContext
and add to projections"]
  I --> J["newArguments add variable"]
  J --> B
  G -- Yes --> K{argument type is JSON_PATH?}
  K -- No --> H
  K -- Yes --> L["Keep argument inline:
newArguments add argument directly"]
  L --> B
Loading

File-Level Changes

Change Details Files
Prevent extraction of JsonPath-typed arguments for local functions when argument projections are empty, keeping them inline in the function call instead.
  • In processArguments, when argumentProjection is empty, detect local JsonPath-typed arguments and append the original argument directly to newArguments instead of allocating a new variable and projection
  • Preserve existing behavior for other argument types by only short-circuiting for local JsonPath arguments
presto-main-base/src/main/java/com/facebook/presto/sql/planner/iterative/rule/PlanRemoteProjections.java
Add a regression test ensuring JsonPath-typed arguments are not extracted into ProjectNode outputs during remote projection planning.
  • Import JsonPathType.JSON_PATH and Assert.assertFalse in the test class
  • Add testJsonPathArgumentKeptInline to build a plan with json_extract_scalar over a remote function and assert that none of the projection variables have JSON_PATH type
presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestPlanRemoteProjections.java

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@feilong-liu feilong-liu changed the title [presto-trunk] Fix PlanRemoteProjections to keep JsonPath arguments inline for local functions fix(Optimizer): Fix PlanRemoteProjections to keep JsonPath arguments inline for local functions Mar 13, 2026
@feilong-liu feilong-liu changed the title fix(Optimizer): Fix PlanRemoteProjections to keep JsonPath arguments inline for local functions fix(optimizer): Fix PlanRemoteProjections to keep JsonPath arguments inline for local functions Mar 13, 2026
@feilong-liu feilong-liu requested a review from kaikalur March 13, 2026 00:06
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The special-case check for local && argument.getType().equals(JSON_PATH) hardcodes a single non-serializable type; consider centralizing this as a predicate (e.g., isInlineOnlyType(Type)) so other non-serializable types can reuse the behavior without further branching here.
  • In testJsonPathArgumentKeptInline, you could tighten the assertion by deriving the JSON path variables from the planned ProjectNode outputs (rather than all projection variables) to ensure the test is specifically validating that no JSON path is ever materialized as a project output.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The special-case check for `local && argument.getType().equals(JSON_PATH)` hardcodes a single non-serializable type; consider centralizing this as a predicate (e.g., `isInlineOnlyType(Type)`) so other non-serializable types can reuse the behavior without further branching here.
- In `testJsonPathArgumentKeptInline`, you could tighten the assertion by deriving the JSON path variables from the planned `ProjectNode` outputs (rather than all projection variables) to ensure the test is specifically validating that no JSON path is ever materialized as a project output.

## Individual Comments

### Comment 1
<location path="presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestPlanRemoteProjections.java" line_range="193-202" />
<code_context>
         assertEquals(rewritten.get(3).getProjections().size(), 5);
     }

+    @Test
+    void testJsonPathArgumentKeptInline()
+    {
+        PlanBuilder planBuilder = new PlanBuilder(TEST_SESSION, new PlanNodeIdAllocator(), getMetadata());
+
+        PlanRemoteProjections rule = new PlanRemoteProjections(getFunctionAndTypeManager());
+        List<ProjectionContext> rewritten = rule.planRemoteAssignments(Assignments.builder()
+                .put(planBuilder.variable("a"), planBuilder.rowExpression("json_extract_scalar(CAST(unittest.memory.remote_foo() AS VARCHAR), '$.key')"))
+                .build(), new VariableAllocator(planBuilder.getTypes().allVariables()));
+        assertEquals(rewritten.size(), 2);
+        // Verify no projection context extracts a JsonPath-typed variable
+        for (ProjectionContext context : rewritten) {
</code_context>
<issue_to_address>
**suggestion (testing):** Consider asserting the structure of the rewritten projections to prove the JsonPath argument is actually kept inline

The current checks (no `JsonPath`-typed variables and exactly two contexts) are good for guarding against the regression. To more directly validate the behavior, consider also asserting that the rewritten projection still contains the `json_extract_scalar(CAST(unittest.memory.remote_foo() AS VARCHAR), '$.key')` expression inline, e.g. via `PlanMatchPattern` or by inspecting `ProjectionContext.getProjections().values()`. That would detect cases where the rule drops or unexpectedly restructures this expression while still avoiding a JsonPath-typed variable projection.

Suggested implementation:

```java
import static com.facebook.presto.sql.planner.iterative.rule.PlanRemoteProjections.ProjectionContext;
import static com.facebook.presto.type.JsonPathType.JSON_PATH;
import static org.testng.Assert.assertEquals;
import static org.testng.Assert.assertFalse;
import static org.testng.Assert.assertTrue;

```

```java
    @Test
    void testJsonPathArgumentKeptInline()
    {
        PlanBuilder planBuilder = new PlanBuilder(TEST_SESSION, new PlanNodeIdAllocator(), getMetadata());

        PlanRemoteProjections rule = new PlanRemoteProjections(getFunctionAndTypeManager());
        List<ProjectionContext> rewritten = rule.planRemoteAssignments(Assignments.builder()
                .put(planBuilder.variable("a"), planBuilder.rowExpression("json_extract_scalar(CAST(unittest.memory.remote_foo() AS VARCHAR), '$.key')"))
                .build(), new VariableAllocator(planBuilder.getTypes().allVariables()));
        assertEquals(rewritten.size(), 2);

        // Verify no projection context extracts a JsonPath-typed variable
        for (ProjectionContext context : rewritten) {
            context.getProjections().keySet().forEach(variable ->
                    assertFalse(variable.getType().equals(JSON_PATH),
                            "JsonPath type should not be extracted as a ProjectNode output"));
        }

        // Verify the json_extract_scalar(...) expression with the JsonPath argument is still kept inline
        boolean foundInlineJsonExtract = false;
        for (ProjectionContext context : rewritten) {
            for (RowExpression projection : context.getProjections().values()) {
                String projectionString = projection.toString();
                if (projectionString.contains("json_extract_scalar")
                        && projectionString.contains("unittest.memory.remote_foo")
                        && projectionString.contains("'$.key'")) {
                    foundInlineJsonExtract = true;
                    break;
                }
            }
            if (foundInlineJsonExtract) {
                break;
            }
        }
        assertTrue(foundInlineJsonExtract,
                "Expected json_extract_scalar(CAST(unittest.memory.remote_foo() AS VARCHAR), '$.key') to be kept inline in rewritten projections");
    }

```

1. Ensure `RowExpression` is imported if not already present in the file:
   `import com.facebook.presto.sql.relational.RowExpression;`
2. If the existing tests use a different way to render or inspect `RowExpression` (e.g. a utility for formatting row expressions), you may want to replace the `toString()`-based checks with that preferred mechanism to make the assertion more robust.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +193 to +202
@Test
void testJsonPathArgumentKeptInline()
{
PlanBuilder planBuilder = new PlanBuilder(TEST_SESSION, new PlanNodeIdAllocator(), getMetadata());

PlanRemoteProjections rule = new PlanRemoteProjections(getFunctionAndTypeManager());
List<ProjectionContext> rewritten = rule.planRemoteAssignments(Assignments.builder()
.put(planBuilder.variable("a"), planBuilder.rowExpression("json_extract_scalar(CAST(unittest.memory.remote_foo() AS VARCHAR), '$.key')"))
.build(), new VariableAllocator(planBuilder.getTypes().allVariables()));
assertEquals(rewritten.size(), 2);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider asserting the structure of the rewritten projections to prove the JsonPath argument is actually kept inline

The current checks (no JsonPath-typed variables and exactly two contexts) are good for guarding against the regression. To more directly validate the behavior, consider also asserting that the rewritten projection still contains the json_extract_scalar(CAST(unittest.memory.remote_foo() AS VARCHAR), '$.key') expression inline, e.g. via PlanMatchPattern or by inspecting ProjectionContext.getProjections().values(). That would detect cases where the rule drops or unexpectedly restructures this expression while still avoiding a JsonPath-typed variable projection.

Suggested implementation:

import static com.facebook.presto.sql.planner.iterative.rule.PlanRemoteProjections.ProjectionContext;
import static com.facebook.presto.type.JsonPathType.JSON_PATH;
import static org.testng.Assert.assertEquals;
import static org.testng.Assert.assertFalse;
import static org.testng.Assert.assertTrue;
    @Test
    void testJsonPathArgumentKeptInline()
    {
        PlanBuilder planBuilder = new PlanBuilder(TEST_SESSION, new PlanNodeIdAllocator(), getMetadata());

        PlanRemoteProjections rule = new PlanRemoteProjections(getFunctionAndTypeManager());
        List<ProjectionContext> rewritten = rule.planRemoteAssignments(Assignments.builder()
                .put(planBuilder.variable("a"), planBuilder.rowExpression("json_extract_scalar(CAST(unittest.memory.remote_foo() AS VARCHAR), '$.key')"))
                .build(), new VariableAllocator(planBuilder.getTypes().allVariables()));
        assertEquals(rewritten.size(), 2);

        // Verify no projection context extracts a JsonPath-typed variable
        for (ProjectionContext context : rewritten) {
            context.getProjections().keySet().forEach(variable ->
                    assertFalse(variable.getType().equals(JSON_PATH),
                            "JsonPath type should not be extracted as a ProjectNode output"));
        }

        // Verify the json_extract_scalar(...) expression with the JsonPath argument is still kept inline
        boolean foundInlineJsonExtract = false;
        for (ProjectionContext context : rewritten) {
            for (RowExpression projection : context.getProjections().values()) {
                String projectionString = projection.toString();
                if (projectionString.contains("json_extract_scalar")
                        && projectionString.contains("unittest.memory.remote_foo")
                        && projectionString.contains("'$.key'")) {
                    foundInlineJsonExtract = true;
                    break;
                }
            }
            if (foundInlineJsonExtract) {
                break;
            }
        }
        assertTrue(foundInlineJsonExtract,
                "Expected json_extract_scalar(CAST(unittest.memory.remote_foo() AS VARCHAR), '$.key') to be kept inline in rewritten projections");
    }
  1. Ensure RowExpression is imported if not already present in the file:
    import com.facebook.presto.sql.relational.RowExpression;
  2. If the existing tests use a different way to render or inspect RowExpression (e.g. a utility for formatting row expressions), you may want to replace the toString()-based checks with that preferred mechanism to make the assertion more robust.

Copy link
Copy Markdown
Member

@skyelves skyelves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@feilong-liu
Copy link
Copy Markdown
Contributor Author

@arhimondr @jaystarshot @NikhilCollooru Can I get a committer approval for it? Thanks!

@feilong-liu feilong-liu merged commit 980f1c8 into prestodb:master Mar 13, 2026
83 of 93 checks passed
@feilong-liu feilong-liu deleted the export-D96397113 branch March 13, 2026 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants