Skip to content

Conversation

@pdabre12
Copy link
Contributor

@pdabre12 pdabre12 commented Feb 20, 2025

Description

Add a ExpressionOptimizer which delegates to the native sidecar process to evaluate expressions with Velox.

Motivation and Context

#26475 added support for an endpoint in the sidecar for constant folding expressions. This follows up on that by adding an expression interpreter to call that endpoint.

For more context: RFC-0006.

Impact

No impact by default as the old in-memory evaluation is the default.

Test Plan

Tests have been added.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

Prestissimo (Native Execution) Changes
* Add a native expression optimizer for optimizing expressions in the sidecar.

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Feb 20, 2025
@pdabre12 pdabre12 force-pushed the expression-optimizer branch 3 times, most recently from baa80c5 to 5bf64bf Compare March 3, 2025 21:27
@pdabre12 pdabre12 force-pushed the expression-optimizer branch from 5bf64bf to 02e81b9 Compare March 3, 2025 22:56
@pdabre12 pdabre12 force-pushed the expression-optimizer branch 2 times, most recently from 5c75b74 to fc9942d Compare April 9, 2025 19:02
@pdabre12 pdabre12 force-pushed the expression-optimizer branch 5 times, most recently from ccbf3ac to e7e90f6 Compare April 21, 2025 22:59
@pdabre12 pdabre12 force-pushed the expression-optimizer branch 2 times, most recently from dd590aa to 241f562 Compare December 4, 2025 21:21
@pdabre12 pdabre12 changed the title [WIP] Add native row expression optimizer feat(plugin-native-sidecar): Add native row expression optimizer Dec 4, 2025
@pdabre12 pdabre12 force-pushed the expression-optimizer branch 2 times, most recently from 9b1e4f6 to 47fe096 Compare December 8, 2025 19:37
@pdabre12 pdabre12 marked this pull request as ready for review December 8, 2025 19:48
@prestodb-ci prestodb-ci requested review from a team, Dilli-Babu-Godari and ScrapCodes and removed request for a team December 8, 2025 19:48
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Dec 8, 2025

Reviewer's Guide

Adds a native expression optimizer that delegates row expression optimization to the native sidecar via a new ExpressionOptimizer implementation, wires it into the expression manager and plugin infrastructure, and refactors tests and existing optimizers/serde to use the new path and shared RowExpression serialization.

Sequence diagram for native expression optimization via sidecar

sequenceDiagram
    participant Coordinator as Coordinator
    participant ExpressionOptimizerManager as ExpressionOptimizerManager
    participant NativeExpressionOptimizer as NativeExpressionOptimizer
    participant NativeSidecarExpressionInterpreter as NativeSidecarExpressionInterpreter
    participant Sidecar as NativeSidecar

    Coordinator->>ExpressionOptimizerManager: optimize(rowExpression, level, session)
    ExpressionOptimizerManager->>NativeExpressionOptimizer: ExpressionOptimizer.optimize(expression, level, session, variableResolver)

    NativeExpressionOptimizer->>NativeExpressionOptimizer: CollectingVisitor.visitExpression
    NativeExpressionOptimizer->>NativeExpressionOptimizer: Collect expressionsToOptimize
    NativeExpressionOptimizer->>NativeExpressionOptimizer: ReplacingVisitor(variableResolver)
    NativeExpressionOptimizer->>NativeSidecarExpressionInterpreter: optimizeBatch(session, expressions, level)

    NativeSidecarExpressionInterpreter->>NodeManager: getSidecarNode()
    NodeManager-->>NativeSidecarExpressionInterpreter: sidecarNode
    NativeSidecarExpressionInterpreter->>NativeSidecarExpressionInterpreter: build Request
    NativeSidecarExpressionInterpreter->>Sidecar: HTTP POST /v1/expressions
    Sidecar-->>NativeSidecarExpressionInterpreter: List<RowExpressionOptimizationResult>

    NativeSidecarExpressionInterpreter-->>NativeExpressionOptimizer: Map<RowExpression,RowExpression> replacements
    NativeExpressionOptimizer->>NativeExpressionOptimizer: ReplacingVisitor(replacements)
    NativeExpressionOptimizer-->>ExpressionOptimizerManager: optimized RowExpression
    ExpressionOptimizerManager-->>Coordinator: optimized RowExpression
Loading

Class diagram for native expression optimizer and context wiring

classDiagram
    class ExpressionOptimizerManager {
        <<service>>
        -NodeManager nodeManager
        -FunctionAndTypeManager functionAndTypeManager
        -RowExpressionSerde rowExpressionSerde
        -FunctionResolution functionResolution
        -File configurationDirectory
        -Map~String,ExpressionOptimizerFactory~ expressionOptimizerFactories
        -Map~String,ExpressionOptimizer~ expressionOptimizers
        +ExpressionOptimizerManager(PluginNodeManager nodeManager, FunctionAndTypeManager functionAndTypeManager, RowExpressionSerde rowExpressionSerde)
        +ExpressionOptimizerManager(PluginNodeManager nodeManager, FunctionAndTypeManager functionAndTypeManager, RowExpressionSerde rowExpressionSerde, File configurationDirectory)
        +void loadExpressionOptimizerFactories()
        +void loadExpressionOptimizerFactory(File configurationFile)
        +void loadExpressionOptimizerFactory(String factoryName, String optimizerName, Map~String,String~ properties)
        +void addExpressionOptimizerFactory(ExpressionOptimizerFactory factory)
    }

    class ExpressionOptimizerContext {
        -NodeManager nodeManager
        -RowExpressionSerde rowExpressionSerde
        -FunctionMetadataManager functionMetadataManager
        -StandardFunctionResolution functionResolution
        +ExpressionOptimizerContext(NodeManager nodeManager, RowExpressionSerde rowExpressionSerde, FunctionMetadataManager functionMetadataManager, StandardFunctionResolution functionResolution)
        +NodeManager getNodeManager()
        +RowExpressionSerde getRowExpressionSerde()
        +FunctionMetadataManager getFunctionMetadataManager()
        +StandardFunctionResolution getFunctionResolution()
    }

    class RowExpressionSerde {
        <<interface>>
        +String serialize(RowExpression expression)
        +RowExpression deserialize(String value)
    }

    class JsonCodecRowExpressionSerde {
        -JsonCodec~RowExpression~ codec
        +JsonCodecRowExpressionSerde(JsonCodec~RowExpression~ codec)
        +String serialize(RowExpression expression)
        +RowExpression deserialize(String data)
    }

    RowExpressionSerde <|.. JsonCodecRowExpressionSerde
    ExpressionOptimizerManager --> RowExpressionSerde : uses
    ExpressionOptimizerManager --> ExpressionOptimizerContext : creates
    ExpressionOptimizerContext --> RowExpressionSerde : exposes

    class NativeExpressionOptimizerFactory {
        <<factory>>
        +static String NAME
        -ClassLoader classLoader
        +NativeExpressionOptimizerFactory(ClassLoader classLoader)
        +String getName()
        +ExpressionOptimizer createOptimizer(Map~String,String~ config, ExpressionOptimizerContext context)
    }

    class NativeExpressionsModule {
        <<module>>
        -NodeManager nodeManager
        -RowExpressionSerde rowExpressionSerde
        -FunctionMetadataManager functionMetadataManager
        -StandardFunctionResolution functionResolution
        +NativeExpressionsModule(NodeManager nodeManager, RowExpressionSerde rowExpressionSerde, FunctionMetadataManager functionMetadataManager, StandardFunctionResolution functionResolution)
        +void configure(Binder binder)
    }

    class NativeExpressionOptimizer {
        <<service>>
        -FunctionMetadataManager functionMetadataManager
        -StandardFunctionResolution resolution
        -NativeSidecarExpressionInterpreter rowExpressionInterpreterService
        +NativeExpressionOptimizer(NativeSidecarExpressionInterpreter rowExpressionInterpreterService, FunctionMetadataManager functionMetadataManager, StandardFunctionResolution resolution)
        +RowExpression optimize(RowExpression expression, ExpressionOptimizer.Level level, ConnectorSession session, Function~VariableReferenceExpression,Object~ variableResolver)
    }

    class CollectingVisitor {
        -FunctionMetadataManager functionMetadataManager
        -ExpressionOptimizer.Level optimizationLevel
        -StandardFunctionResolution resolution
        -Set~RowExpression~ expressionsToOptimize
        -Set~RowExpression~ hasOptimizedChildren
        +CollectingVisitor(FunctionMetadataManager functionMetadataManager, ExpressionOptimizer.Level optimizationLevel, StandardFunctionResolution resolution)
        +boolean canBeOptimized(RowExpression rowExpression)
        +List~RowExpression~ getExpressionsToOptimize()
    }

    class ReplacingVisitor {
        -Function~RowExpression,RowExpression~ resolver
        +ReplacingVisitor(Map~RowExpression,RowExpression~ replacements)
        +ReplacingVisitor(Function~VariableReferenceExpression,RowExpression~ variableResolver)
    }

    class NativeSidecarExpressionInterpreter {
        <<service>>
        +static String PRESTO_TIME_ZONE_HEADER
        +static String PRESTO_USER_HEADER
        +static String PRESTO_EXPRESSION_OPTIMIZER_LEVEL_HEADER
        -NodeManager nodeManager
        -HttpClient httpClient
        -JsonCodec~List~RowExpression~~ rowExpressionCodec
        -JsonCodec~List~RowExpressionOptimizationResult~~ rowExpressionOptimizationResultJsonCodec
        +NativeSidecarExpressionInterpreter(HttpClient httpClient, NodeManager nodeManager, JsonCodec~List~RowExpressionOptimizationResult~~ rowExpressionOptimizationResultJsonCodec, JsonCodec~List~RowExpression~~ rowExpressionCodec)
        +Map~RowExpression,RowExpression~ optimizeBatch(ConnectorSession session, Map~RowExpression,RowExpression~ expressions, ExpressionOptimizer.Level level)
        +List~RowExpressionOptimizationResult~ optimize(ConnectorSession session, ExpressionOptimizer.Level level, List~RowExpression~ resolvedExpressions)
    }

    class RowExpressionSerializer {
        -RowExpressionSerde rowExpressionSerde
        +RowExpressionSerializer(RowExpressionSerde rowExpressionSerde)
        +void serialize(RowExpression rowExpression, JsonGenerator jsonGenerator, SerializerProvider serializerProvider)
    }

    class RowExpressionDeserializer {
        -RowExpressionSerde rowExpressionSerde
        +RowExpressionDeserializer(RowExpressionSerde rowExpressionSerde)
        +RowExpression deserialize(JsonParser jsonParser, DeserializationContext context)
    }

    NativeExpressionOptimizerFactory --> NativeExpressionsModule : installs
    NativeExpressionsModule --> NativeExpressionOptimizer : binds
    NativeExpressionsModule --> NativeSidecarExpressionInterpreter : binds
    NativeExpressionsModule --> RowExpressionSerializer : binds
    NativeExpressionsModule --> RowExpressionDeserializer : binds

    NativeExpressionOptimizer ..> CollectingVisitor : uses
    NativeExpressionOptimizer ..> ReplacingVisitor : uses
    NativeExpressionOptimizer --> NativeSidecarExpressionInterpreter : uses

    RowExpressionSerializer --> RowExpressionSerde : uses
    RowExpressionDeserializer --> RowExpressionSerde : uses
Loading

File-Level Changes

Change Details Files
Introduce a native expression optimizer and sidecar interpreter that batch-optimizes RowExpressions via the sidecar /v1/expressions endpoint, including JSON (de)serialization helpers and module wiring.
  • Add NativeExpressionOptimizer implementing ExpressionOptimizer, which discovers optimizable subexpressions, resolves variables, calls NativeSidecarExpressionInterpreter.optimizeBatch, and rewrites the original tree with optimized results.
  • Add NativeSidecarExpressionInterpreter to construct HTTP POST requests to the sidecar /v1/expressions endpoint, handle headers (user, timezone, optimizer level), decode RowExpressionOptimizationResult, and repackage sidecar failures into a single PrestoException.
  • Introduce NativeExpressionsModule and NativeExpressionOptimizerFactory to bind NodeManager, RowExpressionSerde, function metadata/resolution, JSON codecs for RowExpression and RowExpressionOptimizationResult, and expose a named optimizer factory "native" from NativeSidecarPlugin.
  • Add RowExpressionSerializer/RowExpressionDeserializer and a RowExpressionSerde SPI (JsonCodecRowExpressionSerde implementation) to provide pluggable JSON serialization for RowExpression used by the optimizer and sidecar communication.
presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeExpressionOptimizer.java
presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeSidecarExpressionInterpreter.java
presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeExpressionsModule.java
presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeExpressionOptimizerFactory.java
presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/RowExpressionDeserializer.java
presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/RowExpressionSerializer.java
presto-main-base/src/main/java/com/facebook/presto/sql/expressions/JsonCodecRowExpressionSerde.java
presto-spi/src/main/java/com/facebook/presto/spi/RowExpressionSerde.java
presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/NativeSidecarPlugin.java
presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/NativeSidecarPluginQueryRunnerUtils.java
Refactor RowExpressionOptimizer and ExpressionOptimizerManager to remove the old built-in-namespace hack and to support pluggable optimizers and RowExpressionSerde, then wire this through server, Spark, tests, and LocalQueryRunner.
  • Simplify RowExpressionOptimizer to directly use RowExpressionInterpreter without rewriting function namespaces and delete the inner BuiltInNamespaceRewriter logic.
  • Extend ExpressionOptimizerContext to carry RowExpressionSerde and adjust ExpressionOptimizerManager constructors and loadExpressionOptimizerFactory to pass it into factories; add logging and make file-based loading public for tests.
  • Bind RowExpressionSerde to JsonCodecRowExpressionSerde in server and Spark modules, and register RowExpression JSON codecs where needed (e.g., ServerMainModule, PrestoSparkModule).
  • Update all call sites and tests (AbstractTestQueryFramework, OptimizerAssert, TestPrestoNativeBuiltInFunctions, TestSimplifyRowExpressions, TestExpressionOptimizerManager, LocalQueryRunner) to construct ExpressionOptimizerManager with the new RowExpressionSerde argument.
  • Remove namespace-switching tests and helper methods from TestRowExpressionOptimizer that depended on the old Java/builtin namespace hack.
  • Expose FunctionAndTypeManager.getHandleResolver() and TestingPrestoServer.getFunctionAndTypeManager(), and add MetadataManager.createTestMetadataManager(FunctionAndTypeManager) for reuse in tests and modules.
  • Make HandleJsonModule optionally accept an injected HandleResolver instead of always binding a new singleton.
presto-main-base/src/main/java/com/facebook/presto/sql/relational/RowExpressionOptimizer.java
presto-spi/src/main/java/com/facebook/presto/spi/sql/planner/ExpressionOptimizerContext.java
presto-main-base/src/main/java/com/facebook/presto/sql/expressions/ExpressionOptimizerManager.java
presto-main/src/main/java/com/facebook/presto/server/ServerMainModule.java
presto-spark-base/src/main/java/com/facebook/presto/spark/PrestoSparkModule.java
presto-main-base/src/main/java/com/facebook/presto/testing/LocalQueryRunner.java
presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueryFramework.java
presto-main-base/src/test/java/com/facebook/presto/sql/planner/assertions/OptimizerAssert.java
presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoNativeBuiltInFunctions.java
presto-main-base/src/test/java/com/facebook/presto/sql/planner/iterative/rule/TestSimplifyRowExpressions.java
presto-main-base/src/test/java/com/facebook/presto/sql/expressions/TestExpressionOptimizerManager.java
presto-main-base/src/main/java/com/facebook/presto/metadata/HandleJsonModule.java
presto-main-base/src/main/java/com/facebook/presto/metadata/FunctionAndTypeManager.java
presto-main/src/main/java/com/facebook/presto/server/testing/TestingPrestoServer.java
presto-main-base/src/main/java/com/facebook/presto/metadata/MetadataManager.java
presto-native-execution/pom.xml
Rework native sidecar tests to use the new NativeExpressionOptimizer/NativeSidecarExpressionInterpreter path and shared infrastructure rather than manually spawning a sidecar process and doing low-level HTTP/JSON handling.
  • Change TestNativeExpressionInterpreter to obtain a DistributedQueryRunner via NativeSidecarPluginQueryRunner.getQueryRunner, build test MetadataManager and TestingRowExpressionTranslator, and retrieve a NativeSidecarExpressionInterpreter via Guice wiring instead of manual HTTP clients and JSON decoding.
  • Replace custom JSON/HTTP plumbing in TestNativeExpressionInterpreter (manual codecs, HTTP calls, and sidecar startup logic) with calls to NativeSidecarExpressionInterpreter.optimize and assertions on RowExpressionOptimizationResult objects.
  • Update regex expectations in testFailedExpressionOptimization to account for new native function namespace prefixes (native.default.fail).
  • Refactor NativeSidecarPluginQueryRunner to expose a reusable getQueryRunner helper and use it in main.
  • Register the native expression optimizer factory in NativeSidecarPluginQueryRunnerUtils.setupNativeSidecarPlugin so tests use the native optimizer through the ExpressionOptimizerManager.
  • Add missing test-scope dependencies (e.g., javax.ws.rs-api and airlift json) in relevant POMs, and relax dependency analyzer configuration in presto-tests to ignore airlift:node.
  • Add JSON codec for RowExpression to server and Spark JSON bindings to support the new serde in tests and runtime.
  • Clean up now-unused helpers in TestNativeExpressionInterpreter (sidecar process creation, random port selection, manual JSON codec).
presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java
presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/NativeSidecarPluginQueryRunner.java
presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/NativeSidecarPluginQueryRunnerUtils.java
presto-native-execution/pom.xml
presto-native-sidecar-plugin/pom.xml
presto-tests/pom.xml
presto-main/src/main/java/com/facebook/presto/server/ServerMainModule.java
presto-spark-base/src/main/java/com/facebook/presto/spark/PrestoSparkModule.java

Possibly linked issues

  • #: PR’s new native expression optimizer and RowExpressionOptimizer changes address missing rewriting/folding of nested CallExpressions.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • In TestNativeExpressionInterpreter.optimizeRowExpression you use a bare Java assert to check getExpressionFailureInfo().getMessage() != null; consider replacing this with a TestNG assertion (e.g., assertNotNull) so the check is always enforced regardless of JVM -ea flags.
  • NativeExpressionsModule sets up its own RowExpression (de)serialization and codecs, while ServerMainModule/PrestoSparkModule also bind RowExpression JSON codecs; it may be worth consolidating these bindings or documenting their intended separation to avoid subtle mismatches between server and sidecar serialization.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In TestNativeExpressionInterpreter.optimizeRowExpression you use a bare Java `assert` to check `getExpressionFailureInfo().getMessage() != null`; consider replacing this with a TestNG assertion (e.g., `assertNotNull`) so the check is always enforced regardless of JVM `-ea` flags.
- NativeExpressionsModule sets up its own RowExpression (de)serialization and codecs, while ServerMainModule/PrestoSparkModule also bind RowExpression JSON codecs; it may be worth consolidating these bindings or documenting their intended separation to avoid subtle mismatches between server and sidecar serialization.

## Individual Comments

### Comment 1
<location> `presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeExpressionOptimizer.java:218-227` </location>
<code_context>
+            if (node.getForm() == COALESCE) {
</code_context>

<issue_to_address>
**suggestion (bug_risk):** The COALESCE optimization condition is always true for the `size <= arguments.size()` part, so the logic doesn’t match the comment and may over-mark expressions as optimizable.

In the COALESCE branch, `builder` only contains arguments where `canBeOptimized(argument)` is true, and this block only executes when `anyArgumentsOptimizable` is false, so `builder.build().size()` is always 0. As a result, `builder.build().size() <= node.getArguments().size()` is always true for any non-empty COALESCE, making `canBeOptimized` true regardless of duplicates. This contradicts the de‑duplication comment and likely sends many more COALESCEs to the sidecar than intended. If you’re trying to detect duplicates, you probably need a strict `<` and/or to compare the original argument count to the size of a set containing all arguments, not just the optimizable ones.

Suggested implementation:

```java
            // If the special form is COALESCE, then we can optimize it if there are any duplicate arguments
            if (node.getForm() == COALESCE) {
                // Detect duplicates across all arguments, not just the optimizable ones
                int originalArgumentCount = node.getArguments().size();
                ImmutableSet<RowExpression> uniqueArguments = ImmutableSet.copyOf(node.getArguments());
                boolean hasDuplicates = uniqueArguments.size() < originalArgumentCount;

                // If there were any duplicates, or if there's no arguments (cancel out), or if there's only one argument (just return it),
                // then it's also constant foldable

```

```java
                // If there were any duplicates, or if there's no arguments (cancel out), or if there's only one argument (just return it),
                // then it's also constant foldable
                if (hasDuplicates || originalArgumentCount == 0 || originalArgumentCount == 1) {
                    visitNode(node, true);
                    return null;
                }

```

If there is an existing `if` block immediately after this COALESCE section that calls `visitNode(node, true)` based on `builder.build().size() <= node.getArguments().size()`, it should be removed, because the new `hasDuplicates` logic fully replaces that condition. Adjust imports if `ImmutableSet` is not already statically imported in this file (it appears to be, given the existing code).
</issue_to_address>

### Comment 2
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java:397-401` </location>
<code_context>
+        RowExpressionOptimizationResult response = optimize(expression, level);
</code_context>

<issue_to_address>
**issue (testing):** Avoid using Java `assert` in tests; use TestNG assertions so the checks always run

This uses a plain Java `assert` to check `response.getExpressionFailureInfo().getMessage()` for non-null, which may be disabled in CI or production runs and therefore never execute. Replace it with a TestNG assertion, e.g. `assertNotNull(response.getExpressionFailureInfo().getMessage());`, before the `assertTrue` call so the check always runs.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@pdabre12
Copy link
Contributor Author

@sourcery-ai review

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • In CollectingVisitor.visitSpecialForm, the COALESCE optimization check builder.build().size() <= node.getArguments().size() is always true (since a set cannot be larger than the list), so any COALESCE will be treated as optimizable regardless of duplicates; consider fixing this logic to actually detect and rely on duplicate arguments.
  • In NativeExpressionOptimizer.optimize, the mapping from generic Level to the sidecar-supported levels is done via ordinal() comparison; using an explicit switch or mapping on the enum values (e.g., SERIALIZABLE → OPTIMIZED, OPTIMIZED/EVALUATED → themselves) would avoid brittle ordinal-based coupling.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `CollectingVisitor.visitSpecialForm`, the COALESCE optimization check `builder.build().size() <= node.getArguments().size()` is always true (since a set cannot be larger than the list), so any COALESCE will be treated as optimizable regardless of duplicates; consider fixing this logic to actually detect and rely on duplicate arguments.
- In `NativeExpressionOptimizer.optimize`, the mapping from generic `Level` to the sidecar-supported levels is done via `ordinal()` comparison; using an explicit switch or mapping on the enum values (e.g., SERIALIZABLE → OPTIMIZED, OPTIMIZED/EVALUATED → themselves) would avoid brittle ordinal-based coupling.

## Individual Comments

### Comment 1
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java:360-362` </location>
<code_context>
-            log.error(e, "Failed to decode RowExpression from sidecar response: %s.", e.getMessage());
-            throw new RuntimeException(e);
-        }
+        RowExpressionOptimizationResult response = optimize(rowExpression, ExpressionOptimizer.Level.EVALUATED);
+        assertNotNull(response.getExpressionFailureInfo().getMessage());
+        assertTrue(response.getExpressionFailureInfo().getMessage().contains(errorMessage), format("Sidecar response: %s did not contain expected error message: %s.", response, errorMessage));
     }

</code_context>

<issue_to_address>
**suggestion (testing):** Also assert that `optimizedExpression` is null when a failure is returned

In `assertEvaluateFails`, the failure contract for `RowExpressionOptimizationResult` is that `expressionFailureInfo` is set and `optimizedExpression` is null. Please also assert that `response.getOptimizedExpression()` is null so tests will catch any regression where a non-null optimized expression is returned alongside a failure.

```suggestion
        RowExpressionOptimizationResult response = optimize(rowExpression, ExpressionOptimizer.Level.EVALUATED);
        assertNotNull(response.getExpressionFailureInfo().getMessage());
        assertNull(response.getOptimizedExpression());
        assertTrue(response.getExpressionFailureInfo().getMessage().contains(errorMessage), format("Sidecar response: %s did not contain expected error message: %s.", response, errorMessage));
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, otherwise LGTM

@pdabre12 pdabre12 requested a review from tdcmeehan December 11, 2025 18:16
@pdabre12
Copy link
Contributor Author

@sourcery-ai review

@pdabre12
Copy link
Contributor Author

pdabre12 commented Dec 11, 2025

Hey there - I've reviewed your changes - here's some feedback:

  • In CollectingVisitor.visitSpecialForm, the COALESCE optimization check builder.build().size() <= node.getArguments().size() is always true (since a set cannot be larger than the list), so any COALESCE will be treated as optimizable regardless of duplicates; consider fixing this logic to actually detect and rely on duplicate arguments.
  • In NativeExpressionOptimizer.optimize, the mapping from generic Level to the sidecar-supported levels is done via ordinal() comparison; using an explicit switch or mapping on the enum values (e.g., SERIALIZABLE → OPTIMIZED, OPTIMIZED/EVALUATED → themselves) would avoid brittle ordinal-based coupling.

oridinal() based comparison should be okay.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • NativeSidecarExpressionInterpreter.optimizeBatch assumes every RowExpressionOptimizationResult contains a non-null optimizedExpression and ignores expressionFailureInfo; consider validating the response and propagating or surfacing failures explicitly instead of blindly dereferencing getOptimizedExpression().
  • NativeExpressionOptimizer maps arbitrary levels via ordinal comparisons (level.ordinal() < OPTIMIZED.ordinal() ? OPTIMIZED : level); to avoid subtle bugs if new levels are added or the enum order changes, it would be safer to handle supported levels explicitly (e.g., switch on Level and reject or map only known cases like SERIALIZABLE).
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- NativeSidecarExpressionInterpreter.optimizeBatch assumes every RowExpressionOptimizationResult contains a non-null optimizedExpression and ignores expressionFailureInfo; consider validating the response and propagating or surfacing failures explicitly instead of blindly dereferencing getOptimizedExpression().
- NativeExpressionOptimizer maps arbitrary levels via ordinal comparisons (level.ordinal() < OPTIMIZED.ordinal() ? OPTIMIZED : level); to avoid subtle bugs if new levels are added or the enum order changes, it would be safer to handle supported levels explicitly (e.g., switch on Level and reject or map only known cases like SERIALIZABLE).

## Individual Comments

### Comment 1
<location> `presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeExpressionOptimizer.java:181-183` </location>
<code_context>
+                    optimizationLevel.ordinal() == EVALUATED.ordinal();
+
+            // All arguments must be optimizable in order to evaluate the function
+            boolean allConstantFoldable = node.getArguments().stream()
+                    .peek(child -> child.accept(this, context))
+                    .allMatch(this::canBeOptimized);
+            if (canBeEvaluated && allConstantFoldable) {
+                visitNode(node, true);
</code_context>

<issue_to_address>
**issue (performance):** Child arguments of a CallExpression may not be fully visited, which can miss optimizable sub-expressions.

In `visitCall`, `allConstantFoldable` is computed via `node.getArguments().stream().peek(child -> child.accept(this, context)).allMatch(this::canBeOptimized);`. Because `allMatch` short-circuits, once an argument is not optimizable, later arguments (and their subtrees) are never visited, so their optimizable sub-expressions never reach `expressionsToOptimize`. To avoid this, visit all arguments unconditionally first (e.g., a simple loop calling `accept` on each child), then in a separate pass compute `allConstantFoldable` with `allMatch(this::canBeOptimized)` over the already-visited children.
</issue_to_address>

### Comment 2
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java:360-362` </location>
<code_context>
-            log.error(e, "Failed to decode RowExpression from sidecar response: %s.", e.getMessage());
-            throw new RuntimeException(e);
-        }
+        RowExpressionOptimizationResult response = optimize(rowExpression, ExpressionOptimizer.Level.EVALUATED);
+        assertNotNull(response.getExpressionFailureInfo().getMessage());
+        assertTrue(response.getExpressionFailureInfo().getMessage().contains(errorMessage), format("Sidecar response: %s did not contain expected error message: %s.", response, errorMessage));
     }

</code_context>

<issue_to_address>
**suggestion (testing):** The failure-path assertion no longer checks that the optimized expression is null, weakening the contract being tested.

The previous `assertEvaluateFails` also verified that `optimizedExpression` was absent (null) on failure, while the new test only checks the error message. To preserve the endpoint contract, please also assert that `optimizedExpression` is null, for example:

```java
RowExpressionOptimizationResult response = optimize(rowExpression, ExpressionOptimizer.Level.EVALUATED);
assertNull(response.getOptimizedExpression(), "Expected optimizedExpression to be null on failure");
assertNotNull(response.getExpressionFailureInfo().getMessage());
assertTrue(response.getExpressionFailureInfo().getMessage().contains(errorMessage),
        format("Sidecar response: %s did not contain expected error message: %s.", response, errorMessage));
```

This ensures we catch cases where a failure incorrectly includes a non-null optimized expression.

Suggested implementation:

```java
        RowExpressionOptimizationResult response = optimize(rowExpression, ExpressionOptimizer.Level.EVALUATED);
        assertNull(response.getOptimizedExpression(), "Expected optimizedExpression to be null on failure");
        assertNotNull(response.getExpressionFailureInfo().getMessage());
        assertTrue(response.getExpressionFailureInfo().getMessage().contains(errorMessage), format("Sidecar response: %s did not contain expected error message: %s.", response, errorMessage));

```

If this test class does not already have a static import for `assertNull` (for example, via `import static org.testng.Assert.*;` or `import static org.testng.Assert.assertNull;`), add the appropriate static import alongside the existing TestNG assertions.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@pdabre12
Copy link
Contributor Author

@sourcery-ai review

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/relational/RowExpressionOptimizer.java:57-58` </location>
<code_context>
-                return null;
-            }
-        }
+        RowExpressionInterpreter interpreter = new RowExpressionInterpreter(expression, functionAndTypeManager, session, level);
+        return toRowExpression(expression.getSourceLocation(), interpreter.optimize(variableResolver::apply), expression.getType());
     }
 }
</code_context>

<issue_to_address>
**issue (bug_risk):** Guard against a null `variableResolver` to avoid potential NPEs.

The earlier version safely handled a nullable `variableResolver`, but this change always calls `variableResolver::apply`. If `variableResolver` can be null, this will throw an NPE. Either guarantee non-null at all call sites (and document it) or add a null check here and fall back to `interpreter.optimize()` when no resolver is provided.
</issue_to_address>

### Comment 2
<location> `presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeExpressionOptimizer.java:73-74` </location>
<code_context>
+    public RowExpression optimize(RowExpression expression, Level level, ConnectorSession session, Function<VariableReferenceExpression, Object> variableResolver)
+    {
+        // Collect expressions to optimize
+        CollectingVisitor collectingVisitor = new CollectingVisitor(functionMetadataManager, level, resolution);
+        expression.accept(collectingVisitor, variableResolver);
+        List<RowExpression> expressionsToOptimize = collectingVisitor.getExpressionsToOptimize();
+
</code_context>

<issue_to_address>
**issue (bug_risk):** Handling of `variableResolver` in `CollectingVisitor` assumes it is non-null, which may not always hold.

`CollectingVisitor.visitVariableReference` casts `context` to `Function<VariableReferenceExpression, Object>` and calls it directly. If `variableResolver` is null or a different context type is passed in recursive calls, this will cause a runtime failure. Please either require a non-null resolver at the `optimize` entry point (e.g., via a precondition) or make `CollectingVisitor` handle a null context safely (for example, treating it as `variable -> null`).
</issue_to_address>

### Comment 3
<location> `presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeSidecarExpressionInterpreter.java:110-116` </location>
<code_context>
-
-                    // Replace the namespace with the Java built-in namespace
-                    FunctionHandle javaNamespaceFunctionHandle;
-                    try {
-                        javaNamespaceFunctionHandle = functionAndTypeManager.lookupFunction(
-                                QualifiedObjectName.valueOf(JAVA_BUILTIN_NAMESPACE, call.getDisplayName()),
</code_context>

<issue_to_address>
**suggestion:** The error code `INVALID_ARGUMENTS` for failed sidecar communication may be misleading.

Treating `httpClient.execute` failures (network/serialization/5xx) as `INVALID_ARGUMENTS` misrepresents an internal/system error as a user input issue. Consider using a more suitable code (e.g., generic internal error or sidecar-specific error) so operational failures aren’t surfaced as argument validation problems.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@pdabre12 pdabre12 force-pushed the expression-optimizer branch from 69f9864 to f99b31d Compare December 11, 2025 19:55
@pdabre12
Copy link
Contributor Author

@sourcery-ai review

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • In NativeExpressionOptimizer.CollectingVisitor the canBeEvaluated condition based on optimizationLevel.ordinal() means Level.SERIALIZABLE will never allow any function evaluation (neither deterministic nor non-deterministic), effectively disabling optimization at that level even though the sidecar call later maps SERIALIZABLE to OPTIMIZED; consider making the level check explicit rather than ordinal-based so SERIALIZABLE behaves as intended.
  • In the sidecar flow, success and failure on RowExpressionOptimizationResult are inferred implicitly from expressionFailureInfo/message (e.g. rePackageExceptions treats any non-null failure info as an exception, and tests check for getMessage().isEmpty()), which is easy to misinterpret; it would be clearer and safer to treat an empty/absent message as success explicitly (e.g., by short‑circuiting in rePackageExceptions when the message is empty) and update the test assertions to check expressionFailureInfo nullness separately from message content.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In NativeExpressionOptimizer.CollectingVisitor the `canBeEvaluated` condition based on `optimizationLevel.ordinal()` means `Level.SERIALIZABLE` will never allow any function evaluation (neither deterministic nor non-deterministic), effectively disabling optimization at that level even though the sidecar call later maps SERIALIZABLE to OPTIMIZED; consider making the level check explicit rather than ordinal-based so SERIALIZABLE behaves as intended.
- In the sidecar flow, success and failure on `RowExpressionOptimizationResult` are inferred implicitly from `expressionFailureInfo`/`message` (e.g. `rePackageExceptions` treats any non-null failure info as an exception, and tests check for `getMessage().isEmpty()`), which is easy to misinterpret; it would be clearer and safer to treat an empty/absent message as success explicitly (e.g., by short‑circuiting in `rePackageExceptions` when the message is empty) and update the test assertions to check `expressionFailureInfo` nullness separately from message content.

## Individual Comments

### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/relational/RowExpressionOptimizer.java:57-58` </location>
<code_context>
-                return null;
-            }
-        }
+        RowExpressionInterpreter interpreter = new RowExpressionInterpreter(expression, functionAndTypeManager, session, level);
+        return toRowExpression(expression.getSourceLocation(), interpreter.optimize(variableResolver::apply), expression.getType());
     }
 }
</code_context>

<issue_to_address>
**issue (bug_risk):** Guard against null variableResolver to preserve previous behavior and avoid NPEs.

The old code allowed a null `variableResolver` by passing `null` into `RowExpressionInterpreter.optimize`. The new version always calls `variableResolver::apply`, which will NPE if `variableResolver` is null (it’s not marked `@Nullable`, so this is easy to miss). Either preserve the null check and call `optimize(null)` when it’s null, or enforce non-null with `requireNonNull(variableResolver, ...)` and update callers accordingly. The first option avoids a behavior change.
</issue_to_address>

### Comment 2
<location> `presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeExpressionOptimizer.java:70-78` </location>
<code_context>
     @Override
     public RowExpression optimize(RowExpression expression, Level level, ConnectorSession session, Function<VariableReferenceExpression, Object> variableResolver)
     {
-        return getRowExpression(expression, level, session, variableResolver);
</code_context>

<issue_to_address>
**issue (bug_risk):** Handle possible null variableResolver consistently when resolving variables.

This now assumes `variableResolver` is non-null and passes it into `ReplacingVisitor` as `variable -> variableResolver.apply(variable)`, which will throw an NPE if callers pass `null` (previously allowed by `RowExpressionOptimizer`). To preserve that behavior, either skip the replacement pass when `variableResolver` is null or wrap it in a null-safe adapter before passing it to `ReplacingVisitor`.
</issue_to_address>

### Comment 3
<location> `presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeSidecarExpressionInterpreter.java:110-116` </location>
<code_context>
-
-                    // Replace the namespace with the Java built-in namespace
-                    FunctionHandle javaNamespaceFunctionHandle;
-                    try {
-                        javaNamespaceFunctionHandle = functionAndTypeManager.lookupFunction(
-                                QualifiedObjectName.valueOf(JAVA_BUILTIN_NAMESPACE, call.getDisplayName()),
</code_context>

<issue_to_address>
**suggestion:** Consider using a more appropriate error code for sidecar communication failures.

Treating all HTTP exceptions as `INVALID_ARGUMENTS` conflates user input issues with connectivity or server-side failures. Prefer an internal/generic error code (e.g., `GENERIC_INTERNAL_ERROR`) for sidecar/infrastructure problems, and reserve `INVALID_ARGUMENTS` for real user errors, potentially by checking the HTTP status or response payload.
</issue_to_address>

### Comment 4
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java:75` </location>
<code_context>
-import static org.testng.Assert.assertNull;
 import static org.testng.Assert.assertTrue;

 public class TestNativeExpressionInterpreter
         extends AbstractTestExpressionInterpreter
 {
</code_context>

<issue_to_address>
**suggestion (testing):** There are no dedicated tests covering the new `NativeExpressionOptimizer` and sidecar optimizer wiring; key behaviors are currently untested.

This class only covers the native sidecar constant folding endpoint via `NativeSidecarExpressionInterpreter`; it doesn’t directly exercise `NativeExpressionOptimizer` or its interaction with `ExpressionOptimizerManager`.

Given the optimizer’s complexity (expression collection, COALESCE de-duplication, short-circuit handling, lambda optimization, variable resolver usage) and the behavior in `NativeSidecarExpressionInterpreter.optimizeBatch`/`rePackageExceptions` (batch mapping and error aggregation), please add targeted unit tests, for example:

* A `TestNativeExpressionOptimizer` suite that verifies:
  * Only eligible subexpressions are sent to the sidecar (deterministic vs non-deterministic, no-op casts, COALESCE with duplicate args, lambda/nested lambda bodies, AND/OR special forms).
  * Correct application of the variable resolver (constants vs unresolved variables).
  * No double-optimization of nested expressions (subtree de-dup in `CollectingVisitor`).
* Tests for `optimizeBatch` and `rePackageExceptions` that cover:
  * Successful optimization of all expressions and correct mapping back to originals.
  * Failures in one or more `RowExpressionOptimizationResult` entries, asserting a `PrestoException` with multiple suppressed causes.

These would ensure the optimizer/sidecar wiring can be safely refactored or extended without regressions.

Suggested implementation:

```java
import static com.facebook.presto.metadata.MetadataManager.createTestMetadataManager;
import static com.google.inject.multibindings.Multibinder.newSetBinder;
import static java.lang.String.format;
import static org.testng.Assert.assertEquals;
import static org.testng.Assert.assertNotNull;
import static org.testng.Assert.assertTrue;
import static org.testng.Assert.assertNull;

```

I only see the top of `TestNativeExpressionInterpreter`, so I can’t safely patch in the new tests at the correct location. Below is what you should add to this file (after the existing tests, before the final `}` of the class) to implement the comment:

1. **Add fields / helper wiring for the optimizer + sidecar interpreter**

Inside `TestNativeExpressionInterpreter` (as private fields, likely next to `metadata`, `translator`, `queryRunner`), add:

```java
private final ExpressionOptimizerManager optimizerManager;
private final NativeExpressionOptimizer nativeExpressionOptimizer;
private final NativeSidecarExpressionInterpreter sidecarExpressionInterpreter;
```

Initialize them in the existing constructor / `@BeforeClass` (wherever the current `metadata`, `translator`, `queryRunner` are initialized), roughly like this (adjust ctor signatures to match actual code):

```java
this.optimizerManager = new ExpressionOptimizerManager(
        metadata,
        /* typeManager */ metadata.getTypeManager(),
        /* functionResolution */ new FunctionResolution(metadata.getFunctionAndTypeManager()),
        /* deterministicEvaluator */ new DeterminismEvaluator(metadata.getFunctionAndTypeManager()),
        /* features/config */ new FeaturesConfig());

this.nativeExpressionOptimizer = new NativeExpressionOptimizer(
        optimizerManager,
        /* rowExpressionDomainTranslator */ new RowExpressionDomainTranslator(metadata.getFunctionAndTypeManager()),
        /* variableResolver */ new VariableResolver(/* if you have a concrete impl, e.g., ConstantSymbolResolver */),
        /* metadata */ metadata);

this.sidecarExpressionInterpreter = new NativeSidecarExpressionInterpreter(
        nativeExpressionOptimizer,
        /* sidecar client/service: */ new TestingNativeSidecarExpressionClient(),
        /* batch size / config */ new NativeSidecarConfig().setMaxBatchSize(16));
```

Use the real constructor parameters/types from your codebase; the above is a sketch.

2. **Add tests for `NativeExpressionOptimizer` expression collection & eligibility**

Add a new `@Test` method that verifies only eligible subexpressions are collected and optimized. Example:

```java
@Test
public void testNativeExpressionOptimizerEligibleSubexpressions()
{
    // Example: AND with deterministic & non-deterministic expressions, COALESCE with duplicates,
    // and a lambda expression.
    RowExpression deterministic = translator.translate("1 + 2", metadata);
    RowExpression nonDeterministic = translator.translate("rand()", metadata);
    RowExpression coalesceWithDuplicates = translator.translate("coalesce(a, a, 3)", metadata);
    RowExpression lambdaExpression = translator.translate("transform(array[1,2,3], x -> x + 1)", metadata);

    RowExpression compound = translator.translate(
            format("(%s AND %s) OR %s", "1 + 2", "rand()", "coalesce(a, a, 3)"),
            metadata);

    // Run through optimizer
    RowExpression optimized = nativeExpressionOptimizer.optimize(compound);

    // Assert that:
    // - deterministic subexpressions are folded (e.g., 1 + 2 -> 3)
    // - non-deterministic ones (rand()) are preserved
    // - COALESCE duplicates are de-duplicated
    // - lambda body is optimized but not double-optimized or duplicated
    assertTrue(containsLiteral(optimized, 3L), "Expected 1 + 2 to be constant folded to 3");
    assertTrue(containsCall(optimized, "rand"), "rand() should not be folded away");
    assertTrue(coalesceHasNoDuplicateArgs(optimized), "COALESCE duplicate arguments should be removed");
    assertTrue(lambdaBodyOptimized(optimized), "Lambda body should be optimized");
    assertFalse(hasDuplicateSubtrees(optimized), "Subtrees should be de-duplicated");
}
```

Implement the helper methods (`containsLiteral`, `containsCall`, `coalesceHasNoDuplicateArgs`, `lambdaBodyOptimized`, `hasDuplicateSubtrees`) as private methods in this test class, using the existing row-expression visitor utilities from the test base (`AbstractTestExpressionInterpreter` usually has helpers to walk `RowExpression` trees).

3. **Add tests for variable resolver behavior**

Add a test ensuring constants are resolved and unknown variables are left unresolved:

```java
@Test
public void testNativeExpressionOptimizerVariableResolver()
{
    // Assume `x` is bound to a constant and `y` is unbound
    Map<String, Object> bindings = ImmutableMap.of("x", 5);
    VariableResolver resolver = new MapBackedVariableResolver(bindings);

    NativeExpressionOptimizer optimizerWithResolver = new NativeExpressionOptimizer(
            optimizerManager,
            /* domain translator */ new RowExpressionDomainTranslator(metadata.getFunctionAndTypeManager()),
            resolver,
            metadata);

    RowExpression expression = translator.translate("x + y + 1", metadata);

    RowExpression optimized = optimizerWithResolver.optimize(expression);

    // Expect `x` and literal `1` to be folded, but `y` stays a variable
    assertTrue(containsVariable(optimized, "y"), "Unresolved variable y should remain");
    assertFalse(containsVariable(optimized, "x"), "Resolved variable x should be folded away");
    assertTrue(containsLiteral(optimized, 6L), "x + 1 should fold to 6");
}
```

Again, implement `MapBackedVariableResolver` and `containsVariable` as private helpers in the test (or re-use existing infrastructure if there is one in the module).

4. **Add tests for no double-optimization / subtree de-duplication**

Add a test that constructs an expression with repeated subtrees and verifies optimizer sees/optimizes them once and doesn’t re-insert duplicates:

```java
@Test
public void testNativeExpressionOptimizerNoDoubleOptimization()
{
    // repeated subtree: (1 + 2) appears twice
    RowExpression expression = translator.translate("(1 + 2) + (1 + 2)", metadata);

    RowExpression optimized = nativeExpressionOptimizer.optimize(expression);

    // Optimizer should fold (1 + 2) once and reuse the result instead of re-optimizing
    assertTrue(countLiteral(optimized, 3L) <= 2, "3 should not appear excessively due to double-optimization");
    assertFalse(hasDuplicateSubtrees(optimized), "Subtrees should be de-duplicated");
}
```

`countLiteral` and `hasDuplicateSubtrees` reuse the visitors from step 2.

5. **Add tests for `NativeSidecarExpressionInterpreter.optimizeBatch`**

Add a test to verify successful batch optimization and mapping back to the original expressions:

```java
@Test
public void testOptimizeBatchSuccess()
{
    List<RowExpression> expressions = ImmutableList.of(
            translator.translate("1 + 2", metadata),
            translator.translate("3 * 4", metadata),
            translator.translate("coalesce(1, 2)", metadata));

    List<RowExpressionOptimizationResult> results =
            sidecarExpressionInterpreter.optimizeBatch(expressions);

    assertEquals(results.size(), expressions.size());

    // All entries should be successful and match original order
    for (int i = 0; i < expressions.size(); i++) {
        RowExpressionOptimizationResult result = results.get(i);
        assertTrue(result.getException().isEmpty(), "Expected no exception at index " + i);
        assertNotNull(result.getOptimizedExpression(), "Optimized expression should be present at index " + i);
        assertTrue(isConstant(result.getOptimizedExpression()),
                "Expressions should be constant folded at index " + i);
    }
}
```

Implement or reuse `isConstant` (true if `RowExpression` is a literal) using existing test utilities.

6. **Add tests for `NativeSidecarExpressionInterpreter.rePackageExceptions`**

Add a test that simulates multiple failures in `RowExpressionOptimizationResult` and verifies they are wrapped into a `PrestoException` with suppressed causes:

```java
@Test
public void testRePackageExceptionsAggregatesFailures()
{
    RowExpression expr1 = translator.translate("1 / 0", metadata);
    RowExpression expr2 = translator.translate("unknown_function()", metadata);
    RowExpression expr3 = translator.translate("1 + 1", metadata);

    RuntimeException ex1 = new RuntimeException("division by zero");
    RuntimeException ex2 = new RuntimeException("unknown function");

    List<RowExpressionOptimizationResult> results = ImmutableList.of(
            RowExpressionOptimizationResult.failure(expr1, ex1),
            RowExpressionOptimizationResult.failure(expr2, ex2),
            RowExpressionOptimizationResult.success(expr3, expr3));

    PrestoException aggregated = sidecarExpressionInterpreter.rePackageExceptions(results);

    assertNotNull(aggregated);
    assertEquals(aggregated.getErrorCode(), SOME_EXPECTED_ERROR_CODE); // use the real error code constant
    Throwable[] suppressed = aggregated.getSuppressed();
    assertEquals(suppressed.length, 2);
    assertEquals(suppressed[0].getMessage(), "division by zero");
    assertEquals(suppressed[1].getMessage(), "unknown function");
}
```

If `rePackageExceptions` returns `null`/`Optional`/throws directly instead, adjust assertions accordingly. The important part is: multiple `RowExpressionOptimizationResult` failures lead to a single `PrestoException` that contains the individual failures.

7. **Add a test for the mixed success/failure path in `optimizeBatch` + `rePackageExceptions`**

Add a test that runs `optimizeBatch` with a `TestingNativeSidecarExpressionClient` that intentionally fails on some expressions and succeeds on others, then calls `rePackageExceptions` and asserts:

- successes are still usable,
- failures are aggregated into a single `PrestoException` with multiple suppressed exceptions.

For example:

```java
@Test
public void testOptimizeBatchWithPartialFailures()
{
    TestingNativeSidecarExpressionClient client = new TestingNativeSidecarExpressionClient()
            .failOnIndex(1)
            .failOnIndex(3);

    NativeSidecarExpressionInterpreter interpreter = new NativeSidecarExpressionInterpreter(
            nativeExpressionOptimizer,
            client,
            new NativeSidecarConfig().setMaxBatchSize(16));

    List<RowExpression> expressions = ImmutableList.of(
            translator.translate("1 + 2", metadata),
            translator.translate("1 / 0", metadata),
            translator.translate("3 * 4", metadata),
            translator.translate("unknown_function()", metadata));

    List<RowExpressionOptimizationResult> results = interpreter.optimizeBatch(expressions);

    // Validate successes / failures by index
    assertTrue(results.get(0).getException().isEmpty());
    assertTrue(results.get(2).getException().isEmpty());
    assertTrue(results.get(1).getException().isPresent());
    assertTrue(results.get(3).getException().isPresent());

    PrestoException aggregated = interpreter.rePackageExceptions(results);

    assertNotNull(aggregated);
    assertEquals(aggregated.getSuppressed().length, 2);
}
```

You will need a small `TestingNativeSidecarExpressionClient` in the test code (either in this file as a private static class or in a dedicated test helper file) that implements whatever sidecar client interface `NativeSidecarExpressionInterpreter` uses and allows configuring which indexes fail.

Once you wire these tests into `TestNativeExpressionInterpreter` with the real constructor parameters and helper utilities from your codebase, you will have coverage for:

- `NativeExpressionOptimizer` expression collection and eligibility (deterministic vs non-deterministic, COALESCE de-duplication, logical short-circuit structure, lambdas, nested lambdas).
- Variable resolver behavior.
- Subtree de-duplication (no double-optimization).
- `NativeSidecarExpressionInterpreter.optimizeBatch` success and mixed success/failure batches.
- `NativeSidecarExpressionInterpreter.rePackageExceptions` and the aggregated `PrestoException` with suppressed causes.

Adjust names, constructors, and error-code constants to the actual classes in `com.facebook.presto.sidecar.expressions`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@pdabre12
Copy link
Contributor Author

pdabre12 commented Dec 11, 2025

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • In NativeExpressionOptimizer.CollectingVisitor the canBeEvaluated condition based on optimizationLevel.ordinal() means Level.SERIALIZABLE will never allow any function evaluation (neither deterministic nor non-deterministic), effectively disabling optimization at that level even though the sidecar call later maps SERIALIZABLE to OPTIMIZED; consider making the level check explicit rather than ordinal-based so SERIALIZABLE behaves as intended.

The check is simple, if level is EVALUATED, allow all. If check is less than EVALUATED(SERIALIZABLE or OPTIMIZED) only allow deterministic functions.

@pdabre12
Copy link
Contributor Author

@sourcery-ai review

Copy link

@SourceryAI SourceryAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/relational/RowExpressionOptimizer.java:57-58` </location>
<code_context>
-                return null;
-            }
-        }
+        RowExpressionInterpreter interpreter = new RowExpressionInterpreter(expression, functionAndTypeManager, session, level);
+        return toRowExpression(expression.getSourceLocation(), interpreter.optimize(variableResolver::apply), expression.getType());
     }
 }
</code_context>

<issue_to_address>
**issue (bug_risk):** Guard against null variableResolver in optimize(...) overload to avoid NPEs.

The previous implementation tolerated a nullable `variableResolver` and forwarded `null` to `RowExpressionInterpreter.optimize(...)`. The new code always calls `variableResolver::apply`, which will NPE if any existing caller still passes `null` (likely, given the prior behavior and lack of `@Nonnull`). Please either fail fast with `requireNonNull(variableResolver, ...)` or restore null-safe handling (e.g., conditional lambda vs `null`) to avoid a regression.
</issue_to_address>

### Comment 2
<location> `presto-native-sidecar-plugin/src/main/java/com/facebook/presto/sidecar/expressions/NativeExpressionOptimizer.java:113-117` </location>
<code_context>
+            // The native endpoint only supports optimizer levels OPTIMIZED or EVALUATED.
+            // In the sidecar, SERIALIZABLE is effectively the same as OPTIMIZED,
+            // so if SERIALIZABLE is requested, we use OPTIMIZED instead.
+            replacements.putAll(
+                    rowExpressionInterpreterService.optimizeBatch(
+                            session,
+                            expressions,
+                            level.ordinal() < OPTIMIZED.ordinal() ? OPTIMIZED : level));
+        }
+
</code_context>

<issue_to_address>
**issue (bug_risk):** The optimizer level mapping to the sidecar doesn’t match the comment and may pass unsupported levels.

The mapping only adjusts levels *below* `OPTIMIZED`, so any level like `SERIALIZABLE` with an ordinal above `OPTIMIZED` will be forwarded unchanged, which contradicts the comment and may violate the sidecar’s supported levels. An explicit mapping would be safer, e.g. treat `SERIALIZABLE` and anything below `OPTIMIZED` as `OPTIMIZED`, allow only `OPTIMIZED`/`EVALUATED` as-is, and reject anything else.
</issue_to_address>

### Comment 3
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java:397-400` </location>
<code_context>
+        RowExpressionOptimizationResult response = optimize(expression, level);
</code_context>

<issue_to_address>
**suggestion (testing):** Test should assert `expressionFailureInfo` is non-null instead of its message being non-null

In `optimizeRowExpression`, these assertions are meant to verify the successful optimization path has no failure. Asserting on `response.getExpressionFailureInfo().getMessage()` being non-null and then empty is misleading, and the earlier comment referenced the struct itself as non-null. To better reflect the protocol and match `assertEvaluateFails`, assert that `getExpressionFailureInfo()` is non-null, and then that its message is empty:

```java
assertNotNull(response.getExpressionFailureInfo());
assertTrue(response.getExpressionFailureInfo().getMessage().isEmpty());
```

Suggested implementation:

```java
    {
        expression = expression.accept(visitor, null);
        RowExpressionOptimizationResult response = optimize(expression, level);
        assertNotNull(response.getExpressionFailureInfo());
        assertTrue(response.getExpressionFailureInfo().getMessage().isEmpty());

```

1. If this test method already contains assertions about `response.getExpressionFailureInfo().getMessage()` being non-null, those should be removed to avoid contradictory expectations.
2. Ensure that `assertNotNull` and `assertTrue` are imported from the same assertion framework used elsewhere in the file (likely `org.testng.Assert.*` for TestNG or `org.junit.jupiter.api.Assertions.*` for JUnit). If they are not already statically imported, add the appropriate static imports at the top of the file.
</issue_to_address>

### Comment 4
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java:360-362` </location>
<code_context>
-            log.error(e, "Failed to decode RowExpression from sidecar response: %s.", e.getMessage());
-            throw new RuntimeException(e);
-        }
+        RowExpressionOptimizationResult response = optimize(rowExpression, ExpressionOptimizer.Level.EVALUATED);
+        assertNotNull(response.getExpressionFailureInfo().getMessage());
+        assertTrue(response.getExpressionFailureInfo().getMessage().contains(errorMessage), format("Sidecar response: %s did not contain expected error message: %s.", response, errorMessage));
     }

</code_context>

<issue_to_address>
**suggestion (testing):** `assertEvaluateFails` no longer verifies that `optimizedExpression` is null on failure

After the refactor this helper only checks the failure message and never asserts anything about `getOptimizedExpression()`. To keep validating `NativeSidecarExpressionInterpreter`’s failure contract, please also assert that `response.getOptimizedExpression()` is null (e.g., `assertNull(response.getOptimizedExpression(), ...)`) so tests will catch any regression where a partially optimized expression is returned on failure.

Suggested implementation:

```java
        RowExpressionOptimizationResult response = optimize(rowExpression, ExpressionOptimizer.Level.EVALUATED);
        assertNotNull(response.getExpressionFailureInfo().getMessage());
        assertNull(response.getOptimizedExpression(), format("Expected no optimized expression on failure, but got: %s", response.getOptimizedExpression()));
        assertTrue(response.getExpressionFailureInfo().getMessage().contains(errorMessage), format("Sidecar response: %s did not contain expected error message: %s.", response, errorMessage));

```

To compile, ensure that `assertNull` is available:
1. If using TestNG static imports, add `import static org.testng.Assert.assertNull;` at the top of the file (or `import static org.testng.Assert.*;` if that pattern is already used).
2. If JUnit is used instead, add the corresponding static import for `assertNull` from the correct Assert class, consistent with the existing assertions.
</issue_to_address>

### Comment 5
<location> `presto-main-base/src/test/java/com/facebook/presto/sql/expressions/TestExpressionOptimizerManager.java:66-70` </location>
<code_context>
         manager = new ExpressionOptimizerManager(
                 pluginNodeManager,
                 METADATA.getFunctionAndTypeManager(),
+                new JsonCodecRowExpressionSerde(jsonCodec(RowExpression.class)),
                 directory);
     }

</code_context>

<issue_to_address>
**suggestion (testing):** Missing tests for wiring and behavior of the new native expression optimizer factory

`ExpressionOptimizerManager` is now constructed with a `RowExpressionSerde`, but there’s no test that exercises `loadExpressionOptimizerFactory(...)` with an actual factory or verifies that this serde is propagated into the `ExpressionOptimizerContext`.

Please add tests that:
- register a test `ExpressionOptimizerFactory` that captures the `ExpressionOptimizerContext`,
- call `manager.loadExpressionOptimizerFactory(...)`, and
- assert that `context.getRowExpressionSerde()` is non-null and is the same instance passed to `ExpressionOptimizerManager`.

It would also be valuable to add an end-to-end test that registers the real `NativeExpressionOptimizerFactory` and confirms it can be instantiated and used to optimize a simple expression.

Suggested implementation:

```java
package com.facebook.presto.sql.expressions;

import com.facebook.airlift.json.JsonCodec;
import com.facebook.presto.Session;
import com.facebook.presto.metadata.FunctionAndTypeManager;
import com.facebook.presto.metadata.InternalNodeManager;
import com.facebook.presto.spi.Plugin;
import com.facebook.presto.spi.relation.ExpressionOptimizer;
import com.facebook.presto.spi.relation.ExpressionOptimizerContext;
import com.facebook.presto.spi.relation.ExpressionOptimizerFactory;
import com.facebook.presto.spi.relation.RowExpression;
import com.facebook.presto.sql.relational.JsonCodecRowExpressionSerde;
import com.facebook.presto.sql.relational.RowExpressionSerde;
import com.facebook.presto.testing.LocalQueryRunner;
import com.facebook.presto.testing.TestingSession;
import org.testng.annotations.AfterClass;
import org.testng.annotations.BeforeClass;
import org.testng.annotations.Test;

import java.nio.file.Path;
import java.util.Map;
import java.util.Properties;
import java.util.concurrent.atomic.AtomicReference;

import static com.facebook.airlift.json.JsonCodec.jsonCodec;
import static com.facebook.presto.spi.relation.ExpressionOptimizer.Level.OPTIMIZED;
import static com.facebook.presto.sql.relational.Expressions.constant;
import static com.facebook.presto.testing.TestingSession.testSessionBuilder;
import static java.nio.file.Files.createTempDirectory;
import static org.testng.Assert.assertNotNull;
import static org.testng.Assert.assertSame;

public class TestExpressionOptimizerManager
{
    private LocalQueryRunner queryRunner;
    private InternalNodeManager pluginNodeManager;
    private FunctionAndTypeManager functionAndTypeManager;
    private Path directory;
    private RowExpressionSerde rowExpressionSerde;
    private ExpressionOptimizerManager manager;

    @BeforeClass
    public void setUp()
            throws Exception
    {
        Session session = testSessionBuilder()
                .setCatalog("tpch")
                .setSchema("tiny")
                .build();

        this.queryRunner = LocalQueryRunner.create(session);
        this.pluginNodeManager = queryRunner.getNodeManager();
        this.functionAndTypeManager = queryRunner.getMetadata().getFunctionAndTypeManager();
        this.directory = createTempDirectory("expression-optimizer-manager-test");
        this.rowExpressionSerde = new JsonCodecRowExpressionSerde(jsonCodec(RowExpression.class));

        this.manager = new ExpressionOptimizerManager(
                pluginNodeManager,
                functionAndTypeManager,
                rowExpressionSerde,
                directory.toFile());
    }

    @AfterClass(alwaysRun = true)
    public void tearDown()
    {
        if (queryRunner != null) {
            queryRunner.close();
        }
    }

    @Test
    public void testRowExpressionSerdePropagatedToContext()
    {
        // Arrange: capture the context passed into the factory
        AtomicReference<ExpressionOptimizerContext> capturedContext = new AtomicReference<>();

        ExpressionOptimizerFactory capturingFactory = new ExpressionOptimizerFactory()
        {
            @Override
            public String getName()
            {
                return "capturing";
            }

            @Override
            public ExpressionOptimizer create(Session session, ExpressionOptimizerContext context)
            {
                capturedContext.set(context);

                // return a no-op optimizer; the behavior is not important for this test
                return (sess, expression, expectedType, level) -> expression;
            }
        };

        // Act: register the factory and trigger creation of an optimizer
        manager.loadExpressionOptimizerFactory(capturingFactory);

        Session session = TestingSession.testSessionBuilder()
                .setCatalog("tpch")
                .setSchema("tiny")
                .build();

        // This call should cause the manager to create an optimizer using the factory,
        // which in turn should capture the ExpressionOptimizerContext.
        ExpressionOptimizer optimizer = manager.getExpressionOptimizer("capturing", session);

        assertNotNull(optimizer, "ExpressionOptimizer should be created");

        ExpressionOptimizerContext context = capturedContext.get();
        assertNotNull(context, "ExpressionOptimizerContext should be captured");
        assertNotNull(context.getRowExpressionSerde(), "RowExpressionSerde in context should not be null");
        assertSame(
                context.getRowExpressionSerde(),
                rowExpressionSerde,
                "RowExpressionSerde in context should be the same instance passed to ExpressionOptimizerManager");
    }

    @Test
    public void testNativeExpressionOptimizerFactoryEndToEnd()
    {
        // Arrange: register the real native factory
        NativeExpressionOptimizerFactory nativeFactory = new NativeExpressionOptimizerFactory();
        manager.loadExpressionOptimizerFactory(nativeFactory);

        Session session = TestingSession.testSessionBuilder()
                .setCatalog("tpch")
                .setSchema("tiny")
                .build();

        // A very simple expression: a constant boolean
        RowExpression expression = constant(true, com.facebook.presto.spi.type.BooleanType.BOOLEAN);

        // Act: obtain an optimizer from the native factory and optimize the expression
        ExpressionOptimizer optimizer = manager.getExpressionOptimizer(nativeFactory.getName(), session);
        RowExpression optimized = optimizer.optimize(
                session,
                expression,
                com.facebook.presto.spi.type.BooleanType.BOOLEAN,
                OPTIMIZED);

        // Assert: we at least get a non-null result that is usable; the specific
        // shape is less important than verifying that wiring and instantiation work.
        assertNotNull(optimized, "Optimized expression should not be null");
    }
}

```

The exact APIs of `ExpressionOptimizerManager`, `ExpressionOptimizerFactory`, `ExpressionOptimizerContext`, and `NativeExpressionOptimizerFactory` may differ slightly in your codebase. To adapt this:

1. Update the `loadExpressionOptimizerFactory(...)` and `getExpressionOptimizer(...)` invocations to match the actual method signatures used in `ExpressionOptimizerManager`.
2. Adjust the `ExpressionOptimizerFactory` anonymous implementation to implement the correct methods and signatures (for example, if `create(...)` has different parameters).
3. Fix imports if the classes live in slightly different packages (e.g., `NativeExpressionOptimizerFactory`, `ExpressionOptimizerManager`, or `ExpressionOptimizerContext`).
4. If `ExpressionOptimizerContext` exposes the row-expression serde via a different accessor than `getRowExpressionSerde()`, update the assertion accordingly.
5. If the manager constructor takes slightly different arguments (e.g., a `File` vs. `Path`, or additional dependencies), adjust the `setUp()` method to pass the correct values while still constructing the `JsonCodecRowExpressionSerde` once and reusing the same instance in the assertions.
</issue_to_address>

### Comment 6
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/NativeSidecarPluginQueryRunnerUtils.java:53-55` </location>
<code_context>

         queryRunner.loadTypeManager(NativeTypeManagerFactory.NAME);
         queryRunner.loadPlanCheckerProviderManager("native", ImmutableMap.of());
+        queryRunner.getExpressionManager().loadExpressionOptimizerFactory(NativeExpressionOptimizerFactory.NAME, "native", ImmutableMap.of());
         queryRunner.installPlugin(new NativeSqlInvokedFunctionsPlugin());
     }
</code_context>

<issue_to_address>
**suggestion (testing):** Add an integration-style test that verifies the native expression optimizer is actually used by the engine

The helper now loads the native sidecar expression optimizer, but there’s no test confirming it’s actually used in end-to-end planning/optimization.

Please add an integration-style test (here or in a native-sidecar test) that builds a `QueryRunner` via `NativeSidecarPluginQueryRunner` with the native optimizer loaded, runs a query or planning pipeline on an expression the native sidecar optimizes (e.g., a constant-folding case from `TestNativeExpressionInterpreter`), and asserts that the native optimizer is selected (by name) or that the plan/expression reflects the native-optimized form. This will verify the configuration has the intended effect.

Suggested implementation:

```java
package com.facebook.presto.sidecar;

import com.facebook.presto.scalar.sql.NativeSqlInvokedFunctionsPlugin;
import com.facebook.presto.sidecar.expressions.NativeExpressionOptimizerFactory;
import com.facebook.presto.sidecar.functionNamespace.NativeFunctionNamespaceManagerFactory;
import com.facebook.presto.sidecar.sessionpropertyproviders.NativeSystemSessionPropertyProviderFactory;
import com.facebook.presto.sidecar.typemanager.NativeTypeManagerFactory;
import com.facebook.presto.testing.QueryRunner;
import org.testng.annotations.Test;

import static org.testng.Assert.assertTrue;

```

```java
        queryRunner.loadTypeManager(NativeTypeManagerFactory.NAME);
        queryRunner.loadPlanCheckerProviderManager("native", ImmutableMap.of());
        queryRunner.getExpressionManager().loadExpressionOptimizerFactory(NativeExpressionOptimizerFactory.NAME, "native", ImmutableMap.of());
        queryRunner.installPlugin(new NativeSqlInvokedFunctionsPlugin());
    }

    /**
     * Integration-style test that verifies the native expression optimizer is actually
     * registered and visible through the engine when the QueryRunner is created via
     * NativeSidecarPluginQueryRunnerUtils.
     *
     * The test intentionally goes through the same helper used by other tests instead
     * of constructing its own QueryRunner to ensure we are validating the real wiring.
     */
    @Test
    public void testNativeExpressionOptimizerIsLoadedAndVisible()
            throws Exception
    {
        // Build a QueryRunner via the helper so we exercise the same configuration
        // path that production-style tests use.
        try (QueryRunner queryRunner = createNativeSidecarQueryRunner()) {
            // Verify that the native expression optimizer factory is registered with the
            // ExpressionManager under the expected name. This confirms that the optimizer
            // is visible to the engine's planning / optimization pipeline, not just
            // that the factory class exists on the classpath.
            assertTrue(
                    queryRunner.getExpressionManager()
                            .getExpressionOptimizerFactoryNames()
                            .contains(NativeExpressionOptimizerFactory.NAME),
                    "Native expression optimizer factory should be registered and visible to the engine");
        }
    }
}

```

The above edit assumes the following existing structure and APIs, which you may need to adjust to match the actual codebase:

1. **Helper method name and signature**
   - The test calls `createNativeSidecarQueryRunner()`. If your helper method has a different name or signature (e.g., `createNativeSidecarPluginQueryRunner()` or it takes parameters), update the call accordingly.

2. **ExpressionManager API**
   - The test uses `queryRunner.getExpressionManager().getExpressionOptimizerFactoryNames()` which is a placeholder for a method that exposes the registered optimizer factory names as a `Set<String>` or similar.
   - If your `ExpressionManager` exposes this information differently (e.g., `getExpressionOptimizerFactories().keySet()` or another accessor), change the assertion to use the real method while still checking that `NativeExpressionOptimizerFactory.NAME` is present.

3. **Imports**
   - Ensure `ImmutableMap` is already imported in this file (it appears to be used in the snippet). If not, add `import com.google.common.collect.ImmutableMap;`.
   - If your `QueryRunner` type resides in a different package than `com.facebook.presto.testing.QueryRunner`, adjust the import accordingly.

4. **Test style / framework**
   - The test uses TestNG (`@Test`, `assertTrue` from `org.testng.Assert`). If this module uses JUnit instead, switch the annotations and assertions to the appropriate framework.

5. **Stronger behavioral assertion (optional)**
   - To more directly assert that the native optimizer is *used* (not just registered), you can extend the test to:
     - Run a query or planning pipeline on a constant-folding expression (e.g., `SELECT 1 + 2`).
     - Inspect the resulting plan or expression to confirm it is in the optimized form (e.g., `3` instead of `1 + 2`), using the same patterns as in `TestNativeExpressionInterpreter`.
   - Implement this by:
     - Importing the plan / expression inspection utilities already used in other tests.
     - Adding assertions after the registration check to verify the optimized form is present in the plan.
</issue_to_address>

Hi @pdabre12! 👋

Thanks for trying out Sourcery by commenting with @sourcery-ai review! 🚀

Install the sourcery-ai bot to get automatic code reviews on every pull request ✨

Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • NativeExpressionOptimizer implements ExpressionOptimizer but only overrides the 4-argument optimize method; you should also implement the 3-argument optimize(RowExpression, Level, ConnectorSession) and delegate to the 4-argument version to avoid compilation issues and ensure consistent behavior when callers use the simpler API.
  • In TestNativeExpressionInterpreter.optimizeRowExpression you assert that expressionFailureInfo.getMessage() is both non-null and empty; if an empty string is the success indicator, consider dropping the assertNotNull or explicitly documenting this convention to make the intent clearer to future readers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- NativeExpressionOptimizer implements ExpressionOptimizer but only overrides the 4-argument optimize method; you should also implement the 3-argument optimize(RowExpression, Level, ConnectorSession) and delegate to the 4-argument version to avoid compilation issues and ensure consistent behavior when callers use the simpler API.
- In TestNativeExpressionInterpreter.optimizeRowExpression you assert that expressionFailureInfo.getMessage() is both non-null and empty; if an empty string is the success indicator, consider dropping the assertNotNull or explicitly documenting this convention to make the intent clearer to future readers.

## Individual Comments

### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/relational/RowExpressionOptimizer.java:57-58` </location>
<code_context>
-                return null;
-            }
-        }
+        RowExpressionInterpreter interpreter = new RowExpressionInterpreter(expression, functionAndTypeManager, session, level);
+        return toRowExpression(expression.getSourceLocation(), interpreter.optimize(variableResolver::apply), expression.getType());
     }
 }
</code_context>

<issue_to_address>
**issue (bug_risk):** Guard against null variableResolver to avoid NullPointerException.

The `ExpressionOptimizer` contract doesn’t ensure `variableResolver` is non-null, and the previous `getRowExpression` handled a nullable resolver. With the new `variableResolver::apply` call, a null `variableResolver` will cause an NPE. Either enforce non-null with a precondition or preserve explicit handling of the null case (e.g., call `interpreter.optimize(null)` when the resolver is null).
</issue_to_address>

### Comment 2
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java:360-362` </location>
<code_context>
-            log.error(e, "Failed to decode RowExpression from sidecar response: %s.", e.getMessage());
-            throw new RuntimeException(e);
-        }
+        RowExpressionOptimizationResult response = optimize(rowExpression, ExpressionOptimizer.Level.EVALUATED);
+        assertNotNull(response.getExpressionFailureInfo().getMessage());
+        assertTrue(response.getExpressionFailureInfo().getMessage().contains(errorMessage), format("Sidecar response: %s did not contain expected error message: %s.", response, errorMessage));
     }

</code_context>

<issue_to_address>
**suggestion (testing):** Also assert that failed optimizations do not produce an optimized expression

The earlier helper verified both the error and the absence of an optimized expression. The new version only asserts on the error message. To preserve that contract and catch regressions in sidecar error handling, also assert that `response.getOptimizedExpression()` is null (or equivalent) here.

```suggestion
        RowExpressionOptimizationResult response = optimize(rowExpression, ExpressionOptimizer.Level.EVALUATED);
        assertNull(response.getOptimizedExpression(), format("Expected no optimized expression for failed optimization, but got: %s", response.getOptimizedExpression()));
        assertNotNull(response.getExpressionFailureInfo().getMessage());
        assertTrue(response.getExpressionFailureInfo().getMessage().contains(errorMessage), format("Sidecar response: %s did not contain expected error message: %s.", response, errorMessage));
```
</issue_to_address>

### Comment 3
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java:397-401` </location>
<code_context>
+        RowExpressionOptimizationResult response = optimize(expression, level);
</code_context>

<issue_to_address>
**nitpick (testing):** Assertion on `expressionFailureInfo` looks inverted and may be misleading

You’re asserting `assertNotNull(response.getExpressionFailureInfo().getMessage());` and then immediately asserting that the message is empty. Per the previous comment that an empty message means "no failure", the key condition is that the message is empty, not that it’s non-null. The null check adds little value and can be misleading. Consider removing `assertNotNull`, or instead asserting on the entire `expressionFailureInfo` object (e.g., object is non-null and its `message` is empty) to better express the test’s intent.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@pdabre12 pdabre12 force-pushed the expression-optimizer branch from 3846462 to 380776d Compare December 11, 2025 20:39
@pdabre12 pdabre12 requested a review from SourceryAI December 11, 2025 20:41
for (int i = 0; i < rowExpressionOptimizationResults.size(); i++) {
result.put(originalExpressions.get(i), rowExpressionOptimizationResults.get(i).getOptimizedExpression());
}
return result.build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the row expressions inside of originalExpressions unique? If not, you can use result.buildKeepingLast()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are unique, these originalExpressions come from a set variable expressionsToOptimize


queryRunner.loadTypeManager(NativeTypeManagerFactory.NAME);
queryRunner.loadPlanCheckerProviderManager("native", ImmutableMap.of());
queryRunner.getExpressionManager().loadExpressionOptimizerFactory(NativeExpressionOptimizerFactory.NAME, "native", ImmutableMap.of());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of hardcoding "native", you can use NativeExpressionOptimizerFactory.NAME

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just went with the above structure, wherein they are hardcoding "native".

@pdabre12
Copy link
Contributor Author

pdabre12 commented Dec 12, 2025

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • NativeExpressionOptimizer implements ExpressionOptimizer but only overrides the 4-argument optimize method; you should also implement the 3-argument optimize(RowExpression, Level, ConnectorSession) and delegate to the 4-argument version to avoid compilation issues and ensure consistent behavior when callers use the simpler API.

The 3-argument optimize method is default, so it automatically delegate to the 4-argument optimize method.

  • In TestNativeExpressionInterpreter.optimizeRowExpression you assert that expressionFailureInfo.getMessage() is both non-null and empty; if an empty string is the success indicator, consider dropping the assertNotNull or explicitly documenting this convention to make the intent clearer to future readers.

That is a success indicator only for the test, if no message is present we can assume the optimization worked as intended.

@pdabre12 pdabre12 merged commit a37eab2 into prestodb:master Dec 12, 2025
81 of 82 checks passed
@pdabre12 pdabre12 deleted the expression-optimizer branch December 12, 2025 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants