Skip to content

fix(native): Enable Velox to Presto lambda expression conversion#26913

Merged
pramodsatya merged 1 commit intoprestodb:masterfrom
pramodsatya:fix_vtop_expr
Jan 9, 2026
Merged

fix(native): Enable Velox to Presto lambda expression conversion#26913
pramodsatya merged 1 commit intoprestodb:masterfrom
pramodsatya:fix_vtop_expr

Conversation

@pramodsatya
Copy link
Copy Markdown
Contributor

@pramodsatya pramodsatya commented Jan 7, 2026

Description

Enables Velox to Presto expression conversion for lambda expressions.

Motivation and Context

Velox to Presto expression conversion currently returns the input RowExpression for Velox expressions of types kConcat, kInput, and kLambda. This is incorrect for Lambda expressions because Velox ExprOptimizer optimizes the body of core::LambdaTypedExpr and the result isn't translated back to Presto. This PR adds the missing wiring for Velox to Presto lambda expression conversion required to complete the constant folding.

getRowExpression API in VeloxToPrestoExprConverter takes an optional RowExpression for the default result as and argument. This was being returned for lambda, concat and input Velox expressions. With this change lambda expressions are converted and the Velox ExprOptimizer never returns concat or input expressions. So the default result parameter is unused and can be removed.

Impact

Fixes errors uncovered by array_sort in #26903.

Test Plan

Added e2e tests.

== NO RELEASE NOTE ==

@pramodsatya pramodsatya requested review from a team and pdabre12 as code owners January 7, 2026 03:20
@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Jan 7, 2026
@prestodb-ci prestodb-ci requested a review from a team January 7, 2026 03:20
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Jan 7, 2026

Reviewer's Guide

Enables Velox-to-Presto conversion for lambda expressions by adding explicit LambdaDefinitionExpression construction, simplifying the converter API to always convert or fail, and wiring the optimizer and tests to exercise lambda-heavy array_sort expressions.

Sequence diagram for optimizeExpression using VeloxToPrestoExprConverter with lambda handling

sequenceDiagram
  participant JavaEngine
  participant ExpressionOptimizer
  participant VeloxExprOptimizer
  participant VeloxToPrestoExprConverter
  participant PrestoProtocol

  JavaEngine->>ExpressionOptimizer: optimizeExpression(input RowExpression)
  ExpressionOptimizer->>VeloxExprOptimizer: optimize(input Velox expression)
  VeloxExprOptimizer-->>ExpressionOptimizer: optimized velox_core_TypedExprPtr optimized

  ExpressionOptimizer->>VeloxToPrestoExprConverter: getRowExpression(optimized)
  alt kind is kLambda
    VeloxToPrestoExprConverter->>VeloxToPrestoExprConverter: getLambdaExpression(velox_core_LambdaTypedExpr)
    VeloxToPrestoExprConverter->>PrestoProtocol: construct protocol_LambdaDefinitionExpression
    PrestoProtocol-->>VeloxToPrestoExprConverter: LambdaDefinitionExpressionPtr
  else kind is supported non_lambda
    VeloxToPrestoExprConverter->>PrestoProtocol: construct appropriate RowExpression
    PrestoProtocol-->>VeloxToPrestoExprConverter: RowExpressionPtr
  else kind is unsupported (kConcat, kInput, other)
    VeloxToPrestoExprConverter-->>ExpressionOptimizer: throw VELOX_FAIL exception
  end

  VeloxToPrestoExprConverter-->>ExpressionOptimizer: RowExpressionPtr optimizedExpression
  ExpressionOptimizer-->>JavaEngine: RowExpressionOptimizationResult with optimizedExpression
Loading

Class diagram for updated VeloxToPrestoExprConverter with lambda support

classDiagram
  class VeloxToPrestoExprConverter {
    - memory_Pool_ptr pool_
    +VeloxToPrestoExprConverter(memory_Pool_ptr pool)
    +RowExpressionPtr getRowExpression(velox_core_TypedExprPtr expr) const
    +ConstantExpressionPtr getConstantExpression(velox_core_ConstantTypedExpr constantExpr) const
    +FieldDereferenceExpressionPtr getFieldReferenceExpression(velox_core_FieldAccessTypedExpr fieldAccessExpr) const
    +SpecialFormExpressionPtr getSpecialFormExpression(velox_core_SpecialFormSpecialForm specialFormExpr) const
    +SpecialFormExpressionPtr getDereferenceExpression(velox_core_DereferenceTypedExpr dereferenceExpr) const
    +LambdaDefinitionExpressionPtr getLambdaExpression(velox_core_LambdaTypedExpr lambdaExpr) const
    +CallExpressionPtr getCallExpression(velox_core_CallTypedExpr callExpr) const
  }

  class RowExpressionPtr
  class ConstantExpressionPtr
  class CallExpressionPtr
  class LambdaDefinitionExpressionPtr
  class SpecialFormExpressionPtr
  class FieldDereferenceExpressionPtr

  class velox_core_TypedExprPtr
  class velox_core_LambdaTypedExpr {
    +velox_core_TypePtr signature() const
    +velox_core_TypedExprPtr body() const
  }
  class velox_core_CallTypedExpr

  class protocol_TypeSignature

  VeloxToPrestoExprConverter ..> RowExpressionPtr : returns
  VeloxToPrestoExprConverter ..> ConstantExpressionPtr : returns
  VeloxToPrestoExprConverter ..> CallExpressionPtr : returns
  VeloxToPrestoExprConverter ..> LambdaDefinitionExpressionPtr : returns
  VeloxToPrestoExprConverter ..> SpecialFormExpressionPtr : returns
  VeloxToPrestoExprConverter ..> FieldDereferenceExpressionPtr : returns

  VeloxToPrestoExprConverter ..> velox_core_TypedExprPtr : consumes
  VeloxToPrestoExprConverter ..> velox_core_LambdaTypedExpr : consumes
  VeloxToPrestoExprConverter ..> velox_core_CallTypedExpr : consumes
  VeloxToPrestoExprConverter ..> protocol_TypeSignature : builds_argumentTypes
Loading

File-Level Changes

Change Details Files
Add Velox lambda to Presto LambdaDefinitionExpression conversion
  • Introduce LambdaDefinitionExpressionPtr alias in the converter header
  • Implement getLambdaExpression helper that serializes LambdaTypedExpr signature, argument types, argument names, and body to Presto protocol JSON
  • Wire kLambda handling into getRowExpression to produce a LambdaDefinitionExpression instead of falling back
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.h
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Make VeloxToPrestoExprConverter.getRowExpression a strict converter that no longer takes a default RowExpression
  • Change getRowExpression signature to remove the optional defaultResult/input RowExpression parameter and update its documentation to state it throws on failure
  • Replace the previous error-logging-and-fallback behavior for unsupported expression kinds with VELOX_FAIL to surface conversion errors immediately
  • Update ExpressionOptimizer to call the new getRowExpression signature without passing the input expression as a default
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.h
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
presto-native-execution/presto_cpp/main/types/ExpressionOptimizer.cpp
Extend native expression interpreter tests to cover lambda usage via array_sort
  • Add a testLambda test method that runs optimize on array_sort expressions with scalar and row-typed lambdas to validate end-to-end conversion and optimization
  • Ensure these tests exercise the new lambda conversion path in the native expression interpreter
presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@pramodsatya
Copy link
Copy Markdown
Contributor Author

@aditi-pandit, @pdabre12, could you help review this fix?

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In getLambdaExpression, the second reserve call uses argumentTypes.reserve(signature->names().size()) instead of reserving capacity on the arguments vector, which means the arguments vector still reallocates and you end up over-reserving argumentTypes.
  • In the default case of getRowExpression, VELOX_FAIL will throw, so the subsequent return nullptr; is dead code; consider removing it or restructuring the flow to make the control path explicit.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `getLambdaExpression`, the second `reserve` call uses `argumentTypes.reserve(signature->names().size())` instead of reserving capacity on the `arguments` vector, which means the `arguments` vector still reallocates and you end up over-reserving `argumentTypes`.
- In the `default` case of `getRowExpression`, `VELOX_FAIL` will throw, so the subsequent `return nullptr;` is dead code; consider removing it or restructuring the flow to make the control path explicit.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables Velox to Presto expression conversion for lambda expressions, which was previously unsupported. The key changes include:

  • Implements lambda expression conversion from Velox to Presto protocol format
  • Removes the optional fallback parameter from getRowExpression API, replacing fallback behavior with exception throwing
  • Updates error handling to fail fast rather than logging and returning default values

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.h Adds LambdaDefinitionExpressionPtr type alias, declares getLambdaExpression method, and updates getRowExpression signature to remove optional default result parameter
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp Implements getLambdaExpression method to convert Velox lambda expressions to Presto format, adds lambda case handling in getRowExpression, and changes error handling from logging to throwing exceptions
presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java Adds end-to-end tests for lambda expression optimization using array_sort function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@pramodsatya
Copy link
Copy Markdown
Contributor Author

@sourcery-ai review

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In getLambdaExpression, consider adding explicit checks (e.g., VELOX_CHECK_NOT_NULL(signature) and verifying signature->children().size() == signature->names().size()) to guard against malformed lambda signatures before building argumentTypes and arguments.
  • The change from logging and returning the input RowExpression to throwing via VELOX_FAIL in getRowExpression changes failure semantics; it might be worth confirming that all callers (particularly the optimizer path) are prepared for exceptions here and won’t rely on a best-effort fallback.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `getLambdaExpression`, consider adding explicit checks (e.g., `VELOX_CHECK_NOT_NULL(signature)` and verifying `signature->children().size() == signature->names().size()`) to guard against malformed lambda signatures before building `argumentTypes` and `arguments`.
- The change from logging and returning the input `RowExpression` to throwing via `VELOX_FAIL` in `getRowExpression` changes failure semantics; it might be worth confirming that all callers (particularly the optimizer path) are prepared for exceptions here and won’t rely on a best-effort fallback.

## Individual Comments

### Comment 1
<location> `presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp:237-246` </location>
<code_context>
+
+  json result;
+  result["@type"] = kLambda;
+  const auto& signature = lambdaExpr->signature();
+  std::vector<protocol::TypeSignature> argumentTypes;
+  argumentTypes.reserve(signature->children().size());
+  for (const auto& type : signature->children()) {
+    argumentTypes.emplace_back(getTypeSignature(type));
+  }
+  result["argumentTypes"] = argumentTypes;
+
+  std::vector<std::string> arguments;
+  arguments.reserve(signature->names().size());
+  for (const auto& name : signature->names()) {
+    arguments.emplace_back(name);
+  }
</code_context>

<issue_to_address>
**suggestion:** Guard against mismatch between lambda argument names and types.

This assumes `signature->children()` and `signature->names()` are the same size and aligned. To fail fast on malformed signatures, add a defensive `VELOX_CHECK_EQ` / `VELOX_DCHECK_EQ` on `children().size()` vs `names().size()` before building these vectors.
</issue_to_address>

### Comment 2
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java:104-105` </location>
<code_context>
+    @Test
+    public void testLambda()
+    {
+        optimize("array_sort(ARRAY['apple', 'banana', 'pear'], x -> IF(x = 'banana', NULL, length(x)))");
+        optimize("array_sort(ARRAY[ROW('a', 3), ROW('b', 1), ROW('c', 2)], x -> x[2])");
+    }
+
</code_context>

<issue_to_address>
**suggestion (testing):** Consider expanding lambda coverage to additional shapes and edge cases

These `array_sort` cases validate the original failure, but the converter now supports Lambdas more broadly. Please add a few more scenarios to harden coverage: lambdas capturing outer variables, returning complex types (arrays/rows), nested or higher-order-function lambdas, and inputs with empty arrays or all-NULL results. This will better exercise the Velox-to-Presto lambda wiring and guard against regressions as the code evolves.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@pdabre12 pdabre12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pramodsatya Thanks for this quick fix !

Copy link
Copy Markdown
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pramodsatya

@pramodsatya pramodsatya merged commit 8dece5d into prestodb:master Jan 9, 2026
153 of 159 checks passed
@pramodsatya pramodsatya deleted the fix_vtop_expr branch January 9, 2026 20:11
tdcmeehan pushed a commit to rdtr/presto that referenced this pull request Jan 14, 2026
…stodb#26913)

## Description
Enables Velox to Presto expression conversion for lambda expressions.

## Motivation and Context
Velox to Presto expression conversion currently returns the input
`RowExpression` for Velox expressions of types `kConcat`, `kInput`, and
`kLambda`. This is incorrect for Lambda expressions because Velox
`ExprOptimizer` optimizes the body of `core::LambdaTypedExpr` and the
result isn't translated back to Presto. This PR adds the missing wiring
for Velox to Presto lambda expression conversion required to complete
the constant folding.

`getRowExpression` API in `VeloxToPrestoExprConverter` takes an optional
`RowExpression` for the default result as and argument. This was being
returned for lambda, concat and input Velox expressions. With this
change lambda expressions are converted and the Velox ExprOptimizer
never returns concat or input expressions. So the default result
parameter is unused and can be removed.

## Impact
Fixes errors uncovered by `array_sort` in
prestodb#26903.

## Test Plan
Added e2e tests.


```
== NO RELEASE NOTE ==
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants