fix(native): Enable Velox to Presto lambda expression conversion#26913
fix(native): Enable Velox to Presto lambda expression conversion#26913pramodsatya merged 1 commit intoprestodb:masterfrom
Conversation
Reviewer's GuideEnables Velox-to-Presto conversion for lambda expressions by adding explicit LambdaDefinitionExpression construction, simplifying the converter API to always convert or fail, and wiring the optimizer and tests to exercise lambda-heavy array_sort expressions. Sequence diagram for optimizeExpression using VeloxToPrestoExprConverter with lambda handlingsequenceDiagram
participant JavaEngine
participant ExpressionOptimizer
participant VeloxExprOptimizer
participant VeloxToPrestoExprConverter
participant PrestoProtocol
JavaEngine->>ExpressionOptimizer: optimizeExpression(input RowExpression)
ExpressionOptimizer->>VeloxExprOptimizer: optimize(input Velox expression)
VeloxExprOptimizer-->>ExpressionOptimizer: optimized velox_core_TypedExprPtr optimized
ExpressionOptimizer->>VeloxToPrestoExprConverter: getRowExpression(optimized)
alt kind is kLambda
VeloxToPrestoExprConverter->>VeloxToPrestoExprConverter: getLambdaExpression(velox_core_LambdaTypedExpr)
VeloxToPrestoExprConverter->>PrestoProtocol: construct protocol_LambdaDefinitionExpression
PrestoProtocol-->>VeloxToPrestoExprConverter: LambdaDefinitionExpressionPtr
else kind is supported non_lambda
VeloxToPrestoExprConverter->>PrestoProtocol: construct appropriate RowExpression
PrestoProtocol-->>VeloxToPrestoExprConverter: RowExpressionPtr
else kind is unsupported (kConcat, kInput, other)
VeloxToPrestoExprConverter-->>ExpressionOptimizer: throw VELOX_FAIL exception
end
VeloxToPrestoExprConverter-->>ExpressionOptimizer: RowExpressionPtr optimizedExpression
ExpressionOptimizer-->>JavaEngine: RowExpressionOptimizationResult with optimizedExpression
Class diagram for updated VeloxToPrestoExprConverter with lambda supportclassDiagram
class VeloxToPrestoExprConverter {
- memory_Pool_ptr pool_
+VeloxToPrestoExprConverter(memory_Pool_ptr pool)
+RowExpressionPtr getRowExpression(velox_core_TypedExprPtr expr) const
+ConstantExpressionPtr getConstantExpression(velox_core_ConstantTypedExpr constantExpr) const
+FieldDereferenceExpressionPtr getFieldReferenceExpression(velox_core_FieldAccessTypedExpr fieldAccessExpr) const
+SpecialFormExpressionPtr getSpecialFormExpression(velox_core_SpecialFormSpecialForm specialFormExpr) const
+SpecialFormExpressionPtr getDereferenceExpression(velox_core_DereferenceTypedExpr dereferenceExpr) const
+LambdaDefinitionExpressionPtr getLambdaExpression(velox_core_LambdaTypedExpr lambdaExpr) const
+CallExpressionPtr getCallExpression(velox_core_CallTypedExpr callExpr) const
}
class RowExpressionPtr
class ConstantExpressionPtr
class CallExpressionPtr
class LambdaDefinitionExpressionPtr
class SpecialFormExpressionPtr
class FieldDereferenceExpressionPtr
class velox_core_TypedExprPtr
class velox_core_LambdaTypedExpr {
+velox_core_TypePtr signature() const
+velox_core_TypedExprPtr body() const
}
class velox_core_CallTypedExpr
class protocol_TypeSignature
VeloxToPrestoExprConverter ..> RowExpressionPtr : returns
VeloxToPrestoExprConverter ..> ConstantExpressionPtr : returns
VeloxToPrestoExprConverter ..> CallExpressionPtr : returns
VeloxToPrestoExprConverter ..> LambdaDefinitionExpressionPtr : returns
VeloxToPrestoExprConverter ..> SpecialFormExpressionPtr : returns
VeloxToPrestoExprConverter ..> FieldDereferenceExpressionPtr : returns
VeloxToPrestoExprConverter ..> velox_core_TypedExprPtr : consumes
VeloxToPrestoExprConverter ..> velox_core_LambdaTypedExpr : consumes
VeloxToPrestoExprConverter ..> velox_core_CallTypedExpr : consumes
VeloxToPrestoExprConverter ..> protocol_TypeSignature : builds_argumentTypes
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
@aditi-pandit, @pdabre12, could you help review this fix? |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- In
getLambdaExpression, the secondreservecall usesargumentTypes.reserve(signature->names().size())instead of reserving capacity on theargumentsvector, which means theargumentsvector still reallocates and you end up over-reservingargumentTypes. - In the
defaultcase ofgetRowExpression,VELOX_FAILwill throw, so the subsequentreturn nullptr;is dead code; consider removing it or restructuring the flow to make the control path explicit.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `getLambdaExpression`, the second `reserve` call uses `argumentTypes.reserve(signature->names().size())` instead of reserving capacity on the `arguments` vector, which means the `arguments` vector still reallocates and you end up over-reserving `argumentTypes`.
- In the `default` case of `getRowExpression`, `VELOX_FAIL` will throw, so the subsequent `return nullptr;` is dead code; consider removing it or restructuring the flow to make the control path explicit.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Pull request overview
This PR enables Velox to Presto expression conversion for lambda expressions, which was previously unsupported. The key changes include:
- Implements lambda expression conversion from Velox to Presto protocol format
- Removes the optional fallback parameter from
getRowExpressionAPI, replacing fallback behavior with exception throwing - Updates error handling to fail fast rather than logging and returning default values
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.h | Adds LambdaDefinitionExpressionPtr type alias, declares getLambdaExpression method, and updates getRowExpression signature to remove optional default result parameter |
| presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp | Implements getLambdaExpression method to convert Velox lambda expressions to Presto format, adds lambda case handling in getRowExpression, and changes error handling from logging to throwing exceptions |
| presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java | Adds end-to-end tests for lambda expression optimization using array_sort function |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.h
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
193768f to
6f087a8
Compare
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
getLambdaExpression, consider adding explicit checks (e.g.,VELOX_CHECK_NOT_NULL(signature)and verifyingsignature->children().size() == signature->names().size()) to guard against malformed lambda signatures before buildingargumentTypesandarguments. - The change from logging and returning the input
RowExpressionto throwing viaVELOX_FAILingetRowExpressionchanges failure semantics; it might be worth confirming that all callers (particularly the optimizer path) are prepared for exceptions here and won’t rely on a best-effort fallback.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `getLambdaExpression`, consider adding explicit checks (e.g., `VELOX_CHECK_NOT_NULL(signature)` and verifying `signature->children().size() == signature->names().size()`) to guard against malformed lambda signatures before building `argumentTypes` and `arguments`.
- The change from logging and returning the input `RowExpression` to throwing via `VELOX_FAIL` in `getRowExpression` changes failure semantics; it might be worth confirming that all callers (particularly the optimizer path) are prepared for exceptions here and won’t rely on a best-effort fallback.
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp:237-246` </location>
<code_context>
+
+ json result;
+ result["@type"] = kLambda;
+ const auto& signature = lambdaExpr->signature();
+ std::vector<protocol::TypeSignature> argumentTypes;
+ argumentTypes.reserve(signature->children().size());
+ for (const auto& type : signature->children()) {
+ argumentTypes.emplace_back(getTypeSignature(type));
+ }
+ result["argumentTypes"] = argumentTypes;
+
+ std::vector<std::string> arguments;
+ arguments.reserve(signature->names().size());
+ for (const auto& name : signature->names()) {
+ arguments.emplace_back(name);
+ }
</code_context>
<issue_to_address>
**suggestion:** Guard against mismatch between lambda argument names and types.
This assumes `signature->children()` and `signature->names()` are the same size and aligned. To fail fast on malformed signatures, add a defensive `VELOX_CHECK_EQ` / `VELOX_DCHECK_EQ` on `children().size()` vs `names().size()` before building these vectors.
</issue_to_address>
### Comment 2
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/expressions/TestNativeExpressionInterpreter.java:104-105` </location>
<code_context>
+ @Test
+ public void testLambda()
+ {
+ optimize("array_sort(ARRAY['apple', 'banana', 'pear'], x -> IF(x = 'banana', NULL, length(x)))");
+ optimize("array_sort(ARRAY[ROW('a', 3), ROW('b', 1), ROW('c', 2)], x -> x[2])");
+ }
+
</code_context>
<issue_to_address>
**suggestion (testing):** Consider expanding lambda coverage to additional shapes and edge cases
These `array_sort` cases validate the original failure, but the converter now supports Lambdas more broadly. Please add a few more scenarios to harden coverage: lambdas capturing outer variables, returning complex types (arrays/rows), nested or higher-order-function lambdas, and inputs with empty arrays or all-NULL results. This will better exercise the Velox-to-Presto lambda wiring and guard against regressions as the code evolves.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pdabre12
left a comment
There was a problem hiding this comment.
@pramodsatya Thanks for this quick fix !
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pramodsatya
…stodb#26913) ## Description Enables Velox to Presto expression conversion for lambda expressions. ## Motivation and Context Velox to Presto expression conversion currently returns the input `RowExpression` for Velox expressions of types `kConcat`, `kInput`, and `kLambda`. This is incorrect for Lambda expressions because Velox `ExprOptimizer` optimizes the body of `core::LambdaTypedExpr` and the result isn't translated back to Presto. This PR adds the missing wiring for Velox to Presto lambda expression conversion required to complete the constant folding. `getRowExpression` API in `VeloxToPrestoExprConverter` takes an optional `RowExpression` for the default result as and argument. This was being returned for lambda, concat and input Velox expressions. With this change lambda expressions are converted and the Velox ExprOptimizer never returns concat or input expressions. So the default result parameter is unused and can be removed. ## Impact Fixes errors uncovered by `array_sort` in prestodb#26903. ## Test Plan Added e2e tests. ``` == NO RELEASE NOTE == ```
Description
Enables Velox to Presto expression conversion for lambda expressions.
Motivation and Context
Velox to Presto expression conversion currently returns the input
RowExpressionfor Velox expressions of typeskConcat,kInput, andkLambda. This is incorrect for Lambda expressions because VeloxExprOptimizeroptimizes the body ofcore::LambdaTypedExprand the result isn't translated back to Presto. This PR adds the missing wiring for Velox to Presto lambda expression conversion required to complete the constant folding.getRowExpressionAPI inVeloxToPrestoExprConvertertakes an optionalRowExpressionfor the default result as and argument. This was being returned for lambda, concat and input Velox expressions. With this change lambda expressions are converted and the Velox ExprOptimizer never returns concat or input expressions. So the default result parameter is unused and can be removed.Impact
Fixes errors uncovered by
array_sortin #26903.Test Plan
Added e2e tests.