fix(native): Fix Velox to Presto IN expression conversion#26951
fix(native): Fix Velox to Presto IN expression conversion#26951pramodsatya merged 3 commits intoprestodb:masterfrom
Conversation
Reviewer's GuideAdjusts Velox-to-Presto Sequence diagram for Velox constant IN expression conversionsequenceDiagram
actor QueryEngine
participant VeloxToPrestoExprConverter as Converter
participant CallTypedExpr as InCallExpr
participant ConstantTypedExpr as InListExpr
participant ConstantVector as ConstVector
participant ArrayVector as InListArray
participant PrestoSpecialFormExpression as PrestoInExpr
QueryEngine->>Converter: getSpecialFormExpression(InCallExpr)
Converter->>InCallExpr: inputs()
InCallExpr-->>Converter: lhsExpr, inListExpr
Converter->>Converter: getInSpecialFormExpressionArgs(InCallExpr)
Converter->>InListExpr: isConstantKind()
InListExpr-->>Converter: true
Converter->>InListExpr: toConstantVector(pool_)
InListExpr-->>Converter: ConstVector
Converter->>ConstVector: wrappedVector()
ConstVector-->>Converter: InListArray
Converter->>InListArray: sizeAt(wrappedIdx), offsetAt(wrappedIdx)
InListArray-->>Converter: size, offset
Converter->>InListArray: elements()
InListArray-->>Converter: elementsVector
loop for each element in IN list
Converter->>ConstVector: wrappedIndex(0)
Converter->>Converter: elementIndex = offset + i
Converter->>ConstantVector: wrapInConstant(size, elementIndex, elementsVector)
ConstantVector-->>Converter: elementConstant
Converter->>Converter: create ConstantTypedExpr(elementConstant)
Converter->>Converter: getRowExpression(elementConstantExpr)
Converter-->>Converter: append argument
end
Converter-->>Converter: result.arguments = [lhsExpr, expandedArgs]
Converter-->>QueryEngine: PrestoInExpr
Updated class diagram for VeloxToPrestoExprConverter IN handlingclassDiagram
class VeloxToPrestoExprConverter {
+velox::memory::MemoryPool* pool_
+RowExpressionPtr getRowExpression(velox::core::ITypedExpr* expr)
+std::vector~RowExpressionPtr~ getSwitchSpecialFormExpressionArgs(const velox::core::CallTypedExpr* switchExpr)
+std::vector~RowExpressionPtr~ getInSpecialFormExpressionArgs(const velox::core::CallTypedExpr* inExpr)
+protocol::SpecialFormExpressionPtr getSpecialFormExpression(const velox::core::CallTypedExpr* expr)
}
class CallTypedExpr {
+std::vector~velox::core::ITypedExprPtr~ inputs()
+std::shared_ptr~velox::Type~ type()
}
class ConstantTypedExpr {
+bool isConstantKind()
+std::shared_ptr~velox::Vector~ toConstantVector(velox::memory::MemoryPool* pool)
}
class ConstantVector {
+velox::vector_size_t size()
+velox::vector_size_t wrappedIndex(velox::vector_size_t index)
+std::shared_ptr~velox::BaseVector~ wrappedVector()
+static std::shared_ptr~velox::BaseVector~ wrapInConstant(velox::vector_size_t size, velox::vector_size_t index, std::shared_ptr~velox::BaseVector~ elementsVector)
}
class ArrayVector {
+velox::vector_size_t sizeAt(velox::vector_size_t index)
+velox::vector_size_t offsetAt(velox::vector_size_t index)
+std::shared_ptr~velox::BaseVector~ elements()
}
class SpecialFormExpression {
+std::string form
+std::vector~RowExpressionPtr~ arguments
}
VeloxToPrestoExprConverter --> CallTypedExpr : converts inputs
VeloxToPrestoExprConverter --> ConstantTypedExpr : casts IN list input
ConstantTypedExpr --> ConstantVector : toConstantVector
ConstantVector --> ArrayVector : wrappedVector as ArrayVector
VeloxToPrestoExprConverter --> ConstantVector : wrapInConstant
VeloxToPrestoExprConverter --> SpecialFormExpression : build IN special form
ArrayVector --> ConstantVector : provides element indices
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
1f859b2 to
34fb636
Compare
34fb636 to
8a62c7e
Compare
|
@aditi-pandit, could you please help review this change? |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
getInSpecialFormExpressionArgs, the constant-array branch assumes a non-nullArrayVectorand elements; consider explicitly handling or guarding against nullwrappedVector()/elements()/ null array elements to avoid unexpected crashes when the IN-list or its entries contain nulls. - The error message in
VELOX_CHECK_GE(numInputs, 2, ...)has a small typo (atleast); consider correcting it toat leastto keep diagnostics clear and polished.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `getInSpecialFormExpressionArgs`, the constant-array branch assumes a non-null `ArrayVector` and elements; consider explicitly handling or guarding against null `wrappedVector()` / `elements()` / null array elements to avoid unexpected crashes when the IN-list or its entries contain nulls.
- The error message in `VELOX_CHECK_GE(numInputs, 2, ...)` has a small typo (`atleast`); consider correcting it to `at least` to keep diagnostics clear and polished.
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp:171` </location>
<code_context>
+ std::vector<RowExpressionPtr> result;
+ const auto& inputs = inExpr->inputs();
+ const auto numInputs = inputs.size();
+ VELOX_CHECK_GE(numInputs, 2, "IN expression should have atleast 2 inputs");
+ result.push_back(getRowExpression(inputs.at(0)));
+
</code_context>
<issue_to_address>
**nitpick (typo):** Fix minor typo in the IN-arity check message.
The error message uses "atleast"; please change this to "at least" for consistency with other messages.
```suggestion
VELOX_CHECK_GE(numInputs, 2, "IN expression should have at least 2 inputs");
```
</issue_to_address>
### Comment 2
<location> `presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/TestNativeSidecarPlugin.java:671-678` </location>
<code_context>
assertQuerySucceeds(session, "SELECT array_sort_desc(ARRAY[-25, 20000, -17, 3672], x -> IF(x = 5, NULL, abs(x)))");
}
+ // This test case verifies the IN expression is handled correctly by the native expression optimizer.
+ @Test
+ public void testInExpression()
+ {
+ Session session = Session.builder(getQueryRunner().getDefaultSession())
+ .setSystemProperty(EXPRESSION_OPTIMIZER_NAME, "native")
+ .build();
+ assertQuerySucceeds(session, "SELECT table_name, COALESCE(abs(ordinal_position), 0) as abs_pos FROM information_schema.columns WHERE table_catalog = 'hive' AND table_name IN ('nation', 'region') ORDER BY table_name, ordinal_position");
+ }
+
</code_context>
<issue_to_address>
**issue (testing):** This test only verifies that the query does not fail, but not that the optimized IN expression produces correct results.
The current assertion only checks for absence of errors. Since the original bug was about incorrect IN conversion, this test should validate results, not just success.
Recommend comparing outputs with and without the native optimizer, e.g.:
- Run the same query with the default optimizer, and use `assertQuery` or `assertQuery(sessionWithNative, sql)` to ensure both result sets match.
This will catch regressions where the query returns wrong rows but still succeeds.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
...native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/TestNativeSidecarPlugin.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Fixes native expression optimizer failures for IN predicates by correctly converting Velox constant IN lists (stored as a constant array vector) into Presto IN special-form arguments.
Changes:
- Add
VeloxToPrestoExprConverter::getInSpecialFormExpressionArgsto expand constant array IN-lists into distinct constant arguments. - Route
INspecial-form conversion through the new helper ingetSpecialFormExpression. - Add an e2e test covering
IN ('nation','region')under the native expression optimizer.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| presto-native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/TestNativeSidecarPlugin.java | Adds an e2e regression test for IN predicate handling under the native optimizer. |
| presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.h | Declares a new helper to construct Presto IN special-form arguments from Velox IN. |
| presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp | Implements constant-array IN-list expansion and wires it into special-form conversion. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
8a62c7e to
0f22c46
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...native-sidecar-plugin/src/test/java/com/facebook/presto/sidecar/TestNativeSidecarPlugin.java
Outdated
Show resolved
Hide resolved
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pramodsatya. Your code looks okay overall, but I would be great to explain with an example the structural transformations you are doing.
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
0f22c46 to
4b5cc70
Compare
pramodsatya
left a comment
There was a problem hiding this comment.
Thanks @aditi-pandit, this conversion unwraps the array vector for constant IN-list, performing the inverse operation of convertInExpr:
Added comments to elaborate on the rationale behind this conversion, could you please take another look?
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
pdabre12
left a comment
There was a problem hiding this comment.
Thanks @pramodsatya.
One comment, can we add the failing test queries from the issue as test cases here?
4b5cc70 to
98f50a3
Compare
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pramodsatya
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/VeloxToPrestoExpr.cpp
Outdated
Show resolved
Hide resolved
| result.push_back(getRowExpression(constantExpr)); | ||
| } | ||
| } else { | ||
| for (auto i = 1; i < numInputs; i++) { |
There was a problem hiding this comment.
This is the non-constant IN expression. Please add a comment describing the conversion (with an example)
| const auto& inList = inputs.at(1); | ||
| // Check if IN-list is a constant expression with values represented by a | ||
| // constant array vector. | ||
| if (numInputs == 2 && inList->isConstantKind() && inList->type()->isArray()) { |
There was a problem hiding this comment.
Can we abstract a function for this if condition part of the code ?
| // expression in both of these forms. | ||
| std::vector<RowExpressionPtr> | ||
| VeloxToPrestoExprConverter::getInSpecialFormExpressionArgs( | ||
| const velox::core::CallTypedExpr* inExpr) const { |
There was a problem hiding this comment.
Why is this a const raw pointer ? Why not use a const ref parameter ?
There was a problem hiding this comment.
The input expression here is immutable so a raw pointer is passed, the input expression is also passed on to the caller, getSpecialFormExpression, as a raw pointer. Is it fine to retain this as a raw pointer?
pramodsatya
left a comment
There was a problem hiding this comment.
Thanks for the feedback @aditi-pandit, addressed the review comments. Could you please take another look?
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pramodsatya
Description
Fixes Velox to Presto
INexpression conversion. When theIN-listis constant, the Velox expression representation uses a constant expression with an array vector to store the list (see conversion here). The PrestoINexpression expects the values from constantIN-listto be distinct arguments to theSpecialFormExpression. TheVeloxToPrestoExpris modified accordingly.Motivation and Context
Resolves #26921.
Impact
Fixes bug with
INexpression in native expression optimizer.Test Plan
Added e2e test.
Summary by Sourcery
Fix Velox-to-Presto conversion of IN expressions to correctly construct Presto special form arguments and add coverage for the native expression optimizer.
Bug Fixes:
Tests: