fix(native): Avoid removing valid CASEs in switch expression conversion#27031
fix(native): Avoid removing valid CASEs in switch expression conversion#27031pramodsatya merged 3 commits intoprestodb:masterfrom
Conversation
Reviewer's GuideAdjusts Presto-to-Velox SWITCH/CASE conversion to correctly distinguish between simple and searched forms so iterative native expression optimization doesn’t drop the first WHEN clause, and adds regression tests covering complex CASE expressions with unbound_long in the expression interpreter. Sequence diagram for iterative SWITCH/CASE conversion and optimizationsequenceDiagram
participant PrestoPlanner
participant PrestoToVeloxExpr
participant NativeExpressionOptimizer
participant VeloxToPrestoExprConverter
PrestoPlanner->>PrestoToVeloxExpr: convertSwitchExpr(simple_or_searched_CASE)
PrestoToVeloxExpr->>PrestoToVeloxExpr: detect_simple_vs_searched
PrestoToVeloxExpr->>PrestoPlanner: Velox_SWITCH_expression
PrestoPlanner->>NativeExpressionOptimizer: optimize(Velox_SWITCH_expression)
NativeExpressionOptimizer->>NativeExpressionOptimizer: remove_synthetic_leading_constant_if_present
NativeExpressionOptimizer->>PrestoToVeloxExpr: convertSwitchExpr(optimized_SWITCH_args)
PrestoToVeloxExpr->>PrestoToVeloxExpr: check_first_arg_isWhenClause
alt searched_form
PrestoToVeloxExpr->>PrestoToVeloxExpr: keep_first_WHEN_clause
else simple_form
PrestoToVeloxExpr->>PrestoToVeloxExpr: erase_valueExpr
end
PrestoToVeloxExpr->>NativeExpressionOptimizer: re_converted_SWITCH_expression
NativeExpressionOptimizer->>VeloxToPrestoExprConverter: convert_back_to_Presto
VeloxToPrestoExprConverter->>VeloxToPrestoExprConverter: reconstruct_Presto_SWITCH
VeloxToPrestoExprConverter->>PrestoPlanner: well_formed_SWITCH_with_all_WHEN_clauses
Flow diagram for updated convertSwitchExpr SWITCH/CASE handlingflowchart TD
start((Start))
start --> checkArgs
checkArgs{"args empty?"}
checkArgs -- Yes --> error[VELOX_CHECK failure: SWITCH arguments cannot be empty]
checkArgs -- No --> detectWhen
detectWhen["kWhen = when
isWhenClause = ExprUtils.isCall(*args.begin(), kWhen)"]
detectWhen --> branchWhen
branchWhen{"isWhenClause?"}
branchWhen -- Yes --> setSearched
branchWhen -- No --> setSimple
setSearched["valueExpr = null
valueIsTrue = false"]
setSearched --> initInputs
setSimple["valueExpr = args.front()
remove first arg from args
valueIsTrue = isTrueConstant(valueExpr)"]
setSimple --> initInputs
initInputs["inputs.reserve((args.size() - 1) * 2)"]
initInputs --> loopArgs
loopArgs{{"for each arg in args"}}
loopArgs --> isCallWhen
isCallWhen{"arg is CallTypedExpr and call.name() == kWhen?"}
isCallWhen -- No --> addElseOrDefault["handle else or default branch as before"]
addElseOrDefault --> nextArg
isCallWhen -- Yes --> handleWhen
handleWhen["condition = call.inputs()[0]"]
handleWhen --> condBranch
condBranch{"isWhenClause or valueIsTrue?"}
condBranch -- Yes --> addCondition["inputs.emplace_back(condition)"]
condBranch -- No --> compareTypes{"condition.type == valueExpr.type?"}
compareTypes -- Yes --> addEqualCondition["inputs.emplace_back(eq(valueExpr, condition))"]
compareTypes -- No --> castValueExpr["cast valueExpr to condition.type
inputs.emplace_back(eq(castedValueExpr, condition))"]
addCondition --> nextArg
addEqualCondition --> nextArg
castValueExpr --> nextArg
nextArg{{"more args?"}}
nextArg -- Yes --> loopArgs
nextArg -- No --> buildCall
buildCall["return CallTypedExpr(name = switch, type = returnType, inputs)"]
buildCall --> endNode((End))
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
b8e183c to
0bdd1ae
Compare
|
@aditi-pandit, could you please help review this fix? |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
convertSwitchExpr, consider guarding against an emptyargsvector before calling*args.begin()to avoid potential undefined behavior if the function is ever called with no arguments.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `convertSwitchExpr`, consider guarding against an empty `args` vector before calling `*args.begin()` to avoid potential undefined behavior if the function is ever called with no arguments.
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp:620` </location>
<code_context>
- args.erase(args.begin());
+ static constexpr const char* kWhen = "when";
+
+ bool isSearchedForm = velox::expression::utils::isCall(*args.begin(), kWhen);
+ TypedExprPtr valueExpr = nullptr;
+ bool valueIsTrue = false;
</code_context>
<issue_to_address>
**issue:** Guard the `isCall` check against empty `args` to avoid UB or crashes in edge cases.
This code dereferences `args.begin()` without ensuring `args` is non-empty, which is undefined behavior if a malformed expression (e.g., SWITCH with no arguments) reaches this path. Add a defensive check (such as `VELOX_CHECK(!args.empty(), ...)` or an early return) before calling `isCall` to fail fast with a clear error instead of crashing.
</issue_to_address>
### Comment 2
<location> `presto-main-base/src/test/java/com/facebook/presto/sql/expressions/AbstractTestExpressionInterpreter.java:1194-1196` </location>
<code_context>
"when true then 2.2 " +
"end",
"2.2");
+ assertOptimizedEquals("case unbound_long " +
+ "when 1 then unbound_long " +
+ "else 1 " +
+ "end",
+ "case unbound_long " +
</code_context>
<issue_to_address>
**suggestion (testing):** Add a complementary searched CASE test to directly cover the bug scenario
This assertion exercises the simple CASE form with `unbound_long`, which is good for regression coverage. Since the original bug involved the searched CASE form, please also add a corresponding test using searched CASE (e.g., `case when unbound_long = 1 then unbound_long else 1 end`) so the optimizer/converter behavior for that specific scenario is directly covered.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp
Outdated
Show resolved
Hide resolved
...ase/src/test/java/com/facebook/presto/sql/expressions/AbstractTestExpressionInterpreter.java
Show resolved
Hide resolved
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pramodsatya for this code.
presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp
Outdated
Show resolved
Hide resolved
...ase/src/test/java/com/facebook/presto/sql/expressions/AbstractTestExpressionInterpreter.java
Show resolved
Hide resolved
|
Thanks for the feedback @aditi-pandit, apologies for the delay in addressing this. I have simplified the check now to make it more readable, added code-comments, and more exhaustive tests to validate this fix. Could you please take another look? |
|
@sourcery-ai review. |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- In
convertSwitchExpr, usingvalueExpr == nullptrinside the loop to distinguish searched vs simple CASE makes the control flow a bit opaque; consider introducing an explicitisSearchedCase/hasValueExprflag to make the intent clearer and avoid overloadingvalueExpr's nullness for logic. - The comment above
kWheninconvertSwitchExprstates that the constant expression is always boolean, but for simple CASE it can be a non-boolean value; consider tightening that wording so it matches the actual semantics and avoids confusion for future readers.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `convertSwitchExpr`, using `valueExpr == nullptr` inside the loop to distinguish searched vs simple CASE makes the control flow a bit opaque; consider introducing an explicit `isSearchedCase` / `hasValueExpr` flag to make the intent clearer and avoid overloading `valueExpr`'s nullness for logic.
- The comment above `kWhen` in `convertSwitchExpr` states that the constant expression is always boolean, but for simple CASE it can be a non-boolean value; consider tightening that wording so it matches the actual semantics and avoids confusion for future readers.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review. |
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pramodsatya
presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
It might be easier to use an else for if (call->name() == kWhen) condition to do inputs.emplace_back(arg); instead of using continue.
There was a problem hiding this comment.
Updated, thanks.
There was a problem hiding this comment.
Simplified this with velox expression util function: if (velox::expression::utils::isCall(*arg, kWhen)) {.
Is this alright?
pramodsatya
left a comment
There was a problem hiding this comment.
Thanks for the feedback @aditi-pandit, updated accordingly, could you PTAL?
presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Updated, thanks.
presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp
Outdated
Show resolved
Hide resolved
|
@aditi-pandit could you please take another look at this fix? |
presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp
Outdated
Show resolved
Hide resolved
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pramodsatya
presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp
Outdated
Show resolved
Hide resolved
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @pramodsatya
|
@tdcmeehan could you please help review this change? |
…on (prestodb#27031) ## Description Velox only supports the searched form of `SWITCH` expression, reference: https://prestodb.io/docs/current/functions/conditional.html#case. The simple and searched forms of SWITCH are converted to the Velox representation of `SWITCH` in: https://github.com/prestodb/presto/blob/32207b9573ecab181b61eebc4754a7f545554b27/presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp#L610 The first `expression` of `SWITCH` conditional should only be removed from the simple form of `SWITCH` during PrestoToVelox expression conversion. This fix checks if the first argument is a `WHEN` clause before erasing it, if it's not a `WHEN` call, it's the value expression in simple form and should be erased. This prevents loss of valid `WHEN` clauses when the `NativeExpressionOptimizer` iteratively optimizes expressions. ## Motivation and Context Without this change, the first expression from the Velox representation of searched form of `SWITCH` can be removed in the second iteration of expression optimization with the `NativeExpressionOptimizer`. This can result in the `VeloxToPrestoExprConverter` returning a malformed Presto `SWITCH` expression. This issue only manifests with the native expression optimizer enabled (sidecar execution path), as the converter is called iteratively on previously-optimized expressions that no longer have the synthetic leading constant. ## Impact Fix `SWITCH` expression handling in `NativeExpressionOptimizer` and prevent `WHEN` clause loss during iterative optimization. ## Test Plan Added comprehensive e2e tests in `AbstractTestExpressionInterpreter.java`. ``` == NO RELEASE NOTE == ``` ## Summary by Sourcery Handle both simple and searched CASE expressions correctly when converting Presto CASE/SWITCH to Velox expressions. Bug Fixes: - Fix SWITCH/CASE conversion to avoid incorrectly stripping the first argument for searched CASE expressions, preserving correct semantics. Tests: - Add an expression interpreter test to cover CASE with unbound_long to prevent regressions in CASE optimization and conversion. ## Summary by Sourcery Fix conversion of Presto SWITCH/CASE expressions to Velox to correctly handle searched and simple forms. Bug Fixes: - Preserve the first expression for searched CASE/SWITCH during Presto-to-Velox conversion so that conditions are not lost and the reconstructed Presto expression remains well-formed. Tests: - Extend expression interpreter tests to cover CASE with unbound_long and ensure CASE optimization preserves semantics. ## Summary by Sourcery Ensure SWITCH/CASE expression conversion correctly handles both simple and searched forms during native execution expression optimization. Bug Fixes: - Prevent removal of the first WHEN clause when converting already-optimized searched SWITCH/CASE expressions from Presto to Velox. - Handle SWITCH/CASE expressions without a leading value expression so that reconstructed Presto expressions remain well-formed during iterative optimization. Tests: - Extend expression interpreter tests to cover simple and searched CASE expressions with unbound_long and verify iterative optimization preserves semantics.
Description
Velox only supports the searched form of
SWITCHexpression, reference: https://prestodb.io/docs/current/functions/conditional.html#case. The simple and searched forms of SWITCH are converted to the Velox representation ofSWITCHin:presto/presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.cpp
Line 610 in 32207b9
The first
expressionofSWITCHconditional should only be removed from the simple form ofSWITCHduring PrestoToVelox expression conversion.This fix checks if the first argument is a
WHENclause before erasing it, if it's not aWHENcall, it's the value expression in simple form and should be erased. This prevents loss of validWHENclauses when theNativeExpressionOptimizeriteratively optimizes expressions.Motivation and Context
Without this change, the first expression from the Velox representation of searched form of
SWITCHcan be removed in the second iteration of expression optimization with theNativeExpressionOptimizer. This can result in theVeloxToPrestoExprConverterreturning a malformed PrestoSWITCHexpression.This issue only manifests with the native expression optimizer enabled (sidecar execution path), as the converter is called iteratively on previously-optimized expressions that no longer have the synthetic leading constant.
Impact
Fix
SWITCHexpression handling inNativeExpressionOptimizerand preventWHENclause loss during iterative optimization.Test Plan
Added comprehensive e2e tests in
AbstractTestExpressionInterpreter.java.Summary by Sourcery
Handle both simple and searched CASE expressions correctly when converting Presto CASE/SWITCH to Velox expressions.
Bug Fixes:
Tests:
Summary by Sourcery
Fix conversion of Presto SWITCH/CASE expressions to Velox to correctly handle searched and simple forms.
Bug Fixes:
Tests:
Summary by Sourcery
Ensure SWITCH/CASE expression conversion correctly handles both simple and searched forms during native execution expression optimization.
Bug Fixes:
Tests: