Optimize effectively literal values#10663
Conversation
514dc39 to
4bbc72b
Compare
|
Thanks @losipiuk for a quick review. Added more testing (https://github.com/trinodb/trino/compare/514dc390b4e1c1afbe6158b406b838e596c32568..4bbc72b9c2dea585e6c2dee7bc92b6c9a0324171) and this uncovered a small glitch, also fixed. |
4bbc72b to
eacde51
Compare
core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/ExpressionInterpreter.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/iterative/rule/InlineProjections.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java
Outdated
Show resolved
Hide resolved
eacde51 to
7fc353e
Compare
This addresses a code TODO comment.
Remove unnecessary code copies.
Will be used used soon.
Will be used used soon.
Will be used used soon.
7fc353e to
dc2ed0c
Compare
|
AC |
dc2ed0c to
0bc4936
Compare
There was a problem hiding this comment.
Here & below. Previously we were projecting CAST(DECIMAL '2.1' AS decimal(2, 1)) column to then construct ST_Point(x,x) from it. Now this got recognized as effectively literal and inlined. Thus ST_Point() function call is replaced with literal (point21x21Literal), and subsequent Project nodes are merged into one.
There was a problem hiding this comment.
Where is this case handled? Is it SimplifyExpressions?
There was a problem hiding this comment.
in FilterStatsCalculator,
trino/core/trino-main/src/main/java/io/trino/cost/FilterStatsCalculator.java
Lines 102 to 120 in 8e40a44
There was a problem hiding this comment.
Why is it important that the cast doesn't fail?
I guess, we don't want the query to fail in the stats calculator, or in SimplifyCountOverConstant, trying to evaluate the constant value.
However, in PredicatePushDown, we don't evaluate the expression, so we could consider the potentially unsafe expression as a constant. And in LocalExecutionPlanner we shouldn't care if it fails, but take the opportunity to use the shortcut for trivial projections.
Also, Literal is not safe. It is possible to create a GenericLiteral with mismatching value, and that also fails to evaluate.
I suggest a different approach on handling failures: we could skip the validation here, and if the "effectively literal" value is to be evaluated during the optimization phase, then we could use the safe ExpressionInterpreter.optimize method.
There was a problem hiding this comment.
Why is it important that the cast doesn't fail?
the isEffectivelyLiteral is a boolean-returning method intended to be used wherever we inspect whether an expression can be considered "a literal". It's not intended to throw for a valid input, as it would make the usage harder.
Note that we don't need to be 'smart' about failing cast case. In a typical case, ExpressionInterpreter will fold it to a fail call (which isn't recognized as a literal).
Also, Literal is not safe. It is possible to create a GenericLiteral with mismatching value, and that also fails to evaluate.
True. However, no optimizer / optimizer rule should do this. Literals coming from parser should be validated in ExpressionAnalyzer.
I suggest a different approach on handling failures: we could skip the validation here, and if the "effectively literal" value is to be evaluated during the optimization phase, then we could use the safe ExpressionInterpreter.optimize method.
That would work. Note however that my intention was to capture literals, and other things produced by LiteralEncoder. Hence the method name & javadoc and that's also why I choose not to fail.
Do you feel strongly about this?
There was a problem hiding this comment.
the isEffectivelyLiteral is a boolean-returning method intended to be used wherever we inspect whether an expression can be considered "a literal". It's not intended to throw for a valid input, as it would make the usage harder.
My question was: why do we try to filter-out failing casts instead of just reporting them as "effectively literals". I wasn't trying to suggest that isEffectivelyLiteral should throw. Sorry for not being clear.
Note that we don't need to be 'smart' about failing cast case. In a typical case, ExpressionInterpreter will fold it to a fail call (which isn't recognized as a literal).
Yes, but we shouldn't depend on the rule order, especially that isEffectivelyLiteral can be reused in the future.
that my intention was to capture literals, and other things produced by LiteralEncoder.
If we drop the failure check in isEffectivelyLiteral, which is my suggestion, that intention is still satisfied. It would then depend on the caller whether they care about errors (and if they do, to use ExpressionInterpreter.optimize).
Generally, I think that using ExpressionInterpreter.optimize instead of ExpressionInterpreter.evaluate should be the rule anywhere before the execution.
If we drop the failure check in isEffectivelyLiteral, we are also consistent wrt failing casts and failing GenericLiterals (until we fix them).
On the other hand, if isEffectivelyLiteral reports failing cast as a non-literal, we miss out in PPD and LocalExecutionPlanner, as I mentioned before.
There was a problem hiding this comment.
Considering LocalExecutionPlanner, are there tests for the case with Cast(Literal)) being handled as trivial projection? It was not possible before.
There was a problem hiding this comment.
As discussed offline, the intended contract of this method is to filter simple constant expressions which do not fail, so that the caller does not need to deal with potential failure.
I consider this a fair decision, as the majority of failing expressions should never reach this point (supposed that they go through the ExpressionInterpreter first). So, we don't lose much optimization opportunities by filtering out failing expressions, while the usage of the method is considerably simplified.
However, for the contract to hold, we need to validate Literals also, not only the casts. Maybe add the validation for Literals for now, and a TODO that user-provided Literals should ba validated in the ExpressionAnalyzer?
There was a problem hiding this comment.
However, for the contract to hold, we need to validate Literals also, not only the casts.
Done in #10720, so let me skip adding it here.
0bc4936 to
1de9a35
Compare
|
Pushed some clarification checks in |
core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
As discussed offline, the intended contract of this method is to filter simple constant expressions which do not fail, so that the caller does not need to deal with potential failure.
I consider this a fair decision, as the majority of failing expressions should never reach this point (supposed that they go through the ExpressionInterpreter first). So, we don't lose much optimization opportunities by filtering out failing expressions, while the usage of the method is considerably simplified.
However, for the contract to hold, we need to validate Literals also, not only the casts. Maybe add the validation for Literals for now, and a TODO that user-provided Literals should ba validated in the ExpressionAnalyzer?
core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java
Outdated
Show resolved
Hide resolved
1de9a35 to
3eec633
Compare
3eec633 to
43e2101
Compare
There was a problem hiding this comment.
After validating expressions of the form Cast(Literal) in the isEffectivelyLiteral() method, and expressions being Literals as in #10720, we can skip the validation here, and just use interpreter.evaluate().
There was a problem hiding this comment.
switched to evaluateConstantExpression, it calls ExpressionInterpreter.evaluate` behind the scenes
There was a problem hiding this comment.
With isEffectivelyLiteral, this is not quite true any more.
There was a problem hiding this comment.
How about this: #10663 (comment) ?
Added tests in TestLogicalPlanner and TestInlineProjections.
With isEffectivelyLiteral, this is not quite true any more.
it's still true. ExpressionInterpreter.optimize() is not cheap in general
while ExpressionInterpreter.optimize() is used internally in isEffectivelyLiteral, the use is guarded with a check ensuring the call would be cheap.
In many code places, we have special paths for expressions that are literals. Detecting literals with `instanceof Literal` is not enough, as some types, or some (value, type) combinations, do not have direct literal form. The `LiteralEncoder` encodes them as constant, terse expressions. Throughout the optimizer, the "is literal" detections should treat them same way.
43e2101 to
71045b5
Compare
In many code places, we have special paths for expressions that are
literals. Detecting literals with
instanceof Literalis not enough, assome types, or some (value, type) combinations, do not have direct
literal form. The
LiteralEncoderencodes them as constant, terseexpressions. Throughout the optimizer, the "is literal" detections
should treat them same way.
Follows discussion under #10499 (comment)