Optimize effectively literal values by findepi · Pull Request #10663 · trinodb/trino

findepi · 2022-01-18T16:04:15Z

In many code places, we have special paths for expressions that are
literals. Detecting literals with instanceof Literal is not enough, as
some types, or some (value, type) combinations, do not have direct
literal form. The LiteralEncoder encodes them as constant, terse
expressions. Throughout the optimizer, the "is literal" detections
should treat them same way.

Follows discussion under #10499 (comment)

losipiuk

LGTM

findepi · 2022-01-18T16:24:14Z

Thanks @losipiuk for a quick review.

Added more testing (https://github.com/trinodb/trino/compare/514dc390b4e1c1afbe6158b406b838e596c32568..4bbc72b9c2dea585e6c2dee7bc92b6c9a0324171) and this uncovered a small glitch, also fixed.

findepi · 2022-01-18T17:27:18Z

pushed fix for code style issues (https://github.com/trinodb/trino/compare/4bbc72b9c2dea585e6c2dee7bc92b6c9a0324171..eacde514c86298e047d844b5deeb65ccd0488d9d)

core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java

core/trino-main/src/main/java/io/trino/sql/planner/ExpressionInterpreter.java

core/trino-main/src/main/java/io/trino/sql/planner/iterative/rule/InlineProjections.java

core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java

This addresses a code TODO comment.

Remove unnecessary code copies.

Will be used used soon.

findepi · 2022-01-19T15:41:55Z

AC

findepi · 2022-01-20T10:30:50Z

plugin/trino-geospatial/src/test/java/io/trino/plugin/geospatial/TestSpatialJoinPlanning.java

Here & below. Previously we were projecting CAST(DECIMAL '2.1' AS decimal(2, 1)) column to then construct ST_Point(x,x) from it. Now this got recognized as effectively literal and inlined. Thus ST_Point() function call is replaced with literal (point21x21Literal), and subsequent Project nodes are merged into one.

kasiafi · 2022-01-20T11:03:39Z

core/trino-main/src/main/java/io/trino/cost/FilterStatsCalculator.java

Where is this case handled? Is it SimplifyExpressions?

in FilterStatsCalculator,

trino/core/trino-main/src/main/java/io/trino/cost/FilterStatsCalculator.java

Lines 102 to 120 in 8e40a44

Expression simplifiedExpression = simplifyExpression(session, predicate, types);

return new FilterExpressionStatsCalculatingVisitor(statsEstimate, session, types)

.process(simplifiedExpression);

}

private Expression simplifyExpression(Session session, Expression predicate, TypeProvider types)

{

// TODO reuse io.trino.sql.planner.iterative.rule.SimplifyExpressions.rewrite

Map<NodeRef<Expression>, Type> expressionTypes = getExpressionTypes(session, predicate, types);

ExpressionInterpreter interpreter = new ExpressionInterpreter(predicate, plannerContext, session, expressionTypes);

Object value = interpreter.optimize(NoOpSymbolResolver.INSTANCE);

if (value == null) {

// Expression evaluates to SQL null, which in Filter is equivalent to false. This assumes the expression is a top-level expression (eg. not in NOT).

value = false;

}

return new LiteralEncoder(plannerContext).toExpression(session, value, BOOLEAN);

}

kasiafi · 2022-01-20T11:28:00Z

core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java

Why is it important that the cast doesn't fail?

I guess, we don't want the query to fail in the stats calculator, or in SimplifyCountOverConstant, trying to evaluate the constant value.

However, in PredicatePushDown, we don't evaluate the expression, so we could consider the potentially unsafe expression as a constant. And in LocalExecutionPlanner we shouldn't care if it fails, but take the opportunity to use the shortcut for trivial projections.

Also, Literal is not safe. It is possible to create a GenericLiteral with mismatching value, and that also fails to evaluate.

I suggest a different approach on handling failures: we could skip the validation here, and if the "effectively literal" value is to be evaluated during the optimization phase, then we could use the safe ExpressionInterpreter.optimize method.

Why is it important that the cast doesn't fail?

the isEffectivelyLiteral is a boolean-returning method intended to be used wherever we inspect whether an expression can be considered "a literal". It's not intended to throw for a valid input, as it would make the usage harder.

Note that we don't need to be 'smart' about failing cast case. In a typical case, ExpressionInterpreter will fold it to a fail call (which isn't recognized as a literal).

Also, Literal is not safe. It is possible to create a GenericLiteral with mismatching value, and that also fails to evaluate.

True. However, no optimizer / optimizer rule should do this. Literals coming from parser should be validated in ExpressionAnalyzer.

I suggest a different approach on handling failures: we could skip the validation here, and if the "effectively literal" value is to be evaluated during the optimization phase, then we could use the safe ExpressionInterpreter.optimize method.

That would work. Note however that my intention was to capture literals, and other things produced by LiteralEncoder. Hence the method name & javadoc and that's also why I choose not to fail.
Do you feel strongly about this?

the isEffectivelyLiteral is a boolean-returning method intended to be used wherever we inspect whether an expression can be considered "a literal". It's not intended to throw for a valid input, as it would make the usage harder.

My question was: why do we try to filter-out failing casts instead of just reporting them as "effectively literals". I wasn't trying to suggest that isEffectivelyLiteral should throw. Sorry for not being clear.

Note that we don't need to be 'smart' about failing cast case. In a typical case, ExpressionInterpreter will fold it to a fail call (which isn't recognized as a literal).

Yes, but we shouldn't depend on the rule order, especially that isEffectivelyLiteral can be reused in the future.

that my intention was to capture literals, and other things produced by LiteralEncoder.

If we drop the failure check in isEffectivelyLiteral, which is my suggestion, that intention is still satisfied. It would then depend on the caller whether they care about errors (and if they do, to use ExpressionInterpreter.optimize).
Generally, I think that using ExpressionInterpreter.optimize instead of ExpressionInterpreter.evaluate should be the rule anywhere before the execution.

If we drop the failure check in isEffectivelyLiteral, we are also consistent wrt failing casts and failing GenericLiterals (until we fix them).

On the other hand, if isEffectivelyLiteral reports failing cast as a non-literal, we miss out in PPD and LocalExecutionPlanner, as I mentioned before.

Considering LocalExecutionPlanner, are there tests for the case with Cast(Literal)) being handled as trivial projection? It was not possible before.

As discussed offline, the intended contract of this method is to filter simple constant expressions which do not fail, so that the caller does not need to deal with potential failure.

I consider this a fair decision, as the majority of failing expressions should never reach this point (supposed that they go through the ExpressionInterpreter first). So, we don't lose much optimization opportunities by filtering out failing expressions, while the usage of the method is considerably simplified.

However, for the contract to hold, we need to validate Literals also, not only the casts. Maybe add the validation for Literals for now, and a TODO that user-provided Literals should ba validated in the ExpressionAnalyzer?

However, for the contract to hold, we need to validate Literals also, not only the casts.

Done in #10720, so let me skip adding it here.

findepi · 2022-01-20T13:55:43Z

Pushed some clarification checks in io.trino.sql.planner.ExpressionInterpreter.Visitor#processOperands, as i spent some time thinking why the recursive coalesce processing is actually awesome.

core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java

kasiafi · 2022-01-21T09:31:50Z

core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java

As discussed offline, the intended contract of this method is to filter simple constant expressions which do not fail, so that the caller does not need to deal with potential failure.

I consider this a fair decision, as the majority of failing expressions should never reach this point (supposed that they go through the ExpressionInterpreter first). So, we don't lose much optimization opportunities by filtering out failing expressions, while the usage of the method is considerably simplified.

However, for the contract to hold, we need to validate Literals also, not only the casts. Maybe add the validation for Literals for now, and a TODO that user-provided Literals should ba validated in the ExpressionAnalyzer?

core/trino-main/src/main/java/io/trino/sql/ExpressionUtils.java

findepi · 2022-01-21T13:10:02Z

AC (https://github.com/trinodb/trino/compare/1de9a353fcbf25c2d4db0e5407632de513fa610b..3eec63339856bdee56ec9d41a4a93819cfd0fc6b)
plus removed unused leftover var (https://github.com/trinodb/trino/compare/3eec63339856bdee56ec9d41a4a93819cfd0fc6b..43e2101d7c9c5acc6d5be2236c6ac3ee65c04021)

kasiafi

LGTM

kasiafi · 2022-01-24T13:35:54Z

core/trino-main/src/main/java/io/trino/cost/FilterStatsCalculator.java

After validating expressions of the form Cast(Literal) in the isEffectivelyLiteral() method, and expressions being Literals as in #10720, we can skip the validation here, and just use interpreter.evaluate().

switched to evaluateConstantExpression, it calls ExpressionInterpreter.evaluate` behind the scenes

kasiafi · 2022-01-24T13:41:20Z

...o-main/src/main/java/io/trino/sql/planner/iterative/rule/CanonicalizeExpressionRewriter.java

With isEffectivelyLiteral, this is not quite true any more.

How about this: #10663 (comment) ?

How about this: #10663 (comment) ?

Added tests in TestLogicalPlanner and TestInlineProjections.

With isEffectivelyLiteral, this is not quite true any more.

it's still true. ExpressionInterpreter.optimize() is not cheap in general
while ExpressionInterpreter.optimize() is used internally in isEffectivelyLiteral, the use is guarded with a check ensuring the call would be cheap.

In many code places, we have special paths for expressions that are literals. Detecting literals with `instanceof Literal` is not enough, as some types, or some (value, type) combinations, do not have direct literal form. The `LiteralEncoder` encodes them as constant, terse expressions. Throughout the optimizer, the "is literal" detections should treat them same way.

findepi added enhancement New feature or request performance labels Jan 18, 2022

findepi requested review from kasiafi, losipiuk, martint and sopel39 January 18, 2022 16:04

cla-bot bot added the cla-signed label Jan 18, 2022

findepi mentioned this pull request Jan 18, 2022

Remove unsound varchar->char implicit coercion #10499

Closed

losipiuk reviewed Jan 18, 2022

View reviewed changes

findepi force-pushed the findepi/generalized-literal branch from 514dc39 to 4bbc72b Compare January 18, 2022 16:23

findepi force-pushed the findepi/generalized-literal branch from 4bbc72b to eacde51 Compare January 18, 2022 17:22

martint reviewed Jan 18, 2022

View reviewed changes

findepi force-pushed the findepi/generalized-literal branch from eacde51 to 7fc353e Compare January 19, 2022 15:40

findepi added 6 commits January 19, 2022 16:40

Test count(NULL) aggregation

ca7acde

This addresses a code TODO comment.

Remove unused method in SpatialJoinUtils

c9cbf80

Unify code for getting types in expression

7d4d5c0

Remove unnecessary code copies.

Pass PlannerContext to CanonicalizeExpressionRewriter

3de0c9b

Will be used used soon.

Pass PlannerContext to InlineProjections

e562eb6

Will be used used soon.

Pass PlannerContext to SimplifyCountOverConstant

4af8552

Will be used used soon.

findepi force-pushed the findepi/generalized-literal branch from 7fc353e to dc2ed0c Compare January 19, 2022 15:41

findepi requested a review from martint January 19, 2022 15:41

findepi mentioned this pull request Jan 19, 2022

Test count(NULL) aggregation #10695

Closed

findepi force-pushed the findepi/generalized-literal branch from dc2ed0c to 0bc4936 Compare January 20, 2022 10:28

findepi commented Jan 20, 2022

View reviewed changes

losipiuk approved these changes Jan 20, 2022

View reviewed changes

kasiafi reviewed Jan 20, 2022

View reviewed changes

findepi force-pushed the findepi/generalized-literal branch from 0bc4936 to 1de9a35 Compare January 20, 2022 13:54

findepi mentioned this pull request Jan 21, 2022

Literals should be validated before optimizer #10719

Closed

kasiafi reviewed Jan 21, 2022

View reviewed changes

findepi force-pushed the findepi/generalized-literal branch from 1de9a35 to 3eec633 Compare January 21, 2022 13:01

findepi added 3 commits January 21, 2022 14:06

Skip redundant toExpression conversion

2b5f035

Add null-related assertion in coalesce processing

2936ce0

Add @language annotation to PlanMatchPattern.expression

9c05db6

findepi force-pushed the findepi/generalized-literal branch from 3eec633 to 43e2101 Compare January 21, 2022 13:06

findepi requested a review from kasiafi January 21, 2022 13:10

kasiafi reviewed Jan 24, 2022

View reviewed changes

kasiafi mentioned this pull request Jan 25, 2022

Optimize filter condition with case expression predicate #10580

Closed

findepi force-pushed the findepi/generalized-literal branch from 43e2101 to 71045b5 Compare January 27, 2022 10:10

kasiafi approved these changes Jan 27, 2022

View reviewed changes

findepi merged commit ab6072d into trinodb:master Jan 27, 2022

findepi deleted the findepi/generalized-literal branch January 27, 2022 13:02

github-actions bot added this to the 370 milestone Jan 27, 2022

This was referenced Jan 27, 2022

Add Trino 370 release notes #10793

Merged

Release notes for 370 #10794

Closed

	Expression simplifiedExpression = simplifyExpression(session, predicate, types);
	return new FilterExpressionStatsCalculatingVisitor(statsEstimate, session, types)
	.process(simplifiedExpression);
	}

	private Expression simplifyExpression(Session session, Expression predicate, TypeProvider types)
	{
	// TODO reuse io.trino.sql.planner.iterative.rule.SimplifyExpressions.rewrite

	Map<NodeRef<Expression>, Type> expressionTypes = getExpressionTypes(session, predicate, types);
	ExpressionInterpreter interpreter = new ExpressionInterpreter(predicate, plannerContext, session, expressionTypes);
	Object value = interpreter.optimize(NoOpSymbolResolver.INSTANCE);

	if (value == null) {
	// Expression evaluates to SQL null, which in Filter is equivalent to false. This assumes the expression is a top-level expression (eg. not in NOT).
	value = false;
	}
	return new LiteralEncoder(plannerContext).toExpression(session, value, BOOLEAN);
	}

Conversation

findepi commented Jan 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

losipiuk left a comment

Choose a reason for hiding this comment

Uh oh!

findepi commented Jan 18, 2022

Uh oh!

findepi commented Jan 18, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

findepi commented Jan 19, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

findepi commented Jan 20, 2022

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

findepi commented Jan 21, 2022

Uh oh!

kasiafi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

findepi commented Jan 18, 2022 •

edited

Loading