Load lazy blocks before using them in page filter/projection#10322
Load lazy blocks before using them in page filter/projection#10322martint merged 1 commit intotrinodb:masterfrom
Conversation
8d4fd53 to
331e845
Compare
skrzypo987
left a comment
There was a problem hiding this comment.
This seems way beyond my Trino capabilities.
At least I found a missing requireNonNull, which makes me a proper reviewer.
I'll try to have another look tomorrow but you should seek feedback from professionals.
.../trino-main/src/main/java/io/trino/operator/project/PageFieldsToInputParametersRewriter.java
Outdated
Show resolved
Hide resolved
cc0f437 to
b4e1be6
Compare
core/trino-main/src/main/java/io/trino/operator/project/GeneratedPageProjection.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/project/InputChannels.java
Outdated
Show resolved
Hide resolved
.../trino-main/src/main/java/io/trino/operator/project/PageFieldsToInputParametersRewriter.java
Outdated
Show resolved
Hide resolved
.../trino-main/src/main/java/io/trino/operator/project/PageFieldsToInputParametersRewriter.java
Outdated
Show resolved
Hide resolved
.../trino-main/src/main/java/io/trino/operator/project/PageFieldsToInputParametersRewriter.java
Outdated
Show resolved
Hide resolved
a293b31 to
b9b690b
Compare
.../trino-main/src/main/java/io/trino/operator/project/PageFieldsToInputParametersRewriter.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/operator/project/InputChannels.java
Outdated
Show resolved
Hide resolved
b9b690b to
943f68f
Compare
848bd59 to
9a75401
Compare
core/trino-main/src/main/java/io/trino/operator/project/InputChannels.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
This really depends on the function being applied. It's possible that not all of the references in the lambda expression is used.
There was a problem hiding this comment.
I see, I've updated visitCall to now assume that all the inputs to lambda expression can be conditionally evaluated (i.e. keep existing behaviour of all blocks being lazily loaded).
There was a problem hiding this comment.
This really depends on the function being applied. It's possible that not all of the references in the lambda expression is used.
@martint could you put an example here?
Should we just assume that lambda body is unconditionallyEvaluated=false and add a TODO?
.../trino-main/src/main/java/io/trino/operator/project/PageFieldsToInputParametersRewriter.java
Outdated
Show resolved
Hide resolved
ba8e97a to
a7d36fe
Compare
8f2c5d8 to
8c2990c
Compare
.../trino-main/src/main/java/io/trino/operator/project/PageFieldsToInputParametersRewriter.java
Outdated
Show resolved
Hide resolved
.../trino-main/src/main/java/io/trino/operator/project/PageFieldsToInputParametersRewriter.java
Outdated
Show resolved
Hide resolved
.../trino-main/src/main/java/io/trino/operator/project/PageFieldsToInputParametersRewriter.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
I don't fully understand why call expression has special handling of lambdas? Shouldn't lambdas be fully covered via visitLambda?
There was a problem hiding this comment.
When the visitor is traversing lambda expression, the leaf nodes always seem to be ConstantExpression or VariableReferenceExpression. So setting any value for unconditionallyEvaluated in visitLambda seems to be a no-op.
I think lambda expression is supposed to be always part of a function like transform, reduce etc. The InputReferenceExpression for the fields potentially accessed by the lambda are found in the initial arguments of CallExpression.
There was a problem hiding this comment.
I will let this one for @martint to look at.
Could you search for a code that creates CallExpression for lambdas and see what are the arguments? According to antrl lambdas are primary expressions, so I'm not sure where CallExpression comes from
There was a problem hiding this comment.
the leaf nodes always seem to be ConstantExpression or VariableReferenceExpression. So setting any value for unconditionallyEvaluated in visitLambda seems to be a no-op.
You would need to link those references to the corresponding arguments of the call to the higher-order function.
For example, in a hypothetical apply2 function: apply2(a, b, (x, y) -> x AND y), b is conditionally loaded, but a isn't due to the conditional nature of x AND y.
There was a problem hiding this comment.
@martint in the current code my intent was to fallback to existing behaviour for lambda expression. The motivating scenario for this PR was to improve efficiency for simple filters and projections (e.g. tpch/q01). Can we skip optimising this scenario or would you still consider this a blocker ?
There was a problem hiding this comment.
Yes, we can skip it for now. But add a TODO so we know the implementation is not yet complete.
There was a problem hiding this comment.
I've added a TODO here now
There was a problem hiding this comment.
This really depends on the function being applied. It's possible that not all of the references in the lambda expression is used.
@martint could you put an example here?
Should we just assume that lambda body is unconditionallyEvaluated=false and add a TODO?
8c2990c to
e4ac2f9
Compare
There was a problem hiding this comment.
I will let this one for @martint to look at.
Could you search for a code that creates CallExpression for lambdas and see what are the arguments? According to antrl lambdas are primary expressions, so I'm not sure where CallExpression comes from
.../trino-main/src/main/java/io/trino/operator/project/PageFieldsToInputParametersRewriter.java
Outdated
Show resolved
Hide resolved
00e0ce3 to
de7d367
Compare
core/trino-main/src/main/java/io/trino/operator/project/InputChannels.java
Outdated
Show resolved
Hide resolved
837b10d to
94a2c01
Compare
|
I'll take a look in the next couple of days. |
Helps to avoid calls to LazyData#getTopLevelBlock in generated page filter and projection methods. PageFieldsToInputParametersRewriter now also records which channels are evaluated unconditionally so that LazyBlock can be loaded for those channels before expression evaluation.
94a2c01 to
a1b44a5
Compare
|
Load lazy blocks TPCH unpartitioned ORC sf1000.pdf @martint I've attached results from TPCH unpartitioned ORC sf1000 run |
Helps to avoid calls to LazyData#getTopLevelBlock in
generated page filter and projection methods.
PageFieldsToInputParametersRewriter now also records which
channels are evaluated unconditionally so that LazyBlock can
be loaded for those channel before expression evaluation.