-
Notifications
You must be signed in to change notification settings - Fork 5.5k
[native] Add expression optimizer #22927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
822f79f to
2352cd4
Compare
aditi-pandit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pramodsatya : Have done a first round of comments. Will read your tests more closely once you address the comments here.
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.h
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.h
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/types/PrestoToVeloxExpr.h
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/tests/RowExpressionEvaluatorTest.cpp
Outdated
Show resolved
Hide resolved
2352cd4 to
496bbc5
Compare
pramodsatya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @aditi-pandit, addressed the comments. Could you please take another look?
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionEvaluator.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/tests/RowExpressionEvaluatorTest.cpp
Outdated
Show resolved
Hide resolved
496bbc5 to
d9c3dcd
Compare
d9c3dcd to
e19ae51
Compare
e19ae51 to
747ef08
Compare
Summary: prestodb/presto#23331 adds a native expression optimizer to delegate expression evaluation to the native sidecar. This is used to constant fold expressions on the presto native sidecar, instead of on the presto java coordinator (which is the current behavior). prestodb/presto#22927 implements a proxygen endpoint to accept `RowExpression`s from `NativeSidecarExpressionInterpreter`, optimize them if possible (rewrite special form expressions), and compile the `RowExpression` to a velox expression with constant folding enabled. This velox expression is then converted back to a `RowExpression` and returned by the sidecar to the coordinator. When the constant folded velox expression is of type `velox::exec::ConstantExpr`, we need to return a `RowExpression` of type `ConstantExpression`. This requires us to serialize the constant value from `velox::exec::ConstantExpr` into `protocol::ConstantExpression::valueBlock`. This can be done by serializing the constant value vector to presto SerializedPage::column format, followed by base 64 encoding the result (reverse engineering the logic from `Base64Util.cpp::readBlock`). This PR adds a new function, `serializeSingleColumn`, to `PrestoVectorSerde`. This can be used to serialize input data from vectors containing a single element into a single PrestoPage column format (without the PrestoPage header). This function is not added to `PrestoBatchVectorSerializer` alongside the existing `serialize` function since that would require adding it as a virtual function in `BatchVectorSerializer` as well, and this is not desirable since the `PrestoPage` format is not relevant in this base class. There is an existing function `deserializeSingleColumn` in `PrestoVectorSerde` which is used to deserialize data from a single column, since `serializeSingleColumn` performs the inverse operation to this function, it is added alongside it in `PrestoVectorSerde`. Pull Request resolved: #10657 Reviewed By: amitkdutta Differential Revision: D66044754 Pulled By: pedroerp fbshipit-source-id: e509605067920f8207e5a3fa67552badc2ce0eba
747ef08 to
10bfc08
Compare
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.h
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.h
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.cpp
Outdated
Show resolved
Hide resolved
10bfc08 to
37ef905
Compare
pramodsatya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @aditi-pandit, addressed the review comments. Could you please take another look?
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.cpp
Outdated
Show resolved
Hide resolved
presto-native-execution/presto_cpp/main/expression/RowExpressionOptimizer.cpp
Outdated
Show resolved
Hide resolved
37ef905 to
7425558
Compare
|
@aditi-pandit, @czentgr, could you please help review this PR? Added an API to constant fold |
steveburnett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the doc! Some minor nits of formatting.
steveburnett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pull updated branch, new local doc build, looks good. Thanks!
aditi-pandit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pramodsatya : Haven't read all the code in detail but have one high level comment about the API in VeloxToPrestoExpr
| public: | ||
| explicit VeloxToPrestoExprConverter(memory::MemoryPool* pool) : pool_(pool) {} | ||
|
|
||
| /// Converts a velox expression `expr` to a Presto protocol RowExpression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't follow this... What is the output json ?
| return getCallExpression(call, input); | ||
| } | ||
|
|
||
| LOG(ERROR) << "Unable to convert Velox expression: {}" << expr->toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function signature could be changed to return an optional json. If there is an error here, you can return nullopt. The input expression wouldn't be needed then.
The caller of this function can then decide to return the input expression if the conversion was not successful.
| return getSpecialFormExpression(callTypedExpr, input); | ||
| } | ||
| return getCallExpression(callTypedExpr, input); | ||
| } else if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move Cast logic before Call.
Co-authored-by: Pratik Joseph Dabre <[email protected]>
Co-authored-by: Pratik Joseph Dabre <[email protected]>
Co-authored-by: Pramod Satya <[email protected]> Co-authored-by: Pratik Joseph Dabre <[email protected]>
Co-authored-by: Pramod Satya <[email protected]>
7581285 to
4f203b2
Compare
Description
To support constant folding and consistent semantics between the Presto coordinator (Java) and the Presto C++ worker, it is necessary to use consistent expression evaluation. To support this, a native expression evaluation endpoint,
v1/expressions, has been added to the Presto C++ sidecar, and a plugin has been created which can utilize Velox expression evaluation behind a standardExpressionOptimizer.Depends on Velox changes from facebookincubator/velox#13424 to add an
ExpressionOptimizerfor optimizing and constant foldingTypedExprs. The optimized Veloxcore::TypedExpris converted to a Prestoprotocol::RowExpressionin the Presto native sidecar with helper classVeloxToPrestoExprConverter. The end to end flow between the coordinator and sidecar looks like:Please refer to this document for sidecar implementation details.
Motivation and Context
Consistency between C++ and Java semantics. Support for using C++ functions during constant folding of expressions in the planner. Please refer to RFC-0006.
Test Plan
Tests have been added by extending the
TestRowExpressionInterpreterclass to also test native expression evaluation inTestNativeExpressionOptimizer.java. However, this feature is still in Beta, and to support production workloads with complete certainty a fuzzer will be created to surface any remaining bugs with the integration at a later time.Unit tests for simple expression conversions are added in
VeloxToPrestoExprConverter.cpp.Release Notes