fix: Expand operator with constants specified using ConstantVector#15684
fix: Expand operator with constants specified using ConstantVector#15684jinchengchenghh wants to merge 4 commits intofacebookincubator:mainfrom
Conversation
✅ Deploy Preview for meta-velox canceled.
|
d3e3537 to
c1bbbee
Compare
| @@ -86,16 +86,7 @@ RowVectorPtr Expand::getOutput() { | |||
|
|
|||
| for (auto i = 0; i < numColumns; ++i) { | |||
| if (rowProjection[i] == kConstantChannel) { | |||
| const auto& constantExpr = constantProjection[i]; | |||
| if (constantExpr->value().isNull()) { | |||
There was a problem hiding this comment.
This check is not correct
JkSelf
left a comment
There was a problem hiding this comment.
LGTM. Thanks for your fix.
|
Hi, @mbasmanova , could you help review this PR? Thanks |
velox/exec/tests/ExpandTest.cpp
Outdated
| auto constantExpr = std::make_shared<core::ConstantTypedExpr>(arrayVector); | ||
|
|
||
| auto plan = | ||
| PlanBuilder().values({data}).expand({{constantExpr}}, {"c1"}).planNode(); |
There was a problem hiding this comment.
Can we do something like
.expand({
{"k1", "k2", "a", "b", "[1, 2, 3] as c"},
})
There was a problem hiding this comment.
I receive the exception after updated to this
unknown file: Failure
C++ exception with description "Exception: VeloxUserError
Error Source: USER
Error Code: INVALID_ARGUMENT
Reason: Scalar function doesn't exist: list_value.
Retriable: False
Function: resolveScalarFunctionType
File: /Users/chengchengjin/code/velox/velox/parse/TypeResolver.cpp
Line: 97
TEST_F(ExpandTest, complexConstant) {
auto data = makeRowVectorData(1);
auto arrayVector = makeArrayVector<int32_t>({{1, 2, 3}});
// auto constantExpr = std::make_shared<core::ConstantTypedExpr>(arrayVector);
auto plan = PlanBuilder()
.values({data})
.expand({
{"k1", "k2", "a", "b", "[1, 2, 3] as c"},
})
.planNode();
auto children = data->children();
children.push_back(arrayVector);
auto expected = makeRowVector(children);
assertQuery(plan, expected);
}
There was a problem hiding this comment.
Can not use array_constructor also, the list_value is parsed as array_constructor in here, so even if we resolve the parse issue, we still cannot use the projection to test expand node.
velox/velox/parse/QueryPlanner.cpp
Line 87 in a886644
Error Code: INVALID_ARGUMENT
Reason: Unsupported projection expression in Expand plan node. Expected field reference or constant. Got: array_constructor(1,2,3)
Retriable: False
Expression: TypedExprs::isFieldAccess(columnProjection) || TypedExprs::isConstant(columnProjection)
Function: ExpandNode
File: /Users/chengchengjin/code/velox/velox/core/PlanNode.cpp
There was a problem hiding this comment.
This can work well, I will update to this, thanks for your suggestion!
TEST_F(ExpandTest, complexConstant) {
auto data = makeRowVectorData(1);
auto children = data->children();
auto arrayVector = makeArrayVector<int32_t>({{1, 2, 3}});
auto complexConstants = makeRowVector({arrayVector});
children.push_back(arrayVector);
auto expected = makeRowVector({"k1", "k2", "a", "b", "c0"}, children);
// auto constantExpr = std::make_shared<core::ConstantTypedExpr>(arrayVector);
auto plan = PlanBuilder()
.values({expected})
.expand({
{"k1", "k2", "a", "b", "__complex_constant(c0) as c"}
}, complexConstants)
.planNode();
assertQuery(plan, expected);
}
There was a problem hiding this comment.
I think I missed 'ARRAY' keyword in the original example. Something like this should work:
PlanBuilder(pool())
...
.expand({
{"k1", "k2", "a", "b", "ARRAY[1, 2, 3] as c"},
})
There was a problem hiding this comment.
Thanks! Updated and works well.
velox/exec/Expand.cpp
Outdated
| outputColumns[i] = BaseVector::createConstant( | ||
| constantExpr->type(), constantExpr->value(), numInput, pool()); | ||
| } | ||
| outputColumns[i] = constantProjection[i]->toConstantVector(pool()); |
There was a problem hiding this comment.
This fix looks good. I wonder if we can avoid making new vector for each output batch and instead create these once and reuse.
d9a2a1a to
e0d0b86
Compare
e0d0b86 to
b8c5993
Compare
velox/exec/tests/ExpandTest.cpp
Outdated
| auto children = otherColumns->children(); | ||
| auto arrayVector = makeArrayVector<int32_t>({{1, 2, 3}}); | ||
| children.push_back(arrayVector); | ||
| std::vector<VectorPtr> complexConstants(children.size()); |
There was a problem hiding this comment.
complexConstants not needed anymore
| auto plan = PlanBuilder(pool()) | ||
| .values({expected}) | ||
| .expand( | ||
| {{"k1", "k2", "a", "b", "ARRAY[1, 2, 3] as c"}}) | ||
| .planNode(); | ||
|
|
||
| assertQuery(plan, expected); |
There was a problem hiding this comment.
It is a bit strange that 'expected' is used as both the input and output of the plan; is this intentional?
|
|
||
| const auto& rowProjection = fieldProjections_[rowIndex_]; | ||
| const auto& constantProjection = constantProjections_[rowIndex_]; | ||
| const auto& constantProjection = constantOutputs_[rowIndex_]; |
There was a problem hiding this comment.
constantOutputs_ contains vectors of size 1, but here we need to produce a vector of size numInput
I just realized that we cannot reuse the vector and may need to call BaseVector::wrapInConstant(numInput, 0, constantOutputs_[rowIndex_]).
velox/exec/tests/ExpandTest.cpp
Outdated
| } // anonymous namespace | ||
|
|
||
| TEST_F(ExpandTest, complexConstant) { | ||
| auto otherColumns = makeRowVectorData(1); |
There was a problem hiding this comment.
Let's use size > 1. Otherwise, we won't be able to catch a bug mentioned above.
velox/exec/tests/ExpandTest.cpp
Outdated
|
|
||
| auto plan = PlanBuilder(pool()) | ||
| .values({data}) | ||
| .expand({{"k1", "k2", "a", "b", "ARRAY[1, 2, 3] as c"}}) |
There was a problem hiding this comment.
Should we also test null value?
null::integer[]
|
After add more tests, I receive the exception which requires not allocate memory in Operator constructor, so I move the memory allocation to initialize CC @mbasmanova |
|
@jinchengchenghh I just imported the PR to merge and while I wait for all approvals and signals to pass, can you please take a look at the test failures from the CI signals above and fix if relevant? |
|
Could you help import it? I have resolved the failed CI. Thanks! @kKPulla |
…15684)
Summary:
The constantExpr->value() only exists when the constant is of scalar type, consider the value vector for complex data type.
```
-- Expand[1][[1, null, null, n0_1, n0_2, n0_3, n0_4], [4, n0_0, {-1, {{-1}, {k => v}}}, null, null, null, null]] -> n1_0:INTEGER, n1_1:INTEGER, n1_2:ROW<c1:INTEGER,c2:ROW<a:ARRAY<INTEGER>,m:MAP<VARCHAR,VARCHAR>>>, n1_3:VARCHAR, n1_4:BIGINT, n1_5:INTEGER, n1_6:ROW<>
-- ValueStream[0][] -> n0_0:INTEGER, n0_1:VARCHAR, n0_2:BIGINT, n0_3:INTEGER, n0_4:ROW<>
```
Returns null for complex constant before this fix.
```
VeloxColumnarToRowConverter 0: {4, 1, null, null, null, ...2 more}
```
Create the constant vector in Expand operator constructor to reuse it.
Pull Request resolved: facebookincubator/velox#15684
Reviewed By: kagamiori
Differential Revision: D89665504
Pulled By: kKPulla
fbshipit-source-id: d0167e35a1ee2a8a8e0230a27dfe9fd0e424f923
The constantExpr->value() only exists when the constant is of scalar type, consider the value vector for complex data type.
Returns null for complex constant before this fix.
Create the constant vector in Expand operator constructor to reuse it.