Skip to content

fix: Expand operator with constants specified using ConstantVector#15684

Closed
jinchengchenghh wants to merge 4 commits intofacebookincubator:mainfrom
jinchengchenghh:fixExpand
Closed

fix: Expand operator with constants specified using ConstantVector#15684
jinchengchenghh wants to merge 4 commits intofacebookincubator:mainfrom
jinchengchenghh:fixExpand

Conversation

@jinchengchenghh
Copy link
Copy Markdown
Collaborator

@jinchengchenghh jinchengchenghh commented Dec 3, 2025

The constantExpr->value() only exists when the constant is of scalar type, consider the value vector for complex data type.

-- Expand[1][[1, null, null, n0_1, n0_2, n0_3, n0_4], [4, n0_0, {-1, {{-1}, {k => v}}}, null, null, null, null]] -> n1_0:INTEGER, n1_1:INTEGER, n1_2:ROW<c1:INTEGER,c2:ROW<a:ARRAY<INTEGER>,m:MAP<VARCHAR,VARCHAR>>>, n1_3:VARCHAR, n1_4:BIGINT, n1_5:INTEGER, n1_6:ROW<>
  -- ValueStream[0][] -> n0_0:INTEGER, n0_1:VARCHAR, n0_2:BIGINT, n0_3:INTEGER, n0_4:ROW<>

Returns null for complex constant before this fix.

VeloxColumnarToRowConverter 0: {4, 1, null, null, null, ...2 more}

Create the constant vector in Expand operator constructor to reuse it.

@netlify
Copy link
Copy Markdown

netlify bot commented Dec 3, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit c9e3e87
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/694a1660b6c0390008714d66

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 3, 2025
@@ -86,16 +86,7 @@ RowVectorPtr Expand::getOutput() {

for (auto i = 0; i < numColumns; ++i) {
if (rowProjection[i] == kConstantChannel) {
const auto& constantExpr = constantProjection[i];
if (constantExpr->value().isNull()) {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not correct

Copy link
Copy Markdown
Collaborator

@JkSelf JkSelf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for your fix.

@jinchengchenghh
Copy link
Copy Markdown
Collaborator Author

Hi, @mbasmanova , could you help review this PR? Thanks

auto constantExpr = std::make_shared<core::ConstantTypedExpr>(arrayVector);

auto plan =
PlanBuilder().values({data}).expand({{constantExpr}}, {"c1"}).planNode();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do something like

.expand({
   {"k1", "k2", "a", "b", "[1, 2, 3] as c"},
}) 

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I receive the exception after updated to this

unknown file: Failure
C++ exception with description "Exception: VeloxUserError
Error Source: USER
Error Code: INVALID_ARGUMENT
Reason: Scalar function doesn't exist: list_value.
Retriable: False
Function: resolveScalarFunctionType
File: /Users/chengchengjin/code/velox/velox/parse/TypeResolver.cpp
Line: 97
TEST_F(ExpandTest, complexConstant) {
  auto data = makeRowVectorData(1);

  auto arrayVector = makeArrayVector<int32_t>({{1, 2, 3}});
//   auto constantExpr = std::make_shared<core::ConstantTypedExpr>(arrayVector);

  auto plan = PlanBuilder()
                  .values({data})
                  .expand({
                      {"k1", "k2", "a", "b", "[1, 2, 3] as c"},
                  })
                  .planNode();
  auto children = data->children();
  children.push_back(arrayVector);
  auto expected = makeRowVector(children);
  assertQuery(plan, expected);
}

Copy link
Copy Markdown
Collaborator Author

@jinchengchenghh jinchengchenghh Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can not use array_constructor also, the list_value is parsed as array_constructor in here, so even if we resolve the parse issue, we still cannot use the projection to test expand node.

{"list_value", "array_constructor"},

Error Code: INVALID_ARGUMENT
Reason: Unsupported projection expression in Expand plan node. Expected field reference or constant. Got: array_constructor(1,2,3) 
Retriable: False
Expression: TypedExprs::isFieldAccess(columnProjection) || TypedExprs::isConstant(columnProjection)
Function: ExpandNode
File: /Users/chengchengjin/code/velox/velox/core/PlanNode.cpp

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can work well, I will update to this, thanks for your suggestion!

TEST_F(ExpandTest, complexConstant) {
  auto data = makeRowVectorData(1);
  auto children = data->children();
  auto arrayVector = makeArrayVector<int32_t>({{1, 2, 3}});
  auto complexConstants = makeRowVector({arrayVector});
  children.push_back(arrayVector);
  auto expected = makeRowVector({"k1", "k2", "a", "b", "c0"}, children);

//   auto constantExpr = std::make_shared<core::ConstantTypedExpr>(arrayVector);

  auto plan = PlanBuilder()
                  .values({expected})
                  .expand({
                      {"k1", "k2", "a", "b", "__complex_constant(c0) as c"}
                  }, complexConstants)
                  .planNode();

  assertQuery(plan, expected);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I missed 'ARRAY' keyword in the original example. Something like this should work:

PlanBuilder(pool())
...
.expand({
   {"k1", "k2", "a", "b", "ARRAY[1, 2, 3] as c"},
}) 

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Updated and works well.

outputColumns[i] = BaseVector::createConstant(
constantExpr->type(), constantExpr->value(), numInput, pool());
}
outputColumns[i] = constantProjection[i]->toConstantVector(pool());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix looks good. I wonder if we can avoid making new vector for each output batch and instead create these once and reuse.

auto children = otherColumns->children();
auto arrayVector = makeArrayVector<int32_t>({{1, 2, 3}});
children.push_back(arrayVector);
std::vector<VectorPtr> complexConstants(children.size());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

complexConstants not needed anymore

Comment on lines +49 to +55
auto plan = PlanBuilder(pool())
.values({expected})
.expand(
{{"k1", "k2", "a", "b", "ARRAY[1, 2, 3] as c"}})
.planNode();

assertQuery(plan, expected);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit strange that 'expected' is used as both the input and output of the plan; is this intentional?


const auto& rowProjection = fieldProjections_[rowIndex_];
const auto& constantProjection = constantProjections_[rowIndex_];
const auto& constantProjection = constantOutputs_[rowIndex_];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constantOutputs_ contains vectors of size 1, but here we need to produce a vector of size numInput

I just realized that we cannot reuse the vector and may need to call BaseVector::wrapInConstant(numInput, 0, constantOutputs_[rowIndex_]).

} // anonymous namespace

TEST_F(ExpandTest, complexConstant) {
auto otherColumns = makeRowVectorData(1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use size > 1. Otherwise, we won't be able to catch a bug mentioned above.


auto plan = PlanBuilder(pool())
.values({data})
.expand({{"k1", "k2", "a", "b", "ARRAY[1, 2, 3] as c"}})
Copy link
Copy Markdown
Contributor

@mbasmanova mbasmanova Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also test null value?

null::integer[]

Copy link
Copy Markdown
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@mbasmanova mbasmanova added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Dec 22, 2025
@mbasmanova mbasmanova changed the title fix: Fix Expand operator for constant vector with value vector fix: Expand operator with constants specified using ConstantVector Dec 22, 2025
@jinchengchenghh
Copy link
Copy Markdown
Collaborator Author

After add more tests, I receive the exception which requires not allocate memory in Operator constructor, so I move the memory allocation to initialize CC @mbasmanova

E20251222 16:23:01.206250 24729043 Exceptions.h:53] Line: /Users/chengchengjin/code/velox/velox/exec/Task.cpp:1054, Function:createAndStartDrivers, Expression:  Unexpected memory pool allocations during task[test_cursor_1] driver initialization: 
Top 1 leaf memory pool usages:
    op.1.0.0.Expand usage 768B reserved 1.00MB peak 768B

query.TaskCursorQuery_0.0 usage 768B reserved 1.00MB peak 1.00MB
    task.test_cursor_1 usage 768B reserved 1.00MB peak 1.00MB
        node.1 usage 768B reserved 1.00MB peak 1.00MB
            op.1.0.0.Expand usage 768B reserved 1.00MB peak 768B

, Source: RUNTIME, ErrorCode: INVALID_STATE
unknown file: Failure
C++ exception with description "Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Unexpected memory pool allocations during task[test_cursor_1] driver initialization: 

@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Dec 22, 2025

@kKPulla has imported this pull request. If you are a Meta employee, you can view this in D89665504.

@kKPulla
Copy link
Copy Markdown
Contributor

kKPulla commented Dec 22, 2025

@jinchengchenghh I just imported the PR to merge and while I wait for all approvals and signals to pass, can you please take a look at the test failures from the CI signals above and fix if relevant?

@jinchengchenghh
Copy link
Copy Markdown
Collaborator Author

Could you help import it? I have resolved the failed CI. Thanks! @kKPulla

@meta-codesync meta-codesync bot closed this in 45093a7 Dec 23, 2025
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Dec 23, 2025

@kKPulla merged this pull request in 45093a7.

WangGuangxin pushed a commit to WangGuangxin/bolt that referenced this pull request Mar 13, 2026
…15684)

Summary:
The constantExpr->value() only exists when the constant is of scalar type, consider the value vector for complex data type.
```
-- Expand[1][[1, null, null, n0_1, n0_2, n0_3, n0_4], [4, n0_0, {-1, {{-1}, {k => v}}}, null, null, null, null]] -> n1_0:INTEGER, n1_1:INTEGER, n1_2:ROW<c1:INTEGER,c2:ROW<a:ARRAY<INTEGER>,m:MAP<VARCHAR,VARCHAR>>>, n1_3:VARCHAR, n1_4:BIGINT, n1_5:INTEGER, n1_6:ROW<>
  -- ValueStream[0][] -> n0_0:INTEGER, n0_1:VARCHAR, n0_2:BIGINT, n0_3:INTEGER, n0_4:ROW<>
```
Returns null for complex constant before this fix.
```
VeloxColumnarToRowConverter 0: {4, 1, null, null, null, ...2 more}
```

Create the constant vector in Expand operator constructor to reuse it.

Pull Request resolved: facebookincubator/velox#15684

Reviewed By: kagamiori

Differential Revision: D89665504

Pulled By: kKPulla

fbshipit-source-id: d0167e35a1ee2a8a8e0230a27dfe9fd0e424f923
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants