Conversation
✅ Deploy Preview for meta-velox canceled.
|
9a7590d to
a9011fa
Compare
f7beb33 to
25ac51c
Compare
25ac51c to
49a43dd
Compare
49a43dd to
f458e56
Compare
@kagamiori : Most of the common logic in those classes could have even broader applicability so I moved them to FuzzerUtil class. What remains are the flags etc. I notice that AggregationFuzzerBase covers flags as well. We can try something like that. Will modify the code shortly. |
cafd0d1 to
cfa8421
Compare
|
@kagamiori : Have added a RowNumberFuzzerBase class now. PTAL. Thanks ! |
kagamiori
left a comment
There was a problem hiding this comment.
Hi @aditi-pandit, the code looks good to me, except a few comments for small refactoring. Thanks!
velox/exec/fuzzer/FuzzerUtil.cpp
Outdated
| std::vector<RowVectorPtr> flatten(const std::vector<RowVectorPtr>& vectors) { | ||
| std::vector<RowVectorPtr> flatVectors; | ||
| for (const auto& vector : vectors) { | ||
| auto flat = BaseVector::create<RowVector>( | ||
| vector->type(), vector->size(), vector->pool()); | ||
| flat->copy(vector.get(), 0, 0, vector->size()); | ||
| flatVectors.push_back(flat); | ||
| } | ||
|
|
||
| return flatVectors; | ||
| } |
There was a problem hiding this comment.
nit: Can this method be replaced with BaseVector::flattenVector()?
There was a problem hiding this comment.
@kagamiori : BaseVector::flattenVector flattens the original input vector, while this flatten method returns a separate output vector for the flattened list of input vectors. The original vector is retained and we want to run the fuzzer on the original input vectors only.
An option could be to copy the input vector and then use BaseVector::flatten(...) on the copied vector. But the rest of the stubbing around it makes it seem simpler to use the current flatten method. wdyt ?
There was a problem hiding this comment.
@kagamiori : Nvm... I realize after using common logVectors method from AggregationFuzzerBase that this method isn't needed. see #12300
velox/exec/fuzzer/FuzzerUtil.cpp
Outdated
| // Disable testing with TableScan when input contains TIMESTAMP type, due to | ||
| // the issue #8127. | ||
| if (type->kind() == TypeKind::TIMESTAMP) { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Hi @aditi-pandit, I remember that for running fuzzers with PQR, the issue #8127 can be resolved by adding -Duser.timezone=America/Los_Angeles to etc/jvm.config in the directory where you installed the Presto server (e.g., presto-server-0.284/etc/jvm.config). Could you please have a try and let me know if it works?
There was a problem hiding this comment.
@kagamiori : Have tried with the user.timezone config and I don't see the errors either.
But of course it fails in Velox CI here.
So we'll have to change that separately and let this PR remain as is.
wdyt ?
There was a problem hiding this comment.
Hi @aditi-pandit, I believe we already set -Duser.timezone=America/Los_Angeles in the CI jobs here, so if you still see an error when enabling Timestamp type, that needs to be looked into.
For now, could we move if (type->kind() == TypeKind::TIMESTAMP) { return false; } to a local method only used in TopNRowNumberFuzzer.cpp, so that this won't affect other fuzzers that use this isTableScanSupported() utility method? Then you can look into the error caused by Timestamp type after this PR is merged.
| void RowNumberFuzzerBase::logInput(const std::vector<RowVectorPtr>& input) { | ||
| if (VLOG_IS_ON(1)) { | ||
| // Flatten inputs. | ||
| const auto flatInput = test::flatten(input); | ||
| VLOG(1) << "Input: " << input[0]->toString(); | ||
| for (const auto& v : flatInput) { | ||
| VLOG(1) << std::endl << v->toString(0, v->size()); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
nit: AggregationFuzzerBase also has a logVectors() method. We can move that to a common place to reuse the code.
There was a problem hiding this comment.
Check #12300. I'll update this PR post its merge.
2d7ec1a to
6dbf273
Compare
|
@kagamiori : Have updated this PR after the refactoring PR is merged. PTAL. |
6dbf273 to
7de27c6
Compare
|
LGTM. Let's look into the error caused by Timestamp type and enable Timestamp in a separate PR. |
|
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
7de27c6 to
b2c13f7
Compare
|
@kagamiori : Have removed the timestamp check code from FuzzerUtil to RowNumberFuzzerBase now. PTAL. |
b2c13f7 to
87db1fd
Compare
|
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
kagamiori
left a comment
There was a problem hiding this comment.
Hi @aditi-pandit, I got a few linter error internally. Could you take a look and address them? Thanks!
87db1fd to
711764a
Compare
|
@kagamiori : Thanks for providing me the warnings. Have fixed them. PTAL. |
|
@kagamiori has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@kagamiori merged this pull request in 6632054. |
topNRowNumber node is an optimized planNode for SQL with ranking window functions but which limits them to only the topN results. Add a TopNRowNumberFuzzer for plans with this planNode.
This fuzzer is closely modeled after the RowNumberFuzzer. So the common code is abstracted to a RowNumberFuzzerBase class which is used as the parent class for both RowNumberFuzzer and TopnRowNumberFuzzer.
The fuzzer generates plans only for row_number function right now. It will be enhanced to support rank and dense_rank functions after #11554