[native] Add ANALYZE STATS support#20055
Conversation
|
@karteekmurthys : Do you have an e2e test ? |
I will merge analyze tests with this PR. We cannot directly call internal functions as a user. |
Yes, please add the analyze tests. |
d6df1fd to
d2ad340
Compare
There was a problem hiding this comment.
Nit : Please add a newline between the tests.
There was a problem hiding this comment.
Nit : Please add a newline here.
There was a problem hiding this comment.
This comment can be removed I think. Only please add a comment that the query results in a table giving the (column_name, data_size, distinct_values_count, nulls_fraction, row_count, low_value, high_value).
...ecution/src/test/java/com/facebook/presto/nativeworker/AbstractTestNativeGeneralQueries.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
This comment is not needed.
There was a problem hiding this comment.
Why is this ignored? Would you add a comment and perhaps refer to a GitHub issue that explains the problem.
There was a problem hiding this comment.
@aditi-pandit @karteekmurthys Would take a look at this question? I'm still wondering about this.
There was a problem hiding this comment.
I have opened an issue that i faced during testing. Please refer L73.
mbasmanova
left a comment
There was a problem hiding this comment.
@karteekmurthys Please, make sure all comments represent a standalone change and have [native] or [native pos] prefix. Make sure there are not commits like "address review comments". Please, start commit titles with active verbs.
There was a problem hiding this comment.
kAggregateFunctionsMap -> kFunctionNames for readability; 'Aggregate' is already part of the encoding function name.
There was a problem hiding this comment.
/* Expected number of rows updated */
Please, remove this comment. Adding such comments to each assertUpdate call will make code unreadable.
There was a problem hiding this comment.
What does this do? How do you verify the results of the SHOW STATS command?
There was a problem hiding this comment.
CMIW, assertQuery compares the results of presto and prestissimo. Do I need to explicit check for column values?
https://prestodb.io/docs/current/sql/show-stats.html
There was a problem hiding this comment.
I'm thinking that you run ANALYZE on Prestissimo and therefore we don't know if it works or not. If it doesn't work, then SHOW STATS will show nothing for both Prestissimo and Presto and the test will pass. That's not what we want, is it?
There was a problem hiding this comment.
Also, if ANALYZE works but computes stats incorrectly, the tests won't detect that, will it?
There was a problem hiding this comment.
@mbasmanova : We had been thinking about what would be a more complete test here to cover the situations you outlined.
i) One option is to compare the results of SHOW STATS with values expected in a static list. This list is populated with what we consider correct results.
ii) The Java side also does Statement object level tests. We don't have that framework in prestissimo testing.
Right now the only guarantee is that the sequence of operations on either side are consistent.
But i) would make it more complete.
Wdyt ?
There was a problem hiding this comment.
@mbasmanova Aditi is talking about something like this: https://github.com/prestodb/presto/blob/master/presto-hive/src/test/java/com/facebook/presto/hive/TestHiveIntegrationSmokeTest.java#L4426
There was a problem hiding this comment.
i) One option is to compare the results of SHOW STATS with values expected in a static list. This list is populated with what we consider correct results.
Sounds reasonable to me. Thanks.
7aab88c to
4d17b1f
Compare
mbasmanova
left a comment
There was a problem hiding this comment.
@karteekmurthys Please, review the list of commits and (1) decide if you need 4 commits or 1 commit would be sufficient; (2) Add [native] or [native pos] prefix to all commit titles; (3) check the changes in each commit to make sure they match the commit message.
4d17b1f to
24879ed
Compare
I have squashed it down to single commit now. Please review. |
|
@mbasmanova would you please review this. |
mbasmanova
left a comment
There was a problem hiding this comment.
@karteekmurthys Looks good to me, but CI is red.
Curious, why this doesn't work for PoS? CC: @vermapratyush
mbasmanova
left a comment
There was a problem hiding this comment.
@karteekmurthys It would be nice to update commit message to include more details about this change.
Analyze stats requires functions max_data_size_for_stats and sum_data_size_for_stats that were implemented in Velox. But these are internal functions in Presto mapped to presto.default.$internal$ namesapce.
Translate the name mappings correctly during fragment translation to Velox to support "ANALYZE STATS"
Resolves: facebookincubator/velox#5447 and facebookincubator/velox#5484