-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-28207: NullPointerException is thrown when checking column uniqueness #5207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
why is it returning |
|
@ayushtkn Calcite can express the three statuses, either of "it is definitely unique", "it is not definitely unique", or unknown. Null stands for the last case. If it is null, this PR assumes a more general case, not unique. |
|
I may also check why and when it can return null if it is unexpected |
zabetak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. I think it makes sense to add checks for nulls when necessary since from an API perspective it is a valid value. I left some minor comments in .q files but other than that I am fine to merge this change.
Independently (maybe as a separate JIRA ticket) it may be worth checking why the null value appears in the first place and tackle that problem as well since it could lead to better query plans. In many cases the NPE raised in metadata can be treated by finding the origin of the null value.
ql/src/test/queries/clientpositive/cbo_row_count_non_simple_filter.q
Outdated
Show resolved
Hide resolved
|
I attached a remote debugger and found
I will follow zabetak's comments and update the PR. |
ayushtkn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
|
CI succeeded 💯 |
…eness (Shohei Okumiya reviewed by Stamatis Zampetakis, Ayush Saxena) Close apache/hive#5207
…eness (Shohei Okumiya reviewed by Stamatis Zampetakis, Ayush Saxena) Close apache#5207
…eness (Shohei Okumiya reviewed by Stamatis Zampetakis, Ayush Saxena) Close apache#5207
…eness (Shohei Okumiya reviewed by Stamatis Zampetakis, Ayush Saxena) Close apache/hive#5207




What changes were proposed in this pull request?
Add a null check when using
RelMetadataQuery#areColumnsUnique.https://issues.apache.org/jira/browse/HIVE-28207
Why are the changes needed?
RelMetadataQuery#areColumnsUniquereturnstrue,false,ornull.Some of our implementations skip testing whether it is null and throw NPE.Does this PR introduce any user-facing change?
No. It is a bug fix to resolve NPE.
Is the change a dependency upgrade?
No.
How was this patch tested?
I added integration tests.
Without this patch,
cbo_join_constraints.qfails with NPE thrown byHiveJoinConstraintsRule.cbo_row_count_non_simple_filter.qhits the problem ofHiveJoinConstraintsRuleandHiveRelMdRowCount.I have not found a real case where the issue happens with
HiveAggregateSplitRule, but I modified it as it is likely to hit the same problem.