Prepare broken column statistics table using Hive metastore database directly#13949
Conversation
1fe7f11 to
9540a66
Compare
plugin/trino-hive/src/test/java/io/trino/plugin/hive/TestHiveAnalyzeCorruptStatistics.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
prepareBrokenColumnStatisticsTable may stop functioning and we will never find out.
Perhaps retry the method instead?
Also, why ANALYZE is expected to fail? eventually it should work. As a user, if a have some bad stats, I'd expect ANALYZE to fix them.
There was a problem hiding this comment.
Perhaps retry the method instead?
The test already has invocationCount = 3. I would like to leave to the option than retrying.
Also, why ANALYZE is expected to fail? eventually it should work. As a user, if a have some bad stats, I'd expect ANALYZE to fix them.
I think we agreed to fix only read path in case of broken statistics. Sorry if I misunderstood the conclusion. That's why there's assertThatThrownBy(() -> query("ANALYZE " + tableName)) at L72. As far as I tried, there's no option in Thrift Hive metastore client for updating or dropping column statistics when it has duplicated entires.
There was a problem hiding this comment.
I think we agreed to fix only read path in case of broken statistics
that's fine
but we may want to fix the ANALYZE too, so i wouldn't want to use ANALYZE in the test as a way to check that stats are broken
The test already has
invocationCount = 3.
this means that it's attempted 3 times and all 3 times need to pass
it doesn't handle yet the case where setup didn't succeed in producing erroneous situation
why not retry prepareBrokenColumnStatisticsTable until stats are broken
also, can we inject broken stats into HMS database directly?
that could free us from any concurrency in the test
There was a problem hiding this comment.
Yes, we can break MySQL database directly. I will change to the approach.
The reason I avoided was future HMS may not have duplicated stats issue and direct database modification may hide such changes.
There was a problem hiding this comment.
I agree that the previous approach is generally good. However, if it's not reliable, we may need to change the approach. I prefer to change the approach rather than to disable the test.
9540a66 to
5a4daf8
Compare
5a4daf8 to
342a10c
Compare
Description
Fixes #13889
Documentation
(x) No documentation is needed.
Release notes
(x) No release notes entries required.