Skip to content

Drop the check regarding the presence of all computed partition statistics#15945

Merged
findepi merged 1 commit intotrinodb:masterfrom
findinpath:findinpath/hive-stats
Mar 7, 2023
Merged

Drop the check regarding the presence of all computed partition statistics#15945
findepi merged 1 commit intotrinodb:masterfrom
findinpath:findinpath/hive-stats

Conversation

@findinpath
Copy link
Contributor

Description

While computing the statistics, it may very well happen that the number of partitions in the table will change (partitions get added/dropped). Avoid failing the ANALYZE operation because of inequality of the number of computed partition statistics vs existing table partitions.

Additional context and related issues

In some situations it may happen that ANALYZE table operation is failing with the message

All computed statistics must be used

If the number of the partitions of the table changes between the time when the computing of the partitions stats has started and when it ended, this exception is likely to appear.
Avoid the failure of the operation by dropping the verification for the equality of the number of computed partition statistics vs existing table partitions.

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Feb 2, 2023
@findinpath findinpath added the no-release-notes This pull request does not require release notes entry label Feb 2, 2023
@findinpath
Copy link
Contributor Author

Please do not merge this one yet.
It is not clear yet whether the check removed is actually causing problems.

@findinpath findinpath closed this Feb 6, 2023
@findinpath findinpath reopened this Feb 7, 2023
@findinpath
Copy link
Contributor Author

I'll continue with this PR after landing #15995

…stics

While computing the statistics, it may very well happen that the number
of partitions in the table will change (partitions get added/dropped).
Avoid failing the `ANALYZE` operation because of inequality of
the number of computed partition statistics vs existing table partitions.
@findinpath findinpath force-pushed the findinpath/hive-stats branch from 7245a15 to b16fd74 Compare February 28, 2023 06:00
@findinpath findinpath requested review from ebyhr and findepi February 28, 2023 06:01
Supplier<PartitionStatistics> emptyPartitionStatistics = Suppliers.memoize(() -> createEmptyPartitionStatistics(columnTypes, columnStatisticTypes));

int usedComputedStatistics = 0;
List<Type> partitionTypes = handle.getPartitionColumns().stream()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is not accompanied by corresponding tests because I don't see a way to consistently simulating a situation where partitions get effectively removed during the computation of statistics.

@findepi
Copy link
Member

findepi commented Feb 28, 2023

@ebyhr ptal

@findepi findepi merged commit 41c9336 into trinodb:master Mar 7, 2023
@github-actions github-actions bot added this to the 410 milestone Mar 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed no-release-notes This pull request does not require release notes entry

Development

Successfully merging this pull request may close these issues.

3 participants