[HUDI-6389] Fix instant time check against the active timeline in meta sync #8991
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change Logs
#7561 introduced a bug where in a case, the partition changes may be missed in the meta sync, e.g., the following active timeline:
and ts49.commit and ts48.commit are archived.
If ts47 is the last sync commit time, and ts48.commit or ts49.commit has partition changes, meta sync misses such changes.
This above issue is solved by a separate PR #8388 where the hive sync client returns the right timeline which gives write commits only for checking.
This PR makes sure the problematic API implementation (
TimelineUtils.getCommitsTimelineAfter) is still fixed.Existing tests are enhanced:
TestTimelineUtils.testGetCommitsTimelineAfter: this test fails before this PR and passes after the PR.TestHiveSyncTool. testBasicSync: this test passes before and after this PR on master. Note that, without [HUDI-5816] List all partitions as the fallback mechanism in Hive and Glue Sync #8388, this test would fail without this PR.Impact
Fixes a bug.
Risk level
low
Documentation Update
We need to update release notes on the regression.
Contributor's checklist