-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-4424] add new compactoin trigger stratgy: NUM_COMMITS_AFTER_REQ… #6144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
d9e2cff
ed8f47b
277ef25
3ec6be2
84be476
73b423e
469c8e2
d56deaa
f21fa7d
661dbff
8ca41e6
b2fdc49
66d2610
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -140,6 +140,17 @@ private Option<Pair<Integer, String>> getLatestDeltaCommitInfo() { | |
| return Option.empty(); | ||
| } | ||
|
|
||
| private Option<Pair<Integer, String>> getLatestDeltaCommitInfoSinceLastCompactionRequest() { | ||
| Option<Pair<HoodieTimeline, HoodieInstant>> deltaCommitsInfo = | ||
| CompactionUtils.getDeltaCommitsSinceLatestCompactionRequest(table.getActiveTimeline()); | ||
| if (deltaCommitsInfo.isPresent()) { | ||
| return Option.of(Pair.of( | ||
| deltaCommitsInfo.get().getLeft().countInstants(), | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the compaction scheduling would be affected by the progress of the compaction executions ? Is that reasonable ? Don't think so.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For now, NUM_COMMITS or TIME_ELAPSED compaction trigger strategy will check the number or time of delta-commits after the last successful compaction. so if the offline compaction application crash for a while(or async compaction is very slow), there will be a lot of compaction request(one request per delta commit) in the timeline, and that will have a side effect on performance. so this PR provides a new strategy not to check the last successful compaction but check the last compaction request if possible.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @danny0405 Any input or question? |
||
| deltaCommitsInfo.get().getRight().getTimestamp())); | ||
| } | ||
| return Option.empty(); | ||
| } | ||
|
|
||
| private boolean needCompact(CompactionTriggerStrategy compactionTriggerStrategy) { | ||
| boolean compactable; | ||
| // get deltaCommitsSinceLastCompaction and lastCompactionTs | ||
|
|
@@ -157,6 +168,18 @@ private boolean needCompact(CompactionTriggerStrategy compactionTriggerStrategy) | |
| LOG.info(String.format("The delta commits >= %s, trigger compaction scheduler.", inlineCompactDeltaCommitMax)); | ||
| } | ||
| break; | ||
| case NUM_COMMITS_AFTER_LAST_REQUEST: | ||
| latestDeltaCommitInfoOption = getLatestDeltaCommitInfoSinceLastCompactionRequest(); | ||
|
|
||
| if (!latestDeltaCommitInfoOption.isPresent()) { | ||
| return false; | ||
| } | ||
| latestDeltaCommitInfo = latestDeltaCommitInfoOption.get(); | ||
| compactable = inlineCompactDeltaCommitMax <= latestDeltaCommitInfo.getLeft(); | ||
| if (compactable) { | ||
| LOG.info(String.format("The delta commits >= %s since the last compaction request, trigger compaction scheduler.", inlineCompactDeltaCommitMax)); | ||
| } | ||
| break; | ||
| case TIME_ELAPSED: | ||
| compactable = inlineCompactDeltaSecondsMax <= parsedToSeconds(instantTime) - parsedToSeconds(latestDeltaCommitInfo.getRight()); | ||
| if (compactable) { | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.