-
Notifications
You must be signed in to change notification settings - Fork 3k
Core, API: Support cleanupMode in snapshot expiration #14287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
fda14eb
ddd45ce
dc0d38d
4ffe632
96288f0
22ad3ad
61dbd6f
e27f60c
6f08f99
bf8f4c1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -69,7 +69,6 @@ class RemoveSnapshots implements ExpireSnapshots { | |
| private final Set<Long> idsToRemove = Sets.newHashSet(); | ||
| private final long now; | ||
| private final long defaultMaxRefAgeMs; | ||
| private boolean cleanExpiredFiles = true; | ||
| private TableMetadata base; | ||
| private long defaultExpireOlderThan; | ||
| private int defaultMinNumSnapshots; | ||
|
|
@@ -79,6 +78,8 @@ class RemoveSnapshots implements ExpireSnapshots { | |
| private Boolean incrementalCleanup; | ||
| private boolean specifiedSnapshotId = false; | ||
| private boolean cleanExpiredMetadata = false; | ||
| private boolean cleanExpiredFiles = true; | ||
| private CleanupLevel cleanupLevel = CleanupLevel.ALL; | ||
|
|
||
| RemoveSnapshots(TableOperations ops) { | ||
| this.ops = ops; | ||
|
|
@@ -103,7 +104,12 @@ class RemoveSnapshots implements ExpireSnapshots { | |
|
|
||
| @Override | ||
| public ExpireSnapshots cleanExpiredFiles(boolean clean) { | ||
| Preconditions.checkArgument( | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure about throwing here, why not just override whatever the cleanup mode was set to?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks Eduard, I added such validation mainly want to prevent double intention where one set both the cleanExpired files as well as cleanupMode, so intention is not clear and override might be risky, this could lead to some unexpected results so throw early maybe helpful. This can be found more in my unit test named
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it seems ok to throw an exception here because we don't want to users to call both setters. only one should be used.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. but the condition may not be precise/perfect. e.g., if the client called the other way is to default maybe the complexity is not worth it. I am wondering if it is simpler to just go with what @nastra suggested. just rely on API deprecation to move users away from the old API.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there's some nuance here and value to keep: Take what I included in the unit test for an example here, use preconditions check here prevent the case where both Although unlikely, for the corner case discussed here when
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the current impleemntation is reasonable so long as there's no breakages when someone upgrades to 1.11 and using the deprecated API, which it looks like there's not since the condition is based on whether someone additionally set the new API. I also prefer the approach of failing if a non-default cleanup level is set and the old one is also set because it's pretty unlikely a user intended to do that and it forces them to resolve that ambiguity that @dramaticlly mentioned. |
||
| cleanupLevel == CleanupLevel.ALL, | ||
| "Cannot set cleanExpiredFiles when cleanupLevel has already been set to: %s", | ||
| cleanupLevel); | ||
| this.cleanExpiredFiles = clean; | ||
| this.cleanupLevel = clean ? CleanupLevel.ALL : CleanupLevel.NONE; | ||
| return this; | ||
| } | ||
|
|
||
|
|
@@ -167,6 +173,17 @@ public ExpireSnapshots cleanExpiredMetadata(boolean clean) { | |
| return this; | ||
| } | ||
|
|
||
| @Override | ||
| public ExpireSnapshots cleanupLevel(CleanupLevel level) { | ||
| Preconditions.checkArgument(null != level, "Invalid cleanup level: null"); | ||
| Preconditions.checkArgument( | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same as mentioned earlier about throwing
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unresolving this one, I do think this is a case where we should just be consistent with what's done in the other options in this API, which is just override what was previously set. e.g. we can set cleanExpiredFiles multiple times, and it'll just take the last. Any reason why this particular one should be different and throw @dramaticlly ?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added this mostly want to avoid ambiguous intention when user set both options together regardless of which option is chained first. i.e both shall fail with exception to ask user for further clarification table.expireSnapshots()
.cleanupLevel(ExpireSnapshots.CleanupLevel.METADATA_ONLY)
.cleanExpiredFiles(false)and table
.expireSnapshots()
.cleanExpiredFiles(false)
.cleanupLevel(ExpireSnapshots.CleanupLevel.METADATA_ONLY)Updated preconditions to better assertion condition and message |
||
| cleanExpiredFiles || level == CleanupLevel.NONE, | ||
| "Cannot set cleanupLevel to %s when cleanExpiredFiles was explicitly set to false", | ||
| level); | ||
| this.cleanupLevel = level; | ||
| return this; | ||
| } | ||
|
|
||
| @Override | ||
| public List<Snapshot> apply() { | ||
| TableMetadata updated = internalApply(); | ||
|
|
@@ -350,9 +367,11 @@ public void commit() { | |
| TableMetadata updated = internalApply(); | ||
| ops.commit(base, updated); | ||
| }); | ||
| LOG.info("Committed snapshot changes"); | ||
| LOG.info( | ||
| "Committed snapshot changes and prepare to clean up files at level={}", | ||
| cleanupLevel.name()); | ||
|
|
||
| if (cleanExpiredFiles && !base.snapshots().isEmpty()) { | ||
dramaticlly marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| if (CleanupLevel.NONE != cleanupLevel && !base.snapshots().isEmpty()) { | ||
| cleanExpiredSnapshots(); | ||
| } | ||
| } | ||
|
|
@@ -384,7 +403,7 @@ private void cleanExpiredSnapshots() { | |
| : new ReachableFileCleanup( | ||
| ops.io(), deleteExecutorService, planExecutorService(), deleteFunc); | ||
|
|
||
| cleanupStrategy.cleanFiles(base, current); | ||
| cleanupStrategy.cleanFiles(base, current, cleanupLevel); | ||
| } | ||
|
|
||
| private void validateCleanupCanBeIncremental(TableMetadata current) { | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.