-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: Replace setCurrentSnapshot with setBranchSnapshot in metadata builder #4089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Replace setCurrentSnapshot with setBranchSnapshot in metadata builder #4089
Conversation
|
@amogh-jahagirdar and @jackye1995, this is a first attempt at updating the table metadata builder. It doesn't support everything yet but I want to get this up for early feedback so we can compare the most direct changes with what you were proposing on the other PRs. |
|
|
||
| class SetCurrentSnapshot implements MetadataUpdate { | ||
| private final Long snapshotId; | ||
| class RemoveSnapshotRef implements MetadataUpdate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RemoveSnapshotRef is used for dropping a branch instead of SetCurrentSnapshot(null). Now the snapshot can be required instead of nullable.
| if (!snapshotsById.containsKey(currentSnapshotId)) { | ||
| setCurrentSnapshot(null, System.currentTimeMillis()); | ||
|
|
||
| // remove any refs that are no longer valid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this fail if there are still refs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the current usage, we are only calling this when expiring snapshots. At that time the snapshots send in here should already be the ones that have to be removed after evaluating all retention policies. I think that's why we directly remove the refs instead of failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is in the scope for removeSnapshots(snapshots) to remove the references itself for any snapshots we are removing. Otherwise we would have to have the caller do a separate removeReferences and then do a removeSnapshots which seems more cumbersome.
|
|
||
| if (SnapshotRef.MAIN_BRANCH.equals(branch)) { | ||
| this.currentSnapshotId = snapshot.snapshotId(); | ||
| snapshotLog.add(new SnapshotLogEntry(lastUpdatedMillis, snapshot.snapshotId())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The snapshot log is only for main right now. We may want to update and add logs for branches? Probably not for tags because tags are labels that we probably don't track history for (they should be static).
| snapshotLog.add(new SnapshotLogEntry(lastUpdatedMillis, snapshot.snapshotId())); | ||
| } | ||
|
|
||
| SnapshotRef.Builder refBuilder = SnapshotRef.branchBuilder(snapshot.snapshotId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just call SnapshotRef.builderFrom here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that ref may be null. If it is null, then we create a new one with defaults. If it is non-null, then we copy it. I'll make this a bit more clear.
| TableMetadata.Builder update = TableMetadata.buildFrom(base); | ||
| if (base.snapshot(newSnapshot.snapshotId()) != null) { | ||
| // this is a rollback operation | ||
| update.setBranchSnapshot(newSnapshot.snapshotId(), SnapshotRef.MAIN_BRANCH); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wypoon and @sririshindra, I think this fixes the problem from #4088. The previous call to replaceCurrentSnapshot always used setCurrentSnapshot(Snapshot, String) rather than setCurrentSnapshot(long, String). This detects when the snapshot already exists and calls the correct method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense.
I cherry-picked some prerequisite commits, and then applied this change to our 0.13.0 branch, and reran @sririshindra's rollback test, and it passed.
| TestHelpers.assertEqualsSafe(historyTable.schema().asStruct(), expected.get(0), actual.get(0)); | ||
| TestHelpers.assertEqualsSafe(historyTable.schema().asStruct(), expected.get(1), actual.get(1)); | ||
| TestHelpers.assertEqualsSafe(historyTable.schema().asStruct(), expected.get(2), actual.get(2)); | ||
| TestHelpers.assertEqualsSafe(historyTable.schema().asStruct(), expected.get(3), actual.get(3)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was failing, I think due to the problem in #4088. While looking at it, I realized that the last record was not being checked as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wrong about the cause of the history test failure. It turns out that I called the wrong method from setBranchSnapshot(Snapshot, String).
jackye1995
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me, just 1 nit comment, up to you if you think it's worth adding a util method for that.
| snapshotLog.add(new SnapshotLogEntry(lastUpdatedMillis, snapshot.snapshotId())); | ||
| changes.add(new MetadataUpdate.SetCurrentSnapshot(snapshot.snapshotId())); | ||
|
|
||
| if (SnapshotRef.MAIN_BRANCH.equals(branch)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the check for main branch is done in many places, should we also have a method for ref.isMainBranch()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem is that there aren't many places where we actually have a SnapshotRef instance. Here, for example, we don't necessarily have one because ref might be null. So we actually want to use the string branch name.
Below, we have a SetSnapshotRef instance, which is a change. And in places like removeBranch, we do the check before loading the ref. If there is no main ref to remove, we should still ensure that we set current-snapshot-id appropriately.
|
Thanks for reviewing, @jackye1995 and @amogh-jahagirdar! |
…uilder (apache#4089) (cherry picked from commit a3a112d)
…uilder (apache#4089) (cherry picked from commit a3a112d)
This updates the
TableMetadata.Buildermethods and replaces the previoussetCurrentSnapshotwith a branch-awaresetBranchSnapshot.I'm posting this for early feedback. This is the most straight-forward change, but it doesn't support modifying the ref settings like max age and it doesn't support creating/removing tags. We may want to change the builder API to be more generic and if so this is a step toward that API.