-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: rewrite should drop delete files by data sequence number partition wise #9454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing approach only looks at manifest metadata to understand the min data sequence number across all partitions. This is really cheap as we don't have to open manifests (which can be a really expensive operation). That leads to the problem if one partition is significantly behind, it prevents garbage collection of delete files in other partitions. We have solved that for position deletes via the
rewritePositionDeletesaction but it still remains open for equality deletes.I am not convinced opening these manifests during commits is a good idea. Can we explore the option of leveraging the partition stats spec added recently? We are still building an action to generate those stats but let's think through whether it can help us. One option can be to check if the partition stats file is present and use it populate the min data sequence numbers, opening just a single Parquet file vs potentially tons of manifests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts, @szehon-ho @ajantha-bhat @zinking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow. That sounds like a nice usecase 👍
For each partition, we do keep the snapshot id that last updated that partition. Using the snapshot id we can extract the data sequence numbers from the snapshot.
@zinking: The current status of the partition stats project can be tracked from this: #8450
Another alternative approach is to convert equality delete to position delete (this work is pending), So we can reuse the
rewritePositionDeletes. But it is a long route.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with above. I think there were some other attempts to do this before too, but the concern here is that you dont want to do a lot of things in the commit critical path (here potentially opening an unlimited number of manifest files). Yea if its something cheaper to do like reading one partition stats file, it may be better. Also yes the plan has always been to implement convert eq-delete to pos deletes (which can then be cleaned up by rewritePositionDeletes), though not sure if any progress is being made there.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree that adding this on the general write path is too heavy, so that's why I prefer it enabled during rewrite, or probably just some of the rewrites.
on the other hand, it sounds reasonable to track this on partition metadata, but there has to be somewhere to calculate it anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what Szehon meant is to call the cleanup action from the finally block of rewrite action. Not expose as a new action for cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to think about this a bit more but I do like the idea of using partition stats in one or another way. I'll get back next week.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea for me @zinking raised the good question about using partition stats. Its optional so if the user hasn't analyzed the table, it will be different behavior of whether dangling deletes are removed or not. Which may not be so obvious to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think we all agree that we should use partition stats if they are available and read manifests otherwise. We may think about extending our regular writes to check if there is a partition stats file available and drop the delete files per partition rather than globally, like it is done today. We shouldn't open manifests during writes. We can only do that in a distributed fashion, meaning it has to be part of an action. There we either can add a new action or integrate this logic into the existing action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@szehon-ho and I talked a little bit about this offline. I think we can try extending the action for rewriting data files to also attempt to remove dangling deletes in partitions that were successfully compacted. Separately, we may integrate the cleanup using partition stats during regular commits under a flag (off by default).