-
Notifications
You must be signed in to change notification settings - Fork 3k
Core: Add new rolling file writers #3158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * A rolling equality delete writer that splits incoming deletes into multiple files within one spec/partition | ||
| * based on the target file size. | ||
| */ | ||
| public class RollingEqualityDeleteWriter<T> extends RollingFileWriter<T, EqualityDeleteWriter<T>, DeleteWriteResult> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete writers are in org.apache.iceberg.deletes while all other writers are in org.apache.iceberg.io.
I think it makes sense to have writer-related classes in the io package so I added rolling writers there.
| return partition; | ||
| } | ||
|
|
||
| public CharSequence currentFilePath() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the precondition here that the current file is not null. We will call this method for every single row while writing CDC records. Right now, currentFile is never null as we init it in the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. You mean the CDC writer constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean all classes that extend RollingFileWriter init the writer immediately so we shouldn't worry about the current file being null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I saw that. I don't have a good way around this, but at least we know that it will fail quickly without that init call.
| if (partition == null) { | ||
| return fileFactory.newOutputFile(); | ||
| } else { | ||
| return fileFactory.newOutputFile(spec, partition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rdblue, I've updated this place to pass spec like we discussed in the original PR.
core/src/main/java/org/apache/iceberg/io/RollingFileWriter.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/iceberg/io/RollingFileWriter.java
Outdated
Show resolved
Hide resolved
|
Looks good to me. I had a couple comments, but nothing blocking. |
|
Thanks for reviewing, @rdblue! |
This PR adds new rolling writers and contains a subset of changes in PR #2945.