Skip to content

Conversation

@aokolnychyi
Copy link
Contributor

This PR adds new rolling writers and contains a subset of changes in PR #2945.

* A rolling equality delete writer that splits incoming deletes into multiple files within one spec/partition
* based on the target file size.
*/
public class RollingEqualityDeleteWriter<T> extends RollingFileWriter<T, EqualityDeleteWriter<T>, DeleteWriteResult> {
Copy link
Contributor Author

@aokolnychyi aokolnychyi Sep 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete writers are in org.apache.iceberg.deletes while all other writers are in org.apache.iceberg.io.
I think it makes sense to have writer-related classes in the io package so I added rolling writers there.

@aokolnychyi
Copy link
Contributor Author

return partition;
}

public CharSequence currentFilePath() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the precondition here that the current file is not null. We will call this method for every single row while writing CDC records. Right now, currentFile is never null as we init it in the constructor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. You mean the CDC writer constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean all classes that extend RollingFileWriter init the writer immediately so we shouldn't worry about the current file being null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I saw that. I don't have a good way around this, but at least we know that it will fail quickly without that init call.

if (partition == null) {
return fileFactory.newOutputFile();
} else {
return fileFactory.newOutputFile(spec, partition);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdblue, I've updated this place to pass spec like we discussed in the original PR.

@rdblue
Copy link
Contributor

rdblue commented Sep 20, 2021

Looks good to me. I had a couple comments, but nothing blocking.

@aokolnychyi aokolnychyi merged commit 7eeeada into apache:master Sep 20, 2021
@aokolnychyi
Copy link
Contributor Author

Thanks for reviewing, @rdblue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants