Skip to content

Conversation

@amogh-jahagirdar
Copy link
Contributor

No description provided.

@github-actions github-actions bot added the core label Aug 21, 2022
@amogh-jahagirdar amogh-jahagirdar changed the title Implement performing row-level updates and deletes to branches Implement performing row-level delta to branches Aug 21, 2022
@amogh-jahagirdar
Copy link
Contributor Author

Er. It may make more sense to come back to row level delta after implementing branch writing for the other operations. There's more validation logic to modify and meaningful testing of row level delta is difficult without implementing branch writes for the other operations.

/**
* Validates that no delete files matching a filter have been added to the table since a starting
* snapshot.
* ToDo: Remove after branch writing implementation complete
Copy link
Contributor Author

@amogh-jahagirdar amogh-jahagirdar Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the existing validate after we implement the different write operations because we will always have an ending snapshot. This is fine because validations are all protected/private.

AssertHelpers.assertThrows(
"Should fail to commit when validating from non-ancestor snapshot",
ValidationException.class,
"Cannot commit, missing data files",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to fix the message

@amogh-jahagirdar amogh-jahagirdar marked this pull request as ready for review August 23, 2022 07:13
@amogh-jahagirdar
Copy link
Contributor Author

Looks like @namrathamyske also has a PR for this https://github.com/apache/iceberg/pull/5234/files/ , the approach is fundamentally the same but I was thinking it makes more sense to iterate on one operation at a time rather than doing all the merging snapshot producers at once. We can have copies of the validate methods which accept an ending snapshot. That way we can validate the code and tests for a single operation at a time which seems easier for review IMO, but open to discussion

@namrathamyske @rdblue @jackye1995

@amogh-jahagirdar
Copy link
Contributor Author

amogh-jahagirdar commented Aug 23, 2022

Actually the approach in @namrathamyske works because for the other operations the branch write wouldn't be supported. It's just passing through the end snapshot. So in that case, that PR makes sense, we can just continue discussion on there then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant