-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-802] Fixing deletes for inserts in same batch in write path #1792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-802] Fixing deletes for inserts in same batch in write path #1792
Conversation
hudi-common/src/main/java/org/apache/hudi/common/model/OverwriteWithLatestAvroPayload.java
Outdated
Show resolved
Hide resolved
2877f12 to
2826493
Compare
|
@vinothchandar : awaiting your review. if this is not doable (checking delete in getInsertValue()) for some reason, we might have to make some changes to #1819 as well. so would appreciate your review. |
|
@nsivabalan this does seem good to me. Can we add a Unit test specifically for |
715e665 to
d18b409
Compare
nsivabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests
d18b409 to
fa21405
Compare
vinothchandar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. please go ahead and merge once CI is happy
@nsivabalan this was a recent test right? this seems to be failing here. may be run this until failiure locally and go from there? |
What is the purpose of the pull request
Fixing deletes for inserts in same write batch.
Brief change log
If same record is both inserted and deleted in same batch(possible when listening to events via deltastreamer), precombine will favor delete record. But then, in our OverwriteWithLatestAvroPayload, only combineAndGetUpdateValue checks for delete flag, where as getInsertValue does not. Hence the issue is that, for these cases, deleted records are still seen in read query.
Verify this pull request
This change added tests and can be verified as follows:
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.