Skip to content

Conversation

@nsivabalan
Copy link
Contributor

@nsivabalan nsivabalan commented Jul 4, 2020

What is the purpose of the pull request

Fixing deletes for inserts in same write batch.

Brief change log

If same record is both inserted and deleted in same batch(possible when listening to events via deltastreamer), precombine will favor delete record. But then, in our OverwriteWithLatestAvroPayload, only combineAndGetUpdateValue checks for delete flag, where as getInsertValue does not. Hence the issue is that, for these cases, deleted records are still seen in read query.

Verify this pull request

This change added tests and can be verified as follows:

  • TestHoodieClientOnCopyOnWriteStorage#testDeletesForInsertsInSameBatch

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@nsivabalan nsivabalan force-pushed the deleteForInsertsInSameWriteBatch branch from 2877f12 to 2826493 Compare July 10, 2020 11:47
@nsivabalan
Copy link
Contributor Author

@vinothchandar : awaiting your review. if this is not doable (checking delete in getInsertValue()) for some reason, we might have to make some changes to #1819 as well. so would appreciate your review.

@vinothchandar
Copy link
Member

@nsivabalan this does seem good to me. Can we add a Unit test specifically for OverwriteWithLatestAvroPayload which just tests these scenarios at the single class elvel?

@nsivabalan nsivabalan force-pushed the deleteForInsertsInSameWriteBatch branch from 715e665 to d18b409 Compare July 22, 2020 03:24
Copy link
Contributor Author

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests

@nsivabalan nsivabalan force-pushed the deleteForInsertsInSameWriteBatch branch from d18b409 to fa21405 Compare July 22, 2020 11:15
Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. please go ahead and merge once CI is happy

@vinothchandar
Copy link
Member

[ERROR] Failures: 
2552[ERROR] org.apache.hudi.client.TestHoodieClientOnCopyOnWriteStorage.testUpsertsUpdatePartitionPathGlobalBloom(HoodieIndex$IndexType)
2553[INFO]   Run 1: PASS
2554[ERROR]   Run 2: TestHoodieClientOnCopyOnWriteStorage.testUpsertsUpdatePartitionPathGlobalBloom:448->testUpsertsUpdatePartitionPath:506 expected: <1> but was: <0>

@nsivabalan this was a recent test right? this seems to be failing here. may be run this until failiure locally and go from there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants