-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-5238] Fixing HoodieMergeHandle shutdown sequence
#7245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6146275 to
2c234af
Compare
HoodieMergeHandle shutdown sequenceHoodieMergeHandle shutdown sequence
85eaa8d to
9fd17ce
Compare
9fd17ce to
811eac5
Compare
…handles w/in the executor's consumers, as soon as writing is completed
49aea36 to
f5753cd
Compare
|
@xushiyan / @alexeykudinkin / @nsivabalan i am facing this exact same issue with 12.0 #7234. One question: why this happens selectively? In my case, many tables' upserts are running fine with 12.0 except one table. Why? Also how should i fix this. Do i need to migrate to a later version? or is there a better way to fix this |
Change Logs
This PR addresses the #7234 related to
HoodieMergeHandleshutdown sequence:in introduced at #4264 we changed the ordering in which we shut down the handle relative to the executor:
Before it was
After
The reason it was switched was to handle the case when during exception thrown executor might still be writing out records, and closing of the handle (before the executor) was leaving some of the produced Parquet files corrupted.
This PR, addresses this issue by making sure that in the successful path we close the Handle immediately as soon as writing has finished (before we shutdown the executor), which would make sure this will not result in any
PipeBrokenexceptions in GCSImpact
No impact
Risk level (write none, low medium or high below)
Low
Documentation Update
N/A
Contributor's checklist