-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-840]Clean blank file created by HoodieLogFormatWriter #1567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-840]Clean blank file created by HoodieLogFormatWriter #1567
Conversation
|
@yanghua please have a review when free. |
|
@n3nash could you also review this since it touches the common log format writer as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hddong : Any possible reasons why blank file is created in the first place ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hddong : Any possible reasons why blank file is created in the first place ?
Create blank file for appendBlock when new HoodieLogFormatWriter
https://github.com/apache/incubator-hudi/blob/f1592be629c3f9762f62d4e1dbf3be54f213d92d/hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFormatWriter.java#L105-L108
But there is a special case, when roll over is true(block size is past the threshold), we will close the old writer and create a new writer . And if we close a new writer created by rolloverIfNeeded , there will left a blank file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough. but this is effectively doing an extra RPC every close, just to handle this corner case. what is the actual issue caused by the empty file? does the query/writing fail or get stuck?
|
@hddong This PR has conflicts files. |
f1592be to
fb2bf16
Compare
|
@yanghua had rebase this. |
|
hudi/hudi-common/src/main/java/org/apache/hudi/common/table/TableSchemaResolver.java Line 382 in 0cb24e4
readSchemaFromLogFile may read the blank file and the blank file had been read before in HoodieLogFileCommand(modified to avoid reading the blank files)
|
fb2bf16 to
54b0ed4
Compare
|
@hddong : I went ahead and redid this change in the interest of time :) |
54b0ed4 to
96adac4
Compare
|
LGTM |
…ased lock provider (apache#1567)
Tips
What is the purpose of the pull request
When roll over is true, HoodieLogFormatWriter will create next version log file. But it always left a blank file when close.
Brief change log
(for example:)
Verify this pull request
This pull request is a trivial rework / code cleanup without any test coverage.
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.