-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-2994] Add judgement to existed partitionPath in the catch code block for HU… #4294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
|
|
@@ -336,9 +336,11 @@ protected FileStatus[] listPartition(Path partitionPath) throws IOException { | |||
| if (!metaClient.getFs().exists(partitionPath)) { | ||||
| metaClient.getFs().mkdirs(partitionPath); | ||||
| return new FileStatus[0]; | ||||
| } else { | ||||
| // in case the partition path was created by another caller | ||||
| return metaClient.getFs().listStatus(partitionPath); | ||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add one line comment here: // in case the partition path was created by another caller.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, you may want to add the fix to
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in general, I thought list partitions for any given partition can't have concurrent callers. across partitions we will parallelize, but for a given partition, I wasn't aware that there could be concurrent calls. can you throw some light on when this could happen.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In flink pipeline, there are cases that we need to look into the partition files through the fs view with this method call, the bucket assigner would check the small files under the partition and the partition may be new. Multiple bucket assigner may operate on same partitions at the same time. |
||||
| } | ||||
| } | ||||
| throw new HoodieIOException(String.format("Failed to list partition path: %s", partitionPath)); | ||||
| } | ||||
|
|
||||
| /** | ||||
|
|
||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the partition path was created by another process/thread, we should invoke the
#listStatusagain.