[SPARK-27194][SPARK-29302][SQL] Define a spark staging committer to resolve FileAlreadyExistingException #28989

turboFei · 2020-07-03T03:04:38Z

What changes were proposed in this pull request?

For dynamic partition overwrite, its working dir is .spark-staging-{jobId}.
Task file name formatted part-$taskId-$jobId$ext(regardless task attempt Id).
Each task writes its output to:

.spark-staging-{jobId}/partitionPath1/taskFileName1
.spark-staging-{jobId}/partitionPath2/taskFileName2
...
.spark-staging-{jobId}/partitionPathN/taskFileNameN

If speculation is enabled, there may be several tasks, which have same taskId and different attemptId, write to the same files concurrently.
For distributedFileSystem, it only allow one task to hold the lease to write a file, if two tasks want to write the same file, an exception like no lease on inode would be thrown.

Even speculation is not enabled, if a task aborted due to Executor OOM, its output would not be cleaned up.
Then a new task launched to write the same file, because parquet disallows overwriting, a FileAlreadyExistsException would be thrown, like.

Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: /user/hive/warehouse/t2/.spark-staging-1f1efbfd-7e20-4e0f-a49c-a7fa3eae4cb1/part1=2/part2=2/part-00000-1f1efbfd-7e20-4e0f-a49c-a7fa3eae4cb1.c000.snappy.parquet for client 127.0.0.1 already exists
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2578)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2465)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2349)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:624)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:398)

It is a critical issue and would cause job failed.

In this Pr, we define a spark staging output committer to fix this issue:

set a working path under staging dir named partitionPath-attemptId.
after task completed, rename partitionPath-attemptId/fileName to partitionPath/fileName
leverage the OutputCommitCoordinator to coordinate the task commits

Why are the changes needed?

Without this PR, dynamic partition overwrite operation might fail.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added UT.

turboFei · 2020-07-03T03:43:29Z

gentle ping @cloud-fan

Hi, we found a new solution to fix the issues when dynamic partition overwrite is enabled.

FileAlreadyExsitException when executor was crash
task conflicts with its speculation task.

In this PR, we define a new type of OutputCommitter and leverage the OutputCommitCoordinator to coordinate the task commits.
Could you kindly give some suggestions?

turboFei · 2020-07-03T05:37:10Z

aslo cc @Ngone51

turboFei · 2020-07-06T03:30:57Z

gentle ping @cloud-fan @dongjoon-hyun @xuanyuanking @WinkerDu @LuciferYang @advancedxy

dongjoon-hyun · 2020-07-06T21:20:08Z

ok to test

SparkQA · 2020-07-06T21:22:50Z

Test build #125132 has started for PR 28989 at commit b09a665.

shaneknapp · 2020-07-06T23:14:01Z

test this please

SparkQA · 2020-07-07T02:42:54Z

Test build #125143 has finished for PR 28989 at commit b09a665.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class SparkStagingOutputCommitter(

turboFei · 2020-07-07T06:14:58Z

will try to fix it.

…solve file already exist exception

SparkQA · 2020-07-07T11:13:17Z

Test build #125196 has finished for PR 28989 at commit 0492977.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class SparkStagingOutputCommitter(

SparkQA · 2020-07-07T12:21:18Z

Test build #125205 has finished for PR 28989 at commit bc7d2b5.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-07-07T15:34:52Z

Test build #125212 has finished for PR 28989 at commit a7a8d4b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-07-08T01:56:04Z

Test build #125275 has finished for PR 28989 at commit 0d722a9.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds no public classes.

turboFei · 2020-07-09T02:52:37Z

close this,
I am wrong about that, I thought the taskAttemptContext.getTaskAttemptId.getId is same with the taskAttemptId of spark and it would at most create several(the largest task attempt number) staging partition dir.
But the taskAttemptContext.getTaskAttemptId.getId is also a uniq id, so this would also create multi staging partition dir for each task.

Prefer to #29000

probot-autolabeler bot added CORE SQL labels Jul 3, 2020

turboFei mentioned this pull request Jul 3, 2020

[SPARK-27194][SPARK-29302][SQL] For dynamic partition overwrite operation, fix speculation task conflict issue and FileAlreadyExistsException issue #26339

Closed

turboFei changed the title ~~[SPARK-27194][SPARK-29302][SQL] Define a spark staing committer to resolve FileAlreadyExistingException~~ [WIP][SPARK-27194][SPARK-29302][SQL] Define a spark staing committer to resolve FileAlreadyExistingException Jul 3, 2020

turboFei changed the title ~~[WIP][SPARK-27194][SPARK-29302][SQL] Define a spark staing committer to resolve FileAlreadyExistingException~~ [SPARK-27194][SPARK-29302][SQL] Define a spark staging committer to resolve FileAlreadyExistingException Jul 3, 2020

turboFei force-pushed the SPARK-27194-custom-committer branch 3 times, most recently from 07066e7 to 8f202c9 Compare July 3, 2020 04:08

turboFei mentioned this pull request Jul 6, 2020

[SPARK-27194][SPARK-29302][SQL] Fix commit collision in dynamic partition overwrite mode #29000

Closed

turboFei force-pushed the SPARK-27194-custom-committer branch 2 times, most recently from cd42f42 to b09a665 Compare July 6, 2020 06:10

turboFei changed the title ~~[SPARK-27194][SPARK-29302][SQL] Define a spark staging committer to resolve FileAlreadyExistingException~~ [WIP][SPARK-27194][SPARK-29302][SQL] Define a spark staging committer to resolve FileAlreadyExistingException Jul 7, 2020

[SPARK-27194][SPARK-29302][SQL] Define a spark staing committer to re…

0492977

…solve file already exist exception

turboFei force-pushed the SPARK-27194-custom-committer branch from b09a665 to 0492977 Compare July 7, 2020 08:11

refactor

bc7d2b5

fix ut

a7a8d4b

remove some hack

0d722a9

turboFei changed the title ~~[WIP][SPARK-27194][SPARK-29302][SQL] Define a spark staging committer to resolve FileAlreadyExistingException~~ [SPARK-27194][SPARK-29302][SQL] Define a spark staging committer to resolve FileAlreadyExistingException Jul 8, 2020

turboFei closed this Jul 9, 2020

[SPARK-27194][SPARK-29302][SQL] Define a spark staging committer to resolve FileAlreadyExistingException #28989

[SPARK-27194][SPARK-29302][SQL] Define a spark staging committer to resolve FileAlreadyExistingException #28989

Uh oh!

Conversation

turboFei commented Jul 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

turboFei commented Jul 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

turboFei commented Jul 3, 2020

Uh oh!

turboFei commented Jul 6, 2020

Uh oh!

dongjoon-hyun commented Jul 6, 2020

Uh oh!

SparkQA commented Jul 6, 2020

Uh oh!

shaneknapp commented Jul 6, 2020

Uh oh!

SparkQA commented Jul 7, 2020

Uh oh!

turboFei commented Jul 7, 2020

Uh oh!

SparkQA commented Jul 7, 2020

Uh oh!

SparkQA commented Jul 7, 2020

Uh oh!

SparkQA commented Jul 7, 2020

Uh oh!

SparkQA commented Jul 8, 2020

Uh oh!

turboFei commented Jul 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

turboFei commented Jul 3, 2020 •

edited

Loading

turboFei commented Jul 3, 2020 •

edited

Loading