[SPARK-30320][SQL] Fix insert overwrite to DataSource table with dynamic partition error #26971

WinkerDu · 2019-12-20T16:06:29Z

What changes were proposed in this pull request?

This PR fix insert overwrite table error when running multiple speculative task. This PR introduces task attempt id appended to dynamic partition staging dir to identify different speculative task output dir rather than all speculative tasks use same output dir. CommitTask in HadoopMapReduceCommitProtocol carries attempt id in TaskCommitMessage

Why are the changes needed?

This PR fix insert overwrite to DataSource table with dynamic partition error when running multiple task attempts. Suppose there are one task attempt and one speculative task attempt, the speculative would raise FileAlreadyExistsException because of same staging dir attempt tasks commit

Does this PR introduce any user-facing change?

How was this patch tested?

Added UT

…mic partition error when running multiple task attempts

WinkerDu · 2019-12-20T16:29:17Z

@xuanyuanking @LuciferYang @LinhongLiu pls have a review

AmplabJenkins · 2019-12-20T16:48:08Z

Can one of the admins verify this patch?

WinkerDu · 2019-12-23T10:57:38Z

cc @cloud-fan

LuciferYang · 2019-12-25T09:11:34Z

core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala

    if (hasValidPath) {
-      val (allAbsPathFiles, allPartitionPaths) =
-        taskCommits.map(_.obj.asInstanceOf[(Map[String, String], Set[String])]).unzip
+      val (allAbsPathFiles, allPartitionPaths, successAttemptIDs) =


Should we add implicit function like

implicit def asPair(x:(Map[String, String], Set[String], String)) = (x._1, (x._2, x._3))

before line 172, then we can unzip taskCommits as (Map[String, String], (Set[String], String)) and eliminate re-zip operation at line 188

Is it right?

Sounds reasonable, I'll try implicit conversion

jiangxb1987 · 2020-02-14T00:15:20Z

core/src/main/scala/org/apache/spark/internal/config/package.scala

    .createOptional

+  private[spark] val MAX_LOCAL_TASK_FAILURES = ConfigBuilder("spark.task.local.maxFailures")
+    .doc("The max failure times for a task while SparkContext running in Local mode, " +


How could you launch speculative task when running under local mode?

In UT class InsertWithMultipleTaskAttemptSuite, I don't expect launching speculative task in local mode. Actually, I make a customized commit protocol named "InsertExceptionCommitProtocol" in InsertWithMultipleTaskAttemptSuite, which overriding commitTask method to fail the first commit task on purpose and restore to normal in subsequent commit tasks. This scene is similar to what happened in speculative tasks with failure.

WinkerDu · 2020-03-05T08:26:29Z

cc @dongjoon-hyun please have a review?

koertkuipers · 2020-04-06T20:06:46Z

core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala

+                // on the rename.
+                fs.mkdirs(finalPartPath.getParent)
+              }
+              fs.rename(new Path(s"$stagingDir/$successAttemptID", part), finalPartPath)


do i understand it correctly that part here is a directory (e.g. x=1/y=2), not a file? so a directory full of files is being moved.
if so couldn't multiple tasks write to the same partition? and then wouldnt these moves conflict with each other?

ramesh-muthusamy · 2020-05-11T15:48:41Z

@WinkerDu I think the same issue is being worked upon in the PR for the ticket. https://issues.apache.org/jira/browse/SPARK-27194
#26339
Please validate

[SPARK-30320][SQL] Fix insert overwrite to DataSource table with dyna…

9dc3006

…mic partition error when running multiple task attempts

WinkerDu force-pushed the master branch from ca2033d to 9dc3006 Compare December 20, 2019 16:17

LuciferYang reviewed Dec 25, 2019

View reviewed changes

rename local failures times, and introduce implicit conversion

84ffc72

dongjoon-hyun added the SQL label Feb 5, 2020

jiangxb1987 reviewed Feb 14, 2020

View reviewed changes

WinkerDu force-pushed the master branch from 1f0d174 to 84ffc72 Compare February 16, 2020 04:24

Merge branch 'master' into master

2c873db

koertkuipers reviewed Apr 6, 2020

View reviewed changes

LuciferYang mentioned this pull request Jun 30, 2020

[SPARK-27194][SPARK-29302][SQL] For dynamic partition overwrite operation, fix speculation task conflict issue and FileAlreadyExistsException issue #26339

Closed

WinkerDu closed this Jul 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-30320][SQL] Fix insert overwrite to DataSource table with dynamic partition error #26971

[SPARK-30320][SQL] Fix insert overwrite to DataSource table with dynamic partition error #26971

Uh oh!

WinkerDu commented Dec 20, 2019 •

edited

Loading

Uh oh!

WinkerDu commented Dec 20, 2019

Uh oh!

AmplabJenkins commented Dec 20, 2019

Uh oh!

WinkerDu commented Dec 23, 2019

Uh oh!

LuciferYang Dec 25, 2019 •

edited

Loading

Uh oh!

WinkerDu Dec 25, 2019

Uh oh!

jiangxb1987 Feb 14, 2020

Uh oh!

WinkerDu Feb 16, 2020

Uh oh!

WinkerDu commented Mar 5, 2020

Uh oh!

koertkuipers Apr 6, 2020

Uh oh!

ramesh-muthusamy commented May 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[SPARK-30320][SQL] Fix insert overwrite to DataSource table with dynamic partition error #26971

[SPARK-30320][SQL] Fix insert overwrite to DataSource table with dynamic partition error #26971

Uh oh!

Conversation

WinkerDu commented Dec 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

WinkerDu commented Dec 20, 2019

Uh oh!

AmplabJenkins commented Dec 20, 2019

Uh oh!

WinkerDu commented Dec 23, 2019

Uh oh!

LuciferYang Dec 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WinkerDu Dec 25, 2019

Choose a reason for hiding this comment

Uh oh!

jiangxb1987 Feb 14, 2020

Choose a reason for hiding this comment

Uh oh!

WinkerDu Feb 16, 2020

Choose a reason for hiding this comment

Uh oh!

WinkerDu commented Mar 5, 2020

Uh oh!

koertkuipers Apr 6, 2020

Choose a reason for hiding this comment

Uh oh!

ramesh-muthusamy commented May 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

WinkerDu commented Dec 20, 2019 •

edited

Loading

LuciferYang Dec 25, 2019 •

edited

Loading