-
Notifications
You must be signed in to change notification settings - Fork 51
[SPARK-8167] Tasks that fail from YARN preemption should not fail job #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The architecture is that, in YARN mode, if the driver detects that an executor has disconnected, it asks the ApplicationMaster why the executor died. If the ApplicationMaster is aware that the executor died because of preemption, all tasks associated with that executor are not marked as failed. The executor is still removed from the driver's list of available executors, however.
…happen immediately. Also begin unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how I feel about creating the subclass of DriverEndpoint here.
The architecture basically only enforces changing the behavior in YARN mode. One could conceivably however want to do something similar in standalone mode, e.g. ask the Spark master why an executor terminated. But to be safe and to minimize the places this change impacts I tried to localize everything to just YARN mode.
…" into true or false directly SQL ``` select key from src where 3 in (4, 5); ``` Before ``` == Optimized Logical Plan == Project [key#12] Filter 3 INSET (5,4) MetastoreRelation default, src, None ``` After ``` == Optimized Logical Plan == LocalRelation [key#228], [] ``` Author: Zhongshuai Pei <[email protected]> Author: DoingDone9 <[email protected]> Closes apache#5972 from DoingDone9/InToFalse and squashes the following commits: 4c722a2 [Zhongshuai Pei] Update predicates.scala abe2bbb [Zhongshuai Pei] Update Optimizer.scala fa461a5 [Zhongshuai Pei] Update Optimizer.scala e34c28a [Zhongshuai Pei] Update predicates.scala 24739bd [Zhongshuai Pei] Update ConstantFoldingSuite.scala f4dbf50 [Zhongshuai Pei] Update ConstantFoldingSuite.scala 35ceb7a [Zhongshuai Pei] Update Optimizer.scala 36c194e [Zhongshuai Pei] Update Optimizer.scala 2e8f6ca [Zhongshuai Pei] Update Optimizer.scala 14952e2 [Zhongshuai Pei] Merge pull request #13 from apache/master f03fe7f [Zhongshuai Pei] Merge pull request #12 from apache/master f12fa50 [Zhongshuai Pei] Merge pull request #10 from apache/master f61210c [Zhongshuai Pei] Merge pull request #9 from apache/master 34b1a9a [Zhongshuai Pei] Merge pull request #8 from apache/master 802261c [DoingDone9] Merge pull request #7 from apache/master d00303b [DoingDone9] Merge pull request #6 from apache/master 98b134f [DoingDone9] Merge pull request #5 from apache/master 161cae3 [DoingDone9] Merge pull request #4 from apache/master c87e8b6 [DoingDone9] Merge pull request #3 from apache/master cb1852d [DoingDone9] Merge pull request #2 from apache/master c3f046f [DoingDone9] Merge pull request #1 from apache/master (cherry picked from commit 4b5e1fe) Signed-off-by: Michael Armbrust <[email protected]>
|
Ping? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to keep this hashset? Is it possible that the same executors appear twice in the onDisconnected() callback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happened to me in local testing, but it's still not clear why. Might be a weird Akka thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
Ok I'll just submit this PR to upstream |
Conflicts: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
Also I changed the semantics of YarnAllocator.getExecutorLossReason() to better "clean up" the completed executor map. Also processCompletedContainers() now always adds an executor reason to the completed executor exit reason map regardless of exit status as it is expected for the client to always call getExecutorLossReason().
|
I lied. Master's updates are making me rethink this implementation and what needs to change. |
|
PR is made against upstream instead |
Previously when tasks failed on executors that were preempted, those tasks counted as failed tasks. If the number of failed tasks reached spark.task.maxFailures then the entire stage would abort, likely causing the active job to crash. However preemption is a normal occurrence and tasks that fail due to preemption should not fail the job.
The solution is to find the reason why executors disconnect when they do. Before, when executors disconnect, the onDisconnected event is fired to the Spark driver but the driver just removes the executor without first finding out why the executor was terminated. Now, in YARN mode specifically, the driver queries the application master to find out why the executor was terminated before forwarding that information to the Scheduler component. The TaskSetManager takes the reason why the executor was lost into consideration when determining if the job should be failed or not.
Reviewers requested: @mingyukim @punya @ash211 - this PR is only for review. It's a big one so I'd like to get it reviewed internally before submitting to upstream for review.