[SPARK-37060][CORE] Handle driver status response from backup masters#34331
Closed
testsgmr wants to merge 1 commit intoapache:masterfrom
testsgmr:fix-bug-in-report-driver-status
Closed
[SPARK-37060][CORE] Handle driver status response from backup masters#34331testsgmr wants to merge 1 commit intoapache:masterfrom testsgmr:fix-bug-in-report-driver-status
testsgmr wants to merge 1 commit intoapache:masterfrom
testsgmr:fix-bug-in-report-driver-status
Conversation
|
Can one of the admins verify this patch? |
Ngone51
reviewed
Nov 12, 2021
Ngone51
reviewed
Nov 12, 2021
Member
srowen
reviewed
Nov 12, 2021
Member
srowen
left a comment
There was a problem hiding this comment.
I don't know enough to review this. Can you CC the author of the change before this?
Contributor
Author
|
Could someone please review these changes? CC: @cloud-fan @Ngone51 @HeartSaVioR @jiangxb1987 |
Ngone51
reviewed
Dec 10, 2021
Member
|
@mohamadrezarostami you may need to rebase your branch to pass GA. |
…sponse from backup masters
Contributor
Author
Done! |
Ngone51
approved these changes
Dec 15, 2021
Member
|
Thanks, merged to master. @mohamadrezarostami Could you create PRs to backport this to branch-3.2/branch-3.1? |
Contributor
Author
|
@Ngone51 |
This was referenced Dec 15, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
After an improvement in SPARK-31486, contributor uses 'asyncSendToMasterAndForwardReply' method instead of 'activeMasterEndpoint.askSync' to get the status of driver. Since the driver's status is only available in active master and the 'asyncSendToMasterAndForwardReply' method iterate over all of the masters, we have to handle the response from the backup masters in the client, which the developer did not consider in the SPARK-31486 change. So drivers running in cluster mode and on a cluster with multi masters affected by this bug.
Why are the changes needed?
We need to find if the response received from a backup master client must ignore it.
Does this PR introduce any user-facing change?
No, It's only fixed a bug and brings back the ability to deploy in cluster mode on multi-master clusters.
How was this patch tested?