Skip to content

Conversation

@otterc
Copy link
Contributor

@otterc otterc commented Apr 13, 2021

What changes were proposed in this pull request?

This is the shuffle fetch side change where executors can fetch local/remote push-merged shuffle data from shuffle services. This is needed for push-based shuffle - SPIP SPARK-30602.
The change adds support to the ShuffleBlockFetchIterator to fetch push-merged block meta and shuffle chunks from local and remote ESS. If the fetch of any of these fails, then the iterator fallsback to fetch the original shuffle blocks that belonged to the push-merged block.

Why are the changes needed?

These changes are needed for push-based shuffle. Refer to the SPIP in SPARK-30602.

Does this PR introduce any user-facing change?

When push-based shuffle is turned on then that will fetch push-merged blocks from the remote shuffle service. The client logs will indicate this.

How was this patch tested?

Added unit tests.
The reference PR with the consolidated changes covering the complete implementation is also provided in SPARK-30602.
We have already verified the functionality and the improved performance as documented in the SPIP doc.

Lead-authored-by: Chandni Singh [email protected]
Co-authored-by: Min Shen [email protected]
Co-authored-by: Ye Zhou [email protected]

@otterc
Copy link
Contributor Author

otterc commented Apr 13, 2021

Adding @Victsm @mridulm @zhouyejoe

@otterc
Copy link
Contributor Author

otterc commented May 5, 2021

Will resolve conflicts when this PR is merged as.

@otterc
Copy link
Contributor Author

otterc commented May 27, 2021

I have rebased the changes against the latest master. It only depends on #32140 but it interfaces at just 1 place with that PR so it can still be reviewed in parallel.
Given the size of this change and the feature freeze in early July, would really appreciate if folks can help with the review.
@mridulm @Victsm @Ngone51 @tgravescs @attilapiros

Copy link
Contributor Author

@otterc otterc May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: Most of this is as before. Have added conditions for shuffleChunks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewers: Both of these are created locally in the initialize and fetchFallbackBlocks. With push-based shuffle, partitionBlocksByFetchMode can be called multiple times because failure to fetch merged shuffle blocks/chunks (fallback) finds original blocks that made up the failed merged block/chunk and then we need to create new FetchRequests from these.

@otterc
Copy link
Contributor Author

otterc commented Jun 3, 2021

Gentle ping to help review this PR @tgravescs @attilapiros @Ngone51 @mridulm @Victsm @zhouyejoe

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took an initial pass, yet to look at ShuffleBlockFetcherIterator or test suites.
I am wondering, given the volume, whether we want to split between ESS side and client side. Thoughts ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getNumBlocks makes this code cleaner.

@otterc
Copy link
Contributor Author

otterc commented Jun 4, 2021

Took an initial pass, yet to look at ShuffleBlockFetcherIterator or test suites.
I am wondering, given the volume, whether we want to split between ESS side and client side. Thoughts ?

Thanks Mridul for reviewing!
My thoughts about splitting this change is that this PR completely encapsulates the fetch-side changes so it is easier to understand how the new messages introduced on the client side are being handled on the server-side. One of the feedbacks we got last year was that we broke things up in a way that made it difficult to understand.
On the server side, this PR mostly adds the wiring needed in ExternalBlockHandler to server the merged meta/data requests from RemoteBlockPushResolver.

That being said, I am still okay to break this change into client/sever PRs if that makes the review easier for the reviewers.
cc. @mridulm @Ngone51 @Victsm @tgravescs @attilapiros

@otterc
Copy link
Contributor Author

otterc commented Jun 28, 2021

@mridulm Resolved the conflict.

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a minor nit - can you fix it pls ?
Will merge it once you are done.

@Ngone51
Copy link
Member

Ngone51 commented Jun 29, 2021

Sorry for the delay. I'll do a review today. BTW, are there any other necessary magnet PRs that have to be merged for the 3.2 cut/release?

@otterc
Copy link
Contributor Author

otterc commented Jun 29, 2021

Sorry for the delay. I'll do a review today. BTW, are there any other necessary magnet PRs that have to be merged for the 3.2 cut/release?

There are 2 pending tasks which are necessary for Magnet:

@otterc
Copy link
Contributor Author

otterc commented Jun 29, 2021

The test failures are unrelated.

s"(${Utils.bytesToString(pushMergedLocalBlockBytes)}) " +
s"local push-merged and $numRemoteBlocks (${Utils.bytesToString(remoteBlockBytes)}) " +
s"remote blocks")
this.hostLocalBlocks ++= hostLocalBlocksCurrentIteration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we reuse hostLocalBlocksByExecutor here? e.g.,

this.hostLocalBlocks ++= hostLocalBlocksByExecutor.values
      .flatMap { infos => infos.map(info => (info._1, info._3)) }

so we can get rid of hostLocalBlocksCurrentIteration.

Copy link
Contributor Author

@otterc otterc Jun 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would need to do something for finding out the number of hostLocalBlocks in the assertions before as well.

Copy link
Contributor Author

@otterc otterc Jun 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made this change but also added a var for counting num of hostLocalBlocks which is needed for the assertions. PTAL

@Ngone51
Copy link
Member

Ngone51 commented Jun 29, 2021

I left some minor comments. I think we're ready to merge after addressing these comments.

@SparkQA
Copy link

SparkQA commented Jun 29, 2021

Test build #140388 has started for PR 32140 at commit ad89a02.

@asfgit asfgit closed this in 9a5cd15 Jun 29, 2021
@mridulm
Copy link
Contributor

mridulm commented Jun 29, 2021

Thanks for working on this @otterc !
Thanks for the detailed reviews @Ngone51 :-)

@mridulm
Copy link
Contributor

mridulm commented Jun 29, 2021

Merged to master

@otterc
Copy link
Contributor Author

otterc commented Jun 29, 2021

Thanks @mridulm and @Ngone51 for the thorough reviews!

asfgit pushed a commit that referenced this pull request Jul 17, 2021
…utor tries to fetch push-merged blocks

### What changes were proposed in this pull request?
Below 2 bugs were introduced with #32140
1. Instead of requesting the local-dirs for push-merged-local blocks from the ESS, `PushBasedFetchHelper` requests it from other executors. Push-based shuffle is only enabled when the ESS is enabled so it should always fetch the dirs from the ESS and not from other executors which is not yet supported.
2. The size of the push-merged blocks is logged incorrectly.

### Why are the changes needed?
This fixes the above mentioned bugs and is needed for push-based shuffle to work properly.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Tested this by running an application on the cluster. The UTs mock the call `hostLocalDirManager.getHostLocalDirs` which is why didn't catch (1) with the UT. However, the fix is trivial and checking this in the UT will require a lot more effort so I haven't modified it in the UT.
Logs of the executor with the bug
```
21/07/15 15:42:46 WARN ExternalBlockStoreClient: Error while trying to get the host local dirs for [shuffle-push-merger]
21/07/15 15:42:46 WARN PushBasedFetchHelper: Error while fetching the merged dirs for push-merged-local blocks: shuffle_0_-1_13. Fetch the original blocks instead
java.lang.RuntimeException: java.lang.IllegalStateException: Invalid executor id: shuffle-push-merger, expected 92.
	at org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:130)
	at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:163)
```
After the fix, the executors were able to fetch the local push-merged blocks.

Closes #33378 from otterc/SPARK-32922-followup.

Authored-by: Chandni Singh <[email protected]>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
asfgit pushed a commit that referenced this pull request Jul 17, 2021
…utor tries to fetch push-merged blocks

### What changes were proposed in this pull request?
Below 2 bugs were introduced with #32140
1. Instead of requesting the local-dirs for push-merged-local blocks from the ESS, `PushBasedFetchHelper` requests it from other executors. Push-based shuffle is only enabled when the ESS is enabled so it should always fetch the dirs from the ESS and not from other executors which is not yet supported.
2. The size of the push-merged blocks is logged incorrectly.

### Why are the changes needed?
This fixes the above mentioned bugs and is needed for push-based shuffle to work properly.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Tested this by running an application on the cluster. The UTs mock the call `hostLocalDirManager.getHostLocalDirs` which is why didn't catch (1) with the UT. However, the fix is trivial and checking this in the UT will require a lot more effort so I haven't modified it in the UT.
Logs of the executor with the bug
```
21/07/15 15:42:46 WARN ExternalBlockStoreClient: Error while trying to get the host local dirs for [shuffle-push-merger]
21/07/15 15:42:46 WARN PushBasedFetchHelper: Error while fetching the merged dirs for push-merged-local blocks: shuffle_0_-1_13. Fetch the original blocks instead
java.lang.RuntimeException: java.lang.IllegalStateException: Invalid executor id: shuffle-push-merger, expected 92.
	at org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:130)
	at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:163)
```
After the fix, the executors were able to fetch the local push-merged blocks.

Closes #33378 from otterc/SPARK-32922-followup.

Authored-by: Chandni Singh <[email protected]>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
(cherry picked from commit 6d2cbad)
Signed-off-by: Mridul Muralidharan <mridulatgmail.com>
domybest11 pushed a commit to domybest11/spark that referenced this pull request Jun 15, 2022
…l and remote merged shuffle data

### What changes were proposed in this pull request?
This is the shuffle fetch side change where executors can fetch local/remote push-merged shuffle data from shuffle services. This is needed for push-based shuffle - SPIP [SPARK-30602](https://issues.apache.org/jira/browse/SPARK-30602).
The change adds support to the `ShuffleBlockFetchIterator` to fetch push-merged block meta and shuffle chunks from local and remote ESS. If the fetch of any of these fails, then the iterator fallsback to fetch the original shuffle blocks that belonged to the push-merged block.

### Why are the changes needed?
These changes are needed for push-based shuffle. Refer to the SPIP in [SPARK-30602](https://issues.apache.org/jira/browse/SPARK-30602).

### Does this PR introduce _any_ user-facing change?
When push-based shuffle is turned on then that will fetch push-merged blocks from the remote shuffle service. The client logs will indicate this.

### How was this patch tested?
Added unit tests.
The reference PR with the consolidated changes covering the complete implementation is also provided in [SPARK-30602](https://issues.apache.org/jira/browse/SPARK-30602).
We have already verified the functionality and the improved performance as documented in the SPIP doc.

Lead-authored-by: Chandni Singh chsinghlinkedin.com
Co-authored-by: Min Shen mshenlinkedin.com
Co-authored-by: Ye Zhou yezhoulinkedin.com

Closes apache#32140 from otterc/SPARK-32922.

Lead-authored-by: Chandni Singh <[email protected]>
Co-authored-by: Chandni Singh <[email protected]>
Co-authored-by: Min Shen <[email protected]>
Co-authored-by: otterc <[email protected]>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
wangyum pushed a commit that referenced this pull request May 26, 2023
…l and remote merged shuffle data

This is the shuffle fetch side change where executors can fetch local/remote push-merged shuffle data from shuffle services. This is needed for push-based shuffle - SPIP [SPARK-30602](https://issues.apache.org/jira/browse/SPARK-30602).
The change adds support to the `ShuffleBlockFetchIterator` to fetch push-merged block meta and shuffle chunks from local and remote ESS. If the fetch of any of these fails, then the iterator fallsback to fetch the original shuffle blocks that belonged to the push-merged block.

These changes are needed for push-based shuffle. Refer to the SPIP in [SPARK-30602](https://issues.apache.org/jira/browse/SPARK-30602).

When push-based shuffle is turned on then that will fetch push-merged blocks from the remote shuffle service. The client logs will indicate this.

Added unit tests.
The reference PR with the consolidated changes covering the complete implementation is also provided in [SPARK-30602](https://issues.apache.org/jira/browse/SPARK-30602).
We have already verified the functionality and the improved performance as documented in the SPIP doc.

Lead-authored-by: Chandni Singh chsinghlinkedin.com
Co-authored-by: Min Shen mshenlinkedin.com
Co-authored-by: Ye Zhou yezhoulinkedin.com

Closes #32140 from otterc/SPARK-32922.

Lead-authored-by: Chandni Singh <[email protected]>
Co-authored-by: Chandni Singh <[email protected]>
Co-authored-by: Min Shen <[email protected]>
Co-authored-by: otterc <[email protected]>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
wangyum pushed a commit that referenced this pull request May 26, 2023
…utor tries to fetch push-merged blocks

### What changes were proposed in this pull request?
Below 2 bugs were introduced with #32140
1. Instead of requesting the local-dirs for push-merged-local blocks from the ESS, `PushBasedFetchHelper` requests it from other executors. Push-based shuffle is only enabled when the ESS is enabled so it should always fetch the dirs from the ESS and not from other executors which is not yet supported.
2. The size of the push-merged blocks is logged incorrectly.

### Why are the changes needed?
This fixes the above mentioned bugs and is needed for push-based shuffle to work properly.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Tested this by running an application on the cluster. The UTs mock the call `hostLocalDirManager.getHostLocalDirs` which is why didn't catch (1) with the UT. However, the fix is trivial and checking this in the UT will require a lot more effort so I haven't modified it in the UT.
Logs of the executor with the bug
```
21/07/15 15:42:46 WARN ExternalBlockStoreClient: Error while trying to get the host local dirs for [shuffle-push-merger]
21/07/15 15:42:46 WARN PushBasedFetchHelper: Error while fetching the merged dirs for push-merged-local blocks: shuffle_0_-1_13. Fetch the original blocks instead
java.lang.RuntimeException: java.lang.IllegalStateException: Invalid executor id: shuffle-push-merger, expected 92.
	at org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:130)
	at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:163)
```
After the fix, the executors were able to fetch the local push-merged blocks.

Closes #33378 from otterc/SPARK-32922-followup.

Authored-by: Chandni Singh <[email protected]>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants