-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29330][CORE][YARN] Allow users to chose the name of Spark Shuffle service #26000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
786f24a
30f8c1d
27e5c87
60795d4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -136,7 +136,11 @@ public class YarnShuffleService extends AuxiliaryService { | |
| private DB db; | ||
|
|
||
| public YarnShuffleService() { | ||
| super("spark_shuffle"); | ||
| this("spark_shuffle"); | ||
| } | ||
|
|
||
| public YarnShuffleService(String serviceName) { | ||
|
||
| super(serviceName); | ||
| logger.info("Initializing YARN shuffle service for Spark"); | ||
|
||
| instance = this; | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -443,6 +443,9 @@ package object config { | |
| private[spark] val SHUFFLE_SERVICE_PORT = | ||
| ConfigBuilder("spark.shuffle.service.port").intConf.createWithDefault(7337) | ||
|
|
||
| private[spark] val SHUFFLE_SERVICE_NAME = | ||
| ConfigBuilder("spark.shuffle.service.name").stringConf.createWithDefault("spark_shuffle") | ||
|
||
|
|
||
| private[spark] val KEYTAB = ConfigBuilder("spark.kerberos.keytab") | ||
| .doc("Location of user's keytab.") | ||
| .stringConf.createOptional | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -865,6 +865,13 @@ Apart from these, the following properties are also available, and may be useful | |
| configuration and setup documentation</a> for more information. | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td><code>spark.shuffle.service.name</code></td> | ||
|
||
| <td>spark_shuffle</td> | ||
| <td> | ||
| Name of the external shuffle service. | ||
| </td> | ||
| </tr> | ||
| <tr> | ||
| <td><code>spark.shuffle.service.port</code></td> | ||
| <td>7337</td> | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still hardcoded? Should we use configured SHUFFLE_SERVICE_NAME?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still hardcoded. I haven't found a way to access Spark configuration from that constructor and
org.apache.hadoop.yarn.server.api.AuxiliaryServicerequires the name. Do you have a suggestion of how that could be done?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I commented below #26000 (comment), if this is just for yarn, put it in YarnShuffleService, like "spark.yarn.shuffle.stopOnFailure"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hardcoded here. Once the shuffle service name is configured, won't they mismatch? Will it cause problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hardcoded here. HDP hardcodes another value though (
spark2_shuffle). While vanilla Spark would keep working as is and would use the namespark_shuffle, the new configuration option would allow users to point Spark to non-vanilla shuffle service.The changes to that class are done only to test that changing the name of the service and in the configuration play nicely together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems impossible to register the service with the name passed in the configuration because the configuration is passed after the class is instantiated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So this config can only be used to let Spark choose which service to connect. It cannot change the name of Shuffle Service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I guess I could implement a workaround, which would get the config setting from the default
Configuration, but that, at least theoretically, wouldn't guarantee that the exact configuration would be passed during service initialization.