-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17321][YARN] Avoid writing shuffle metadata to disk if NM recovery is disabled #19032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Change-Id: Id062d71589f46052706058c151c706dae38b1e6e
|
CC @LiShuMing please take a look at another approach to fix the bad disk issue. Also ping @tgravescs to view the PR. Thanks a lot. |
|
Test build #81067 has finished for PR 19032 at commit
|
vanzin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add @Override to setRecoveryPath? It took some head scratching to figure out where that method was being called from without it. While at it you can update the javadoc for that method.
You also don't need the "recoveryEnabled" check, you can use the recovery path for that since it's only set by YARN when recovery is enabled.
| public class YarnShuffleService extends AuxiliaryService { | ||
| private static final Logger logger = LoggerFactory.getLogger(YarnShuffleService.class); | ||
|
|
||
| private static final boolean DEFAULT_NM_RECOVERY_ENABLED = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this in YarnConfiguration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check the yarn code.
| boolean authEnabled = conf.getBoolean(SPARK_AUTHENTICATE_KEY, DEFAULT_SPARK_AUTHENTICATE); | ||
| if (authEnabled) { | ||
| createSecretManager(); | ||
| createSecretManager(recoveryEnabled); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at this point it would be cleaner to do:
secretManager = new ShuffleSecretManager();
if (recoveryEnabled) {
loadSecretsFromDb();
}
Change-Id: I06866813d24af5cd6ae64f45df1c7a4ebaf2b12d
| /** | ||
| * Figure out the recovery path and handle moving the DB if YARN NM recovery gets enabled | ||
| * when it previously was not. If YARN NM recovery is enabled it uses that path, otherwise | ||
| * it will uses a YARN local dir. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to update comment, probably just remove the last sentence.
Change-Id: I90ae354f840534fad0b448392cab5713eb7c7171
|
lgtm |
|
Test build #81180 has finished for PR 19032 at commit
|
|
Jenkins, retest this please. |
|
@vanzin @tgravescs do you have any further comment? |
|
Test build #81236 has finished for PR 19032 at commit
|
|
+1 go ahead and commit |
|
|
||
| /** | ||
| * Set the recovery path for shuffle service recovery when NM is restarted. The method will be | ||
| * overrode and called when Hadoop version is 2.5+ and NM recovery is enabled, otherwise we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update this comment since it's out of date now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I will.
Change-Id: I85129842468e74c3a91232991595916d50b206fc
|
Test build #81271 has finished for PR 19032 at commit
|
|
Merge to master branch. |
…ervice.db.backend` in `running-on-yarn.md` ### What changes were proposed in this pull request? From the context from [pr](#19032) of [SPARK-17321](https://issues.apache.org/jira/browse/SPARK-17321), `YarnShuffleService` will persist data into `Level/RocksDB` when Yarn NM recovery is enabled. So this pr adds the precondition description related to `Yarn NM recovery is enabled` for `spark.shuffle.service.db.backend`. in `running-on-yarn.md` ### Why are the changes needed? Add precondition description for `spark.shuffle.service.db.backend` in `running-on-yarn.md` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions Closes #37853 from LuciferYang/SPARK-40404. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…ervice.db.backend` in `running-on-yarn.md` ### What changes were proposed in this pull request? From the context from [pr](apache#19032) of [SPARK-17321](https://issues.apache.org/jira/browse/SPARK-17321), `YarnShuffleService` will persist data into `Level/RocksDB` when Yarn NM recovery is enabled. So this pr adds the precondition description related to `Yarn NM recovery is enabled` for `spark.shuffle.service.db.backend`. in `running-on-yarn.md` ### Why are the changes needed? Add precondition description for `spark.shuffle.service.db.backend` in `running-on-yarn.md` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions Closes apache#37853 from LuciferYang/SPARK-40404. Authored-by: yangjie01 <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
In the current code, if NM recovery is not enabled then
YarnShuffleServicewill write shuffle metadata to NM local dir-1, if this local dir-1 is on bad disk, thenYarnShuffleServicewill be failed to start. So to solve this issue, in Spark side if NM recovery is not enabled, then Spark will not persist data into leveldb, in that case yarn shuffle service can still be served but lose the ability for recovery, (it is fine because the failure of NM will kill the containers as well as applications).How was this patch tested?
Tested in the local cluster with NM recovery off and on to see if folder is created or not. MiniCluster UT isn't added because in MiniCluster NM will always set port to 0, but NM recovery requires non-ephemeral port.