From 0c549703fe8173f7be17bf9b386489dfba695080 Mon Sep 17 00:00:00 2001 From: Cheng Pan Date: Mon, 17 Jan 2022 12:14:52 +0800 Subject: [PATCH 1/4] [SPARK-37925] Update document to mention the workaround for YARN-11053 --- docs/running-on-yarn.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index c55ce86531da..549fe6542c83 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -913,7 +913,11 @@ with a mixed workload of applications running multiple Spark versions, since a g the shuffle service is not always compatible with other versions of Spark. YARN versions since 2.9.0 support the ability to run shuffle services within an isolated classloader (see [YARN-4577](https://issues.apache.org/jira/browse/YARN-4577)), meaning multiple Spark versions -can coexist within a single NodeManager. The +can coexist within a single NodeManager. There is an issue +[YARN-11053](https://issues.apache.org/jira/browse/YARN-11053) of YARN isolated classloader which has +been addressed in YARN 3.3.2/3.4.0, you need to explicitly set +`yarn.nodemanager.aux-services..system-classes` to some classes other than Spark shuffle +service classes as a workaround for the former versions. The `yarn.nodemanager.aux-services..classpath` and, starting from YARN 2.10.2/3.1.1/3.2.0, `yarn.nodemanager.aux-services..remote-classpath` options can be used to configure this. In addition to setting up separate classpaths, it's necessary to ensure the two versions @@ -923,7 +927,11 @@ above. For example, you may have configuration like: ```properties yarn.nodemanager.aux-services = spark_shuffle_x,spark_shuffle_y yarn.nodemanager.aux-services.spark_shuffle_x.classpath = /path/to/spark-x-yarn-shuffle.jar,/path/to/spark-x-config + # workaround for YARN-11053 + yarn.nodemanager.aux-services.spark_shuffle_x.system-classes = classes.other.than.spark.shuffle.service.classes yarn.nodemanager.aux-services.spark_shuffle_y.classpath = /path/to/spark-y-yarn-shuffle.jar,/path/to/spark-y-config + # workaround for YARN-11053 + yarn.nodemanager.aux-services.spark_shuffle_y.system-classes = classes.other.than.spark.shuffle.service.classes ``` The two `spark-*-config` directories each contain one file, `spark-shuffle-site.xml`. These are XML From 3dbf2a5c864ff0662c680d62a9a028f3150b6b7d Mon Sep 17 00:00:00 2001 From: Cheng Pan Date: Wed, 2 Feb 2022 04:37:20 +0800 Subject: [PATCH 2/4] Revert "[SPARK-37925] Update document to mention the workaround for YARN-11053" This reverts commit d60d899f37eca27b4653d204a31d4770babdef82. --- docs/running-on-yarn.md | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index 549fe6542c83..c55ce86531da 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -913,11 +913,7 @@ with a mixed workload of applications running multiple Spark versions, since a g the shuffle service is not always compatible with other versions of Spark. YARN versions since 2.9.0 support the ability to run shuffle services within an isolated classloader (see [YARN-4577](https://issues.apache.org/jira/browse/YARN-4577)), meaning multiple Spark versions -can coexist within a single NodeManager. There is an issue -[YARN-11053](https://issues.apache.org/jira/browse/YARN-11053) of YARN isolated classloader which has -been addressed in YARN 3.3.2/3.4.0, you need to explicitly set -`yarn.nodemanager.aux-services..system-classes` to some classes other than Spark shuffle -service classes as a workaround for the former versions. The +can coexist within a single NodeManager. The `yarn.nodemanager.aux-services..classpath` and, starting from YARN 2.10.2/3.1.1/3.2.0, `yarn.nodemanager.aux-services..remote-classpath` options can be used to configure this. In addition to setting up separate classpaths, it's necessary to ensure the two versions @@ -927,11 +923,7 @@ above. For example, you may have configuration like: ```properties yarn.nodemanager.aux-services = spark_shuffle_x,spark_shuffle_y yarn.nodemanager.aux-services.spark_shuffle_x.classpath = /path/to/spark-x-yarn-shuffle.jar,/path/to/spark-x-config - # workaround for YARN-11053 - yarn.nodemanager.aux-services.spark_shuffle_x.system-classes = classes.other.than.spark.shuffle.service.classes yarn.nodemanager.aux-services.spark_shuffle_y.classpath = /path/to/spark-y-yarn-shuffle.jar,/path/to/spark-y-config - # workaround for YARN-11053 - yarn.nodemanager.aux-services.spark_shuffle_y.system-classes = classes.other.than.spark.shuffle.service.classes ``` The two `spark-*-config` directories each contain one file, `spark-shuffle-site.xml`. These are XML From a684e0621ddada4181374809157184bb3af55059 Mon Sep 17 00:00:00 2001 From: Cheng Pan Date: Wed, 2 Feb 2022 04:46:42 +0800 Subject: [PATCH 3/4] [SPARK-37925] Update document to mention the workaround for YARN-11053 --- docs/running-on-yarn.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index c55ce86531da..c231515581e5 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -916,9 +916,12 @@ support the ability to run shuffle services within an isolated classloader can coexist within a single NodeManager. The `yarn.nodemanager.aux-services..classpath` and, starting from YARN 2.10.2/3.1.1/3.2.0, `yarn.nodemanager.aux-services..remote-classpath` options can be used to configure -this. In addition to setting up separate classpaths, it's necessary to ensure the two versions -advertise to different ports. This can be achieved using the `spark-shuffle-site.xml` file described -above. For example, you may have configuration like: +this. Notes that YARN 3.3.0/3.3.1 have an issue which requires setting +`yarn.nodemanager.aux-services..system-classes` as a workaround, see +[YARN-11053](https://issues.apache.org/jira/browse/YARN-11053) for details. In addition to setting +up separate classpaths, it's necessary to ensure the two versions advertise to different ports. +This can be achieved using the `spark-shuffle-site.xml` file described above. For example, you may +have configuration like: ```properties yarn.nodemanager.aux-services = spark_shuffle_x,spark_shuffle_y From d21e8fab224c66672b9e8706ccd2e77fd0dc5104 Mon Sep 17 00:00:00 2001 From: Cheng Pan Date: Wed, 2 Feb 2022 13:18:54 +0800 Subject: [PATCH 4/4] update --- docs/running-on-yarn.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/running-on-yarn.md b/docs/running-on-yarn.md index c231515581e5..63c03760b8be 100644 --- a/docs/running-on-yarn.md +++ b/docs/running-on-yarn.md @@ -916,8 +916,8 @@ support the ability to run shuffle services within an isolated classloader can coexist within a single NodeManager. The `yarn.nodemanager.aux-services..classpath` and, starting from YARN 2.10.2/3.1.1/3.2.0, `yarn.nodemanager.aux-services..remote-classpath` options can be used to configure -this. Notes that YARN 3.3.0/3.3.1 have an issue which requires setting -`yarn.nodemanager.aux-services..system-classes` as a workaround, see +this. Note that YARN 3.3.0/3.3.1 have an issue which requires setting +`yarn.nodemanager.aux-services..system-classes` as a workaround. See [YARN-11053](https://issues.apache.org/jira/browse/YARN-11053) for details. In addition to setting up separate classpaths, it's necessary to ensure the two versions advertise to different ports. This can be achieved using the `spark-shuffle-site.xml` file described above. For example, you may