-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-27180][BUILD][YARN] Fix testing issues with yarn module in Hadoop-3 #24115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| <properties> | ||
| <sbt.project.name>yarn</sbt.project.name> | ||
| <jersey-1.version>1.9</jersey-1.version> | ||
| <jersey-1.version>1.19</jersey-1.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgrade jersey to 1.19. otherwise:
[info] Cause: java.lang.NoClassDefFoundError: com/sun/jersey/spi/container/servlet/ServletContainer
[info] at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1103)
[info] at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1271)
[info] at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
| @@ -0,0 +1,132 @@ | |||
| /* | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add this class. otherwise:
[info] YarnClusterSuite:
[info] org.apache.spark.deploy.yarn.YarnClusterSuite *** ABORTED *** (33 milliseconds)
[info] java.lang.NoClassDefFoundError: org/apache/hadoop/net/ServerSocketUtil
[info] at org.apache.hadoop.yarn.server.MiniYARNCluster.serviceInit(MiniYARNCluster.java:260)
I try to do it by maven, but failed. It seems that sbt-pom-reader does not support test-jar very well.:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
<type>test-jar</type>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have other test-jar imports; some use type and some classifier. Have you tried the latter? e.g. we have
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-tests</artifactId>
<classifier>tests</classifier>
<scope>test</scope>
</dependency>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I tried it, this can be successful on hadoop-3.2, but throws a compilation exception on hadoop-2.7(It seems exclude hadoop-common).
[error] /Users/yumwang/SPARK-20845/spark/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ApplicationMasterSuite.scala:34: Class org.apache.hadoop.conf.Configuration not found - continuing with a stub.
[error] val yarnConf = new YarnConfiguration()
[error]
pom.xml
Outdated
| <hadoop.version>3.1.0</hadoop.version> | ||
| <curator.version>2.12.0</curator.version> | ||
| <zookeeper.version>3.4.9</zookeeper.version> | ||
| <jetty.version>9.3.24.v20180605</jetty.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since HADOOP-10075(Hadoop-3.0.0). Hadoop update it's jetty dependency to version 9.3.x. This version conflict with 9.4.x:
[info] YarnClusterSuite:
[info] org.apache.spark.deploy.yarn.YarnClusterSuite *** ABORTED *** (177 milliseconds)
[info] org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager;
Furthermore, We have some discuss about this change:
https://issues.apache.org/jira/browse/HADOOP-16152
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm afraid this might conflict with Java 11 compatibility, as this was updated for that reason: #22993
https://www.eclipse.org/lists/jetty-announce/msg00124.html
I wonder if this can be worked around on the Spark side, or whether it is indeed possible for Hadoop 3 to update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't change Spark's version of Jetty because a YARN test is failing. A Spark application will not hit the same code paths as the YARN tests (which run a YARN server).
Maybe change the version of jetty in the YARN module when testing with hadoop-3, if there's no other way to work around this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried several ways but they all failed:
- Use
hadoop-client-miniclusterinstead ofhadoop-yarn-server-tests: https://github.com/wangyum/spark-hadoop-client-minicluster - Change the version of jetty in the YARN module when testing with hadoop-3
- Upgrade Eclipse Jetty version to 9.4.x for Hadoop: https://issues.apache.org/jira/browse/HADOOP-16152
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ideal outcome is updating Hadoop, as otherwise it seems like Hadoop 3 and Java 11 support are in conflict. Then again, if Hadoop 3.x isn't quite going to work with Java 11 for other reasons, we have larger problems. Hadoop 2.x is currently working OK with Java 11 tests (it's Hive that's the issue) so I'm kind of surprised.
How did you try to override the Jetty version in the YARN module? I'd expect that's entirely possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to override the Jetty version in the YARN module by: 29e583f.
But +-org.eclipse.jetty:jetty-servlet:9.3.24.v20180605 (evicted by: 9.4.12.v20180830).
I found that adapted SessionHandler from jetty-9.3.25.v20180904 can test the YARN module in Hadoop 2 and Hadoop 3. I know this is not a good way. But it seems that this is the only way at the moment.
|
Test build #103561 has finished for PR 24115 at commit
|
# Conflicts: # pom.xml
�[0m[�[0minfo�[0m] �[0m | | | | +-org.eclipse.jetty:jetty-servlet:9.3.24.v20180605 (evicted by: 9.4.12.v20180830)�[0m
|
Test build #104104 has finished for PR 24115 at commit
|
|
retest this please |
|
Test build #104108 has finished for PR 24115 at commit
|
resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java
Show resolved
Hide resolved
|
Test build #104123 has finished for PR 24115 at commit
|
|
Test build #104124 has finished for PR 24115 at commit
|
|
retest this please |
|
Test build #104136 has finished for PR 24115 at commit
|
resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java
Show resolved
Hide resolved
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK if that's what we have to do here.
| kafka-source-initial-offset-version-2.1.0.bin | ||
| kafka-source-initial-offset-future-version.bin | ||
| vote.tmpl | ||
| SessionManager.java |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can't narrow this to the particular file in its particular directory? it's probably fine if not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Please see RAT-240 CLI: RAT excludes file does not allow to set a full path
|
Test build #104149 has finished for PR 24115 at commit
|
|
Merged to master |
…oop-3 Fix testing issues with `yarn` module in Hadoop-3: 1. Upgrade jersey-1 to `1.19` to fix ```Cause: java.lang.NoClassDefFoundError: com/sun/jersey/spi/container/servlet/ServletContainer```. 2. Copy `ServerSocketUtil` from hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java to fix ```java.lang.NoClassDefFoundError: org/apache/hadoop/net/ServerSocketUtil```. 3. Adapte `SessionHandler` from jetty-9.3.25.v20180904/jetty-server/src/main/java/org/eclipse/jetty/server/session/SessionHandler.java to fix ```java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager```. manual tests: ```shell build/sbt yarn/test -Pyarn build/sbt yarn/test -Phadoop-3.2 -Pyarn build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnClusterSuite -pl resource-managers/yarn test -Pyarn build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnClusterSuite -pl resource-managers/yarn test -Pyarn -Phadoop-3.2 ``` Closes apache#24115 from wangyum/hadoop3-yarn. Authored-by: Yuming Wang <[email protected]> Signed-off-by: Sean Owen <[email protected]> (cherry picked from commit 13c5c1f)
* SNAPSHOT to 3.2.2.0-1095 * [SPARK-27180][BUILD][YARN] Fix testing issues with yarn module in Hadoop-3 Fix testing issues with `yarn` module in Hadoop-3: 1. Upgrade jersey-1 to `1.19` to fix ```Cause: java.lang.NoClassDefFoundError: com/sun/jersey/spi/container/servlet/ServletContainer```. 2. Copy `ServerSocketUtil` from hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java to fix ```java.lang.NoClassDefFoundError: org/apache/hadoop/net/ServerSocketUtil```. 3. Adapte `SessionHandler` from jetty-9.3.25.v20180904/jetty-server/src/main/java/org/eclipse/jetty/server/session/SessionHandler.java to fix ```java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager```. manual tests: ```shell build/sbt yarn/test -Pyarn build/sbt yarn/test -Phadoop-3.2 -Pyarn build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnClusterSuite -pl resource-managers/yarn test -Pyarn build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnClusterSuite -pl resource-managers/yarn test -Pyarn -Phadoop-3.2 ``` Closes apache#24115 from wangyum/hadoop3-yarn. Authored-by: Yuming Wang <[email protected]> Signed-off-by: Sean Owen <[email protected]> (cherry picked from commit 13c5c1f) * [SPARK-42463][SPARK-27180][YARN][TESTS] Clean up the third-party Java files copy introduced by SPARK-27180 introduced some third-party Java source code to solve Yarn module test failure, but maven and sbt can also test pass without them, so this pr remove these files. Clean up the third-party Java source code copy in Spark. No - Pass GitHub Actions - manual check: **Maven** ``` build/mvn clean build/mvn clean install -DskipTestes -pl resource-managers/yarn -am -Pyarn build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnClusterSuite -pl resource-managers/yarn test -Pyarn build/mvn test -pl resource-managers/yarn -Pyarn -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest ``` Both `YarnClusterSuite` and full module test passed. **SBT** ``` build/sbt clean yarn/test -Pyarn -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest ``` All tests passed. Closes apache#40052 from LuciferYang/SPARK-42463. Authored-by: yangjie01 <[email protected]> Signed-off-by: Sean Owen <[email protected]> (cherry picked from commit 64e5928) * ODP-1095: jettison 1.5.4 * Fixed version as per main across all poms --------- Co-authored-by: kravii <[email protected]> Co-authored-by: Yuming Wang <[email protected]> Co-authored-by: yangjie01 <[email protected]> Co-authored-by: Prabhjyot Singh <[email protected]>
What changes were proposed in this pull request?
Fix testing issues with
yarnmodule in Hadoop-3:1.19to fixCause: java.lang.NoClassDefFoundError: com/sun/jersey/spi/container/servlet/ServletContainer.ServerSocketUtilfrom hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java to fixjava.lang.NoClassDefFoundError: org/apache/hadoop/net/ServerSocketUtil.SessionHandlerfrom jetty-9.3.25.v20180904/jetty-server/src/main/java/org/eclipse/jetty/server/session/SessionHandler.java to fixjava.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager.How was this patch tested?
manual tests: