-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-27180][BUILD][YARN] Fix testing issues with yarn module in Hadoop-3 #24115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
757a09a
5fb2e13
29e583f
4058b2a
3c9bb7b
a33fcef
6a00f19
49f0d8a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -29,7 +29,7 @@ | |
| <name>Spark Project YARN</name> | ||
| <properties> | ||
| <sbt.project.name>yarn</sbt.project.name> | ||
| <jersey-1.version>1.9</jersey-1.version> | ||
| <jersey-1.version>1.19</jersey-1.version> | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Upgrade jersey to 1.19. otherwise: |
||
| </properties> | ||
|
|
||
| <dependencies> | ||
|
|
@@ -166,6 +166,12 @@ | |
| <scope>test</scope> | ||
| <version>${jersey-1.version}</version> | ||
| </dependency> | ||
| <dependency> | ||
| <groupId>com.sun.jersey</groupId> | ||
| <artifactId>jersey-servlet</artifactId> | ||
| <scope>test</scope> | ||
| <version>${jersey-1.version}</version> | ||
| </dependency> | ||
|
|
||
| <!-- These dependencies are duplicated from core, because dependencies in the "provided" | ||
| scope are not transitive.--> | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,132 @@ | ||
| /* | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add this class. otherwise: I try to do it by maven, but failed. It seems that <dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
<type>test-jar</type>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>*</groupId>
<artifactId>*</artifactId>
</exclusion>
</exclusions>
</dependency>
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We do have other test-jar imports; some use
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I tried it, this can be successful on hadoop-3.2, but throws a compilation exception on hadoop-2.7(It seems exclude [error] /Users/yumwang/SPARK-20845/spark/resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/ApplicationMasterSuite.scala:34: Class org.apache.hadoop.conf.Configuration not found - continuing with a stub.
[error] val yarnConf = new YarnConfiguration()
[error] |
||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package org.apache.hadoop.net; | ||
|
|
||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| import java.io.IOException; | ||
| import java.net.ServerSocket; | ||
| import java.util.Random; | ||
|
|
||
| /** | ||
| * Copied from | ||
| * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java | ||
| * for Hadoop-3.x testing | ||
| */ | ||
| public class ServerSocketUtil { | ||
|
|
||
| private static final Logger LOG = LoggerFactory.getLogger(ServerSocketUtil.class); | ||
| private static Random rand = new Random(); | ||
|
|
||
| /** | ||
| * Port scan & allocate is how most other apps find ports | ||
| * | ||
| * @param port given port | ||
| * @param retries number of retries | ||
| * @return | ||
| * @throws IOException | ||
| */ | ||
| public static int getPort(int port, int retries) throws IOException { | ||
| int tryPort = port; | ||
| int tries = 0; | ||
| while (true) { | ||
| if (tries > 0 || tryPort == 0) { | ||
| tryPort = port + rand.nextInt(65535 - port); | ||
| } | ||
| if (tryPort == 0) { | ||
| continue; | ||
| } | ||
| try (ServerSocket s = new ServerSocket(tryPort)) { | ||
| LOG.info("Using port " + tryPort); | ||
| return tryPort; | ||
| } catch (IOException e) { | ||
| tries++; | ||
| if (tries >= retries) { | ||
| LOG.info("Port is already in use; giving up"); | ||
| throw e; | ||
| } else { | ||
| LOG.info("Port is already in use; trying again"); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Check whether port is available or not. | ||
| * | ||
| * @param port given port | ||
| * @return | ||
| */ | ||
| private static boolean isPortAvailable(int port) { | ||
| try (ServerSocket s = new ServerSocket(port)) { | ||
| return true; | ||
| } catch (IOException e) { | ||
| return false; | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Wait till the port available. | ||
| * | ||
| * @param port given port | ||
| * @param retries number of retries for given port | ||
| * @return | ||
| * @throws InterruptedException | ||
| * @throws IOException | ||
| */ | ||
| public static int waitForPort(int port, int retries) | ||
| throws InterruptedException, IOException { | ||
| int tries = 0; | ||
| while (true) { | ||
| if (isPortAvailable(port)) { | ||
| return port; | ||
| } else { | ||
| tries++; | ||
| if (tries >= retries) { | ||
| throw new IOException( | ||
| "Port is already in use; giving up after " + tries + " times."); | ||
| } | ||
| Thread.sleep(1000); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Find the specified number of unique ports available. | ||
| * The ports are all closed afterwards, | ||
| * so other network services started may grab those same ports. | ||
| * | ||
| * @param numPorts number of required port nubmers | ||
| * @return array of available port numbers | ||
| * @throws IOException | ||
| */ | ||
| public static int[] getPorts(int numPorts) throws IOException { | ||
| ServerSocket[] sockets = new ServerSocket[numPorts]; | ||
| int[] ports = new int[numPorts]; | ||
| for (int i = 0; i < numPorts; i++) { | ||
| ServerSocket sock = new ServerSocket(0); | ||
| sockets[i] = sock; | ||
| ports[i] = sock.getLocalPort(); | ||
| } | ||
| for (ServerSocket sock : sockets) { | ||
| sock.close(); | ||
| } | ||
| return ports; | ||
| } | ||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since HADOOP-10075(Hadoop-3.0.0). Hadoop update it's jetty dependency to version 9.3.x. This version conflict with 9.4.x:
Furthermore, We have some discuss about this change:
https://issues.apache.org/jira/browse/HADOOP-16152
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm afraid this might conflict with Java 11 compatibility, as this was updated for that reason: #22993
https://www.eclipse.org/lists/jetty-announce/msg00124.html
I wonder if this can be worked around on the Spark side, or whether it is indeed possible for Hadoop 3 to update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't change Spark's version of Jetty because a YARN test is failing. A Spark application will not hit the same code paths as the YARN tests (which run a YARN server).
Maybe change the version of jetty in the YARN module when testing with hadoop-3, if there's no other way to work around this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried several ways but they all failed:
hadoop-client-miniclusterinstead ofhadoop-yarn-server-tests: https://github.com/wangyum/spark-hadoop-client-miniclusterThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ideal outcome is updating Hadoop, as otherwise it seems like Hadoop 3 and Java 11 support are in conflict. Then again, if Hadoop 3.x isn't quite going to work with Java 11 for other reasons, we have larger problems. Hadoop 2.x is currently working OK with Java 11 tests (it's Hive that's the issue) so I'm kind of surprised.
How did you try to override the Jetty version in the YARN module? I'd expect that's entirely possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to override the Jetty version in the YARN module by: 29e583f.
But
+-org.eclipse.jetty:jetty-servlet:9.3.24.v20180605 (evicted by: 9.4.12.v20180830).I found that adapted
SessionHandlerfrom jetty-9.3.25.v20180904 can test the YARN module in Hadoop 2 and Hadoop 3. I know this is not a good way. But it seems that this is the only way at the moment.