-
Notifications
You must be signed in to change notification settings - Fork 2.8k
ZEPPELIN-1815. Support Spark 2.1 #1787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@Leemoonsoo @felixcheung Please help review. |
|
Looks fine. Would you like to get this in before Spark 2.1 artifacts are in maven? |
|
and btw, could you please test with the Spark's last 2.1 RC5 (aka release build) instead of snapshot builds. |
|
In a different jira, should we consider removing this check for 2.1, 2.2 since its unlikely that Spark will break common APIs in dot releases. Perhaps the policy can change to adding a check when a known incompatible Spark version shows up (until that gets supported in Zeppelin). Thoughts? |
|
@bikassaha zeppelin use spark repl api heavily. And spark repl is not stable as RDD/DataFrame api (It happens in spark 1.x), so it is possible to have changes in spark repl for spark feature release. Another thing is that zeppelin may need some update to support spark new features, so we need to check spark version. |
|
2.1.0 artifacts are on central repo. @zjffdu do you want to update this PR to add build against 2.1? |
|
the 2.1 profile failed, could you check? |
|
@felixcheung The test fail is due to api change in spark 2.1.0. |
|
ah, that's unfortunate. the feedback was it would be inconsistent and confusing to have multiple default values for the timeout, instead it is moved to the top of the function - but I can see the caller of connectBackend (though not officially supported) can be broken by this. |
|
CI failure doesn't seem related. LGTM. |
|
When I compile Zeppelin (from master, including this change) against Spark 2.1.0, I hit the following error: [ERROR] /Volumes/Amazon/oss-workspace/src/zeppelin/spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java:[1454,26] constructor SecurityManager in class org.apache.spark.SecurityManager cannot be applied to given types; It doesn't matter if I'm using Scala 2.10 or 2.11. I can reproduce this with a command as simple as: mvn clean package -DskipTests -Pspark-2.0 -Dspark.version=2.1.0 I'm not all that familiar with Scala, but I don't understand why this is a problem, as this Optional<byte[]> argument of the SecurityManager class (from Spark's source code) has a default of None, so I wouldn't think that it would be necessary to specify this second parameter to the constructor in Zeppelin code. Is it not possible for Java code to take advantage of the default value specified in the Scala code? Also, isn't this breaking the Zeppelin CI? I don't know where to find the Zeppelin CI, so could somebody please provide a link? |
|
Answering my own questions: I still haven't found a definitive answer to whether or not Java classes may take advantage of default parameter values defined in Scala classes, but it's pretty apparent that you cannot do this. This means that in order to support both Spark < 2.1.0 and Spark >= 2.1.0, we'd need to do a little more reflection magic to support the two possible versions of the SecurityManager constructor. As for the link to the Jenkins CI, I found https://travis-ci.org/apache/zeppelin. I don't at all understand though how the new Spark 2.1.0 profile is building successfully when I've found it so easy to reproduce this compilation error. |
|
Thanks @ejono for reporting this, I can also reproduce it. |
|
yea, this is a bit puzzling... |
|
Thanks for confirming. The fix would not be too difficult, but yes, I'm puzzled as to why the CI isn't broken, so maybe we need to figure that out first. |
|
@ejono The root cause is that SPARK_VER doesn't take effect in travis, I created ZEPPELIN-1918. It would be better to fix the issue here with ZEPPELIN-1918 together. I would be happy to help if you want to contribute on this. BTW, you might need to create another profile for spark 2.1.0, because spark 2.1.0 use another version of pyspark. |
|
Oh, sorry I also didn't check it carefully. I'll check it manually when you guys create new PR for handling a new issue |
|
this #1856 should help too |
|
ping @ejono Do you want to contribute it ? If not, I will fix it. |
|
Let's get this fix ASAP - we are going to cut 0.7 release in 2 days
|
|
Sure, let me fix it. |
|
Sorry, I was not able to work on this over the past few days. Thank you very much for fixing it, @zjffdu! |
### What is this PR for? This PR is to update the max supported spark version to be 2.1 as Spark 2.1 is released. I didn't change pom file and travis script to update spark version. I think we can wait for some time until we confirm spark 2.1 is stable enough. ### What type of PR is it? [Improvement] ### Todos * [ ] - Task ### What is the Jira issue? * https://issues.apache.org/jira/browse/ZEPPELIN-1815 ### How should this be tested? Tested on spark 2.1 ### Screenshots (if appropriate)  ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Author: Jeff Zhang <[email protected]> Closes apache#1787 from zjffdu/ZEPPELIN-1815 and squashes the following commits: b2ebf3a [Jeff Zhang] add timeout for connectBackend 0c17cf0 [Jeff Zhang] add build with spark 2.1.0 f7e72c9 [Jeff Zhang] ZEPPELIN-1815. Support Spark 2.1
What is this PR for?
This PR is to update the max supported spark version to be 2.1 as Spark 2.1 is released.
I didn't change pom file and travis script to update spark version. I think we can wait for some time until we confirm spark 2.1 is stable enough.
What type of PR is it?
[Improvement]
Todos
What is the Jira issue?
How should this be tested?
Tested on spark 2.1
Screenshots (if appropriate)
Questions: