Skip to content

Conversation

@zjffdu
Copy link
Contributor

@zjffdu zjffdu commented Dec 21, 2016

What is this PR for?

This PR is to update the max supported spark version to be 2.1 as Spark 2.1 is released.
I didn't change pom file and travis script to update spark version. I think we can wait for some time until we confirm spark 2.1 is stable enough.

What type of PR is it?

[Improvement]

Todos

  • - Task

What is the Jira issue?

How should this be tested?

Tested on spark 2.1

Screenshots (if appropriate)

sparkr_progress

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@zjffdu
Copy link
Contributor Author

zjffdu commented Dec 21, 2016

@Leemoonsoo @felixcheung Please help review.

@felixcheung
Copy link
Member

Looks fine. Would you like to get this in before Spark 2.1 artifacts are in maven?
I think we should reword the JIRA if that's the case - I don't think we should say we support 2.1 until we have 2.1 in the build (pom), travis etc.

@felixcheung
Copy link
Member

felixcheung commented Dec 21, 2016

and btw, could you please test with the Spark's last 2.1 RC5 (aka release build) instead of snapshot builds.

@bikassaha
Copy link

In a different jira, should we consider removing this check for 2.1, 2.2 since its unlikely that Spark will break common APIs in dot releases. Perhaps the policy can change to adding a check when a known incompatible Spark version shows up (until that gets supported in Zeppelin). Thoughts?

@zjffdu
Copy link
Contributor Author

zjffdu commented Dec 21, 2016

@bikassaha zeppelin use spark repl api heavily. And spark repl is not stable as RDD/DataFrame api (It happens in spark 1.x), so it is possible to have changes in spark repl for spark feature release. Another thing is that zeppelin may need some update to support spark new features, so we need to check spark version.

@felixcheung
Copy link
Member

2.1.0 artifacts are on central repo. @zjffdu do you want to update this PR to add build against 2.1?

@felixcheung
Copy link
Member

the 2.1 profile failed, could you check?

@zjffdu
Copy link
Contributor Author

zjffdu commented Jan 1, 2017

@felixcheung The test fail is due to api change in spark 2.1.0.
SPARK-17919 remove the default value for timeout, not sure why. I don't think we need to remove it.
https://github.com/apache/spark/pull/15471/files. @felixcheung I see you are the reviewer, you may know the background.

@felixcheung
Copy link
Member

ah, that's unfortunate. the feedback was it would be inconsistent and confusing to have multiple default values for the timeout, instead it is moved to the top of the function - but I can see the caller of connectBackend (though not officially supported) can be broken by this.

@felixcheung
Copy link
Member

CI failure doesn't seem related.

java.lang.AssertionError: expected:<FINISHED> but was:<ERROR>
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.failNotEquals(Assert.java:834)
	at org.junit.Assert.assertEquals(Assert.java:118)
	at org.junit.Assert.assertEquals(Assert.java:144)
	at org.apache.zeppelin.rest.ZeppelinSparkClusterTest.zRunTest(ZeppelinSparkClusterTest.java:347)

LGTM.

@asfgit asfgit closed this in c2a91e9 Jan 4, 2017
@ejono
Copy link
Contributor

ejono commented Jan 6, 2017

When I compile Zeppelin (from master, including this change) against Spark 2.1.0, I hit the following error:

[ERROR] /Volumes/Amazon/oss-workspace/src/zeppelin/spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java:[1454,26] constructor SecurityManager in class org.apache.spark.SecurityManager cannot be applied to given types;
required: org.apache.spark.SparkConf,scala.Option<byte[]>
found: org.apache.spark.SparkConf
reason: actual and formal argument lists differ in length
[ERROR] /Volumes/Amazon/oss-workspace/src/zeppelin/spark/src/main/java/org/apache/zeppelin/spark/SparkInterpreter.java:[1465,22] constructor SecurityManager in class org.apache.spark.SecurityManager cannot be applied to given types;
required: org.apache.spark.SparkConf,scala.Option<byte[]>
found: org.apache.spark.SparkConf
reason: actual and formal argument lists differ in length

It doesn't matter if I'm using Scala 2.10 or 2.11. I can reproduce this with a command as simple as:

mvn clean package -DskipTests -Pspark-2.0 -Dspark.version=2.1.0

I'm not all that familiar with Scala, but I don't understand why this is a problem, as this Optional<byte[]> argument of the SecurityManager class (from Spark's source code) has a default of None, so I wouldn't think that it would be necessary to specify this second parameter to the constructor in Zeppelin code. Is it not possible for Java code to take advantage of the default value specified in the Scala code?

Also, isn't this breaking the Zeppelin CI? I don't know where to find the Zeppelin CI, so could somebody please provide a link?

@ejono
Copy link
Contributor

ejono commented Jan 6, 2017

Answering my own questions:

I still haven't found a definitive answer to whether or not Java classes may take advantage of default parameter values defined in Scala classes, but it's pretty apparent that you cannot do this. This means that in order to support both Spark < 2.1.0 and Spark >= 2.1.0, we'd need to do a little more reflection magic to support the two possible versions of the SecurityManager constructor.

As for the link to the Jenkins CI, I found https://travis-ci.org/apache/zeppelin. I don't at all understand though how the new Spark 2.1.0 profile is building successfully when I've found it so easy to reproduce this compilation error.

@zjffdu
Copy link
Contributor Author

zjffdu commented Jan 7, 2017

Thanks @ejono for reporting this, I can also reproduce it.

@felixcheung
Copy link
Member

yea, this is a bit puzzling...

@ejono
Copy link
Contributor

ejono commented Jan 7, 2017

Thanks for confirming. The fix would not be too difficult, but yes, I'm puzzled as to why the CI isn't broken, so maybe we need to figure that out first.

@zjffdu
Copy link
Contributor Author

zjffdu commented Jan 7, 2017

@ejono The root cause is that SPARK_VER doesn't take effect in travis, I created ZEPPELIN-1918. It would be better to fix the issue here with ZEPPELIN-1918 together. I would be happy to help if you want to contribute on this. BTW, you might need to create another profile for spark 2.1.0, because spark 2.1.0 use another version of pyspark.

@jongyoul
Copy link
Member

jongyoul commented Jan 7, 2017

Oh, sorry I also didn't check it carefully. I'll check it manually when you guys create new PR for handling a new issue

@felixcheung
Copy link
Member

this #1856 should help too

@zjffdu
Copy link
Contributor Author

zjffdu commented Jan 10, 2017

ping @ejono Do you want to contribute it ? If not, I will fix it.

@felixcheung
Copy link
Member

felixcheung commented Jan 10, 2017 via email

@zjffdu
Copy link
Contributor Author

zjffdu commented Jan 10, 2017

Sure, let me fix it.

@ejono
Copy link
Contributor

ejono commented Jan 10, 2017

Sorry, I was not able to work on this over the past few days. Thank you very much for fixing it, @zjffdu!

beriaanirudh pushed a commit to beriaanirudh/zeppelin that referenced this pull request Mar 8, 2017
### What is this PR for?
This PR is to update the max supported spark version to be 2.1 as Spark 2.1 is released.
I didn't change pom file and travis script to update spark version. I think we can wait for some time until we confirm spark 2.1 is stable enough.

### What type of PR is it?
[Improvement]

### Todos
* [ ] - Task

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-1815

### How should this be tested?
Tested on spark 2.1

### Screenshots (if appropriate)
![sparkr_progress](https://cloud.githubusercontent.com/assets/164491/21373523/b380f44a-c75a-11e6-9004-e85009f9143a.gif)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Jeff Zhang <[email protected]>

Closes apache#1787 from zjffdu/ZEPPELIN-1815 and squashes the following commits:

b2ebf3a [Jeff Zhang] add timeout for connectBackend
0c17cf0 [Jeff Zhang] add build with spark 2.1.0
f7e72c9 [Jeff Zhang] ZEPPELIN-1815. Support Spark 2.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants