Skip to content

Conversation

@zjffdu
Copy link
Contributor

@zjffdu zjffdu commented Jun 3, 2019

What is this PR for?

This PR is just to remove the old spark interpreter. The old spark interpreter has several issues, and we introduce new spark interpreter implementation in 0.8. This ticket is to remove it in 0.9. Here's the issues of old spark interpreter.

  • Didn't use native scala shell api.
  • Dependency management is not applied for yarn cluster mode.
  • It can not support scala 2.12 due to point 1

What type of PR is it?

[ Improvement ]

Todos

  • - Task

What is the Jira issue?

How should this be tested?

  • CI pass

Screenshots (if appropriate)

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

// zeppelin.spark.useHiveContext & zeppelin.spark.concurrentSQL are legacy zeppelin
// properties, convert them to spark properties here.
if (entry.getKey().toString().equals("zeppelin.spark.useHiveContext")) {
conf.set("spark.useHiveContext", entry.getValue().toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall this is the name in spark? spark.useHiveContext anyway this isn't supported in spark today?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is not supported today, just for legacy support. Will remove it in future.

if (entry.getKey().toString().equals("zeppelin.spark.useHiveContext")) {
conf.set("spark.useHiveContext", entry.getValue().toString());
}
if (entry.getKey().toString().equals("zeppelin.spark.concurrentSQL")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we only set this for SQL interpreter? this might have unintended effect for non-SQL ones

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid no. Because it would set spark.scheduler.mode which is need to be set when start driver is starting. And starting driver is in SparkInterpreter

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then maybe we should rename and deprecate this one in the next release. IIRC, people has complained about paragraph execution order changing and breaking stuff, so if this affects all spark interpreter and not just sql, it has a higher risk of that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would not affect the executing order of spark scala code. Because spark scala interpreter use FIFOScheduler. Only SparkSqlInterpreter is affected, as SparkSqlInterpreter use ParallelScheduler https://github.com/apache/zeppelin/blob/master/spark/interpreter/src/main/java/org/apache/zeppelin/spark/SparkSqlInterpreter.java#L128

this.innerInterpreter.bind("z", z.getClass().getCanonicalName(), z,
Lists.newArrayList("@transient"));
} catch (Exception e) {
LOGGER.error("Fail to open SparkInterpreter", ExceptionUtils.getStackTrace(e));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log e instead of ExceptionUtils.getStackTrace(e)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

if (scalaVersionString.contains("version 2.10")) {
return "2.10";
} else {
return "2.11";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could break with scala 2.12

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I will fix it in #3034

@felixcheung
Copy link
Member

I don't see any blocker other than comment on 2.12 and fair scheduler mode

try {
String keytab = getProperties().getProperty("spark.yarn.keytab");
String principal = getProperties().getProperty("spark.yarn.principal");
UserGroupInformation.loginUserFromKeytab(principal, keytab);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set the kerberos authentication information according to the configuration in the new spark interpreter.
Can it be added to the new spark interpreter?
This is very useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is legacy code for OldSparkInterpreter. At that time, we didn't pass spark conf via --conf in spark-submit. But now, we correct that in SparkInterpreterLauncher, so we don't need to do that again.

@asfgit asfgit closed this in f5ee329 Jun 12, 2019
asfgit pushed a commit that referenced this pull request Sep 23, 2019
### What is this PR for?
When Zeppelin is running in Kubernetes, "View in Spark web UI" gives internal address, instead of address defined in SERVICE_DOMAIN.

I think this problem is side effect of #3375 and this PR includes fix and updated unittest.

### What type of PR is it?
Bug Fix

### What is the Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-4226

### How should this be tested?
Run Zeppelin on kubernetes, and run spark job, click "View in Spark web UI" button.

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: Lee moon soo <[email protected]>

Closes #3451 from Leemoonsoo/ZEPPELIN-4226 and squashes the following commits:

7e34542 [Lee moon soo] use StringUtils.isBlank
a33c3b2 [Lee moon soo] pickup SparkUI address from zeppelin.spark.uiWebUrl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants