-
Notifications
You must be signed in to change notification settings - Fork 2.8k
ZEPPELIN-4176. Remove old spark interpreter #3375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // zeppelin.spark.useHiveContext & zeppelin.spark.concurrentSQL are legacy zeppelin | ||
| // properties, convert them to spark properties here. | ||
| if (entry.getKey().toString().equals("zeppelin.spark.useHiveContext")) { | ||
| conf.set("spark.useHiveContext", entry.getValue().toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't recall this is the name in spark? spark.useHiveContext anyway this isn't supported in spark today?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is not supported today, just for legacy support. Will remove it in future.
| if (entry.getKey().toString().equals("zeppelin.spark.useHiveContext")) { | ||
| conf.set("spark.useHiveContext", entry.getValue().toString()); | ||
| } | ||
| if (entry.getKey().toString().equals("zeppelin.spark.concurrentSQL") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we only set this for SQL interpreter? this might have unintended effect for non-SQL ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am afraid no. Because it would set spark.scheduler.mode which is need to be set when start driver is starting. And starting driver is in SparkInterpreter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then maybe we should rename and deprecate this one in the next release. IIRC, people has complained about paragraph execution order changing and breaking stuff, so if this affects all spark interpreter and not just sql, it has a higher risk of that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would not affect the executing order of spark scala code. Because spark scala interpreter use FIFOScheduler. Only SparkSqlInterpreter is affected, as SparkSqlInterpreter use ParallelScheduler https://github.com/apache/zeppelin/blob/master/spark/interpreter/src/main/java/org/apache/zeppelin/spark/SparkSqlInterpreter.java#L128
| this.innerInterpreter.bind("z", z.getClass().getCanonicalName(), z, | ||
| Lists.newArrayList("@transient")); | ||
| } catch (Exception e) { | ||
| LOGGER.error("Fail to open SparkInterpreter", ExceptionUtils.getStackTrace(e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log e instead of ExceptionUtils.getStackTrace(e)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| if (scalaVersionString.contains("version 2.10")) { | ||
| return "2.10"; | ||
| } else { | ||
| return "2.11"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could break with scala 2.12
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I will fix it in #3034
|
I don't see any blocker other than comment on 2.12 and fair scheduler mode |
| try { | ||
| String keytab = getProperties().getProperty("spark.yarn.keytab"); | ||
| String principal = getProperties().getProperty("spark.yarn.principal"); | ||
| UserGroupInformation.loginUserFromKeytab(principal, keytab); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set the kerberos authentication information according to the configuration in the new spark interpreter.
Can it be added to the new spark interpreter?
This is very useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is legacy code for OldSparkInterpreter. At that time, we didn't pass spark conf via --conf in spark-submit. But now, we correct that in SparkInterpreterLauncher, so we don't need to do that again.
### What is this PR for? When Zeppelin is running in Kubernetes, "View in Spark web UI" gives internal address, instead of address defined in SERVICE_DOMAIN. I think this problem is side effect of #3375 and this PR includes fix and updated unittest. ### What type of PR is it? Bug Fix ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-4226 ### How should this be tested? Run Zeppelin on kubernetes, and run spark job, click "View in Spark web UI" button. ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <[email protected]> Closes #3451 from Leemoonsoo/ZEPPELIN-4226 and squashes the following commits: 7e34542 [Lee moon soo] use StringUtils.isBlank a33c3b2 [Lee moon soo] pickup SparkUI address from zeppelin.spark.uiWebUrl
What is this PR for?
This PR is just to remove the old spark interpreter. The old spark interpreter has several issues, and we introduce new spark interpreter implementation in 0.8. This ticket is to remove it in 0.9. Here's the issues of old spark interpreter.
What type of PR is it?
[ Improvement ]
Todos
What is the Jira issue?
How should this be tested?
Screenshots (if appropriate)
Questions: