Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 13 additions & 8 deletions docs/interpreter/spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,11 @@ You can also set other Spark properties which are not listed in the table. For a
<td>true</td>
<td>Do not change - developer only setting, not for production use</td>
</tr>
<tr>
<td>zeppelin.spark.uiWebUrl</td>
<td></td>
<td>Overrides Spark UI default URL. Value should be a full URL (ex: http://{hostName}/{uniquePath}</td>
</tr>
</table>

Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you'll need to follow below two simple steps.
Expand Down Expand Up @@ -183,7 +188,7 @@ For example,
* **yarn-client** in Yarn client mode
* **mesos://host:5050** in Mesos cluster

That's it. Zeppelin will work with any version of Spark and any deployment type without rebuilding Zeppelin in this way.
That's it. Zeppelin will work with any version of Spark and any deployment type without rebuilding Zeppelin in this way.
For the further information about Spark & Zeppelin version compatibility, please refer to "Available Interpreters" section in [Zeppelin download page](https://zeppelin.apache.org/download.html).

> Note that without exporting `SPARK_HOME`, it's running in local mode with included version of Spark. The included version may vary depending on the build profile.
Expand All @@ -210,7 +215,7 @@ There are two ways to load external libraries in Spark interpreter. First is usi
Please see [Dependency Management](../usage/interpreter/dependency_management.html) for the details.

### 2. Loading Spark Properties
Once `SPARK_HOME` is set in `conf/zeppelin-env.sh`, Zeppelin uses `spark-submit` as spark interpreter runner. `spark-submit` supports two ways to load configurations.
Once `SPARK_HOME` is set in `conf/zeppelin-env.sh`, Zeppelin uses `spark-submit` as spark interpreter runner. `spark-submit` supports two ways to load configurations.
The first is command line options such as --master and Zeppelin can pass these options to `spark-submit` by exporting `SPARK_SUBMIT_OPTIONS` in `conf/zeppelin-env.sh`. Second is reading configuration options from `SPARK_HOME/conf/spark-defaults.conf`. Spark properties that user can set to distribute libraries are:

<table class="table-configuration">
Expand Down Expand Up @@ -243,7 +248,7 @@ Here are few examples:
```bash
export SPARK_SUBMIT_OPTIONS="--packages com.databricks:spark-csv_2.10:1.2.0 --jars /path/mylib1.jar,/path/mylib2.jar --files /path/mylib1.py,/path/mylib2.zip,/path/mylib3.egg"
```

* `SPARK_HOME/conf/spark-defaults.conf`

```
Expand Down Expand Up @@ -408,17 +413,17 @@ To learn more about dynamic form, checkout [Dynamic Form](../usage/dynamic_form/


## Matplotlib Integration (pyspark)
Both the `python` and `pyspark` interpreters have built-in support for inline visualization using `matplotlib`,
a popular plotting library for python. More details can be found in the [python interpreter documentation](../interpreter/python.html),
since matplotlib support is identical. More advanced interactive plotting can be done with pyspark through
Both the `python` and `pyspark` interpreters have built-in support for inline visualization using `matplotlib`,
a popular plotting library for python. More details can be found in the [python interpreter documentation](../interpreter/python.html),
since matplotlib support is identical. More advanced interactive plotting can be done with pyspark through
utilizing Zeppelin's built-in [Angular Display System](../usage/display_system/angular_backend.html), as shown below:

<img class="img-responsive" src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/matplotlibAngularExample.gif" />

## Interpreter setting option

You can choose one of `shared`, `scoped` and `isolated` options wheh you configure Spark interpreter.
Spark interpreter creates separated Scala compiler per each notebook but share a single SparkContext in `scoped` mode (experimental).
You can choose one of `shared`, `scoped` and `isolated` options wheh you configure Spark interpreter.
Spark interpreter creates separated Scala compiler per each notebook but share a single SparkContext in `scoped` mode (experimental).
It creates separated SparkContext per each notebook in `isolated` mode.

## IPython support
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,9 @@
import java.lang.reflect.Method;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.NoSuchElementException;
import java.util.Properties;
import java.util.Set;
import java.util.concurrent.atomic.AtomicInteger;
Expand All @@ -51,7 +49,6 @@
import org.apache.spark.ui.SparkUI;
import org.apache.spark.ui.jobs.JobProgressListener;
import org.apache.zeppelin.interpreter.BaseZeppelinContext;
import org.apache.zeppelin.interpreter.DefaultInterpreterProperty;
import org.apache.zeppelin.interpreter.Interpreter;
import org.apache.zeppelin.interpreter.InterpreterContext;
import org.apache.zeppelin.interpreter.InterpreterException;
Expand All @@ -72,7 +69,6 @@
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.google.common.base.Joiner;
import scala.Console;
import scala.Enumeration.Value;
import scala.None;
Expand Down Expand Up @@ -206,7 +202,7 @@ public synchronized void onJobStart(SparkListenerJobStart jobStart) {
private String getJobUrl(int jobId) {
String jobUrl = null;
if (sparkUrl != null) {
jobUrl = sparkUrl + "/jobs/job?id=" + jobId;
jobUrl = sparkUrl + "/jobs/job/?id=" + jobId;
}
return jobUrl;
}
Expand Down Expand Up @@ -936,6 +932,11 @@ public String getSparkUIUrl() {
return sparkUrl;
}

String sparkUrlProp = property.getProperty("zeppelin.spark.uiWebUrl", "");
if (!StringUtils.isBlank(sparkUrlProp)) {
return sparkUrlProp;
}

if (sparkVersion.newerThanEquals(SparkVersion.SPARK_2_0_0)) {
Option<String> uiWebUrlOption = (Option<String>) Utils.invokeMethod(sc, "uiWebUrl");
if (uiWebUrlOption.isDefined()) {
Expand Down
7 changes: 7 additions & 0 deletions spark/src/main/resources/interpreter-setting.json
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,13 @@
"defaultValue": true,
"description": "Do not change - developer only setting, not for production use",
"type": "checkbox"
},
"zeppelin.spark.uiWebUrl": {
"envName": null,
"propertyName": "zeppelin.spark.uiWebUrl",
"defaultValue": "",
"description": "Override Spark UI default URL",
"type": "string"
}
},
"editor": {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ public void testParagraphUrls() {
}
String sparkUIUrl = repl.getSparkUIUrl();
assertNotNull(jobUrl);
assertTrue(jobUrl.startsWith(sparkUIUrl + "/jobs/job?id="));
assertTrue(jobUrl.startsWith(sparkUIUrl + "/jobs/job/?id="));

}
}