Skip to content

Commit 07180d0

Browse files
author
Michael Gummelt
committed
Merge branch 'master' of github.com:mesosphere/spark-build
2 parents 34f352a + 3f7417f commit 07180d0

File tree

12 files changed

+56
-48
lines changed

12 files changed

+56
-48
lines changed

docker/Dockerfile

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,12 +32,16 @@ RUN apt-get update && \
3232
runit \
3333
nginx
3434

35-
RUN add-apt-repository ppa:openjdk-r/ppa
3635
RUN apt-get update && \
37-
apt-get install -y openjdk-8-jdk curl
36+
apt-get install -y curl
3837
RUN apt-get install -y r-base
3938

40-
ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64
39+
RUN cd /usr/lib/jvm && \
40+
curl -O https://downloads.mesosphere.com/java/jre-8u121-linux-x64.tar.gz && \
41+
tar zxf jre-8u121-linux-x64.tar.gz && \
42+
rm jre-8u121-linux-x64.tar.gz
43+
44+
ENV JAVA_HOME /usr/lib/jvm/jre1.8.0_121
4145
ENV MESOS_NATIVE_JAVA_LIBRARY /usr/lib/libmesos.so
4246
ENV HADOOP_CONF_DIR /etc/hadoop
4347

docs/history-server.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,12 @@ your cluster][10] and run:
2222
configuration file. Here we call it `options.json`:
2323

2424
{
25-
"history-server": {
26-
"enabled": true
27-
}
25+
"history-server": {
26+
"enabled": true
27+
},
28+
"hdfs": {
29+
"config-url": "http://hdfs.marathon.mesos:9000/v1/connection"
30+
}
2831
}
2932

3033
1. Install Spark:
@@ -40,4 +43,4 @@ configuration file. Here we call it `options.json`:
4043
to the history server entry for that job.
4144

4245
[3]: http://spark.apache.org/docs/latest/monitoring.html#viewing-after-the-fact
43-
[10]: https://docs.mesosphere.com/1.8/administration/access-node/sshcluster/
46+
[10]: https://docs.mesosphere.com/1.9/administration/access-node/sshcluster/

docs/index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,8 +57,8 @@ dispatcher and the history server
5757
[1]: http://spark.apache.org/documentation.html
5858
[2]: http://spark.apache.org/docs/latest/running-on-mesos.html#cluster-mode
5959
[3]: http://spark.apache.org/docs/latest/monitoring.html#viewing-after-the-fact
60-
[4]: https://docs.mesosphere.com/1.8/usage/service-guides/hdfs/
61-
[5]: https://docs.mesosphere.com/1.8/usage/service-guides/kafka/
60+
[4]: https://docs.mesosphere.com/1.9/usage/service-guides/hdfs/
61+
[5]: https://docs.mesosphere.com/1.9/usage/service-guides/kafka/
6262
[6]: https://zeppelin.incubator.apache.org/
6363
[17]: https://github.com/mesosphere/spark
6464
[18]: https://github.com/mesosphere/spark-build

docs/install.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ enterprise: 'no'
66
---
77

88
# About Installing Spark on Enterprise DC/OS
9-
In Enterprise DC/OS `strict` [security mode](https://docs.mesosphere.com/1.8/administration/installing/custom/configuration-parameters/#security), Spark requires a service account. In `permissive`, a service account is optional. Only someone with `superuser` permission can create the service account. Refer to [Provisioning Spark](https://docs.mesosphere.com/1.8/administration/id-and-access-mgt/service-auth/spark-auth/) for instructions.
9+
In Enterprise DC/OS `strict` [security mode](https://docs.mesosphere.com/1.9/administration/installing/custom/configuration-parameters/#security), Spark requires a service account. In `permissive`, a service account is optional. Only someone with `superuser` permission can create the service account. Refer to [Provisioning Spark](https://docs.mesosphere.com/1.9/administration/id-and-access-mgt/service-auth/spark-auth/) for instructions.
1010

1111
# Default Installation
1212

@@ -17,10 +17,11 @@ server.
1717

1818
$ dcos package install spark
1919

20-
Go to the **Services** tab of the DC/OS web interface to monitor the deployment. Once it is
20+
Go to the **Services** > **Deployments** tab of the DC/OS web interface to monitor the deployment. Once it is
2121
complete, visit Spark at `http://<dcos-url>/service/spark/`.
2222

23-
You can also [install Spark via the DC/OS web interface](https://docs.mesosphere.com/1.8/usage/webinterface/#universe).
23+
You can also [install Spark via the DC/OS web interface](https://docs.mesosphere.com/1.9/usage/webinterface/#universe).
24+
2425
**Note:** If you install Spark via the web interface, run the
2526
following command from the DC/OS CLI to install the Spark CLI:
2627

docs/limitations.md

Lines changed: 3 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -5,19 +5,8 @@ feature_maturity: stable
55
enterprise: 'no'
66
---
77

8-
* DC/OS Spark only supports submitting jars and Python scripts. It
9-
does not support R.
8+
* Mesosphere does not provide support for Spark app development, such as writing a Python app to process data from Kafka or writing Scala code to process data from HDFS.
109

11-
* Mesosphere does not provide support for Spark app development,
12-
such as writing a Python app to process data from Kafka or writing
13-
Scala code to process data from HDFS.
10+
* Spark jobs run in Docker containers. The first time you run a Spark job on a node, it might take longer than you expect because of the `docker pull`.
1411

15-
* Spark jobs run in Docker containers. The first time you run a
16-
Spark job on a node, it might take longer than you expect because of
17-
the `docker pull`.
18-
19-
* DC/OS Spark only supports running the Spark shell from within a
20-
DC/OS cluster. See the Spark Shell section for more information.
21-
For interactive analytics, we
22-
recommend Zeppelin, which supports visualizations and dynamic
23-
dependency management.
12+
* DC/OS Spark only supports running the Spark shell from within a DC/OS cluster. See the Spark Shell section for more information. For interactive analytics, we recommend Zeppelin, which supports visualizations and dynamic dependency management.

docs/quick-start.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,16 @@ enterprise: 'no'
1111

1212
1. Run a Spark job:
1313

14-
$ dcos spark run --submit-args="--class org.apache.spark.examples.SparkPi https://s3.amazonaws.com/downloads.mesosphere.io/spark/assets/spark-examples_2.10-1.4.0-SNAPSHOT.jar 30"
14+
$ dcos spark run --submit-args="--class org.apache.spark.examples.SparkPi https://downloads.mesosphere.com/spark/assets/spark-examples_2.10-1.4.0-SNAPSHOT.jar 30"
1515

1616
1. Run a Python Spark job:
1717

1818
$ dcos spark run --submit-args="https://downloads.mesosphere.com/spark/examples/pi.py 30"
1919

20+
1. Run an R Spark job:
21+
22+
$ dcos spark run --submit-args="https://downloads.mesosphere.com/spark/examples/dataframe.R"
23+
2024
1. View your job:
2125

2226
Visit the Spark cluster dispatcher at

docs/run-job.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,10 @@ more][13].
1212

1313
$ dcos spark run --submit-args=`--class MySampleClass http://external.website/mysparkapp.jar 30`
1414

15-
1615
$ dcos spark run --submit-args="--py-files mydependency.py http://external.website/mysparkapp.py 30"
1716

17+
$ dcos spark run --submit-args="http://external.website/mysparkapp.R"
18+
1819
`dcos spark run` is a thin wrapper around the standard Spark
1920
`spark-submit` script. You can submit arbitrary pass-through options
2021
to this script via the `--submit-args` options.
@@ -64,7 +65,7 @@ To set Spark properties with a configuration file, create a
6465

6566
# Versioning
6667

67-
The DC/OS Spark docker image contains OpenJDK 8 and Python 2.7.6.
68+
The DC/OS Spark Docker image contains OpenJDK 8 and Python 2.7.6.
6869

6970
DC/OS Spark distributions 1.X are compiled with Scala 2.10. DC/OS
7071
Spark distributions 2.X are compiled with Scala 2.11. Scala is not

docs/runtime-config-change.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,13 @@ enterprise: 'no'
77

88
You can customize DC/OS Spark in-place when it is up and running.
99

10-
1. Go to the DC/OS web interface.
10+
1. Go to the DC/OS GUI.
1111

1212
1. Click the **Services** tab, then the name of the Spark
1313
framework to be updated.
1414

15-
1. Within the Spark instance details view, click **Edit**.
15+
1. Within the Spark instance details view, click the menu in the upper right, then choose **Edit**.
1616

17-
1. In the dialog that appears, click the **Environment Variables**
18-
tab and update any field(s) to their desired value(s).
17+
1. In the dialog that appears, click the **Environment** tab and update any field(s) to their desired value(s).
1918

20-
1. Click **Deploy** to apply any changes and
21-
cleanly reload Spark.
19+
1. Click **REVIEW & RUN** to apply any changes and cleanly reload Spark.

docs/security.md

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@ post_title: Security
33
menu_order: 40
44
enterprise: 'no'
55
---
6-
76
# Mesos Security
87

98
## SSL
@@ -23,13 +22,11 @@ enterprise: 'no'
2322

2423
## Authentication
2524

26-
When running in [DC/OS strict security mode](https://docs.mesosphere.com/latest/administration/id-and-access-mgt/), both the dispatcher and jobs must authenticate to Mesos using a [DC/OS Service Account](https://docs.mesosphere.com/1.8/administration/id-and-access-mgt/service-auth/).
25+
When running in [DC/OS strict security mode](https://docs.mesosphere.com/latest/administration/id-and-access-mgt/), both the dispatcher and jobs must authenticate to Mesos using a [DC/OS Service Account](https://docs.mesosphere.com/1.9/administration/id-and-access-mgt/service-auth/).
2726

2827
Follow these instructions to authenticate in strict mode:
2928

30-
1. Create a Service Account
31-
32-
Instructions [here](https://docs.mesosphere.com/1.8/administration/id-and-access-mgt/service-auth/universe-service-auth/).
29+
1. Create a service account by following the instructions [here](https://docs.mesosphere.com/1.9/administration/id-and-access-mgt/service-auth/universe-service-auth/).
3330

3431
1. Assign Permissions
3532

@@ -47,7 +44,7 @@ Follow these instructions to authenticate in strict mode:
4744
"$(dcos config show core.dcos_url)/acs/api/v1/acls/dcos:mesos:master:task:user:root/users/${SERVICE_ACCOUNT_NAME}/create"
4845
```
4946
50-
Now you must allow Spark to register under the desired role. This is the value used for `service.role` when installing Spark (default: `*`):
47+
Now, you must allow Spark to register under the desired role. This is the value used for `service.role` when installing Spark (default: `*`):
5148
5249
```
5350
$ export ROLE=<service.role value>
@@ -88,7 +85,7 @@ Follow these instructions to authenticate in strict mode:
8885
8986
1. Submit a Job
9087
91-
We've now installed the Spark Dispatcher, which is authenticating itself to the Mesos master. Spark jobs are also frameworks which must authenticate. The dispatcher will pass the secret along to the jobs, so all that's left to do is configure our jobs to use DC/OS authentication:
88+
We've now installed the Spark Dispatcher, which is authenticating itself to the Mesos master. Spark jobs are also frameworks that must authenticate. The dispatcher will pass the secret along to the jobs, so all that's left to do is configure our jobs to use DC/OS authentication:
9289
9390
```
9491
$ PROPS="-Dspark.mesos.driverEnv.MESOS_MODULES=file:///opt/mesosphere/etc/mesos-scheduler-modules/dcos_authenticatee_module.json "
@@ -172,5 +169,5 @@ In addition to the described configuration, make sure to connect the DC/OS clust
172169
173170
$ dcos config set core.dcos_url https://<dcos-url>
174171
175-
[11]: https://docs.mesosphere.com/1.8/overview/components/
172+
[11]: https://docs.mesosphere.com/1.9/overview/architecture/components/
176173
[12]: http://docs.oracle.com/javase/8/docs/technotes/tools/unix/keytool.html

docs/spark-shell.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ enterprise: 'no'
77
# Interactive Spark Shell
88

99
You can run Spark commands interactively in the Spark shell. The Spark shell is available
10-
in either Scala or Python.
10+
in either Scala, Python, or R.
1111

1212
1. SSH into a node in the DC/OS cluster. [Learn how to SSH into your cluster and get the agent node ID](https://dcos.io/docs/latest/administration/access-node/sshcluster/).
1313

@@ -27,6 +27,10 @@ in either Scala or Python.
2727

2828
$ ./bin/pyspark --master mesos://<internal-master-ip>:5050 --conf spark.mesos.executor.docker.image=mesosphere/spark:1.0.4-2.0.1 --conf spark.mesos.executor.home=/opt/spark/dist
2929

30+
Or, run the R Spark shell.
31+
32+
$ ./bin/sparkR --master mesos://<internal-master-ip>:5050 --conf spark.mesos.executor.docker.image=mesosphere/spark:1.0.7-2.1.0-hadoop-2.6 --conf spark.mesos.executor.home=/opt/spark/dist
33+
3034
1. Run Spark commands interactively.
3135

3236
In the Scala shell:
@@ -38,3 +42,8 @@ in either Scala or Python.
3842

3943
$ textFile = sc.textFile("/opt/spark/dist/README.md")
4044
$ textFile.count()
45+
46+
In the R shell:
47+
48+
$ df <- as.DataFrame(faithful)
49+
$ head(df)

0 commit comments

Comments
 (0)