Skip to content

Commit fed6303

Browse files
committed
Tweaks to Mesos docs
- Mention Apache downloads first - Shorten some wording Author: Matei Zaharia <[email protected]> Closes apache#806 from mateiz/doc-update and squashes the following commits: d9345cd [Matei Zaharia] typo a179f8d [Matei Zaharia] Tweaks to Mesos docs
1 parent 40d6acd commit fed6303

File tree

1 file changed

+34
-37
lines changed

1 file changed

+34
-37
lines changed

docs/running-on-mesos.md

Lines changed: 34 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,15 @@ layout: global
33
title: Running Spark on Mesos
44
---
55

6-
# Why Mesos
7-
86
Spark can run on hardware clusters managed by [Apache Mesos](http://mesos.apache.org/).
97

108
The advantages of deploying Spark with Mesos include:
9+
1110
- dynamic partitioning between Spark and other
1211
[frameworks](https://mesos.apache.org/documentation/latest/mesos-frameworks/)
1312
- scalable partitioning between multiple instances of Spark
1413

15-
# How it works
14+
# How it Works
1615

1716
In a standalone cluster deployment, the cluster manager in the below diagram is a Spark master
1817
instance. When using Mesos, the Mesos master replaces the Spark master as the cluster manager.
@@ -37,11 +36,25 @@ require any special patches of Mesos.
3736
If you already have a Mesos cluster running, you can skip this Mesos installation step.
3837

3938
Otherwise, installing Mesos for Spark is no different than installing Mesos for use by other
40-
frameworks. You can install Mesos using either prebuilt packages or by compiling from source.
39+
frameworks. You can install Mesos either from source or using prebuilt packages.
40+
41+
## From Source
42+
43+
To install Apache Mesos from source, follow these steps:
44+
45+
1. Download a Mesos release from a
46+
[mirror](http://www.apache.org/dyn/closer.cgi/mesos/{{site.MESOS_VERSION}}/)
47+
2. Follow the Mesos [Getting Started](http://mesos.apache.org/gettingstarted) page for compiling and
48+
installing Mesos
49+
50+
**Note:** If you want to run Mesos without installing it into the default paths on your system
51+
(e.g., if you lack administrative privileges to install it), pass the
52+
`--prefix` option to `configure` to tell it where to install. For example, pass
53+
`--prefix=/home/me/mesos`. By default the prefix is `/usr/local`.
4154

42-
## Prebuilt packages
55+
## Third-Party Packages
4356

44-
The Apache Mesos project only publishes source package releases, no binary releases. But other
57+
The Apache Mesos project only publishes source releases, not binary packages. But other
4558
third party projects publish binary releases that may be helpful in setting Mesos up.
4659

4760
One of those is Mesosphere. To install Mesos using the binary releases provided by Mesosphere:
@@ -52,20 +65,6 @@ One of those is Mesosphere. To install Mesos using the binary releases provided
5265
The Mesosphere installation documents suggest setting up ZooKeeper to handle Mesos master failover,
5366
but Mesos can be run without ZooKeeper using a single master as well.
5467

55-
## From source
56-
57-
To install Mesos directly from the upstream project rather than a third party, install from source.
58-
59-
1. Download the Mesos distribution from a
60-
[mirror](http://www.apache.org/dyn/closer.cgi/mesos/{{site.MESOS_VERSION}}/)
61-
2. Follow the Mesos [Getting Started](http://mesos.apache.org/gettingstarted) page for compiling and
62-
installing Mesos
63-
64-
**Note:** If you want to run Mesos without installing it into the default paths on your system
65-
(e.g., if you lack administrative privileges to install it), you should also pass the
66-
`--prefix` option to `configure` to tell it where to install. For example, pass
67-
`--prefix=/home/user/mesos`. By default the prefix is `/usr/local`.
68-
6968
## Verification
7069

7170
To verify that the Mesos cluster is ready for Spark, navigate to the Mesos master webui at port
@@ -74,32 +73,30 @@ To verify that the Mesos cluster is ready for Spark, navigate to the Mesos maste
7473

7574
# Connecting Spark to Mesos
7675

77-
To use Mesos from Spark, you need a Spark distribution available in a place accessible by Mesos, and
76+
To use Mesos from Spark, you need a Spark binary package available in a place accessible by Mesos, and
7877
a Spark driver program configured to connect to Mesos.
7978

80-
## Uploading Spark Distribution
81-
82-
When Mesos runs a task on a Mesos slave for the first time, that slave must have a distribution of
83-
Spark available for running the Spark Mesos executor backend. A distribution of Spark is just a
84-
compiled binary version of Spark.
79+
## Uploading Spark Package
8580

86-
The Spark distribution can be hosted at any Hadoop URI, including HTTP via `http://`, [Amazon Simple
87-
Storage Service](http://aws.amazon.com/s3) via `s3://`, or HDFS via `hdfs:///`.
81+
When Mesos runs a task on a Mesos slave for the first time, that slave must have a Spark binary
82+
package for running the Spark Mesos executor backend.
83+
The Spark package can be hosted at any Hadoop-accessible URI, including HTTP via `http://`,
84+
[Amazon Simple Storage Service](http://aws.amazon.com/s3) via `s3n://`, or HDFS via `hdfs://`.
8885

89-
To use a precompiled distribution:
86+
To use a precompiled package:
9087

91-
1. Download a Spark distribution from the Spark [download page](https://spark.apache.org/downloads.html)
88+
1. Download a Spark binary package from the Spark [download page](https://spark.apache.org/downloads.html)
9289
2. Upload to hdfs/http/s3
9390

9491
To host on HDFS, use the Hadoop fs put command: `hadoop fs -put spark-{{site.SPARK_VERSION}}.tar.gz
9592
/path/to/spark-{{site.SPARK_VERSION}}.tar.gz`
9693

9794

98-
Or if you are using a custom-compiled version of Spark, you will need to create a distribution using
95+
Or if you are using a custom-compiled version of Spark, you will need to create a package using
9996
the `make-distribution.sh` script included in a Spark source tarball/checkout.
10097

10198
1. Download and build Spark using the instructions [here](index.html)
102-
2. Create a Spark distribution using `make-distribution.sh --tgz`.
99+
2. Create a binary package using `make-distribution.sh --tgz`.
103100
3. Upload archive to http/s3/hdfs
104101

105102

@@ -115,8 +112,8 @@ The driver also needs some configuration in `spark-env.sh` to interact properly
115112
`<prefix>/lib/libmesos.so` where the prefix is `/usr/local` by default. See Mesos installation
116113
instructions above. On Mac OS X, the library is called `libmesos.dylib` instead of
117114
`libmesos.so`.
118-
* `export SPARK_EXECUTOR_URI=<path to spark-{{site.SPARK_VERSION}}.tar.gz uploaded above>`.
119-
2. Also set `spark.executor.uri` to <path to spark-{{site.SPARK_VERSION}}.tar.gz>
115+
* `export SPARK_EXECUTOR_URI=<URL of spark-{{site.SPARK_VERSION}}.tar.gz uploaded above>`.
116+
2. Also set `spark.executor.uri` to `<URL of spark-{{site.SPARK_VERSION}}.tar.gz>`.
120117

121118
Now when starting a Spark application against the cluster, pass a `mesos://`
122119
or `zk://` URL as the master when creating a `SparkContext`. For example:
@@ -129,7 +126,7 @@ val conf = new SparkConf()
129126
val sc = new SparkContext(conf)
130127
{% endhighlight %}
131128

132-
When running a shell the `spark.executor.uri` parameter is inherited from `SPARK_EXECUTOR_URI`, so
129+
When running a shell, the `spark.executor.uri` parameter is inherited from `SPARK_EXECUTOR_URI`, so
133130
it does not need to be redundantly passed in as a system property.
134131

135132
{% highlight bash %}
@@ -168,7 +165,7 @@ using `conf.set("spark.cores.max", "10")` (for example).
168165
# Running Alongside Hadoop
169166

170167
You can run Spark and Mesos alongside your existing Hadoop cluster by just launching them as a
171-
separate service on the machines. To access Hadoop data from Spark, a full hdfs:// URL is required
168+
separate service on the machines. To access Hadoop data from Spark, a full `hdfs://` URL is required
172169
(typically `hdfs://<namenode>:9000/path`, but you can find the right URL on your Hadoop Namenode web
173170
UI).
174171

@@ -195,7 +192,7 @@ A few places to look during debugging:
195192
And common pitfalls:
196193

197194
- Spark assembly not reachable/accessible
198-
- Slaves need to be able to download the distribution
195+
- Slaves must be able to download the Spark binary package from the `http://`, `hdfs://` or `s3n://` URL you gave
199196
- Firewall blocking communications
200197
- Check for messages about failed connections
201198
- Temporarily disable firewalls for debugging and then poke appropriate holes

0 commit comments

Comments
 (0)