@@ -3,16 +3,15 @@ layout: global
33title : Running Spark on Mesos
44---
55
6- # Why Mesos
7-
86Spark can run on hardware clusters managed by [ Apache Mesos] ( http://mesos.apache.org/ ) .
97
108The advantages of deploying Spark with Mesos include:
9+
1110- dynamic partitioning between Spark and other
1211 [ frameworks] ( https://mesos.apache.org/documentation/latest/mesos-frameworks/ )
1312- scalable partitioning between multiple instances of Spark
1413
15- # How it works
14+ # How it Works
1615
1716In a standalone cluster deployment, the cluster manager in the below diagram is a Spark master
1817instance. When using Mesos, the Mesos master replaces the Spark master as the cluster manager.
@@ -37,11 +36,25 @@ require any special patches of Mesos.
3736If you already have a Mesos cluster running, you can skip this Mesos installation step.
3837
3938Otherwise, installing Mesos for Spark is no different than installing Mesos for use by other
40- frameworks. You can install Mesos using either prebuilt packages or by compiling from source.
39+ frameworks. You can install Mesos either from source or using prebuilt packages.
40+
41+ ## From Source
42+
43+ To install Apache Mesos from source, follow these steps:
44+
45+ 1 . Download a Mesos release from a
46+ [ mirror] ( http://www.apache.org/dyn/closer.cgi/mesos/{{site.MESOS_VERSION}}/ )
47+ 2 . Follow the Mesos [ Getting Started] ( http://mesos.apache.org/gettingstarted ) page for compiling and
48+ installing Mesos
49+
50+ ** Note:** If you want to run Mesos without installing it into the default paths on your system
51+ (e.g., if you lack administrative privileges to install it), pass the
52+ ` --prefix ` option to ` configure ` to tell it where to install. For example, pass
53+ ` --prefix=/home/me/mesos ` . By default the prefix is ` /usr/local ` .
4154
42- ## Prebuilt packages
55+ ## Third-Party Packages
4356
44- The Apache Mesos project only publishes source package releases, no binary releases . But other
57+ The Apache Mesos project only publishes source releases, not binary packages . But other
4558third party projects publish binary releases that may be helpful in setting Mesos up.
4659
4760One of those is Mesosphere. To install Mesos using the binary releases provided by Mesosphere:
@@ -52,20 +65,6 @@ One of those is Mesosphere. To install Mesos using the binary releases provided
5265The Mesosphere installation documents suggest setting up ZooKeeper to handle Mesos master failover,
5366but Mesos can be run without ZooKeeper using a single master as well.
5467
55- ## From source
56-
57- To install Mesos directly from the upstream project rather than a third party, install from source.
58-
59- 1 . Download the Mesos distribution from a
60- [ mirror] ( http://www.apache.org/dyn/closer.cgi/mesos/{{site.MESOS_VERSION}}/ )
61- 2 . Follow the Mesos [ Getting Started] ( http://mesos.apache.org/gettingstarted ) page for compiling and
62- installing Mesos
63-
64- ** Note:** If you want to run Mesos without installing it into the default paths on your system
65- (e.g., if you lack administrative privileges to install it), you should also pass the
66- ` --prefix ` option to ` configure ` to tell it where to install. For example, pass
67- ` --prefix=/home/user/mesos ` . By default the prefix is ` /usr/local ` .
68-
6968## Verification
7069
7170To verify that the Mesos cluster is ready for Spark, navigate to the Mesos master webui at port
@@ -74,32 +73,30 @@ To verify that the Mesos cluster is ready for Spark, navigate to the Mesos maste
7473
7574# Connecting Spark to Mesos
7675
77- To use Mesos from Spark, you need a Spark distribution available in a place accessible by Mesos, and
76+ To use Mesos from Spark, you need a Spark binary package available in a place accessible by Mesos, and
7877a Spark driver program configured to connect to Mesos.
7978
80- ## Uploading Spark Distribution
81-
82- When Mesos runs a task on a Mesos slave for the first time, that slave must have a distribution of
83- Spark available for running the Spark Mesos executor backend. A distribution of Spark is just a
84- compiled binary version of Spark.
79+ ## Uploading Spark Package
8580
86- The Spark distribution can be hosted at any Hadoop URI, including HTTP via ` http:// ` , [ Amazon Simple
87- Storage Service] ( http://aws.amazon.com/s3 ) via ` s3:// ` , or HDFS via ` hdfs:/// ` .
81+ When Mesos runs a task on a Mesos slave for the first time, that slave must have a Spark binary
82+ package for running the Spark Mesos executor backend.
83+ The Spark package can be hosted at any Hadoop-accessible URI, including HTTP via ` http:// ` ,
84+ [ Amazon Simple Storage Service] ( http://aws.amazon.com/s3 ) via ` s3n:// ` , or HDFS via ` hdfs:// ` .
8885
89- To use a precompiled distribution :
86+ To use a precompiled package :
9087
91- 1 . Download a Spark distribution from the Spark [ download page] ( https://spark.apache.org/downloads.html )
88+ 1 . Download a Spark binary package from the Spark [ download page] ( https://spark.apache.org/downloads.html )
92892 . Upload to hdfs/http/s3
9390
9491To host on HDFS, use the Hadoop fs put command: `hadoop fs -put spark-{{site.SPARK_VERSION}}.tar.gz
9592/path/to/spark-{{site.SPARK_VERSION}}.tar.gz`
9693
9794
98- Or if you are using a custom-compiled version of Spark, you will need to create a distribution using
95+ Or if you are using a custom-compiled version of Spark, you will need to create a package using
9996the ` make-distribution.sh ` script included in a Spark source tarball/checkout.
10097
101981 . Download and build Spark using the instructions [ here] ( index.html )
102- 2 . Create a Spark distribution using ` make-distribution.sh --tgz ` .
99+ 2 . Create a binary package using ` make-distribution.sh --tgz ` .
1031003 . Upload archive to http/s3/hdfs
104101
105102
@@ -115,8 +112,8 @@ The driver also needs some configuration in `spark-env.sh` to interact properly
115112 ` <prefix>/lib/libmesos.so ` where the prefix is ` /usr/local ` by default. See Mesos installation
116113 instructions above. On Mac OS X, the library is called ` libmesos.dylib ` instead of
117114 ` libmesos.so ` .
118- * ` export SPARK_EXECUTOR_URI=<path to spark-{{site.SPARK_VERSION}}.tar.gz uploaded above> ` .
119- 2 . Also set ` spark.executor.uri ` to <path to spark-{{site.SPARK_VERSION}}.tar.gz>
115+ * ` export SPARK_EXECUTOR_URI=<URL of spark-{{site.SPARK_VERSION}}.tar.gz uploaded above> ` .
116+ 2 . Also set ` spark.executor.uri ` to ` <URL of spark-{{site.SPARK_VERSION}}.tar.gz>` .
120117
121118Now when starting a Spark application against the cluster, pass a ` mesos:// `
122119or ` zk:// ` URL as the master when creating a ` SparkContext ` . For example:
@@ -129,7 +126,7 @@ val conf = new SparkConf()
129126val sc = new SparkContext(conf)
130127{% endhighlight %}
131128
132- When running a shell the ` spark.executor.uri ` parameter is inherited from ` SPARK_EXECUTOR_URI ` , so
129+ When running a shell, the ` spark.executor.uri ` parameter is inherited from ` SPARK_EXECUTOR_URI ` , so
133130it does not need to be redundantly passed in as a system property.
134131
135132{% highlight bash %}
@@ -168,7 +165,7 @@ using `conf.set("spark.cores.max", "10")` (for example).
168165# Running Alongside Hadoop
169166
170167You can run Spark and Mesos alongside your existing Hadoop cluster by just launching them as a
171- separate service on the machines. To access Hadoop data from Spark, a full hdfs:// URL is required
168+ separate service on the machines. To access Hadoop data from Spark, a full ` hdfs:// ` URL is required
172169(typically ` hdfs://<namenode>:9000/path ` , but you can find the right URL on your Hadoop Namenode web
173170UI).
174171
@@ -195,7 +192,7 @@ A few places to look during debugging:
195192And common pitfalls:
196193
197194- Spark assembly not reachable/accessible
198- - Slaves need to be able to download the distribution
195+ - Slaves must be able to download the Spark binary package from the ` http:// ` , ` hdfs:// ` or ` s3n:// ` URL you gave
199196- Firewall blocking communications
200197 - Check for messages about failed connections
201198 - Temporarily disable firewalls for debugging and then poke appropriate holes
0 commit comments