Skip to content

Commit 0c91927

Browse files
committed
SPARK-1167: Remove metrics-ganglia from default build due to LGPL issues...
This patch removes Ganglia integration from the default build. It allows users willing to link against LGPL code to use Ganglia by adding build flags or linking against a new Spark artifact called spark-ganglia-lgpl. This brings Spark in line with the Apache policy on LGPL code enumerated here: https://www.apache.org/legal/3party.html#options-optional Author: Patrick Wendell <[email protected]> Closes #108 from pwendell/ganglia and squashes the following commits: 326712a [Patrick Wendell] Responding to review feedback 5f28ee4 [Patrick Wendell] SPARK-1167: Remove metrics-ganglia from default build due to LGPL issues. (cherry picked from commit 16788a6) Conflicts: core/pom.xml dev/audit-release/sbt_app_core/src/main/scala/SparkApp.scala dev/create-release/create-release.sh pom.xml project/SparkBuild.scala
1 parent 6f0db0a commit 0c91927

File tree

7 files changed

+95
-16
lines changed

7 files changed

+95
-16
lines changed

assembly/pom.xml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,16 @@
158158
</dependency>
159159
</dependencies>
160160
</profile>
161+
<profile>
162+
<id>spark-ganglia-lgpl</id>
163+
<dependencies>
164+
<dependency>
165+
<groupId>org.apache.spark</groupId>
166+
<artifactId>spark-ganglia-lgpl_${scala.binary.version}</artifactId>
167+
<version>${project.version}</version>
168+
</dependency>
169+
</dependencies>
170+
</profile>
161171
<profile>
162172
<id>bigtop-dist</id>
163173
<!-- This profile uses the assembly plugin to create a special "dist" package for BigTop

core/pom.xml

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -147,10 +147,6 @@
147147
<groupId>com.codahale.metrics</groupId>
148148
<artifactId>metrics-json</artifactId>
149149
</dependency>
150-
<dependency>
151-
<groupId>com.codahale.metrics</groupId>
152-
<artifactId>metrics-ganglia</artifactId>
153-
</dependency>
154150
<dependency>
155151
<groupId>com.codahale.metrics</groupId>
156152
<artifactId>metrics-graphite</artifactId>

docs/monitoring.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,11 +48,22 @@ Each instance can report to zero or more _sinks_. Sinks are contained in the
4848

4949
* `ConsoleSink`: Logs metrics information to the console.
5050
* `CSVSink`: Exports metrics data to CSV files at regular intervals.
51-
* `GangliaSink`: Sends metrics to a Ganglia node or multicast group.
5251
* `JmxSink`: Registers metrics for viewing in a JXM console.
5352
* `MetricsServlet`: Adds a servlet within the existing Spark UI to serve metrics data as JSON data.
5453
* `GraphiteSink`: Sends metrics to a Graphite node.
5554

55+
Spark also supports a Ganglia sink which is not included in the default build due to
56+
licensing restrictions:
57+
58+
* `GangliaSink`: Sends metrics to a Ganglia node or multicast group.
59+
60+
To install the `GangliaSink` you'll need to perform a custom build of Spark. _**Note that
61+
by embedding this library you will include [LGPL](http://www.gnu.org/copyleft/lesser.html)-licensed
62+
code in your Spark package**_. For sbt users, set the
63+
`SPARK_GANGLIA_LGPL` environment variable before building. For Maven users, enable
64+
the `-Pspark-ganglia-lgpl` profile. In addition to modifying the cluster's Spark build
65+
user applications will need to link to the `spark-ganglia-lgpl` artifact.
66+
5667
The syntax of the metrics configuration file is defined in an example configuration file,
5768
`$SPARK_HOME/conf/metrics.conf.template`.
5869

extras/spark-ganglia-lgpl/pom.xml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<!--
3+
~ Licensed to the Apache Software Foundation (ASF) under one or more
4+
~ contributor license agreements. See the NOTICE file distributed with
5+
~ this work for additional information regarding copyright ownership.
6+
~ The ASF licenses this file to You under the Apache License, Version 2.0
7+
~ (the "License"); you may not use this file except in compliance with
8+
~ the License. You may obtain a copy of the License at
9+
~
10+
~ http://www.apache.org/licenses/LICENSE-2.0
11+
~
12+
~ Unless required by applicable law or agreed to in writing, software
13+
~ distributed under the License is distributed on an "AS IS" BASIS,
14+
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
~ See the License for the specific language governing permissions and
16+
~ limitations under the License.
17+
-->
18+
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
19+
<modelVersion>4.0.0</modelVersion>
20+
<parent>
21+
<groupId>org.apache.spark</groupId>
22+
<artifactId>spark-parent</artifactId>
23+
<version>1.0.0-SNAPSHOT</version>
24+
<relativePath>../../pom.xml</relativePath>
25+
</parent>
26+
27+
<!-- Ganglia integration is not included by default due to LGPL-licensed code -->
28+
<groupId>org.apache.spark</groupId>
29+
<artifactId>spark-ganglia-lgpl_2.10</artifactId>
30+
<packaging>jar</packaging>
31+
<name>Spark Ganglia Integration</name>
32+
33+
<dependencies>
34+
<dependency>
35+
<groupId>org.apache.spark</groupId>
36+
<artifactId>spark-core_${scala.binary.version}</artifactId>
37+
<version>${project.version}</version>
38+
</dependency>
39+
40+
<dependency>
41+
<groupId>com.codahale.metrics</groupId>
42+
<artifactId>metrics-ganglia</artifactId>
43+
</dependency>
44+
</dependencies>
45+
</project>

pom.xml

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -743,11 +743,9 @@
743743
<hadoop.version>0.23.7</hadoop.version>
744744
<!--<hadoop.version>2.0.5-alpha</hadoop.version> -->
745745
</properties>
746-
747746
<modules>
748747
<module>yarn</module>
749748
</modules>
750-
751749
</profile>
752750

753751
<profile>
@@ -760,7 +758,15 @@
760758
<modules>
761759
<module>yarn</module>
762760
</modules>
763-
764761
</profile>
762+
763+
<!-- Ganglia integration is not included by default due to LGPL-licensed code -->
764+
<profile>
765+
<id>spark-ganglia-lgpl</id>
766+
<modules>
767+
<module>extras/spark-ganglia-lgpl</module>
768+
</modules>
769+
</profile>
770+
765771
</profiles>
766772
</project>

project/SparkBuild.scala

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ object SparkBuild extends Build {
6161
lazy val mllib = Project("mllib", file("mllib"), settings = mllibSettings) dependsOn(core)
6262

6363
lazy val assemblyProj = Project("assembly", file("assembly"), settings = assemblyProjSettings)
64-
.dependsOn(core, graphx, bagel, mllib, repl, streaming) dependsOn(maybeYarn: _*)
64+
.dependsOn(core, graphx, bagel, mllib, repl, streaming) dependsOn(maybeYarn: _*) dependsOn(maybeGanglia: _*)
6565

6666
lazy val assembleDeps = TaskKey[Unit]("assemble-deps", "Build assembly of dependencies and packages Spark projects")
6767

@@ -90,14 +90,21 @@ object SparkBuild extends Build {
9090
case None => DEFAULT_YARN
9191
case Some(v) => v.toBoolean
9292
}
93-
lazy val hadoopClient = if (hadoopVersion.startsWith("0.20.") || hadoopVersion == "1.0.0") "hadoop-core" else "hadoop-client"
94-
95-
// Conditionally include the yarn sub-project
93+
lazy val hadoopClient = if (hadoopVersion.startsWith("0.20.") || hadoopVersion == "1.0.0") "hadoop-core" else "hadoop-client"
94+
95+
// Include Ganglia integration if the user has enabled Ganglia
96+
// This is isolated from the normal build due to LGPL-licensed code in the library
97+
lazy val isGangliaEnabled = Properties.envOrNone("SPARK_GANGLIA_LGPL").isDefined
98+
lazy val gangliaProj = Project("spark-ganglia-lgpl", file("extras/spark-ganglia-lgpl"), settings = gangliaSettings).dependsOn(core)
99+
val maybeGanglia: Seq[ClasspathDependency] = if (isGangliaEnabled) Seq(gangliaProj) else Seq()
100+
val maybeGangliaRef: Seq[ProjectReference] = if (isGangliaEnabled) Seq(gangliaProj) else Seq()
101+
102+
// Include the YARN project if the user has enabled YARN
96103
lazy val yarnAlpha = Project("yarn-alpha", file("yarn/alpha"), settings = yarnAlphaSettings) dependsOn(core)
97104
lazy val yarn = Project("yarn", file("yarn/stable"), settings = yarnSettings) dependsOn(core)
98105

99-
lazy val maybeYarn = if (isYarnEnabled) Seq[ClasspathDependency](if (isNewHadoop) yarn else yarnAlpha) else Seq[ClasspathDependency]()
100-
lazy val maybeYarnRef = if (isYarnEnabled) Seq[ProjectReference](if (isNewHadoop) yarn else yarnAlpha) else Seq[ProjectReference]()
106+
lazy val maybeYarn: Seq[ClasspathDependency] = if (isYarnEnabled) Seq(if (isNewHadoop) yarn else yarnAlpha) else Seq()
107+
lazy val maybeYarnRef: Seq[ProjectReference] = if (isYarnEnabled) Seq(if (isNewHadoop) yarn else yarnAlpha) else Seq()
101108

102109
lazy val externalTwitter = Project("external-twitter", file("external/twitter"), settings = twitterSettings)
103110
.dependsOn(streaming % "compile->compile;test->test")
@@ -121,7 +128,7 @@ object SparkBuild extends Build {
121128
.dependsOn(core, mllib, graphx, bagel, streaming, externalTwitter) dependsOn(allExternal: _*)
122129

123130
// Everything except assembly, tools and examples belong to packageProjects
124-
lazy val packageProjects = Seq[ProjectReference](core, repl, bagel, streaming, mllib, graphx) ++ maybeYarnRef
131+
lazy val packageProjects = Seq[ProjectReference](core, repl, bagel, streaming, mllib, graphx) ++ maybeYarnRef ++ maybeGangliaRef
125132

126133
lazy val allProjects = packageProjects ++ allExternalRefs ++ Seq[ProjectReference](examples, tools, assemblyProj)
127134

@@ -281,7 +288,6 @@ object SparkBuild extends Build {
281288
"com.codahale.metrics" % "metrics-core" % "3.0.0",
282289
"com.codahale.metrics" % "metrics-jvm" % "3.0.0",
283290
"com.codahale.metrics" % "metrics-json" % "3.0.0",
284-
"com.codahale.metrics" % "metrics-ganglia" % "3.0.0",
285291
"com.codahale.metrics" % "metrics-graphite" % "3.0.0",
286292
"com.twitter" %% "chill" % "0.3.1",
287293
"com.twitter" % "chill-java" % "0.3.1",
@@ -371,6 +377,11 @@ object SparkBuild extends Build {
371377
name := "spark-yarn"
372378
)
373379

380+
def gangliaSettings = sharedSettings ++ Seq(
381+
name := "spark-ganglia-lgpl",
382+
libraryDependencies += "com.codahale.metrics" % "metrics-ganglia" % "3.0.0"
383+
)
384+
374385
// Conditionally include the YARN dependencies because some tools look at all sub-projects and will complain
375386
// if we refer to nonexistent dependencies (e.g. hadoop-yarn-api from a Hadoop version without YARN).
376387
def extraYarnSettings = if(isYarnEnabled) yarnEnabledSettings else Seq()

0 commit comments

Comments
 (0)