Skip to content

Commit 7379b29

Browse files
committed
Merge remote-tracking branch 'upstream/master'
2 parents 66d5012 + a2e7e04 commit 7379b29

File tree

176 files changed

+1487
-2022
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

176 files changed

+1487
-2022
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
*~
22
*.swp
3+
*.ipr
34
*.iml
5+
*.iws
46
.idea/
57
.settings
68
.cache

README.md

Lines changed: 17 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -13,20 +13,20 @@ This README file only contains basic setup instructions.
1313
## Building
1414

1515
Spark requires Scala 2.10. The project is built using Simple Build Tool (SBT),
16-
which is packaged with it. To build Spark and its example programs, run:
16+
which can be obtained [here](http://www.scala-sbt.org). To build Spark and its example programs, run:
1717

18-
sbt/sbt assembly
18+
sbt assembly
1919

2020
Once you've built Spark, the easiest way to start using it is the shell:
2121

22-
./spark-shell
22+
./bin/spark-shell
2323

24-
Or, for the Python API, the Python shell (`./pyspark`).
24+
Or, for the Python API, the Python shell (`./bin/pyspark`).
2525

2626
Spark also comes with several sample programs in the `examples` directory.
27-
To run one of them, use `./run-example <class> <params>`. For example:
27+
To run one of them, use `./bin/run-example <class> <params>`. For example:
2828

29-
./run-example org.apache.spark.examples.SparkLR local[2]
29+
./bin/run-example org.apache.spark.examples.SparkLR local[2]
3030

3131
will run the Logistic Regression example locally on 2 CPUs.
3232

@@ -36,7 +36,13 @@ All of the Spark samples take a `<master>` parameter that is the cluster URL
3636
to connect to. This can be a mesos:// or spark:// URL, or "local" to run
3737
locally with one thread, or "local[N]" to run locally with N threads.
3838

39+
## Running tests
3940

41+
Testing first requires [Building](#Building) Spark. Once Spark is built, tests
42+
can be run using:
43+
44+
`sbt test`
45+
4046
## A Note About Hadoop Versions
4147

4248
Spark uses the Hadoop core library to talk to HDFS and other Hadoop-supported
@@ -49,22 +55,22 @@ For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop
4955
versions without YARN, use:
5056

5157
# Apache Hadoop 1.2.1
52-
$ SPARK_HADOOP_VERSION=1.2.1 sbt/sbt assembly
58+
$ SPARK_HADOOP_VERSION=1.2.1 sbt assembly
5359

5460
# Cloudera CDH 4.2.0 with MapReduce v1
55-
$ SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.2.0 sbt/sbt assembly
61+
$ SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.2.0 sbt assembly
5662

5763
For Apache Hadoop 2.2.X, 2.1.X, 2.0.X, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions
5864
with YARN, also set `SPARK_YARN=true`:
5965

6066
# Apache Hadoop 2.0.5-alpha
61-
$ SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_YARN=true sbt/sbt assembly
67+
$ SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_YARN=true sbt assembly
6268

6369
# Cloudera CDH 4.2.0 with MapReduce v2
64-
$ SPARK_HADOOP_VERSION=2.0.0-cdh4.2.0 SPARK_YARN=true sbt/sbt assembly
70+
$ SPARK_HADOOP_VERSION=2.0.0-cdh4.2.0 SPARK_YARN=true sbt assembly
6571

6672
# Apache Hadoop 2.2.X and newer
67-
$ SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true sbt/sbt assembly
73+
$ SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true sbt assembly
6874

6975
When developing a Spark application, specify the Hadoop version by adding the
7076
"hadoop-client" artifact to your project's dependencies. For example, if you're

assembly/lib/PY4J_LICENSE.txt

Lines changed: 0 additions & 27 deletions
This file was deleted.

assembly/lib/PY4J_VERSION.txt

Lines changed: 0 additions & 1 deletion
This file was deleted.
-101 KB
Binary file not shown.

assembly/lib/net/sf/py4j/py4j/0.7/py4j-0.7.pom

Lines changed: 0 additions & 9 deletions
This file was deleted.

assembly/lib/net/sf/py4j/py4j/maven-metadata-local.xml

Lines changed: 0 additions & 12 deletions
This file was deleted.

assembly/pom.xml

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@
6767
<dependency>
6868
<groupId>net.sf.py4j</groupId>
6969
<artifactId>py4j</artifactId>
70-
<version>0.7</version>
70+
<version>0.8.1</version>
7171
</dependency>
7272
</dependencies>
7373

@@ -124,7 +124,17 @@
124124

125125
<profiles>
126126
<profile>
127-
<id>hadoop2-yarn</id>
127+
<id>yarn-alpha</id>
128+
<dependencies>
129+
<dependency>
130+
<groupId>org.apache.spark</groupId>
131+
<artifactId>spark-yarn-alpha_${scala.binary.version}</artifactId>
132+
<version>${project.version}</version>
133+
</dependency>
134+
</dependencies>
135+
</profile>
136+
<profile>
137+
<id>yarn</id>
128138
<dependencies>
129139
<dependency>
130140
<groupId>org.apache.spark</groupId>

assembly/src/main/assembly/assembly.xml

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -39,23 +39,20 @@
3939
</fileSet>
4040
<fileSet>
4141
<directory>
42-
${project.parent.basedir}/bin/
42+
${project.parent.basedir}/sbin/
4343
</directory>
44-
<outputDirectory>/bin</outputDirectory>
44+
<outputDirectory>/sbin</outputDirectory>
4545
<includes>
4646
<include>**/*</include>
4747
</includes>
4848
</fileSet>
4949
<fileSet>
5050
<directory>
51-
${project.parent.basedir}
51+
${project.parent.basedir}/bin/
5252
</directory>
5353
<outputDirectory>/bin</outputDirectory>
5454
<includes>
55-
<include>run-example*</include>
56-
<include>spark-class*</include>
57-
<include>spark-shell*</include>
58-
<include>spark-executor*</include>
55+
<include>**/*</include>
5956
</includes>
6057
</fileSet>
6158
</fileSets>

bin/compute-classpath.cmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ rem Load environment variables from conf\spark-env.cmd, if it exists
2929
if exist "%FWDIR%conf\spark-env.cmd" call "%FWDIR%conf\spark-env.cmd"
3030

3131
rem Build up classpath
32-
set CLASSPATH=%SPARK_CLASSPATH%;%FWDIR%conf
32+
set CLASSPATH=%FWDIR%conf
3333
if exist "%FWDIR%RELEASE" (
3434
for %%d in ("%FWDIR%jars\spark-assembly*.jar") do (
3535
set ASSEMBLY_JAR=%%d

0 commit comments

Comments
 (0)