Skip to content

Commit ad7cb53

Browse files
committed
Merge branch 'master' of https://github.com/apache/spark into reduce-locations
Conflicts: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
2 parents df14cee + cd3176b commit ad7cb53

File tree

1,974 files changed

+161855
-28113
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,974 files changed

+161855
-28113
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,9 @@ ec2/lib/
6363
rat-results.txt
6464
scalastyle.txt
6565
scalastyle-output.xml
66+
R-unit-tests.log
67+
R/unit-tests.out
68+
python/lib/pyspark.zip
6669

6770
# For Hive
6871
metastore_db/

.rat-excludes

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ TAGS
1515
RELEASE
1616
control
1717
docs
18+
docker.properties.template
1819
fairscheduler.xml.template
1920
spark-defaults.conf.template
2021
log4j.properties
@@ -29,7 +30,12 @@ spark-env.sh.template
2930
log4j-defaults.properties
3031
bootstrap-tooltip.js
3132
jquery-1.11.1.min.js
33+
d3.min.js
34+
dagre-d3.min.js
35+
graphlib-dot.min.js
3236
sorttable.js
37+
vis.min.js
38+
vis.min.css
3339
.*avsc
3440
.*txt
3541
.*json
@@ -67,3 +73,15 @@ logs
6773
.*scalastyle-output.xml
6874
.*dependency-reduced-pom.xml
6975
known_translations
76+
json_expectation
77+
local-1422981759269/*
78+
local-1422981780767/*
79+
local-1425081759269/*
80+
local-1426533911241/*
81+
local-1426633911242/*
82+
local-1430917381534/*
83+
local-1430917381535_1
84+
local-1430917381535_2
85+
DESCRIPTION
86+
NAMESPACE
87+
test_support/*

CONTRIBUTING.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,16 @@
11
## Contributing to Spark
22

3-
Contributions via GitHub pull requests are gladly accepted from their original
4-
author. Along with any pull requests, please state that the contribution is
5-
your original work and that you license the work to the project under the
6-
project's open source license. Whether or not you state this explicitly, by
7-
submitting any copyrighted material via pull request, email, or other means
8-
you agree to license the material under the project's open source license and
9-
warrant that you have the legal authority to do so.
3+
*Before opening a pull request*, review the
4+
[Contributing to Spark wiki](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark).
5+
It lists steps that are required before creating a PR. In particular, consider:
6+
7+
- Is the change important and ready enough to ask the community to spend time reviewing?
8+
- Have you searched for existing, related JIRAs and pull requests?
9+
- Is this a new feature that can stand alone as a package on http://spark-packages.org ?
10+
- Is the change being proposed clearly explained and motivated?
1011

11-
Please see the [Contributing to Spark wiki page](https://cwiki.apache.org/SPARK/Contributing+to+Spark)
12-
for more information.
12+
When you contribute code, you affirm that the contribution is your original work and that you
13+
license the work to the project under the project's open source license. Whether or not you
14+
state this explicitly, by submitting any copyrighted material via pull request, email, or
15+
other means you agree to license the material under the project's open source license and
16+
warrant that you have the legal authority to do so.

LICENSE

Lines changed: 94 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -643,6 +643,36 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
643643
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
644644
THE SOFTWARE.
645645

646+
========================================================================
647+
For d3 (core/src/main/resources/org/apache/spark/ui/static/d3.min.js):
648+
========================================================================
649+
650+
Copyright (c) 2010-2015, Michael Bostock
651+
All rights reserved.
652+
653+
Redistribution and use in source and binary forms, with or without
654+
modification, are permitted provided that the following conditions are met:
655+
656+
* Redistributions of source code must retain the above copyright notice, this
657+
list of conditions and the following disclaimer.
658+
659+
* Redistributions in binary form must reproduce the above copyright notice,
660+
this list of conditions and the following disclaimer in the documentation
661+
and/or other materials provided with the distribution.
662+
663+
* The name Michael Bostock may not be used to endorse or promote products
664+
derived from this software without specific prior written permission.
665+
666+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
667+
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
668+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
669+
DISCLAIMED. IN NO EVENT SHALL MICHAEL BOSTOCK BE LIABLE FOR ANY DIRECT,
670+
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
671+
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
672+
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
673+
OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
674+
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
675+
EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
646676

647677
========================================================================
648678
For Scala Interpreter classes (all .scala files in repl/src/main/scala
@@ -806,6 +836,68 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
806836
See the License for the specific language governing permissions and
807837
limitations under the License.
808838

839+
========================================================================
840+
For vis.js (core/src/main/resources/org/apache/spark/ui/static/vis.min.js):
841+
========================================================================
842+
Copyright (C) 2010-2015 Almende B.V.
843+
844+
Vis.js is dual licensed under both
845+
846+
* The Apache 2.0 License
847+
http://www.apache.org/licenses/LICENSE-2.0
848+
849+
and
850+
851+
* The MIT License
852+
http://opensource.org/licenses/MIT
853+
854+
Vis.js may be distributed under either license.
855+
856+
========================================================================
857+
For dagre-d3 (core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js):
858+
========================================================================
859+
Copyright (c) 2013 Chris Pettitt
860+
861+
Permission is hereby granted, free of charge, to any person obtaining a copy
862+
of this software and associated documentation files (the "Software"), to deal
863+
in the Software without restriction, including without limitation the rights
864+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
865+
copies of the Software, and to permit persons to whom the Software is
866+
furnished to do so, subject to the following conditions:
867+
868+
The above copyright notice and this permission notice shall be included in
869+
all copies or substantial portions of the Software.
870+
871+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
872+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
873+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
874+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
875+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
876+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
877+
THE SOFTWARE.
878+
879+
========================================================================
880+
For graphlib-dot (core/src/main/resources/org/apache/spark/ui/static/graphlib-dot.min.js):
881+
========================================================================
882+
Copyright (c) 2012-2013 Chris Pettitt
883+
884+
Permission is hereby granted, free of charge, to any person obtaining a copy
885+
of this software and associated documentation files (the "Software"), to deal
886+
in the Software without restriction, including without limitation the rights
887+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
888+
copies of the Software, and to permit persons to whom the Software is
889+
furnished to do so, subject to the following conditions:
890+
891+
The above copyright notice and this permission notice shall be included in
892+
all copies or substantial portions of the Software.
893+
894+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
895+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
896+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
897+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
898+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
899+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
900+
THE SOFTWARE.
809901

810902
========================================================================
811903
BSD-style licenses
@@ -814,7 +906,8 @@ BSD-style licenses
814906
The following components are provided under a BSD-style license. See project link for details.
815907

816908
(BSD 3 Clause) core (com.github.fommil.netlib:core:1.1.2 - https://github.com/fommil/netlib-java/core)
817-
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.3 - http://jblas.org/)
909+
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.1.15 - https://github.com/jpmml/jpmml-model)
910+
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.4 - http://jblas.org/)
818911
(BSD License) AntLR Parser Generator (antlr:antlr:2.7.7 - http://www.antlr.org/)
819912
(BSD License) Javolution (javolution:javolution:5.5.1 - http://javolution.org)
820913
(BSD licence) ANTLR ST4 4.0.4 (org.antlr:ST4:4.0.4 - http://www.stringtemplate.org)

R/.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
*.o
2+
*.so
3+
*.Rd
4+
lib
5+
pkg/man
6+
pkg/html

R/DOCUMENTATION.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# SparkR Documentation
2+
3+
SparkR documentation is generated using in-source comments annotated using using
4+
`roxygen2`. After making changes to the documentation, to generate man pages,
5+
you can run the following from an R console in the SparkR home directory
6+
7+
library(devtools)
8+
devtools::document(pkg="./pkg", roclets=c("rd"))
9+
10+
You can verify if your changes are good by running
11+
12+
R CMD check pkg/

R/README.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# R on Spark
2+
3+
SparkR is an R package that provides a light-weight frontend to use Spark from R.
4+
5+
### SparkR development
6+
7+
#### Build Spark
8+
9+
Build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn) and include the `-PsparkR` profile to build the R package. For example to use the default Hadoop versions you can run
10+
```
11+
build/mvn -DskipTests -Psparkr package
12+
```
13+
14+
#### Running sparkR
15+
16+
You can start using SparkR by launching the SparkR shell with
17+
18+
./bin/sparkR
19+
20+
The `sparkR` script automatically creates a SparkContext with Spark by default in
21+
local mode. To specify the Spark master of a cluster for the automatically created
22+
SparkContext, you can run
23+
24+
./bin/sparkR --master "local[2]"
25+
26+
To set other options like driver memory, executor memory etc. you can pass in the [spark-submit](http://spark.apache.org/docs/latest/submitting-applications.html) arguments to `./bin/sparkR`
27+
28+
#### Using SparkR from RStudio
29+
30+
If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
31+
```
32+
# Set this to where Spark is installed
33+
Sys.setenv(SPARK_HOME="/Users/shivaram/spark")
34+
# This line loads SparkR from the installed directory
35+
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
36+
library(SparkR)
37+
sc <- sparkR.init(master="local")
38+
```
39+
40+
#### Making changes to SparkR
41+
42+
The [instructions](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark) for making contributions to Spark also apply to SparkR.
43+
If you only make R file changes (i.e. no Scala changes) then you can just re-install the R package using `R/install-dev.sh` and test your changes.
44+
Once you have made your changes, please include unit tests for them and run existing unit tests using the `run-tests.sh` script as described below.
45+
46+
#### Generating documentation
47+
48+
The SparkR documentation (Rd files and HTML files) are not a part of the source repository. To generate them you can run the script `R/create-docs.sh`. This script uses `devtools` and `knitr` to generate the docs and these packages need to be installed on the machine before using the script.
49+
50+
### Examples, Unit tests
51+
52+
SparkR comes with several sample programs in the `examples/src/main/r` directory.
53+
To run one of them, use `./bin/sparkR <filename> <args>`. For example:
54+
55+
./bin/sparkR examples/src/main/r/dataframe.R
56+
57+
You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first):
58+
59+
R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
60+
./R/run-tests.sh
61+
62+
### Running on YARN
63+
The `./bin/spark-submit` and `./bin/sparkR` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
64+
```
65+
export YARN_CONF_DIR=/etc/hadoop/conf
66+
./bin/spark-submit --master yarn examples/src/main/r/dataframe.R
67+
```

R/WINDOWS.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
## Building SparkR on Windows
2+
3+
To build SparkR on Windows, the following steps are required
4+
5+
1. Install R (>= 3.1) and [Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
6+
include Rtools and R in `PATH`.
7+
2. Install
8+
[JDK7](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html) and set
9+
`JAVA_HOME` in the system environment variables.
10+
3. Download and install [Maven](http://maven.apache.org/download.html). Also include the `bin`
11+
directory in Maven in `PATH`.
12+
4. Set `MAVEN_OPTS` as described in [Building Spark](http://spark.apache.org/docs/latest/building-spark.html).
13+
5. Open a command shell (`cmd`) in the Spark directory and run `mvn -DskipTests -Psparkr package`

R/create-docs.sh

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
#!/bin/bash
2+
3+
#
4+
# Licensed to the Apache Software Foundation (ASF) under one or more
5+
# contributor license agreements. See the NOTICE file distributed with
6+
# this work for additional information regarding copyright ownership.
7+
# The ASF licenses this file to You under the Apache License, Version 2.0
8+
# (the "License"); you may not use this file except in compliance with
9+
# the License. You may obtain a copy of the License at
10+
#
11+
# http://www.apache.org/licenses/LICENSE-2.0
12+
#
13+
# Unless required by applicable law or agreed to in writing, software
14+
# distributed under the License is distributed on an "AS IS" BASIS,
15+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
# See the License for the specific language governing permissions and
17+
# limitations under the License.
18+
#
19+
20+
# Script to create API docs for SparkR
21+
# This requires `devtools` and `knitr` to be installed on the machine.
22+
23+
# After running this script the html docs can be found in
24+
# $SPARK_HOME/R/pkg/html
25+
26+
set -o pipefail
27+
set -e
28+
29+
# Figure out where the script is
30+
export FWDIR="$(cd "`dirname "$0"`"; pwd)"
31+
pushd $FWDIR
32+
33+
# Generate Rd file
34+
Rscript -e 'library(devtools); devtools::document(pkg="./pkg", roclets=c("rd"))'
35+
36+
# Install the package
37+
./install-dev.sh
38+
39+
# Now create HTML files
40+
41+
# knit_rd puts html in current working directory
42+
mkdir -p pkg/html
43+
pushd pkg/html
44+
45+
Rscript -e 'library(SparkR, lib.loc="../../lib"); library(knitr); knit_rd("SparkR")'
46+
47+
popd
48+
49+
popd

R/install-dev.bat

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
@echo off
2+
3+
rem
4+
rem Licensed to the Apache Software Foundation (ASF) under one or more
5+
rem contributor license agreements. See the NOTICE file distributed with
6+
rem this work for additional information regarding copyright ownership.
7+
rem The ASF licenses this file to You under the Apache License, Version 2.0
8+
rem (the "License"); you may not use this file except in compliance with
9+
rem the License. You may obtain a copy of the License at
10+
rem
11+
rem http://www.apache.org/licenses/LICENSE-2.0
12+
rem
13+
rem Unless required by applicable law or agreed to in writing, software
14+
rem distributed under the License is distributed on an "AS IS" BASIS,
15+
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
rem See the License for the specific language governing permissions and
17+
rem limitations under the License.
18+
rem
19+
20+
rem Install development version of SparkR
21+
rem
22+
23+
set SPARK_HOME=%~dp0..
24+
25+
MKDIR %SPARK_HOME%\R\lib
26+
27+
R.exe CMD INSTALL --library="%SPARK_HOME%\R\lib" %SPARK_HOME%\R\pkg\

0 commit comments

Comments
 (0)