Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
155 commits
Select commit Hold shift + click to select a range
6f61e1f
[SPARK-4761][SQL] Enables Kryo by default in Spark SQL Thrift server
liancheng Dec 5, 2014
98a7d09
[SPARK-4005][CORE] handle message replies in receive instead of in th…
liyezhang556520 Dec 5, 2014
6eb1b6f
Streaming doc : do you mean inadvertently?
CrazyJvm Dec 5, 2014
e895e0c
[SPARK-3623][GraphX] GraphX should support the checkpoint operation
witgo Dec 6, 2014
2e6b736
[SPARK-4646] Replace Scala.util.Sorting.quickSort with Sorter(TimSort…
maropu Dec 8, 2014
8817fc7
[SPARK-4620] Add unpersist in Graph and GraphImpl
maropu Dec 8, 2014
ab2abcb
[SPARK-4764] Ensure that files are fetched atomically
Dec 8, 2014
d6a972b
[SPARK-4774] [SQL] Makes HiveFromSpark more portable
Dec 8, 2014
65f929d
[SPARK-4750] Dynamic allocation - synchronize kills
Dec 9, 2014
e829bfa
SPARK-3926 [CORE] Reopened: result of JavaRDD collectAsMap() is not s…
srowen Dec 9, 2014
cda94d1
SPARK-4770. [DOC] [YARN] spark.scheduler.minRegisteredResourcesRatio …
sryza Dec 9, 2014
9443843
[SQL] remove unnecessary import in spark-sql
Dec 9, 2014
51b1fe1
[SPARK-4769] [SQL] CTAS does not work when reading from temporary tables
chenghao-intel Dec 9, 2014
bcb5cda
[SPARK-3154][STREAMING] Replace ConcurrentHashMap with mutable.HashMa…
zsxwing Dec 9, 2014
383c555
[SPARK-4785][SQL] Initilize Hive UDFs on the driver and serialize the…
chenghao-intel Dec 9, 2014
912563a
SPARK-4338. [YARN] Ditch yarn-alpha.
sryza Dec 9, 2014
61f1a70
[SPARK-874] adding a --wait flag
Dec 9, 2014
b310744
[SPARK-4691][shuffle] Restructure a few lines in shuffle code
Dec 9, 2014
1f51106
[SPARK-4765] Make GC time always shown in UI.
kayousterhout Dec 9, 2014
30dca92
[SPARK-4714] BlockManager.dropFromMemory() should check whether block…
suyanNone Dec 9, 2014
5e4c06f
SPARK-4567. Make SparkJobInfo and SparkStageInfo serializable
sryza Dec 10, 2014
d8f84f2
SPARK-4805 [CORE] BlockTransferMessage.toByteArray() trips assertion
srowen Dec 10, 2014
2b9b726
[SPARK-4740] Create multiple concurrent connections between two peer …
rxin Dec 10, 2014
9bd9334
Config updates for the new shuffle transport.
rxin Dec 10, 2014
f79c1cf
[Minor] Use <sup> tag for help icon in web UI page header
JoshRosen Dec 10, 2014
94b377f
[SPARK-4772] Clear local copies of accumulators as soon as we're done…
Dec 10, 2014
742e709
[SPARK-4161]Spark shell class path is not correctly set if "spark.dri…
witgo Dec 10, 2014
0fc637b
[SPARK-4329][WebUI] HistoryPage pagenation
sarutak Dec 10, 2014
5621283
[SPARK-4771][Docs] Document standalone cluster supervise mode
Dec 10, 2014
faa8fd8
[SPARK-4215] Allow requesting / killing executors only in YARN mode
Dec 10, 2014
e230da1
[SPARK-4793] [Deploy] ensure .jar at end of line
adrian-wang Dec 10, 2014
447ae2d
[SPARK-4569] Rename 'externalSorting' in Aggregator
Dec 10, 2014
4f93d0c
[SPARK-4759] Fix driver hanging from coalescing partitions
Dec 10, 2014
36bdb5b
MAINTENANCE: Automated closing of pull requests.
pwendell Dec 10, 2014
652b781
SPARK-3526 Add section about data locality to the tuning guide
ash211 Dec 10, 2014
57d37f9
[CORE]codeStyle: uniform ConcurrentHashMap define in StorageLevel.sca…
liyezhang556520 Dec 11, 2014
2a5b5fd
[SPARK-4791] [sql] Infer schema from case class with multiple constru…
jkbradley Dec 11, 2014
b004150
[SPARK-4806] Streaming doc update for 1.2
tdas Dec 11, 2014
bf40cf8
[SPARK-4713] [SQL] SchemaRDD.unpersist() should not raise exception i…
chenghao-intel Dec 12, 2014
a7f07f5
[SPARK-4662] [SQL] Whitelist more unittest
chenghao-intel Dec 12, 2014
c152dde
[SPARK-4639] [SQL] Pass maxIterations in as a parameter in Analyzer
Dec 12, 2014
3344803
[SPARK-4293][SQL] Make Cast be able to handle complex types.
ueshin Dec 12, 2014
d8cf678
[SQL] Remove unnecessary case in HiveContext.toHiveString
scwf Dec 12, 2014
acb3be6
[SPARK-4828] [SQL] sum and avg on empty table should always return null
adrian-wang Dec 12, 2014
cbb634a
[SQL] enable empty aggr test case
adrian-wang Dec 12, 2014
0abbff2
[SPARK-4825] [SQL] CTAS fails to resolve when created using saveAsTable
chenghao-intel Dec 12, 2014
8091dd6
[SPARK-4742][SQL] The name of Parquet File generated by AppendingParq…
sasakitoa Dec 12, 2014
41a3f93
[SPARK-4829] [SQL] add rule to fold count(expr) if expr is not null
adrian-wang Dec 12, 2014
ef84dab
MAINTENANCE: Automated closing of pull requests.
pwendell Dec 12, 2014
2a2983f
fixed spelling errors in documentation
peterklipfel Dec 14, 2014
4c06738
HOTFIX: Disabling failing block manager test
pwendell Dec 15, 2014
8098fab
[SPARK-4494][mllib] IDFModel.transform() add support for single vector
yu-iskw Dec 15, 2014
f6b8591
[SPARK-4826] Fix generation of temp file names in WAL tests
JoshRosen Dec 15, 2014
38703bb
[SPARK-1037] The name of findTaskFromList & findTask in TaskSetManage…
Dec 15, 2014
8176b7a
[SPARK-4668] Fix some documentation typos.
ryan-williams Dec 15, 2014
2a28bc6
SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions
srowen Dec 16, 2014
5c24759
[Minor][Core] fix comments in MapOutputTracker
scwf Dec 16, 2014
81112e4
SPARK-4814 [CORE] Enable assertions in SBT, Maven tests / AssertionEr…
srowen Dec 16, 2014
c762877
[SPARK-4792] Add error message when making local dir unsuccessfully
XuTingjun Dec 16, 2014
c246b95
[SPARK-4841] fix zip with textFile()
Dec 16, 2014
ed36200
[SPARK-4437] update doc for WholeCombineFileRecordReader
Dec 16, 2014
cb48447
[SPARK-4855][mllib] testing the Chi-squared hypothesis test
Dec 16, 2014
d12c071
[SPARK-3405] add subnet-id and vpc-id options to spark_ec2.py
mvj101 Dec 16, 2014
17688d1
[SQL] SPARK-4700: Add HTTP protocol spark thrift server
judynash Dec 16, 2014
1a9e35e
[DOCS][SQL] Add a Note on jsonFile having separate JSON objects per line
petervandenabeele Dec 16, 2014
dc8280d
[SPARK-4847][SQL]Fix "extraStrategies cannot take effect in SQLContex…
jerryshao Dec 16, 2014
6530243
[SPARK-4812][SQL] Fix the initialization issue of 'codegenEnabled'
zsxwing Dec 16, 2014
b0dfdbd
SPARK-4767: Add support for launching in a specified placement group …
holdenk Dec 16, 2014
ea1315e
[SPARK-4527][SQl]Add BroadcastNestedLoopJoin operator selection tests…
wangxiaojing Dec 16, 2014
30f6b85
[SPARK-4483][SQL]Optimization about reduce memory costs during the Ha…
tianyi Dec 16, 2014
a66c23e
[SPARK-4827][SQL] Fix resolution of deeply nested Project(attr, Proje…
marmbrus Dec 16, 2014
fa66ef6
[SPARK-4269][SQL] make wait time configurable in BroadcastHashJoin
Dec 16, 2014
6f80b74
[Release] Major improvements to generate contributors script
Dec 17, 2014
b85044e
[Release] Cache known author translations locally
Dec 17, 2014
3b395e1
[SPARK-4798][SQL] A new set of Parquet testing API and test suites
liancheng Dec 17, 2014
0aa834a
[SPARK-4744] [SQL] Short circuit evaluation for AND & OR in CodeGen
chenghao-intel Dec 17, 2014
ddc7ba3
[SPARK-4720][SQL] Remainder should also return null if the divider is 0.
ueshin Dec 17, 2014
770d815
[SPARK-4375] [SQL] Add 0 argument support for udf
chenghao-intel Dec 17, 2014
ec5c427
[SPARK-4866] support StructType as key in MapType
Dec 17, 2014
6069880
[SPARK-4618][SQL] Make foreign DDL commands options case-insensitive
scwf Dec 17, 2014
4e1112e
[Release] Update contributors list format and sort it
Dec 17, 2014
3d0c37b
[HOTFIX] Fix RAT exclusion for known_translations file
JoshRosen Dec 17, 2014
cf50631
[SPARK-4595][Core] Fix MetricsServlet not work issue
jerryshao Dec 17, 2014
5fdcbdc
[SPARK-4625] [SQL] Add sort by for DSL & SimpleSqlParser
chenghao-intel Dec 17, 2014
4782def
[SPARK-4694]Fix HiveThriftServer2 cann't stop In Yarn HA mode.
SaintBacchus Dec 17, 2014
7ad579e
[SPARK-3698][SQL] Fix case insensitive resolution of GetField.
marmbrus Dec 17, 2014
6277135
[SPARK-4493][SQL] Don't pushdown Eq, NotEq, Lt, LtEq, Gt and GtEq pre…
liancheng Dec 17, 2014
902e4d5
[SPARK-4755] [SQL] sqrt(negative value) should return null
adrian-wang Dec 17, 2014
636d9fc
[SPARK-3739] [SQL] Update the split num base on block size for table …
chenghao-intel Dec 17, 2014
affc3f4
[SPARK-4821] [mllib] [python] [docs] Fix for pyspark.mllib.rand doc
jkbradley Dec 17, 2014
19c0faa
[HOTFIX][SQL] Fix parquet filter suite
marmbrus Dec 17, 2014
8d0d2a6
[SPARK-4856] [SQL] NullType instead of StringType when sampling again…
chenghao-intel Dec 17, 2014
f33d550
[SPARK-3891][SQL] Add array support to percentile, percentile_approx …
gvramana Dec 17, 2014
ca12608
MAINTENANCE: Automated closing of pull requests.
pwendell Dec 17, 2014
3cd5161
[SPARK-4822] Use sphinx tags for Python doc annotations
Lewuathe Dec 18, 2014
3b76469
[SPARK-4461][YARN] pass extra java options to yarn application master
zhzhan Dec 18, 2014
253b72b
SPARK-3779. yarn spark.yarn.applicationMaster.waitTries config should…
sryza Dec 18, 2014
d9956f8
Add mesos specific configurations into doc
tnachen Dec 18, 2014
3720057
[SPARK-3607] ConnectionManager threads.max configs on the thread pool…
Dec 18, 2014
59a49db
[SPARK-4887][MLlib] Fix a bad unittest in LogisticRegressionSuite
Dec 18, 2014
a7ed6f3
[SPARK-4880] remove spark.locality.wait in Analytics
Earne Dec 18, 2014
d5a596d
[SPARK-4884]: Improve Partition docs
msiddalingaiah Dec 19, 2014
f9f58b9
SPARK-4743 - Use SparkEnv.serializer instead of closureSerializer in …
Dec 19, 2014
105293a
[SPARK-4837] NettyBlockTransferService should use spark.blockManager.…
aarondav Dec 19, 2014
9804a75
[SPARK-4754] Refactor SparkContext into ExecutorAllocationClient
Dec 19, 2014
f728e0f
[SPARK-2663] [SQL] Support the Grouping Set
chenghao-intel Dec 19, 2014
b68bc6d
[SPARK-3928][SQL] Support wildcard matches on Parquet files.
tkyaw Dec 19, 2014
22ddb6e
[SPARK-4756][SQL] FIX: sessionToActivePool grow infinitely, even as s…
guowei2 Dec 19, 2014
e7de7e5
[SPARK-4693] [SQL] PruningPredicates may be wrong if predicates conta…
YanTangZhai Dec 19, 2014
7687415
[SPARK-2554][SQL] Supporting SumDistinct partial aggregation
ravipesala Dec 19, 2014
ae9f128
[SPARK-4573] [SQL] Add SettableStructObjectInspector support in "wrap…
chenghao-intel Dec 19, 2014
c3d91da
[SPARK-4861][SQL] Refactory command in spark sql
scwf Dec 19, 2014
ee1fb97
[SPARK-4728][MLLib] Add exponential, gamma, and log normal sampling t…
rnowling Dec 19, 2014
d7fc69a
[SPARK-4674] Refactor getCallSite
viirya Dec 19, 2014
283263f
SPARK-3428. TaskMetrics for running tasks is missing GC time metrics
sryza Dec 19, 2014
5479450
[SPARK-4901] [SQL] Hot fix for ByteWritables.copyBytes
chenghao-intel Dec 19, 2014
8e253eb
[Build] Remove spark-staging-1038
scwf Dec 19, 2014
336cd34
Small refactoring to pass SparkEnv into Executor rather than creating…
rxin Dec 19, 2014
cdb2c64
[SPARK-4889] update history server example cmds
ryan-williams Dec 19, 2014
7981f96
[SPARK-4896] don’t redundantly overwrite executor JAR deps
ryan-williams Dec 19, 2014
c28083f
[SPARK-4890] Upgrade Boto to 2.34.0; automatically download Boto from…
JoshRosen Dec 20, 2014
4564519
[SPARK-2261] Make event logger use a single file.
Dec 20, 2014
c25c669
change signature of example to match released code
eranation Dec 20, 2014
8d93247
[SPARK-3060] spark-shell.cmd doesn't accept application options in Wi…
tsudukim Dec 20, 2014
1d64812
SPARK-2641: Passing num executors to spark arguments from properties …
Dec 20, 2014
7cb3f54
[SPARK-4831] Do not include SPARK_CLASSPATH if empty
darabos Dec 20, 2014
15c03e1
[SPARK-4140] Document dynamic allocation
Dec 20, 2014
a764960
[Minor] Build Failed: value defaultProperties not found
SaintBacchus Dec 20, 2014
c6a3c0d
SPARK-4910 [CORE] build failed (use of FileStatus.isFile in Hadoop 1.x)
srowen Dec 21, 2014
6ee6aa7
[SPARK-2075][Core] Make the compiler generate same bytes code for Had…
zsxwing Dec 22, 2014
93b2f3a
[SPARK-4918][Core] Reuse Text in saveAsTextFile
zsxwing Dec 22, 2014
96606f6
[SPARK-4915][YARN] Fix classname to be specified for external shuffle…
Dec 22, 2014
39272c8
[SPARK-4870] Add spark version to driver log
liyezhang556520 Dec 22, 2014
8773705
[SPARK-4883][Shuffle] Add a name to the directoryCleaner thread
zsxwing Dec 22, 2014
1d9788e
[Minor] Improve some code in BroadcastTest for short
SaintBacchus Dec 22, 2014
fb8e85e
[SPARK-4733] Add missing prameter comments in ShuffleDependency
maropu Dec 22, 2014
d62da64
SPARK-4447. Remove layers of abstraction in YARN code no longer neede…
sryza Dec 22, 2014
7c0ed13
[SPARK-4079] [CORE] Consolidates Errors if a CompressionCodec is not …
Dec 22, 2014
fbca6b6
[SPARK-4864] Add documentation to Netty-based configs
aarondav Dec 22, 2014
a61aa66
[Minor] Fix scala doc
viirya Dec 22, 2014
de9d7d2
[SPARK-4920][UI]:current spark version in UI is not striking.
uncleGen Dec 22, 2014
c233ab3
[SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join
zsxwing Dec 22, 2014
a96b727
[SPARK-4907][MLlib] Inconsistent loss and gradient in LeastSquaresGra…
Dec 23, 2014
0e532cc
[Docs] Minor typo fixes
nchammas Dec 23, 2014
2823c7f
[SPARK-4890] Ignore downloaded EC2 libs
nchammas Dec 23, 2014
2d215ae
[SPARK-4931][Yarn][Docs] Fix the format of running-on-yarn.md
zsxwing Dec 23, 2014
dd15536
[SPARK-4834] [standalone] Clean up application files after app finishes.
Dec 23, 2014
9c251c5
[SPARK-4932] Add help comments in Analytics
maropu Dec 23, 2014
395b771
[SPARK-4914][Build] Cleans lib_managed before compiling with Hive 0.13.1
liancheng Dec 23, 2014
27c5399
[SPARK-4730][YARN] Warn against deprecated YARN settings
Dec 23, 2014
96281cd
[SPARK-4913] Fix incorrect event log path
viirya Dec 23, 2014
10d69e9
[SPARK-4802] [streaming] Remove receiverInfo once receiver is de-regi…
ilayaperumalg Dec 23, 2014
3f5f4cc
[SPARK-4671][Streaming]Do not replicate streaming block when WAL is e…
jerryshao Dec 23, 2014
7e2deb7
[SPARK-4606] Send EOF to child JVM when there's no more data to read.
Dec 24, 2014
fd41eb9
[SPARK-4860][pyspark][sql] speeding up `sample()` and `takeSample()`
Dec 24, 2014
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,11 @@ checkpoint
derby.log
dist/
dev/create-release/*txt
dev/create-release/*new
dev/create-release/*final
spark-*-bin-*.tgz
unit-tests.log
/lib/
ec2/lib/
rat-results.txt
scalastyle.txt
scalastyle-output.xml
Expand Down
1 change: 1 addition & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -64,3 +64,4 @@ dist/*
logs
.*scalastyle-output.xml
.*dependency-reduced-pom.xml
known_translations
3 changes: 2 additions & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -646,7 +646,8 @@ THE SOFTWARE.

========================================================================
For Scala Interpreter classes (all .scala files in repl/src/main/scala
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala):
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
and for SerializableMapWrapper in JavaUtils.scala:
========================================================================

Copyright (c) 2002-2013 EPFL
Expand Down
10 changes: 0 additions & 10 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -169,16 +169,6 @@
</build>

<profiles>
<profile>
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-yarn-alpha_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>yarn</id>
<dependencies>
Expand Down
12 changes: 8 additions & 4 deletions bin/compute-classpath.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,11 @@ FWDIR="$(cd "`dirname "$0"`"/..; pwd)"

. "$FWDIR"/bin/load-spark-env.sh

CLASSPATH="$SPARK_CLASSPATH:$SPARK_SUBMIT_CLASSPATH"
if [ -n "$SPARK_CLASSPATH" ]; then
CLASSPATH="$SPARK_CLASSPATH:$SPARK_SUBMIT_CLASSPATH"
else
CLASSPATH="$SPARK_SUBMIT_CLASSPATH"
fi

# Build up classpath
if [ -n "$SPARK_CONF_DIR" ]; then
Expand Down Expand Up @@ -68,14 +72,14 @@ else
assembly_folder="$ASSEMBLY_DIR"
fi

num_jars="$(ls "$assembly_folder" | grep "spark-assembly.*hadoop.*\.jar" | wc -l)"
num_jars="$(ls "$assembly_folder" | grep "spark-assembly.*hadoop.*\.jar$" | wc -l)"
if [ "$num_jars" -eq "0" ]; then
echo "Failed to find Spark assembly in $assembly_folder"
echo "You need to build Spark before running this program."
exit 1
fi
if [ "$num_jars" -gt "1" ]; then
jars_list=$(ls "$assembly_folder" | grep "spark-assembly.*hadoop.*.jar")
jars_list=$(ls "$assembly_folder" | grep "spark-assembly.*hadoop.*.jar$")
echo "Found multiple Spark assembly jars in $assembly_folder:"
echo "$jars_list"
echo "Please remove all but one jar."
Expand Down Expand Up @@ -108,7 +112,7 @@ else
datanucleus_dir="$FWDIR"/lib_managed/jars
fi

datanucleus_jars="$(find "$datanucleus_dir" 2>/dev/null | grep "datanucleus-.*\\.jar")"
datanucleus_jars="$(find "$datanucleus_dir" 2>/dev/null | grep "datanucleus-.*\\.jar$")"
datanucleus_jars="$(echo "$datanucleus_jars" | tr "\n" : | sed s/:$//g)"

if [ -n "$datanucleus_jars" ]; then
Expand Down
7 changes: 7 additions & 0 deletions bin/spark-shell
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,13 @@ source "$FWDIR"/bin/utils.sh
SUBMIT_USAGE_FUNCTION=usage
gatherSparkSubmitOpts "$@"

# SPARK-4161: scala does not assume use of the java classpath,
# so we need to add the "-Dscala.usejavacp=true" flag mnually. We
# do this specifically for the Spark shell because the scala REPL
# has its own class loader, and any additional classpath specified
# through spark.driver.extraClassPath is not automatically propagated.
SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Dscala.usejavacp=true"

function main() {
if $cygwin; then
# Workaround for issue involving JLine and Cygwin
Expand Down
21 changes: 20 additions & 1 deletion bin/spark-shell2.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,23 @@ rem

set SPARK_HOME=%~dp0..

cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd --class org.apache.spark.repl.Main %* spark-shell
echo "%*" | findstr " --help -h" >nul
if %ERRORLEVEL% equ 0 (
call :usage
exit /b 0
)

call %SPARK_HOME%\bin\windows-utils.cmd %*
if %ERRORLEVEL% equ 1 (
call :usage
exit /b 1
)

cmd /V /E /C %SPARK_HOME%\bin\spark-submit.cmd --class org.apache.spark.repl.Main %SUBMISSION_OPTS% spark-shell %APPLICATION_OPTS%

exit /b 0

:usage
echo "Usage: .\bin\spark-shell.cmd [options]" >&2
%SPARK_HOME%\bin\spark-submit --help 2>&1 | findstr /V "Usage" 1>&2
exit /b 0
59 changes: 59 additions & 0 deletions bin/windows-utils.cmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
rem
rem Licensed to the Apache Software Foundation (ASF) under one or more
rem contributor license agreements. See the NOTICE file distributed with
rem this work for additional information regarding copyright ownership.
rem The ASF licenses this file to You under the Apache License, Version 2.0
rem (the "License"); you may not use this file except in compliance with
rem the License. You may obtain a copy of the License at
rem
rem http://www.apache.org/licenses/LICENSE-2.0
rem
rem Unless required by applicable law or agreed to in writing, software
rem distributed under the License is distributed on an "AS IS" BASIS,
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
rem See the License for the specific language governing permissions and
rem limitations under the License.
rem

rem Gather all spark-submit options into SUBMISSION_OPTS

set SUBMISSION_OPTS=
set APPLICATION_OPTS=

rem NOTE: If you add or remove spark-sumbmit options,
rem modify NOT ONLY this script but also SparkSubmitArgument.scala

:OptsLoop
if "x%1"=="x" (
goto :OptsLoopEnd
)

SET opts="\<--master\> \<--deploy-mode\> \<--class\> \<--name\> \<--jars\> \<--py-files\> \<--files\>"
SET opts="%opts:~1,-1% \<--conf\> \<--properties-file\> \<--driver-memory\> \<--driver-java-options\>"
SET opts="%opts:~1,-1% \<--driver-library-path\> \<--driver-class-path\> \<--executor-memory\>"
SET opts="%opts:~1,-1% \<--driver-cores\> \<--total-executor-cores\> \<--executor-cores\> \<--queue\>"
SET opts="%opts:~1,-1% \<--num-executors\> \<--archives\>"

echo %1 | findstr %opts% >nul
if %ERRORLEVEL% equ 0 (
if "x%2"=="x" (
echo "%1" requires an argument. >&2
exit /b 1
)
set SUBMISSION_OPTS=%SUBMISSION_OPTS% %1 %2
shift
shift
goto :OptsLoop
)
echo %1 | findstr "\<--verbose\> \<-v\> \<--supervise\>" >nul
if %ERRORLEVEL% equ 0 (
set SUBMISSION_OPTS=%SUBMISSION_OPTS% %1
shift
goto :OptsLoop
)
set APPLICATION_OPTS=%APPLICATION_OPTS% %1
shift
goto :OptsLoop

:OptsLoopEnd
exit /b 0
4 changes: 2 additions & 2 deletions conf/metrics.properties.template
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,8 @@
# sample false Whether to show entire set of samples for histograms ('false' or 'true')
#
# * Default path is /metrics/json for all instances except the master. The master has two paths:
# /metrics/aplications/json # App information
# /metrics/master/json # Master information
# /metrics/applications/json # App information
# /metrics/master/json # Master information

# org.apache.spark.metrics.sink.GraphiteSink
# Name: Default: Description:
Expand Down
4 changes: 3 additions & 1 deletion core/src/main/java/org/apache/spark/SparkJobInfo.java
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,15 @@

package org.apache.spark;

import java.io.Serializable;

/**
* Exposes information about Spark Jobs.
*
* This interface is not designed to be implemented outside of Spark. We may add additional methods
* which may break binary compatibility with outside implementations.
*/
public interface SparkJobInfo {
public interface SparkJobInfo extends Serializable {
int jobId();
int[] stageIds();
JobExecutionStatus status();
Expand Down
4 changes: 3 additions & 1 deletion core/src/main/java/org/apache/spark/SparkStageInfo.java
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,15 @@

package org.apache.spark;

import java.io.Serializable;

/**
* Exposes information about Spark Stages.
*
* This interface is not designed to be implemented outside of Spark. We may add additional methods
* which may break binary compatibility with outside implementations.
*/
public interface SparkStageInfo {
public interface SparkStageInfo extends Serializable {
int stageId();
int currentAttemptId();
long submissionTime();
Expand Down
12 changes: 11 additions & 1 deletion core/src/main/resources/org/apache/spark/ui/static/webui.css
Original file line number Diff line number Diff line change
Expand Up @@ -169,8 +169,18 @@ span.additional-metric-title {
display: inline-block;
}

.version {
line-height: 30px;
vertical-align: bottom;
font-size: 12px;
padding: 0;
margin: 0;
font-weight: bold;
color: #777;
}

/* Hide all additional metrics by default. This is done here rather than using JavaScript to
* avoid slow page loads for stage pages with large numbers (e.g., thousands) of tasks. */
.scheduler_delay, .gc_time, .deserialization_time, .serialization_time, .getting_result_time {
.scheduler_delay, .deserialization_time, .serialization_time, .getting_result_time {
display: none;
}
14 changes: 8 additions & 6 deletions core/src/main/scala/org/apache/spark/Accumulators.scala
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ package org.apache.spark

import java.io.{ObjectInputStream, Serializable}
import java.util.concurrent.atomic.AtomicLong
import java.lang.ThreadLocal

import scala.collection.generic.Growable
import scala.collection.mutable.Map
Expand Down Expand Up @@ -278,10 +279,12 @@ object AccumulatorParam {

// TODO: The multi-thread support in accumulators is kind of lame; check
// if there's a more intuitive way of doing it right
private object Accumulators {
private[spark] object Accumulators {
// TODO: Use soft references? => need to make readObject work properly then
val originals = Map[Long, Accumulable[_, _]]()
val localAccums = Map[Thread, Map[Long, Accumulable[_, _]]]()
val localAccums = new ThreadLocal[Map[Long, Accumulable[_, _]]]() {
override protected def initialValue() = Map[Long, Accumulable[_, _]]()
}
var lastId: Long = 0

def newId(): Long = synchronized {
Expand All @@ -293,22 +296,21 @@ private object Accumulators {
if (original) {
originals(a.id) = a
} else {
val accums = localAccums.getOrElseUpdate(Thread.currentThread, Map())
accums(a.id) = a
localAccums.get()(a.id) = a
}
}

// Clear the local (non-original) accumulators for the current thread
def clear() {
synchronized {
localAccums.remove(Thread.currentThread)
localAccums.get.clear
}
}

// Get the values of the local accumulators for the current thread (by ID)
def values: Map[Long, Any] = synchronized {
val ret = Map[Long, Any]()
for ((id, accum) <- localAccums.getOrElse(Thread.currentThread, Map())) {
for ((id, accum) <- localAccums.get) {
ret(id) = accum.localValue
}
return ret
Expand Down
10 changes: 6 additions & 4 deletions core/src/main/scala/org/apache/spark/Aggregator.scala
Original file line number Diff line number Diff line change
Expand Up @@ -34,15 +34,17 @@ case class Aggregator[K, V, C] (
mergeValue: (C, V) => C,
mergeCombiners: (C, C) => C) {

private val externalSorting = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", true)
// When spilling is enabled sorting will happen externally, but not necessarily with an
// ExternalSorter.
private val isSpillEnabled = SparkEnv.get.conf.getBoolean("spark.shuffle.spill", true)

@deprecated("use combineValuesByKey with TaskContext argument", "0.9.0")
def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]]): Iterator[(K, C)] =
combineValuesByKey(iter, null)

def combineValuesByKey(iter: Iterator[_ <: Product2[K, V]],
context: TaskContext): Iterator[(K, C)] = {
if (!externalSorting) {
if (!isSpillEnabled) {
val combiners = new AppendOnlyMap[K,C]
var kv: Product2[K, V] = null
val update = (hadValue: Boolean, oldValue: C) => {
Expand Down Expand Up @@ -71,9 +73,9 @@ case class Aggregator[K, V, C] (
combineCombinersByKey(iter, null)

def combineCombinersByKey(iter: Iterator[_ <: Product2[K, C]], context: TaskContext)
: Iterator[(K, C)] =
: Iterator[(K, C)] =
{
if (!externalSorting) {
if (!isSpillEnabled) {
val combiners = new AppendOnlyMap[K,C]
var kc: Product2[K, C] = null
val update = (hadValue: Boolean, oldValue: C) => {
Expand Down
3 changes: 3 additions & 0 deletions core/src/main/scala/org/apache/spark/Dependency.scala
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,9 @@ abstract class NarrowDependency[T](_rdd: RDD[T]) extends Dependency[T] {
* @param serializer [[org.apache.spark.serializer.Serializer Serializer]] to use. If set to None,
* the default serializer, as specified by `spark.serializer` config option, will
* be used.
* @param keyOrdering key ordering for RDD's shuffles
* @param aggregator map/reduce-side aggregator for RDD's shuffle
* @param mapSideCombine whether to perform partial aggregation (also known as map-side combine)
*/
@DeveloperApi
class ShuffleDependency[K, V, C](
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark

/**
* A client that communicates with the cluster manager to request or kill executors.
*/
private[spark] trait ExecutorAllocationClient {

/**
* Request an additional number of executors from the cluster manager.
* Return whether the request is acknowledged by the cluster manager.
*/
def requestExecutors(numAdditionalExecutors: Int): Boolean

/**
* Request that the cluster manager kill the specified executors.
* Return whether the request is acknowledged by the cluster manager.
*/
def killExecutors(executorIds: Seq[String]): Boolean

/**
* Request that the cluster manager kill the specified executor.
* Return whether the request is acknowledged by the cluster manager.
*/
def killExecutor(executorId: String): Boolean = killExecutors(Seq(executorId))
}
Loading