Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
4053 commits
Select commit Hold shift + click to select a range
6c70a38
[SPARK-19088][SQL] Optimize sequence type deserialization codegen
michalsenkyr Mar 28, 2017
a9abff2
[SPARK-20119][TEST-MAVEN] Fix the test case fail in DataSourceScanExe…
gatorsmile Mar 28, 2017
91559d2
[SPARK-20094][SQL] Preventing push down of IN subquery to Join operator
Mar 28, 2017
4fcc214
[SPARK-20124][SQL] Join reorder should keep the same order of final p…
Mar 28, 2017
f82461f
[SPARK-20126][SQL] Remove HiveSessionState
hvanhovell Mar 28, 2017
17eddb3
[SPARK-19995][YARN] Register tokens to current UGI to avoid re-issuin…
jerryshao Mar 28, 2017
d4fac41
[SPARK-20125][SQL] Dataset of type option of map does not work
cloud-fan Mar 28, 2017
92e385e
[SPARK-19868] conflict TasksetManager lead to spark stopped
Mar 28, 2017
7d432af
[SPARK-20043][ML] DecisionTreeModel: ImpurityCalculator builder fails…
facaiy Mar 28, 2017
a5c8770
[SPARK-20040][ML][PYTHON] pyspark wrapper for ChiSquareTest
MrBago Mar 29, 2017
9712bd3
[SPARK-20134][SQL] SQLMetrics.postDriverMetricUpdates to simplify dri…
rxin Mar 29, 2017
b56ad2b
[SPARK-19556][CORE] Do not encrypt block manager data in memory.
Mar 29, 2017
c622a87
[SPARK-20059][YARN] Use the correct classloader for HBaseCredentialPr…
jerryshao Mar 29, 2017
d6ddfdf
[SPARK-19955][PYSPARK] Jenkins Python Conda based test.
holdenk Mar 29, 2017
142f6d1
[SPARK-20048][SQL] Cloning SessionState does not clone query executio…
kunalkhamar Mar 29, 2017
c400848
[SPARK-20009][SQL] Support DDL strings for defining schema in functio…
maropu Mar 29, 2017
5c8ef37
[SPARK-17075][SQL][FOLLOWUP] Add Estimation of Constant Literal
gatorsmile Mar 29, 2017
fe1d6b0
[SPARK-20120][SQL] spark-sql support silent mode
wangyum Mar 29, 2017
dd2e7d5
[SPARK-19088][SQL] Fix 2.10 build.
ueshin Mar 30, 2017
22f07fe
[SPARK-20146][SQL] fix comment missing issue for thrift server
Mar 30, 2017
6097788
[SPARK-20136][SQL] Add num files and metadata operation timing to sca…
rxin Mar 30, 2017
7963605
[SPARK-20148][SQL] Extend the file commit API to allow subscribing to…
ericl Mar 30, 2017
471de5d
[MINOR][SPARKR] Add run command comment in examples
wangmiao1981 Mar 30, 2017
edc87d7
[SPARK-20107][DOC] Add spark.hadoop.mapreduce.fileoutputcommitter.alg…
wangyum Mar 30, 2017
b454d44
[SPARK-15354][CORE] Topology aware block replication strategies
shubhamchopra Mar 30, 2017
0197262
[DOCS] Docs-only improvements
jaceklaskowski Mar 30, 2017
258bff2
[SPARK-19999] Workaround JDK-8165231 to identify PPC64 architectures …
samelamin Mar 30, 2017
e9d268f
[SPARK-20096][SPARK SUBMIT][MINOR] Expose the right queue name not nu…
yaooqinn Mar 30, 2017
669a11b
[DOCS][MINOR] Fixed a few typos in the Structured Streaming documenta…
Mar 30, 2017
5e00a5d
[SPARK-20127][CORE] few warning have been fixed which Intellij IDEA r…
Mar 30, 2017
c734fc5
[SPARK-20121][SQL] simplify NullPropagation with NullIntolerant
cloud-fan Mar 30, 2017
a8a765b
[SPARK-20151][SQL] Account for partition pruning in scan metadataTime…
rxin Mar 31, 2017
254877c
[SPARK-20164][SQL] AnalysisException not tolerant of null query plan.
kunalkhamar Mar 31, 2017
c4c03ee
[SPARK-20084][CORE] Remove internal.metrics.updatedBlockStatuses from…
rdblue Mar 31, 2017
b2349e6
[SPARK-20160][SQL] Move ParquetConversions and OrcConversions Out Of …
gatorsmile Mar 31, 2017
567a50a
[SPARK-20165][SS] Resolve state encoder's deserializer in driver in F…
tdas Mar 31, 2017
cf5963c
[SPARK-20177] Document about compression way has some little detail ch…
Apr 1, 2017
89d6822
[SPARK-19148][SQL][FOLLOW-UP] do not expose the external table concep…
gatorsmile Apr 1, 2017
2287f3d
[SPARK-20186][SQL] BroadcastHint should use child's stats
Apr 1, 2017
d40cbb8
[SPARK-20143][SQL] DataType.fromJson should throw an exception with b…
HyukjinKwon Apr 2, 2017
76de2d1
[SPARK-20123][BUILD] SPARK_HOME variable might have spaces in it(e.g.…
Apr 2, 2017
657cb95
[SPARK-20173][SQL][HIVE-THRIFTSERVER] Throw NullPointerException when…
Apr 2, 2017
93dbfe7
[SPARK-20159][SPARKR][SQL] Support all catalog API in R
felixcheung Apr 2, 2017
2a903a1
[SPARK-19985][ML] Fixed copy method for some ML Models
BryanCutler Apr 3, 2017
cff11fd
[SPARK-20166][SQL] Use XXX for ISO 8601 timezone instead of ZZ (FastD…
HyukjinKwon Apr 3, 2017
364b0db
[MINOR][DOCS] Replace non-breaking space to normal spaces that breaks…
HyukjinKwon Apr 3, 2017
fb5869f
[SPARK-9002][CORE] KryoSerializer initialization does not include 'Ar…
Apr 3, 2017
4d28e84
[SPARK-19969][ML] Imputer doc and example
YY-OnCall Apr 3, 2017
4fa1a43
[SPARK-19641][SQL] JSON schema inference in DROPMALFORMED mode produc…
HyukjinKwon Apr 3, 2017
703c42c
[SPARK-20194] Add support for partition pruning to in-memory catalog
adrian-ionescu Apr 3, 2017
58c9e6e
[SPARK-20145] Fix range case insensitive bug in SQL
samelamin Apr 4, 2017
e7877fd
[SPARK-19408][SQL] filter estimation on two columns of same table
ron8hu Apr 4, 2017
3bfb639
[SPARK-10364][SQL] Support Parquet logical type TIMESTAMP_MILLIS
dilipbiswal Apr 4, 2017
51d3c85
[SPARK-20067][SQL] Unify and Clean Up Desc Commands Using Catalog Int…
gatorsmile Apr 4, 2017
b34f766
[SPARK-19825][R][ML] spark.ml R API for FPGrowth
zero323 Apr 4, 2017
c95fbea
[SPARK-20190][APP-ID] applications//jobs' in rest api,status should b…
Apr 4, 2017
26e7bca
[SPARK-20198][SQL] Remove the inconsistency in table/function name co…
gatorsmile Apr 4, 2017
11238d4
[SPARK-18278][SCHEDULER] Documentation to point to Kubernetes cluster…
foxish Apr 4, 2017
0736980
[SPARK-20191][YARN] Crate wrapper for RackResolver so tests can overr…
Apr 4, 2017
0e2ee82
[MINOR][R] Reorder `Collate` fields in DESCRIPTION file
HyukjinKwon Apr 4, 2017
402bf2a
[SPARK-20204][SQL] remove SimpleCatalystConf and CatalystConf type alias
cloud-fan Apr 4, 2017
295747e
[SPARK-19716][SQL] support by-name resolution for struct type element…
cloud-fan Apr 4, 2017
a59759e
[SPARK-20183][ML] Added outlierRatio arg to MLTestingUtils.testOutlie…
Apr 5, 2017
b28bbff
[SPARK-20003][ML] FPGrowthModel setMinConfidence should affect rules …
YY-OnCall Apr 5, 2017
c1b8b66
[SPARKR][DOC] update doc for fpgrowth
felixcheung Apr 5, 2017
b6e7103
Small doc fix for ReuseSubquery.
rxin Apr 5, 2017
dad499f
[SPARK-20209][SS] Execute next trigger immediately if previous batch …
tdas Apr 5, 2017
6f09dc7
[SPARK-20042][WEB UI] Fix log page buttons for reverse proxy mode
okoethibm Apr 5, 2017
71c3c48
[SPARK-19807][WEB UI] Add reason for cancellation when a stage is kil…
Apr 5, 2017
a2d8d76
[SPARK-20223][SQL] Fix typo in tpcds q77.sql
Apr 5, 2017
e277399
[SPARK-19454][PYTHON][SQL] DataFrame.replace improvements
zero323 Apr 5, 2017
9543fc0
[SPARK-20224][SS] Updated docs for streaming dropDuplicates and mapGr…
tdas Apr 5, 2017
9d68c67
[SPARK-20204][SQL][FOLLOWUP] SQLConf should react to change in defaul…
dilipbiswal Apr 6, 2017
1220605
[SPARK-20214][ML] Make sure converted csc matrix has sorted indices
viirya Apr 6, 2017
4000f12
[SPARK-20231][SQL] Refactor star schema code for the subsequent star …
ioana-delaney Apr 6, 2017
5142e5d
[SPARK-20217][CORE] Executor should not fail stage if killed task thr…
ericl Apr 6, 2017
e156b5d
[SPARK-19953][ML] Random Forest Models use parent UID when being fit
BryanCutler Apr 6, 2017
c8fc1f3
[SPARK-20085][MESOS] Configurable mesos labels for executors
Apr 6, 2017
d009fb3
[SPARK-20064][PYSPARK] Bump the PySpark verison number to 2.2
rubenljanssen Apr 6, 2017
bccc330
[SPARK-20196][PYTHON][SQL] update doc for catalog functions for all l…
felixcheung Apr 6, 2017
5a693b4
[SPARK-20195][SPARKR][SQL] add createTable catalog API and deprecate …
felixcheung Apr 6, 2017
a449162
[SPARK-17019][CORE] Expose on-heap and off-heap memory usage in vario…
jerryshao Apr 6, 2017
8129d59
[MINOR][DOCS] Fix typo in Hive Examples
Apr 6, 2017
626b4ca
[SPARK-19495][SQL] Make SQLConf slightly more extensible - addendum
rxin Apr 7, 2017
ad3cc13
[SPARK-20245][SQL][MINOR] pass output to LogicalRelation directly
cloud-fan Apr 7, 2017
1a52a62
[SPARK-20076][ML][PYSPARK] Add Python interface for ml.stats.Correlation
viirya Apr 7, 2017
9e0893b
[SPARK-20218][DOC][APP-ID] applications//stages' in REST API,add desc…
Apr 7, 2017
870b9d9
[SPARK-20026][DOC][SPARKR] Add Tweedie example for SparkR in programm…
actuaryzhang Apr 7, 2017
8feb799
[SPARK-20197][SPARKR] CRAN check fail with package installation
felixcheung Apr 7, 2017
1ad73f0
[SPARK-20258][DOC][SPARKR] Fix SparkR logistic regression example in …
actuaryzhang Apr 7, 2017
589f3ed
[SPARK-20255] Move listLeafFiles() to InMemoryFileIndex
adrian-ionescu Apr 7, 2017
7577e9c
[SPARK-20246][SQL] should not push predicate down through aggregate w…
cloud-fan Apr 8, 2017
e1afc4d
[SPARK-20262][SQL] AssertNotNull should throw NullPointerException
rxin Apr 8, 2017
34fc48f
[MINOR] Issue: Change "slice" vs "partition" in exception messages (a…
asmith26 Apr 9, 2017
1f0de3c
[SPARK-19991][CORE][YARN] FileSegmentManagedBuffer performance improv…
srowen Apr 9, 2017
261eaf5
[SPARK-20260][MLLIB] String interpolation required for error message
Apr 9, 2017
7a63f5e
[SPARK-20253][SQL] Remove unnecessary nullchecks of a return value fr…
kiszk Apr 10, 2017
7bfa05e
[SPARK-20264][SQL] asm should be non-test dependency in sql/core
rxin Apr 10, 2017
1a0bc41
[SPARK-20270][SQL] na.fill should not change the values in long or in…
Apr 10, 2017
3d7f201
[SPARK-20229][SQL] add semanticHash to QueryPlan
cloud-fan Apr 10, 2017
4f7d49b
[SPARK-20243][TESTS] DebugFilesystem.assertNoOpenStreams thread race
bogdanrdc Apr 10, 2017
5acaf8c
[SPARK-19518][SQL] IGNORE NULLS in first / last in SQL
HyukjinKwon Apr 10, 2017
fd711ea
[SPARK-20273][SQL] Disallow Non-deterministic Filter push-down into J…
gatorsmile Apr 10, 2017
a26e3ed
[SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String toLowerCase "T…
srowen Apr 10, 2017
f6dd8e0
[SPARK-20280][CORE] FileStatusCache Weigher integer overflow
bogdanrdc Apr 10, 2017
f9a50ba
[SPARK-20285][TESTS] Increase the pyspark streaming test timeout to 3…
zsxwing Apr 10, 2017
a35b9d9
[SPARK-20282][SS][TESTS] Write the commit log first to fix a race con…
zsxwing Apr 10, 2017
379b0b0
[SPARK-20283][SQL] Add preOptimizationBatches
rxin Apr 10, 2017
734dfbf
[SPARK-17564][TESTS] Fix flaky RequestTimeoutIntegrationSuite.further…
zsxwing Apr 11, 2017
0d2b796
[SPARK-20097][ML] Fix visibility discrepancy with numInstances and de…
BenFradet Apr 11, 2017
d11ef3d
Document Master URL format in high availability set up
MirrorZ Apr 11, 2017
c870698
[SPARK-20274][SQL] support compatible array element type in encoder
cloud-fan Apr 11, 2017
cd91f96
[SPARK-20175][SQL] Exists should not be evaluated in Join operator
viirya Apr 11, 2017
123b4fb
[SPARK-20289][SQL] Use StaticInvoke to box primitive types
rxin Apr 11, 2017
6297697
[SPARK-19505][PYTHON] AttributeError on Exception.message in Python3
Apr 11, 2017
cde9e32
[MINOR][DOCS] Update supported versions for Hive Metastore
dongjoon-hyun Apr 12, 2017
8ad63ee
[SPARK-20291][SQL] NaNvl(FloatType, NullType) should not be cast to N…
Apr 12, 2017
b14bfc3
[SPARK-19993][SQL] Caching logical plans containing subquery expressi…
dilipbiswal Apr 12, 2017
b938438
[MINOR][DOCS] Fix spacings in Structured Streaming Programming Guide
dongjinleekr Apr 12, 2017
bca4259
[MINOR][DOCS] JSON APIs related documentation fixes
HyukjinKwon Apr 12, 2017
044f7ec
[SPARK-20298][SPARKR][MINOR] fixed spelling mistake "charactor"
bdwyer2 Apr 12, 2017
ffc57b0
[SPARK-20302][SQL] Short circuit cast when from and to types are stru…
rxin Apr 12, 2017
2e1fd46
[SPARK-20296][TRIVIAL][DOCS] Count distinct error message for streaming
jtoka Apr 12, 2017
ceaf77a
[SPARK-18692][BUILD][DOCS] Test Java 8 unidoc build on Jenkins
HyukjinKwon Apr 12, 2017
504e62e
[SPARK-20303][SQL] Rename createTempFunction to registerFunction
gatorsmile Apr 12, 2017
5408553
[SPARK-20304][SQL] AssertNotNull should not include path in string re…
rxin Apr 12, 2017
99a9473
[SPARK-19570][PYSPARK] Allow to disable hive in pyspark shell
zjffdu Apr 12, 2017
924c424
[SPARK-20301][FLAKY-TEST] Fix Hadoop Shell.runCommand flakiness in St…
brkyvz Apr 12, 2017
a7b430b
[SPARK-15354][FLAKY-TEST] TopologyAwareBlockReplicationPolicyBehavior…
cloud-fan Apr 13, 2017
c5f1cc3
[SPARK-20131][CORE] Don't use `this` lock in StandaloneSchedulerBacke…
zsxwing Apr 13, 2017
ec68d8f
[SPARK-20189][DSTREAM] Fix spark kinesis testcases to remove deprecat…
yashs360 Apr 13, 2017
095d1cb
[SPARK-20265][MLLIB] Improve Prefix'span pre-processing efficiency
Syrux Apr 13, 2017
a4293c2
[SPARK-20284][CORE] Make {Des,S}erializationStream extend Closeable
Apr 13, 2017
fbe4216
[SPARK-20233][SQL] Apply star-join filter heuristics to dynamic progr…
ioana-delaney Apr 13, 2017
8ddf0d2
[SPARK-20232][PYTHON] Improve combineByKey docs
Apr 13, 2017
7536e28
[SPARK-20038][SQL] FileFormatWriter.ExecuteWriteTask.releaseResources…
steveloughran Apr 13, 2017
fb036c4
[SPARK-20318][SQL] Use Catalyst type for min/max in ColumnStat for ea…
Apr 14, 2017
98b41ec
[SPARK-20316][SQL] Val and Var should strictly follow the Scala syntax
Apr 15, 2017
35e5ae4
[SPARK-19716][SQL][FOLLOW-UP] UnresolvedMapObjects should always be s…
cloud-fan Apr 16, 2017
e090f3c
[SPARK-20335][SQL] Children expressions of Hive UDF impacts the deter…
gatorsmile Apr 16, 2017
a888fed
[SPARK-19740][MESOS] Add support in Spark to pass arbitrary parameter…
Apr 16, 2017
ad935f5
[SPARK-20343][BUILD] Add avro dependency in core POM to resolve build…
HyukjinKwon Apr 16, 2017
86d251c
[SPARK-20278][R] Disable 'multiple_dots_linter' lint rule that is aga…
HyukjinKwon Apr 16, 2017
24f09b3
[SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json.array in fro…
HyukjinKwon Apr 17, 2017
01ff035
[SPARK-20349][SQL] ListFunctions returns duplicate functions after us…
gatorsmile Apr 17, 2017
e5fee3e
[SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patterns.
jodersky Apr 17, 2017
0075562
Typo fix: distitrbuted -> distributed
ash211 Apr 18, 2017
33ea908
[TEST][MINOR] Replace repartitionBy with distribute in CollapseRepart…
jaceklaskowski Apr 18, 2017
b0a1e93
[SPARK-17647][SQL][FOLLOWUP][MINOR] fix typo
felixcheung Apr 18, 2017
07fd94e
[SPARK-20344][SCHEDULER] Duplicate call in FairSchedulableBuilder.add…
snazy Apr 18, 2017
d4f10cb
[SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to resolve build f…
HyukjinKwon Apr 18, 2017
321b4f0
[SPARK-20366][SQL] Fix recursive join reordering: inside joins are no…
Apr 18, 2017
1f81dda
[SPARK-20354][CORE][REST-API] When I request access to the 'http: //i…
Apr 18, 2017
f654b39
[SPARK-20360][PYTHON] reprs for interpreters
rgbkrk Apr 18, 2017
74aa0df
[SPARK-20377][SS] Fix JavaStructuredSessionization example
tdas Apr 18, 2017
e468a96
[SPARK-20254][SQL] Remove unnecessary data conversion for Dataset wit…
kiszk Apr 19, 2017
702d85a
[SPARK-20208][R][DOCS] Document R fpGrowth support
zero323 Apr 19, 2017
608bf30
[SPARK-20359][SQL] Avoid unnecessary execution in EliminateOuterJoin …
koertkuipers Apr 19, 2017
773754b
[SPARK-20356][SQL] Pruned InMemoryTableScanExec should have correct o…
viirya Apr 19, 2017
3537876
[SPARK-20343][BUILD] Avoid Unidoc build only if Hadoop 2.6 is explici…
HyukjinKwon Apr 19, 2017
71a8e9d
[SPARK-20036][DOC] Note incompatible dependencies on org.apache.kafka…
koeninger Apr 19, 2017
4fea784
[SPARK-20397][SPARKR][SS] Fix flaky test: test_streaming.R.Terminated…
zsxwing Apr 19, 2017
63824b2
[SPARK-20350] Add optimization rules to apply Complementation Laws.
ptkool Apr 20, 2017
39e303a
[MINOR][SS] Fix a missing space in UnsupportedOperationChecker error …
zsxwing Apr 20, 2017
dd6d55d
[SPARK-20398][SQL] range() operator should include cancellation reaso…
ericl Apr 20, 2017
bdc6056
Fixed typos in docs
Apr 20, 2017
46c5749
[SPARK-20375][R] R wrappers for array and map
zero323 Apr 20, 2017
55bea56
[SPARK-20156][SQL][FOLLOW-UP] Java String toLowerCase "Turkish locale…
gatorsmile Apr 20, 2017
c6f62c5
[SPARK-20405][SQL] Dataset.withNewExecutionId should be private
rxin Apr 20, 2017
b91873d
[SPARK-20409][SQL] fail early if aggregate function in GROUP BY
cloud-fan Apr 20, 2017
c5a31d1
[SPARK-20407][TESTS] ParquetQuerySuite 'Enabling/disabling ignoreCorr…
bogdanrdc Apr 20, 2017
b2ebadf
[SPARK-20358][CORE] Executors failing stage on interrupted exception …
ericl Apr 20, 2017
d95e4d9
[SPARK-20334][SQL] Return a better error message when correlated pred…
dilipbiswal Apr 20, 2017
0332063
[SPARK-20410][SQL] Make sparkConf a def in SharedSQLContext
hvanhovell Apr 20, 2017
592f5c8
[SPARK-20172][CORE] Add file permission check when listing files in F…
jerryshao Apr 20, 2017
0368eb9
[SPARK-20367] Properly unescape column names of partitioning columns …
juliuszsompolski Apr 21, 2017
760c8d0
[SPARK-20329][SQL] Make timezone aware expression without timezone un…
hvanhovell Apr 21, 2017
48d760d
[SPARK-20281][SQL] Print the identical Range parameters of SparkConte…
maropu Apr 21, 2017
e2b3d23
[SPARK-20420][SQL] Add events to the external catalog
hvanhovell Apr 21, 2017
3476799
Small rewording about history server use case
dud225 Apr 21, 2017
c9e6035
[SPARK-20412] Throw ParseException from visitNonOptionalPartitionSpec…
juliuszsompolski Apr 21, 2017
a750a59
[SPARK-20341][SQL] Support BigInt's value that does not fit in long v…
kiszk Apr 21, 2017
eb00378
[SPARK-20423][ML] fix MLOR coeffs centering when reg == 0
WeichenXu123 Apr 21, 2017
fd648bf
[SPARK-20371][R] Add wrappers for collect_list and collect_set
zero323 Apr 21, 2017
ad29040
[SPARK-20401][DOC] In the spark official configuration document, the …
Apr 21, 2017
05a4514
[SPARK-20386][SPARK CORE] modify the log info if the block exists on …
eatoncys Apr 22, 2017
b3c572a
[SPARK-20430][SQL] Initialise RangeExec parameters in a driver side
maropu Apr 22, 2017
8765bc1
[SPARK-20132][DOCS] Add documentation for column string functions
map222 Apr 23, 2017
2eaf4f3
[SPARK-20385][WEB-UI] Submitted Time' field, the date format needs to…
Apr 23, 2017
e9f9715
[BUILD] Close stale PRs
maropu Apr 24, 2017
776a2c0
[SPARK-20439][SQL] Fix Catalog API listTables and getTable when faile…
gatorsmile Apr 24, 2017
90264ac
[SPARK-18901][ML] Require in LR LogisticAggregator is redundant
wangmiao1981 Apr 24, 2017
8a272dd
[SPARK-20438][R] SparkR wrappers for split and repeat
zero323 Apr 24, 2017
5280d93
[SPARK-20239][CORE] Improve HistoryServer's ACL mechanism
jerryshao Apr 25, 2017
f44c8a8
[SPARK-20453] Bump master branch version to 2.3.0-SNAPSHOT
JoshRosen Apr 25, 2017
31345fd
[SPARK-20451] Filter out nested mapType datatypes from sort order in …
sameeragarwal Apr 25, 2017
c8f1219
[SPARK-20455][DOCS] Fix Broken Docker IT Docs
original-brownbear Apr 25, 2017
0bc7a90
[SPARK-20404][CORE] Using Option(name) instead of Some(name)
szhem Apr 25, 2017
387565c
[SPARK-18901][FOLLOWUP][ML] Require in LR LogisticAggregator is redun…
wangmiao1981 Apr 25, 2017
67eef47
[SPARK-20449][ML] Upgrade breeze version to 0.13.1
yanboliang Apr 25, 2017
0a7f5f2
[SPARK-5484][GRAPHX] Periodically do checkpoint in Pregel
Apr 25, 2017
caf3920
[SPARK-18127] Add hooks and extension points to Spark
sameeragarwal Apr 26, 2017
57e1da3
[SPARK-16548][SQL] Inconsistent error handling in JSON parsing SQL fu…
Apr 26, 2017
df58a95
[SPARK-20437][R] R wrappers for rollup and cube
zero323 Apr 26, 2017
7a36525
[SPARK-20400][DOCS] Remove References to 3rd Party Vendor Tools
Apr 26, 2017
7fecf51
[SPARK-19812] YARN shuffle service fails to relocate recovery DB acro…
tgravescs Apr 26, 2017
dbb06c6
[MINOR][ML] Fix some PySpark & SparkR flaky tests
yanboliang Apr 26, 2017
66dd5b8
[SPARK-20391][CORE] Rename memory related fields in ExecutorSummay
jerryshao Apr 26, 2017
99c6cf9
[SPARK-20473] Enabling missing types in ColumnVector.Array
michal-databricks Apr 26, 2017
a277ae8
[SPARK-20474] Fixing OnHeapColumnVector reallocation
michal-databricks Apr 26, 2017
2ba1eba
[SPARK-12868][SQL] Allow adding jars from hdfs
weiqingy Apr 26, 2017
66636ef
[SPARK-20435][CORE] More thorough redaction of sensitive information
markgrover Apr 27, 2017
b4724db
[SPARK-20425][SQL] Support a vertical display mode for Dataset.show
maropu Apr 27, 2017
b58cf77
[DOCS][MINOR] Add missing since to SparkR repeat_string note.
zero323 Apr 27, 2017
ba76662
[SPARK-20208][DOCS][FOLLOW-UP] Add FP-Growth to SparkR programming guide
zero323 Apr 27, 2017
7633933
[SPARK-20483] Mesos Coarse mode may starve other Mesos frameworks
dgshep Apr 27, 2017
561e9cc
[SPARK-20421][CORE] Mark internal listeners as deprecated.
Apr 27, 2017
85c6ce6
[SPARK-20426] Lazy initialization of FileSegmentManagedBuffer for shu…
Apr 27, 2017
26ac2ce
[SPARK-20482][SQL] Resolving Casts is too strict on having time zone set
rednaxelafx Apr 27, 2017
a4aa466
[SPARK-20487][SQL] `HiveTableScan` node is quite verbose in explained…
tejasapatil Apr 27, 2017
039e32c
[SPARK-20483][MINOR] Test for Mesos Coarse mode may starve other Meso…
dgshep Apr 27, 2017
606432a
[SPARK-20047][ML] Constrained Logistic Regression
yanboliang Apr 27, 2017
01c999e
[SPARK-20461][CORE][SS] Use UninterruptibleThread for Executor and fi…
zsxwing Apr 27, 2017
823baca
[SPARK-20452][SS][KAFKA] Fix a potential ConcurrentModificationExcept…
zsxwing Apr 27, 2017
b90bf52
[SPARK-12837][CORE] Do not send the name of internal accumulator to e…
cloud-fan Apr 28, 2017
7fe8249
[SPARKR][DOC] Document LinearSVC in R programming guide
wangmiao1981 Apr 28, 2017
e3c8160
[SPARK-20476][SQL] Block users to create a table that use commas in t…
gatorsmile Apr 28, 2017
59e3a56
[SPARK-14471][SQL] Aliases in SELECT could be used in GROUP BY
maropu Apr 28, 2017
8c911ad
[SPARK-20465][CORE] Throws a proper exception when any temp directory…
HyukjinKwon Apr 28, 2017
733b81b
[SPARK-20496][SS] Bug in KafkaWriter Looks at Unanalyzed Plans
Apr 28, 2017
5d71f3d
[SPARK-20514][CORE] Upgrade Jetty to 9.3.11.v20160721
markgrover Apr 28, 2017
ebff519
[SPARK-20471] Remove AggregateBenchmark testsuite warning: Two level …
heary-cao Apr 28, 2017
77bcd77
[SPARK-19525][CORE] Add RDD checkpoint compression support
Apr 28, 2017
814a61a
[SPARK-20487][SQL] Display `serde` for `HiveTableScan` node in explai…
tejasapatil Apr 29, 2017
b28c3bc
[SPARK-20477][SPARKR][DOC] Document R bisecting k-means in R programm…
wangmiao1981 Apr 29, 2017
add9d1b
[SPARK-19791][ML] Add doc and example for fpgrowth
YY-OnCall Apr 29, 2017
ee694cd
[SPARK-20533][SPARKR] SparkR Wrappers Model should be private and val…
wangmiao1981 Apr 29, 2017
70f1bcd
[SPARK-20493][R] De-duplicate parse logics for DDL-like type strings …
HyukjinKwon Apr 29, 2017
d228cd0
[SPARK-20442][PYTHON][DOCS] Fill up documentations for functions in C…
HyukjinKwon Apr 29, 2017
4d99b95
[SPARK-20521][DOC][CORE] The default of 'spark.worker.cleanup.appData…
Apr 30, 2017
1ee494d
[SPARK-20492][SQL] Do not print empty parentheses for invalid primiti…
HyukjinKwon Apr 30, 2017
ae3df4e
[SPARK-20535][SPARKR] R wrappers for explode_outer and posexplode_outer
zero323 Apr 30, 2017
6613046
[MINOR][DOCS][PYTHON] Adding missing boolean type for replacement val…
May 1, 2017
80e9cf1
[SPARK-20490][SPARKR] Add R wrappers for eqNullSafe and ! / not
zero323 May 1, 2017
a355b66
[SPARK-20541][SPARKR][SS] support awaitTermination without timeout
felixcheung May 1, 2017
f0169a1
[SPARK-20290][MINOR][PYTHON][SQL] Add PySpark wrapper for eqNullSafe
zero323 May 1, 2017
6b44c4d
[SPARK-20534][SQL] Make outer generate exec return empty rows
hvanhovell May 1, 2017
ab30590
[SPARK-20517][UI] Fix broken history UI download link
jerryshao May 1, 2017
6fc6cf8
[SPARK-20464][SS] Add a job group and description for streaming queri…
kunalkhamar May 1, 2017
2b2dd08
[SPARK-20540][CORE] Fix unstable executor requests.
rdblue May 1, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
4 changes: 1 addition & 3 deletions .github/PULL_REQUEST_TEMPLATE
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,9 @@

(Please fill in changes proposed in this fix)


## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)


(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.
105 changes: 59 additions & 46 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,77 +1,90 @@
*~
*.#*
*#*#
*.swp
*.ipr
*.#*
*.iml
*.ipr
*.iws
*.pyc
*.pyo
*.swp
*~
.DS_Store
.cache
.classpath
.ensime
.ensime_cache/
.ensime_lucene
.generated-mima*
.idea/
.idea_modules/
build/*.jar
.project
.pydevproject
.scala_dependencies
.settings
.cache
cache
.generated-mima*
work/
out/
.DS_Store
/lib/
R-unit-tests.log
R/unit-tests.out
R/cran-check.out
R/pkg/vignettes/sparkr-vignettes.html
build/*.jar
build/apache-maven*
build/zinc*
build/scala*
conf/java-opts
conf/*.sh
build/zinc*
cache
checkpoint
conf/*.cmd
conf/*.properties
conf/*.conf
conf/*.properties
conf/*.sh
conf/*.xml
conf/java-opts
conf/slaves
dependency-reduced-pom.xml
derby.log
dev/create-release/*final
dev/create-release/*txt
dev/pr-deps/
dist/
docs/_site
docs/api
target/
reports/
.project
.classpath
.scala_dependencies
lib_managed/
src_managed/
lint-r-report.log
log/
logs/
out/
project/boot/
project/plugins/project/build.properties
project/build/target/
project/plugins/target/
project/plugins/lib_managed/
project/plugins/project/build.properties
project/plugins/src_managed/
logs/
log/
project/plugins/target/
python/lib/pyspark.zip
python/deps
python/pyspark/python
reports/
scalastyle-on-compile.generated.xml
scalastyle-output.xml
scalastyle.txt
spark-*-bin-*.tgz
spark-tests.log
src_managed/
streaming-tests.log
dependency-reduced-pom.xml
.ensime
.ensime_cache/
.ensime_lucene
checkpoint
derby.log
dist/
dev/create-release/*txt
dev/create-release/*final
spark-*-bin-*.tgz
target/
unit-tests.log
/lib/
scalastyle.txt
scalastyle-output.xml
R-unit-tests.log
R/unit-tests.out
python/lib/pyspark.zip
lint-r-report.log
work/

# For Hive
metastore_db/
metastore/
warehouse/
TempStatsStore/
metastore/
metastore_db/
sql/hive-thriftserver/test_warehouses
warehouse/
spark-warehouse/

# For R session data
.RHistory
.RData
.RHistory
.Rhistory
*.Rproj
*.Rproj.*

.Rproj.user
50 changes: 50 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Spark provides this Travis CI configuration file to help contributors
# check Scala/Java style conformance and JDK7/8 compilation easily
# during their preparing pull requests.
# - Scalastyle is executed during `maven install` implicitly.
# - Java Checkstyle is executed by `lint-java`.
# See the related discussion here.
# https://github.com/apache/spark/pull/12980

# 1. Choose OS (Ubuntu 14.04.3 LTS Server Edition 64bit, ~2 CORE, 7.5GB RAM)
sudo: required
dist: trusty

# 2. Choose language and target JDKs for parallel builds.
language: java
jdk:
- oraclejdk8

# 3. Setup cache directory for SBT and Maven.
cache:
directories:
- $HOME/.sbt
- $HOME/.m2

# 4. Turn off notifications.
notifications:
email: false

# 5. Run maven install before running lint-java.
install:
- export MAVEN_SKIP_RC=1
- build/mvn -T 4 -q -DskipTests -Pmesos -Pyarn -Pkinesis-asl -Phive -Phive-thriftserver install

# 6. Run lint-java.
script:
- dev/lint-java
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
## Contributing to Spark

*Before opening a pull request*, review the
[Contributing to Spark wiki](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark).
[Contributing to Spark guide](http://spark.apache.org/contributing.html).
It lists steps that are required before creating a PR. In particular, consider:

- Is the change important and ready enough to ask the community to spend time reviewing?
- Have you searched for existing, related JIRAs and pull requests?
- Is this a new feature that can stand alone as a package on http://spark-packages.org ?
- Is this a new feature that can stand alone as a [third party project](http://spark.apache.org/third-party-projects.html) ?
- Is the change being proposed clearly explained and motivated?

When you contribute code, you affirm that the contribution is your original work and that you
Expand Down
9 changes: 5 additions & 4 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -257,14 +257,13 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.11:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.11:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.11:0.7.1 - http://spire-math.org)
(New BSD License) Kryo (com.esotericsoftware.kryo:kryo:2.21 - http://code.google.com/p/kryo/)
(New BSD License) MinLog (com.esotericsoftware.minlog:minlog:1.2 - http://code.google.com/p/minlog/)
(New BSD License) ReflectASM (com.esotericsoftware.reflectasm:reflectasm:1.07 - http://code.google.com/p/reflectasm/)
(New BSD License) Kryo (com.esotericsoftware:kryo:3.0.3 - https://github.com/EsotericSoftware/kryo)
(New BSD License) MinLog (com.esotericsoftware:minlog:1.3.0 - https://github.com/EsotericSoftware/minlog)
(New BSD license) Protocol Buffer Java API (com.google.protobuf:protobuf-java:2.5.0 - http://code.google.com/p/protobuf)
(New BSD license) Protocol Buffer Java API (org.spark-project.protobuf:protobuf-java:2.4.1-shaded - http://code.google.com/p/protobuf)
(The BSD License) Fortran to Java ARPACK (net.sourceforge.f2j:arpack_combined_all:0.1 - http://f2j.sourceforge.net)
(The BSD License) xmlenc Library (xmlenc:xmlenc:0.52 - http://xmlenc.sourceforge.net)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.9.2 - http://py4j.sourceforge.net/)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.10.4 - http://py4j.sourceforge.net/)
(Two-clause BSD-style license) JUnit-Interface (com.novocode:junit-interface:0.10 - https://github.com/szeiger/junit-interface/)
(BSD licence) sbt and sbt-launch-lib.bash
(BSD 3 Clause) d3.min.js (https://github.com/mbostock/d3/blob/master/LICENSE)
Expand Down Expand Up @@ -297,3 +296,5 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) blockUI (http://jquery.malsup.com/block/)
(MIT License) RowsGroup (http://datatables.net/license/mit)
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
(MIT License) modernizr (https://github.com/Modernizr/Modernizr/blob/master/LICENSE)
(MIT License) machinist (https://github.com/typelevel/machinist)
16 changes: 5 additions & 11 deletions NOTICE
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Apache Spark
Copyright 2014 The Apache Software Foundation.
Copyright 2014 and onwards The Apache Software Foundation.

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
Expand All @@ -12,7 +12,9 @@ Common Development and Distribution License 1.0
The following components are provided under the Common Development and Distribution License 1.0. See project link for details.

(CDDL 1.0) Glassfish Jasper (org.mortbay.jetty:jsp-2.1:6.1.14 - http://jetty.mortbay.org/project/modules/jsp-2.1)
(CDDL 1.0) JAX-RS (https://jax-rs-spec.java.net/)
(CDDL 1.0) Servlet Specification 2.5 API (org.mortbay.jetty:servlet-api-2.5:6.1.14 - http://jetty.mortbay.org/project/modules/servlet-api-2.5)
(CDDL 1.0) (GPL2 w/ CPE) javax.annotation API (https://glassfish.java.net/nonav/public/CDDL+GPL.html)
(COMMON DEVELOPMENT AND DISTRIBUTION LICENSE (CDDL) Version 1.0) (GNU General Public Library) Streaming API for XML (javax.xml.stream:stax-api:1.0-2 - no url defined)
(Common Development and Distribution License (CDDL) v1.0) JavaBeans Activation Framework (JAF) (javax.activation:activation:1.1 - http://java.sun.com/products/javabeans/jaf/index.jsp)

Expand All @@ -22,15 +24,10 @@ Common Development and Distribution License 1.1

The following components are provided under the Common Development and Distribution License 1.1. See project link for details.

(CDDL 1.1) (GPL2 w/ CPE) org.glassfish.hk2 (https://hk2.java.net)
(CDDL 1.1) (GPL2 w/ CPE) JAXB API bundle for GlassFish V3 (javax.xml.bind:jaxb-api:2.2.2 - https://jaxb.dev.java.net/)
(CDDL 1.1) (GPL2 w/ CPE) JAXB RI (com.sun.xml.bind:jaxb-impl:2.2.3-1 - http://jaxb.java.net/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-core (com.sun.jersey:jersey-core:1.8 - https://jersey.dev.java.net/jersey-core/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-core (com.sun.jersey:jersey-core:1.9 - https://jersey.java.net/jersey-core/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-guice (com.sun.jersey.contribs:jersey-guice:1.9 - https://jersey.java.net/jersey-contribs/jersey-guice/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-json (com.sun.jersey:jersey-json:1.8 - https://jersey.dev.java.net/jersey-json/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-json (com.sun.jersey:jersey-json:1.9 - https://jersey.java.net/jersey-json/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-server (com.sun.jersey:jersey-server:1.8 - https://jersey.dev.java.net/jersey-server/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-server (com.sun.jersey:jersey-server:1.9 - https://jersey.java.net/jersey-server/)
(CDDL 1.1) (GPL2 w/ CPE) Jersey 2 (https://jersey.java.net)

========================================================================
Common Public License 1.0
Expand Down Expand Up @@ -424,9 +421,6 @@ Copyright (c) 2011, Terrence Parr.
This product includes/uses ASM (http://asm.ow2.org/),
Copyright (c) 2000-2007 INRIA, France Telecom.

This product includes/uses org.json (http://www.json.org/java/index.html),
Copyright (c) 2002 JSON.org

This product includes/uses JLine (http://jline.sourceforge.net/),
Copyright (c) 2002-2006, Marc Prud'hommeaux <[email protected]>.

Expand Down
2 changes: 2 additions & 0 deletions R/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,5 @@
lib
pkg/man
pkg/html
SparkR.Rcheck/
SparkR_*.tar.gz
91 changes: 91 additions & 0 deletions R/CRAN_RELEASE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# SparkR CRAN Release

To release SparkR as a package to CRAN, we would use the `devtools` package. Please work with the
`[email protected]` community and R package maintainer on this.

### Release

First, check that the `Version:` field in the `pkg/DESCRIPTION` file is updated. Also, check for stale files not under source control.

Note that while `run-tests.sh` runs `check-cran.sh` (which runs `R CMD check`), it is doing so with `--no-manual --no-vignettes`, which skips a few vignettes or PDF checks - therefore it will be preferred to run `R CMD check` on the source package built manually before uploading a release. Also note that for CRAN checks for pdf vignettes to success, `qpdf` tool must be there (to install it, eg. `yum -q -y install qpdf`).

To upload a release, we would need to update the `cran-comments.md`. This should generally contain the results from running the `check-cran.sh` script along with comments on status of all `WARNING` (should not be any) or `NOTE`. As a part of `check-cran.sh` and the release process, the vignettes is build - make sure `SPARK_HOME` is set and Spark jars are accessible.

Once everything is in place, run in R under the `SPARK_HOME/R` directory:

```R
paths <- .libPaths(); .libPaths(c("lib", paths)); Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); devtools::release(); .libPaths(paths)
```

For more information please refer to http://r-pkgs.had.co.nz/release.html#release-check

### Testing: build package manually

To build package manually such as to inspect the resulting `.tar.gz` file content, we would also use the `devtools` package.

Source package is what get released to CRAN. CRAN would then build platform-specific binary packages from the source package.

#### Build source package

To build source package locally without releasing to CRAN, run in R under the `SPARK_HOME/R` directory:

```R
paths <- .libPaths(); .libPaths(c("lib", paths)); Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); devtools::build("pkg"); .libPaths(paths)
```

(http://r-pkgs.had.co.nz/vignettes.html#vignette-workflow-2)

Similarly, the source package is also created by `check-cran.sh` with `R CMD build pkg`.

For example, this should be the content of the source package:

```sh
DESCRIPTION R inst tests
NAMESPACE build man vignettes

inst/doc/
sparkr-vignettes.html
sparkr-vignettes.Rmd
sparkr-vignettes.Rman

build/
vignette.rds

man/
*.Rd files...

vignettes/
sparkr-vignettes.Rmd
```

#### Test source package

To install, run this:

```sh
R CMD INSTALL SparkR_2.1.0.tar.gz
```

With "2.1.0" replaced with the version of SparkR.

This command installs SparkR to the default libPaths. Once that is done, you should be able to start R and run:

```R
library(SparkR)
vignette("sparkr-vignettes", package="SparkR")
```

#### Build binary package

To build binary package locally, run in R under the `SPARK_HOME/R` directory:

```R
paths <- .libPaths(); .libPaths(c("lib", paths)); Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); devtools::build("pkg", binary = TRUE); .libPaths(paths)
```

For example, this should be the content of the binary package:

```sh
DESCRIPTION Meta R html tests
INDEX NAMESPACE help profile worker
```
Loading