Skip to content

Commit 0cc935b

Browse files
committed
update user guide to position spark.ml better
1 parent 76389c5 commit 0cc935b

File tree

2 files changed

+21
-14
lines changed

2 files changed

+21
-14
lines changed

docs/ml-guide.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,16 @@ layout: global
33
title: Spark ML Programming Guide
44
---
55

6-
Spark ML is Spark's new machine learning package. It is currently an alpha component but is potentially a successor to [MLlib](mllib-guide.html). The `spark.ml` package aims to replace the old APIs with a cleaner, more uniform set of APIs which will help users create full machine learning pipelines.
7-
8-
MLlib vs. Spark ML:
9-
10-
* Users can use algorithms from either of the two packages, but APIs may differ. Currently, `spark.ml` offers a subset of the algorithms from `spark.mllib`. Since Spark ML is an alpha component, its API may change in future releases.
11-
* Developers should contribute new algorithms to `spark.mllib` and can optionally contribute to `spark.ml`. See below for more details.
12-
* Spark ML only has Scala and Java APIs, whereas MLlib also has a Python API.
6+
`spark.ml` is a new packaged introduced in Spark 1.2, which aims to provide a uniform set of
7+
high-level APIs that help users create and tune practical machine learning pipelines.
8+
It is currently an alpha component, and we would like to hear back from the community about
9+
how it fits real-world use cases and how it could be improved.
10+
11+
Note that we will keep supporting and adding features to `spark.mllib` along with the
12+
development of `spark.ml`.
13+
Users should be comfortable using `spark.mllib` features and expect more features coming.
14+
Developers should contribute new algorithms to `spark.mllib` and can optionally contribute
15+
to `spark.ml`.
1316

1417
**Table of Contents**
1518

docs/mllib-guide.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -35,16 +35,20 @@ MLlib is under active development.
3535
The APIs marked `Experimental`/`DeveloperApi` may change in future releases,
3636
and the migration guide below will explain all changes between releases.
3737

38-
# spark.ml: The New ML Package
38+
# spark.ml: high-level APIs for ML pipelines
3939

40-
Spark 1.2 includes a new machine learning package called `spark.ml`, currently an alpha component but potentially a successor to `spark.mllib`. The `spark.ml` package aims to replace the old APIs with a cleaner, more uniform set of APIs which will help users create full machine learning pipelines.
40+
Spark 1.2 includes a new package called `spark.ml`, which aims to provide a uniform set of
41+
high-level APIs that help users create and tune practical machine learning pipelines.
42+
It is currently an alpha component, and we would like to hear back from the community about
43+
how it fits real-world use cases and how it could be improved.
4144

42-
See the **[spark.ml programming guide](ml-guide.html)** for more information on this package.
43-
44-
Users can use algorithms from either of the two packages, but APIs may differ. Currently, `spark.ml` offers a subset of the algorithms from `spark.mllib`.
45+
Note that we will keep supporting and adding features to `spark.mllib` along with the
46+
development of `spark.ml`.
47+
Users should be comfortable using `spark.mllib` features and expect more features coming.
48+
Developers should contribute new algorithms to `spark.mllib` and can optionally contribute
49+
to `spark.ml`.
4550

46-
Developers should contribute new algorithms to `spark.mllib` and can optionally contribute to `spark.ml`.
47-
See the `spark.ml` programming guide linked above for more details.
51+
See the **[spark.ml programming guide](ml-guide.html)** for more information on this package.
4852

4953
# Dependencies
5054

0 commit comments

Comments
 (0)