-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-5254][MLLIB] Update the user guide to position spark.ml better #4052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #25572 has started for PR 4052 at commit
|
|
Test build #25572 has finished for PR 4052 at commit
|
|
Test PASSed. |
|
Test build #25579 has started for PR 4052 at commit
|
|
Test build #25579 has finished for PR 4052 at commit
|
|
Test PASSed. |
|
LGTM |
The current statement in the user guide may deliver confusing messages to users. spark.ml contains high-level APIs for building ML pipelines. But it doesn't mean that spark.mllib is being deprecated. First of all, the pipeline API is in its alpha stage and we need to see more use cases from the community to stabilizes it, which may take several releases. Secondly, the components in spark.ml are simple wrappers over spark.mllib implementations. Neither the APIs or the implementations from spark.mllib are being deprecated. We expect users use spark.ml pipeline APIs to build their ML pipelines, but we will keep supporting and adding features to spark.mllib. For example, there are many features in review at https://spark-prs.appspot.com/#mllib. So users should be comfortable with using spark.mllib features and expect more coming. The user guide needs to be updated to make the message clear. Author: Xiangrui Meng <[email protected]> Closes #4052 from mengxr/SPARK-5254 and squashes the following commits: 6d5f1d3 [Xiangrui Meng] typo 0cc935b [Xiangrui Meng] update user guide to position spark.ml better (cherry picked from commit 13d2406) Signed-off-by: Xiangrui Meng <[email protected]>
|
Merged into master and branch-1.2. |
Forgot to remove this section in #4052. Author: Xiangrui Meng <[email protected]> Closes #4053 from mengxr/SPARK-5254-update and squashes the following commits: f295bde [Xiangrui Meng] remove developers section from spark.ml guide
Forgot to remove this section in #4052. Author: Xiangrui Meng <[email protected]> Closes #4053 from mengxr/SPARK-5254-update and squashes the following commits: f295bde [Xiangrui Meng] remove developers section from spark.ml guide (cherry picked from commit 6abc45e) Signed-off-by: Xiangrui Meng <[email protected]>
The current statement in the user guide may deliver confusing messages to users. spark.ml contains high-level APIs for building ML pipelines. But it doesn't mean that spark.mllib is being deprecated.
First of all, the pipeline API is in its alpha stage and we need to see more use cases from the community to stabilizes it, which may take several releases. Secondly, the components in spark.ml are simple wrappers over spark.mllib implementations. Neither the APIs or the implementations from spark.mllib are being deprecated. We expect users use spark.ml pipeline APIs to build their ML pipelines, but we will keep supporting and adding features to spark.mllib. For example, there are many features in review at https://spark-prs.appspot.com/#mllib. So users should be comfortable with using spark.mllib features and expect more coming. The user guide needs to be updated to make the message clear.