Skip to content

Conversation

@ymwdalex
Copy link
Contributor

@ymwdalex ymwdalex commented Mar 2, 2017

What changes were proposed in this pull request?

Description about pipeline in this paragraph is incorrect https://spark.apache.org/docs/latest/ml-pipeline.html#how-it-works

If the Pipeline had more stages, it would call the LogisticRegressionModel’s transform() method on the DataFrame before passing the DataFrame to the next stage.

Reason: Transformer could also be a stage. But only another Estimator will invoke an transform call and pass the data to next stage. The description in the document misleads ML pipeline users.

How was this patch tested?

This is a tiny modification of docs/ml-pipelines.md. I jekyll build the modification and check the compiled document.

@srowen
Copy link
Member

srowen commented Mar 2, 2017

I don't think this an essential change (see JIRA) but it's OK. See http://spark.apache.org/contributing.html for how to format the title and description of a PR.

@ymwdalex ymwdalex changed the title Correct ML pipeline document (https://issues.apache.org/jira/browse/SPARK-19797) [SPARK-19797][DOC] ML pipeline document correction Mar 2, 2017
@SparkQA
Copy link

SparkQA commented Mar 2, 2017

Test build #3593 has finished for PR 17137 at commit c9037be.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ymwdalex
Copy link
Contributor Author

ymwdalex commented Mar 2, 2017

Format the title and description of this PR according to contributing.html

@BryanCutler
Copy link
Member

LGTM

@srowen
Copy link
Member

srowen commented Mar 3, 2017

Merged to master/2.1

asfgit pushed a commit that referenced this pull request Mar 3, 2017
## What changes were proposed in this pull request?
Description about pipeline in this paragraph is incorrect https://spark.apache.org/docs/latest/ml-pipeline.html#how-it-works

> If the Pipeline had more **stages**, it would call the LogisticRegressionModel’s transform() method on the DataFrame before passing the DataFrame to the next stage.

Reason: Transformer could also be a stage. But only another Estimator will invoke an transform call and pass the data to next stage. The description in the document misleads ML pipeline users.

## How was this patch tested?
This is a tiny modification of **docs/ml-pipelines.md**. I jekyll build the modification and check the compiled document.

Author: Zhe Sun <[email protected]>

Closes #17137 from ymwdalex/SPARK-19797-ML-pipeline-document-correction.

(cherry picked from commit 0bac3e4)
Signed-off-by: Sean Owen <[email protected]>
@asfgit asfgit closed this in 0bac3e4 Mar 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants