[SPARK-15040][ML][PYSPARK] Add Imputer to PySpark by MLnick · Pull Request #17316 · apache/spark

MLnick · 2017-03-16T14:22:24Z

Add Python wrapper for Imputer feature transformer.

How was this patch tested?

New doc tests and tweak to PySpark ML tests.py

MLnick · 2017-03-16T14:22:41Z

cc @hhbyyh

SparkQA · 2017-03-16T14:29:29Z

Test build #74667 has finished for PR 17316 at commit 5efe889.

This patch fails Python style tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class Imputer(JavaEstimator, HasInputCols, JavaMLReadable, JavaMLWritable):
class ImputerModel(JavaModel, JavaMLReadable, JavaMLWritable):

SparkQA · 2017-03-16T15:29:30Z

Test build #74669 has finished for PR 17316 at commit 5e53e05.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-03-16T15:34:34Z

Test build #74672 has finished for PR 17316 at commit 325c9cf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

hhbyyh

Looks good to me.

hhbyyh · 2017-03-16T17:57:17Z

+    """
+    .. note:: Experimental
+
+    Imputation estimator for completing missing values, either using the mean or the median


Nit: Shall we change all the "column" to "columns" since we are supporting multiple columns now...

Will do for Python and Scala doc

hhbyyh · 2017-03-16T18:04:57Z

+    def getOutputCols(self):
+        """
+        Gets the value of :py:attr:`outputCols` or its default value.
+        """


This reminds me we should add

require(get(inputCols).isDefined, "Input cols must be defined first.") require(get(outputCols).isDefined, "Output cols must be defined first.")

in transformschema

Do we really need that? The first call to $(inputCols) in validateAndTransformSchema will just throw an error with Failed to find a default value ...

SparkQA · 2017-03-20T12:02:21Z

Test build #74874 has finished for PR 17316 at commit 5c272b5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-03-22T00:42:14Z

Test build #75012 has finished for PR 17316 at commit 7fd17dd.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

MLnick · 2017-03-24T15:01:52Z

Merged to master.

## What changes were proposed in this pull request? Add docs and examples for spark.ml.feature.Imputer. Currently scala and Java examples are included. Python example will be added after #17316 ## How was this patch tested? local doc generation and example execution Author: Yuhao Yang <yuhao.yang@intel.com> Closes #17324 from hhbyyh/imputerdoc.

Pyspark Imputer

5efe889

Nick Pentreath added 2 commits March 16, 2017 16:35

fix line length

5e53e05

Add Experimental tags

325c9cf

hhbyyh approved these changes Mar 16, 2017

View reviewed changes

hhbyyh mentioned this pull request Mar 16, 2017

[SPARK-19969] [ML] Imputer doc and example #17324

Closed

Column -> columns in doc comments

5c272b5

Fix doc for approxQuantile link

7fd17dd

asfgit closed this in d9f4ce6 Mar 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-15040][ML][PYSPARK] Add Imputer to PySpark#17316

[SPARK-15040][ML][PYSPARK] Add Imputer to PySpark#17316
MLnick wants to merge 5 commits intoapache:masterfrom
MLnick:SPARK-15040-pyspark-imputer

MLnick commented Mar 16, 2017

Uh oh!

MLnick commented Mar 16, 2017

Uh oh!

SparkQA commented Mar 16, 2017

Uh oh!

SparkQA commented Mar 16, 2017

Uh oh!

SparkQA commented Mar 16, 2017

Uh oh!

hhbyyh left a comment

Uh oh!

hhbyyh Mar 16, 2017

Uh oh!

MLnick Mar 17, 2017

Uh oh!

hhbyyh Mar 16, 2017 •

edited

Loading

Uh oh!

MLnick Mar 17, 2017

Uh oh!

SparkQA commented Mar 20, 2017

Uh oh!

SparkQA commented Mar 22, 2017

Uh oh!

MLnick commented Mar 24, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

MLnick commented Mar 16, 2017

How was this patch tested?

Uh oh!

MLnick commented Mar 16, 2017

Uh oh!

SparkQA commented Mar 16, 2017

Uh oh!

SparkQA commented Mar 16, 2017

Uh oh!

SparkQA commented Mar 16, 2017

Uh oh!

hhbyyh left a comment

Choose a reason for hiding this comment

Uh oh!

hhbyyh Mar 16, 2017

Choose a reason for hiding this comment

Uh oh!

MLnick Mar 17, 2017

Choose a reason for hiding this comment

Uh oh!

hhbyyh Mar 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MLnick Mar 17, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 20, 2017

Uh oh!

SparkQA commented Mar 22, 2017

Uh oh!

MLnick commented Mar 24, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hhbyyh Mar 16, 2017 •

edited

Loading