[SPARK-15040][ML][PYSPARK] Add Imputer to PySpark#17316
[SPARK-15040][ML][PYSPARK] Add Imputer to PySpark#17316MLnick wants to merge 5 commits intoapache:masterfrom
Conversation
|
cc @hhbyyh |
|
Test build #74667 has finished for PR 17316 at commit
|
|
Test build #74669 has finished for PR 17316 at commit
|
|
Test build #74672 has finished for PR 17316 at commit
|
| """ | ||
| .. note:: Experimental | ||
|
|
||
| Imputation estimator for completing missing values, either using the mean or the median |
There was a problem hiding this comment.
Nit: Shall we change all the "column" to "columns" since we are supporting multiple columns now...
There was a problem hiding this comment.
Will do for Python and Scala doc
| def getOutputCols(self): | ||
| """ | ||
| Gets the value of :py:attr:`outputCols` or its default value. | ||
| """ |
There was a problem hiding this comment.
This reminds me we should add
require(get(inputCols).isDefined, "Input cols must be defined first.")
require(get(outputCols).isDefined, "Output cols must be defined first.")
in transformschema
There was a problem hiding this comment.
Do we really need that? The first call to $(inputCols) in validateAndTransformSchema will just throw an error with Failed to find a default value ...
|
Test build #74874 has finished for PR 17316 at commit
|
|
Test build #75012 has finished for PR 17316 at commit
|
|
Merged to master. |
## What changes were proposed in this pull request? Add docs and examples for spark.ml.feature.Imputer. Currently scala and Java examples are included. Python example will be added after #17316 ## How was this patch tested? local doc generation and example execution Author: Yuhao Yang <yuhao.yang@intel.com> Closes #17324 from hhbyyh/imputerdoc.
Add Python wrapper for
Imputerfeature transformer.How was this patch tested?
New doc tests and tweak to PySpark ML
tests.py