-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-14615][ML][FOLLOWUP] Fix Python examples to use the new ML Vector and Matrix APIs in the ML pipeline based algorithms #13393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #59588 has finished for PR 13393 at commit
|
|
LGTM and cc @mengxr |
|
Hi @mengxr , Could you please take a look? |
|
Please let me ping @mengxr again. Thanks! |
|
Hi @yanboliang , could you maybe take a quick look please? |
|
I'll take a look |
| # $example on$ | ||
| from pyspark.ml.regression import AFTSurvivalRegression | ||
| from pyspark.mllib.linalg import Vectors | ||
| from pyspark.ml.linalg import Vectors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this example run for you? It seems broken (not due to your PR though). Would you mind checking to identify the last time it worked?
Traceback (most recent call last):
File "/Users/josephkb/spark/examples/src/main/python/ml/aft_survival_regression.py", line 49, in <module>
model = aft.fit(training)
File "/Users/josephkb/spark/python/lib/pyspark.zip/pyspark/ml/base.py", line 64, in fit
File "/Users/josephkb/spark/python/lib/pyspark.zip/pyspark/ml/wrapper.py", line 213, in _fit
File "/Users/josephkb/spark/python/lib/pyspark.zip/pyspark/ml/wrapper.py", line 210, in _fit_java
File "/Users/josephkb/spark/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py", line 933, in __call__
File "/Users/josephkb/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 79, in deco
pyspark.sql.utils.IllegalArgumentException: u'requirement failed: The number of instances should be greater than 0.0, but got 0.'
|
LGTM except for the broken example, but I don't think that's from this PR. I'll rerun tests before merging it. |
|
Test build #3076 has finished for PR 13393 at commit
|
|
Well, that test works in 1.6 but fails in branch-2.0. I'll merge your PR. Thanks! I created a JIRA for the bug. Would you have time to look into it? [https://issues.apache.org/jira/browse/SPARK-15892] |
…tor and Matrix APIs in the ML pipeline based algorithms ## What changes were proposed in this pull request? This PR fixes Python examples to use the new ML Vector and Matrix APIs in the ML pipeline based algorithms. I firstly executed this shell command, `grep -r "from pyspark.mllib" .` and then executed them all. Some of tests in `ml` produced the error messages as below: ``` pyspark.sql.utils.IllegalArgumentException: u'requirement failed: Input type must be VectorUDT but got org.apache.spark.mllib.linalg.VectorUDTf71b0bce.' ``` So, I fixed them to use new ones just identically with some Python tests fixed in #12627 ## How was this patch tested? Manually tested for all the examples listed by `grep -r "from pyspark.mllib" .`. Author: hyukjinkwon <[email protected]> Closes #13393 from HyukjinKwon/SPARK-14615. (cherry picked from commit 99f3c82) Signed-off-by: Joseph K. Bradley <[email protected]>
|
@jkbradley Sure, thanks! |
What changes were proposed in this pull request?
This PR fixes Python examples to use the new ML Vector and Matrix APIs in the ML pipeline based algorithms.
I firstly executed this shell command,
grep -r "from pyspark.mllib" .and then executed them all.Some of tests in
mlproduced the error messages as below:So, I fixed them to use new ones just identically with some Python tests fixed in #12627
How was this patch tested?
Manually tested for all the examples listed by
grep -r "from pyspark.mllib" ..