-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-14712][ML] LogisticRegressionModel.toString should summarize model #18826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @holdenk , I'm opening this PR to continue the effort in #12491 |
|
On the Python side using repr looks reasonable (although I would like to see a doctest for this). But we really should get @jkbradley or maybe @dbtsai to take a look on the ML side. Jenkins ok to test. Sorry for the delay on getting to this. |
|
Test build #4183 has finished for PR 18826 at commit
|
|
Test build #4187 has finished for PR 18826 at commit
|
|
This PR recently got tested so it draws my attention. Is this something we want to proceed? @holdenk @yanboliang @jkbradley @dbtsai I don't see how the test failures relate to LogisticRegression. Might it be some other flaky tests? Also tagging @HyukjinKwon who is active in maintaining stale PRs. |
|
@bravo-zhang, mind if I ask to rebase it and see if the tests pass? BTW, let's fix the PR title to link the JIRA. |
|
@HyukjinKwon It's ready to test. |
|
retest this please |
|
Test build #91531 has finished for PR 18826 at commit
|
|
@HyukjinKwon Can you recommend someone to take a look at this PR or maybe you can take a look? |
|
ok to test |
|
I think you cc'ed right ones .. |
|
Test build #92157 has finished for PR 18826 at commit
|
holdenk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not noticing this earlier, looks really good - quick question on the Python side though. Would love to get this in :)
| java_blr_summary = self._call_java("evaluate", dataset) | ||
| return BinaryLogisticRegressionSummary(java_blr_summary) | ||
|
|
||
| def __repr__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my question here is why we aren't calling the Java/Scala toString method directly as we do in the mllib one and in many of the other models in regression.py for the ml one?
|
Test build #92240 has finished for PR 18826 at commit
|
|
@holdenk @HyukjinKwon I updated |
|
LGTM merging to master. |
|
Thanks for the improvement :) |
What changes were proposed in this pull request?
SPARK-14712
spark.mllib LogisticRegressionModel overrides toString to print a little model info. We should do the same in spark.ml and override repr in pyspark.
How was this patch tested?
LogisticRegressionSuite.scala
Python doctest in pyspark.ml.classification.py