-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-26754][PYTHON] Add hasTrainingSummary to replace duplicate code in PySpark #23676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #101774 has finished for PR 23676 at commit
|
python/pyspark/ml/util.py
Outdated
| """ | ||
|
|
||
| @property | ||
| @since("3.0.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only issue I see here is that these properties were effectively added to subclasses in 2.0.0 and 2.1.0, so this seems somewhat misleading. Maybe we can declare these "since 2.1.0" as close enough?
python/pyspark/ml/util.py
Outdated
| Gets summary of the model trained on the training set. An exception is thrown if | ||
| no summary exists. | ||
| """ | ||
| if self.hasSummary: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also call "summary" directly though I suppose that would result in a different, possibly less useful error. I was also wondering whether RuntimeError is the right one, but I see other call sites above throw this when there is no summary.
Can you remove other checks for "if self.hasSummary" then, above? because when summary() is called it will now throw this error already.
|
Test build #101911 has finished for PR 23676 at commit
|
| Gets summary (e.g. accuracy/precision/recall, objective history, total iterations) of model | ||
| trained on the training set. An exception is thrown if `trainingSummary is None`. | ||
| """ | ||
| if self.hasSummary: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pardon if you were already going to follow up on this, but I think you don't need to check self.hasSummary here and other classes anymore. The implementation in the new trait can do that in one place.
|
OK I get it, yeah, the polymorphism doesn't come across, and has to be mirrored in Python. |
|
Merged to master |
|
Thanks a lot for your help! @srowen |
…e in PySpark ## What changes were proposed in this pull request? Python version of apache#17654 ## How was this patch tested? Existing Python unit test Closes apache#23676 from huaxingao/spark26754. Authored-by: Huaxin Gao <[email protected]> Signed-off-by: Sean Owen <[email protected]>
What changes were proposed in this pull request?
Python version of #17654
How was this patch tested?
Existing Python unit test