Skip to content

Conversation

@huaxingao
Copy link
Contributor

What changes were proposed in this pull request?

Python version of #17654

How was this patch tested?

Existing Python unit test

@SparkQA
Copy link

SparkQA commented Jan 28, 2019

Test build #101774 has finished for PR 23676 at commit 1d70ad1.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class LogisticRegressionModel(JavaModel, JavaClassificationModel, JavaMLWritable, JavaMLReadable,
  • class GaussianMixtureModel(JavaModel, JavaMLWritable, JavaMLReadable, HasTrainingSummary):
  • class KMeansModel(JavaModel, GeneralJavaMLWritable, JavaMLReadable, HasTrainingSummary):
  • class BisectingKMeansModel(JavaModel, JavaMLWritable, JavaMLReadable, HasTrainingSummary):
  • class LinearRegressionModel(JavaModel, JavaPredictionModel, GeneralJavaMLWritable, JavaMLReadable,
  • class HasTrainingSummary(object):

"""

@property
@since("3.0.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only issue I see here is that these properties were effectively added to subclasses in 2.0.0 and 2.1.0, so this seems somewhat misleading. Maybe we can declare these "since 2.1.0" as close enough?

Gets summary of the model trained on the training set. An exception is thrown if
no summary exists.
"""
if self.hasSummary:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also call "summary" directly though I suppose that would result in a different, possibly less useful error. I was also wondering whether RuntimeError is the right one, but I see other call sites above throw this when there is no summary.

Can you remove other checks for "if self.hasSummary" then, above? because when summary() is called it will now throw this error already.

@SparkQA
Copy link

SparkQA commented Jan 30, 2019

Test build #101911 has finished for PR 23676 at commit 6308e62.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Gets summary (e.g. accuracy/precision/recall, objective history, total iterations) of model
trained on the training set. An exception is thrown if `trainingSummary is None`.
"""
if self.hasSummary:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pardon if you were already going to follow up on this, but I think you don't need to check self.hasSummary here and other classes anymore. The implementation in the new trait can do that in one place.

@srowen
Copy link
Member

srowen commented Feb 1, 2019

OK I get it, yeah, the polymorphism doesn't come across, and has to be mirrored in Python.

@srowen
Copy link
Member

srowen commented Feb 1, 2019

Merged to master

@srowen srowen closed this in 5bb9647 Feb 1, 2019
@huaxingao
Copy link
Contributor Author

Thanks a lot for your help! @srowen

jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…e in PySpark

## What changes were proposed in this pull request?

Python version of apache#17654

## How was this patch tested?

Existing Python unit test

Closes apache#23676 from huaxingao/spark26754.

Authored-by: Huaxin Gao <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants