Skip to content

Conversation

@mgaido91
Copy link
Contributor

@mgaido91 mgaido91 commented Nov 6, 2017

What changes were proposed in this pull request?

In SPARK-14516 we have introduced ClusteringEvaluator, but we didn't put any reference in the documentation and the examples were still relying on the sum of squared errors to show a way to evaluate the clustering model.

The PR adds the ClusteringEvaluator in the examples.

How was this patch tested?

Manual runs of the examples.

@SparkQA
Copy link

SparkQA commented Nov 6, 2017

Test build #83500 has finished for PR 19676 at commit 4c4f83e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor Author

sorry for pinging you, what do you think about adding ClusteringEvaluator to the examples @yanboliang ? Thanks.

@yanboliang
Copy link
Contributor

It's good to have this, sorry for late response, I will make a pass tomorrow. Thanks.

Copy link
Contributor

@yanboliang yanboliang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except one minor comment. Thanks.

// Evaluate clustering by computing Silhouette score
ClusteringEvaluator evaluator = new ClusteringEvaluator()
.setFeaturesCol("features")
.setPredictionCol("prediction")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use default values here, so it's not necessary to set them explicitly. We should keep examples as simple as possible. Thanks.

@SparkQA
Copy link

SparkQA commented Dec 9, 2017

Test build #84681 has finished for PR 19676 at commit feb619d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

ClusteringEvaluator evaluator = new ClusteringEvaluator();

double silhouette = evaluator.evaluate(predictions);
System.out.println("Silhouette with squared euclidean distance = " + silhouette);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

euclidean -> Euclidean, but not important to change unless you're touching the code again anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I don't think I am changing the code again, but I can fix this grammatical error if you want.

@srowen
Copy link
Member

srowen commented Dec 11, 2017

Merged to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants