-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-14516][FOLLOWUP] Adding ClusteringEvaluator to examples #19676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #83500 has finished for PR 19676 at commit
|
|
sorry for pinging you, what do you think about adding |
|
It's good to have this, sorry for late response, I will make a pass tomorrow. Thanks. |
yanboliang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except one minor comment. Thanks.
| // Evaluate clustering by computing Silhouette score | ||
| ClusteringEvaluator evaluator = new ClusteringEvaluator() | ||
| .setFeaturesCol("features") | ||
| .setPredictionCol("prediction") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use default values here, so it's not necessary to set them explicitly. We should keep examples as simple as possible. Thanks.
|
Test build #84681 has finished for PR 19676 at commit
|
| ClusteringEvaluator evaluator = new ClusteringEvaluator(); | ||
|
|
||
| double silhouette = evaluator.evaluate(predictions); | ||
| System.out.println("Silhouette with squared euclidean distance = " + silhouette); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
euclidean -> Euclidean, but not important to change unless you're touching the code again anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I don't think I am changing the code again, but I can fix this grammatical error if you want.
|
Merged to master |
What changes were proposed in this pull request?
In SPARK-14516 we have introduced ClusteringEvaluator, but we didn't put any reference in the documentation and the examples were still relying on the sum of squared errors to show a way to evaluate the clustering model.
The PR adds the ClusteringEvaluator in the examples.
How was this patch tested?
Manual runs of the examples.