[SPARK-19337] [ML] [Doc] Documentation and examples for LinearSVC #16968

hhbyyh · 2017-02-16T22:45:20Z

What changes were proposed in this pull request?

Documentation and examples (Java, scala, python, R) for LinearSVC

How was this patch tested?

local doc generation

SparkQA · 2017-02-16T23:01:14Z

Test build #73020 has finished for PR 16968 at commit 7a0829f.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
public class JavaLinearSVCExample

wangmiao1981

@hhbyyh In R, the LinearSVC is named as spark.svmLinear. I just created a PR #16969 for R example and vignettes. Can you drop off the R example? After merging the two PRs, I will update the document section to R. Or you can copy the R example to this PR.

hhbyyh · 2017-02-17T01:56:52Z

I see. I will drop the R example here, whichever PR goes in later can finish the document update.

felixcheung · 2017-02-17T04:14:05Z

docs/ml-classification-regression.md

+regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
+the largest distance to the nearest training-data point of any class (so-called functional margin),
+since in general the larger the margin the lower the generalization error of the classifier. LinearSVC
+in Spark ML supports binary calssification with linear SVM. Internally, it optimizes the 


calssification -> classification

felixcheung · 2017-02-17T04:14:42Z

docs/ml-classification-regression.md

+A [support vector machine](https://en.wikipedia.org/wiki/Support_vector_machine) constructs a hyperplane
+or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification,
+regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
+the largest distance to the nearest training-data point of any class (so-called functional margin),


"largest distance" -> "longest distance"? I think?

Thanks for the comment. I think both large and long can be used to describe distance, wherever large is more suitable to describe the numeric margin. Please let me know if you have a strong preference.

felixcheung · 2017-02-17T07:59:44Z

title should say
[SPARK-19337] [ML] [Dcoc]
->
[SPARK-19337] [ML] [Doc]

hhbyyh · 2017-02-17T20:35:49Z

Thanks for the comment @felixcheung

SparkQA · 2017-02-17T20:49:19Z

Test build #73071 has finished for PR 16968 at commit b888f35.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

felixcheung

LGTM.
I'll leave this out for a day in case anyone else can comment.

felixcheung · 2017-02-17T21:34:17Z

docs/ml-classification-regression.md

+regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has
+the largest distance to the nearest training-data points of any class (so-called functional margin),
+since in general the larger the margin the lower the generalization error of the classifier. LinearSVC
+in Spark ML supports binomial classification with linear SVM. Internally, it optimizes the 


actually, is there a reason you change this to binomial classification?

just to be consistent with LR. But I'm not sure if it's the common expression.

do you have a link? I think binary classification is more commonly used

FWIW I have never head the term binomial classification and it doesn't show up in a Google search. I think it was a typo.

yes, let's fix that

felixcheung

let's change binomial classification

hhbyyh · 2017-02-19T19:31:13Z

Thanks for the review. Updated to binary.
Also add the reference to R example.

SparkQA · 2017-02-19T19:38:26Z

Test build #73132 has finished for PR 16968 at commit 165fbe4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

felixcheung

LGTM @wangmiao1981

felixcheung · 2017-02-21T17:38:28Z

merged to master.

## What changes were proposed in this pull request? Documentation and examples (Java, scala, python, R) for LinearSVC ## How was this patch tested? local doc generation Author: Yuhao Yang <[email protected]> Closes apache#16968 from hhbyyh/mlsvmdoc.

linearsvc doc and example

7a0829f

wangmiao1981 reviewed Feb 17, 2017

View reviewed changes

felixcheung mentioned this pull request Feb 17, 2017

[SPARK-19639][SPARKR][Example]:Add spark.svmLinear example and update vignettes #16969

Closed

felixcheung reviewed Feb 17, 2017

View reviewed changes

hhbyyh changed the title ~~[SPARK-19337] [ML] [Dcoc] Documentation and examples for LinearSVC~~ [SPARK-19337] [ML] [Doc] Documentation and examples for LinearSVC Feb 17, 2017

remove r example and some spell correction

b888f35

felixcheung approved these changes Feb 17, 2017

View reviewed changes

felixcheung reviewed Feb 17, 2017

View reviewed changes

felixcheung requested changes Feb 19, 2017

View reviewed changes

YY-OnCall added 2 commits February 19, 2017 10:42

Merge remote-tracking branch 'upstream/master' into mlsvmdoc

38d8665

change to binary and add r

165fbe4

felixcheung approved these changes Feb 19, 2017

View reviewed changes

asfgit closed this in 280afe0 Feb 21, 2017

[SPARK-19337] [ML] [Doc] Documentation and examples for LinearSVC #16968

[SPARK-19337] [ML] [Doc] Documentation and examples for LinearSVC #16968

Uh oh!

Conversation

hhbyyh commented Feb 16, 2017

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Feb 16, 2017

Uh oh!

wangmiao1981 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hhbyyh commented Feb 17, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

felixcheung commented Feb 17, 2017

Uh oh!

hhbyyh commented Feb 17, 2017

Uh oh!

SparkQA commented Feb 17, 2017

Uh oh!

felixcheung left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

felixcheung left a comment

Choose a reason for hiding this comment

Uh oh!

hhbyyh commented Feb 19, 2017

Uh oh!

SparkQA commented Feb 19, 2017

Uh oh!

felixcheung left a comment

Choose a reason for hiding this comment

Uh oh!

felixcheung commented Feb 21, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

wangmiao1981 left a comment •

edited

Loading