[SPARK-20029][ML] ML LinearRegression supports bound constrained optimization. #17360

yanboliang · 2017-03-20T10:12:57Z

What changes were proposed in this pull request?

MLlib LinearRegression should support bound constrained optimization. Users can add bound constraints to coefficients to make the solver produce solution in the specified range.
Under the hood, we call breeze L-BFGS-B as the solver for bound constrained optimization. And we only support L2 regularization currently.

Todo:

Support set bound for intercept.

How was this patch tested?

Unit tests.

SparkQA · 2017-03-20T12:06:18Z

Test build #74876 has finished for PR 17360 at commit aa7e768.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

sethah · 2017-03-20T15:46:59Z

I don't think this is the best approach. We're further confounding the algorithm API with parameters of the optimizer used to fit the algorithm.

I strongly prefer to put more effort into getting this right via SPARK-17136. For what it's worth, I have an initial PR basically ready that provides an API that makes adding this functionality trivial.

SparkQA · 2017-03-21T12:45:42Z

Test build #74979 has finished for PR 17360 at commit 5af16cb.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

yanboliang · 2017-03-21T15:15:04Z

@sethah I left some questions on SPARK-17136. I think the main question we should figure out is whether we still expose the optimizer params as the estimator params after SPARK-17136. I'm more prefer to keep these params in estimators, make the optimizer layer as an internal API, and users can implement their own optimizer like Spark SQL external data source support. Since I found this is more aligned with the original ML pipeline design which stores params outside a pipeline component.
So I think this PR is not conflict with SPARK-17136 and can work parallel. I'm also open to hear your thoughts. Thanks!

sethah · 2017-03-21T18:59:04Z

@yanboliang Thanks for your feedback! The design of the optimizer interface, or even whether it should be included at all, is definitely open for discussion and your suggestions are much appreciated. If SPARK-17136 proceeds as you suggest (internal optimization API that allows users to register optimizers) then it is possible that this PR does not conflict with that JIRA (though I don't know about the details of that, so even that I'm not sure of). However, that matter is far from settled. If we end up deciding to provide the external optimizer API as is currently suggested in that JIRA, then these two do conflict. If we add the ability to specify parameter bounds on the estimator, then add an optimizer API, we have added yet more optimizer parameters to the estimator that can conflict with parameters of the optimizer provided to the estimator.

My point is that I think these are two competing approaches and we should settle on one over the other before we make API changes that cannot be undone. I'm open to potentially changing the design of SPARK-17136, but we need to decide on something first.

SparkQA · 2018-10-22T10:39:11Z

Test build #97747 has finished for PR 17360 at commit 5af16cb.

This patch fails Spark unit tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2018-10-22T11:18:36Z

Test build #97754 has finished for PR 17360 at commit 5af16cb.

This patch fails Spark unit tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2018-10-22T13:44:04Z

Test build #97789 has finished for PR 17360 at commit 5af16cb.

This patch fails build dependency tests.
This patch does not merge cleanly.
This patch adds no public classes.

github-actions · 2020-01-17T00:13:52Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

ML LinearRegression supports bound constrained optimization.

aa7e768

Update code and test.

5af16cb

yanboliang changed the title ~~[WIP][SPARK-20029][ML] ML LinearRegression supports bound constrained optimization.~~ [SPARK-20029][ML] ML LinearRegression supports bound constrained optimization. Mar 21, 2017

WeichenXu123 mentioned this pull request Apr 4, 2017

Fix lbfgsb linesearch out of bound and findAlpha method scalanlp/breeze#633

Merged

dongjoon-hyun added the ML label Jun 14, 2019

github-actions bot added the Stale label Jan 17, 2020

github-actions bot closed this Jan 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-20029][ML] ML LinearRegression supports bound constrained optimization. #17360

[SPARK-20029][ML] ML LinearRegression supports bound constrained optimization. #17360

Uh oh!

yanboliang commented Mar 20, 2017 •

edited

Loading

Uh oh!

SparkQA commented Mar 20, 2017

Uh oh!

sethah commented Mar 20, 2017

Uh oh!

SparkQA commented Mar 21, 2017

Uh oh!

yanboliang commented Mar 21, 2017 •

edited

Loading

Uh oh!

sethah commented Mar 21, 2017 •

edited

Loading

Uh oh!

SparkQA commented Oct 22, 2018

Uh oh!

SparkQA commented Oct 22, 2018

Uh oh!

SparkQA commented Oct 22, 2018

Uh oh!

github-actions bot commented Jan 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-20029][ML] ML LinearRegression supports bound constrained optimization. #17360

[SPARK-20029][ML] ML LinearRegression supports bound constrained optimization. #17360

Uh oh!

Conversation

yanboliang commented Mar 20, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Mar 20, 2017

Uh oh!

sethah commented Mar 20, 2017

Uh oh!

SparkQA commented Mar 21, 2017

Uh oh!

yanboliang commented Mar 21, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sethah commented Mar 21, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented Oct 22, 2018

Uh oh!

SparkQA commented Oct 22, 2018

Uh oh!

SparkQA commented Oct 22, 2018

Uh oh!

github-actions bot commented Jan 17, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yanboliang commented Mar 20, 2017 •

edited

Loading

yanboliang commented Mar 21, 2017 •

edited

Loading

sethah commented Mar 21, 2017 •

edited

Loading