-
Notifications
You must be signed in to change notification settings - Fork 696
Quadratic Minimization solver using ADMM based Proximal Algorithm #321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 15 commits
1eb3445
fae0dc8
1b41b2e
1a63352
3992aa1
abc83ff
4ff3475
84424ee
0288b5d
746a384
660c53e
813c93a
7931c73
e5f91bf
3ec9e04
a228efb
2d63560
95416d4
503d3cf
6ee0d65
d5b2a1b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,13 @@ | ||
| Breeze is distributed under an Apache License V2.0 (See LICENSE) | ||
|
|
||
| =============================================================================== | ||
|
|
||
| Proximal algorithms outlined in Proximal.scala (package breeze.optimize.quadratic) | ||
| are based on https://github.com/cvxgrp/proximal (see LICENSE for details) and distributed with | ||
| Copyright (c) 2014 by Debasish Das (Verizon), all rights reserved. | ||
|
|
||
| =============================================================================== | ||
|
|
||
| QuadraticMinimizer class in package breeze.optimize.proximal is distributed with Copyright (c) | ||
| 2014, Debasish Das (Verizon), all rights reserved. | ||
|
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -51,4 +51,3 @@ pomExtra := ( | |
| <url>http://cs.berkeley.edu/~dlwh/</url> | ||
| </developer> | ||
| </developers>) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,203 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| package breeze.optimize.linear | ||
|
|
||
| import breeze.linalg.{DenseMatrix, DenseVector, axpy} | ||
| import breeze.optimize.proximal.QpGenerator | ||
| import breeze.stats.distributions.Rand | ||
| import breeze.util.Implicits._ | ||
| /** | ||
| * Object used to solve nonnegative least squares problems using a modified | ||
| * projected gradient method. | ||
| */ | ||
|
|
||
| class NNLS(val maxIters: Int = -1) { | ||
| type BDM = DenseMatrix[Double] | ||
| type BDV = DenseVector[Double] | ||
|
|
||
| case class State(x: BDV, grad: BDV, dir: BDV, lastDir: BDV, res: BDV, | ||
| lastNorm: Double, lastWall: Int, iter: Int, converged: Boolean) | ||
|
|
||
| // find the optimal unconstrained step | ||
| def steplen(ata: BDM, dir: BDV, res: BDV, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. any reason to not make all these helper methods private?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added private to both NNLS and QuadraticMinimizer |
||
| tmp: BDV): Double = { | ||
| val top = dir.dot(res) | ||
| tmp := ata * dir | ||
| // Push the denominator upward very slightly to avoid infinities and silliness | ||
| top / (tmp.dot(dir) + 1e-20) | ||
| } | ||
|
|
||
| // stopping condition | ||
| def stop(step: Double, ndir: Double, nx: Double): Boolean = { | ||
| ((step.isNaN) // NaN | ||
| || (step < 1e-7) // too small or negative | ||
| || (step > 1e40) // too small; almost certainly numerical problems | ||
| || (ndir < 1e-12 * nx) // gradient relatively too small | ||
| || (ndir < 1e-32) // gradient absolutely too small; numerical issues may lurk | ||
| ) | ||
| } | ||
|
|
||
| /** | ||
| * Solve a least squares problem, possibly with nonnegativity constraints, by a modified | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you move this to iterations or minimize or something?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this has been called from 2 places in iterations and that's why I defined it as a function
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i meant the comments |
||
| * projected gradient method. That is, find x minimising ||Ax - b||_2 given A^T A and A^T b. | ||
| * | ||
| * We solve the problem | ||
| * min_x 1/2 x' ata x' - x'atb | ||
| * subject to x >= 0 | ||
| * | ||
| * The method used is similar to one described by Polyak (B. T. Polyak, The conjugate gradient | ||
| * method in extremal problems, Zh. Vychisl. Mat. Mat. Fiz. 9(4)(1969), pp. 94-112) for bound- | ||
| * constrained nonlinear programming. Polyak unconditionally uses a conjugate gradient | ||
| * direction, however, while this method only uses a conjugate gradient direction if the last | ||
| * iteration did not cause a previously-inactive constraint to become active. | ||
| */ | ||
| def initialState(ata: DenseMatrix[Double], atb: DenseVector[Double], n: Int): State = { | ||
| require(ata.cols == ata.rows, s"NNLS:iterations gram matrix must be symmetric") | ||
| require(ata.rows == atb.length, s"NNLS:iterations gram matrix rows must be same as length of linear term") | ||
|
|
||
| val grad = DenseVector.zeros[Double](n) | ||
| val x = DenseVector.zeros[Double](n) | ||
| val dir = DenseVector.zeros[Double](n) | ||
| val lastDir = DenseVector.zeros[Double](n) | ||
| val res = DenseVector.zeros[Double](n) | ||
| val lastNorm = 0.0 | ||
| val lastWall = 0 | ||
| State(x, grad, dir, lastDir, res, lastNorm, lastWall, 0, false) | ||
| } | ||
|
|
||
| def iterations(ata: DenseMatrix[Double], | ||
| atb: DenseVector[Double]): Iterator[State] = | ||
| Iterator.iterate(initialState(ata, atb, atb.length)) { state => | ||
| import state._ | ||
| val n = atb.length | ||
| val tmp = DenseVector.zeros[Double](atb.length) | ||
|
|
||
| val iterMax = if (maxIters < 0) Math.max(400, 20 * n) else maxIters | ||
|
|
||
| // find the residual | ||
| res := ata * x | ||
| res -= atb | ||
| grad := res | ||
|
|
||
| // project the gradient | ||
| var i = 0 | ||
| while (i < n) { | ||
| if (grad.data(i) > 0.0 && x.data(i) == 0.0) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in general we prefer grad(i) or grad.unsafeValueAt(i), unless this is making a noticeable difference. Also, cforRange is better than a while loop.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added cforRange |
||
| grad.data(i) = 0.0 | ||
| } | ||
| i = i + 1 | ||
| } | ||
| val ngrad = grad.dot(grad) | ||
| dir := grad | ||
|
|
||
| // use a CG direction under certain conditions | ||
| var step = steplen(ata, grad, res, tmp) | ||
| var ndir = 0.0 | ||
| val nx = x.dot(x) | ||
|
|
||
| if (iter > lastWall + 1) { | ||
| val alpha = ngrad / lastNorm | ||
| axpy(alpha, lastDir, dir) | ||
| val dstep = steplen(ata, dir, res, tmp) | ||
| ndir = dir.dot(dir) | ||
| if (stop(dstep, ndir, nx)) { | ||
| // reject the CG step if it could lead to premature termination | ||
| dir := grad | ||
| ndir = dir.dot(dir) | ||
| } else { | ||
| step = dstep | ||
| } | ||
| } else { | ||
| ndir = dir.dot(dir) | ||
| } | ||
|
|
||
| // terminate? | ||
| if (stop(step, ndir, nx) || iter > iterMax) | ||
| State(x, grad, dir, lastDir, res, lastNorm, lastWall, iter + 1, true) | ||
| else { | ||
| // don't run through the walls | ||
| i = 0 | ||
| while (i < n) { | ||
| if (step * dir.data(i) > x.data(i)) { | ||
| step = x.data(i) / dir.data(i) | ||
| } | ||
| i = i + 1 | ||
| } | ||
|
|
||
| var nextWall = lastWall | ||
|
|
||
| // take the step | ||
| i = 0 | ||
| while (i < n) { | ||
| if (step * dir.data(i) > x.data(i) * (1 - 1e-14)) { | ||
| x.data(i) = 0 | ||
| nextWall = iter | ||
| } else { | ||
| x.data(i) -= step * dir.data(i) | ||
| } | ||
| i = i + 1 | ||
| } | ||
| lastDir := dir | ||
| val nextNorm = ngrad | ||
| State(x, grad, dir, lastDir, res, nextNorm, nextWall, iter + 1, false) | ||
| } | ||
| }.takeUpToWhere(_.converged) | ||
|
|
||
| def minimizeAndReturnState(ata: DenseMatrix[Double], | ||
| atb: DenseVector[Double]): State = { | ||
| iterations(ata, atb).last | ||
| } | ||
|
|
||
| def minimize(ata: DenseMatrix[Double], atb: DenseVector[Double]): DenseVector[Double] = { | ||
| minimizeAndReturnState(ata, atb).x | ||
| } | ||
| } | ||
|
|
||
| object NNLS { | ||
| /** Compute the objective value */ | ||
| def computeObjectiveValue(ata: DenseMatrix[Double], atb: DenseVector[Double], x: DenseVector[Double]): Double = { | ||
| val res = (x.t*ata*x)*0.5 - atb.dot(x) | ||
| res | ||
| } | ||
|
|
||
| def apply(iters: Int) = new NNLS(iters) | ||
|
|
||
| def main(args: Array[String]) { | ||
| if (args.length < 2) { | ||
| println("Usage: NNLS n s") | ||
| println("Test NNLS with quadratic function of dimension n for s consecutive solves") | ||
| sys.exit(1) | ||
| } | ||
|
|
||
| val problemSize = args(0).toInt | ||
| val numSolves = args(1).toInt | ||
| val nnls = new NNLS() | ||
|
|
||
| var i = 0 | ||
| var nnlsTime = 0L | ||
| while(i < numSolves) { | ||
| val ata = QpGenerator.getGram(problemSize) | ||
| val atb = DenseVector.rand[Double](problemSize, Rand.gaussian(0,1)) | ||
| val startTime = System.nanoTime() | ||
| nnls.minimize(ata, atb) | ||
| nnlsTime = nnlsTime + (System.nanoTime() - startTime) | ||
| i = i + 1 | ||
| } | ||
| println(s"NNLS problemSize $problemSize solves $numSolves ${nnlsTime/1e6} ms") | ||
|
||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"); you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
| package breeze.optimize.linear | ||
|
|
||
| import breeze.linalg.operators.OpMulMatrix | ||
| import breeze.math.{MutableInnerProductModule} | ||
| import breeze.numerics.abs | ||
| import breeze.util.SerializableLogging | ||
| import breeze.linalg.{DenseMatrix, DenseVector, norm} | ||
| import breeze.util.Implicits._ | ||
| import breeze.optimize.proximal.QuadraticMinimizer | ||
| /** | ||
| * Created by debasish83 on 2/3/15. | ||
| */ | ||
| class PowerMethod[T, M](maxIterations: Int = 10,tolerance: Double = 1E-5, inverse: Boolean = false) | ||
| (implicit space: MutableInnerProductModule[T, Double], | ||
| mult: OpMulMatrix.Impl2[M, T, T]) extends SerializableLogging { | ||
|
|
||
| import space._ | ||
|
|
||
| case class State(eigenValue: Double, eigenVector: T, iter: Int, converged: Boolean) | ||
|
|
||
| def initialState(y: T, A: M): State = { | ||
| //Force y to be a vector of unit norm | ||
| val normInit = norm(y) | ||
| y *= 1.0 / normInit | ||
| //TO DO : How to fix the asInstanceOf, at least throw a exception/require | ||
| val ay = if (inverse) | ||
| QuadraticMinimizer.solveTriangular(A.asInstanceOf[DenseMatrix[Double]], y.asInstanceOf[DenseVector[Double]]).asInstanceOf[T] | ||
| else mult(A, y) | ||
|
|
||
| val lambda = y dot ay | ||
|
|
||
| y := ay | ||
| val norm1 = norm(ay) | ||
| y *= 1.0/norm1 | ||
| if (lambda < 0.0) y *= -1.0 | ||
| State(lambda, y, 0, false) | ||
| } | ||
|
|
||
| def iterations(y: T, | ||
| A: M): Iterator[State] = Iterator.iterate(initialState(y, A)) { state => | ||
| import state._ | ||
| val ay = if (inverse) | ||
| QuadraticMinimizer.solveTriangular(A.asInstanceOf[DenseMatrix[Double]], eigenVector.asInstanceOf[DenseVector[Double]]).asInstanceOf[T] | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dlwh For DenseMatrix[Double] I am using inverse power method here...I can catch ClassCastException and add a WARN that inverse power method right now is only defined for DenseMatrix[Double]...Is there an elegant way to fix it ? Down the line I would like to support SparseMatrix[T] and DenseMatrix[T] but right now LU and cholesky are both using Double and so I could not specialize QuadraticMinimizer on Float, Double
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean, there's just no reason to make PowerMethod generic in M and T if it actually isn't generic.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I kept PowerMethod generic in M and T (since I want to use it over LBFGS.ApproximateInverseHessian and PQN.CompactHessian if required). For InversePowerMethod I introduced a new class specialized on DenseMatrix[Double], DenseVector[Double] to be used from QuadraticMinimizer...Now it is clean.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. Made a few more comments. On Tue, Feb 17, 2015 at 1:23 PM, Debasish Das [email protected]
|
||
| else mult(A, eigenVector) | ||
| val lambda = eigenVector dot ay | ||
| val norm1 = norm(ay) | ||
| ay *= 1.0 / norm1 | ||
| if (lambda < 0.0) ay *= -1.0 | ||
|
|
||
| val val_dif = abs(lambda - eigenValue) | ||
| if (val_dif <= tolerance || iter > maxIterations) State(lambda, ay, iter + 1, true) | ||
| else State(lambda, ay, iter + 1, false) | ||
| }.takeUpToWhere(_.converged) | ||
|
|
||
| def iterateAndReturnState(y: T, A: M): State = { | ||
| iterations(y, A).last | ||
| } | ||
|
|
||
| def eigen(y: T, A: M): Double = { | ||
| iterateAndReturnState(y, A).eigenValue | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one or more | ||
| * contributor license agreements. See the NOTICE file distributed with | ||
| * this work for additional information regarding copyright ownership. | ||
| * The ASF licenses this file to You under the Apache License, Version 2.0 | ||
| * (the "License"), you may not use this file except in compliance with | ||
| * the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
| package breeze.optimize.proximal | ||
|
|
||
| object Constraint extends Enumeration { | ||
| type Constraint = Value | ||
| val SMOOTH, POSITIVE, BOX, SPARSE, EQUALITY = Value | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix license headers; this is not an apache project. Just use your name like I do in the files I've made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed all the license headers and added tests for the minimize API that uses updateGram.