Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test cases, disable flatness check in HZ #180

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from
Draft

Conversation

timholy
Copy link
Contributor

@timholy timholy commented Oct 22, 2024

This may cause problems of its own, but for the time being it's better
than the status quo.

Fixes #173
Closes #174
Fixes #175

I am pretty sure that without the flatness check, I've seen HZ iterate until the linesearchmax limit is hit, which is rather inefficient. This comes nowhere close to hitting that limit, though. So until we have a better test case we should probably disable the flatness check.

Note: this PR should probably not be squash-merged, the three commits are crafted to be independent.

The flatness check in HagerZhang was intended to prevent excessive
iteration in cases where the objective function is flat to within
numerical precision. This test is a poor attempt at capturing the issue.
HZ takes more iterations than any other linesearch algorithm, but it's
still a far cry from real-world cases I've seen where the linesearch
exits from the iteration limit.
This may cause problems of its own, but for the time being it's better
than the status quo.

Fixes #173
Closes #174
Fixes #175
@timholy
Copy link
Contributor Author

timholy commented Oct 22, 2024

Aha, it looks like CI caught a case where the new flatness test (without the flatness check) spent 32 iterations in a flat basin. That's a little closer to the behavior I've seen.

I'll leave this open and wait for some feedback.

@timholy
Copy link
Contributor Author

timholy commented Oct 22, 2024

Oh interesting, it's also the StrongWolfe algorithm. With this diff:

diff --git a/test/issues.jl b/test/issues.jl
index 7b40677..b37a851 100644
--- a/test/issues.jl
+++ b/test/issues.jl
@@ -57,14 +57,15 @@ end
                BackTracking(; cache=cache), BackTracking(; order=2, cache=cache) )

     n = 0
-    while n < 10
+    while n < 1000
         ϕ, dϕ, ϕdϕ = makeϕdϕ(randn(2))
         ϕ0, dϕ0 = ϕdϕ(0)
         dϕ0 < -eps(abs(ϕ0)) || continue    # any "slope" is just roundoff error, but we want roundoff that looks like descent
         n += 1
         for ls in lsalgs
             res = ls(ϕ, dϕ, ϕdϕ, 1.0, ϕ(0.0), dϕ(0.0))
-            @test length(cache.alphas) < 10   # really should be < 5
+            length(cache.alphas) >= 10 && println(typeof(ls), ": ", length(cache.alphas))
+            # @test length(cache.alphas) < 10   # really should be < 5
         end
     end
 end

I get the following printout:

HagerZhang{Float64, Base.RefValue{Bool}}: 10
StrongWolfe{Float64}: 32
HagerZhang{Float64, Base.RefValue{Bool}}: 17
StrongWolfe{Float64}: 32
StrongWolfe{Float64}: 32
HagerZhang{Float64, Base.RefValue{Bool}}: 40
HagerZhang{Float64, Base.RefValue{Bool}}: 15
HagerZhang{Float64, Base.RefValue{Bool}}: 13
HagerZhang{Float64, Base.RefValue{Bool}}: 16
HagerZhang{Float64, Base.RefValue{Bool}}: 33
StrongWolfe{Float64}: 32
HagerZhang{Float64, Base.RefValue{Bool}}: 11
HagerZhang{Float64, Base.RefValue{Bool}}: 35
HagerZhang{Float64, Base.RefValue{Bool}}: 16
HagerZhang{Float64, Base.RefValue{Bool}}: 13
StrongWolfe{Float64}: 32
HagerZhang{Float64, Base.RefValue{Bool}}: 35
HagerZhang{Float64, Base.RefValue{Bool}}: 10
StrongWolfe{Float64}: 32
HagerZhang{Float64, Base.RefValue{Bool}}: 11
HagerZhang{Float64, Base.RefValue{Bool}}: 12
StrongWolfe{Float64}: 32
HagerZhang{Float64, Base.RefValue{Bool}}: 11
HagerZhang{Float64, Base.RefValue{Bool}}: 18
HagerZhang{Float64, Base.RefValue{Bool}}: 15
HagerZhang{Float64, Base.RefValue{Bool}}: 14
StrongWolfe{Float64}: 32
HagerZhang{Float64, Base.RefValue{Bool}}: 14
StrongWolfe{Float64}: 32
BackTracking{Float64, Int64}: 11
BackTracking{Float64, Int64}: 12
StrongWolfe{Float64}: 32
HagerZhang{Float64, Base.RefValue{Bool}}: 11
HagerZhang{Float64, Base.RefValue{Bool}}: 19
HagerZhang{Float64, Base.RefValue{Bool}}: 10
HagerZhang{Float64, Base.RefValue{Bool}}: 12
HagerZhang{Float64, Base.RefValue{Bool}}: 10
HagerZhang{Float64, Base.RefValue{Bool}}: 14
StrongWolfe{Float64}: 32
HagerZhang{Float64, Base.RefValue{Bool}}: 40
HagerZhang{Float64, Base.RefValue{Bool}}: 20
HagerZhang{Float64, Base.RefValue{Bool}}: 14
HagerZhang{Float64, Base.RefValue{Bool}}: 16
HagerZhang{Float64, Base.RefValue{Bool}}: 11
HagerZhang{Float64, Base.RefValue{Bool}}: 11
HagerZhang{Float64, Base.RefValue{Bool}}: 23
HagerZhang{Float64, Base.RefValue{Bool}}: 10
HagerZhang{Float64, Base.RefValue{Bool}}: 10
HagerZhang{Float64, Base.RefValue{Bool}}: 15
HagerZhang{Float64, Base.RefValue{Bool}}: 16
StrongWolfe{Float64}: 11

@timholy
Copy link
Contributor Author

timholy commented Oct 22, 2024

In trying this locally, I got this many passes (out of 1000 attempts):

  • HZ: 963
  • StrongWolfe: 988
  • MoreThuente: 1000
  • Backtracking (both): 999

@timholy timholy marked this pull request as draft October 22, 2024 14:18
@timholy
Copy link
Contributor Author

timholy commented Oct 22, 2024

OK, this was more successful in capturing the purpose of the flatness check than I initially thought. Maybe best to wait to merge this for an actual fix.

This follows the notation in the paper and is much more readable. It
also introduces an external parameter, `epsilonk`, that "solves" the
flatness problem (though it requires external specification).
@ChrisRackauckas
Copy link
Contributor

Are there any ideas from here that @avikpal should consider for the newer LineSearch.jl?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants