Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisiting linesearches and LBFGS. #1133

Merged
merged 1 commit into from
Nov 12, 2024
Merged

Revisiting linesearches and LBFGS. #1133

merged 1 commit into from
Nov 12, 2024

Conversation

copybara-service[bot]
Copy link

@copybara-service copybara-service bot commented Nov 11, 2024

Revisiting linesearches and LBFGS.

For backtracking linesearch:

  • Add debugging option for backtracking_linesearch.
  • Add info entry in BacktrackingLinesearchState to potentially help debugging by looking at outputs (could be useful for example in vmap setting, and mimics the setup for the zoom linesearch).
  • Adding mechanism to prevent the linesearch to make a step if that would end up getting NaNs or infinite values in the function.

For zoom_linesearch:

  • Simplifies a bit the debugging information for the zoom linesearch and added prints of some relevant values for debugging.
  • Added a note in the zoom linesearch that using curv_tol=inf, would let this method make an efficient alternative to the backtracking linesearch using polynomial interpolation strategies.
  • Most importantly, added an option to define the initial guess for the linesearch. Looking up Nocedal and Wright, this initial guess should always be one for Newton or quasi-Newton methods. Could be refined for other methods (for now, for such other methods like gradient descent, we may simply keep the previous learning rate). This largely improved the performance in the public notebook.

For lbfgs:

  • Use clipped gradient step for the very first step (when scale_init_precond=True). The scale of the preconditioner for the very first iteration is not detailed anywhere in the literature I've seen. But using such clipped gradient step ensures to capture approximately the right scale. This made for example one of the tests pass without any further modifications of the default hyperparameters of the objective.
  • Revised the notebook in view fo these changes. Added some tips and an example of benchmark.

@copybara-service copybara-service bot force-pushed the test_694183028 branch 6 times, most recently from 152f44f to c0693b4 Compare November 12, 2024 16:47
For backtracking linesearch:
- Add debugging option for backtracking_linesearch.
- Add info entry in BacktrackingLinesearchState to potentially help debugging by looking at outputs (could be useful for example in vmap setting, and mimics the setup for the zoom linesearch).
- Adding mechanism to prevent the linesearch to make a step if that would end up getting NaNs or infinite values in the function.

For zoom_linesearch:
- Simplifies a bit the debugging information for the zoom linesearch and added prints of some relevant values for debugging.
- Added a note in the zoom linesearch that using curv_tol=inf, would let this method make an efficient alternative to the backtracking linesearch using polynomial interpolation strategies.
- Most importantly, added an option to define the initial guess for the linesearch. Looking up Nocedal and Wright, this initial guess should always be one for Newton or quasi-Newton methods. Could be refined for other methods (for now, for such other methods like gradient descent, we may simply keep the previous learning rate). This largely improved the performance in the public notebook.

For lbfgs:
- Use clipped gradient step for the very first step (when scale_init_precond=True). The scale of the preconditioner for the very first iteration is not detailed anywhere in the literature I've seen. But using such clipped gradient step ensures to capture approximately the right scale. This made for example one of the tests pass without any further modifications of the default hyperparameters of the objective.
- Revised the notebook in view fo these changes. Added some tips and an example of benchmark.

PiperOrigin-RevId: 695757735
@copybara-service copybara-service bot merged commit d25e9e5 into main Nov 12, 2024
@copybara-service copybara-service bot deleted the test_694183028 branch November 12, 2024 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant