Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encapsulate snapshot restoration params in a struct #598

Closed
wants to merge 2 commits into from

Conversation

JKSenthil
Copy link
Contributor

Summary:

Context

Users may want to opt out of restoring certain parts of the app state (for example, like optimizer and lr scheduler states when finetuning). This is currently not supported in torchsnapshot saver

This Diff

  • Adds RestoreOptions dataclass to encapsulate all restoration params
  • moves restore_train_progress, restore_eval_progress to the struct
  • Replaces all restore apis to take RestoreOptions struct, replacing the restore_train_progress, restore_eval_progress
  • Adds optimizer and lr_scheduler to RestoreOptions

Differential Revision: D50757494

Summary:

Few imports of torchsnapshot were outside of the try catch that checks if torchsnapshot is in the environment. This diff moves them into the try catch block

Reviewed By: galrotem

Differential Revision: D50754606
Summary:
# Context
Users may want to opt out of restoring certain parts of the app state (for example, like optimizer and lr scheduler states when finetuning). This is currently not supported in torchsnapshot saver

# This Diff
* Adds `RestoreOptions` dataclass to encapsulate all restoration params
* moves `restore_train_progress`, `restore_eval_progress` to the struct
* Replaces all `restore` apis to take `RestoreOptions` struct, replacing the `restore_train_progress`, `restore_eval_progress`
* Adds optimizer and lr_scheduler to `RestoreOptions`

Differential Revision: D50757494
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D50757494

@codecov
Copy link

codecov bot commented Oct 27, 2023

Codecov Report

Merging #598 (1766dad) into master (2794b9a) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #598      +/-   ##
==========================================
+ Coverage   86.95%   86.99%   +0.04%     
==========================================
  Files          66       66              
  Lines        4131     4144      +13     
==========================================
+ Hits         3592     3605      +13     
  Misses        539      539              
Files Coverage Δ
...orchtnt/framework/callbacks/torchsnapshot_saver.py 92.18% <100.00%> (+0.41%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants