Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support passing storage_options in trainer.save_checkpoint() API #11891

Merged
merged 27 commits into from
Mar 9, 2022

Conversation

jjenniferdai
Copy link
Contributor

@jjenniferdai jjenniferdai commented Feb 12, 2022

What does this PR do?

Fixes #10629

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

@ananthsub ananthsub added the checkpointing Related to checkpointing label Feb 14, 2022
@ananthsub ananthsub added this to the 1.6 milestone Feb 14, 2022
@kaushikb11
Copy link
Contributor

@jjenniferdai Any updates on the PR?

@jjenniferdai
Copy link
Contributor Author

fuller picture for how to implement storage options across the codebase.

e.g. like the other filesystem ops in ModelCheckpoint?

  • this is the only entry point to enable different storage_options across different checkpoint instances right?
  • I think this is also different than just fsspec "storage_options" - it's really just for passing any checkpoint-instance specific information for CheckpointIO to use however it'd like to - thoughts?

tests/checkpointing/test_trainer_checkpoint.py Outdated Show resolved Hide resolved
pytorch_lightning/plugins/io/xla_plugin.py Outdated Show resolved Hide resolved
@mergify mergify bot added ready PRs ready to be merged has conflicts labels Feb 27, 2022
@mergify mergify bot removed the has conflicts label Feb 28, 2022
@mergify mergify bot added the has conflicts label Mar 1, 2022
@mergify mergify bot removed the has conflicts label Mar 2, 2022
@jjenniferdai jjenniferdai requested a review from carmocca March 2, 2022 18:36
@kaushikb11
Copy link
Contributor

@tchaton @Borda @SeanNaren Mind reviewing?

pytorch_lightning/plugins/io/torch_plugin.py Outdated Show resolved Hide resolved
tests/checkpointing/test_trainer_checkpoint.py Outdated Show resolved Hide resolved
@mergify mergify bot added the has conflicts label Mar 7, 2022
@mergify mergify bot removed the has conflicts label Mar 8, 2022
@ananthsub ananthsub enabled auto-merge (squash) March 9, 2022 18:22
@ananthsub ananthsub merged commit d31126c into Lightning-AI:master Mar 9, 2022
facebook-github-bot pushed a commit to facebookresearch/recipes that referenced this pull request Apr 15, 2022
…rch-lightning] Support passing `storage_options` in `trainer.save_checkpoint()` API (#11891)

Summary:
fbcode patch in first to unblock this diff stack:

OSS PR: Lightning-AI/pytorch-lightning#11891
original sync diff: D35211283

Reviewed By: fegin

Differential Revision: D35321207

fbshipit-source-id: ff6c888c562ccddabfc5045c1f18221f9c93a3bc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
checkpointing Related to checkpointing ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support passing storage_options in trainer.save_checkpoint() API
5 participants