-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Utility to consolidate sharded checkpoints #19213
Conversation
5cb990a
to
b5bbd20
Compare
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…ightning-AI/lightning into feature/consolidate-sharded-checkpoint
for more information, see https://pre-commit.ci
d62cf6a
to
9ed053a
Compare
for more information, see https://pre-commit.ci
⚡ Required checks status: All passing 🟢Groups summary🟢 pytorch_lightning: Tests workflowThese checks are required after the changes to 🟢 pytorch_lightning: Azure GPU
These checks are required after the changes to 🟢 pytorch_lightning: Benchmarks
These checks are required after the changes to 🟢 fabric: Docs
These checks are required after the changes to 🟢 pytorch_lightning: Docs
These checks are required after the changes to 🟢 lightning_fabric: CPU workflowThese checks are required after the changes to 🟢 lightning_fabric: Azure GPU
These checks are required after the changes to 🟢 mypy
These checks are required after the changes to 🟢 installThese checks are required after the changes to Thank you for your contribution! 💜
|
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #19213 +/- ##
==========================================
- Coverage 83% 54% -29%
==========================================
Files 446 443 -3
Lines 37702 37697 -5
==========================================
- Hits 31242 20282 -10960
- Misses 6460 17415 +10955 |
for more information, see https://pre-commit.ci
…ightning-AI/lightning into feature/consolidate-sharded-checkpoint
Co-authored-by: Carlos Mocholí <[email protected]>
for more information, see https://pre-commit.ci
What does this PR do?
Adds a utility function and CLI to consolidate sharded checkpoints saved with FSDP. See the docs page added here for an explanation of how to use it.
The name for the CLI option needs to be decided.
📚 Documentation preview 📚: https://pytorch-lightning--19213.org.readthedocs.build/en/19213/
cc @Borda @awaelchli @carmocca @justusschock