Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay Precision.convert_module until configure_model has run #19061

Merged
merged 12 commits into from
Feb 7, 2024

Conversation

carmocca
Copy link
Contributor

@carmocca carmocca commented Nov 23, 2023

What does this PR do?

Fixes #18936

The main change is in src/lightning/pytorch/strategies/strategy.py. Everything else is a bit of flattening and making it as consistent as possible with its subclasses.

The half precision test is redundant. It wasn't failing before this PR. The relevant tests are the TransformerEngine and Bitsandbytes ones.


📚 Documentation preview 📚: https://pytorch-lightning--19061.org.readthedocs.build/en/19061/

cc @Borda @carmocca @justusschock @awaelchli

@carmocca carmocca added bug Something isn't working precision: bnb Bitsandbytes quantization labels Nov 23, 2023
@carmocca carmocca added this to the 2.1.x milestone Nov 23, 2023
@carmocca carmocca self-assigned this Nov 23, 2023
@github-actions github-actions bot added the pl Generic label for PyTorch Lightning package label Nov 23, 2023
@carmocca carmocca added precision: double Double precision precision: half Half-precision precision: te Transformer Engine labels Nov 23, 2023
@carmocca carmocca removed precision: double Double precision precision: half Half-precision labels Nov 23, 2023
@github-actions github-actions bot added the fabric lightning.fabric.Fabric label Nov 23, 2023
@carmocca carmocca marked this pull request as ready for review November 23, 2023 20:14
Copy link
Contributor

github-actions bot commented Nov 23, 2023

⚡ Required checks status: All passing 🟢

Groups summary

🟢 pytorch_lightning: Tests workflow
Check ID Status
pl-cpu (macOS-11, lightning, 3.8, 1.13, oldest) success
pl-cpu (macOS-11, lightning, 3.10, 1.13) success
pl-cpu (macOS-11, lightning, 3.10, 2.1) success
pl-cpu (ubuntu-20.04, lightning, 3.8, 1.13, oldest) success
pl-cpu (ubuntu-20.04, lightning, 3.10, 1.13) success
pl-cpu (ubuntu-20.04, lightning, 3.10, 2.1) success
pl-cpu (windows-2022, lightning, 3.8, 1.13, oldest) success
pl-cpu (windows-2022, lightning, 3.10, 1.13) success
pl-cpu (windows-2022, lightning, 3.10, 2.1) success
pl-cpu (macOS-11, pytorch, 3.8, 1.13) success
pl-cpu (ubuntu-20.04, pytorch, 3.8, 1.13) success
pl-cpu (windows-2022, pytorch, 3.8, 1.13) success
pl-cpu (macOS-12, pytorch, 3.11, 2.0) success
pl-cpu (macOS-12, pytorch, 3.11, 2.1) success
pl-cpu (ubuntu-22.04, pytorch, 3.11, 2.0) success
pl-cpu (ubuntu-22.04, pytorch, 3.11, 2.1) success
pl-cpu (windows-2022, pytorch, 3.11, 2.0) success
pl-cpu (windows-2022, pytorch, 3.11, 2.1) success

These checks are required after the changes to src/lightning/pytorch/strategies/ddp.py, src/lightning/pytorch/strategies/deepspeed.py, src/lightning/pytorch/strategies/fsdp.py, src/lightning/pytorch/strategies/single_device.py, src/lightning/pytorch/strategies/single_xla.py, src/lightning/pytorch/strategies/strategy.py, src/lightning/pytorch/strategies/xla.py, tests/tests_pytorch/plugins/precision/test_bitsandbytes.py, tests/tests_pytorch/plugins/precision/test_half.py, tests/tests_pytorch/plugins/precision/test_transformer_engine.py.

🟢 pytorch_lightning: Azure GPU
Check ID Status
pytorch-lightning (GPUs) (testing Lightning | latest) success
pytorch-lightning (GPUs) (testing PyTorch | latest) success

These checks are required after the changes to src/lightning/pytorch/strategies/ddp.py, src/lightning/pytorch/strategies/deepspeed.py, src/lightning/pytorch/strategies/fsdp.py, src/lightning/pytorch/strategies/single_device.py, src/lightning/pytorch/strategies/single_xla.py, src/lightning/pytorch/strategies/strategy.py, src/lightning/pytorch/strategies/xla.py, tests/tests_pytorch/plugins/precision/test_bitsandbytes.py, tests/tests_pytorch/plugins/precision/test_half.py, tests/tests_pytorch/plugins/precision/test_transformer_engine.py.

🟢 pytorch_lightning: Benchmarks
Check ID Status
lightning.Benchmarks success

These checks are required after the changes to src/lightning/pytorch/strategies/ddp.py, src/lightning/pytorch/strategies/deepspeed.py, src/lightning/pytorch/strategies/fsdp.py, src/lightning/pytorch/strategies/single_device.py, src/lightning/pytorch/strategies/single_xla.py, src/lightning/pytorch/strategies/strategy.py, src/lightning/pytorch/strategies/xla.py.

🟢 pytorch_lightning: Docs
Check ID Status
docs-make (pytorch, doctest) success
docs-make (pytorch, html) success

These checks are required after the changes to src/lightning/pytorch/strategies/ddp.py, src/lightning/pytorch/strategies/deepspeed.py, src/lightning/pytorch/strategies/fsdp.py, src/lightning/pytorch/strategies/single_device.py, src/lightning/pytorch/strategies/single_xla.py, src/lightning/pytorch/strategies/strategy.py, src/lightning/pytorch/strategies/xla.py.

🟢 mypy
Check ID Status
mypy success

These checks are required after the changes to src/lightning/pytorch/strategies/ddp.py, src/lightning/pytorch/strategies/deepspeed.py, src/lightning/pytorch/strategies/fsdp.py, src/lightning/pytorch/strategies/single_device.py, src/lightning/pytorch/strategies/single_xla.py, src/lightning/pytorch/strategies/strategy.py, src/lightning/pytorch/strategies/xla.py.

🟢 install
Check ID Status
install-pkg (ubuntu-22.04, app, 3.8) success
install-pkg (ubuntu-22.04, app, 3.11) success
install-pkg (ubuntu-22.04, fabric, 3.8) success
install-pkg (ubuntu-22.04, fabric, 3.11) success
install-pkg (ubuntu-22.04, pytorch, 3.8) success
install-pkg (ubuntu-22.04, pytorch, 3.11) success
install-pkg (ubuntu-22.04, lightning, 3.8) success
install-pkg (ubuntu-22.04, lightning, 3.11) success
install-pkg (ubuntu-22.04, notset, 3.8) success
install-pkg (ubuntu-22.04, notset, 3.11) success
install-pkg (macOS-12, app, 3.8) success
install-pkg (macOS-12, app, 3.11) success
install-pkg (macOS-12, fabric, 3.8) success
install-pkg (macOS-12, fabric, 3.11) success
install-pkg (macOS-12, pytorch, 3.8) success
install-pkg (macOS-12, pytorch, 3.11) success
install-pkg (macOS-12, lightning, 3.8) success
install-pkg (macOS-12, lightning, 3.11) success
install-pkg (macOS-12, notset, 3.8) success
install-pkg (macOS-12, notset, 3.11) success
install-pkg (windows-2022, app, 3.8) success
install-pkg (windows-2022, app, 3.11) success
install-pkg (windows-2022, fabric, 3.8) success
install-pkg (windows-2022, fabric, 3.11) success
install-pkg (windows-2022, pytorch, 3.8) success
install-pkg (windows-2022, pytorch, 3.11) success
install-pkg (windows-2022, lightning, 3.8) success
install-pkg (windows-2022, lightning, 3.11) success
install-pkg (windows-2022, notset, 3.8) success
install-pkg (windows-2022, notset, 3.11) success

These checks are required after the changes to src/lightning/pytorch/strategies/ddp.py, src/lightning/pytorch/strategies/deepspeed.py, src/lightning/pytorch/strategies/fsdp.py, src/lightning/pytorch/strategies/single_device.py, src/lightning/pytorch/strategies/single_xla.py, src/lightning/pytorch/strategies/strategy.py, src/lightning/pytorch/strategies/xla.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and updates for 60 minutes every 180 seconds. If you have any other questions, contact carmocca for help.

Copy link

codecov bot commented Nov 23, 2023

Codecov Report

Merging #19061 (24a1b52) into master (c7c42dc) will decrease coverage by 33%.
Report is 11 commits behind head on master.
The diff coverage is 83%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #19061      +/-   ##
==========================================
- Coverage      82%      49%     -33%     
==========================================
  Files         448      440       -8     
  Lines       37915    37783     -132     
==========================================
- Hits        30985    18467   -12518     
- Misses       6930    19316   +12386     

@mergify mergify bot removed the has conflicts label Nov 25, 2023
@carmocca carmocca requested a review from awaelchli November 27, 2023 13:07
@carmocca carmocca changed the title Delay Precision.convert_module until configure_model has run Delay Precision.convert_module until configure_model has run [TPU] Nov 27, 2023
@carmocca carmocca marked this pull request as draft November 29, 2023 02:26
Copy link

gitguardian bot commented Jan 16, 2024

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
Once a secret has been leaked into a git repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

@carmocca carmocca changed the title Delay Precision.convert_module until configure_model has run [TPU] Delay Precision.convert_module until configure_model has run Feb 1, 2024
@carmocca carmocca marked this pull request as ready for review February 1, 2024 20:02
@mergify mergify bot added the has conflicts label Feb 2, 2024
@mergify mergify bot removed the has conflicts label Feb 5, 2024
@carmocca
Copy link
Contributor Author

carmocca commented Feb 5, 2024

The failing TPU job is for Fabric only, but Fabric isn't impacted by this PR so it must be an issue from somewhere else

@awaelchli awaelchli modified the milestones: 2.1.x, 2.2 Feb 6, 2024
@mergify mergify bot added the ready PRs ready to be merged label Feb 7, 2024
src/lightning/pytorch/CHANGELOG.md Show resolved Hide resolved
src/lightning/pytorch/strategies/ddp.py Show resolved Hide resolved
@awaelchli awaelchli merged commit 4510351 into master Feb 7, 2024
90 of 93 checks passed
@awaelchli awaelchli deleted the carmocca/delay-convert-module branch February 7, 2024 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fabric lightning.fabric.Fabric pl Generic label for PyTorch Lightning package precision: bnb Bitsandbytes quantization precision: te Transformer Engine ready PRs ready to be merged run TPU
Projects
None yet
Development

Successfully merging this pull request may close these issues.

convert_module in BitsandbytesPrecision is called before configure_model
4 participants