Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitsandbytes precision plugin #18655

Merged
merged 29 commits into from
Sep 29, 2023
Merged

Bitsandbytes precision plugin #18655

merged 29 commits into from
Sep 29, 2023

Conversation

carmocca
Copy link
Contributor

@carmocca carmocca commented Sep 27, 2023

What does this PR do?

Fixes #18295
Closes #18559

Some minor improvements are added to the transformer engine integration. I thought about these as I was working on this plugin, since they work very similarly.

Before:

import torch
import torch.nn as nn
import bitsandbytes as bnb
from bitsandbytes.nn import Linear8bitLt, LinearNF4

model = Linear8bitLt(64, 64, has_fp16_weights=False)
print(model.weight.device)  # cpu
print(model.weight.dtype)  # float32
model = model.to(0)  # quantization happens here
print(model.weight.device)  # cuda
print(model.weight.dtype)  # int8

model = LinearNF4(64, 64)
print(model.weight.device)  # cpu
print(model.weight.dtype)  # float32
model = model.to(0)  # quantization happens here
print(model.weight.device)  # cuda
print(model.weight.dtype)  # uint8

After:

import torch
import torch.nn as nn
import lightning as L
from lightning.fabric.plugins import BitsandbytesPrecision

fabric = L.Fabric(devices=1, plugins=BitsandbytesPrecision("int8", dtype=torch.float16))
with fabric.init_module():
    model = nn.Linear(64, 64)
print(model.weight.device)  # cuda (init_module already inited on device)
print(model.weight.dtype)  # int8 (model is already quantized)
model = fabric.setup(model)  # won't do anything new
print(model.weight.device)  # cuda
print(model.weight.dtype)  # int8

fabric = L.Fabric(devices=1, plugins=BitsandbytesPrecision("nf4", dtype=torch.bfloat16))
with fabric.init_module():
    model = nn.Linear(64, 64)
print(model.weight.device)  # cuda (init_module already inited on device)
print(model.weight.dtype)  # uint8 (model is already quantized)
model = fabric.setup(model)  # won't do anything new
print(model.weight.device)  # cuda
print(model.weight.dtype)  # uint8

If the user request to skip=... submodules, quantization will only happen in fabric.setup


📚 Documentation preview 📚: https://pytorch-lightning--18655.org.readthedocs.build/en/18655/

cc @Borda @carmocca @justusschock @awaelchli

@carmocca carmocca added feature Is an improvement or enhancement fabric lightning.fabric.Fabric experimental plugin pl Generic label for PyTorch Lightning package labels Sep 27, 2023
@carmocca carmocca added this to the 2.1 milestone Sep 27, 2023
@carmocca carmocca self-assigned this Sep 27, 2023
@carmocca carmocca marked this pull request as ready for review September 28, 2023 17:39
@github-actions
Copy link
Contributor

github-actions bot commented Sep 28, 2023

⚡ Required checks status: All passing 🟢

Groups summary

🟢 pytorch_lightning: Tests workflow
Check ID Status
pl-cpu (macOS-11, lightning, 3.8, 1.11) success
pl-cpu (macOS-11, lightning, 3.9, 1.12) success
pl-cpu (macOS-11, lightning, 3.10, 1.13) success
pl-cpu (macOS-11, lightning, 3.10, 2.0) success
pl-cpu (macOS-11, lightning, 3.8, 1.11, oldest) success
pl-cpu (ubuntu-20.04, lightning, 3.8, 1.11) success
pl-cpu (ubuntu-20.04, lightning, 3.9, 1.12) success
pl-cpu (ubuntu-20.04, lightning, 3.10, 1.13) success
pl-cpu (ubuntu-20.04, lightning, 3.10, 2.0) success
pl-cpu (ubuntu-20.04, lightning, 3.8, 1.11, oldest) success
pl-cpu (windows-2022, lightning, 3.8, 1.11) success
pl-cpu (windows-2022, lightning, 3.9, 1.12) success
pl-cpu (windows-2022, lightning, 3.10, 1.13) success
pl-cpu (windows-2022, lightning, 3.10, 2.0) success
pl-cpu (windows-2022, lightning, 3.8, 1.11, oldest) success
pl-cpu (macOS-11, pytorch, 3.8, 1.13) success
pl-cpu (ubuntu-20.04, pytorch, 3.8, 1.13) success
pl-cpu (windows-2022, pytorch, 3.8, 1.13) success
pl-cpu (macOS-12, pytorch, 3.11, 2.0) success
pl-cpu (ubuntu-22.04, pytorch, 3.11, 2.0) success
pl-cpu (windows-2022, pytorch, 3.11, 2.0) success

These checks are required after the changes to src/lightning/fabric/connector.py, src/lightning/fabric/plugins/__init__.py, src/lightning/fabric/plugins/precision/__init__.py, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/precision.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/pytorch/plugins/__init__.py, src/lightning/pytorch/plugins/precision/__init__.py, src/lightning/pytorch/plugins/precision/bitsandbytes.py, src/lightning/pytorch/plugins/precision/transformer_engine.py, src/lightning/pytorch/trainer/connectors/accelerator_connector.py, tests/tests_pytorch/plugins/precision/test_transformer_engine.py, tests/tests_pytorch/trainer/connectors/test_accelerator_connector.py.

🟢 pytorch_lightning: Azure GPU
Check ID Status
[pytorch-lightning (GPUs) (testing Lightning latest)](https://dev.azure.com/Lightning-AI/72ab7ed8-b00f-4b6e-b131-3388f7ffafa7/_build/results?buildId=176621&view=logs&jobId=47e66f3c-897a-5428-da11-bf5c7745762e) success
[pytorch-lightning (GPUs) (testing PyTorch latest)](https://dev.azure.com/Lightning-AI/72ab7ed8-b00f-4b6e-b131-3388f7ffafa7/_build/results?buildId=176621&view=logs&jobId=fe777007-6e77-5e50-6e71-fc5977ab193a) success

These checks are required after the changes to src/lightning/pytorch/plugins/__init__.py, src/lightning/pytorch/plugins/precision/__init__.py, src/lightning/pytorch/plugins/precision/bitsandbytes.py, src/lightning/pytorch/plugins/precision/transformer_engine.py, src/lightning/pytorch/trainer/connectors/accelerator_connector.py, tests/tests_pytorch/plugins/precision/test_transformer_engine.py, tests/tests_pytorch/trainer/connectors/test_accelerator_connector.py, src/lightning/fabric/connector.py, src/lightning/fabric/plugins/__init__.py, src/lightning/fabric/plugins/precision/__init__.py, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/precision.py, src/lightning/fabric/plugins/precision/transformer_engine.py.

🟢 pytorch_lightning: Benchmarks
Check ID Status
lightning.Benchmarks success

These checks are required after the changes to src/lightning/fabric/connector.py, src/lightning/fabric/plugins/__init__.py, src/lightning/fabric/plugins/precision/__init__.py, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/precision.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/pytorch/plugins/__init__.py, src/lightning/pytorch/plugins/precision/__init__.py, src/lightning/pytorch/plugins/precision/bitsandbytes.py, src/lightning/pytorch/plugins/precision/transformer_engine.py, src/lightning/pytorch/trainer/connectors/accelerator_connector.py.

🟢 fabric: Docs
Check ID Status
docs-make (fabric, doctest) success
docs-make (fabric, html) success

These checks are required after the changes to src/lightning/fabric/connector.py, src/lightning/fabric/plugins/__init__.py, src/lightning/fabric/plugins/precision/__init__.py, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/precision.py, src/lightning/fabric/plugins/precision/transformer_engine.py, docs/source-fabric/api/fabric_args.rst, docs/source-fabric/api/precision.rst, docs/source-fabric/conf.py, docs/source-fabric/fundamentals/precision.rst.

🟢 pytorch_lightning: Docs
Check ID Status
docs-make (pytorch, doctest) success
docs-make (pytorch, html) success

These checks are required after the changes to src/lightning/pytorch/plugins/__init__.py, src/lightning/pytorch/plugins/precision/__init__.py, src/lightning/pytorch/plugins/precision/bitsandbytes.py, src/lightning/pytorch/plugins/precision/transformer_engine.py, src/lightning/pytorch/trainer/connectors/accelerator_connector.py, docs/source-pytorch/api_references.rst, docs/source-pytorch/common/precision_intermediate.rst, docs/source-pytorch/common/trainer.rst, docs/source-pytorch/conf.py, docs/source-pytorch/extensions/plugins.rst.

🟢 lightning_fabric: CPU workflow
Check ID Status
fabric-cpu (macOS-11, lightning, 3.8, 1.11) success
fabric-cpu (macOS-11, lightning, 3.9, 1.12) success
fabric-cpu (macOS-11, lightning, 3.10, 1.13) success
fabric-cpu (macOS-11, lightning, 3.10, 2.0) success
fabric-cpu (macOS-11, lightning, 3.8, 1.11, oldest) success
fabric-cpu (ubuntu-20.04, lightning, 3.8, 1.11) success
fabric-cpu (ubuntu-20.04, lightning, 3.9, 1.12) success
fabric-cpu (ubuntu-20.04, lightning, 3.10, 1.13) success
fabric-cpu (ubuntu-20.04, lightning, 3.10, 2.0) success
fabric-cpu (ubuntu-20.04, lightning, 3.8, 1.11, oldest) success
fabric-cpu (windows-2022, lightning, 3.8, 1.11) success
fabric-cpu (windows-2022, lightning, 3.9, 1.12) success
fabric-cpu (windows-2022, lightning, 3.10, 1.13) success
fabric-cpu (windows-2022, lightning, 3.10, 2.0) success
fabric-cpu (windows-2022, lightning, 3.8, 1.11, oldest) success
fabric-cpu (macOS-11, fabric, 3.8, 1.13) success
fabric-cpu (ubuntu-20.04, fabric, 3.8, 1.13) success
fabric-cpu (windows-2022, fabric, 3.8, 1.13) success
fabric-cpu (macOS-12, fabric, 3.11, 2.0) success
fabric-cpu (ubuntu-22.04, fabric, 3.11, 2.0) success
fabric-cpu (windows-2022, fabric, 3.11, 2.0) success

These checks are required after the changes to src/lightning/fabric/connector.py, src/lightning/fabric/plugins/__init__.py, src/lightning/fabric/plugins/precision/__init__.py, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/precision.py, src/lightning/fabric/plugins/precision/transformer_engine.py, tests/tests_fabric/plugins/precision/test_bitsandbytes.py, tests/tests_fabric/plugins/precision/test_transformer_engine.py, tests/tests_fabric/test_connector.py.

🟢 lightning_fabric: Azure GPU
Check ID Status
[lightning-fabric (GPUs) (testing Fabric latest)](https://dev.azure.com/Lightning-AI/72ab7ed8-b00f-4b6e-b131-3388f7ffafa7/_build/results?buildId=176623&view=logs&jobId=fe777007-6e77-5e50-6e71-fc5977ab193a) success
[lightning-fabric (GPUs) (testing Lightning latest)](https://dev.azure.com/Lightning-AI/72ab7ed8-b00f-4b6e-b131-3388f7ffafa7/_build/results?buildId=176623&view=logs&jobId=47e66f3c-897a-5428-da11-bf5c7745762e) success

These checks are required after the changes to src/lightning/fabric/connector.py, src/lightning/fabric/plugins/__init__.py, src/lightning/fabric/plugins/precision/__init__.py, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/precision.py, src/lightning/fabric/plugins/precision/transformer_engine.py, tests/tests_fabric/plugins/precision/test_bitsandbytes.py, tests/tests_fabric/plugins/precision/test_transformer_engine.py, tests/tests_fabric/test_connector.py.

🟢 mypy
Check ID Status
mypy success

These checks are required after the changes to src/lightning/fabric/connector.py, src/lightning/fabric/plugins/__init__.py, src/lightning/fabric/plugins/precision/__init__.py, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/precision.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/pytorch/plugins/__init__.py, src/lightning/pytorch/plugins/precision/__init__.py, src/lightning/pytorch/plugins/precision/bitsandbytes.py, src/lightning/pytorch/plugins/precision/transformer_engine.py, src/lightning/pytorch/trainer/connectors/accelerator_connector.py.

🟢 install
Check ID Status
install-pkg (ubuntu-22.04, app, 3.8) success
install-pkg (ubuntu-22.04, app, 3.11) success
install-pkg (ubuntu-22.04, fabric, 3.8) success
install-pkg (ubuntu-22.04, fabric, 3.11) success
install-pkg (ubuntu-22.04, pytorch, 3.8) success
install-pkg (ubuntu-22.04, pytorch, 3.11) success
install-pkg (ubuntu-22.04, lightning, 3.8) success
install-pkg (ubuntu-22.04, lightning, 3.11) success
install-pkg (ubuntu-22.04, notset, 3.8) success
install-pkg (ubuntu-22.04, notset, 3.11) success
install-pkg (macOS-12, app, 3.8) success
install-pkg (macOS-12, app, 3.11) success
install-pkg (macOS-12, fabric, 3.8) success
install-pkg (macOS-12, fabric, 3.11) success
install-pkg (macOS-12, pytorch, 3.8) success
install-pkg (macOS-12, pytorch, 3.11) success
install-pkg (macOS-12, lightning, 3.8) success
install-pkg (macOS-12, lightning, 3.11) success
install-pkg (macOS-12, notset, 3.8) success
install-pkg (macOS-12, notset, 3.11) success
install-pkg (windows-2022, app, 3.8) success
install-pkg (windows-2022, app, 3.11) success
install-pkg (windows-2022, fabric, 3.8) success
install-pkg (windows-2022, fabric, 3.11) success
install-pkg (windows-2022, pytorch, 3.8) success
install-pkg (windows-2022, pytorch, 3.11) success
install-pkg (windows-2022, lightning, 3.8) success
install-pkg (windows-2022, lightning, 3.11) success
install-pkg (windows-2022, notset, 3.8) success
install-pkg (windows-2022, notset, 3.11) success

These checks are required after the changes to src/lightning/fabric/connector.py, src/lightning/fabric/plugins/__init__.py, src/lightning/fabric/plugins/precision/__init__.py, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/precision.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/pytorch/plugins/__init__.py, src/lightning/pytorch/plugins/precision/__init__.py, src/lightning/pytorch/plugins/precision/bitsandbytes.py, src/lightning/pytorch/plugins/precision/transformer_engine.py, src/lightning/pytorch/trainer/connectors/accelerator_connector.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and updates for 60 minutes every 180 seconds. If you have any other questions, contact carmocca for help.

@codecov
Copy link

codecov bot commented Sep 29, 2023

Codecov Report

Merging #18655 (003e307) into master (ac71365) will decrease coverage by 30%.
Report is 7 commits behind head on master.
The diff coverage is 58%.

Additional details and impacted files
@@            Coverage Diff            @@
##           master   #18655     +/-   ##
=========================================
- Coverage      83%      53%    -30%     
=========================================
  Files         426      423      -3     
  Lines       33381    33452     +71     
=========================================
- Hits        27670    17857   -9813     
- Misses       5711    15595   +9884     

Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

docs/source-pytorch/common/precision_intermediate.rst Outdated Show resolved Hide resolved
docs/source-pytorch/common/precision_intermediate.rst Outdated Show resolved Hide resolved
docs/source-pytorch/common/precision_intermediate.rst Outdated Show resolved Hide resolved
docs/source-pytorch/common/precision_intermediate.rst Outdated Show resolved Hide resolved
docs/source-pytorch/common/precision_intermediate.rst Outdated Show resolved Hide resolved
src/lightning/fabric/plugins/precision/bitsandbytes.py Outdated Show resolved Hide resolved
src/lightning/fabric/plugins/precision/bitsandbytes.py Outdated Show resolved Hide resolved
tests/tests_fabric/test_connector.py Show resolved Hide resolved
@carmocca carmocca requested a review from awaelchli September 29, 2023 14:04
Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great feature. We can start testing it now with lit-gpt.
@carmocca Would you like to also open an issue about the design concern we had around composing plugins in the future?

@carmocca
Copy link
Contributor Author

@awaelchli Opened #18679. Also working to update lit-gpt in Lightning-AI/litgpt#596

@carmocca carmocca merged commit 5120ad2 into master Sep 29, 2023
113 checks passed
@carmocca carmocca deleted the carmocca/bnb branch September 29, 2023 17:17
@mergify mergify bot added the ready PRs ready to be merged label Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experimental fabric lightning.fabric.Fabric feature Is an improvement or enhancement pl Generic label for PyTorch Lightning package plugin ready PRs ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lightning usage with bitsandbytes
3 participants