Support 4bit BNB layers meta-device materialization #19150

carmocca · 2023-12-13T05:44:10Z

What does this PR do?

Adds support for converting Linear layers on meta-device to Bitsandbytes Linear layers.

Also for materializing and quantizing Bitsandbytes Linear layers on meta-device.

Scenario 1

fabric = Fabric(plugins=BitsandbytesPrecision(), devices=1)
# empty_init=True with devices=1 doesn't use meta device at the moment
with fabric.init_module(empty_init=False), torch.device("meta"):
    model = LLM()
# this looks stupid, but in reality this uses a custom materialization scheme
materialize_meta_tensors(model, fabric.device)  # quantize here
maybe_install_hooks(model)
load_checkpoint(model)  # quantize here

Cons:

to_empty and reset_parameters (called from materialize_meta_tensors) will create empty tensors and quantize. There's no way to avoid this unless we make materialize_meta_tensors aware of bitsandbytes.
load_checkpoint will again quantize weights that were already quantized during materialization.
we materialized model parts that were going to be loaded anyway

Scenario 2

fabric = Fabric(plugins=BitsandbytesPrecision(), devices=1)
with fabric.init_module(empty_init=False), torch.device("meta"):
    model = LLM()
maybe_install_hooks(model)
load_checkpoint(model)  # quantize here
materialize_meta_tensors(model, fabric.device)  # quantize here
model = fabric.setup(model, move_to_device=False)
compile(model)

I believe this is the ideal scenario.

Scenario 3

fabric = Fabric(plugins=BitsandbytesPrecision(), devices=1)
with torch.device("meta"):
    model = LLM()
maybe_install_hooks(model)
load_checkpoint(model)
model = fabric.setup(model, move_to_device=False)  # do not quantize, just replace
materialize_meta_tensors(model, fabric.device)  # quantize here

Cons:

If the checkpoint is complete, loading will OOM.
fabric.setup will need to recreate layers so the model hooks are lost.
to_empty and reset_parameters (called from materialize_meta_tensors) will both create empty tensors and quantize. There's no way to avoid this unless we make materialize_meta_tensors aware of bitsandbytes.

Scenario 4

fabric = Fabric(plugins=BitsandbytesPrecision(), devices=1)
with torch.device("meta"):
    model = LLM()
materialize_meta_tensors(model, fabric.device)
model = fabric.setup(model, move_to_device=False)  # do not quantize, just replace
maybe_install_hooks(model)
load_checkpoint(model)  # quantize here
quantize_unloaded_layers(model)  # quantize here

Cons:

If the checkpoint is incomplete, an extra quantization step would be required. (unimplemented)

8-bit layer materialization is not implemented. I only made the minimal changes required for it.

📚 Documentation preview 📚: https://pytorch-lightning--19150.org.readthedocs.build/en/19150/

cc @Borda @carmocca @justusschock @awaelchli

tests/tests_fabric/plugins/precision/test_bitsandbytes.py

github-actions · 2023-12-14T16:51:50Z

⚡ Required checks status: All passing 🟢

Groups summary

🟢 pytorch_lightning: Tests workflow

Check ID	Status
pl-cpu (macOS-11, lightning, 3.8, 1.12, oldest)	success	✅
pl-cpu (macOS-11, lightning, 3.9, 1.12)	success	✅
pl-cpu (macOS-11, lightning, 3.10, 1.13)	success	✅
pl-cpu (macOS-11, lightning, 3.10, 2.0)	success	✅
pl-cpu (macOS-11, lightning, 3.10, 2.1)	success	✅
pl-cpu (ubuntu-20.04, lightning, 3.8, 1.12, oldest)	success	✅
pl-cpu (ubuntu-20.04, lightning, 3.9, 1.12)	success	✅
pl-cpu (ubuntu-20.04, lightning, 3.10, 1.13)	success	✅
pl-cpu (ubuntu-20.04, lightning, 3.10, 2.0)	success	✅
pl-cpu (ubuntu-20.04, lightning, 3.10, 2.1)	success	✅
pl-cpu (windows-2022, lightning, 3.8, 1.12, oldest)	success	✅
pl-cpu (windows-2022, lightning, 3.9, 1.12)	success	✅
pl-cpu (windows-2022, lightning, 3.10, 1.13)	success	✅
pl-cpu (windows-2022, lightning, 3.10, 2.0)	success	✅
pl-cpu (windows-2022, lightning, 3.10, 2.1)	success	✅
pl-cpu (macOS-11, pytorch, 3.8, 1.13)	success	✅
pl-cpu (ubuntu-20.04, pytorch, 3.8, 1.13)	success	✅
pl-cpu (windows-2022, pytorch, 3.8, 1.13)	success	✅
pl-cpu (macOS-12, pytorch, 3.11, 2.0)	success	✅
pl-cpu (macOS-12, pytorch, 3.11, 2.1)	success	✅
pl-cpu (ubuntu-22.04, pytorch, 3.11, 2.0)	success	✅
pl-cpu (ubuntu-22.04, pytorch, 3.11, 2.1)	success	✅
pl-cpu (windows-2022, pytorch, 3.11, 2.0)	success	✅
pl-cpu (windows-2022, pytorch, 3.11, 2.1)	success	✅

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/utilities/init.py, requirements/pytorch/extra.txt.

🟢 pytorch_lightning: Azure GPU

Check ID	Status
pytorch-lightning (GPUs) (testing Lightning \| latest)	success	✅
pytorch-lightning (GPUs) (testing PyTorch \| latest)	success	✅

These checks are required after the changes to requirements/pytorch/extra.txt, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/utilities/init.py.

🟢 pytorch_lightning: Benchmarks

Check ID	Status
lightning.Benchmarks	success	✅

These checks are required after the changes to requirements/pytorch/extra.txt, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/utilities/init.py.

🟢 fabric: Docs

Check ID	Status
docs-make (fabric, doctest)	success	✅
docs-make (fabric, html)	success	✅

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/utilities/init.py.

🟢 pytorch_lightning: Docs

Check ID	Status
docs-make (pytorch, doctest)	success	✅
docs-make (pytorch, html)	success	✅

These checks are required after the changes to requirements/pytorch/extra.txt.

🟢 pytorch_lightning: Docker

Check ID	Status
build-cuda (3.9, 1.12, 11.7.1)	success	✅
build-cuda (3.9, 1.13, 11.8.0)	success	✅
build-cuda (3.9, 1.13, 12.0.1)	success	✅
build-cuda (3.10, 2.0, 11.8.0)	success	✅
build-cuda (3.10, 2.1, 12.1.0)	success	✅
build-pl (3.9, 1.12, 11.7.1)	success	✅
build-pl (3.9, 1.13, 11.8.0)	success	✅
build-pl (3.9, 1.13, 12.0.1)	success	✅
build-pl (3.10, 2.0, 11.8.0)	success	✅
build-pl (3.10, 2.1, 12.1.0)	success	✅

These checks are required after the changes to requirements/pytorch/extra.txt.

🟢 lightning_fabric: CPU workflow

Check ID	Status
fabric-cpu (macOS-11, lightning, 3.8, 1.12, oldest)	success	✅
fabric-cpu (macOS-11, lightning, 3.9, 1.12)	success	✅
fabric-cpu (macOS-11, lightning, 3.10, 1.13)	success	✅
fabric-cpu (macOS-11, lightning, 3.10, 2.0)	success	✅
fabric-cpu (macOS-11, lightning, 3.11, 2.1)	success	✅
fabric-cpu (ubuntu-20.04, lightning, 3.8, 1.12, oldest)	success	✅
fabric-cpu (ubuntu-20.04, lightning, 3.9, 1.12)	success	✅
fabric-cpu (ubuntu-20.04, lightning, 3.10, 1.13)	success	✅
fabric-cpu (ubuntu-20.04, lightning, 3.10, 2.0)	success	✅
fabric-cpu (ubuntu-20.04, lightning, 3.11, 2.1)	success	✅
fabric-cpu (windows-2022, lightning, 3.8, 1.12, oldest)	success	✅
fabric-cpu (windows-2022, lightning, 3.9, 1.12)	success	✅
fabric-cpu (windows-2022, lightning, 3.10, 1.13)	success	✅
fabric-cpu (windows-2022, lightning, 3.10, 2.0)	success	✅
fabric-cpu (windows-2022, lightning, 3.11, 2.1)	success	✅
fabric-cpu (macOS-11, fabric, 3.8, 1.13)	success	✅
fabric-cpu (ubuntu-20.04, fabric, 3.8, 1.13)	success	✅
fabric-cpu (windows-2022, fabric, 3.8, 1.13)	success	✅
fabric-cpu (macOS-12, fabric, 3.11, 2.0)	success	✅
fabric-cpu (macOS-12, fabric, 3.11, 2.1)	success	✅
fabric-cpu (ubuntu-22.04, fabric, 3.11, 2.0)	success	✅
fabric-cpu (ubuntu-22.04, fabric, 3.11, 2.1)	success	✅
fabric-cpu (windows-2022, fabric, 3.11, 2.0)	success	✅
fabric-cpu (windows-2022, fabric, 3.11, 2.1)	success	✅

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/utilities/init.py, tests/tests_fabric/plugins/precision/test_bitsandbytes.py, tests/tests_fabric/utilities/test_init.py.

🟢 lightning_fabric: Azure GPU

Check ID	Status
lightning-fabric (GPUs) (testing Fabric \| latest)	success	✅
lightning-fabric (GPUs) (testing Lightning \| latest)	success	✅

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/utilities/init.py, tests/tests_fabric/plugins/precision/test_bitsandbytes.py, tests/tests_fabric/utilities/test_init.py.

🟢 mypy

Check ID	Status
mypy	success	✅

These checks are required after the changes to requirements/pytorch/extra.txt, src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/utilities/init.py.

🟢 install

Check ID	Status
install-pkg (ubuntu-22.04, app, 3.8)	success	✅
install-pkg (ubuntu-22.04, app, 3.11)	success	✅
install-pkg (ubuntu-22.04, fabric, 3.8)	success	✅
install-pkg (ubuntu-22.04, fabric, 3.11)	success	✅
install-pkg (ubuntu-22.04, pytorch, 3.8)	success	✅
install-pkg (ubuntu-22.04, pytorch, 3.11)	success	✅
install-pkg (ubuntu-22.04, lightning, 3.8)	success	✅
install-pkg (ubuntu-22.04, lightning, 3.11)	success	✅
install-pkg (ubuntu-22.04, notset, 3.8)	success	✅
install-pkg (ubuntu-22.04, notset, 3.11)	success	✅
install-pkg (macOS-12, app, 3.8)	success	✅
install-pkg (macOS-12, app, 3.11)	success	✅
install-pkg (macOS-12, fabric, 3.8)	success	✅
install-pkg (macOS-12, fabric, 3.11)	success	✅
install-pkg (macOS-12, pytorch, 3.8)	success	✅
install-pkg (macOS-12, pytorch, 3.11)	success	✅
install-pkg (macOS-12, lightning, 3.8)	success	✅
install-pkg (macOS-12, lightning, 3.11)	success	✅
install-pkg (macOS-12, notset, 3.8)	success	✅
install-pkg (macOS-12, notset, 3.11)	success	✅
install-pkg (windows-2022, app, 3.8)	success	✅
install-pkg (windows-2022, app, 3.11)	success	✅
install-pkg (windows-2022, fabric, 3.8)	success	✅
install-pkg (windows-2022, fabric, 3.11)	success	✅
install-pkg (windows-2022, pytorch, 3.8)	success	✅
install-pkg (windows-2022, pytorch, 3.11)	success	✅
install-pkg (windows-2022, lightning, 3.8)	success	✅
install-pkg (windows-2022, lightning, 3.11)	success	✅
install-pkg (windows-2022, notset, 3.8)	success	✅
install-pkg (windows-2022, notset, 3.11)	success	✅

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/utilities/init.py, requirements/pytorch/extra.txt.

Thank you for your contribution! 💜

Note
This comment is automatically generated and updates for 60 minutes every 180 seconds. If you have any other questions, contact carmocca for help.

codecov · 2023-12-14T17:06:39Z

Codecov Report

Merging #19150 (4b67966) into master (3b1643c) will decrease coverage by 29%.
Report is 6 commits behind head on master.
The diff coverage is 42%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #19150      +/-   ##
==========================================
- Coverage      83%      54%     -29%     
==========================================
  Files         443      439       -4     
  Lines       36859    36984     +125     
==========================================
- Hits        30539    19940   -10599     
- Misses       6320    17044   +10724

requirements/pytorch/extra.txt

src/lightning/fabric/plugins/precision/bitsandbytes.py

Andrei-Aksionov · 2023-12-27T16:03:07Z

src/lightning/fabric/plugins/precision/bitsandbytes.py

@@ -37,7 +39,8 @@

 log = logging.getLogger(__name__)

-_BITSANDBYTES_AVAILABLE = RequirementCache("bitsandbytes>=0.41.0")
+# TODO: unpin after resolving the `quant_state` format breaking changes
+_BITSANDBYTES_AVAILABLE = RequirementCache("bitsandbytes==0.41.0")


Still new to this repo, so maybe I misunderstood something.

Not sure how it can work.
In requirements/pytorch/extra.txt the BNB is fixed to 0.41.1.
The same goes to tests: skipif will always return True. That's why these tests are skipped.

These are GPU tests

Oh great catch. I messed upthe requirements, I'll open a PR

carmocca added 3 commits December 13, 2023 06:04

weights

53b99fb

Partial 8bit impl

3e52295

Minor fix

421e7c7

carmocca added feature Is an improvement or enhancement precision: bnb Bitsandbytes quantization labels Dec 13, 2023

carmocca added this to the 2.2 milestone Dec 13, 2023

carmocca self-assigned this Dec 13, 2023

github-actions bot added fabric lightning.fabric.Fabric pl Generic label for PyTorch Lightning package dependencies Pull requests that update a dependency file labels Dec 13, 2023

carmocca added 5 commits December 13, 2023 06:50

CHANGELOG

f2c89f2

Init tests

6a3b602

Test fix

c5bb14c

Test fix

f6e8326

strict

0e63722

carmocca commented Dec 13, 2023

View reviewed changes

tests/tests_fabric/plugins/precision/test_bitsandbytes.py Outdated Show resolved Hide resolved

carmocca force-pushed the carmocca/bnb-metad branch from d341df3 to f5be300 Compare December 13, 2023 17:47

Test improvements

ef17e7a

carmocca force-pushed the carmocca/bnb-metad branch from 1ec5916 to ef17e7a Compare December 13, 2023 23:30

carmocca added 5 commits December 14, 2023 00:32

Merge branch 'master' into carmocca/bnb-metad

b247d0d

mypy

f83576f

mypy

b1b7c48

More tests

dafe4e3

Merge branch 'master' into carmocca/bnb-metad

69d38a2

carmocca marked this pull request as ready for review December 14, 2023 16:51

carmocca requested review from awaelchli, justusschock, williamFalcon, lantiga and Borda as code owners December 14, 2023 16:51

carmocca requested a review from tchaton as a code owner December 14, 2023 16:51

mergify bot added the has conflicts label Dec 14, 2023

Merge branch 'master' into carmocca/bnb-metad

fbbf0d1

mergify bot removed the has conflicts label Dec 14, 2023

Require 2.1

b06666f

carmocca mentioned this pull request Dec 15, 2023

Sequential generation on multiple devices Lightning-AI/litgpt#815

Merged

awaelchli approved these changes Dec 20, 2023

View reviewed changes

Borda approved these changes Dec 20, 2023

View reviewed changes

mergify bot added the ready PRs ready to be merged label Dec 20, 2023

carmocca commented Dec 20, 2023

View reviewed changes

src/lightning/fabric/plugins/precision/bitsandbytes.py Show resolved Hide resolved

src/lightning/fabric/plugins/precision/bitsandbytes.py Outdated Show resolved Hide resolved

src/lightning/fabric/plugins/precision/bitsandbytes.py Outdated Show resolved Hide resolved

Apply suggestions from code review

4b67966

carmocca merged commit 6dfa5cc into master Dec 20, 2023
129 checks passed

carmocca deleted the carmocca/bnb-metad branch December 20, 2023 21:13

Andrei-Aksionov reviewed Dec 27, 2023

View reviewed changes

carmocca mentioned this pull request Jan 2, 2024

Fix pinned bitsandbytes version #19233

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support 4bit BNB layers meta-device materialization #19150

Support 4bit BNB layers meta-device materialization #19150

carmocca commented Dec 13, 2023 •

edited by github-actions bot

Loading

github-actions bot commented Dec 14, 2023 •

edited

Loading

codecov bot commented Dec 14, 2023 •

edited

Loading

Andrei-Aksionov Dec 27, 2023 •

edited

Loading

carmocca Jan 2, 2024

Support 4bit BNB layers meta-device materialization #19150

Support 4bit BNB layers meta-device materialization #19150

Conversation

carmocca commented Dec 13, 2023 • edited by github-actions bot Loading

What does this PR do?

Scenario 1

Scenario 2

Scenario 3

Scenario 4

github-actions bot commented Dec 14, 2023 • edited Loading

⚡ Required checks status: All passing 🟢

Groups summary

codecov bot commented Dec 14, 2023 • edited Loading

Codecov Report

Andrei-Aksionov Dec 27, 2023 • edited Loading

Choose a reason for hiding this comment

carmocca Jan 2, 2024

Choose a reason for hiding this comment

carmocca commented Dec 13, 2023 •

edited by github-actions bot

Loading

github-actions bot commented Dec 14, 2023 •

edited

Loading

codecov bot commented Dec 14, 2023 •

edited

Loading

Andrei-Aksionov Dec 27, 2023 •

edited

Loading