[NF4] Add `quantize_()` API support for NF4 #1216

gau-nernst · 2024-11-02T15:56:15Z

Background

When I was working on INT8 training stuff a while back, I realized there was no issue to make NF4 training work with quantize_() API (i.e. swap plain tensor with tensor subclass). This is because autograd+compile still works correctly for __torch_function__ (might not be the case in the past when NF4 was implemented) -> eliminate the need for custom nn.Module for downstream users as well as need for explicitly call to linear_nf4().

Usage

from torchao import quantize_
from torchao.dtypes.nf4 import nf4_weight_only

model = ...
quantize_(model, nf4_weight_only())  # works for both training and inference. NF4 weight is non-trainable

This would also compose nicely with LoRA, making QLoRA implementation more seamless. e.g.

from torch import nn
import torch.nn.functional as F

class LoRALinear(nn.Module):
    ...
    def forward(self, x):
        out = F.linear(x, self.weight)  # self.weight can be NF4Tensor
        out = out + self.lora_b(self.lora_a(x)) * (self.alpha / self.rank)
        return out

lora_layer = LoRALinear(...)
quantize_(
    lora_layer,
    nf4_weight_only(),
    filter_fn=lambda module, name: isinstance(module, LoRALinear),
)  # now this becomes QLoRA

pytorch-bot · 2024-11-02T15:56:19Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1216

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit f9a5f94 with merge base f99b667 ():

NEW FAILURE - The following job has failed:

Run Regression Tests / test (CPU Nightly, linux.4xlarge, --pre torch --index-url https://download.pytorch.org/whl/nightl... / linux-job (gh)
test/prototype/test_parametrization.py::TestFakeSparsity::test_jit_trace

This comment was automatically generated by Dr. CI and updates every 15 minutes.

suryanshgupta9933 · 2024-11-02T19:58:31Z

This would be really neat.

quantize api for nf4

f9a5f94

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 2, 2024

msaroufim self-requested a review November 2, 2024 22:53

msaroufim approved these changes Nov 2, 2024

View reviewed changes

msaroufim merged commit 1fbf788 into pytorch:main Nov 4, 2024
16 of 17 checks passed

gau-nernst deleted the nf4 branch November 4, 2024 05:30

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Support text-only input with llama3.2-11b (pytorch#1216)

e4b36f9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NF4] Add `quantize_()` API support for NF4 #1216

[NF4] Add `quantize_()` API support for NF4 #1216

Uh oh!

gau-nernst commented Nov 2, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 2, 2024 •

edited

Loading

Uh oh!

suryanshgupta9933 commented Nov 2, 2024

Uh oh!

Uh oh!

Uh oh!

[NF4] Add quantize_() API support for NF4 #1216

[NF4] Add quantize_() API support for NF4 #1216

Uh oh!

Conversation

gau-nernst commented Nov 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Usage

Uh oh!

pytorch-bot bot commented Nov 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1216

❌ 1 New Failure

Uh oh!

suryanshgupta9933 commented Nov 2, 2024

Uh oh!

Uh oh!

Uh oh!

[NF4] Add `quantize_()` API support for NF4 #1216

[NF4] Add `quantize_()` API support for NF4 #1216

gau-nernst commented Nov 2, 2024 •

edited

Loading

pytorch-bot bot commented Nov 2, 2024 •

edited

Loading