Add support for quantize_() with Float8Linear module #1344

jainapurva · 2024-11-26T00:52:33Z

Added support for quantize_() API to work with models trained with float8, using Float8Linear.

pytorch-bot · 2024-11-26T00:52:36Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1344

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit f66e7ed with merge base ed76e9c ():

NEW FAILURE - The following job has failed:

Run Regression Tests / test-nightly (CPU Nightly, linux.4xlarge, --pre torch --index-url https://download.pytorch.org/wh... / linux-job (gh)
test/prototype/test_parametrization.py::TestFakeSparsity::test_jit_trace

This comment was automatically generated by Dr. CI and updates every 15 minutes.

drisspg

I think makes sense, can you add some tests

Some alternatives would be to maintain the Float8Linear structure and then swap the weight class dtype (feels weird).

I think this is good argument for subclasses since you can maintain structure and then assert all low_p subclasses have a dequant method and call that which will convert to fp32. I know Christian wants tensor.to(torch.float32) to do this but I think its tooo magical

cc @vkuzo

vkuzo · 2024-11-26T16:01:09Z

torchao/quantization/quant_api.py

@@ -222,6 +224,9 @@ def _replace_with_custom_fn_if_matches_filter(
    Returns:
        None
    """
+    # If model is Float8Linear, convert it to Linear before moving forward
+    if isinstance(model, Float8Linear):
+        model = dequantize_float8_training(model)


can you just move your code snippet from the other file here:

if isinstance(model, Float8Linear): with torch.device("meta"): new_module = nn.Linear(model.in_features, model.out_features) new_module.weight = model.weight new_module.bias = model.bias model = new_module

and not need any changes to torchao/float8?

@vkuzo what do you think about having dequantizing a model as a separate API? it feels a bit weird to have this logic in _replace_with_custom_fn_if_matches_filter which is supposed to be a simple module replacement function I feel.

my 5c -

we can make it work now without adding any public APIs, with minimal increase in complexity

if it's important to have a public API for "remove low precision training from a model", we can have that conversation in parallel

wdyt

the motivation for adding a new API is making the dequantizing step more explicit for user, instead of hide it in a module replacement function.

but agree this can happen in parallel. also it's probably not worth spending time to discuss as of now, and wait until there are more use cases might be better

vkuzo

would be good to hear some motivation on why this needs a public API versus doing the same thing without a new API

torchao/quantization/quant_api.py

test/float8/test_base.py

jerryzh168

LGTM

Add support for quantize_() with Float8Linear module

00f9137

jainapurva added the topic: bug fix Use this tag for PRs that fix bugs label Nov 26, 2024

jainapurva requested review from vkuzo, jerryzh168 and drisspg November 26, 2024 00:52

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 26, 2024

drisspg reviewed Nov 26, 2024

View reviewed changes

vkuzo reviewed Nov 26, 2024

View reviewed changes

Expose fp8linear dequantize api

62bd5da

vkuzo requested changes Nov 26, 2024

View reviewed changes

jainapurva force-pushed the fp8_linear_quantize branch 2 times, most recently from 8d1f189 to a184759 Compare November 27, 2024 00:23

Implicit conversion

a73180d

jainapurva force-pushed the fp8_linear_quantize branch from a184759 to 7373ae7 Compare November 27, 2024 19:25

Merge remote-tracking branch 'origin/main' into fp8_linear_quantize

dc1a233

jainapurva force-pushed the fp8_linear_quantize branch from 7373ae7 to dc1a233 Compare November 27, 2024 22:35

jainapurva marked this pull request as ready for review November 27, 2024 23:09

jainapurva requested review from vkuzo and drisspg November 27, 2024 23:11

jerryzh168 reviewed Nov 27, 2024

View reviewed changes

torchao/quantization/quant_api.py Outdated Show resolved Hide resolved

Inline function

dfdba92

jainapurva force-pushed the fp8_linear_quantize branch from cab447c to dfdba92 Compare November 27, 2024 23:48

jerryzh168 reviewed Nov 28, 2024

View reviewed changes

test/float8/test_base.py Outdated Show resolved Hide resolved

jerryzh168 approved these changes Nov 28, 2024

View reviewed changes

Updated logic'

f66e7ed

jainapurva force-pushed the fp8_linear_quantize branch from 94655a9 to f66e7ed Compare November 28, 2024 01:15

jainapurva merged commit c45d975 into main Nov 28, 2024
17 of 18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for quantize_() with Float8Linear module #1344

Add support for quantize_() with Float8Linear module #1344

jainapurva commented Nov 26, 2024

pytorch-bot bot commented Nov 26, 2024 •

edited

Loading

drisspg left a comment

vkuzo Nov 26, 2024

jerryzh168 Nov 26, 2024

vkuzo Nov 26, 2024

jerryzh168 Nov 26, 2024

vkuzo left a comment

jerryzh168 left a comment

Add support for quantize_() with Float8Linear module #1344

Add support for quantize_() with Float8Linear module #1344

Conversation

jainapurva commented Nov 26, 2024

pytorch-bot bot commented Nov 26, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1344

❌ 1 New Failure

drisspg left a comment

Choose a reason for hiding this comment

vkuzo Nov 26, 2024

Choose a reason for hiding this comment

jerryzh168 Nov 26, 2024

Choose a reason for hiding this comment

vkuzo Nov 26, 2024

Choose a reason for hiding this comment

jerryzh168 Nov 26, 2024

Choose a reason for hiding this comment

vkuzo left a comment

Choose a reason for hiding this comment

jerryzh168 left a comment

Choose a reason for hiding this comment

pytorch-bot bot commented Nov 26, 2024 •

edited

Loading