[core] Fix torchao by MekkCyber · Pull Request #42289 · huggingface/transformers

MekkCyber · 2025-11-19T16:45:34Z

What does this PR do?

Refactors torchao quantization method to use conversion ops instead of the classical create_quantized_param

…pdate-conversion

HuggingFaceDocBuilderDev · 2025-11-19T16:55:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…ao-v2

SunMarc

A few nits but overall fine !

SunMarc · 2025-11-21T15:48:35Z

src/transformers/core_model_loading.py

        source_keys: list[str],
        target_keys: list[str],
        full_layer_name: str,
+        model,
+        missing_keys,
        config,
        **kwargs,


most of the args should be optional kwargs so that we can clean the other convert function with **kwargs and only put args that are being used but that's fine, we should do that later on

src/transformers/integrations/torchao.py

SunMarc · 2025-11-21T16:03:50Z

src/transformers/quantizers/quantizer_torchao.py

+        if self.pre_quantized:
+            return False


this is typically one of the cases where we would get the wrong numel calculation since we are skipping them. We should try to fix that at some point, as this should be quite simple

SunMarc · 2025-11-21T16:10:17Z

src/transformers/quantizers/quantizer_torchao.py

+                WeightConverter(
+                    source_keys=["weight:_data"],
+                    target_keys="weight",
+                    operations=[TorchAoDeserialize(self)],


A WeightRename should be enough in this case no ?

yes both are fine i guess

FYI we just changed weight:_data to weight_qdata so these things can be attached to module directly incase we need it in the future. pytorch/ao@ba3ac9f

Weight converter is better than WeightRename here because there is an op!

SunMarc · 2025-11-21T16:12:41Z

src/transformers/integrations/torchao.py

+        full_layer_name: str | None = None,
+        missing_keys=None,
+        **kwargs,
+    ) -> dict[str, torch.Tensor]:


the safe serialization don't work yet because of torchao, so it is fine to just clean a bit, we can come back to that later on

liangel-02 · 2025-11-21T19:39:45Z

src/transformers/integrations/torchao.py

+            # print("metadata", self.hf_quantizer.metadata)
+            raise ValueError("To use `safetensors` serialization, you should have `torchao>=0.14.0` installed")
+
+        new_param = unflatten_tensor_state_dict(param_data, self.hf_quantizer.metadata)[full_layer_name]


in a followup pr, we can modify this to work with all tensor subclasses and for sharded checkpoint files.

im thinking that in this convert function, we load in the tensor subclass components (ie _weight_qdata) as module parameters. after all files are loaded, we can delete them and replace the actual layer weights with the reconstructed quantized tensors.

see #41998 for details - will this approach still work with the new refactoring? cc @jerryzh168

@liangel-02 yeah I think our original approach should still work, I guess it's fine to land this PR first and you can re-open #41998 on top of these new changes, since you are more familiar with this part

Thanks both for chiming in! 🤗

jerryzh168 · 2025-11-22T01:31:19Z

src/transformers/quantizers/quantizer_torchao.py

+        if self.pre_quantized:
+            return [
+                WeightConverter(
+                    source_keys=["weight:qdata", "weight:scale", "weight:zero_point"],


nit: maybe also add [weight_qdata, weight_scale] as well since zero_point may be optional, like https://github.com/pytorch/ao/blob/2ff1eb2e356275cfbe46832327387d382c72945d/torchao/quantization/quantize_/workflows/float8/float8_tensor.py#L99

let's do that in a follow up pr since the safetensors support is broken with the latest torchao version

ArthurZucker

Great work! thanks 🤗

src/transformers/integrations/torchao.py

ArthurZucker · 2025-11-24T08:53:29Z

src/transformers/integrations/torchao.py

+            # print("metadata", self.hf_quantizer.metadata)
+            raise ValueError("To use `safetensors` serialization, you should have `torchao>=0.14.0` installed")
+
+        new_param = unflatten_tensor_state_dict(param_data, self.hf_quantizer.metadata)[full_layer_name]


Thanks both for chiming in! 🤗

ArthurZucker · 2025-11-24T08:54:17Z

src/transformers/core_model_loading.py

        source_keys: list[str],
        target_keys: list[str],
        full_layer_name: str,
+        model,
+        missing_keys,
        config,
        **kwargs,


ArthurZucker · 2025-11-24T08:54:56Z

src/transformers/modeling_utils.py

+            if hf_quantizer is not None:
+                weight_conversions.extend(hf_quantizer.get_weight_conversions())


github-actions · 2025-11-25T08:59:24Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: finegrained_fp8, torchao_integration

* inital commit * up * update unexpected later on * fix * update * simplify our lives * isolate a bit more * fixup * small nits * style * nit * fix common cases * fix post merge * bnb needs missing keys * small fix * bettrer documentation * no veradict + base class * rake review comments * take all comments * fix super init * update doc to be more real * up * fix some tests * weight convertor * up * mostly correct * oups * skip non linears * only some tests to go * need quantization * fix tests * rm comment * revert * revert 2 * style * up * up * remove unsafe loading path * fix * fix * fix * up * rm Dtensor import * rm replicate import * fix imports * up * minor modifications * add quantizaton_operation * delattr * fix * fix get_param_buffer * better to just set module initialized * rm tie_weights * guard imports * up * rm offloading for now * add license * don't guard torch * comment * fix * rm torch.grad * revert * fix * add guard * add second guard --------- Co-authored-by: Arthur <arthur.zucker@gmail.com> Co-authored-by: Marc Sun <marc@huggingface.co>

ArthurZucker added 22 commits November 18, 2025 10:51

inital commit

fd8e607

up

7990c49

update unexpected later on

1a9f77a

Merge branch 'main' of github.com:huggingface/transformers into vlm-u…

c82b5c8

…pdate-conversion

fix

30e405a

update

e9fcb66

simplify our lives

4204535

isolate a bit more

1da30a6

fixup

5c71300

small nits

6c33dc8

style

e53e1c6

nit

5e2e0c4

fix common cases

eb8493c

Merge branch 'main' of github.com:huggingface/transformers into vlm-u…

526001e

…pdate-conversion

fix post merge

74c524d

bnb needs missing keys

7c04b0f

small fix

935e77f

bettrer documentation

6c23f3e

no veradict + base class

b5adc5b

rake review comments

2746e0f

take all comments

b7591da

fix super init

138d415

ArthurZucker and others added 6 commits November 20, 2025 11:37

update doc to be more real

cb63300

up

12a74d9

fix some tests

1b6f64d

weight convertor

fe089ea

up

f1e8731

mostly correct

9607be8

MekkCyber force-pushed the fix-torchao-v2 branch from 57f7874 to 9607be8 Compare November 20, 2025 16:08

SunMarc and others added 7 commits November 21, 2025 15:55

delattr

dfc2fc8

Merge remote-tracking branch 'upstream/fix-torchao-v2' into fix-torch…

df5a5fc

…ao-v2

fix

f90d29b

fix get_param_buffer

bff9528

better to just set module initialized

07cc7fa

rm tie_weights

e713f43

guard imports

e7aae58

SunMarc approved these changes Nov 21, 2025

View reviewed changes

up

edcd1a0

liangel-02 reviewed Nov 21, 2025

View reviewed changes

jerryzh168 reviewed Nov 22, 2025

View reviewed changes

ArthurZucker approved these changes Nov 24, 2025

View reviewed changes

MekkCyber and others added 11 commits November 24, 2025 11:29

rm offloading for now

cabd4e2

add license

7043cb8

don't guard torch

396e06f

comment

d215565

fix

56f695f

rm torch.grad

0dd7a35

revert

1f904ed

Merge branch 'main' into fix-torchao-v2

2bcc1a7

fix

c63dfb9

add guard

755a2ba

add second guard

8ebf64f

MekkCyber merged commit 5169c23 into main Nov 25, 2025
24 checks passed

MekkCyber deleted the fix-torchao-v2 branch November 25, 2025 09:22

rogeryoungh mentioned this pull request Nov 26, 2025

Add support for MiniMax-M2 #42028

Merged

5 tasks

liangel-02 mentioned this pull request Nov 26, 2025

[torchao] fix safetensors and enable loading from sharded files #41998

Closed

jiqing-feng mentioned this pull request Feb 4, 2026

Quantization model behavior changed #43725

Open

		if hf_quantizer is not None:
		weight_conversions.extend(hf_quantizer.get_weight_conversions())

Conversation

MekkCyber commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Nov 19, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

SunMarc Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liangel-02 Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

MekkCyber commented Nov 19, 2025 •

edited

Loading

SunMarc Nov 21, 2025 •

edited

Loading

jerryzh168 Nov 22, 2025 •

edited

Loading

liangel-02 Nov 21, 2025 •

edited

Loading

jerryzh168 Nov 22, 2025 •

edited

Loading