Skip to content

Add a compatibility method so callers expecting PreTrainedModel-like …#42342

Closed
sywangyi wants to merge 4 commits intohuggingface:mainfrom
sywangyi:hf_quanter_auto_crash
Closed

Add a compatibility method so callers expecting PreTrainedModel-like …#42342
sywangyi wants to merge 4 commits intohuggingface:mainfrom
sywangyi:hf_quanter_auto_crash

Conversation

@sywangyi
Copy link
Contributor

…API don't crash

pytest tests/quantization/compressed_tensors_integration/test_compressed_tensors.py::CompressedTensorsTest

FAILED tests/quantization/compressed_tensors_integration/test_compressed_tensors.py::CompressedTensorsTest::test_llama_8b_fp8 - AttributeError: 'LlamaRotaryEmbedding' object has no attribute 'get_parameter_or_buffer'
FAILED tests/quantization/compressed_tensors_integration/test_compressed_tensors.py::CompressedTensorsTest::test_tinyllama_w4a16 - AttributeError: 'LlamaRotaryEmbedding' object has no attribute 'get_parameter_or_buffer'
FAILED tests/quantization/compressed_tensors_integration/test_compressed_tensors.py::CompressedTensorsTest::test_tinyllama_w8a16 - AttributeError: 'LlamaRotaryEmbedding' object has no attribute 'get_parameter_or_buffer'
FAILED tests/quantization/compressed_tensors_integration/test_compressed_tensors.py::CompressedTensorsTest::test_tinyllama_w8a8 - AttributeError: 'LlamaRotaryEmbedding' object has no attribute 'get_parameter_or_buffer'

call stack

tests/quantization/compressed_tensors_integration/test_compressed_tensors.py:51:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/quantization/compressed_tensors_integration/test_compressed_tensors.py:67: in _test_quantized_model
    quantized_model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
src/transformers/models/auto/auto_factory.py:373: in from_pretrained
    return model_class.from_pretrained(
src/transformers/modeling_utils.py:278: in _wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
src/transformers/modeling_utils.py:3967: in from_pretrained
    device_map = _get_device_map(model, device_map, max_memory, hf_quantizer)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
src/transformers/integrations/accelerate.py:407: in _get_device_map
    device_map = infer_auto_device_map(
src/transformers/integrations/accelerate.py:736: in infer_auto_device_map
    current_buffer_size = compute_module_total_buffer_size(module, hf_quantizer)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
src/transformers/integrations/accelerate.py:260: in compute_module_total_buffer_size
    module_sizes, _ = compute_module_sizes(model, hf_quantizer, buffers_only=True)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
src/transformers/integrations/accelerate.py:240: in compute_module_sizes
    dtype_size = hf_quantizer.param_element_size(model, name)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
src/transformers/quantizers/base.py:182: in param_element_size
    return model.get_parameter_or_buffer(param_name).element_size()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = LlamaRotaryEmbedding(), name = 'get_parameter_or_buffer'

    def __getattr__(self, name: str) -> Union[Tensor, "Module"]:
        if "_parameters" in self.__dict__:
            _parameters = self.__dict__["_parameters"]
            if name in _parameters:
                return _parameters[name]
        if "_buffers" in self.__dict__:
            _buffers = self.__dict__["_buffers"]
            if name in _buffers:
                return _buffers[name]
        if "_modules" in self.__dict__:
            modules = self.__dict__["_modules"]
            if name in modules:
                return modules[name]
>       raise AttributeError(
            f"'{type(self).__name__}' object has no attribute '{name}'"
        )
E       AttributeError: 'LlamaRotaryEmbedding' object has no attribute 'get_parameter_or_buffer'

…API don't crash

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
@sywangyi
Copy link
Contributor Author

@ydshieh please help review

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: apertus, arcee, aria, bamba, bitnet, blt, chameleon, cohere, cohere2, csm, cwm, dbrx, deepseek_v2, deepseek_v3, dia, diffllama

@rogeryoungh rogeryoungh mentioned this pull request Nov 24, 2025
5 tasks
@Rocketknight1
Copy link
Member

Hmmn, I don't think we want to add that method onto every RotaryEmbedding class! This seems like an issue with the compressed_tensors integration maybe, cc @MekkCyber if you have any idea?

Copy link
Contributor

@MekkCyber MekkCyber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR @sywangyi ! However this will be fixed soon once we merge this PR : https://github.com/huggingface/transformers/pull/42289/files#diff-7f40070336f6d7b1ffe08e654cdf930080e3cbd4dbbcbee2996fabe4ffc1c2b3, it seems like a shorter solution

@sywangyi
Copy link
Contributor Author

@sywangyi sywangyi closed this Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants