Tensor size mismatch when using static quantization #665

yiliu30 · 2024-08-13T12:57:27Z

At static_quant, if we change the ToyLinearModel(1024, 1024, 1024) to ToyLinearModel(), it causes the below issue:

Traceback (most recent call last):
  File "projs/torchao/tutorials/calibration_flow/static_quant.py", line 132, in <module>
    m(*example_inputs)
  File "conda_env/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "projs/torchao/tutorials/calibration_flow/static_quant.py", line 109, in forward
    x = self.linear2(x)
        ^^^^^^^^^^^^^^^
  File "conda_env/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "projs/torchao/tutorials/calibration_flow/static_quant.py", line 27, in forward
    observed_weight = self.weight_obs(self.weight)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/ao/quantization/observer.py", line 739, in forward
    return self._forward(x_orig)
           ^^^^^^^^^^^^^^^^^^^^^
  File "conda_env/torch/ao/quantization/observer.py", line 762, in _forward
    min_val = torch.min(min_val_cur, min_val)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The size of tensor a (64) must match the size of tensor b (32) at non-singleton dimension 0

cc @jerryzh168

The text was updated successfully, but these errors were encountered:

jerryzh168 · 2024-08-13T20:33:48Z

thanks, there is a bug in replacement_fn I think, fixed in #650

jerryzh168 · 2024-08-14T20:58:26Z

should be fixed now

* executable README * fix title of CI workflow * markup commands in markdown * extend the markup-markdown language * Automatically identify cuda from nvidia-smi in install-requirements (pytorch#606) * Automatically identify cuda from nvidia-smi in install-requirements * Update README.md --------- Co-authored-by: Michael Gschwind <[email protected]> * Unbreak zero-temperature sampling (pytorch#599) Fixes pytorch#581. * Improve process README * [retake] Add sentencepiece tokenizer (pytorch#626) * Add sentencepiece tokenizer Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Add white space Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Handle white space: Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Handle control ids Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * More cleanup Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Lint Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Use unique_ptr Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Use a larger runner Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Debug Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Debug Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Cleanup * Update install_utils.sh to use python3 instead of python (pytorch#636) As titled. On some devices `python` and `python3` are pointing to different environments so good to unify them. * Fix quantization doc to specify dytpe limitation on a8w4dq (pytorch#629) Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Co-authored-by: Kimish Patel <[email protected]> * add desktop.json (pytorch#622) * add desktop.json * add fast * remove embedding * improvements * update readme from doc branch * tab/spc * fix errors in updown language * fix errors in updown language, and [skip]: begin/end * fix errors in updown language, and [skip]: begin/end * a storied run * stories run on readme instructions does not need HF token * increase timeout * check for hang un hf_login * executable README improvements * typo * typo --------- Co-authored-by: Ian Barber <[email protected]> Co-authored-by: Scott Wolchok <[email protected]> Co-authored-by: Mengwei Liu <[email protected]> Co-authored-by: Kimish Patel <[email protected]> Co-authored-by: Scott Roy <[email protected]>

jerryzh168 self-assigned this Aug 13, 2024

msaroufim added the bug Something isn't working label Aug 13, 2024

jerryzh168 closed this as completed Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor size mismatch when using static quantization #665

Tensor size mismatch when using static quantization #665

yiliu30 commented Aug 13, 2024 •

edited

Loading

jerryzh168 commented Aug 13, 2024

jerryzh168 commented Aug 14, 2024

Tensor size mismatch when using static quantization #665

Tensor size mismatch when using static quantization #665

Comments

yiliu30 commented Aug 13, 2024 • edited Loading

jerryzh168 commented Aug 13, 2024

jerryzh168 commented Aug 14, 2024

yiliu30 commented Aug 13, 2024 •

edited

Loading