Fix dimension issues for int4 weight only quant path #330

jerryzh168 · 2024-06-06T18:20:50Z

Summary:
Currently the accepted dimension of _quantized_linear is not clear, this PR fixes the issue.

Currently the "tensor_core_tiled" layout tensor does not do repacking in view operation, which is incorrect, this PR removes the view support (which is not needed right now), and restrict the use case to transpose op, and records the transpose status of the tensor instead of doing repacking for performance.

Test Plan:
python test/quantization/test_quant_api.py
python test/integration/test_integration.py

TORCH_LOGS='output_code' python tutorials/quantize_vit/run_vit_b_quant.py

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2024-06-06T18:20:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/330

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Rebase your PRs: Unstable CUDA signal in CI caused by cudnn 9 update

✅ No Failures

As of commit 0732101 with merge base e2196fd ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

HDCharles · 2024-06-06T19:43:52Z

why did you change teh name to transpose change shape? it doesn't have anything to do with the transpose handling, it just changes teh external shape.

jerryzh168 · 2024-06-06T21:19:06Z

why did you change teh name to transpose change shape? it doesn't have anything to do with the transpose handling, it just changes teh external shape.

it's not a general _change_shape, since it only makes sense with transpose dimension I think? otherwise we need to unpack and repack the packed weight

cpuhrsch

I'd also like to suggest some unit tests for aqt eventually. They're actually quite easy to write for shape operations, because a dtype has no influence on the change in shape.

That means you can do self.assertEqual(fn(t).shape, fn(to_aqt(t)).shape).

We also have OpInfos and such that could be used here. For example, some tests might easily translate, just with different tolerances due to decreased bit width.

jerryzh168 · 2024-06-06T23:05:02Z

I'd also like to suggest some unit tests for aqt eventually. They're actually quite easy to write for shape operations, because a dtype has no influence on the change in shape.

That means you can do self.assertEqual(fn(t).shape, fn(to_aqt(t)).shape).

We also have OpInfos and such that could be used here. For example, some tests might easily translate, just with different tolerances due to decreased bit width.

yeah sure, I was thinking of adding some tests for aqt afterwards, but I can start with transpose

Summary: Currently the accepted dimension of _quantized_linear is not clear, this PR fixes the issue. Currently the "tensor_core_tiled" layout tensor does not do repacking in view operation, which is incorrect, this PR removes the view support (which is not needed right now), and restrict the use case to transpose op, and records the transpose status of the tensor instead of doing repacking for performance. Test Plan: python test/quantization/test_quant_api.py python test/integration/test_integration.py Reviewers: Subscribers: Tasks: Tags:

jerryzh168 · 2024-06-07T23:48:27Z

why did you change teh name to transpose change shape? it doesn't have anything to do with the transpose handling, it just changes teh external shape.

OK removed the function since it's a bit confusing, we can add back later if needed. I understand it's just changing the external shape without touching the internal data representation now.

msaroufim · 2024-06-10T16:34:22Z

@HDCharles mind reviewing this, one more time?

Summary: Currently the accepted dimension of _quantized_linear is not clear, this PR fixes the issue. Currently the "tensor_core_tiled" layout tensor does not do repacking in view operation, which is incorrect, this PR removes the view support (which is not needed right now), and restrict the use case to transpose op, and records the transpose status of the tensor instead of doing repacking for performance. Test Plan: python test/quantization/test_quant_api.py python test/integration/test_integration.py Reviewers: Subscribers: Tasks: Tags:

jerryzh168 requested a review from HDCharles June 6, 2024 18:20

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 6, 2024

jerryzh168 requested review from cpuhrsch and msaroufim June 6, 2024 18:21

cpuhrsch reviewed Jun 6, 2024

View reviewed changes

jerryzh168 force-pushed the fix-int4-dim branch 4 times, most recently from 6e07180 to c0600c2 Compare June 7, 2024 03:03

jerryzh168 force-pushed the fix-int4-dim branch from c0600c2 to 0732101 Compare June 7, 2024 17:52

jerryzh168 requested a review from cpuhrsch June 7, 2024 21:05

msaroufim removed their request for review June 9, 2024 17:21

jerryzh168 requested review from msaroufim and removed request for msaroufim June 10, 2024 17:05

HDCharles approved these changes Jun 10, 2024

View reviewed changes

jerryzh168 merged commit 79f2c7f into pytorch:main Jun 10, 2024
13 checks passed

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Add CodeLlama usage to the README and make sure it works (pytorch#330)

81d09b7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dimension issues for int4 weight only quant path #330

Fix dimension issues for int4 weight only quant path #330

jerryzh168 commented Jun 6, 2024 •

edited

Loading

pytorch-bot bot commented Jun 6, 2024 •

edited

Loading

HDCharles commented Jun 6, 2024 •

edited

Loading

jerryzh168 commented Jun 6, 2024

cpuhrsch left a comment

jerryzh168 commented Jun 6, 2024

jerryzh168 commented Jun 7, 2024

msaroufim commented Jun 10, 2024

Fix dimension issues for int4 weight only quant path #330

Fix dimension issues for int4 weight only quant path #330

Conversation

jerryzh168 commented Jun 6, 2024 • edited Loading

pytorch-bot bot commented Jun 6, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/330

❗ 1 Active SEVs

✅ No Failures

HDCharles commented Jun 6, 2024 • edited Loading

jerryzh168 commented Jun 6, 2024

cpuhrsch left a comment

Choose a reason for hiding this comment

jerryzh168 commented Jun 6, 2024

jerryzh168 commented Jun 7, 2024

msaroufim commented Jun 10, 2024

jerryzh168 commented Jun 6, 2024 •

edited

Loading

pytorch-bot bot commented Jun 6, 2024 •

edited

Loading

HDCharles commented Jun 6, 2024 •

edited

Loading