Fix the impl for `to` for int4 weight only use case #522

jerryzh168 · 2024-07-17T19:50:18Z

Summary:
Note that we can do the following right now:

initialize and quantize the model with int4_weight_only quant in cpu
move the model to cuda

we'll enable this in a separate PR

Test Plan:
python test/quantization/test_quant_api.py -k test_int4wo_quantized_model_to_device
Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2024-07-17T19:50:21Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/522

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 3ed1a46 with merge base 6dd82d8 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

larryliu0820 · 2024-07-17T19:51:35Z

test/quantization/test_quant_api.py

@@ -637,6 +637,22 @@ def test_quantized_model_to_device(self):
        cuda_res = m(*example_inputs_cuda)
        self.assertEqual(cuda_res.cpu(), ref)

+    # TODO: enable this test


why disable test?

cpu -> cuda does not work yet, I changed it to cuda to cuda for now

Summary: Note that we can do the following right now: * initialize and quantize the model with int4_weight_only quant in cpu * move the model to cuda we'll enable this in a separate PR Test Plan: CI Reviewers: Subscribers: Tasks: Tags:

jerryzh168 requested a review from msaroufim July 17, 2024 19:50

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 17, 2024

jerryzh168 requested a review from HDCharles July 17, 2024 19:50

larryliu0820 reviewed Jul 17, 2024

View reviewed changes

larryliu0820 approved these changes Jul 17, 2024

View reviewed changes

jerryzh168 force-pushed the test-to branch 2 times, most recently from 2019b80 to 948b7ed Compare July 17, 2024 20:40

Fix the impl for to for int4 weight only use case

3ed1a46

Summary: Note that we can do the following right now: * initialize and quantize the model with int4_weight_only quant in cpu * move the model to cuda we'll enable this in a separate PR Test Plan: CI Reviewers: Subscribers: Tasks: Tags:

jerryzh168 force-pushed the test-to branch from 948b7ed to 3ed1a46 Compare July 17, 2024 21:55

jerryzh168 merged commit d36de1b into pytorch:main Jul 17, 2024

jerryzh168 deleted the test-to branch July 17, 2024 23:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix the impl for `to` for int4 weight only use case #522

Fix the impl for `to` for int4 weight only use case #522

Uh oh!

jerryzh168 commented Jul 17, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 17, 2024 •

edited

Loading

Uh oh!

larryliu0820 Jul 17, 2024

Uh oh!

jerryzh168 Jul 17, 2024

Uh oh!

Uh oh!

Fix the impl for to for int4 weight only use case #522

Fix the impl for to for int4 weight only use case #522

Uh oh!

Conversation

jerryzh168 commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/522

✅ No Failures

Uh oh!

larryliu0820 Jul 17, 2024

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jul 17, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Fix the impl for `to` for int4 weight only use case #522

Fix the impl for `to` for int4 weight only use case #522

jerryzh168 commented Jul 17, 2024 •

edited

Loading

pytorch-bot bot commented Jul 17, 2024 •

edited

Loading