[reland] Refactor quant_llm to work with affine quantized tensor (#696) #772

jerryzh168 · 2024-08-28T14:19:26Z

Summary:
We want to add quant_llm to affine quantized tensor as a general fp2-fp7 dtype, before that we need to refactor the current implementation to work with AffineQuantizedTensor first

Test Plan:
python test/prototype/test_quant_llm.py

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2024-08-28T14:19:30Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/772

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a74fa26 with merge base 09a5e54 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…orch#696) Summary: We want to add quant_llm to affine quantized tensor as a general fp2-fp7 dtype, before that we need to refactor the current implementation to work with AffineQuantizedTensor first Test Plan: python test/prototype/test_quant_llm.py Reviewers: Subscribers: Tasks: Tags:

jerryzh168 · 2024-08-29T20:58:55Z

the problem is the previous error in CI does not really appear here, maybe we should do a forward fix. cc @msaroufim @gau-nernst

jerryzh168 · 2024-08-29T22:58:07Z

update: moved the import for quant_llm op, verified locally that "import torchao" works

* move gguf tests to script * execute advanced instructions

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 28, 2024

jerryzh168 added 4 commits August 29, 2024 13:04

import issue

ceaf5f5

fix tests

1e7cf49

rebase

37e6534

jerryzh168 force-pushed the reland-fpx branch from 890c7eb to 37e6534 Compare August 29, 2024 20:06

jerryzh168 requested review from msaroufim and gau-nernst August 29, 2024 20:59

jerryzh168 added 2 commits August 29, 2024 15:50

fix import error

072b7e9

minor fix

a74fa26

msaroufim approved these changes Aug 29, 2024

View reviewed changes

jerryzh168 merged commit 05224a9 into pytorch:main Aug 29, 2024
16 checks passed

jerryzh168 deleted the reland-fpx branch August 29, 2024 23:40

jerryzh168 mentioned this pull request Aug 29, 2024

[Tracker] WIP features for torchao 0.5 #667

Closed

17 tasks

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Advanced.md (pytorch#772)

fc43771

* move gguf tests to script * execute advanced instructions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[reland] Refactor quant_llm to work with affine quantized tensor (#696) #772

[reland] Refactor quant_llm to work with affine quantized tensor (#696) #772

jerryzh168 commented Aug 28, 2024

pytorch-bot bot commented Aug 28, 2024 •

edited

Loading

jerryzh168 commented Aug 29, 2024

jerryzh168 commented Aug 29, 2024

[reland] Refactor quant_llm to work with affine quantized tensor (#696) #772

[reland] Refactor quant_llm to work with affine quantized tensor (#696) #772

Conversation

jerryzh168 commented Aug 28, 2024

pytorch-bot bot commented Aug 28, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/772

✅ No Failures

jerryzh168 commented Aug 29, 2024

jerryzh168 commented Aug 29, 2024

pytorch-bot bot commented Aug 28, 2024 •

edited

Loading