Deprecate top level quantization APIs (#344) #357

jerryzh168 · 2024-06-13T02:01:21Z

Summary:
This PR deprecates a few quantization APIs and here are the bc-breaking notes:

int8 weight only quantization int8 weight only quant module swap API

apply_weight_only_int8_quant(model)

and
int8 weight only tensor subclass API

change_linear_weights_to_int8_woqtensors(model)

-->

unified tensor subclass API

quantize(model, get_apply_int8wo_quant()))

int8 dynamic quantization

apply_dynamic_quant(model)

or

change_linear_weights_to_int8_dqtensors(model)

-->

unified tensor subclass API

quantize(model, get_apply_int8dyn_quant()))

int4 weight only quantization

change_linear_weights_to_int4_wotensors(model)

-->

unified tensor subclass API

quantize(model, get_apply_int4wo_quant()))

Test Plan:
python test/quantization/test_quant_api.py
python test/integration/test_integration.py

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: This PR deprecates a few quantization APIs and here are the bc-breaking notes: 1. int8 weight only quantization int8 weight only quant module swap API ``` apply_weight_only_int8_quant(model) ``` and int8 weight only tensor subclass API ``` change_linear_weights_to_int8_woqtensors(model) ``` --> unified tensor subclass API ``` quantize(model, get_apply_int8wo_quant())) ``` 2. int8 dynamic quantization ``` apply_dynamic_quant(model) ``` or ``` change_linear_weights_to_int8_dqtensors(model) ``` --> unified tensor subclass API ``` quantize(model, get_apply_int8dyn_quant())) ``` 3. int4 weight only quantization ``` change_linear_weights_to_int4_wotensors(model) ``` --> unified tensor subclass API ``` quantize(model, get_apply_int4wo_quant())) ``` Test Plan: python test/quantization/test_quant_api.py python test/integration/test_integration.py Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2024-06-13T02:01:23Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/357

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e9fde84 with merge base 950a893 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2024-06-13T02:06:14Z

looks like there are not many tests for the release branch cherry-pick merge

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 13, 2024

jerryzh168 requested a review from msaroufim June 13, 2024 02:01

msaroufim approved these changes Jun 13, 2024

View reviewed changes

jerryzh168 merged commit 7e0027a into pytorch:release/0.3 Jun 13, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate top level quantization APIs (#344) #357

Deprecate top level quantization APIs (#344) #357

jerryzh168 commented Jun 13, 2024

pytorch-bot bot commented Jun 13, 2024 •

edited

Loading

jerryzh168 commented Jun 13, 2024

Deprecate top level quantization APIs (#344) #357

Deprecate top level quantization APIs (#344) #357

Conversation

jerryzh168 commented Jun 13, 2024

pytorch-bot bot commented Jun 13, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/357

✅ No Failures

jerryzh168 commented Jun 13, 2024

pytorch-bot bot commented Jun 13, 2024 •

edited

Loading