Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support model.to int8 weight only quantized model #122

Merged
merged 1 commit into from
Apr 4, 2024
Merged

Support model.to int8 weight only quantized model #122

merged 1 commit into from
Apr 4, 2024

Conversation

jerryzh168
Copy link
Contributor

Summary:
registering fields as buffers so they get picked up in model.to

Test Plan:
python test/quantization/test_quant_api.py -k test_int8_wo_quant_save_load Reviewers:

Subscribers:

Tasks:

Tags:

@jerryzh168 jerryzh168 requested a review from cpuhrsch April 4, 2024 06:49
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 4, 2024
@jerryzh168 jerryzh168 requested a review from HDCharles April 4, 2024 06:49
Summary:
registering fields as buffers so they get picked up in `model.to`

Test Plan:
python test/quantization/test_quant_api.py -k test_int8_wo_quant_save_load
Reviewers:

Subscribers:

Tasks:

Tags:
@jerryzh168 jerryzh168 merged commit 76e2ef5 into main Apr 4, 2024
7 checks passed
@jerryzh168 jerryzh168 deleted the int8 branch April 4, 2024 07:10
jerryzh168 added a commit that referenced this pull request Apr 4, 2024
* add int4 non-gptq and bugfixes (#119)

Summary: int4weightlinear had a bug that made it not pad when it should
have

Test Plan: python test/quantization/test_quant_api.py -k "int4wo"

Reviewers:

Subscribers:

Tasks:

Tags:

* fixing bug in GPTQ (#120)

* fixing bug in GPTQ

Summary: shape was always padded even when not needed.

Test Plan: pythont test/quantization/test_quant_api.py -k
"test_gptq_quantizer_int4wo"

Reviewers:

Subscribers:

Tasks:

Tags:

* removing extra spaces

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Support `model.to` int8 weight only quantized model (#122)

Summary:
registering fields as buffers so they get picked up in `model.to`

Test Plan:
python test/quantization/test_quant_api.py -k test_int8_wo_quant_save_load
Reviewers:

Subscribers:

Tasks:

Tags:

---------

Co-authored-by: HDCharles <[email protected]>
apply_weight_only_int8_quant(m)
example_inputs = m.example_inputs()
ref = m(*example_inputs)
_TMP_FN = "_test.pt"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using named temporary files might be better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024
Summary:
registering fields as buffers so they get picked up in `model.to`

Test Plan:
python test/quantization/test_quant_api.py -k test_int8_wo_quant_save_load
Reviewers:

Subscribers:

Tasks:

Tags:
yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants