New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Support torch_dtype modification and update FAQs for AWQ quantization #2898

Merged

lvhan028 merged 13 commits into InternLM:main from AllentDan:mixtral-w8a8

Dec 25, 2024

Collaborator

AllentDan commented Dec 16, 2024

No description provided.

AllentDan added 3 commits

December 16, 2024 11:19


          Support torch_dtype modification and update FAQs for AWQ quantization

b018f23


          fix lint

dae6d57


          Merge branch 'main' into mixtral-w8a8

5006aec

Conflicts:
	lmdeploy/lite/apis/calibrate.py
	lmdeploy/vl/model/builder.py

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/quantization/awq.py Show resolved Hide resolved

lvhan028 reviewed

View reviewed changes

docs/en/quantization/w4a16.md Outdated Show resolved Hide resolved

AllentDan added 2 commits

December 23, 2024 11:06


          Merge branch 'main' into mixtral-w8a8

185643a

Conflicts:
	lmdeploy/lite/apis/calibrate.py


          add clamp-zeros option

00948ee

Weiyun1025 mentioned this pull request

[Bug] facing nan problem assert torch.isnan(p).sum() == 0 OpenGVLab/InternVL#757

Open

3 tasks


          add guidance

6b54470

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/apis/auto_awq.py Outdated Show resolved Hide resolved


          datasets proxy

8c318d0

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/quantization/awq.py Outdated Show resolved Hide resolved

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/quantization/awq.py Outdated Show resolved Hide resolved

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/quantization/awq.py Outdated Show resolved Hide resolved

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/utils/load.py Show resolved Hide resolved


          remove clamp_zeros

cbdee59

lvhan028 reviewed

View reviewed changes

docs/en/quantization/w4a16.md Show resolved Hide resolved

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/apis/auto_awq.py Show resolved Hide resolved

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/apis/calibrate.py Outdated Show resolved Hide resolved

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/apis/gptq.py Show resolved Hide resolved

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/apis/smooth_quant.py Outdated Show resolved Hide resolved


          fix comments

e5541bf

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/quantization/calibration.py Outdated Show resolved Hide resolved

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/quantization/calibration.py Show resolved Hide resolved

lvhan028 reviewed

View reviewed changes

lmdeploy/lite/quantization/awq.py Outdated Show resolved Hide resolved


          print

7c71c37

lvhan028 approved these changes

View reviewed changes

lvhan028 added the Bug:P1 label

lvhan028 requested a review from RunningLeon

December 24, 2024 08:14

lvhan028 mentioned this pull request

[Bug] AWQ 4-bit quantize Qwen2-VL-72B-Instruct error #2935

Closed

3 tasks

Collaborator

lvhan028 commented Dec 24, 2024

test passed with model InternVL2_5-78B

RunningLeon reviewed

View reviewed changes

Collaborator

RunningLeon left a comment

may merge the latest main to fix ut

AllentDan and others added 3 commits

December 25, 2024 10:31


          Merge branch 'main' into mixtral-w8a8

cba8177


          fix ut hf-token

5cfed5d


          Merge pull request #1 from RunningLeon/hf-token

d9668b3

fix ut hf-token

RunningLeon reviewed

View reviewed changes

Collaborator

RunningLeon left a comment

tested ok on Qwen-VL-Chat

RunningLeon approved these changes

View reviewed changes

Collaborator

RunningLeon left a comment

LGTM

lvhan028 merged commit 9565505 into InternLM:main

4 of 5 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels