Skip to content

Mixed precision export support for gptq quantized model#1853

Merged
kunal-vaishnavi merged 1 commit intomicrosoft:mainfrom
CodeLinaro:gptqmodel_mixed_precision
Nov 11, 2025
Merged

Mixed precision export support for gptq quantized model#1853
kunal-vaishnavi merged 1 commit intomicrosoft:mainfrom
CodeLinaro:gptqmodel_mixed_precision

Conversation

@rM-planet
Copy link
Contributor

@rM-planet rM-planet commented Nov 3, 2025

1> Changes in OGA to support mixed precision export of models quantized with GPTQModel.
2> Changes to decide whether to use packed matmul or not on the basis of q,k,v precisions.

@rM-planet
Copy link
Contributor Author

@baijumeswani Please review. Thanks!

@rM-planet rM-planet force-pushed the gptqmodel_mixed_precision branch from f74cae5 to e6ff697 Compare November 6, 2025 01:39
@gtonpe
Copy link

gtonpe commented Nov 7, 2025

In Review

@rM-planet rM-planet force-pushed the gptqmodel_mixed_precision branch 2 times, most recently from f6386c2 to 10059d2 Compare November 7, 2025 23:41
@rM-planet rM-planet force-pushed the gptqmodel_mixed_precision branch from 10059d2 to 107e483 Compare November 10, 2025 17:50
@rM-planet rM-planet force-pushed the gptqmodel_mixed_precision branch from 107e483 to 2fd49fd Compare November 10, 2025 22:37
@kunal-vaishnavi kunal-vaishnavi enabled auto-merge (squash) November 11, 2025 18:12
@kunal-vaishnavi kunal-vaishnavi merged commit 14c4999 into microsoft:main Nov 11, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants