You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, when loading a model in quantized form, the HFQuantizer is created based on other kwargs passed into the from_pretrained function. See current implementation below:
This would give users more flexibility, and allow one to easily create and integrate custom implementations of the HFQuantizer class. I am personally working on a project where this change is necessary to work with quantization methods that have not yet been added to the library
Your contribution
I can make a PR and contribution
The text was updated successfully, but these errors were encountered:
Feature request
Currently, when loading a model in quantized form, the
HFQuantizer
is created based on other kwargs passed into thefrom_pretrained
function. See current implementation below:This should be a straightforward addition, by adding the following lines:
Motivation
This would give users more flexibility, and allow one to easily create and integrate custom implementations of the
HFQuantizer
class. I am personally working on a project where this change is necessary to work with quantization methods that have not yet been added to the libraryYour contribution
I can make a PR and contribution
The text was updated successfully, but these errors were encountered: