-
Notifications
You must be signed in to change notification settings - Fork 309
Supports Loading Quantized Models with from_preset()
#2367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supports Loading Quantized Models with from_preset()
#2367
Conversation
fd28a15 to
1b07517
Compare
mattdangerw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
from_preset()
88e2cec to
430d7b9
Compare
430d7b9 to
58dfab9
Compare
JyotinderSingh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved comments
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses an issue with loading quantized models from presets by introducing a _resolve_dtype utility function and ensuring dtype policies are correctly serialized. The changes are logical and well-tested. I have a couple of minor suggestions to fix a test assertion message and improve docstring formatting to align with the style guide.
mattdangerw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! just a couple nits
0161fb9 to
7eb8f1e
Compare
mattdangerw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
from_preset()from_preset()
Description of the change
This change resolves an issue with loading quantized models from presets. Previously, the model's serialized
DTypePolicyMapwas not correctly passed to the backbone during loading, which caused failures during initialization of quantized layers.The fix introduces a new
_resolve_dtypeutility function that determines the correctdtypefor the model based on the following rules:User-specified
dtype: If a user explicitly provides adtypein the from_preset call (e.g.,from_preset("bert_tiny_en_uncased", num_classes=2, dtype="float32")), that value is used.Float type casting: If no user
dtypeis provided and the saveddtypeis a floating-point type (e.g., "float32"), the model will be loaded using the current Keras defaultdtypepolicy. This allows for safe casting between different floating-point precisions.DTypePolicyMap: If no userdtypeis provided and the saveddtypeis a complex object (like aDTypePolicyMapfor quantization), the saved type is used as is. This ensures that quantization configurations are preserved during loading.Colab Notebook
Checklist