-
Notifications
You must be signed in to change notification settings - Fork 538
Quantize QuestionAnswering models #1581
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1581/bc9ce0f899079a3cf1a8d80bbb1fb746dd3c69b5/index.html |
scripts/question_answering/models.py
Outdated
@@ -287,10 +289,14 @@ def forward(self, tokens, token_types, valid_length, p_mask, start_position): | |||
Shape (batch_size, sequence_length) | |||
answerable_logits | |||
""" | |||
backbone_net = self.backbone | |||
if self.quantized: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.quantized: | |
if self.quantized_bacbone is not None: |
end remove quantized ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it, but I remained this quantized flag as switch on/off of quantized model - not sure if it is really usable. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be not usable for me.
The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1581/bba1525cc4ac848aca3ed452dbc15f43d8e53afb/index.html |
The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1581/f2b5043608cbc68c0b67fbdcf2a3a3ef85363921/index.html |
Description
This PR enables quantization on question answering scripts.
Added custom calibration collector to avoid significant accuracy drop