Quantize QuestionAnswering models #1581

bgawrych · 2021-10-06T12:03:07Z

Description

This PR enables quantization on question answering scripts.
Added custom calibration collector to avoid significant accuracy drop

bartekkuncer

Looks good to me.

github-actions · 2021-10-06T12:17:50Z

The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1581/bc9ce0f899079a3cf1a8d80bbb1fb746dd3c69b5/index.html

anko-intel · 2021-10-06T12:20:30Z

scripts/question_answering/models.py

@@ -287,10 +289,14 @@ def forward(self, tokens, token_types, valid_length, p_mask, start_position):
            Shape (batch_size, sequence_length)
        answerable_logits
        """
+        backbone_net = self.backbone
+        if self.quantized:


Suggested change

if self.quantized:

if self.quantized_bacbone is not None:

end remove quantized ?

I thought about it, but I remained this quantized flag as switch on/off of quantized model - not sure if it is really usable. What do you think?

Seems to be not usable for me.

github-actions · 2021-10-08T07:49:46Z

The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1581/bba1525cc4ac848aca3ed452dbc15f43d8e53afb/index.html

github-actions · 2021-10-12T07:38:11Z

The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1581/f2b5043608cbc68c0b67fbdcf2a3a3ef85363921/index.html

Bartlomiej Gawrych added 2 commits October 6, 2021 11:18

Add quantization to QA scripts

0900dcd

fix

bc9ce0f

bgawrych requested a review from a team as a code owner October 6, 2021 12:03

bartekkuncer reviewed Oct 6, 2021

View reviewed changes

anko-intel approved these changes Oct 6, 2021

View reviewed changes

Remove quantize bool field

bba1525

Fix electra large accuracy

f2b5043

Update mkldnn to onednn

94bf297

This was referenced Dec 2, 2021

Fix identity fuse for oneDNN apache/mxnet#20767

Merged

Fix oneDNN fallback for concat with scalar apache/mxnet#20770

Closed

Fix oneDNN fallback for concat with scalar apache/mxnet#20772

Merged

Bartlomiej Gawrych added 2 commits March 8, 2022 16:53

Accuracy fix

d56fe41

Add sphinx to dev requirments

112071f

bartekkuncer approved these changes Apr 8, 2022

View reviewed changes

bgawrych and others added 3 commits April 21, 2022 11:46

remove print

2b6997f

change quantize_mode to proper one

cc1a87d

fix round_to argument

8dc7b05

DominikaJedynak mentioned this pull request Aug 30, 2022

Added ALBERT v2 quantization with INC example #1591

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantize QuestionAnswering models #1581

Quantize QuestionAnswering models #1581

bgawrych commented Oct 6, 2021

bartekkuncer left a comment

github-actions bot commented Oct 6, 2021

anko-intel Oct 6, 2021

bgawrych Oct 6, 2021

anko-intel Oct 6, 2021

github-actions bot commented Oct 8, 2021

github-actions bot commented Oct 12, 2021

Quantize QuestionAnswering models #1581

Are you sure you want to change the base?

Quantize QuestionAnswering models #1581

Conversation

bgawrych commented Oct 6, 2021

Description

bartekkuncer left a comment

Choose a reason for hiding this comment

github-actions bot commented Oct 6, 2021

anko-intel Oct 6, 2021

Choose a reason for hiding this comment

bgawrych Oct 6, 2021

Choose a reason for hiding this comment

anko-intel Oct 6, 2021

Choose a reason for hiding this comment

github-actions bot commented Oct 8, 2021

github-actions bot commented Oct 12, 2021