[Bug fix] Fixed immutable op quantization when non-quantizable op exists in multiple followed ops #39342

wozna · 2022-02-01T17:25:21Z

PR types

Bug fixes

PR changes

Others

Describe

This PR:

fixes problem for many output operators,
adds UT for the mentioned problem,
adds missing descriptions for some logs in cpu_quantize_pass

There was a problem in quantizing the picodet_m_416_coco model. For operators such as reshape2, transpose2, slice, and nearest_interp/v2, it is necessary to check if there are quantized operators before or after the operator because quantizing these operators alone does not give you any speed up. So in cpu_quantize_pass we look for the pattern prev_op->op->next_op to check it. It turned out that the solution did not support a situation where there are not one but many operators after the op. This PR fixes that.

paddle-bot-old · 2022-02-01T17:25:25Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

sfraczek

Condition in line 39 of paddle/fluid/framework/ir/mkldnn/cpu_quantize_pass_tester.cc seems wrong because if type is "quantize" the first test for type != "dropout" will be true and other conditions should not be evaluated
if (type != "dropout" || type != "quantize" || type != "dequantize") {

sfraczek · 2022-02-02T09:55:50Z

paddle/fluid/framework/ir/mkldnn/cpu_quantize_pass_tester.cc

@@ -417,36 +417,88 @@ void TestImmutableOpBetweenNonQuantizedOp(const std::string tested_op) {
           SCALE * S8_MAX);
 }

+// a->Dropout1->b
+// b->TestedOp1(not quantized)->c


comment says not quantized but in line 434 is set to int8

Yes, because in cpu_quantize_placement_pass we set mkldnn_data_type to int8 to all supported operators, then in cpu_quantize_pass we decide if it will be quantized or not. Maybe I should change this comment to (will be quantized) and (won't be quantized)

Sounds good. Or should/shouldn't be quantized sounds also good.

sfraczek · 2022-02-02T13:50:49Z

paddle/fluid/framework/ir/mkldnn/cpu_quantize_pass.cc

+  for (auto output : node->outputs) {
+    if (!output->IsOp() ||
+        !(output->Op()->Type() == "quantize" ||
+          platform::HasOpINT8DataType(output->Op())))
+      return false;
+  }
+  return true;


I can suggest another approach that might be more readable because it has less ! and or and single return:

// return true only if all of outputs are ops and their are either quantize or have int8 data type return all_of(node->outputs.begin(),node->outputs.end(), [](Node* output){output->IsOp() && (output->Op()->Type() == "quantize" || platform::HasOpINT8DataType(output->Op()))});

Oh thank you, this is a more elegant approach.

sfraczek

LGTM

baoachun

LGTM

lidanqing-intel

LGTM

lidanqing-intel · 2022-04-09T04:26:33Z

paddle/fluid/framework/ir/mkldnn/cpu_quantize_pass_tester.cc

+//    c->TestedOp2(will be quantized)->e
+//        e->Pool2d1(will be quantized)->f
+//        e->Pool2d2(will be quantized)->g
+void TestImmutableOpWithManyOutputs(const std::string tested_op) {


/* a | Dropout1 | b | TestedOp1(won't be quantized) | c c / \ Dropout2 TestedOp2(will be quantized) | e / \ Pool2d1(will be quantized) Pool2d2(will be quantized) | | f g */

If one immutable op has many next op and more than one of the next op is not quantizable, then do not quantize this immutable op ? Is it correct?

But why TestedOp2, Pool2d1 and Pool2d2 should be not quantized? testedop2 has two followed op both quantizable. What about change the condition

if (!(IsOpDequantized(prev_op)) && !(IsOpQuantized(nearest_interp_out))) { return; }

to

if (!(IsOpDequantized(prev_op)&& IsOpQuantized(nearest_interp_out)) { return; }

Is this quantization is slow? Or it is difficult to get input scale for TestedOp2?
Just had some doubts, thanks !

If one of the next op is not quantizable we don't want to quantize TestedOp2, because we will have to put quantize before TestedOp2 and dequantize after.
Immutable ops usually have the same performance comparing FP32 and INT8. That is why we should be careful adding quantize, dequantize.

So in this situation, we skip quantization of this operator only when prev_op won't be quantize and not all next ops will be quantize

if (!(IsOpDequantized(prev_op)) && !(IsOpQuantized(nearest_interp_out))) { return; }

In any other situation, we will quantize this op.

Fix quantization next op findings

327946e

wozna added int8 Intel labels Feb 1, 2022

$sfraczek$

sfraczek reviewed Feb 2, 2022

View reviewed changes

Corrections according to the review

28559f9

$sfraczek$

sfraczek approved these changes Feb 3, 2022

View reviewed changes

lidanqing-intel changed the title ~~Fixed handling of one of the cases in the quantization process~~ [Bug fix] Fixed handling of one of the cases in the quantization process Feb 7, 2022

lidanqing-intel assigned baoachun Feb 7, 2022

baoachun approved these changes Feb 8, 2022

View reviewed changes

lidanqing-intel self-requested a review February 8, 2022 11:35

lidanqing-intel approved these changes Feb 8, 2022

View reviewed changes

lidanqing-intel merged commit e4d475e into PaddlePaddle:develop Feb 8, 2022

lidanqing-intel reviewed Apr 9, 2022

View reviewed changes

lidanqing-intel changed the title ~~[Bug fix] Fixed handling of one of the cases in the quantization process~~ [Bug fix] Fixed immutable op quantization when non-quantizable op exists in multiple followed ops Apr 9, 2022

wozna deleted the fix_next_op_pattern branch February 24, 2023 16:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug fix] Fixed immutable op quantization when non-quantizable op exists in multiple followed ops #39342

[Bug fix] Fixed immutable op quantization when non-quantizable op exists in multiple followed ops #39342

wozna commented Feb 1, 2022

paddle-bot-old bot commented Feb 1, 2022

$@sfraczek$ sfraczek left a comment •

edited

Loading

$@sfraczek$ sfraczek Feb 2, 2022

wozna Feb 2, 2022

$@sfraczek$ sfraczek Feb 3, 2022

$@sfraczek$ sfraczek Feb 2, 2022

wozna Feb 2, 2022

$@sfraczek$ sfraczek left a comment

baoachun left a comment

lidanqing-intel left a comment

lidanqing-intel Apr 9, 2022 •

edited

Loading

wozna Apr 11, 2022

[Bug fix] Fixed immutable op quantization when non-quantizable op exists in multiple followed ops #39342

[Bug fix] Fixed immutable op quantization when non-quantizable op exists in multiple followed ops #39342

Conversation

wozna commented Feb 1, 2022

PR types

PR changes

Describe

paddle-bot-old bot commented Feb 1, 2022

sfraczek left a comment • edited Loading

Choose a reason for hiding this comment

sfraczek Feb 2, 2022

Choose a reason for hiding this comment

wozna Feb 2, 2022

Choose a reason for hiding this comment

sfraczek Feb 3, 2022

Choose a reason for hiding this comment

sfraczek Feb 2, 2022

Choose a reason for hiding this comment

wozna Feb 2, 2022

Choose a reason for hiding this comment

sfraczek left a comment

Choose a reason for hiding this comment

baoachun left a comment

Choose a reason for hiding this comment

lidanqing-intel left a comment

Choose a reason for hiding this comment

lidanqing-intel Apr 9, 2022 • edited Loading

Choose a reason for hiding this comment

wozna Apr 11, 2022

Choose a reason for hiding this comment

$@sfraczek$ sfraczek left a comment •

edited

Loading

$@sfraczek$ sfraczek Feb 2, 2022

$@sfraczek$ sfraczek Feb 3, 2022

$@sfraczek$ sfraczek Feb 2, 2022

$@sfraczek$ sfraczek left a comment

lidanqing-intel Apr 9, 2022 •

edited

Loading