Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug fix] Fixed immutable op quantization when non-quantizable op exists in multiple followed ops #39342

Merged
merged 2 commits into from
Feb 8, 2022

Conversation

wozna
Copy link
Contributor

@wozna wozna commented Feb 1, 2022

PR types

Bug fixes

PR changes

Others

Describe

This PR:

  • fixes problem for many output operators,
  • adds UT for the mentioned problem,
  • adds missing descriptions for some logs in cpu_quantize_pass

There was a problem in quantizing the picodet_m_416_coco model. For operators such as reshape2, transpose2, slice, and nearest_interp/v2, it is necessary to check if there are quantized operators before or after the operator because quantizing these operators alone does not give you any speed up. So in cpu_quantize_pass we look for the pattern prev_op->op->next_op to check it. It turned out that the solution did not support a situation where there are not one but many operators after the op. This PR fixes that.

@paddle-bot-old
Copy link

paddle-bot-old bot commented Feb 1, 2022

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@sfraczek sfraczek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Condition in line 39 of paddle/fluid/framework/ir/mkldnn/cpu_quantize_pass_tester.cc seems wrong because if type is "quantize" the first test for type != "dropout" will be true and other conditions should not be evaluated
if (type != "dropout" || type != "quantize" || type != "dequantize") {

@@ -417,36 +417,88 @@ void TestImmutableOpBetweenNonQuantizedOp(const std::string tested_op) {
SCALE * S8_MAX);
}

// a->Dropout1->b
// b->TestedOp1(not quantized)->c
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment says not quantized but in line 434 is set to int8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because in cpu_quantize_placement_pass we set mkldnn_data_type to int8 to all supported operators, then in cpu_quantize_pass we decide if it will be quantized or not. Maybe I should change this comment to (will be quantized) and (won't be quantized)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Or should/shouldn't be quantized sounds also good.

Comment on lines 269 to 275
for (auto output : node->outputs) {
if (!output->IsOp() ||
!(output->Op()->Type() == "quantize" ||
platform::HasOpINT8DataType(output->Op())))
return false;
}
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can suggest another approach that might be more readable because it has less ! and or and single return:

// return true only if all of outputs are ops and their are either quantize or have int8 data type
return all_of(node->outputs.begin(),node->outputs.end(), [](Node* output){output->IsOp() && (output->Op()->Type() == "quantize" || platform::HasOpINT8DataType(output->Op()))});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thank you, this is a more elegant approach.

Copy link
Contributor

@sfraczek sfraczek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lidanqing-intel lidanqing-intel changed the title Fixed handling of one of the cases in the quantization process [Bug fix] Fixed handling of one of the cases in the quantization process Feb 7, 2022
Copy link
Contributor

@baoachun baoachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@lidanqing-intel lidanqing-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lidanqing-intel lidanqing-intel merged commit e4d475e into PaddlePaddle:develop Feb 8, 2022
// c->TestedOp2(will be quantized)->e
// e->Pool2d1(will be quantized)->f
// e->Pool2d2(will be quantized)->g
void TestImmutableOpWithManyOutputs(const std::string tested_op) {
Copy link
Contributor

@lidanqing-intel lidanqing-intel Apr 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*
                         a 
                         | 
                        Dropout1
                         | 
                         b
                         |
      TestedOp1(won't be quantized)
                         |
                       c    c 
                      /      \       
               Dropout2       TestedOp2(will be quantized)          
                                            | 
                                            e
                                          /    \
                    Pool2d1(will be quantized)   Pool2d2(will be quantized)       
                            |                                         |
                            f                                         g
*/
  • If one immutable op has many next op and more than one of the next op is not quantizable, then do not quantize this immutable op ? Is it correct?
  • But why TestedOp2, Pool2d1 and Pool2d2 should be not quantized? testedop2 has two followed op both quantizable. What about change the condition
  if (!(IsOpDequantized(prev_op)) && !(IsOpQuantized(nearest_interp_out))) {
      return;
    }

to

 if (!(IsOpDequantized(prev_op)&& IsOpQuantized(nearest_interp_out)) {
      return;
    }

Is this quantization is slow? Or it is difficult to get input scale for TestedOp2?
Just had some doubts, thanks !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one of the next op is not quantizable we don't want to quantize TestedOp2, because we will have to put quantize before TestedOp2 and dequantize after.
Immutable ops usually have the same performance comparing FP32 and INT8. That is why we should be careful adding quantize, dequantize.

So in this situation, we skip quantization of this operator only when prev_op won't be quantize and not all next ops will be quantize

  if (!(IsOpDequantized(prev_op)) && !(IsOpQuantized(nearest_interp_out))) {
      return;
    }

In any other situation, we will quantize this op.

@lidanqing-intel lidanqing-intel changed the title [Bug fix] Fixed handling of one of the cases in the quantization process [Bug fix] Fixed immutable op quantization when non-quantizable op exists in multiple followed ops Apr 9, 2022
@wozna wozna deleted the fix_next_op_pattern branch February 24, 2023 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants