[QNN][Hexagon] Disable QNN canonicalization pass #12398

ibsidorenko · 2022-08-12T06:56:29Z

Main goals for this PR are the following:

E2E compilation should work without QNN canonicalization on the Hexagon target.
Enable int8 -> int8 computation for dense/conv2d operation.

What was done:

Disabled qnn::Legalize for Hexagon target. For other targets QNN passes keep working. By default QNN legalize and QNN canonicalize are enabled, but it can be disabled through the pass config. Example:

    # Disable QNN legalization and canonicalization passes
    with tvm.transform.PassContext(opt_level=3, disabled_pass=["qnn.Legalize"]):
        mod, _ = relay.optimize(mod, tvm.target.Target(target_hexagon, host=target_hexagon))

Implemented new strategies (compute and schedules) for QNN ops. This is POC code. There was no goal to implement high performance compute/schedule functions. This will be done in future.
To enable int8 -> int8 computation for dense/conv2d operation we need to merge dense|conv2d + [bias]! + requantize.
There are two ways how to implement this:
A) Implement new IRModule -> IRModule pass that will pattern match combination conv2d+requantize/conv2d+bias+requantize and etc.
B) Change TECompiler and lower sequence of QNN operation into one TOPI op.
Possibly, A) is more natural but A) implies adding dozens of new Relay ops which only make sense for Hexagon target, not genetic and overloaded with a huge number of arguments (inputs, quantization parameters, bias, requantization parameters etc.)
That's why B) approach was implemented in this PR.

cc @mehrdadh

This commit enables work of TVM without QNN canonicalization pass. It adds new TOPI ops for QNN + simple compute/schedules.

QConv2DAttr

QDenseAttrs.

masahi · 2022-10-14T19:06:00Z

@tvm-bot rerun

masahi · 2022-10-17T07:16:28Z

@tvm-bot rerun

masahi · 2022-10-17T09:53:17Z

src/relay/backend/utils.cc

-  pass_seqs.push_back(relay::qnn::transform::Legalize());
+  // Skip these passes for Hexagon target.
+  if ((is_homogeneous && homogeneous_target->GetTargetDeviceType() != kDLHexagon) ||
+      !is_homogeneous)


Always disabling qnn legalize for Hexagon could be problematic, since we want it to be applied for vrmpy tensorization (like #12911).

So instead, I suggest using qnn legalize by default, and let users disable it via disabled_pass=["qnn.Legalize"]. To do that, we can refactor

tvm/src/relay/qnn/pass/legalize.cc

Lines 33 to 39 in 8131364

Pass Legalize() {

Array<Pass> pass_seqs;

pass_seqs.push_back(relay::transform::Legalize("FTVMQnnLegalize"));

pass_seqs.push_back(relay::transform::Legalize("FTVMQnnCanonicalize"));

relay::transform::Pass seq = relay::transform::Sequential(pass_seqs);

return seq;

}

and formally define qnn.Legalize pass, by CreateFunctionPass(..., "qnn.Legalize", ...).

This should also remove the need for the GetPassPrefix API and other changes.

masahi · 2022-10-17T10:01:51Z

python/tvm/topi/hexagon/qnn.py

+def qnn_quantize(data, output_scale, output_zero_point, axis, out_dtype):
+    """Compute for qnn.quantize
+
+    Note! This is POC code. There was no goal to implement high performance compute function.


Do we want to leave this comment?

Removed. It was a reminder for QC people.

masahi · 2022-10-17T10:05:18Z

python/tvm/topi/hexagon/qnn.py

+    """Compute for qnn.conv2d with NCHW layout
+
+    Note! This is POC code. There was no goal to implement high performance compute function.
+


Document that the output can be int32 or odtype depending on the presence of rq parameters.

The same comment for depthwise conv2d, dense etc.

masahi · 2022-10-17T10:10:54Z

python/tvm/topi/hexagon/qnn.py

+    """Compute for qnn.dense
+
+    Note! This is POC code. There was no goal to implement high performance compute function.
+


Can requantize support be added for bmm as well? If not, I think it is ok to drop support for qnn.batch_matmul entirely for now, since this PR already is very big. We also need to update the pattern matcher.

Yes, it is possible to do. There is no any limitation. I can add

masahi · 2022-10-17T10:18:20Z

src/relay/backend/te_compiler_cache.cc

    static auto flower_call = tvm::runtime::Registry::Get("relay.backend.lower_call");
    ICHECK(flower_call) << "relay.backend.lower_call is not registered.";

+    if (target_->GetTargetDeviceType() == kDLHexagon) pattern_matcher_.Register(call_node);


Probably don't need this check

Yes, we can remove this check. Done.

masahi · 2022-10-17T10:18:30Z

src/relay/backend/te_compiler_cache.cc

 }

+// Helper class that is used during lowering to TE.
+// It matches sequence of Ops and lower them into single TOPI operation. Has sense for Hexagon only.


Remove Has sense for Hexagon only.

masahi · 2022-10-17T10:26:48Z

src/relay/backend/te_compiler_cache.cc

+// Helper class that is used during lowering to TE.
+// It matches sequence of Ops and lower them into single TOPI operation. Has sense for Hexagon only.
+// All supported patterns are enumerated in "supported_patterns_"
+class PatternMatcher {


Although this class can be used in general settings, currently it is only used for QNN ops and the class implementation is hardcoded for them. So how about we change the class name to QNNPatternMatcher?

Ok, no objections. Done.

masahi · 2022-10-17T21:51:57Z

@tvm-bot rerun

tvm-bot · 2022-10-18T00:05:51Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Built docs for commit d5dc496 can be found here.

_{Generated by tvm-bot}

QNN passes are enabled by default. To disable use disabled_pass=["qnn.Legalize"] in pass config.

* [QNN] Disable QNN canonicalization pass. This commit enables work of TVM without QNN canonicalization pass. It adds new TOPI ops for QNN + simple compute/schedules. * added dependence of the qnn::transform::Legalize pass launch on target. * Added new dense topi operator for the pattern qnn.dense+bias+requantize * Added support of axis attribute for QNN TOPI ops * Fixed TOPI compute implementation for qnn.add * Fixed issue with non zero padding value for qnn.conv2d * Fixed Bias.add for qnn.conv2d * Added support of depthwise qnn.conv2d topi operator * Added support of 1D quantization params in qnn.dequantize * Added support of qnn.concatenate * Fixed out of range array access * Added meta_schedule_original_shape attribute in QDenseAttr and QConv2DAttr * Added support of qnn.batch_matmul as a standalone op. * Added per channel zp in qnn.dense and qnn.conv2d. * Fixed corner cases like dense+bias+bias+rq. * Added unit test. * Removed rq_out_dtype and axis attributes declaration in QConv2DAttra and QDenseAttrs. * Changed target x86->Hexagon to disable QNN passes. * Fixed issue with QDenseAttrs and QConv2dAttrs. * Fixed build for Cortex-M. * Removed QDenseAttrs and QConv2dAttrs * Fix tests after rebase * Address code review comments. * [QNN] Add option to disabe QNN passes. QNN passes are enabled by default. To disable use disabled_pass=["qnn.Legalize"] in pass config. * Revert changes of GetPassPrefix interface.

csullivan · 2022-11-21T23:41:24Z

python/tvm/topi/hexagon/qnn/nn.py

+    _input_scale,
+    _kernel_scale,


@ibsidorenko, @supersat and I were working through the flow you introduced in this PR and found that _kernel_scale and _input_scale` are not used, are these being folded into the requantize scaling params?

@csullivan yes, you are right, input/kernel scales should be taken into account in subsequent requantize op.

* [QNN] Disable QNN canonicalization pass. This commit enables work of TVM without QNN canonicalization pass. It adds new TOPI ops for QNN + simple compute/schedules. * added dependence of the qnn::transform::Legalize pass launch on target. * Added new dense topi operator for the pattern qnn.dense+bias+requantize * Added support of axis attribute for QNN TOPI ops * Fixed TOPI compute implementation for qnn.add * Fixed issue with non zero padding value for qnn.conv2d * Fixed Bias.add for qnn.conv2d * Added support of depthwise qnn.conv2d topi operator * Added support of 1D quantization params in qnn.dequantize * Added support of qnn.concatenate * Fixed out of range array access * Added meta_schedule_original_shape attribute in QDenseAttr and QConv2DAttr * Added support of qnn.batch_matmul as a standalone op. * Added per channel zp in qnn.dense and qnn.conv2d. * Fixed corner cases like dense+bias+bias+rq. * Added unit test. * Removed rq_out_dtype and axis attributes declaration in QConv2DAttra and QDenseAttrs. * Changed target x86->Hexagon to disable QNN passes. * Fixed issue with QDenseAttrs and QConv2dAttrs. * Fixed build for Cortex-M. * Removed QDenseAttrs and QConv2dAttrs * Fix tests after rebase * Address code review comments. * [QNN] Add option to disabe QNN passes. QNN passes are enabled by default. To disable use disabled_pass=["qnn.Legalize"] in pass config. * Revert changes of GetPassPrefix interface.

ibsidorenko marked this pull request as draft August 12, 2022 06:57

masahi self-assigned this Aug 12, 2022

github-actions bot requested a review from mehrdadh August 12, 2022 09:39

ibsidorenko added 20 commits October 14, 2022 17:12

[QNN] Disable QNN canonicalization pass.

243784d

This commit enables work of TVM without QNN canonicalization pass. It adds new TOPI ops for QNN + simple compute/schedules.

added dependence of the qnn::transform::Legalize pass launch on target.

71c0cb0

Added new dense topi operator for the pattern qnn.dense+bias+requantize

a442f70

Added support of axis attribute for QNN TOPI ops

06bfeed

Fixed TOPI compute implementation for qnn.add

777a9be

Fixed issue with non zero padding value for qnn.conv2d

3aa343e

Fixed Bias.add for qnn.conv2d

46d9edf

Added support of depthwise qnn.conv2d topi operator

723b7ab

Added support of 1D quantization params in qnn.dequantize

ce8a765

Added support of qnn.concatenate

823f46a

Fixed out of range array access

055ea96

Added meta_schedule_original_shape attribute in QDenseAttr and

8efb58f

QConv2DAttr

Added support of qnn.batch_matmul as a standalone op.

82e21cb

Added per channel zp in qnn.dense and qnn.conv2d.

a63a4e4

Fixed corner cases like dense+bias+bias+rq.

235084f

Added unit test.

862ea2e

Removed rq_out_dtype and axis attributes declaration in QConv2DAttra and

c985961

QDenseAttrs.

Changed target x86->Hexagon to disable QNN passes.

92862ff

Fixed issue with QDenseAttrs and QConv2dAttrs.

a287205

Fixed build for Cortex-M.

72af59c

ibsidorenko force-pushed the disable-qnn-canon-v3 branch from 4dc9cac to 239c061 Compare October 14, 2022 15:06

Removed QDenseAttrs and QConv2dAttrs

9fe7401

ibsidorenko force-pushed the disable-qnn-canon-v3 branch from 239c061 to 9fe7401 Compare October 14, 2022 15:31

masahi marked this pull request as ready for review October 17, 2022 07:16

Fix tests after rebase

968be2b

masahi reviewed Oct 17, 2022

View reviewed changes

Address code review comments.

b7e6e26

ibsidorenko added 2 commits October 18, 2022 12:32

[QNN] Add option to disabe QNN passes.

c1b6399

QNN passes are enabled by default. To disable use disabled_pass=["qnn.Legalize"] in pass config.

Revert changes of GetPassPrefix interface.

d5dc496

masahi approved these changes Oct 18, 2022

View reviewed changes

masahi merged commit 010d05c into apache:main Oct 18, 2022

guberti mentioned this pull request Nov 13, 2022

[microTVM] Modernize Arm Cortex-M convolution schedules #13242

Merged

csullivan reviewed Nov 21, 2022

View reviewed changes

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

ibsidorenko deleted the disable-qnn-canon-v3 branch March 29, 2023 06:26

	Pass Legalize() {
	Array<Pass> pass_seqs;
	pass_seqs.push_back(relay::transform::Legalize("FTVMQnnLegalize"));
	pass_seqs.push_back(relay::transform::Legalize("FTVMQnnCanonicalize"));
	relay::transform::Pass seq = relay::transform::Sequential(pass_seqs);
	return seq;
	}

		"""Compute for qnn.conv2d with NCHW layout

		Note! This is POC code. There was no goal to implement high performance compute function.

		"""Compute for qnn.dense

		Note! This is POC code. There was no goal to implement high performance compute function.

[QNN][Hexagon] Disable QNN canonicalization pass #12398

[QNN][Hexagon] Disable QNN canonicalization pass #12398

Uh oh!

Conversation

ibsidorenko commented Aug 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

masahi commented Oct 14, 2022

Uh oh!

masahi commented Oct 17, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

masahi commented Oct 17, 2022

Uh oh!

tvm-bot commented Oct 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ibsidorenko commented Aug 12, 2022 •

edited

Loading

tvm-bot commented Oct 18, 2022 •

edited

Loading