Skip to content

Conversation

@masahi
Copy link
Member

@masahi masahi commented Mar 23, 2022

I want to extract tuning tasks for ARM int8 tensorization. The current alter_layout code in topi/arm_cpu doesn't fire in the fallback mode, which has been fixed in the PR.

@tkonolige @comaniac

@masahi masahi marked this pull request as ready for review March 23, 2022 09:19
cfg = dispatch_ctx.query(target, workload)
if cfg.is_fallback: # if is fallback, clear query cache and return None
autotvm.task.clear_fallback_cache(target, workload)
return None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope it's safe to remove this. The x86 counterpart doesn't have thing like this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you testing it ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I can pass the CI with this change, I assume it is safe.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mousius - could you please look at this given you've recently been turning on topi tests on aarch64 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to remove this code because it always makes alter_layout nop in the fallback mode. In contrast, in the x86 schedule, alter_layout always fires.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI has passed. I found that Giuseppe's im2col based conv2d implementation can fail to tensorize in the fallback mode, so I partially restored the fallback return path above.

@masahi masahi force-pushed the arm-conv2d-alter-op branch from 8acb02c to a05d81f Compare March 23, 2022 09:24
Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good except for a few small changes.

@masahi masahi force-pushed the arm-conv2d-alter-op branch from a05d81f to 0beacdf Compare March 24, 2022 02:35
@masahi
Copy link
Member Author

masahi commented Mar 24, 2022

cc @Mousius @leandron (I don't know who are interested in ARM cpu stuff), this is for meta schedule task extraction. I'm porting TE ARM tensorized schedules to TIR, starting with NCHWc conv2d schedules.

@masahi masahi merged commit e9091d6 into apache:main Mar 25, 2022
assert "vnni" in annotations["schedule_rule"]


def extract_task_arm_conv2d_nchwc():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this runs @masahi? There's no test_ prefix

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops thanks for pointing this out, I added a fix in #10773

pfk-beta pushed a commit to pfk-beta/tvm that referenced this pull request Apr 11, 2022
* [ARM] Support NCHWc alter layout in the fallback mode

* remove fallback path

* add test

* fixed int32_lanes and add channel check

* fixed schedule dispatch bug

* add workaround fallback path for NHWC im2col based GEMM schedule

* int32_lanes=4 by default

* typo

* update test
Anndrey24 added a commit to Anndrey24/tvm that referenced this pull request Aug 17, 2023
…ts matrix for arm_cpu NHWC quantized conv2d

Fixed arm_cpu strategy bug which was causing tensorization errors when using the `AlterOpLayout` pass for the quantized NHWC conv2d schedules, as discovered in apache#10724. Therefore, we can now also enable the usage of `AlterOpLayout` for these schedules in order to transform the weight matrix at compile time, instead of runtime as before.
I also modified the padding in `Conv2DGemmWeightTransformRel` and `interleave_transpose_weights` to reflect the changes made in apache#13669 and updated the AlterOpLayout tests accordingly.
lhutton1 pushed a commit that referenced this pull request Aug 23, 2023
…ts matrix for arm_cpu NHWC quantized conv2d (#15584)

Fixed arm_cpu strategy bug which was causing tensorization errors when using the `AlterOpLayout` pass for the quantized NHWC conv2d schedules, as discovered in #10724. Therefore, we can now also enable the usage of `AlterOpLayout` for these schedules in order to transform the weight matrix at compile time, instead of runtime as before.
I also modified the padding in `Conv2DGemmWeightTransformRel` and `interleave_transpose_weights` to reflect the changes made in #13669 and updated the AlterOpLayout tests accordingly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants