Asymmetric padding and dilation in conv2d workload #7142

Wheest · 2020-12-21T21:05:28Z

The goal of this pull request is to make asymmetric padding a first-class citizen in 2D convolution in TOPI.

The current workload description has "hpad" and "wpad", however this is not representative of all of the possible configurations. Most TOPI conv2d implementations in TVM already support asymmetric padding, so I think this should be reflected in the workload description.

EDIT

The process of developing this PR uncovered an additional bug with Conv2D workload definitions, where the output dimensions were not being properly calculated for fallback_schedules. Both asymmetric padding, and dilation were not being considered properly, which was leading to some untested incorrect behaviour. For some cases, this could perhaps result in a schedule with a performance regression, but this has not been tested.

comaniac

The change LGTM. Could you add a test case?

Wheest · 2020-12-22T20:36:27Z

Happy to add a test case if necessary, though I'm still to get familiar with the testing infrastructure for TVM.

Existing specific TOPI conv2d implementations are tested with asymmetric padding under tests/python/topi/python/ (e.g. test_topi_conv2d_nchw.py#L227).

This change is just ensuring that data is held in the workload too. If all of the existing tests pass, is that sufficient?

comaniac · 2020-12-22T21:04:52Z

As you pointed out, the workload doesn't handle asymmetric padding as the compute implementation, which looks like a bug to me. However, it never triggers CI errors before, meaning that there aren't existing test cases for it. As a result, I'm expecting to have a test case that requires this PR to pass. For example, conv2d_avx_1x1 has out_height = (wkl.height + 2 * HPAD - wkl.hkernel) // HSTR + 1 before this PR. Then you will get a wrong out_height if you provide a workload with asymmetric padding without this PR.

Wheest · 2020-12-22T22:57:56Z

Thanks, I understand better what a good test for this PR would be: one that fails on the current main branch but not this PR.

I've been working on devising a test like this, but haven't got one that fails yet. Will keep working on it, but here's my reasoning so far:

afaik the workload data is only used the creation of fallback schedules. e.g. for creating the "tile_ow" parameter for spatial pack convolution.

So I imagine I would want to create a test that has padding such that something like "tile_ow" is generated with an incorrect value, creating a schedule that causes an invalid transformation.

e.g. verify_conv2d_nchw(1, 64, 8, 128, 3, 1, (6, 6, 0, 0))

If we are to focus on NCHWc convolution, which uses "tile_ow" in its schedule (via SplitEntity with reg in the _fallback_schedule) python/tvm/topi/x86/conv2d_avx_common.py#L119. There is no value we can set reg or "tile_ow" to that will cause an invalid transformation, even through silly hardcoding like out_width = 10000 or ow_chunk, ow_block = s[C].split(ow, factor=10000)

It always works. Though perhaps it suffers a performance regression?

It could be happenstance that none of the conv2d schedules make transformations that are rendered invalid by having an incorrect value for the output height/width. That being the case, I'm unsure how I would devise a test for this.

comaniac · 2020-12-22T23:17:15Z

I see what you meant. How about we just simply add a test in test_topi_conv2d_int8.py that directly calls fallback_schedule_cpu_common_int8 that takes a workload generated by _get_workload? Something like:

def test_worload_with_asmmetric_padding():
  cfg = ...
  wkl = _get_workload(...) # with asmmetric padding
  int32_lanes = ...
  num_int8_elements = ...
  fallback_schedule_cpu_common_int8(cfg, wkl, int32_lanes, num_int8_elements)
  assert cfg["tile_ow"] ... # check if tile_ow candidates are the factors of the right output weight.

So does the other ops changed by this PR.

Wheest · 2020-12-23T13:07:55Z

Thanks for the tip! I've added a test for conv2d_nchw_int8, with a structure much like you suggested. It fails in main, but is fixed in this PR. Through the test, I identified a silly mistake I had made in the workload definition in the original commit, now fixed.

Does the style of this test meet the standards you expect? E.g. is the test being called as a nested function of verify_conv2d_nchw_int8 the right place for this?

If so, I will add a similar test for all of the ops touched by this PR. Otherwise, happy to take some suggestions on how to improve this test before I reproduce it elsewhere.

EDIT
CI still failed, I should have run all tests locally, rather than just the one I added. I have discovered that dilation is also not used in the workload, so this PR will also add dilation to avoid this issue. Will push the fix when available, earlier question about the test standard remains.

comaniac

Left some comments. Please take a look and apply to other tests as you mentioned. Also please rebase and git submodule update. This PR should not update the submodule (i.e., vta-hw`).

tests/python/topi/python/test_topi_conv2d_int8.py

… of running integration tests locally.

FrozenGene

Generally LGTM. Suggest correcting them to correct code style then we could merge it in so that we don't block group convolution pr.

python/tvm/topi/nn/depthwise_conv2d.py

Wheest · 2020-12-29T21:50:09Z

Have added all specific requested changes @comaniac, and have added tests to:

test_topi_conv2d_int8.py

verify_conv2d_nchw_int8

test_topi_conv2d_nchw.py

verify_conv2d_nchw

test_topi_depthwise_conv2d.py

depthwise_conv2d_with_workload_nchw

These three tests cover all the workloads that are touched by this PR. Other data layouts use the same workload definitions. A similar test could be added to every single conv2d test, but am unsure if that is worth it right now.

comaniac · 2020-12-29T23:10:01Z

Thanks @Wheest @FrozenGene

* added asymmetric padding to conv2d workload * fixed depthwise conv2d padding * Added fix to include dilation in workload output width calculation * Added missing dilation to arm_cpu/conv2d_int8.py workload * Fixed dilation for x86 conv2d * Improved dilation workload integration in x86 * Fixed x86 conv2d_alter_op to add dilation * Local linting not always producing same output as CI, probably my fault * Fixed bug, tested locally * Abusing CI until I can figure out how to reproduce the same behaviour of running integration tests locally. * Ammeded conv2d_int8 test * Updated workload, improved unit tests * Added depthwise conv2d workload test

Wheest and others added 3 commits December 21, 2020 21:01

added asymmetric padding to conv2d workload

23e5505

Merging with CI passing main branch

39bf9de

fixed depthwise conv2d padding

f574452

comaniac requested changes Dec 22, 2020

View reviewed changes

comaniac requested changes Dec 23, 2020

View reviewed changes

tests/python/topi/python/test_topi_conv2d_int8.py Outdated Show resolved Hide resolved

tests/python/topi/python/test_topi_conv2d_int8.py Outdated Show resolved Hide resolved

tests/python/topi/python/test_topi_conv2d_int8.py Outdated Show resolved Hide resolved

comaniac added status: need test case need test cases to cover the change status: need update need update based on feedbacks labels Dec 23, 2020

Wheest force-pushed the conv2d_asym_pad branch from 2ec8257 to f574452 Compare December 23, 2020 20:13

Added fix to include dilation in workload output width calculation

81ec6c6

Wheest force-pushed the conv2d_asym_pad branch from 15afec0 to d72c06d Compare December 23, 2020 20:52

Added missing dilation to arm_cpu/conv2d_int8.py workload

d72c06d

Wheest changed the title ~~Asymmetric padding in conv2d workload~~ Asymmetric padding and diliation in conv2d workload Dec 23, 2020

Wheest changed the title ~~Asymmetric padding and diliation in conv2d workload~~ Asymmetric padding and dilation in conv2d workload Dec 23, 2020

Wheest added 6 commits December 23, 2020 21:35

Fixed dilation for x86 conv2d

eea4f58

Improved dilation workload integration in x86

e7296eb

Fixed x86 conv2d_alter_op to add dilation

3e5ca07

Local linting not always producing same output as CI, probably my fault

9f7341a

Fixed bug, tested locally

4fe405d

Abusing CI until I can figure out how to reproduce the same behaviour…

297bb35

… of running integration tests locally.

Wheest mentioned this pull request Dec 26, 2020

Better grouped convolution for CPU targets #6137

Merged

Ammeded conv2d_int8 test

c221db2

FrozenGene approved these changes Dec 29, 2020

View reviewed changes

python/tvm/topi/nn/depthwise_conv2d.py Outdated Show resolved Hide resolved

Wheest added 2 commits December 29, 2020 15:25

Updated workload, improved unit tests

ea0acb2

Added depthwise conv2d workload test

8e24562

comaniac approved these changes Dec 29, 2020

View reviewed changes

comaniac merged commit cfdbf0e into apache:main Dec 29, 2020

comaniac added status: accepted and removed status: need test case need test cases to cover the change status: need update need update based on feedbacks labels Dec 29, 2020

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

Wheest mentioned this pull request Jan 5, 2024

[CLEANUP] Fix typos across the project triton-lang/triton#2876

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Asymmetric padding and dilation in conv2d workload #7142

Asymmetric padding and dilation in conv2d workload #7142

Wheest commented Dec 21, 2020 •

edited

Loading

comaniac left a comment

Wheest commented Dec 22, 2020

comaniac commented Dec 22, 2020

Wheest commented Dec 22, 2020

comaniac commented Dec 22, 2020

Wheest commented Dec 23, 2020 •

edited

Loading

comaniac left a comment

FrozenGene left a comment

Wheest commented Dec 29, 2020

comaniac commented Dec 29, 2020

Asymmetric padding and dilation in conv2d workload #7142

Asymmetric padding and dilation in conv2d workload #7142

Conversation

Wheest commented Dec 21, 2020 • edited Loading

comaniac left a comment

Choose a reason for hiding this comment

Wheest commented Dec 22, 2020

comaniac commented Dec 22, 2020

Wheest commented Dec 22, 2020

comaniac commented Dec 22, 2020

Wheest commented Dec 23, 2020 • edited Loading

comaniac left a comment

Choose a reason for hiding this comment

FrozenGene left a comment

Choose a reason for hiding this comment

Wheest commented Dec 29, 2020

comaniac commented Dec 29, 2020

Wheest commented Dec 21, 2020 •

edited

Loading

Wheest commented Dec 23, 2020 •

edited

Loading