Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asymmetric padding and dilation in conv2d workload #7142

Merged
merged 14 commits into from
Dec 29, 2020

Conversation

Wheest
Copy link
Contributor

@Wheest Wheest commented Dec 21, 2020

The goal of this pull request is to make asymmetric padding a first-class citizen in 2D convolution in TOPI.

The current workload description has "hpad" and "wpad", however this is not representative of all of the possible configurations. Most TOPI conv2d implementations in TVM already support asymmetric padding, so I think this should be reflected in the workload description.

EDIT

The process of developing this PR uncovered an additional bug with Conv2D workload definitions, where the output dimensions were not being properly calculated for fallback_schedules. Both asymmetric padding, and dilation were not being considered properly, which was leading to some untested incorrect behaviour. For some cases, this could perhaps result in a schedule with a performance regression, but this has not been tested.

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change LGTM. Could you add a test case?

@Wheest
Copy link
Contributor Author

Wheest commented Dec 22, 2020

Happy to add a test case if necessary, though I'm still to get familiar with the testing infrastructure for TVM.

Existing specific TOPI conv2d implementations are tested with asymmetric padding under tests/python/topi/python/ (e.g. test_topi_conv2d_nchw.py#L227).

This change is just ensuring that data is held in the workload too. If all of the existing tests pass, is that sufficient?

@comaniac
Copy link
Contributor

As you pointed out, the workload doesn't handle asymmetric padding as the compute implementation, which looks like a bug to me. However, it never triggers CI errors before, meaning that there aren't existing test cases for it. As a result, I'm expecting to have a test case that requires this PR to pass. For example, conv2d_avx_1x1 has out_height = (wkl.height + 2 * HPAD - wkl.hkernel) // HSTR + 1 before this PR. Then you will get a wrong out_height if you provide a workload with asymmetric padding without this PR.

@Wheest
Copy link
Contributor Author

Wheest commented Dec 22, 2020

Thanks, I understand better what a good test for this PR would be: one that fails on the current main branch but not this PR.

I've been working on devising a test like this, but haven't got one that fails yet. Will keep working on it, but here's my reasoning so far:

afaik the workload data is only used the creation of fallback schedules. e.g. for creating the "tile_ow" parameter for spatial pack convolution.

So I imagine I would want to create a test that has padding such that something like "tile_ow" is generated with an incorrect value, creating a schedule that causes an invalid transformation.

e.g. verify_conv2d_nchw(1, 64, 8, 128, 3, 1, (6, 6, 0, 0))

If we are to focus on NCHWc convolution, which uses "tile_ow" in its schedule (via SplitEntity with reg in the _fallback_schedule) python/tvm/topi/x86/conv2d_avx_common.py#L119. There is no value we can set reg or "tile_ow" to that will cause an invalid transformation, even through silly hardcoding like out_width = 10000 or ow_chunk, ow_block = s[C].split(ow, factor=10000)

It always works. Though perhaps it suffers a performance regression?

It could be happenstance that none of the conv2d schedules make transformations that are rendered invalid by having an incorrect value for the output height/width. That being the case, I'm unsure how I would devise a test for this.

@comaniac
Copy link
Contributor

I see what you meant. How about we just simply add a test in test_topi_conv2d_int8.py that directly calls fallback_schedule_cpu_common_int8 that takes a workload generated by _get_workload? Something like:

def test_worload_with_asmmetric_padding():
  cfg = ...
  wkl = _get_workload(...) # with asmmetric padding
  int32_lanes = ...
  num_int8_elements = ...
  fallback_schedule_cpu_common_int8(cfg, wkl, int32_lanes, num_int8_elements)
  assert cfg["tile_ow"] ... # check if tile_ow candidates are the factors of the right output weight.

So does the other ops changed by this PR.

@Wheest
Copy link
Contributor Author

Wheest commented Dec 23, 2020

Thanks for the tip! I've added a test for conv2d_nchw_int8, with a structure much like you suggested. It fails in main, but is fixed in this PR. Through the test, I identified a silly mistake I had made in the workload definition in the original commit, now fixed.

Does the style of this test meet the standards you expect? E.g. is the test being called as a nested function of verify_conv2d_nchw_int8 the right place for this?

If so, I will add a similar test for all of the ops touched by this PR. Otherwise, happy to take some suggestions on how to improve this test before I reproduce it elsewhere.

EDIT
CI still failed, I should have run all tests locally, rather than just the one I added. I have discovered that dilation is also not used in the workload, so this PR will also add dilation to avoid this issue. Will push the fix when available, earlier question about the test standard remains.

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. Please take a look and apply to other tests as you mentioned. Also please rebase and git submodule update. This PR should not update the submodule (i.e., vta-hw`).

tests/python/topi/python/test_topi_conv2d_int8.py Outdated Show resolved Hide resolved
tests/python/topi/python/test_topi_conv2d_int8.py Outdated Show resolved Hide resolved
tests/python/topi/python/test_topi_conv2d_int8.py Outdated Show resolved Hide resolved
@comaniac comaniac added status: need test case need test cases to cover the change status: need update need update based on feedbacks labels Dec 23, 2020
@Wheest Wheest changed the title Asymmetric padding in conv2d workload Asymmetric padding and diliation in conv2d workload Dec 23, 2020
@Wheest Wheest changed the title Asymmetric padding and diliation in conv2d workload Asymmetric padding and dilation in conv2d workload Dec 23, 2020
Copy link
Member

@FrozenGene FrozenGene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. Suggest correcting them to correct code style then we could merge it in so that we don't block group convolution pr.

python/tvm/topi/nn/depthwise_conv2d.py Outdated Show resolved Hide resolved
@Wheest
Copy link
Contributor Author

Wheest commented Dec 29, 2020

Have added all specific requested changes @comaniac, and have added tests to:

test_topi_conv2d_int8.py

  • verify_conv2d_nchw_int8

test_topi_conv2d_nchw.py

  • verify_conv2d_nchw

test_topi_depthwise_conv2d.py

  • depthwise_conv2d_with_workload_nchw

These three tests cover all the workloads that are touched by this PR. Other data layouts use the same workload definitions. A similar test could be added to every single conv2d test, but am unsure if that is worth it right now.

@comaniac comaniac merged commit cfdbf0e into apache:main Dec 29, 2020
@comaniac
Copy link
Contributor

Thanks @Wheest @FrozenGene

@comaniac comaniac added status: accepted and removed status: need test case need test cases to cover the change status: need update need update based on feedbacks labels Dec 29, 2020
tkonolige pushed a commit to tkonolige/incubator-tvm that referenced this pull request Jan 11, 2021
* added asymmetric padding to conv2d workload

* fixed depthwise conv2d padding

* Added fix to include dilation in workload output width calculation

* Added missing dilation to arm_cpu/conv2d_int8.py workload

* Fixed dilation for x86 conv2d

* Improved dilation workload integration in x86

* Fixed x86 conv2d_alter_op to add dilation

* Local linting not always producing same output as CI, probably my fault

* Fixed bug, tested locally

* Abusing CI until I can figure out how to reproduce the same behaviour of running integration tests locally.

* Ammeded conv2d_int8 test

* Updated workload, improved unit tests

* Added depthwise conv2d workload test
TusharKanekiDey pushed a commit to TusharKanekiDey/tvm that referenced this pull request Jan 20, 2021
* added asymmetric padding to conv2d workload

* fixed depthwise conv2d padding

* Added fix to include dilation in workload output width calculation

* Added missing dilation to arm_cpu/conv2d_int8.py workload

* Fixed dilation for x86 conv2d

* Improved dilation workload integration in x86

* Fixed x86 conv2d_alter_op to add dilation

* Local linting not always producing same output as CI, probably my fault

* Fixed bug, tested locally

* Abusing CI until I can figure out how to reproduce the same behaviour of running integration tests locally.

* Ammeded conv2d_int8 test

* Updated workload, improved unit tests

* Added depthwise conv2d workload test
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Jan 21, 2021
* added asymmetric padding to conv2d workload

* fixed depthwise conv2d padding

* Added fix to include dilation in workload output width calculation

* Added missing dilation to arm_cpu/conv2d_int8.py workload

* Fixed dilation for x86 conv2d

* Improved dilation workload integration in x86

* Fixed x86 conv2d_alter_op to add dilation

* Local linting not always producing same output as CI, probably my fault

* Fixed bug, tested locally

* Abusing CI until I can figure out how to reproduce the same behaviour of running integration tests locally.

* Ammeded conv2d_int8 test

* Updated workload, improved unit tests

* Added depthwise conv2d workload test
electriclilies pushed a commit to electriclilies/tvm that referenced this pull request Feb 18, 2021
* added asymmetric padding to conv2d workload

* fixed depthwise conv2d padding

* Added fix to include dilation in workload output width calculation

* Added missing dilation to arm_cpu/conv2d_int8.py workload

* Fixed dilation for x86 conv2d

* Improved dilation workload integration in x86

* Fixed x86 conv2d_alter_op to add dilation

* Local linting not always producing same output as CI, probably my fault

* Fixed bug, tested locally

* Abusing CI until I can figure out how to reproduce the same behaviour of running integration tests locally.

* Ammeded conv2d_int8 test

* Updated workload, improved unit tests

* Added depthwise conv2d workload test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants