-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asymmetric padding and dilation in conv2d workload #7142
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change LGTM. Could you add a test case?
Happy to add a test case if necessary, though I'm still to get familiar with the testing infrastructure for TVM. Existing specific TOPI conv2d implementations are tested with asymmetric padding under This change is just ensuring that data is held in the workload too. If all of the existing tests pass, is that sufficient? |
As you pointed out, the workload doesn't handle asymmetric padding as the compute implementation, which looks like a bug to me. However, it never triggers CI errors before, meaning that there aren't existing test cases for it. As a result, I'm expecting to have a test case that requires this PR to pass. For example, |
Thanks, I understand better what a good test for this PR would be: one that fails on the current I've been working on devising a test like this, but haven't got one that fails yet. Will keep working on it, but here's my reasoning so far: afaik the workload data is only used the creation of fallback schedules. e.g. for creating the So I imagine I would want to create a test that has padding such that something like e.g. If we are to focus on NCHWc convolution, which uses It always works. Though perhaps it suffers a performance regression? It could be happenstance that none of the conv2d schedules make transformations that are rendered invalid by having an incorrect value for the output height/width. That being the case, I'm unsure how I would devise a test for this. |
I see what you meant. How about we just simply add a test in def test_worload_with_asmmetric_padding():
cfg = ...
wkl = _get_workload(...) # with asmmetric padding
int32_lanes = ...
num_int8_elements = ...
fallback_schedule_cpu_common_int8(cfg, wkl, int32_lanes, num_int8_elements)
assert cfg["tile_ow"] ... # check if tile_ow candidates are the factors of the right output weight. So does the other ops changed by this PR. |
Thanks for the tip! I've added a test for Does the style of this test meet the standards you expect? E.g. is the test being called as a nested function of If so, I will add a similar test for all of the ops touched by this PR. Otherwise, happy to take some suggestions on how to improve this test before I reproduce it elsewhere. EDIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments. Please take a look and apply to other tests as you mentioned. Also please rebase and git submodule update
. This PR should not update the submodule (i.e., vta-hw`).
2ec8257
to
f574452
Compare
15afec0
to
d72c06d
Compare
… of running integration tests locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM. Suggest correcting them to correct code style then we could merge it in so that we don't block group convolution pr.
Have added all specific requested changes @comaniac, and have added tests to: test_topi_conv2d_int8.py
test_topi_conv2d_nchw.py
test_topi_depthwise_conv2d.py
These three tests cover all the workloads that are touched by this PR. Other data layouts use the same workload definitions. A similar test could be added to every single conv2d test, but am unsure if that is worth it right now. |
Thanks @Wheest @FrozenGene |
* added asymmetric padding to conv2d workload * fixed depthwise conv2d padding * Added fix to include dilation in workload output width calculation * Added missing dilation to arm_cpu/conv2d_int8.py workload * Fixed dilation for x86 conv2d * Improved dilation workload integration in x86 * Fixed x86 conv2d_alter_op to add dilation * Local linting not always producing same output as CI, probably my fault * Fixed bug, tested locally * Abusing CI until I can figure out how to reproduce the same behaviour of running integration tests locally. * Ammeded conv2d_int8 test * Updated workload, improved unit tests * Added depthwise conv2d workload test
* added asymmetric padding to conv2d workload * fixed depthwise conv2d padding * Added fix to include dilation in workload output width calculation * Added missing dilation to arm_cpu/conv2d_int8.py workload * Fixed dilation for x86 conv2d * Improved dilation workload integration in x86 * Fixed x86 conv2d_alter_op to add dilation * Local linting not always producing same output as CI, probably my fault * Fixed bug, tested locally * Abusing CI until I can figure out how to reproduce the same behaviour of running integration tests locally. * Ammeded conv2d_int8 test * Updated workload, improved unit tests * Added depthwise conv2d workload test
* added asymmetric padding to conv2d workload * fixed depthwise conv2d padding * Added fix to include dilation in workload output width calculation * Added missing dilation to arm_cpu/conv2d_int8.py workload * Fixed dilation for x86 conv2d * Improved dilation workload integration in x86 * Fixed x86 conv2d_alter_op to add dilation * Local linting not always producing same output as CI, probably my fault * Fixed bug, tested locally * Abusing CI until I can figure out how to reproduce the same behaviour of running integration tests locally. * Ammeded conv2d_int8 test * Updated workload, improved unit tests * Added depthwise conv2d workload test
* added asymmetric padding to conv2d workload * fixed depthwise conv2d padding * Added fix to include dilation in workload output width calculation * Added missing dilation to arm_cpu/conv2d_int8.py workload * Fixed dilation for x86 conv2d * Improved dilation workload integration in x86 * Fixed x86 conv2d_alter_op to add dilation * Local linting not always producing same output as CI, probably my fault * Fixed bug, tested locally * Abusing CI until I can figure out how to reproduce the same behaviour of running integration tests locally. * Ammeded conv2d_int8 test * Updated workload, improved unit tests * Added depthwise conv2d workload test
The goal of this pull request is to make asymmetric padding a first-class citizen in 2D convolution in TOPI.
The current workload description has
"hpad"
and"wpad"
, however this is not representative of all of the possible configurations. Most TOPI conv2d implementations in TVM already support asymmetric padding, so I think this should be reflected in the workload description.EDIT
The process of developing this PR uncovered an additional bug with Conv2D workload definitions, where the output dimensions were not being properly calculated for
fallback_schedule
s. Both asymmetric padding, and dilation were not being considered properly, which was leading to some untested incorrect behaviour. For some cases, this could perhaps result in a schedule with a performance regression, but this has not been tested.