-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TOPI] Fix bug in Winograd on CUDA #4260
Conversation
Looks good to me (looked at winograd part). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@vinx13 could you help merge it? Thanks. |
@tqchen seems I don't have write permission |
@vinx13 you should have permission now, if now, please check if you have linked your github account. I will send you instructions |
@tqchen I still don't have permission. I'm already in the Apache github org, is there anything I'm missing? |
* fix winograd * move get padding after kernel transform
Several topics [1, 2, 3] in the discuss mention that the conv2d failed to pass the shape checking in the runtime after the conv2d has been tuned by AutoTVM. This PR investigated the reason and resolved the issue. (thanks the help from @kevinthesun, @Laurawly and @vinx13 with the investigation).
Investigation
All errors happen at the same dimension: the output image height (arg2.shape[2]).
For example in [1], the workload has input (1, 80, 73, 73) with stride=1 and padding=0, so the output shape should be (1, 192, 71, 71). However, it encountered the following error:
TVMError: Check failed: ret == 0 (-1 vs. 0) : Assert fail: (73 == int32(arg2.shape[2])), Argument arg2.shape[2] has an unsatisfied constraint
It means that somehow the output shape is set to 73 instead of 71 during scheduling. After digging into the code, we found that Winograd schedule overrides the strides and padding to be 1 regardless the input workload.
Modification
Accordingly, this PR modified two parts. First, we get the stride and padding directly from the input workload and check if it is valid like previous. This change passed the isolated example that yyding provided in the TVM discuss.
Second, the Winograd unittest uses fallback config, but the Winograd schedule will fallback to the direct template if the config is fallback. It means the alter layer for Winograd schedule is never tested. This PR also forced the fallback to be False to enable the testing, which was suggested by @vinx13.
[1] https://discuss.tvm.ai/t/auto-tune-error-occurs-during-inference-when-using-auto-tuned-schedule
[2] https://discuss.tvm.ai/t/error-float16-for-cuda-with-autotvm
[3] https://discuss.tvm.ai/t/graphruntime-module-run-failed-when-created-with-logfile-from-autotvm-tuning