Skip to content

Conversation

@Yongqi-Zhuo
Copy link
Contributor

…ffer not padded by PadEinsum

TileWithTensorIntrin in src/tir/schedule/transform.cc calls PadEinsum and inlines the padded input (output) to its original producer (consumer). However it is possible that one of the inputs/outputs does not need to be padded, in which case that producer (consumer) is not padded by PadEinsum. This means that TileWithTensorIntrin may inline blocks that are irrelevant to padding and must not be inlined.

This has led to multiple bug reports and (temporary) fixes as in #17171, #16614, #16239 and #15505. Unfortunately #16239 and #17171 can only prevent TVM from crashing when the padded buffer is an input/output buffer, and still some incorrect inlining may be performed. The workaround in #15505 tried to handle this bug by extra checking in the MultiLevelTilingTensorCore rule, which is logically incorrect. This PR aims to provide a one-and-for-all fix for this.

Credit to @XFPlus for the bug reproduce example in #16239.

@Yongqi-Zhuo
Copy link
Contributor Author

cc @tqchen @junrushao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants