-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TOPI] add basic scheduling for conv2d_transpose on x86 #3491
Conversation
def traverse(op): | ||
"""Traverse operators from computation graph""" | ||
# inline all one-to-one-mapping operators except the last stage (output) | ||
if tag.is_broadcast(op.tag) or tag.is_injective(op.tag): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_injective
implies is_broadcast
def _declaration_conv2d_transpose(cfg, data, kernel, strides, padding, out_dtype): | ||
return _declaration_conv2d_transpose_impl(cfg, data, kernel, strides, padding, out_dtype) | ||
|
||
def _declaration_conv2d_transpose_impl(cfg, data, kernel, strides, padding, out_dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we put this function to nn/conv2d_transpose.py
and have nn.conv2d_transpose_nchw
and _declaration_conv2d_transpose
share the implement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. But I am assuming that cfg
will be used in the future within this function. If we move it to nn/conv2d_transpose.py
, I am not sure how to deal with it. What do you suggest? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I modified as you suggested and put a TODO
for now.
@yidawang do you folks happen to have an E2E example of Mask R-CNN with TVM? We'd be interested in this at FB for our object detector work. |
Thanks @yidawang |
* initialize cond 2d transpose scheduling on x86 * refine the scheduler a bit * fix for lint * address review comments; remove duplicate code * fix lint
* initialize cond 2d transpose scheduling on x86 * refine the scheduler a bit * fix for lint * address review comments; remove duplicate code * fix lint
* initialize cond 2d transpose scheduling on x86 * refine the scheduler a bit * fix for lint * address review comments; remove duplicate code * fix lint
As the title. For workload of mask r-cnn, the #FLOPS improves from 0.9G to 90G on c5.18xlarge instance of Amazon EC2.
Finer tuning and autotvm remain the future work.
@yzhliu @kevinthesun