Skip to content

[Bug] The optimal implementation of reduce_sum searched by Ansor is more than 30x slower than torch.sum #15342

@MrJungle1

Description

@MrJungle1

Expected behavior

the optimal implementation of reduce_sum searched by Ansor will have a performance similar to that of torch.sum

Actual behavior

But the optimal implementation of reduce_sum searched by Ansor is more than 30x slower than torch.sum

Environment

Any environment details, such as: Operating System, TVM version, etc
TVM version:0.12.0 release
NVCC:11.0

Steps to reproduce

image
image
image

Triage

Please refer to the list of label tags here to find the relevant tags and add them below in a bullet format (example below).

  • tune:auto_scheduler

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triagePRs or issues that need to be investigated by maintainers to find the right assignees to address ittype: bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions