[Bug] The optimal implementation of reduce_sum searched by Ansor is more than 30x slower than torch.sum

### Expected behavior

 the optimal implementation of reduce_sum searched by Ansor  will have a performance similar to that of torch.sum

### Actual behavior

But the optimal implementation of reduce_sum searched by Ansor is more than 30x slower than torch.sum

### Environment

Any environment details, such as: Operating System, TVM version, etc
TVM version：0.12.0 release
NVCC：11.0

### Steps to reproduce
![image](https://github.com/apache/tvm/assets/43968255/ef8eb2fa-ef5e-4cad-854b-9841793c8d47)
![image](https://github.com/apache/tvm/assets/43968255/7ad11ac4-45ce-415d-803e-f57ac19053ae)
![image](https://github.com/apache/tvm/assets/43968255/e8fe4c4b-1510-48f4-a720-c6b717da55c8)

### Triage

Please refer to the list of label tags [here](https://github.com/apache/tvm/wiki/Issue-Triage-Labels) to find the relevant tags and add them below in a bullet format (example below).

* tune:auto_scheduler


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] The optimal implementation of reduce_sum searched by Ansor is more than 30x slower than torch.sum #15342

Expected behavior

Actual behavior

Environment

Steps to reproduce

Triage

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] The optimal implementation of reduce_sum searched by Ansor is more than 30x slower than torch.sum #15342

Description

Expected behavior

Actual behavior

Environment

Steps to reproduce

Triage

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions