ROCm backend for AMDGPU?

Hi, do you have any plan to support ROCm backend for AMDGPU? ([rocm](https://rocm.github.io/) is CUDA equivalent of AMD) 
Or, are you interested in PR?

It can enable an assembly level optimizations that are not possible with OpenCL backend.

I've been involved with rocm backend development for [TVM](https://github.com/dmlc/tvm), which is basically Halide tailored for deep learning inference. Their rocm backend compiles how level IR in python to optimized gpu code using LLVM's AMDGPU backend. 

Although their backend is still very much preliminary (the runtime was [fixed](https://github.com/dmlc/tvm/pull/544) just week and special math function support is still [in progress](https://github.com/dmlc/tvm/pull/553)), the performance of generated code is quite descent: Without AMD-specific optimization, their sgemm kernel already achieves 5200 GFLOPs for a 8 TFLOPs card (see [here](https://github.com/dmlc/tvm/pull/554)) and 7740 GFLOPs for a 12.5 TFLOPs card. Their HWCW layout convolution kernel already achieves performance on par with their OpenCL backend.

My real passion is in imaging, or visual computing in general (and not in deep learning per se), so I'd love to see rocm support in Halide as well. Please note that I don't work for AMD and have nothing to do with it. I just like rocm's open ecosystem over NV platform.

Thanks


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ROCm backend for AMDGPU? #2443

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ROCm backend for AMDGPU? #2443

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions