Skip to content

ROCm backend for AMDGPU? #2443

@masahi

Description

@masahi

Hi, do you have any plan to support ROCm backend for AMDGPU? (rocm is CUDA equivalent of AMD)
Or, are you interested in PR?

It can enable an assembly level optimizations that are not possible with OpenCL backend.

I've been involved with rocm backend development for TVM, which is basically Halide tailored for deep learning inference. Their rocm backend compiles how level IR in python to optimized gpu code using LLVM's AMDGPU backend.

Although their backend is still very much preliminary (the runtime was fixed just week and special math function support is still in progress), the performance of generated code is quite descent: Without AMD-specific optimization, their sgemm kernel already achieves 5200 GFLOPs for a 8 TFLOPs card (see here) and 7740 GFLOPs for a 12.5 TFLOPs card. Their HWCW layout convolution kernel already achieves performance on par with their OpenCL backend.

My real passion is in imaging, or visual computing in general (and not in deep learning per se), so I'd love to see rocm support in Halide as well. Please note that I don't work for AMD and have nothing to do with it. I just like rocm's open ecosystem over NV platform.

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions