A (almost) tutorial how to write CUDA kernels in Rust, right alongside with CPU code.
Personally, I would be glad Rust to dominate the world to use Rust in different areas ;)
I was very excited to see Jorge Aparicio's PTX generation example (@japaric/nvptx) some time ago. Since that time I was thinking about how can we go further towards a more natural mix of CUDA and Rust code.
- Chapter 0 - Reference CPU code
- Chapter 1 - Separated host and device code
- Chapter 2 - Naive approach: Merging host and device code with
#[cfg(...)]
- Chapter 3 - Better approach: Merging host and device code with Compiler Plugins / Procedural Macros
- Chapter 4 - Beyond CUDA: Is the approch good for OpenCL as well?
Unfortunately, I don't know how far would we get. The goal is to create a convenient way to write CUDA / OpenCL kernels in Rust just as easy as the host code, but I'm not sure we achieve it easily (if it's possible at all). So, let's start our journey through the thorns to a better future with GPGPU in Rust!