Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Implement New Operators Using CUDA Host Functions Along with Thrust and CUB Libraries #2258

Open
chenwanqq opened this issue Jun 7, 2024 · 1 comment

Comments

@chenwanqq
Copy link
Contributor

As stated, the CUDA code in the candle-kernels repository seems to only contain kernel functions. When I want to implement new operators (such as nonzero), it seems I'm only able to use Rust for higher-level functionality, which means I cannot utilize the device_vector from Thrust or the flagged APIs from CUB. This poses a significant challenge for implementing my algorithms. For example, to implement nonzero, it seems I would have to reimplement algorithms like exclusive_scan and scatter using the current approach?

I am hoping for a better way to utilize the CUDA ecosystem!

Specifically, I'm interested in how to:

  1. Incorporate host functions in CUDA code to facilitate the use of libraries like Thrust and CUB.
  2. Effectively leverage these libraries to implement algorithms and operators that are not natively supported in the current codebase.
    Any guidance or best practices for achieving this would be greatly appreciated.
    (Translate from Chinese using LLM, Might be a little bit.. formal^_^)
@chenwanqq
Copy link
Contributor Author

I have finished a GPU version of nonzero candle-nonzero. It uses FFI to provoke CUDA functions.
I'm still wondering what is the best way to integrate it to this project🧐

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant