Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Add support for SYCL Nvidia target #5738

Merged
merged 3 commits into from
Mar 11, 2024

Conversation

AidanBeltonS
Copy link
Collaborator

This PR adds CMake and instructions for a SYCL Nvidia target. This is done with the LLAMA_SYCL_BACKEND CMake option which defaults to INTEL but can be set to NVIDIA. This approach allows us in the future to expand this for AMD targets.

This PR is dependent on #5591 as that resolves failing Nvidia test-backend-ops tests.

@AidanBeltonS
Copy link
Collaborator Author

@NeoZhangJianyu, @abhilash1910, @Alcpz, feel free to review

@0cc4m
Copy link
Collaborator

0cc4m commented Feb 26, 2024

Just curious, does this kind of build time selection mean that it's not possible to run an Nvidia and an Intel GPU at the same time using SYCL?

@AidanBeltonS
Copy link
Collaborator Author

AidanBeltonS commented Feb 27, 2024

Just curious, does this kind of build time selection mean that it's not possible to run an Nvidia and an Intel GPU at the same time using SYCL?

It is possible to target both Nvidia and Intel GPUs with the same SYCL code, if you pass -fsycl-targets=nvptx64-nvidia-cuda,spir64 you will generate both pieces of device code and then can execute on either GPU at runtime. The issue in the specific case for llama is that oneMKL must be built for the specific backend cuBlas backend, so in this case it makes sense to have it build for either device but both at the same time.

@NeoZhangJianyu
Copy link
Collaborator

@AidanBeltonS
What's the key benefit of this PR to user? Performance, easy usage?
I only see the benefit as you said is "one binary file support multiple GPUs (Intel, NV, ...)".

But this PR don't provide the key feature to user.

Another word, how to persuade user to use SYCL backend for NV GPU, instead of cuBlas backend?

@AidanBeltonS
Copy link
Collaborator Author

@AidanBeltonS What's the key benefit of this PR to user? Performance, easy usage? I only see the benefit as you said is "one binary file support multiple GPUs (Intel, NV, ...)".

But this PR don't provide the key feature to user.

Another word, how to persuade user to use SYCL backend for NV GPU, instead of cuBlas backend?

oneAPI by default does support a single binary file for multiple GPUs. oneMKL is the reason we cannot do it in this case. In the long term I think we should raise this issue and fix that rather than not support NVidia GPUs.
There are a few other reasons that we should want to support NVidia GPUs.

  1. SYCL is a portable-performant standard and we should use it as such.
  2. Supporting different vendors also makes the implementation more generic, increasing the likelihood that other SYCL implementations such as AdaptiveCpp will easily work with the code. Aiding in the development of an ecosystem rather than an exclusive backend.
  3. Testing and accessibility, the more devices we support easier it is to test, implement CI, and more people can use it. I.e. gain adoption
  4. Debugging, by using the same hardware between the backends we can isolate differences to the implementation removing the hardware

Copy link
Collaborator

@abhilash1910 abhilash1910 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Lets wait for CI.
cc @ggerganov

@NeoZhangJianyu NeoZhangJianyu merged commit 3814a07 into ggerganov:master Mar 11, 2024
60 checks passed
NeoZhangJianyu pushed a commit to NeoZhangJianyu/llama.cpp that referenced this pull request Mar 12, 2024
* Add support for nvidia target in CMake

* Update sycl read-me for Nvidia target

* Fix errors
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
* Add support for nvidia target in CMake

* Update sycl read-me for Nvidia target

* Fix errors
hodlen pushed a commit to hodlen/llama.cpp that referenced this pull request Apr 1, 2024
* Add support for nvidia target in CMake

* Update sycl read-me for Nvidia target

* Fix errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants