[Bugfix] Explicitly set LoRA triton kernel device#13043
[Bugfix] Explicitly set LoRA triton kernel device#13043jeejeelee wants to merge 3 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
tlrmchlsmth
left a comment
There was a problem hiding this comment.
Seems that we will need to do this for all triton kernels. @fabianlim ran into this problem on Bamba as well (it uses triton kernels for the mamba mixer).
If we have to do this everywhere it will be easy to miss a spot. Is it possible to do this in the GPUModelRunner instead?
Make sense, I will handling it ASAP |
|
#13027 can better solve this issue, so I close this one |
FIX #12967 (link existing issues this PR will resolve)