Add NotImplementedError to v1 cpu runner#19527
Add NotImplementedError to v1 cpu runner#19527fred2167 wants to merge 2 commits intovllm-project:mainfrom
Conversation
add v1 cpu runner not implemented error
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
609d77c to
feb00d5
Compare
|
Any issue when you using the CPU V1? The CPU V1 model runner inherits from the GPU V1 model runner and most member functions can be reused directly. So it's not required to throw unimplemented errors. Just checked |
Signed-off-by: Fred Chan <fred2167@gmail.com>
feb00d5 to
7cfc9e1
Compare
|
yeah, I am also a bit confused, since CPU backend also works on macos CPU side. :-) |
Im running on M1 Mac so I guess I dont have this package from intel in SDAP backend. any recommendation for Mac?
does the intel package also support GPU? if not, is it better to move cpu specific logic to the cpu model runner?
this is the output Im getting by running |
interesting, are you able to run the example on Mac? |
|
@fred2167 The V1 engine requires chunked-prefill support, which has not been supported on macos. For CPU, only x86 supports this via CUDA backend will not use SDAP backend, it is only used by CPU. |
|
https://gist.github.com/houseroad/9fc43ba08c192c7c91914f2f1af539fb, I tried something like this yesterday :-) |
make sense. I dont think its a small lift for v1 to support Mac. maybe worth updating the doc to reflect this given someone have similar issue on slack |
is this on V0 or V1 engine |
|
V0 works. V1 failed. |
|
I thought my work was done on v1, but actually it was on v0. Then this PR may make sense. |
|
The underlying issue was addressed by #19121 |
add v1 cpu runner not implemented error
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Purpose
This is my first commit to the repo. The purpose is to familiarize the codebase with minimal changes.
This PR is only aim to make the error more verbose while running the example on a CPU machine so that it doesnt fall to the GPU implementation
Test Plan
Unit test and example
Test Result