-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Multistep with n>1 Fails #7968
Comments
I will take a look later today |
Looks like @tdoublep encountered this issue a while ago in the context of speculative deocding and has a PR with a fix (that would need to be rebased):
I also found a couple other issues for the same crash: |
cc @afeldman-nm |
I'm running into the same issue. Does anyone know of a workaround? We don't need We can reproduce using VLLM's provided This runs ok:
This crashes:
The error I'm getting is:
|
@comaniac Hi just wondering if someone working on VLLM can provide an update on this. We want to use multi-step scheduler because the throughput is much better for our needs, however we also need to set n > 1. Simply disabling multistep in that case won't work for us. Thanks! |
Sorry we're busying with the company event (Ray Summit) until this week. Will try to find some time after the event to look into it. @SolitaryThinker could you also take a look if you got a chance? |
@afeldman-nm has a WIP branch for this |
Thanks — are you referring to the branch linked above that disables the multi-step scheduler? |
Yes - to avoid crashing the server. We are not planning to support both multistep and beam search at the same time. Instead, we are working on rearchitecting vllm to have asynchronous scheduling which will accomplish the same goal as multistep for throughput performance while making it easier to support the other features however, if you have an idea for how to do this with multistep, feel free to open up a PR |
Your current environment
The output of `python collect_env.py`
🐛 Describe the bug
Launched server with:
vllm serve $MODEL --num-scheduler-steps 8
Sent the following request:
Got the following output:
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: