-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] plan to support medusa? #859
Comments
Speculative decoding support is on our roadmap. Currently, FlashInfer has implemented the corresponding kernel and made targeted optimizations, please stay tuned. |
really looking for fast decoding methods like medusa,Speculative decoding, LOOKAHEAD DECODING and such |
Hi I was wondering Medusa will be supported with full tree attention or the Top-1 version currently available in vLLM? Thanks. |
This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed. |
Motivation
plan to support medusa?
Related resources
No response
The text was updated successfully, but these errors were encountered: