[Models] Cohere Eagle + fix to Cohere MoE#42078
Conversation
e276516 to
1e4284a
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces support for the EAGLE speculative decoding draft model for Cohere architectures, including the implementation of EagleCohereForCausalLM and updates to the reasoning parser for handling multiple structural tags. The review feedback highlights several critical improvements: initializing flags to track model-specific weights for the speculative proposer, ensuring the first draft layer correctly disables the input layernorm to align with the original EAGLE implementation, and removing logic that skips loading embeddings to avoid breaking weight-sharing comparisons.
|
Documentation preview: https://vllm--42078.org.readthedocs.build/en/42078/ |
…er for MHL v2 - Add EagleCohereForCausalLM model (cohere_eagle.py) for Eagle speculative decoding - Register EagleCohereForCausalLM in model and test registries - Add Cohere2VisionForConditionalGeneration to multimodal spec decode list - Update CohereCommandReasoningParser for MHL v2: support multiple JSON tags per architecture (e.g. MOE uses both START_RESPONSE and START_TEXT delimiters), add Cohere2MoeForCausalLM tag style, fix structural_tag triggers list Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Terrencezzj <terrence@cohere.ai>
Signed-off-by: Terrencezzj <terrence@cohere.ai>
d5a3e2c to
344552d
Compare
|
Hi @Terrencezzj, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Terrencezzj <terrence@cohere.ai>
Head branch was pushed to by a user without write access
Signed-off-by: Terrencezzj <terrence@cohere.ai> Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Terrencezzj <terrence@cohere.ai> Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Terrencezzj <terrence@cohere.ai> Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Terrencezzj <terrence@cohere.ai> Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Signed-off-by: Terrencezzj <terrence@cohere.ai> Co-authored-by: Cursor <cursoragent@cursor.com>
Purpose
Add Cohere Eagle to vLLM.
Update CohereCommandReasoningParser
Online inference
Terminal 1: start server
Terminal 2: run client
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.