[Model] Refactor JambaForCausalLM #21394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

jeejeelee merged 5 commits into vllm-project:main from jeejeelee:refactor-jamba

Jul 29, 2025

Collaborator

jeejeelee commented Jul 22, 2025 •

edited by github-actions bot

Loading

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

The main motivation is to standardize the model implementation and support BNB quantization

Test Plan

Test Result

(Optional) Documentation Update


          Done

7f3bdec

Signed-off-by: Jee Jee Li <[email protected]>

github-actions bot commented Jul 22, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist bot reviewed

View reviewed changes

Contributor

gemini-code-assist bot left a comment

Code Review

This pull request refactors the Jamba model implementation, primarily by replacing the custom JambaMLP with the more generic LlamaMLP and by simplifying the weight loading logic in JambaForCausalLM to use the AutoWeightsLoader. These changes improve code clarity and maintainability. I've found two high-severity issues related to weight loading that could cause problems.

vllm/model_executor/models/jamba.py

Comment on lines +381 to +383

    
                          ("qkv_proj", "q_proj", "q"),

                          ("qkv_proj", "k_proj", "k"),

                          ("qkv_proj", "v_proj", "v"),

Contributor

gemini-code-assist bot Jul 22, 2025

The stacked_params_mapping entries for qkv_proj are missing leading dots¹. This can cause ambiguous or incorrect weight name matching, as q_proj could match unrelated parameter names. Using .q_proj ensures matching a specific component, improving safety and explicitness. This inconsistency with gate_up_proj mappings and implementations in models like LlamaModel further emphasizes the need for correction.

Style Guide References

Suggested change

      
                        ("qkv_proj", "q_proj", "q"),
          
                        ("qkv_proj", "k_proj", "k"),
          
                        ("qkv_proj", "v_proj", "v"),
          
                        (".qkv_proj", ".q_proj", "q"),
          
                        (".qkv_proj", ".k_proj", "k"),
          
                        (".qkv_proj", ".v_proj", "v"),

Ensure that weight names are explicitly and accurately matched to avoid unintended matches with unrelated parameters. (link) ↩

jeejeelee commented

View reviewed changes

vllm/model_executor/models/jamba.py

    
                      return hidden_states.view(orig_shape)

              class JambaMLP(JambaMoE):

Collaborator Author

jeejeelee Jul 22, 2025 •

edited

Loading

#7415 Change JambaMLP to MoE , now this PR wants to revert, do you accept? @mzusman
The main motivation is to standardize the model implementation and support BNB quantization

jeejeelee changed the title ~~[Do not merge] Refactor JambaForCausalLM~~ [Model] Refactor JambaForCausalLM

jeejeelee added 2 commits

July 23, 2025 21:38


          Merge branch 'vllm-project:main' into refactor-jamba

873ba8a


          Modify mlp

be6b707

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee requested a review from tlrmchlsmth

July 23, 2025 14:47

tlrmchlsmth reviewed

View reviewed changes

vllm/model_executor/models/jamba.py

Comment on lines +119 to +132

    
                      if num_experts > 1:

                          self.feed_forward = JambaMoE(

                              config,

                              quant_config=quant_config,

                              prefix=f"{prefix}.feed_forward",

                          )

                      else:

                          self.feed_forward = JambaMLP(

                              config.hidden_size,

                              config.intermediate_size,

                              config.hidden_act,

                              quant_config=quant_config,

                              prefix=f"{prefix}.feed_forward",

                          )

Member

tlrmchlsmth Jul 29, 2025

yeah this is cleaner

vllm/model_executor/models/jamba.py

Comment on lines +382 to +383

    
                          (".gate_up_proj", ".gate_proj", 0),

                          (".gate_up_proj", ".up_proj", 1),

Member

tlrmchlsmth Jul 29, 2025

why do we have to handle these now? (the old code didn't mention these)

Collaborator Author

jeejeelee Jul 29, 2025

These exist in the old version of the code https://github.com/vllm-project/vllm/blob/v0.5.4/vllm/model_executor/models/jamba.py#L857, this PR is just a revert

tlrmchlsmth reviewed

View reviewed changes

vllm/model_executor/models/jamba.py Show resolved Hide resolved

tlrmchlsmth approved these changes

View reviewed changes

Member

tlrmchlsmth left a comment

looks good, thanks - just a couple of nits

jeejeelee added 2 commits

July 29, 2025 15:11


          Merge branch 'vllm-project:main' into refactor-jamba

0e612a7

Fix

4aca978

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee added the ready label

jeejeelee merged commit 61a6905 into vllm-project:main

78 checks passed

jeejeelee deleted the refactor-jamba branch

July 29, 2025 10:25

liuyumoye pushed a commit to liuyumoye/vllm that referenced this pull request


          [Model] Refactor JambaForCausalLM (vllm-project#21394)

70d04a3

Signed-off-by: Jee Jee Li <[email protected]>

x22x22 pushed a commit to x22x22/vllm that referenced this pull request


          [Model] Refactor JambaForCausalLM (vllm-project#21394)

54af49c

Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: x22x22 <[email protected]>

Pradyun92 pushed a commit to Pradyun92/vllm that referenced this pull request


          [Model] Refactor JambaForCausalLM (vllm-project#21394)

eb54d35

Signed-off-by: Jee Jee Li <[email protected]>

npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request


          [Model] Refactor JambaForCausalLM (vllm-project#21394)

31b33b2

Signed-off-by: Jee Jee Li <[email protected]>

jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request


          [Model] Refactor JambaForCausalLM (vllm-project#21394)

12add30

Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jinzhen Lin <[email protected]>

noamgat pushed a commit to noamgat/vllm that referenced this pull request


          [Model] Refactor JambaForCausalLM (vllm-project#21394)

4a9367d

Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Noam Gat <[email protected]>

paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request


          [Model] Refactor JambaForCausalLM (vllm-project#21394)

a40d5d1

Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Paul Pak <[email protected]>

diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request


          [Model] Refactor JambaForCausalLM (vllm-project#21394)

fa3047f

Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Diego-Castan <[email protected]>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request


          [Model] Refactor JambaForCausalLM (vllm-project#21394)

6308bfb

Signed-off-by: Jee Jee Li <[email protected]>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request


          [Model] Refactor JambaForCausalLM (vllm-project#21394)

7a7bca2

Signed-off-by: Jee Jee Li <[email protected]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready