[BUGFIX][DEEPSEEK][MODEL_LOAD] fix w13, w2 weight not initialized assert #20202

xuechendi · 2025-06-27T21:19:31Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Fix issue when loading deepseek model, current error log
Failing is introduced by #18343

		INFO 06-27 21:12:49 [default_loader.py:272] Loading weights took 7.88 seconds
		ERROR 06-27 21:12:49 [core.py:519] EngineCore failed to start.
		ERROR 06-27 21:12:49 [core.py:519] Traceback (most recent call last):
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm/vllm/v1/engine/core.py", line 510, in run_engine_core
		ERROR 06-27 21:12:49 [core.py:519]     engine_core = EngineCoreProc(*args, **kwargs)
		ERROR 06-27 21:12:49 [core.py:519]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm/vllm/v1/engine/core.py", line 394, in __init__
		ERROR 06-27 21:12:49 [core.py:519]     super().__init__(vllm_config, executor_class, log_stats,
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm/vllm/v1/engine/core.py", line 75, in __init__
		ERROR 06-27 21:12:49 [core.py:519]     self.model_executor = executor_class(vllm_config)
		ERROR 06-27 21:12:49 [core.py:519]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm/vllm/executor/executor_base.py", line 53, in __init__
		ERROR 06-27 21:12:49 [core.py:519]     self._init_executor()
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm/vllm/executor/uniproc_executor.py", line 48, in _init_executor
		ERROR 06-27 21:12:49 [core.py:519]     self.collective_rpc("load_model")
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm/vllm/executor/uniproc_executor.py", line 57, in collective_rpc
		ERROR 06-27 21:12:49 [core.py:519]     answer = run_method(self.driver_worker, method, args, kwargs)
		ERROR 06-27 21:12:49 [core.py:519]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm/vllm/utils.py", line 2687, in run_method
		ERROR 06-27 21:12:49 [core.py:519]     return func(*args, **kwargs)
		ERROR 06-27 21:12:49 [core.py:519]            ^^^^^^^^^^^^^^^^^^^^^
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm-gaudi/vllm_hpu/v1/worker/hpu_worker.py", line 112, in load_model
		ERROR 06-27 21:12:49 [core.py:519]     self.model_runner.load_model()
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm-gaudi/vllm_hpu/v1/worker/hpu_model_runner.py", line 1688, in load_model
		ERROR 06-27 21:12:49 [core.py:519]     self.model = get_model(vllm_config=self.vllm_config)
		ERROR 06-27 21:12:49 [core.py:519]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm/vllm/model_executor/model_loader/__init__.py", line 59, in get_model
		ERROR 06-27 21:12:49 [core.py:519]     return loader.load_model(vllm_config=vllm_config,
		ERROR 06-27 21:12:49 [core.py:519]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm/vllm/model_executor/model_loader/base_loader.py", line 41, in load_model
		ERROR 06-27 21:12:49 [core.py:519]     self.load_weights(model, model_config)
		ERROR 06-27 21:12:49 [core.py:519]   File "/workspace/vllm/vllm/model_executor/model_loader/default_loader.py", line 281, in load_weights
		ERROR 06-27 21:12:49 [core.py:519]     raise ValueError("Following weights were not initialized from "
ERROR 06-27 21:12:49 [core.py:519] ValueError: Following weights were not initialized from checkpoint: {'model.layers.1.mlp.experts.w13_weight', 'model.layers.6.mlp.experts.w2_weight', 'model.layers.18.mlp.experts.w13_weight', 'model.layers.24.mlp.experts.w13_weight', 'model.layers.5.mlp.experts.w13_weight', 'model.layers.12.mlp.experts.w13_weight', 'model.layers.21.mlp.experts.w13_weight', 'model.layers.7.mlp.experts.w2_weight', 'model.layers.18.mlp.experts.w2_weight', 'model.layers.3.mlp.experts.w13_weight', 'model.layers.22.mlp.experts.w13_weight', 'model.layers.25.mlp.experts.w2_weight', 'model.layers.16.mlp.experts.w2_weight', 'model.layers.9.mlp.experts.w2_weight', 'model.layers.11.mlp.experts.w2_weight', 'model.layers.4.mlp.experts.w2_weight', 'model.layers.17.mlp.experts.w2_weight', 'model.layers.13.mlp.experts.w2_weight', 'model.layers.15.mlp.experts.w13_weight', 'model.layers.9.mlp.experts.w13_weight', 'model.layers.12.mlp.experts.w2_weight', 'model.layers.19.mlp.experts.w13_weight', 'model.layers.8.mlp.experts.w13_weight', 'model.layers.23.mlp.experts.w2_weight', 'model.layers.6.mlp.experts.w13_weight', 'model.layers.26.mlp.experts.w13_weight', 'model.layers.10.mlp.experts.w2_weight', 'model.layers.5.mlp.experts.w2_weight', 'model.layers.14.mlp.experts.w13_weight', 'model.layers.20.mlp.experts.w13_weight', 'model.layers.2.mlp.experts.w2_weight', 'model.layers.10.mlp.experts.w13_weight', 'model.layers.21.mlp.experts.w2_weight', 'model.layers.22.mlp.experts.w2_weight', 'model.layers.26.mlp.experts.w2_weight', 'model.layers.23.mlp.experts.w13_weight', 'model.layers.25.mlp.experts.w13_weight', 'model.layers.1.mlp.experts.w2_weight', 'model.layers.11.mlp.experts.w13_weight', 'model.layers.16.mlp.experts.w13_weight', 'model.layers.3.mlp.experts.w2_weight', 'model.layers.24.mlp.experts.w2_weight', 'model.layers.4.mlp.experts.w13_weight', 'model.layers.15.mlp.experts.w2_weight', 'model.layers.17.mlp.experts.w13_weight', 'model.layers.13.mlp.experts.w13_weight', 'model.layers.8.mlp.experts.w2_weight', 'model.layers.2.mlp.experts.w13_weight', 'model.layers.20.mlp.experts.w2_weight', 'model.layers.7.mlp.experts.w13_weight', 'model.layers.19.mlp.experts.w2_weight', 'model.layers.14.mlp.experts.w2_weight'}

Test Plan

Test Result

(Optional) Documentation Update

github-actions · 2025-06-27T21:19:38Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Summary of Changes

Hello @xuechendi, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a specific model loading issue for the DeepSeek v2 architecture. The core problem was that certain expert weights were not being recognized and initialized during the model loading phase, causing the entire process to fail. My change ensures that the correct weight names are used throughout the loading mechanism, enabling DeepSeek models to be loaded without errors.

Highlights

DeepSeek Model Loading Fix: I've addressed a critical bug that prevented the DeepSeek v2 model from loading successfully. Specifically, the issue stemmed from w13_weight and w2_weight expert weights not being properly initialized from the checkpoint, leading to a ValueError. The fix ensures that when a weight name is mapped during the loading process, the mapped name is correctly propagated for subsequent operations, allowing these weights to be found and loaded.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The PR fixes an issue with uninitialized weights in DeepSeek models by updating the parameter name after checkpoint mapping. However, the current implementation incorrectly updates a loop variable, potentially skipping valid mappings. The suggestion ensures the name is updated only upon successful weight loading.

vllm/model_executor/models/deepseek_v2.py

xuechendi · 2025-06-27T21:24:36Z

@abmfy @WoosukKwon may you take a look of this quick fix.
I don't have context of why we now use 'name_mapped' instead of name for all w13_weight and w2_weight, from my local test, since load_weights function returns param_names which being loaded. And name_mapped is being recorded to loaded_params. Which caused above error.

Signed-off-by: Chendi Xue <[email protected]>

abmfy · 2025-06-27T21:49:19Z

@abmfy @WoosukKwon may you take a look of this quick fix. I don't have context of why we now use 'name_mapped' instead of name for all w13_weight and w2_weight, from my local test, since load_weights function returns param_names which being loaded. And name_mapped is being recorded to loaded_params. Which caused above error.

Hi @xuechendi, yes that's an overlook by me in #18343, we should add name_mapped to loaded_params.

Currently fix LGTM. Details come later

xuechendi · 2025-06-27T22:07:16Z

@abmfy , thanks for quick reply, may you help to approve this PR and trigger CI.
So we can have a quick fix to unblock deepseek model, Thanks.

abmfy · 2025-06-27T22:58:33Z

@abmfy , thanks for quick reply, may you help to approve this PR and trigger CI. So we can have a quick fix to unblock deepseek model, Thanks.

Hi @xuechendi, I’m not sure if I have permission to trigger the CI.

Here’s a detailed explanation of the issue:

TL;DR It was a bug, and the proposed fix is correct.

I made this change because, under EPLB settings, there are cases where a logical expert is replicated multiple times, with each replica corresponding to an entry in expert_params_mapping.

Previously, when weight_name in name was true, it meant either we needed to load this weight or discard it if the expert was not local. Therefore, the logic inside the for loop would run only once, and it was safe to modify name directly since we’d never enter the loop again.

However, under EPLB, weight_name in name now simply means we need to load this logical expert. But inside the loop, the current entry in expert_params_mapping might still refer to a physical expert that doesn’t belong to this rank. We must continue checking, since each logical expert can map to multiple physical experts. Thus, we can’t simply break out of the loop; instead, the weight loader must determine whether the weight belongs to this rank. If we modified name directly, subsequent iterations of the loop would fail to find the correct parameter name. That’s why we introduced a temporary variable to hold the mapped name instead.

Finally, it was my oversight that I didn’t update the logic for appending to loaded_params to use mapped_name. I missed this because, in the EPLB PR, I was testing with FP8 quantization, and default_loader.py#L278 only applies in the non-quantized case.

houseroad

Looks good. Could you add the test command and results to the PR description?

gemini-code-assist bot reviewed Jun 27, 2025

View reviewed changes

vllm/model_executor/models/deepseek_v2.py Outdated Show resolved Hide resolved

[BUGFIX][DEEPSEEK][MODEL_LOAD] fix w13, w2 weight not initialized assert

ca363c5

Signed-off-by: Chendi Xue <[email protected]>

xuechendi force-pushed the fix_deepseek_load_weights branch from caef4c2 to ca363c5 Compare June 27, 2025 21:25

abmfy approved these changes Jun 27, 2025

View reviewed changes

houseroad approved these changes Jun 28, 2025

View reviewed changes

houseroad added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 28, 2025

ruisearch42 approved these changes Jun 28, 2025

View reviewed changes

houseroad enabled auto-merge (squash) June 28, 2025 01:42

vllm-bot merged commit 5a52f38 into vllm-project:main Jun 30, 2025
78 of 82 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BUGFIX][DEEPSEEK][MODEL_LOAD] fix w13, w2 weight not initialized assert #20202

[BUGFIX][DEEPSEEK][MODEL_LOAD] fix w13, w2 weight not initialized assert #20202

Uh oh!

xuechendi commented Jun 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jun 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

xuechendi commented Jun 27, 2025

Uh oh!

abmfy commented Jun 27, 2025 •

edited

Loading

Uh oh!

xuechendi commented Jun 27, 2025

Uh oh!

abmfy commented Jun 27, 2025

Uh oh!

houseroad left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

[BUGFIX][DEEPSEEK][MODEL_LOAD] fix w13, w2 weight not initialized assert #20202

[BUGFIX][DEEPSEEK][MODEL_LOAD] fix w13, w2 weight not initialized assert #20202

Uh oh!

Conversation

xuechendi commented Jun 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

github-actions bot commented Jun 27, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

xuechendi commented Jun 27, 2025

Uh oh!

abmfy commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xuechendi commented Jun 27, 2025

Uh oh!

abmfy commented Jun 27, 2025

Uh oh!

houseroad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

xuechendi commented Jun 27, 2025 •

edited by github-actions bot

Loading

abmfy commented Jun 27, 2025 •

edited

Loading