Skip to content

[Bugfix] Fix T5EncoderModel load_weights corrupting key weight names#2344

Open
sfc-gh-goliaro wants to merge 1 commit intovllm-project:mainfrom
sfc-gh-goliaro:fix/t5-encoder-weight-loading
Open

[Bugfix] Fix T5EncoderModel load_weights corrupting key weight names#2344
sfc-gh-goliaro wants to merge 1 commit intovllm-project:mainfrom
sfc-gh-goliaro:fix/t5-encoder-weight-loading

Conversation

@sfc-gh-goliaro
Copy link
Copy Markdown

@sfc-gh-goliaro sfc-gh-goliaro commented Mar 30, 2026

Purpose

  • Fix T5EncoderModel.load_weights silently failing to load K-projection weights into the fused qkv_proj layer due to a str.replace bug that corrupts parameter names.
  • The bare name.replace("k", "qkv_proj") replaces every occurrence of "k" in the name — including the "k" in "block" — producing a corrupted lookup name (encoder.blocqkv_proj.0.layer...) that matches no parameter. As a result, K weights are never loaded.
  • Fix: use dotted replacement (.k..qkv_proj.) so only the full component is matched, consistent with the guard that already checks f".{weight_name}.".

Test plan

  • Verified that all 5 stacked weight entries (q, k, v, wi_0, wi_1) now produce correct lookup names
  • Compared T5 encoder output against HuggingFace transformers.T5EncoderModel — cosine similarity improves from ~0.71 to ~0.999 after fix

Test Result

  • All 5 stacked weight entries now produce correct lookup names
  • Cosine similarity between T5 ecnoder and transformers.T5EncoderModel rises from ~0.71 to ~0.999 after fix

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

The bare `name.replace(weight_name, param_name)` for the `("qkv_proj",
"k", "k")` mapping entry replaces every occurrence of the substring "k"
in the parameter name — including the "k" in "block" — producing a
corrupted lookup name like `encoder.blocqkv_proj.0.layer...` that
silently fails to match any parameter. As a result, the K-projection
weights in every T5SelfAttention layer are never loaded into the fused
`qkv_proj`, leaving the K shard at its random initialization.

Fix: use dotted replacement (`.k.` → `.qkv_proj.`) so only the
full dotted component is matched, consistent with the guard check
that already uses `f".{weight_name}."`.

The `q` and `v` entries were unaffected because neither letter appears
as a substring in other path components (`encoder`, `block`, `layer`,
etc.), but the fix is applied uniformly for robustness.

Made-with: Cursor
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good catch. The guard was already using dotted form but the replace wasn't — silent and nasty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants