Fix weigit loading for GQA with TP by zhangch9 · Pull Request #2379 · vllm-project/vllm

zhangch9 · 2024-01-08T09:29:51Z

Fixes #1735.

This PR modifies the weight loading logic when tp_size is larger than num_kv_heads.

zhuohan123

Great catch! Thanks for the fix!

fix weigit loading for GQA with TP

82ec5c0

WoosukKwon requested a review from zhuohan123 January 8, 2024 21:14

zhuohan123 approved these changes Jan 15, 2024

View reviewed changes

zhuohan123 merged commit f780504 into vllm-project:main Jan 15, 2024

zhangch9 deleted the fix-gqa-tp branch January 16, 2024 05:48

esmeetu mentioned this pull request Jan 16, 2024

GPTBigCodeForCausalLM, TP >= 2, output is bloken. Is this BUG? #2417

Closed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Jan 18, 2024

fix weigit loading for GQA with TP (vllm-project#2379)

069c400

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

fix weigit loading for GQA with TP (vllm-project#2379)

8c6aa5e

c0de128 mentioned this pull request Jan 5, 2026

[Bugfix][Hardware][AMD] Consolidate FP8 min/max values helper function #31106

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix weigit loading for GQA with TP#2379

Fix weigit loading for GQA with TP#2379
zhuohan123 merged 1 commit intovllm-project:mainfrom
zhangch9:fix-gqa-tp

zhangch9 commented Jan 8, 2024

Uh oh!

zhuohan123 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

zhangch9 commented Jan 8, 2024

Uh oh!

zhuohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants