Skip to content

[Gemma 4] Refactor Gemma 4 weight loader and qkv proj#24088

Closed
kpham-sgl wants to merge 1 commit into
mainfrom
kp/gemma4-qkj-proj-refactor
Closed

[Gemma 4] Refactor Gemma 4 weight loader and qkv proj#24088
kpham-sgl wants to merge 1 commit into
mainfrom
kp/gemma4-qkj-proj-refactor

Conversation

@kpham-sgl
Copy link
Copy Markdown
Collaborator

@kpham-sgl kpham-sgl commented Apr 29, 2026

Motivation

Refactor QKV proj to avoid wasted KV proj for KV shared layers

Modifications

  • KV-shared layers now load and call only a separate q_proj (ColumnParallelLinear) — no fused qkv_proj, no k_norm / v_norm.
  • Hoisted the entire KV-sharing decision (including kv_shared_layer_index lookup) into Gemma4DecoderLayer.__init__; Gemma4Attention just receives the resolved index.

Accuracy Tests

Model Reference (cookbook) This run Δ
gemma-4-E2B-it 0.307 0.296 −0.011
gemma-4-E4B-it 0.396 0.402 +0.006
gemma-4-31B-it 0.589 0.576 −0.013
gemma-4-26B-A4B-it 0.549 0.559 +0.010

Speed Tests and Profiling

[TODO] profile to make sure K_proj and V_proj does not appear for the smaller models

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@kpham-sgl kpham-sgl changed the title Refactor weight loader and qkv proj [Gemma 4] Refactor Gemma 4 weight loader and qkv proj Apr 29, 2026
@kpham-sgl
Copy link
Copy Markdown
Collaborator Author

Close in favor of #25461

@kpham-sgl kpham-sgl closed this May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant