Skip to content

fix AssertionError: Original QKV code not found#3657

Merged
NanoCode012 merged 3 commits into
axolotl-ai-cloud:mainfrom
ved1beta:gemma-QKV
May 22, 2026
Merged

fix AssertionError: Original QKV code not found#3657
NanoCode012 merged 3 commits into
axolotl-ai-cloud:mainfrom
ved1beta:gemma-QKV

Conversation

@ved1beta
Copy link
Copy Markdown
Member

@ved1beta ved1beta commented May 15, 2026

Description

when training Gemma 4 with LoRA. The Gemma 4 fused RMSNorm+RoPE monkeypatch replaces Gemma4TextAttention.forward wholesale before the LoRA QKV/O source-rewrite patch runs, so the rewrite can no longer find its expected pattern and asserts.

Motivation and Context

#3655

How has this been tested?

config runs without AssertionError: Original QKV code not found

AI Usage Disclaimer

claude dignose

Summary by CodeRabbit

  • Bug Fixes
    • Improved LoRA compatibility with optimized/fused attention paths to avoid incorrect patching.
    • Gemma4 models now bypass incompatible forward-rewrite logic and use the LoRA output hook when available, preventing misapplied patches and ensuring safer fallback behavior.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 575eb148-5028-4dc7-8a60-d8c6ef962e0d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Removes Gemma4 from QKV source-rewrite patches, makes patch_self_attn_lora skip Gemma4 attention classes, and updates Gemma4 fused attention to use an optional apply_o hook for output projection with fallback to o_proj.

Changes

Gemma4 LoRA handling changes

Layer / File(s) Summary
Remove Gemma4 QKV patch entry
src/axolotl/monkeypatch/lora_kernels.py
Deletes the Gemma4-specific QKV replacement from QKV_PATCHES and documents that Gemma4 uses fused-attention so QKV patching is omitted.
Gemma4 guard in patch_self_attn_lora
src/axolotl/monkeypatch/lora_kernels.py
patch_self_attn_lora imports/checks Gemma4TextAttention, logs an informational message, and returns early to avoid forward source-rewrite/patching for Gemma4.
Fused attention conditional apply_o
src/axolotl/monkeypatch/models/gemma4/fused_attn.py
Gemma4 fused attention now routes output through a LoRA apply_o hook when present; otherwise it falls back to o_proj.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • winglian
  • NanoCode012
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly addresses the main issue being fixed: the AssertionError when the QKV pattern is not found, which occurs because Gemma4's fused attention monkeypatch runs before the LoRA QKV source-rewrite patch.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

❌ Patch coverage is 0% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/axolotl/monkeypatch/lora_kernels.py 0.00% 6 Missing ⚠️
...rc/axolotl/monkeypatch/models/gemma4/fused_attn.py 0.00% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

@winglian
Copy link
Copy Markdown
Collaborator

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 16, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Comment thread src/axolotl/monkeypatch/lora_kernels.py Outdated
Comment on lines +79 to +83
# NOTE: Gemma4 intentionally has no QKV_PATCHES entry. It always runs
# through the fused attention monkeypatch (patch_gemma4_fused_attn),
# whose hand-written forward already calls self.apply_qkv/self.apply_o.
# patch_self_attn_lora skips Gemma4 by class, so a source-rewrite pattern
# here would be permanently dead. See that skip for the rationale.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct, it's only when we setup that patch, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok agreed it was misleading it's not specific to Gemma4 ,ig it holds because patch_manager applies patch_gemma4_fused_attn unconditionally for gemma4 before patch_self_attn_lora runs, corrected the comment 🫡

Copy link
Copy Markdown
Collaborator

@NanoCode012 NanoCode012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@NanoCode012 NanoCode012 merged commit bccc1e5 into axolotl-ai-cloud:main May 22, 2026
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants