fix AssertionError: Original QKV code not found by ved1beta · Pull Request #3657 · axolotl-ai-cloud/axolotl

ved1beta · 2026-05-15T07:25:11Z

Description

when training Gemma 4 with LoRA. The Gemma 4 fused RMSNorm+RoPE monkeypatch replaces Gemma4TextAttention.forward wholesale before the LoRA QKV/O source-rewrite patch runs, so the rewrite can no longer find its expected pattern and asserts.

Motivation and Context

#3655

How has this been tested?

config runs without AssertionError: Original QKV code not found

AI Usage Disclaimer

claude dignose

Summary by CodeRabbit

Bug Fixes
- Improved LoRA compatibility with optimized/fused attention paths to avoid incorrect patching.
- Gemma4 models now bypass incompatible forward-rewrite logic and use the LoRA output hook when available, preventing misapplied patches and ensuring safer fallback behavior.

coderabbitai · 2026-05-15T07:25:22Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 575eb148-5028-4dc7-8a60-d8c6ef962e0d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Removes Gemma4 from QKV source-rewrite patches, makes patch_self_attn_lora skip Gemma4 attention classes, and updates Gemma4 fused attention to use an optional apply_o hook for output projection with fallback to o_proj.

Changes

Gemma4 LoRA handling changes

Layer / File(s)	Summary
Remove Gemma4 QKV patch entry `src/axolotl/monkeypatch/lora_kernels.py`	Deletes the Gemma4-specific QKV replacement from `QKV_PATCHES` and documents that Gemma4 uses fused-attention so QKV patching is omitted.
Gemma4 guard in patch_self_attn_lora `src/axolotl/monkeypatch/lora_kernels.py`	`patch_self_attn_lora` imports/checks `Gemma4TextAttention`, logs an informational message, and returns early to avoid forward source-rewrite/patching for Gemma4.
Fused attention conditional apply_o `src/axolotl/monkeypatch/models/gemma4/fused_attn.py`	Gemma4 fused attention now routes output through a LoRA `apply_o` hook when present; otherwise it falls back to `o_proj`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

axolotl-ai-cloud/axolotl#2732: Related changes around LoRA self-attention kernel patching and tests for apply_qkv/apply_o.

Suggested reviewers

winglian
NanoCode012

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly addresses the main issue being fixed: the AssertionError when the QKV pattern is not found, which occurs because Gemma4's fused attention monkeypatch runs before the LoRA QKV source-rewrite patch.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-05-15T07:36:14Z

Codecov Report

❌ Patch coverage is 0% with 9 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/axolotl/monkeypatch/lora_kernels.py	0.00%	6 Missing ⚠️
...rc/axolotl/monkeypatch/models/gemma4/fused_attn.py	0.00%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

winglian · 2026-05-16T18:32:06Z

@coderabbitai review

coderabbitai · 2026-05-16T18:32:12Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

winglian · 2026-05-16T18:43:17Z

+    # NOTE: Gemma4 intentionally has no QKV_PATCHES entry. It always runs
+    # through the fused attention monkeypatch (patch_gemma4_fused_attn),
+    # whose hand-written forward already calls self.apply_qkv/self.apply_o.
+    # patch_self_attn_lora skips Gemma4 by class, so a source-rewrite pattern
+    # here would be permanently dead. See that skip for the rationale.


I don't think this is correct, it's only when we setup that patch, right?

ok agreed it was misleading it's not specific to Gemma4 ,ig it holds because patch_manager applies patch_gemma4_fused_attn unconditionally for gemma4 before patch_self_attn_lora runs, corrected the comment 🫡

NanoCode012

Thanks

fix AssertionError: Original QKV code not found

057577b

skip ig gemma for lor a

d4c404a

winglian reviewed May 16, 2026

View reviewed changes

fix misleading commentsT_T'

d744fc2

NanoCode012 approved these changes May 18, 2026

View reviewed changes

NanoCode012 added the ready to merge label May 19, 2026

NanoCode012 merged commit bccc1e5 into axolotl-ai-cloud:main May 22, 2026
14 of 15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix AssertionError: Original QKV code not found#3657

fix AssertionError: Original QKV code not found#3657
NanoCode012 merged 3 commits into
axolotl-ai-cloud:mainfrom
ved1beta:gemma-QKV

ved1beta commented May 15, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 15, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented May 15, 2026 •

edited

Loading

Uh oh!

winglian commented May 16, 2026

Uh oh!

coderabbitai Bot commented May 16, 2026

Uh oh!

winglian May 16, 2026

Uh oh!

ved1beta May 17, 2026

Uh oh!

NanoCode012 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

ved1beta commented May 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How has this been tested?

AI Usage Disclaimer

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

codecov Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

winglian commented May 16, 2026

Uh oh!

coderabbitai Bot commented May 16, 2026

Uh oh!

winglian May 16, 2026

Choose a reason for hiding this comment

Uh oh!

ved1beta May 17, 2026

Choose a reason for hiding this comment

Uh oh!

NanoCode012 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ved1beta commented May 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 15, 2026 •

edited

Loading

codecov Bot commented May 15, 2026 •

edited

Loading