Skip to content

feat: add gemma3_text attention handling for lora kernels#3103

Merged
NanoCode012 merged 1 commit into
axolotl-ai-cloud:mainfrom
NanoCode012:feat/gemma3-attn
Aug 26, 2025
Merged

feat: add gemma3_text attention handling for lora kernels#3103
NanoCode012 merged 1 commit into
axolotl-ai-cloud:mainfrom
NanoCode012:feat/gemma3-attn

Conversation

@NanoCode012

@NanoCode012 NanoCode012 commented Aug 26, 2025

Copy link
Copy Markdown
Collaborator

Description

Add handling for gemma3_text which is used for the new gemma3_270m model.

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

  • Bug Fixes
    • Improved compatibility and stability for Gemma 3 text models by directly selecting the appropriate attention implementation, reducing intermittent import errors.
    • Fewer edge-case failures during model loading and inference for affected configurations.
    • More reliable startup behavior with no changes required to user configuration or workflows.

@NanoCode012 NanoCode012 requested a review from djsaunde August 26, 2025 09:43
@coderabbitai

coderabbitai Bot commented Aug 26, 2025

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Adds an early conditional branch in get_attention_cls_from_config for model_type "gemma3_text" to directly import and return Gemma3Attention, preceding the existing dynamic import logic. No changes to function signature or other code paths.

Changes

Cohort / File(s) Summary
Attention-class resolution logic updates
src/axolotl/monkeypatch/lora_kernels.py
Inserted an early-case check for model_type == "gemma3_text" to import/return Gemma3Attention before the generic dynamic import path. No other modifications.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • djsaunde
  • winglian

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@NanoCode012

Copy link
Copy Markdown
Collaborator Author

Thanks for fast review :)

@NanoCode012 NanoCode012 merged commit 0de254a into axolotl-ai-cloud:main Aug 26, 2025
8 of 9 checks passed
@NanoCode012 NanoCode012 deleted the feat/gemma3-attn branch August 26, 2025 09:47

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/axolotl/monkeypatch/lora_kernels.py (2)

152-156: Optional: broaden the branch to also cover model_type == "gemma3" for clarity and symmetry

The dynamic import will already work for model_type == "gemma3", but handling both here makes the intent explicit and future-proof if get_causal_lm_model_cls_prefix ever changes. It also documents in code that both gemma3 and gemma3_text share the same attention class location/name. (huggingface.co, github.com)

Apply this minimal diff:

-    if model_type == "gemma3_text":
+    if model_type in ("gemma3_text", "gemma3"):
         from transformers.models.gemma3.modeling_gemma3 import Gemma3Attention

         return Gemma3Attention

152-156: Add unit tests to verify Gemma3Attention resolution

To prevent future regressions across Transformers versions and differing config layouts, add a small suite of unit tests—without downloading any models—that asserts get_attention_cls_from_config returns Gemma3Attention for both:

  • a text‐only Gemma 3 config (model_type="gemma3_text")
  • a multimodal Gemma 3 config (model_type="gemma3" with nested text_config.model_type="gemma3_text")

You can mock transformers.AutoConfig.from_pretrained to return a simple object with the right model_type fields, and import the real Gemma3Attention class via a dummy module import. For example, in a new test file tests/test_lora_kernels.py:

import importlib
import types
import pytest
from axolotl.monkeypatch.lora_kernels import get_attention_cls_from_config

class DummyConfig:
    def __init__(self, model_type, nested=None):
        self.model_type = model_type
        if nested:
            setattr(self, nested[0], nested[1])

@pytest.fixture(autouse=True)
def patch_autoconfig(monkeypatch):
    def fake_from_pretrained(name):
        # route based on the name to simulate both scenarios
        if "text-only" in name:
            return DummyConfig("gemma3_text")
        else:
            return DummyConfig("gemma3", nested=("text_config", DummyConfig("gemma3_text")))
    monkeypatch.setattr("transformers.AutoConfig.from_pretrained", fake_from_pretrained)

def test_text_only_gemma3():
    cfg = {"base_model": "dummy-text-only"}
    cls = get_attention_cls_from_config(cfg)
    module = importlib.import_module("transformers.models.gemma3.modeling_gemma3")
    expected = getattr(module, "Gemma3Attention")
    assert cls is expected

def test_multimodal_gemma3():
    cfg = {"base_model": "dummy-multimodal"}
    cls = get_attention_cls_from_config(cfg)
    module = importlib.import_module("transformers.models.gemma3.modeling_gemma3")
    expected = getattr(module, "Gemma3Attention")
    assert cls is expected

Points to verify:

  • Both tests pass without pulling any checkpoints or requiring a real HF endpoint.
  • They cover the top‐level "gemma3_text" case and the nested multimodal case.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 79ddaeb and f211aec.

📒 Files selected for processing (1)
  • src/axolotl/monkeypatch/lora_kernels.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.0)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.7.0)
🔇 Additional comments (1)
src/axolotl/monkeypatch/lora_kernels.py (1)

152-156: Good fix: gemma3_text needs a direct import of Gemma3Attention from the gemma3 module

AutoConfig for text-only Gemma 3 checkpoints can surface model_type == "gemma3_text" (e.g., Gemma3ForCausalLM 1B text-only), but Transformers defines the attention under transformers.models.gemma3.modeling_gemma3 as Gemma3Attention. Without this special-case, the dynamic path would try to import transformers.models.gemma3_text.modeling_gemma3_text.Gemma3TextAttention, which does not exist. This branch prevents that import failure and resolves the mismatch. (huggingface.co, github.com)

@codecov

codecov Bot commented Aug 26, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/axolotl/monkeypatch/lora_kernels.py 0.00% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants