fix: qwen3_vl attention config by NanoCode012 · Pull Request #3216 · axolotl-ai-cloud/axolotl

NanoCode012 · 2025-10-16T12:52:20Z

Description

Fixes https://discord.com/channels/1104757954588196865/1111279858136383509/1428340920318689435

AttributeError: module 'transformers.models.qwen3_vl.modeling_qwen3_vl' has no attribute 'Qwen3VlAttention'. Did you mean: 'Qwen3VLTextAttention'?
Error: module 'transformers.models.qwen3_vl.modeling_qwen3_vl' has no attribute 'Qwen3VlAttention'

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

New Features
- Added support for Qwen3VL model type in attention handling.

coderabbitai · 2025-10-16T12:52:41Z

📝 Walkthrough

Walkthrough

Adds a new special-case branch for model_type "qwen3_vl" in get_attention_cls_from_config, importing Qwen3VLTextAttention from transformers and returning it. The branch is placed before the existing "mllama" case. No changes to function signature or behavior for other model types.

Changes

Cohort / File(s)	Change Summary
Model-type branching in attention class selector `src/axolotl/monkeypatch/lora_kernels.py`	Added new conditional branch for "qwen3_vl" model type that imports and returns `Qwen3VLTextAttention` from transformers

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

axolotl-ai-cloud/axolotl#2943: Adds model-specific branching for "llama4" in the same get_attention_cls_from_config function with identical pattern
axolotl-ai-cloud/axolotl#3103: Adds early special-case branch for "gemma3_text" in the same function

Suggested reviewers

djsaunde
SalmanMohammadi

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "fix: qwen3_vl attention config" is directly related to the primary change in the pull request. The changeset adds support for the Qwen3VL model type by importing and returning Qwen3VLTextAttention in the get_attention_cls_from_config function, which resolves an AttributeError where the wrong attention class name was being referenced. The title is concise, specific enough to indicate this is fixing Qwen3VL attention configuration, and avoids vague language or noise. A developer scanning commit history would understand that this change addresses an attention configuration issue for the Qwen3VL model.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/qwen3vl-loraoptim

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa1240a and 2d94327.

📒 Files selected for processing (1)

src/axolotl/monkeypatch/lora_kernels.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
GitHub Check: PyTest (3.11, 2.7.1)
GitHub Check: PyTest (3.11, 2.8.0)

🔇 Additional comments (1)

src/axolotl/monkeypatch/lora_kernels.py (1)

137-140: Looks good—the fix is isolated and follows the established pattern. Manual verification recommended.

The codebase search confirms:

No other qwen3_vl references that need updating

No incorrect class name Qwen3VlAttention elsewhere

The fix matches the error message correction and is consistent with similar model type handlers

The import couldn't be verified in the sandbox (transformers not available), but the learnings confirm transformers 4.57.0 adds Qwen3-VL support. Please verify this works in your local environment with transformers >= 4.57.0 installed before merging.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-10-16T13:01:28Z

Codecov Report

❌ Patch coverage is 33.33333% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/axolotl/monkeypatch/lora_kernels.py	33.33%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

(cherry picked from commit 93ba573)

fix: qwen3_vl attention config

2d94327

salmanmohammadi approved these changes Oct 16, 2025

View reviewed changes

NanoCode012 merged commit 93ba573 into main Oct 17, 2025
13 of 14 checks passed

NanoCode012 deleted the fix/qwen3vl-loraoptim branch October 17, 2025 03:35

flaviusburca pushed a commit to invergent-ai/axolotl that referenced this pull request Oct 18, 2025

fix: qwen3_vl attention config (axolotl-ai-cloud#3216)

2dd1801

(cherry picked from commit 93ba573)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: qwen3_vl attention config#3216

fix: qwen3_vl attention config#3216
NanoCode012 merged 1 commit into
mainfrom
fix/qwen3vl-loraoptim

NanoCode012 commented Oct 16, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Oct 16, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

codecov Bot commented Oct 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

NanoCode012 commented Oct 16, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

codecov Bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NanoCode012 commented Oct 16, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 16, 2025 •

edited

Loading

codecov Bot commented Oct 16, 2025 •

edited

Loading