Skip to content

fix: qwen3_vl attention config#3216

Merged
NanoCode012 merged 1 commit into
mainfrom
fix/qwen3vl-loraoptim
Oct 17, 2025
Merged

fix: qwen3_vl attention config#3216
NanoCode012 merged 1 commit into
mainfrom
fix/qwen3vl-loraoptim

Conversation

@NanoCode012

@NanoCode012 NanoCode012 commented Oct 16, 2025

Copy link
Copy Markdown
Collaborator

Description

Fixes https://discord.com/channels/1104757954588196865/1111279858136383509/1428340920318689435

AttributeError: module 'transformers.models.qwen3_vl.modeling_qwen3_vl' has no attribute 'Qwen3VlAttention'. Did you mean: 'Qwen3VLTextAttention'?
Error: module 'transformers.models.qwen3_vl.modeling_qwen3_vl' has no attribute 'Qwen3VlAttention'

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

Summary by CodeRabbit

  • New Features
    • Added support for Qwen3VL model type in attention handling.

@coderabbitai

coderabbitai Bot commented Oct 16, 2025

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Adds a new special-case branch for model_type "qwen3_vl" in get_attention_cls_from_config, importing Qwen3VLTextAttention from transformers and returning it. The branch is placed before the existing "mllama" case. No changes to function signature or behavior for other model types.

Changes

Cohort / File(s) Change Summary
Model-type branching in attention class selector
src/axolotl/monkeypatch/lora_kernels.py
Added new conditional branch for "qwen3_vl" model type that imports and returns Qwen3VLTextAttention from transformers

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Suggested reviewers

  • djsaunde
  • SalmanMohammadi

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "fix: qwen3_vl attention config" is directly related to the primary change in the pull request. The changeset adds support for the Qwen3VL model type by importing and returning Qwen3VLTextAttention in the get_attention_cls_from_config function, which resolves an AttributeError where the wrong attention class name was being referenced. The title is concise, specific enough to indicate this is fixing Qwen3VL attention configuration, and avoids vague language or noise. A developer scanning commit history would understand that this change addresses an attention configuration issue for the Qwen3VL model.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/qwen3vl-loraoptim

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa1240a and 2d94327.

📒 Files selected for processing (1)
  • src/axolotl/monkeypatch/lora_kernels.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
  • GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.7.1)
  • GitHub Check: PyTest (3.11, 2.8.0)
🔇 Additional comments (1)
src/axolotl/monkeypatch/lora_kernels.py (1)

137-140: Looks good—the fix is isolated and follows the established pattern. Manual verification recommended.

The codebase search confirms:

  • No other qwen3_vl references that need updating
  • No incorrect class name Qwen3VlAttention elsewhere
  • The fix matches the error message correction and is consistent with similar model type handlers

The import couldn't be verified in the sandbox (transformers not available), but the learnings confirm transformers 4.57.0 adds Qwen3-VL support. Please verify this works in your local environment with transformers >= 4.57.0 installed before merging.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Oct 16, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 33.33333% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/axolotl/monkeypatch/lora_kernels.py 33.33% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@NanoCode012 NanoCode012 merged commit 93ba573 into main Oct 17, 2025
13 of 14 checks passed
@NanoCode012 NanoCode012 deleted the fix/qwen3vl-loraoptim branch October 17, 2025 03:35
flaviusburca pushed a commit to invergent-ai/axolotl that referenced this pull request Oct 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants