Skip to content

cp: Fix Qwen2.5-VL huggingface conversion issue (#2107) (2156) into r0.3.0#2205

Merged
ko3n1g merged 1 commit intor0.3.0from
cherry-pick-2156-r0.3.0
Feb 6, 2026
Merged

cp: Fix Qwen2.5-VL huggingface conversion issue (#2107) (2156) into r0.3.0#2205
ko3n1g merged 1 commit intor0.3.0from
cherry-pick-2156-r0.3.0

Conversation

@ko3n1g
Copy link
Copy Markdown
Contributor

@ko3n1g ko3n1g commented Feb 4, 2026

beep boop [🤖]: Hi @meatybobby 👋,

we've cherry picked #2156 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

Summary by CodeRabbit

  • Refactor

    • Improved handling of nested model structures for MCore Inference Engine compatibility with Qwen25 VLM inference.
  • Tests

    • Enhanced test coverage to validate proper exposure of model components across various nested model configurations.

Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
@ko3n1g
Copy link
Copy Markdown
Contributor Author

ko3n1g commented Feb 4, 2026

/ok to test 57a3aa8

@ko3n1g ko3n1g added the Run CICD label Feb 4, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Feb 4, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 4, 2026

📝 Walkthrough

Walkthrough

This change relocates decoder exposure logic for Qwen25VL models from model initialization to the inference wrapper setup phase. A new helper function unwraps nested modules to expose the decoder on the MCore inference wrapper, while the corresponding model-level assignment is removed.

Changes

Cohort / File(s) Summary
Inference Wrapper Setup
src/megatron/bridge/inference/vlm/base.py
Adds helper function _expose_decoder_from_language_model() to unwrap nested modules and expose decoder; integrates call in setup_inference_wrapper() for Qwen25VLModelProvider configs.
Model Initialization
src/megatron/bridge/models/qwen_vl/modeling_qwen25_vl.py
Removes self.decoder = self.language_model.decoder assignment from Qwen25VLModel class initialization.
Test Scaffolding
tests/unit_tests/inference/vlm/test_base.py
Updates mock structures to reflect three-level nested model hierarchies for Qwen25; adds lifecycle method mocks (cuda, to, eval) and simplifies Qwen3 path tests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

r0.3.0

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title is a cherry-pick reference that describes the action (copying PR #2107/#2156 into r0.3.0 branch) but doesn't clearly convey the actual technical change being made (refactoring decoder exposure for Qwen2.5-VL inference compatibility). Use a more descriptive title that reflects the actual technical change, such as 'Refactor decoder exposure for Qwen2.5-VL MCore Inference Engine compatibility' rather than just the cherry-pick reference.
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Test Results For Major Changes ✅ Passed PR contains minor targeted bug fix for Qwen2.5-VL conversion with updated unit tests validating decoder exposure logic.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cherry-pick-2156-r0.3.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/megatron/bridge/inference/vlm/base.py`:
- Around line 115-123: The helper _expose_decoder_from_language_model currently
unwraps wrapped modules by following .module and can loop forever if .module
cycles; update its signature to include type hints (e.g., model:
torch.nn.Module) and a return type of None, add a Google-style docstring
describing args and return, and modify the unwrap loop to detect cycles by
tracking visited objects (e.g., using a set of ids of `current`) and
break/return if a cycle is detected; after unwrapping, safely access
`language_model` and assign `current.decoder = language_model.decoder` only if
present.

In `@tests/unit_tests/inference/vlm/test_base.py`:
- Line 262: The test assigns `_wrapper = setup_inference_wrapper(mock_model,
mock_tokenizer)` (and the same at another spot) but never uses `_wrapper`,
causing lint F841; fix by removing the unused assignment and either call
`setup_inference_wrapper(mock_model, mock_tokenizer)` without assigning or, if
the return is needed later, assign to a used name (e.g., `wrapper`) or use the
returned value; update the lines where `_wrapper` is set (the
`setup_inference_wrapper` calls) accordingly.

Comment on lines +115 to +123
def _expose_decoder_from_language_model(model):
"""Recursively get language_model from model and expose decoder, handling wrapped modules."""
current = model
while hasattr(current, "module"):
current = current.module

if hasattr(current, "language_model"):
language_model = current.language_model
current.decoder = language_model.decoder
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard against cyclic .module and add type hints/docstring.
The unwrap loop can hang if .module ever points to itself (or cycles). Also, this new helper should follow the typing + Google-docstring rules.

🔧 Proposed fix
-def _expose_decoder_from_language_model(model):
-    """Recursively get language_model from model and expose decoder, handling wrapped modules."""
-    current = model
-    while hasattr(current, "module"):
-        current = current.module
+def _expose_decoder_from_language_model(model: torch.nn.Module) -> None:
+    """Expose the decoder on the base model after unwrapping `.module` layers.
+
+    Args:
+        model: The potentially wrapped model instance.
+
+    Returns:
+        None.
+    """
+    current = model
+    while hasattr(current, "module"):
+        next_module = current.module
+        if next_module is None or next_module is current:
+            break
+        current = next_module

As per coding guidelines: Use Google style docstrings for classes and functions; Use type hints for function arguments and return types.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _expose_decoder_from_language_model(model):
"""Recursively get language_model from model and expose decoder, handling wrapped modules."""
current = model
while hasattr(current, "module"):
current = current.module
if hasattr(current, "language_model"):
language_model = current.language_model
current.decoder = language_model.decoder
def _expose_decoder_from_language_model(model: torch.nn.Module) -> None:
"""Expose the decoder on the base model after unwrapping `.module` layers.
Args:
model: The potentially wrapped model instance.
Returns:
None.
"""
current = model
while hasattr(current, "module"):
next_module = current.module
if next_module is None or next_module is current:
break
current = next_module
if hasattr(current, "language_model"):
language_model = current.language_model
current.decoder = language_model.decoder
🤖 Prompt for AI Agents
In `@src/megatron/bridge/inference/vlm/base.py` around lines 115 - 123, The helper
_expose_decoder_from_language_model currently unwraps wrapped modules by
following .module and can loop forever if .module cycles; update its signature
to include type hints (e.g., model: torch.nn.Module) and a return type of None,
add a Google-style docstring describing args and return, and modify the unwrap
loop to detect cycles by tracking visited objects (e.g., using a set of ids of
`current`) and break/return if a cycle is detected; after unwrapping, safely
access `language_model` and assign `current.decoder = language_model.decoder`
only if present.

mock_model.to = MagicMock(return_value=mock_model)
mock_model.eval = MagicMock()

_wrapper = setup_inference_wrapper(mock_model, mock_tokenizer)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused _wrapper assignments to satisfy lint.
_wrapper is assigned but never used, and F841 will fail lint. Drop the assignment or use the value.

🧹 Proposed fix
-        _wrapper = setup_inference_wrapper(mock_model, mock_tokenizer)
+        setup_inference_wrapper(mock_model, mock_tokenizer)
-        _wrapper = setup_inference_wrapper(mock_model, mock_tokenizer)
+        setup_inference_wrapper(mock_model, mock_tokenizer)

Also applies to: 288-288

🧰 Tools
🪛 Flake8 (7.3.0)

[error] 262-262: local variable '_wrapper' is assigned to but never used

(F841)

🤖 Prompt for AI Agents
In `@tests/unit_tests/inference/vlm/test_base.py` at line 262, The test assigns
`_wrapper = setup_inference_wrapper(mock_model, mock_tokenizer)` (and the same
at another spot) but never uses `_wrapper`, causing lint F841; fix by removing
the unused assignment and either call `setup_inference_wrapper(mock_model,
mock_tokenizer)` without assigning or, if the return is needed later, assign to
a used name (e.g., `wrapper`) or use the returned value; update the lines where
`_wrapper` is set (the `setup_inference_wrapper` calls) accordingly.

@meatybobby meatybobby marked this pull request as ready for review February 5, 2026 23:46
@ko3n1g ko3n1g merged commit 37ba134 into r0.3.0 Feb 6, 2026
49 checks passed
@ko3n1g ko3n1g deleted the cherry-pick-2156-r0.3.0 branch February 6, 2026 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants