cp: add peft to recipe qwen3vl (2023) into r0.3.0#2220
Conversation
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
|
/ok to test 581f0f6 |
📝 WalkthroughWalkthroughThis PR adds Parameter-Efficient Fine-Tuning (PEFT) support to Qwen3-VL through documentation, a CLI argument for PEFT selection, model forward signature updates for loss masking, recipe configuration handling, and new functional tests for finetune recipes. Changes
Sequence DiagramsequenceDiagram
participant User
participant CLI as finetune_qwen_vl.py
participant RecipeModule as qwen3_vl Recipe
participant PEFTConfig as PEFT Config Builder
participant Model as Qwen3VLModel
User->>CLI: Run with --peft lora
CLI->>CLI: Build recipe_kwargs dict
CLI->>RecipeModule: Call pretrain_config(**recipe_kwargs)
RecipeModule->>PEFTConfig: _default_peft_config(peft)
PEFTConfig-->>RecipeModule: PEFT config object
RecipeModule->>RecipeModule: Adjust tensor_model_parallel_size<br/>(4 for full SFT, 1 for PEFT)
RecipeModule->>RecipeModule: Compute effective_lr<br/>(finetune_lr override)
RecipeModule-->>CLI: ConfigContainer with peft_config
CLI->>Model: Initialize with config
CLI->>Model: Forward pass with loss_mask
Model-->>CLI: Loss output
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@tests/functional_tests/recipes/test_qwen3_vl_recipes_finetune.py`:
- Around line 15-26: Update the top-file docstring in
tests/functional_tests/recipes/test_qwen3_vl_recipes_finetune.py: correct the
example run command string to reference the current file (change
test_qwen3_vl_recipes_peft.py to test_qwen3_vl_recipes_finetune.py) and add a
short hardware note indicating the GPU requirement (e.g., "Requires 2 GPUs; run
with torchrun --nproc_per_node=2"). Locate and edit the docstring text block
that contains the "Run with:" line to make these changes.
🧹 Nitpick comments (4)
src/megatron/bridge/models/qwen_vl/modelling_qwen3_vl/model.py (1)
195-221: Alignloss_masktyping and docstring with Python 3.10+ conventions.Use PEP604 for nullable typing and document the new parameter in the docstring.
Proposed fix
- loss_mask: torch.Tensor = None, + loss_mask: torch.Tensor | None = None, @@ - labels (torch.Tensor): Optional target text labels [batch, combined_seq_len]. + labels (torch.Tensor): Optional target text labels [batch, combined_seq_len]. + loss_mask (torch.Tensor | None): Optional loss mask aligned with labels.As per coding guidelines: Use 'T | None' for nullable types instead of 'Optional[T]' and Use Google style docstrings (parseable by Sphinx) for classes and functions.
src/megatron/bridge/recipes/qwen_vl/qwen3_vl.py (2)
102-105: Use PEP604 types for the new PEFT fields.This aligns the new fields with the Python 3.10+ typing guidance already required.
Proposed fix
- peft: Optional[Union[str, PEFT]] - finetune_lr: float + peft: PEFT | str | None + finetune_lr: float | NoneAs per coding guidelines: Use 'T | None' for nullable types instead of 'Optional[T]' and Use 'X | Y' for union types instead of 'Union[X, Y]'.
327-329: Use PEP604 types for the new_qwen3_vl_commonparams.Proposed fix
- peft: Optional[Union[str, PEFT]] = None, - finetune_lr: Optional[float] = None, + peft: PEFT | str | None = None, + finetune_lr: float | None = None,As per coding guidelines: Use 'T | None' for nullable types instead of 'Optional[T]' and Use 'X | Y' for union types instead of 'Union[X, Y]'.
tests/functional_tests/recipes/test_qwen3_vl_recipes_finetune.py (1)
34-54: Add an integration mark and use theG_global prefix.Proposed fix
-QWEN3_VL_FINETUNE_RECIPES = [ +G_QWEN3_VL_FINETUNE_RECIPES = [ @@ - `@pytest.mark.parametrize`( - "config_func,recipe_name,parallelism_overrides,model_overrides", - QWEN3_VL_FINETUNE_RECIPES, - ) + `@pytest.mark.integration` + `@pytest.mark.parametrize`( + "config_func,recipe_name,parallelism_overrides,model_overrides", + G_QWEN3_VL_FINETUNE_RECIPES, + )As per coding guidelines: Use pytest.mark to categorize tests (unit, integration, system) and Use upper snake_case and prefix 'G' for global variables (e.g., G_MY_GLOBAL).
| """Functional smoke tests for Qwen3-VL finetuning recipes. | ||
|
|
||
| This test ensures that: | ||
| 1. Qwen3-VL model forward pass works with all required parameters (including loss_mask) | ||
| 2. Training loop completes without errors | ||
| 3. Checkpoints are saved correctly | ||
|
|
||
| This catches regressions like missing parameters in the forward pass signature. | ||
|
|
||
| Run with: | ||
| torchrun --nproc_per_node=2 -m pytest tests/functional_tests/recipes/test_qwen3_vl_recipes_peft.py -v | ||
| """ |
There was a problem hiding this comment.
Fix the run command and document GPU requirements.
The run command references the wrong filename, and the GPU requirement should be explicit.
Proposed fix
-Run with:
- torchrun --nproc_per_node=2 -m pytest tests/functional_tests/recipes/test_qwen3_vl_recipes_peft.py -v
+Run with (requires 2 GPUs):
+ torchrun --nproc_per_node=2 -m pytest tests/functional_tests/recipes/test_qwen3_vl_recipes_finetune.py -vAs per coding guidelines: Document hardware requirements for GPU tests.
🤖 Prompt for AI Agents
In `@tests/functional_tests/recipes/test_qwen3_vl_recipes_finetune.py` around
lines 15 - 26, Update the top-file docstring in
tests/functional_tests/recipes/test_qwen3_vl_recipes_finetune.py: correct the
example run command string to reference the current file (change
test_qwen3_vl_recipes_peft.py to test_qwen3_vl_recipes_finetune.py) and add a
short hardware note indicating the GPU requirement (e.g., "Requires 2 GPUs; run
with torchrun --nproc_per_node=2"). Locate and edit the docstring text block
that contains the "Run with:" line to make these changes.
ko3n1g
left a comment
There was a problem hiding this comment.
@yashaswikarnati Do we have failing tests we're fixing with this PR? If not, please split the docs changes out so that we can merge those. The code changes we should merge-post release, unless we have tests that are failing
ko3n1g
left a comment
There was a problem hiding this comment.
Yash and I synced offline - there was no test in CI before, but this PR adds a fast for the failure. All good then, we merge it
beep boop [🤖]: Hi @yashaswikarnati 👋,
Summary by CodeRabbit
Release Notes
New Features
Documentation
Tests