Skip to content

[test] fix: use toy configs in qwen2.5 omni unit tests#2761

Merged
cuichenx merged 3 commits intomainfrom
yuya/fix-omni-unit-test-toy-model
Mar 12, 2026
Merged

[test] fix: use toy configs in qwen2.5 omni unit tests#2761
cuichenx merged 3 commits intomainfrom
yuya/fix-omni-unit-test-toy-model

Conversation

@yaoyu-33
Copy link
Copy Markdown
Contributor

@yaoyu-33 yaoyu-33 commented Mar 11, 2026

What does this PR do?

Replaces full-size HuggingFace model configs with local toy-sized configs in TestQwen25OmniModel, fixing CI timeout failures.

Problem

The tests introduced in #2634 use AutoConfig.from_pretrained("Qwen/Qwen2.5-Omni-7B") to load the real 7B model config, then construct full-size HF vision encoder (32-layer ViT), audio encoder (32-layer Whisper), and a 4-layer language model at 7B dimensions (hidden_size=3584). test_shared_embedding_or_output_weight constructs two such models within a 50s timeout, causing CI failures on PRs like #2004.

Solution

Build tiny configs locally — no network downloads needed:

Component Before After
Vision encoder depth=32, hidden_size=3584 depth=2, hidden_size=64
Audio encoder encoder_layers=32, d_model=1280 encoder_layers=2, d_model=64
Language model hidden_size=3584, vocab_size=152064 hidden_size=128, vocab_size=1000
Total test time >50s (timeout) ~7s

Verified on

  • CW cluster GPU node: all 4 tests pass in 7.06s

Changelog

  • Replace AutoConfig.from_pretrained / AutoProcessor.from_pretrained with local toy Qwen2_5OmniThinkerConfig
  • Refactor test helpers into _make_language_config, _make_layer_spec, _build_model for DRY
  • Remove unused processor fixture and get_data_batch static method
  • Remove @pytest.mark.timeout(50) markers (tests now run in <1s each)

Supersedes #2759 (temporary pleasefixme skip)

cc @yuekaizhang @ko3n1g

Made with Cursor

Summary by CodeRabbit

  • Chores

    • Updated internal dependencies.
  • Tests

    • Improved test efficiency by streamlining test configurations and removing unnecessary dependencies.

Note: This release contains no user-facing changes.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Mar 11, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@yaoyu-33
Copy link
Copy Markdown
Contributor Author

/ok to test 8087281

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

A submodule pointer update for Megatron-LM and a comprehensive refactoring of test fixtures that replaces HF-dependent toy configurations with self-contained, lightweight setup helpers and builder methods to streamline test initialization.

Changes

Cohort / File(s) Summary
Submodule Update
3rdparty/Megatron-LM
Pointer updated from commit 07e512a to 23dd639. No behavioral or code changes.
Test Fixture Refactoring
tests/unit_tests/models/qwen_omni/modeling_qwen25_omni/test_omni_model.py
Replaced HF-driven toy setup with self-contained configurations for vision, audio, and text. Introduced thinker_config fixture and helper methods (_setup_parallel_state, _build_model, _make_language_config, _make_layer_spec) to reduce boilerplate. Updated test_model_freeze_api, test_shared_embedding_or_output_weight, and test_set_input_tensor to use new toy config pathway.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: replacing full-size HuggingFace model configs with locally constructed toy-sized configs in Qwen2.5 omni unit tests to fix timeout issues.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Test Results For Major Changes ✅ Passed PR documents test results with quantified performance metrics and functional verification, satisfying requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch yuya/fix-omni-unit-test-toy-model

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Replace AutoConfig.from_pretrained("Qwen/Qwen2.5-Omni-7B") with local
toy-sized configs (hidden_size=128, depth=2, encoder_layers=2) so the
tests run in ~7s total instead of timing out at 50s.

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Made-with: Cursor
@yaoyu-33 yaoyu-33 force-pushed the yuya/fix-omni-unit-test-toy-model branch from 8087281 to ba10e02 Compare March 11, 2026 20:25
@yaoyu-33
Copy link
Copy Markdown
Contributor Author

/ok to test ba10e02

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tests/unit_tests/models/qwen_omni/modeling_qwen25_omni/test_omni_model.py (2)

243-244: Consider adding an assertion for the pre_process=False case.

The test verifies model.thinker.encoder_hidden_state is not None for pre_process=True but has no assertion for the pre_process=False case. Consider adding an assertion to verify the expected behavior:

💡 Proposed assertion
         model = self._build_model(thinker_config, pre_process=False)
         model.set_input_tensor([test_tensor])
+        # When pre_process=False, encoder_hidden_state should still be set
+        assert model.thinker.encoder_hidden_state is not None

Or if the expected behavior differs:

         model = self._build_model(thinker_config, pre_process=False)
         model.set_input_tensor([test_tensor])
+        # Document expected behavior for pre_process=False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit_tests/models/qwen_omni/modeling_qwen25_omni/test_omni_model.py`
around lines 243 - 244, Add an assertion for the pre_process=False case to
verify the expected encoder state behavior: after creating the model via
_build_model(thinker_config, pre_process=False), calling
model.set_input_tensor([test_tensor]) should be followed by an assertion on
model.thinker.encoder_hidden_state (either is None or not None depending on
expected behavior); update the test to explicitly assert the expected condition
for model.thinker.encoder_hidden_state to mirror the pre_process=True check.

94-94: Consider updating or removing the @pytest.mark.pleasefixme marker.

The TODO comment suggests replacing HF vision/audio encoders with dummy stubs, but with the toy configs now in place, the tests should complete quickly. If the tests pass within acceptable time limits (PR states ~7s total), this marker and TODO may be stale.

Consider either:

  1. Removing the marker if tests are now reliably fast
  2. Updating the TODO to reflect actual remaining work
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit_tests/models/qwen_omni/modeling_qwen25_omni/test_omni_model.py` at
line 94, The test currently uses the deprecated marker `@pytest.mark.pleasefixme`
in tests/unit_tests/models/qwen_omni/modeling_qwen25_omni/test_omni_model.py;
either remove the marker entirely if the toy configs make the test reliably
fast, or replace/update the TODO comment on that line to accurately describe any
remaining work (e.g., "TODO: keep in mind to stub HF vision/audio encoders only
if tests regress in CI") and remove the custom marker so CI and pytest outputs
are clean. Ensure the change is made on the line containing
"@pytest.mark.pleasefixme" so the test runs normally without the special marker.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@3rdparty/Megatron-LM`:
- Line 1: This PR updates the vendored Megatron-LM submodule but the change
appears unrelated to the stated test-only refactor; either remove the
Megatron-LM revision bump from this PR or move it to a dedicated PR, and if the
bump is truly required, add a brief note in the PR description documenting which
upstream commits or files from the Megatron-LM delta are needed (e.g., justify
why the test import from megatron.core requires the new revision and list the
specific commits or API changes). Update the branch to exclude the submodule
change or add the justification text and a summary of the upstream dependency so
reviewers can verify the necessity.

In `@tests/unit_tests/models/qwen_omni/modeling_qwen25_omni/test_omni_model.py`:
- Around line 152-179: The test config builds a Qwen25OmniTransformerConfig with
a toy vocab_size=1000 but leaves default token ID fields (e.g., image_token_id,
audio_token_id, video_token_id, etc.) that exceed that range; update the test
instantiation of Qwen25OmniTransformerConfig to explicitly set all token-id
fields referenced by the model (image_token_id, audio_token_id, video_token_id,
and any other special token id attributes present on
Qwen25OmniTransformerConfig) to safe values within [0, vocab_size-1] (for
example small constants like 0..10) so future forward passes won’t hit
index-out-of-bounds errors.

---

Nitpick comments:
In `@tests/unit_tests/models/qwen_omni/modeling_qwen25_omni/test_omni_model.py`:
- Around line 243-244: Add an assertion for the pre_process=False case to verify
the expected encoder state behavior: after creating the model via
_build_model(thinker_config, pre_process=False), calling
model.set_input_tensor([test_tensor]) should be followed by an assertion on
model.thinker.encoder_hidden_state (either is None or not None depending on
expected behavior); update the test to explicitly assert the expected condition
for model.thinker.encoder_hidden_state to mirror the pre_process=True check.
- Line 94: The test currently uses the deprecated marker
`@pytest.mark.pleasefixme` in
tests/unit_tests/models/qwen_omni/modeling_qwen25_omni/test_omni_model.py;
either remove the marker entirely if the toy configs make the test reliably
fast, or replace/update the TODO comment on that line to accurately describe any
remaining work (e.g., "TODO: keep in mind to stub HF vision/audio encoders only
if tests regress in CI") and remove the custom marker so CI and pytest outputs
are clean. Ensure the change is made on the line containing
"@pytest.mark.pleasefixme" so the test runs normally without the special marker.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 503bdcfc-f586-4dac-adca-4072ef32b90e

📥 Commits

Reviewing files that changed from the base of the PR and between ca66cdc and ba10e02.

📒 Files selected for processing (2)
  • 3rdparty/Megatron-LM
  • tests/unit_tests/models/qwen_omni/modeling_qwen25_omni/test_omni_model.py

@yaoyu-33 yaoyu-33 added bug Something isn't working area:model Model implementations and HF bridge logic needs-review PR is ready for code review and waiting on a reviewer labels Mar 12, 2026
cuichenx
cuichenx previously approved these changes Mar 12, 2026
Drop the stale pleasefixme marker so the toy-sized Omni unit tests run in CI again, and restore the Megatron-LM submodule pointer back to the main-tracked commit for this branch.

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Made-with: Cursor
@yaoyu-33
Copy link
Copy Markdown
Contributor Author

/ok to test 13cb6f8

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Made-with: Cursor
@yaoyu-33
Copy link
Copy Markdown
Contributor Author

/ok to test 24ff6fd

@cuichenx cuichenx added ready-to-merge PR is approved, current, and only waiting for CI to pass before merge and removed needs-review PR is ready for code review and waiting on a reviewer labels Mar 12, 2026
@cuichenx cuichenx enabled auto-merge (squash) March 12, 2026 21:37
@cuichenx cuichenx merged commit 3fd61dd into main Mar 12, 2026
64 checks passed
@cuichenx cuichenx deleted the yuya/fix-omni-unit-test-toy-model branch March 12, 2026 22:21
copy-pr-bot bot pushed a commit that referenced this pull request Mar 19, 2026
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:model Model implementations and HF bridge logic bug Something isn't working ready-to-merge PR is approved, current, and only waiting for CI to pass before merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants