Skip to content

[model] feat: add registration and config converter for Qwen 2.5-Omni#5120

Closed
martinzhang03 wants to merge 1 commit intoverl-project:mainfrom
martinzhang03:feat/qwen2_5_omni_support
Closed

[model] feat: add registration and config converter for Qwen 2.5-Omni#5120
martinzhang03 wants to merge 1 commit intoverl-project:mainfrom
martinzhang03:feat/qwen2_5_omni_support

Conversation

@martinzhang03
Copy link

What does this PR do?

This PR introduces the initial registration and configuration conversion logic for the Qwen 2.5-Omni model within the veRL framework.

Key Changes

  • Model Registration: Added QWEN2_5_OMNI to the SupportedVLM and SupportedModel enums in verl/models/mcore/registry.py.

  • Config Converter: Implemented hf_to_mcore_config_qwen2_5_omni in verl/models/mcore/config_converter.py.

    • This converter specifically addresses the nested configuration structure of the Omni model: Qwen2_5OmniConfigthinker_configtext_config.
    • It correctly extracts mrope_section from rope_parameters to support multimodal rotary positional embeddings (mRoPE).
  • Forward Registry Skeleton: Added commented-out entries for MODEL_FORWARD registries as placeholders for the upcoming implementation.

Implementation Details

The implementation was developed by:

  • Analyzing the Qwen2_5OmniForConditionalGeneration structure in the Hugging Face transformers library.
  • Referencing established patterns in ms-swift for Omni model training support to ensure compatibility with existing community standards.

Future Work (Next Steps)

This PR serves as the foundation for full Qwen 2.5-Omni support. Immediate follow-up PRs will include:

  • Weight Converter: Implementation to map HF checkpoints to Megatron-Core tensors.
  • Model Initializer: Implementation of the initialization logic.
  • Forward Activation: Enabling the Forward Registries (standard, no-pad, and fused).

Technical Summary (中文)

本次 PR 实现了 Qwen 2.5-Omni 模型在 veRL 中的基础注册与配置转换逻辑:

  • 嵌套配置解析:成功处理了 Omni 模型特有的 thinker_config 嵌套结构,准确提取核心文本参数。
  • mRoPE 支持:从配置中提取并映射了多模态旋转位置编码参数。
  • 架构对齐:参考了 ms-swift 的实现经验,确保与现有社区多模态训练标准保持一致。

@CLAassistant
Copy link

CLAassistant commented Jan 30, 2026

CLA assistant check
All committers have signed the CLA.

@wuxibin89
Copy link
Collaborator

wuxibin89 commented Jan 30, 2026

We're now use mbridge and NVIDIA-NeMo/Megatron-Bridge to convert hf model to mcore GPTModel, please submit a PR to them.

@martinzhang03
Copy link
Author

We're now use mbridge and NVIDIA-NeMo/Megatron-Bridge to convert hf model to mcore GPTModel, please submit a PR to them.

Thanks for the heads-up, @wuxibin89! I'll port the configuration logic over to that repository and submit a PR there instead. Should I keep this PR open for a moment in case the registration part (the Enum changes) is still needed here, or should I close this entirely?

@wuxibin89
Copy link
Collaborator

We're now use mbridge and NVIDIA-NeMo/Megatron-Bridge to convert hf model to mcore GPTModel, please submit a PR to them.

Thanks for the heads-up, @wuxibin89! I'll port the configuration logic over to that repository and submit a PR there instead. Should I keep this PR open for a moment in case the registration part (the Enum changes) is still needed here, or should I close this entirely?

Close this PR since we're going to cleanup mcore model registration in verl #4496 #4530

@wuxibin89 wuxibin89 closed this Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants