Enable multimodal model support for RL training#264
Enable multimodal model support for RL training#264nancyjlau wants to merge 4 commits intorllm-org:mainfrom
Conversation
…o multimodal models can work with rllm. that way for issue rllm-org#242, multimodal ReAct agents can process and generate images for RL training
…3VLProcessor requires transformers 4.57.0+
|
Thanks for your PR! This feature would be really helpful. Can you write a simple training example that can verify it? |
Could you please provide a test startup script as well as the training dataset used for testing? Thanks! |
|
I think we need vllm>=0.11.0 or sglang>=0.5.3 to support qwen3-vl rollout |
|
Working on getting a environment that works again for examples but yes, this needs an updated sglang version in order to get it to work. Currently in dependency hell trying to recreate a working environment for qwen3-vl. |
|
Does merging this into the nightly version helps? It is currently using |
|
For vllm==0.11.0, this works for me verl-project/verl#3934 |
|
You were able to get it working on I'm currently facing these issues: For SGLang, I was able to get it training until I hit a OOM error on a single GPU, and then am unable to replicate the setup with multiple H100s. |
|
@nancyjlau I would like to know if you can start the RLLM training in the latest version of verl. I tried to replace vert-0.5.0 with the latest one, but it was suspended. As a result, the task pends after outputting the logs: Furthermore, my enviroment is: |
|
@nancyjlau cu128+torch2.8.0 is compatible with flash-attn 2.7.4.post1 in my env. Btw, which sglang version are you using? I encontered some memory leak problem with sglang |


Added processor support that is tested to work for multimodal VL models like Qwen2.5-VL and Qwen3-VL. Changes include updating the verl submodule to latest main (which includes multimodal support from PRs #2146 and #2398), adding
hf_processorloading in rLLM trainers (train_workflow_pipeline.pyandtrain_agent_ppo.py), and bumping transformers to >=4.57.0 for Qwen3-VL.I have tested
Qwen/Qwen2.5-VL-3B-Instructand successfully loadedQwen2_5_VLProcessor. Qwen3-VL models are supported withtransformers >=4.57.0.Closes #242