Support nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 (and nvidia/C-RADIOv2-H)#12277
Conversation
fe75b56 to
998988a
Compare
|
Do you think it would be possible to open a separate PR for the changes made to the following files?
This would help us streamline the review and move forward more quickly. Thanks! |
Sure, I'll do so promptly. |
a0c18eb to
f973053
Compare
|
Could you rebase and resolve the conflicts? Then we can run the CI tests — I believe we’ll be able to merge it soon. |
Done |
|
Can you check the cause of the CI error? |
The commit I just pushed should entirely prevent that error from occuring. |
Motivation
Support Multimodal nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16.
Support its vision encoder: nvidia/CRadioV2-H.
Modifications
python/sglang/srt/models/nano_nemotron_vl.pypython/sglang/srt/configs/radio.pyAccuracy Tests
Reference was VLLM, with EVS turned off, temperature 0.
Both VideoMME and DocVQA run via VLMEvalKit.
VideoMMEresults on par with VLLM:DocVQA_Val: 94.329, identical to VLLM.Checklist