Skip to content

Support qwen3-omni with DP Encoder#14886

Open
apinge wants to merge 3 commits intosgl-project:mainfrom
apinge:qwen3-omni--mm-enable-dp-encoder-all
Open

Support qwen3-omni with DP Encoder#14886
apinge wants to merge 3 commits intosgl-project:mainfrom
apinge:qwen3-omni--mm-enable-dp-encoder-all

Conversation

@apinge
Copy link

@apinge apinge commented Dec 11, 2025

Motivation

Add support for Qwen3-Omni with vision model DP based on #13724

Modifications

Accuracy Tests

Tested Qwen/Qwen3-Omni-30B-A3B-Instruct

with dp

Tasks Version Filter n-shot Metric Value Stderr
mmmu_val 0 none 0 mmmu_acc 0.5867 ± N/A

without dp

Tasks Version Filter n-shot Metric Value Stderr
mmmu_val 0 none 0 mmmu_acc 0.59 ± N/A

Benchmarking and Profiling

On H20, we benchmarked the server with the --mm-enable-dp-encoderoption enabled against the baseline performance without any code modifications. We found that enabling --mm-enable-dp-encoder reduces TTFT for multi-modal workloads under some testing conditions.

SGLANG_VLM_CACHE_SIZE_MB=0  
python3 -m sglang.launch_server \
         --model-path /root/workspace/Qwen3-Omni-30B-A3B-Instruct/   \
         --host localhost \
         --port 40000 \
         --tp-size 4 \
        --trust-remote-code \
         --mem-fraction-static 0.85  \
        --disable-radix-cache   \
        --cuda-graph-max-bs 32 \
        --mm-attention-backend fa3 \
        --attention-backend flashinfer\
        --mm-enable-dp-encoder
tp Img count Img resolution TTFT base(ms) TTFT with dp(ms) improment
4 4 960x1280 6733.54 6543.15 -2.83%
4 8 960x1280 16671.97 14916.71 -10.53%%

Checklist

@apinge apinge force-pushed the qwen3-omni--mm-enable-dp-encoder-all branch from 5e89e16 to 4db2169 Compare December 16, 2025 01:03
@apinge apinge marked this pull request as ready for review December 16, 2025 02:32
Signed-off-by: apinge <tong.qiu2@amd.com>
Signed-off-by: apinge <tong.qiu2@amd.com>
Signed-off-by: apinge <tong.qiu2@amd.com>
@apinge apinge force-pushed the qwen3-omni--mm-enable-dp-encoder-all branch from 4db2169 to 6673203 Compare December 16, 2025 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant