Skip to content

[Refactor][Perf] Qwen3-omni: code predictor with re-prefill + SDPA and eliminate decode hot-path CPU round-trips#1758

Merged
hsliuustc0106 merged 10 commits into
vllm-project:mainfrom
LJH-LBJ:qwen3-omni-decode-performance
Mar 10, 2026
Merged

[Refactor][Perf] Qwen3-omni: code predictor with re-prefill + SDPA and eliminate decode hot-path CPU round-trips#1758
hsliuustc0106 merged 10 commits into
vllm-project:mainfrom
LJH-LBJ:qwen3-omni-decode-performance

Commits

Commits on Mar 9, 2026

Commits on Mar 10, 2026