[NPU] Upgrade to v0.20.0 & align with GPU model runner#3325
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
|
@hsliuustc0106 @Gaohan123 PTAL. Thanks! |
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
| max_num_seqs: 10 | ||
| gpu_memory_utilization: 0.3 | ||
| async_scheduling: true | ||
| enforce_eager: true |
There was a problem hiding this comment.
@linyueqian @Sy0307 I remembered that we don't use CudaGraphWrapper's graph for qwen3-tts talker. Instead, we enable the code predictor graph by default. So we should enforce eager explicitly in deploy yaml. Correct me if I'm wrong. Thanks!
There was a problem hiding this comment.
i think we should set it to false. Stage 0 does run on cudagraph. @Sy0307 please correct me if i was wrong.
There was a problem hiding this comment.
I move it to NPU only path now. We can discuss it later.
|
BLOCKING:
|
|
Please fix CI failure. |
The failure looks unrelated to this PR. |
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
|
@hsliuustc0106 @Gaohan123 Could you help approve this PR? It should only affect NPU code path. |
…3325) Signed-off-by: gcanlin <canlinguosdu@gmail.com>
commit a3d4ed809d56977eb632e8a63aae1fc090a790e3
Author: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Date: Wed May 20 00:14:08 2026 +0800
[Quantization][tools] Add diffusion quantization output comparison tool (#3175)
Signed-off-by: david6666666 <530634352@qq.com>
Signed-off-by: David Chen <530634352@qq.com>
commit 3c58868c9a4fb7f0b1754d07738d1f87d3af5dae
Author: dengyunyang <584797741@qq.com>
Date: Tue May 19 22:22:27 2026 +0800
[BugFix] fix mult cli timeout with get kv (#3741)
Signed-off-by: dengyunyang <584797741@qq.com>
commit da5361879395d45d5017fb575a7446cb36774bf4
Author: Shin <shin@yixiaoer.sg>
Date: Tue May 19 19:56:38 2026 +0800
[Recipe] Qwen/Qwen-Image-Edit (#3684)
Signed-off-by: yixiaoer <shin@yixiaoer.sg>
commit 18186db216319684e3e0d2c268d6a0409525fc2e
Author: Schatten <3192396192@qq.com>
Date: Tue May 19 19:23:45 2026 +0800
[Cleanup] Remove unused build_base_engine_args after #1115 (#3720)
Signed-off-by: Schatten <czhengt@qq.com>
commit 14e5baceaf240e78d1a0c5dcc883563db23eb703
Author: Lu <luludachiever@gmail.com>
Date: Tue May 19 19:19:58 2026 +0800
[Qwen-Image] Drop unused vision tower from text encoder (#3608)
Signed-off-by: lulugoodcoder <luludachiever@gmail.com>
Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
commit 2af2a50e0e2981ec2eef32e704f5a66c3d451c95
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date: Tue May 19 15:22:02 2026 +0800
[CI] improve Buildkite testcase statistics reports (#3543)
Signed-off-by: wangyu <410167048@qq.com>
commit bd83ac9b4b6f7f3a64d13a1695d5a51e73164075
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date: Tue May 19 14:28:47 2026 +0800
[CI] invalid_param reliability suite and weekly http_invalid jobs (#3652)
Signed-off-by: wangyu <410167048@qq.com>
commit e277feacaf859c1aa3f2f7354d6fc396cf06ba5d
Author: chickeyton <ngton2014@gmail.com>
Date: Tue May 19 12:08:20 2026 +0800
[large-scale-serving] Integrate OmniCoordinator into stage engine pipeline (#3569)
Signed-off-by: chickeyton <ngton2014@gmail.com>
Signed-off-by: herotai214 <herotai214@gmail.com>
Co-authored-by: herotai214 <herotai214@gmail.com>
commit ca9fd0b71ce04fa6283154c0ee7f32fcfc2eaf11
Author: JiaHong <2360655509@qq.com>
Date: Tue May 19 11:52:41 2026 +0800
Reject non-positive Flux2 Klein inference steps (#3717)
Signed-off-by: MmMaiIIi <2360655509@qq.com>
commit 3ac739817f5afce9b5a291c2eddaccf5c1927cab
Author: JiaHong <2360655509@qq.com>
Date: Tue May 19 11:30:54 2026 +0800
[Bugfix] Reject empty prompts in Flux2 Klein diffusion pipeline (#3711)
Signed-off-by: MmMaiIIi <2360655509@qq.com>
Co-authored-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
commit 1fa734419ec6b578537aa5267c0d42f006499201
Author: bjf-frz <frz123db@gmail.com>
Date: Tue May 19 11:30:33 2026 +0800
[Refactor]Rename diffusion benchmark backend to endpoint (#3137)
Signed-off-by: bjf-frz <frz123db@gmail.com>
Signed-off-by: bjfwhite <baijingfan1@huawei.com>
Co-authored-by: bjfwhite <baijingfan1@huawei.com>
commit 2c6b1bb0c0b814aa562770737e8d0a6dd7c848f7
Author: fan2956 <zhoufan53@huawei.com>
Date: Tue May 19 10:24:27 2026 +0800
[Bugfix] Fix hunyuanimage3 dit quant storageshape mismatch error (#3694)
Signed-off-by: fan2956 <zhoufan53@huawei.com>
commit e2ed1c457455f8460182873111882b46829dc2df
Author: Daniel Huang <daniel1.huang@intel.com>
Date: Mon May 18 19:19:16 2026 -0700
Disable sampler kernel for XPU test (#3718)
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
commit 89f8819525589141fd825ce4f0d1e1be9cf3660b
Author: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Date: Tue May 19 09:42:17 2026 +0800
[Feature] Add support for Pipeline Parallel and integrate it into Wan 2.2 (#2322)
Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
commit 475a4002b0136235b4feb22d4a1e4b221ca5e112
Author: Chendi.Xue <chendi.xue@intel.com>
Date: Mon May 18 18:58:15 2026 -0500
[XPU] set flash_attn as default diffusion attn backend and fix k_len for cross_attn (#3525)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
commit ab59673f21d804729557ac53d4c839e6d7353afb
Author: Sy03 <1370724210@qq.com>
Date: Tue May 19 02:59:23 2026 +0800
[Bugfix][Qwen3-Omni] Handle short Code2Wav chunk outputs (#3687)
Signed-off-by: Sy03 <1370724210@qq.com>
Co-authored-by: amy-why-3459 <wuhaiyan17@huawei.com>
commit 821286794f1afaac7d44d7a75371e87527b30d22
Author: lyj-jjj <liuyingjun5@huawei.com>
Date: Tue May 19 00:35:30 2026 +0800
[HY-Imgae3.0] support hunyuan image3 dit fa-fp8 on npu (#3540)
Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
commit 309e5c38c665b91a9818f03dd5c515878caf0e53
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date: Mon May 18 21:25:37 2026 +0800
[BugFix][CI]Fixing occasional CI failures (#3623)
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
commit f4115bd7716e1d29c8233bc8a69125dfdd35b3d1
Author: Ding Zuhao <e1583181@u.nus.edu>
Date: Mon May 18 21:12:46 2026 +0800
[Bugfix] Fix SenseNova U1 broken import after SupportsModuleOffload (#3691)
Signed-off-by: nussejzz <nussejzz@users.noreply.github.com>
Co-authored-by: nussejzz <nussejzz@users.noreply.github.com>
commit dbc589dbca09df88714ba433ee241c3aa6690235
Author: Lancer <maruixiang6688@gmail.com>
Date: Mon May 18 17:23:40 2026 +0800
[Bugfix] fix diffusion quantization benchmarking for Omni outputs (#3653)
Signed-off-by: Lancer <maruixiang6688@gmail.com>
commit 990566aef10c69ac1fa3073437be0a3333b3dc15
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date: Mon May 18 05:18:18 2026 -0400
[Bugfix][TTS] Drop meaningless TTFT from speech-endpoint benchmarks (#3674)
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
commit 6d37e77fb2d9b9f4625a022ccffcafdff3134ef7
Author: Chendi.Xue <chendi.xue@intel.com>
Date: Sun May 17 22:06:56 2026 -0500
[XPU] update dockerfile and CI to 0.21.0 (#3675)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
commit 4ba8e14981bb80a1835a7956357ebf32011b0c27
Author: wuhang <wuhang6@huawei.com>
Date: Mon May 18 08:54:57 2026 +0800
Fix diffusion engine cleanup lifecycle (#3494)
Signed-off-by: wuhang <wuhang6@huawei.com>
Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit c99df1ebd9f8007639507a6ba6e5dea09e0abd9c
Author: Sy03 <1370724210@qq.com>
Date: Mon May 18 04:58:59 2026 +0800
[TTS][Perf] Optimize Qwen3-TTS high-concurrency serving (#3662)
Signed-off-by: Sy03 <1370724210@qq.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
commit 0a395f9de11469255d8347a1ce48df56fef74888
Author: bjf-frz <frz123db@gmail.com>
Date: Mon May 18 00:59:46 2026 +0800
[SKILL]Add diffusion perf skill (#3461)
Signed-off-by: bjf-frz <frz123db@gmail.com>
commit c0e132d973276e5c1213bd03d930718ff056fd57
Author: Hongsheng Liu <liuhongsheng4@huawei.com>
Date: Mon May 18 00:02:34 2026 +0800
[Doc] Reorganize available recipes into a table (#3671)
Signed-off-by: hsliu <liuhongsheng4@huawei.com>
Co-authored-by: deepseek-v4-pro <noreply@anthropic.com>
commit 471ddfe025db12bf6f117eb6dd66c40343849c21
Author: Hongsheng Liu <liuhongsheng4@huawei.com>
Date: Sun May 17 23:36:46 2026 +0800
[Doc] Simplify template example subtitle (#3669)
Signed-off-by: hsliu <liuhongsheng4@huawei.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
commit 8cfc9179e6545ee45c90be36cfdba43afcec788e
Author: Mike Qiu <qdy220091330@gmail.com>
Date: Sun May 17 23:30:24 2026 +0800
Fix reasoning_parser crash: reconstruct StructuredOutputsConfig from dict (#2845)
Signed-off-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
Co-authored-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 0da9ffdb0d3023482e1e90d6563a3e379ed6a160
Author: Mike Qiu <qdy220091330@gmail.com>
Date: Sun May 17 23:05:34 2026 +0800
Fix output finish reason issue for audio chunk in stream mode (#2849)
Signed-off-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
Co-authored-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 4e880537501d2b2935c97ddfe3dfdf2679d3e2dc
Author: TaffyOfficial <2587297563@qq.com>
Date: Sun May 17 22:42:59 2026 +0800
[BugFix][HunyuanImage3] Set MRoPE dynamic_arg_dims so graph mode can compile (#3630)
Signed-off-by: TaffyOfficial <2324465096@qq.com>
Co-authored-by: TaffyOfficial <2324465096@qq.com>
Co-authored-by: Codex <codex@openai.com>
commit 768943b8791abf30a1cc7b1cf82cbbad5d5ee247
Author: Reid <61492567+reidliu41@users.noreply.github.com>
Date: Sun May 17 22:10:26 2026 +0800
[Frontend]Handle audio generate engine errors consistently (#3316)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
commit 220db62b3f6a7877e0eb39f3cb8f15ec219d4136
Author: Yuanheng Zhao <54058983+yuanheng-zhao@users.noreply.github.com>
Date: Sun May 17 21:58:44 2026 +0800
[Bugfix] Adapt LTX-2 connector arg with diffusers 0.38.0 (#3661)
Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>
commit 5549b7f44a0bfa75c294d397f8742208e253c3d1
Author: Kevin H. Luu <khluu000@gmail.com>
Date: Sun May 17 04:02:28 2026 -0700
[CI/Build] Enable twine upload to PyPI (#3667)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
commit bc26cad19a0443cc4f444d5bb843e55c1ac3e2f4
Author: Kevin H. Luu <khluu000@gmail.com>
Date: Sun May 17 03:40:14 2026 -0700
[CI/Build] Unify release pipeline with NIGHTLY=1 option, add x86_64/aarch64 image builds (#3428)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
commit 9c5e35f485a7d2037330ea74535e8373c739f350
Author: Alex Brooks <albrooks@redhat.com>
Date: Sat May 16 17:33:03 2026 -0600
[Config Refactor] Support Recursive Merging for Engine Args (#3009)
Signed-off-by: Alex Brooks <albrooks@redhat.com>
Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit a64ebf103b35fa48f42accf444e4f027c992009e
Author: Yuanheng Zhao <54058983+yuanheng-zhao@users.noreply.github.com>
Date: Sun May 17 07:32:34 2026 +0800
[Refactor] Migrate and clean up TTS configs: CosyVoice3, OmniVoice, VoxCPM (#3338)
Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com>
Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>
commit c08959ee040281ecd310293adeb82067fa2e5932
Author: TJian <tunjian.tan@embeddedllm.com>
Date: Sat May 16 23:15:59 2026 +0800
[ROCm] [CI] [Bugfix] Upgrade vllm version to v0.21.0 and ROCm 7.2.2 (#3659)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
commit c5ac295e3c9f0b3425843b15964824a89cd271ae
Author: rongfu.leng <lenronfu@gmail.com>
Date: Sat May 16 21:11:34 2026 +0800
[Feat] Add helios support cache dit (#3470)
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
commit ea35a0cc4a35dcdb674af76d8279c084a6aaa181
Author: Zeng Chuang <zengchuang3@huawei.com>
Date: Sat May 16 20:51:31 2026 +0800
[Bugfix]update process name for dit stage (#3602)
Signed-off-by: zengchuang <zengchuang3@huawei.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 0f4853ff86f3fd840f9404535c89961a48eb13e2
Author: wuhang <wuhang6@huawei.com>
Date: Sat May 16 20:50:29 2026 +0800
[Bugfix] Support diffusion worker dead detect when use inline engine (#3214)
Signed-off-by: wuhang <wuhang6@huawei.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit b5e163cfcabbfdea73469c014766d104d2231e10
Author: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Date: Sat May 16 20:08:51 2026 +0800
[CI][Accuracy] Add Qwen-Image-2512 Qwen-Image-Edit-2511 pixel accuracy tests (#3502)
Signed-off-by: david6666666 <530634352@qq.com>
commit d647e7e4cfa3c50bed50cc07e465365bc9627f0b
Author: dengyunyang <584797741@qq.com>
Date: Sat May 16 19:35:05 2026 +0800
[Hunyuanimage 3.0] hunyuan accuracy test (#3655)
Signed-off-by: dengyunyang <584797741@qq.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 33220b1e39c51d87b982dd1d5e6abd8e20aa8b5a
Author: Nick Cao <ncao@redhat.com>
Date: Sat May 16 07:34:04 2026 -0400
[BugFix] Finish async_chunk requests without pad-token injection (#3613)
Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit eb4e60ee64f2e5cd785b43fdd3af9ff7822b5a4f
Author: Zhou Taichang <tzhouam@connect.ust.hk>
Date: Sat May 16 18:18:43 2026 +0800
[Rebase] Rebase to vllm v0.21.0 (#3530)
Signed-off-by: tzhouam <tzhouam@connect.ust.hk>
Signed-off-by: Zhou Taichang <tzhouam@connect.ust.hk>
Signed-off-by: NumberWan <wantszkin2003@gmail.com>
Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: Dnoob <dxpouo@gmail.com>
Signed-off-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: rein yang <ruiruyang2@gmail.com>
Signed-off-by: Nick Cao <ncao@redhat.com>
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
Signed-off-by: Ricardo Noriega De Soto <rnoriega@redhat.com>
Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: wangyu <410167048@qq.com>
Signed-off-by: weizhoublue <weizhoublue@github.com>
Signed-off-by: weizhou.lan@daocloud.io <weizhou.lan@daocloud.io>
Signed-off-by: dengyunyang <584797741@qq.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Signed-off-by: David Chen <530634352@qq.com>
Signed-off-by: Jie Liu <33612777+keeper-jie@users.noreply.github.com>
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: natureofnature <wzliu@connect.hku.hk>
Signed-off-by: bjf-frz <frz123db@gmail.com>
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
Signed-off-by: KexiongYu <yukexiong1@huawei.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: NumberWan <wantszkin2003@gmail.com>
Co-authored-by: dsinghvi <divyanshsinghvi@gmail.com>
Co-authored-by: Dnoob <dxpouo@gmail.com>
Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Co-authored-by: Samit <285365963@qq.com>
Co-authored-by: rein yang <73573651+R2-Y@users.noreply.github.com>
Co-authored-by: Nick Cao <ncao@redhat.com>
Co-authored-by: zhumingjue138 <zhumingjue@huawei.com>
Co-authored-by: Ricardo Noriega <rnoriega@redhat.com>
Co-authored-by: lyj-jjj <liuyingjun5@huawei.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: gcanlin <canlinguosdu@gmail.com>
Co-authored-by: wangyu <53896905+yenuo26@users.noreply.github.com>
Co-authored-by: weizhoublue <45163302+weizhoublue@users.noreply.github.com>
Co-authored-by: weizhoublue <weizhoublue@github.com>
Co-authored-by: dengyunyang <584797741@qq.com>
Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Co-authored-by: Jie Liu <33612777+keeper-jie@users.noreply.github.com>
Co-authored-by: Yueqian Lin <linyueqian@outlook.com>
Co-authored-by: NATURE <wzliu@connect.hku.hk>
Co-authored-by: bjf-frz <frz123db@gmail.com>
Co-authored-by: amy-why-3459 <wuhaiyan17@huawei.com>
Co-authored-by: Y. Fisher <yukexiong1@huawei.com>
Co-authored-by: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
commit 5e1986206f7381757d51c507dcbd54b553889fb1
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date: Fri May 15 11:35:44 2026 -0400
[CI] Replace c=128 perf cell with c=16; loosen new-cell baselines (#3637)
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit d1c65bdffaa21799d6a4dc34086ceb68dea9fe9d
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date: Fri May 15 22:50:03 2026 +0800
[BugFix] fix ci (#3650)
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
commit d18168ccbbb1b3735b43d25e712ad248e9a29ffa
Author: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
Date: Fri May 15 17:55:09 2026 +0800
[bugfix] Fix diffusers backend input bug after #2913 (#3644)
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
commit 779bf3118bf642fbd8bb35f9416b443829f6c604
Author: Y. Fisher <yukexiong1@huawei.com>
Date: Fri May 15 17:35:22 2026 +0800
[Bugfix] fix compatibility of _hunyuan_image3_unpack_packed_topk between vllm / vllm ascend (#3640)
Signed-off-by: KexiongYu <yukexiong1@huawei.com>
commit e7ee5de09f2fb32debadf4b42f193baf27042c69
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date: Fri May 15 17:07:42 2026 +0800
[BugFix] Fix the issue of thinker requests being preempted, causing shape mismatch. (#3147)
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
commit 440c718d2a6beb052a18c23163112a5ed5413d6d
Author: bjf-frz <frz123db@gmail.com>
Date: Fri May 15 16:42:18 2026 +0800
[Bugfix]Fix multimodal cache routing for AR replicas (#3605)
Signed-off-by: bjf-frz <frz123db@gmail.com>
commit 82a0b3a46763d8be64c3613265297e2a2271faa4
Author: NATURE <wzliu@connect.hku.hk>
Date: Fri May 15 14:14:08 2026 +0800
[2/5] [core]refactor communication layer: PR 2 of 5 Qwen3 Omni non async (#2677)
Signed-off-by: natureofnature <wzliu@connect.hku.hk>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit c7178d89bb7a70817f239febc84c3b21a714dae7
Author: 汪志鹏 <wangzhipeng628@gmail.com>
Date: Fri May 15 13:28:40 2026 +0800
[Bugfix] UnspecifiedOmniPlatform.get_device_count returns 0 instead o… (#3636)
Signed-off-by: princepride <wangzhipeng628@gmail.com>
commit fdb0efea946c35d2ee68f57274dadd0a616e561e
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date: Fri May 15 11:50:22 2026 +0800
[CI] add cuda marker to Diffusion X2V function pytest (#3625)
Signed-off-by: wangyu <410167048@qq.com>
commit 90f5b3c3a10b8c6032bfb82d6e112ec6d70b761a
Author: Jie Liu <33612777+keeper-jie@users.noreply.github.com>
Date: Fri May 15 11:43:05 2026 +0800
Update streaming_speech_client.py to solve Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice voice problem (#3380)
Signed-off-by: Jie Liu <33612777+keeper-jie@users.noreply.github.com>
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
Co-authored-by: Yueqian Lin <linyueqian@outlook.com>
commit bbc00f9f86e5bf54633737bedb7964ea4003e37d
Author: lyj-jjj <liuyingjun5@huawei.com>
Date: Fri May 15 11:22:59 2026 +0800
[BugFix] fix(omni): isolate diffusion KV-cache dtype from vLLM --kv-cache-dtype #3585 (#3596)
Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
commit adb2291c2770a66a8658718780ff3b597591dc6d
Author: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Date: Fri May 15 09:38:47 2026 +0800
Update WeChat group QR code (#3624)
Signed-off-by: David Chen <530634352@qq.com>
commit 4f13b871f949d29da952d7582a21d982330f4213
Author: Canlin Guo <canlinguosdu@gmail.com>
Date: Thu May 14 20:37:51 2026 +0800
[CI] Add Qwen3-TTS tests for ready tag (#3600)
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
commit 94254e015f3164a54ef66c042b8bce1a1abee34b
Author: dengyunyang <584797741@qq.com>
Date: Thu May 14 20:07:12 2026 +0800
[BugFix] fix shm connector (#3583)
Signed-off-by: dengyunyang <584797741@qq.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>
commit f7161b07d0126fc933c89eb057113cec089fc5d3
Author: bjf-frz <frz123db@gmail.com>
Date: Thu May 14 17:27:59 2026 +0800
[Bugfix]Allow HunyuanImage3 AR sampler batching (#3590)
Signed-off-by: bjf-frz <frz123db@gmail.com>
Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>
commit c0b7509f0789c79199f55e13dc7320ab22d95e97
Author: Hongsheng Liu <liuhongsheng4@huawei.com>
Date: Thu May 14 17:23:55 2026 +0800
update v0.20.0 readme (#3594)
Signed-off-by: hsliu_ustc <hsliu_ustc@noreply.gitcode.com>
Co-authored-by: hsliu_ustc <hsliu_ustc@noreply.gitcode.com>
commit 3f63aaf982bcba327b7e5150faf6ccc242f84eaa
Author: TaffyOfficial <2587297563@qq.com>
Date: Thu May 14 16:58:40 2026 +0800
[Feature] HunyuanImage-3.0 IT2I: multi-image input + prompt API cleanup (#3444)
Signed-off-by: TaffyOfficial <2324465096@qq.com>
Signed-off-by: TaffyOfficial <wu15922848573@outlook.com>
Signed-off-by: skf1999 <13234016272@163.com>
Signed-off-by: zuiho <2324465096@qq.com>
Signed-off-by: Claude Code <noreply@anthropic.com>
Signed-off-by: zuiho <wu15922848573@outlook.com>
Signed-off-by: TaffyOfficial <2587297563@qq.com>
Co-authored-by: TaffyOfficial <2324465096@qq.com>
Co-authored-by: TaffyOfficial <wu15922848573@outlook.com>
Co-authored-by: skf1999 <13234016272@163.com>
commit c4f859bf56ef294e0e70b7ea6befdfc5b3f0880b
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date: Thu May 14 02:26:59 2026 -0400
[CI] Harden Qwen3-TTS perf nightly: enable Base voice_clone, add c=64/128, 2-GPU split (#3491)
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
commit 0d9d57acd90f6b6418cf8ccc91c76991a84103e6
Author: wuhang <wuhang6@huawei.com>
Date: Thu May 14 11:25:55 2026 +0800
[Entrypoint][Refactor] Make field type hint more concrete (#3139)
Signed-off-by: wuhang <wuhang6@huawei.com>
commit 51b4b1131e2811942d16fe984eaa1890a6112e44
Author: Y. Fisher <yukexiong1@huawei.com>
Date: Thu May 14 11:17:15 2026 +0800
[Bugfix]: Fix online serving failure when using deploy config (#3537)
Signed-off-by: KexiongYu <yukexiong1@huawei.com>
Signed-off-by: Y. Fisher <yukexiong1@huawei.com>
commit e818dba016c390b7a85afb2cb941af8f2928fe3f
Author: zhumingjue138 <zhumingjue@huawei.com>
Date: Thu May 14 10:47:14 2026 +0800
[Test] Add stability tests for HunyuanImage-3-Instruct (#3504)
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
commit 754d2e52fcbf3230b015457595991a1e6c9c2f6b
Author: Alex Brooks <albrooks@redhat.com>
Date: Wed May 13 14:20:18 2026 -0600
[BugFix] Refresh TeaCache when num_inference_steps=None (#2240)
Signed-off-by: Alex Brooks <albrooks@redhat.com>
commit 9de9d1f7b593e5fc8884bcdd3456e062950f076f
Author: vraiti <vraiti@redhat.com>
Date: Wed May 13 15:33:55 2026 -0400
[Model] Add TP-aware MistralEncoder for FLUX.2-dev TP (#2465)
Signed-off-by: vraiti <vraiti@redhat.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
commit efd955674b608833533626fec21dfb7bacc8f009
Author: dengyunyang <584797741@qq.com>
Date: Wed May 13 22:40:18 2026 +0800
[Bugfix][HunyuanImage3.0] Fix KV reuse compatibility in SP scenarios (#3546)
Signed-off-by: dengyunyang <584797741@qq.com>
commit 4d3eed152a697412c966d2ac97e0009b92490b5e
Author: Y. Fisher <yukexiong1@huawei.com>
Date: Wed May 13 22:22:43 2026 +0800
[Feat][Config] Support additional_config for diffusion worker (#3020)
Signed-off-by: KexiongYu <yukexiong1@huawei.com>
Signed-off-by: Y. Fisher <yukexiong1@huawei.com>
commit 16a84b29d51165a47152c540babce56392dfdc0e
Author: Zeng Chuang <zengchuang1005@gmail.com>
Date: Wed May 13 22:10:35 2026 +0800
[Bugfix] Add bot_task option of think_recaption for hunyuanimage3 it2i (#3551)
Signed-off-by: zengchuang <zengchuang3@huawei.com>
commit b9cb57b6310de8bbc85a278e165ddf0690a5667c
Author: TJian <tunjian.tan@embeddedllm.com>
Date: Wed May 13 20:50:57 2026 +0800
[ROCm] Bugfix wan22 (#3463)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
commit 2e8e3057bcefb9edcc62b3370914ed0e1352e44e
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date: Wed May 13 17:54:21 2026 +0800
[skip ci][Tests] Splitting Qwen3-omni's performance test cases (#3501)
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
commit a715abd4474f8c31084692b2637885088193d8c1
Author: hxhhhlalala <hyh_hh@163.com>
Date: Wed May 13 17:14:42 2026 +0800
[NPU][Quant] Add W8A8 MXFP8 online/offline quantization support for Wan2.2 T2V / I2V / TI2V inference on Ascend NPU (#3140)
Signed-off-by: hyh_hh <huyinghong1@huawei.com>
Co-authored-by: hyh_hh <huyinghong1@huawei.com>
commit b6bdc5997f73c85e3544f4e21c28049119fa7b63
Author: weizhoublue <45163302+weizhoublue@users.noreply.github.com>
Date: Wed May 13 16:22:48 2026 +0800
Fix: NPU AR model runner prefix cache key flattening (#3568)
Signed-off-by: weizhoublue <weizhoublue@github.com>
Signed-off-by: weizhou.lan@daocloud.io <weizhou.lan@daocloud.io>
Co-authored-by: weizhoublue <weizhoublue@github.com>
commit 631251a1f8573fc1fcc325041bf1b3bf347226be
Author: knlnguyen1802 <knlnguyen1802@gmail.com>
Date: Wed May 13 15:31:48 2026 +0800
[Bugfix, rl] Diffusion worker SIGKILL under Ray actor (exitcode -9) (#3533)
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Co-authored-by: Samit <285365963@qq.com>
commit b6f29ee6145bf353b084557b80792dee2d5a7149
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date: Wed May 13 14:25:56 2026 +0800
[CI][Bugfix] skip fp8 Z-Image quality gate (#3531) and add torchdiffeq dev extra (#3563)
Signed-off-by: wangyu <410167048@qq.com>
commit 0ab1d3005694473a4684959c809d3fd84a00ae69
Author: Canlin Guo <canlinguosdu@gmail.com>
Date: Wed May 13 12:58:17 2026 +0800
[CI][Test] Add NPU nightly tests (#3480)
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
commit 56ca7dd612bbd2298426ce34147845d29197e0b4
Author: lyj-jjj <liuyingjun5@huawei.com>
Date: Wed May 13 11:46:00 2026 +0800
support online FP8 quantization for FA on NPU #2236 (#2640)
Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: gcanlin <canlinguosdu@gmail.com>
commit 83bbe39d39bb6c6db9278ba5e9bd3aee37ce0040
Author: Ricardo Noriega <rnoriega@redhat.com>
Date: Wed May 13 04:25:02 2026 +0200
Bump diffusers minimum version to >=0.38.0 (#3349)
Signed-off-by: Ricardo Noriega De Soto <rnoriega@redhat.com>
commit 5313cf6d4800ec9dc438686f7e32eeee48bbb022
Author: Nick Cao <ncao@redhat.com>
Date: Tue May 12 22:08:23 2026 -0400
[Bugfix] Fix omni processing test for non-multimodal talker stage (#3559)
Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
commit c167b9d69190299070159e053a0be0a6db7f2cc1
Author: zhumingjue138 <zhumingjue@huawei.com>
Date: Wed May 13 09:15:20 2026 +0800
[Bugfix] Fix the issue where the qwen3-omni model long-term stability test sometimes gets stuck without sending requests. (#3468)
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
commit dca369d448cd714d36bfaab7d54ab9e3449de306
Author: Nick Cao <ncao@redhat.com>
Date: Tue May 12 11:24:13 2026 -0400
[Perf] Remove dead audio_tower and visual from Qwen2.5-Omni talker stage (#3425)
Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
commit f4b28f239848db9f12121e1d760ef204b128e0be
Author: rein yang <73573651+R2-Y@users.noreply.github.com>
Date: Tue May 12 22:10:10 2026 +0800
[CI] update daily omni min accuracy (#3536)
Signed-off-by: rein yang <ruiruyang2@gmail.com>
commit aa1184d737f2e908f1467b04e13b8df3aae12e53
Author: knlnguyen1802 <knlnguyen1802@gmail.com>
Date: Tue May 12 14:55:30 2026 +0800
[bugfix, rl] Fix race condition bug on async running for diffusion model (#3379)
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Co-authored-by: Samit <285365963@qq.com>
commit d7ea5d5979c78fb697e7497dd1aed75bf886a9cf
Author: Dnoob <dxpouo@gmail.com>
Date: Tue May 12 14:39:58 2026 +0800
[New Model] Add support for tencent/Covo-Audio-Chat (#2293)
Signed-off-by: Dnoob <dxpouo@gmail.com>
Signed-off-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit e40076e872b0ac4a458ca3abb069bfbf9806935d
Author: dsinghvi <divyanshsinghvi@gmail.com>
Date: Tue May 12 12:00:13 2026 +0530
[Refactor] msgspec standardisation for data entry key names and improved type checks (#3149)
Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
commit 9a60e11e3dbac99e2414b8c0eaa747119f9c61bd
Author: NumberWan <wantszkin2003@gmail.com>
Date: Tue May 12 13:59:04 2026 +0800
[Nightly CI] Remove TP case (#3534)
Signed-off-by: NumberWan <wantszkin2003@gmail.com>
commit fe72d078caa30212244ad7d023fcf7af9531c176
Author: Saad Al-Tohamy <92796871+saadaltohamy@users.noreply.github.com>
Date: Tue May 12 08:10:04 2026 +0300
[FIX] Ensure `extra_params` are correctly merged into sampling params in `_create_diffusion_speech()` (#3320)
Signed-off-by: saadaltohamy <saad_altohamy@yahoo.com>
Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>
commit 0d91fbbbb7fe8b0e3b59e35403d2d2123969ae3f
Author: dengyunyang <584797741@qq.com>
Date: Tue May 12 12:13:30 2026 +0800
[Bugfix] Align the AR and DiT prompt formatting across both online and offline modes. (#3516)
Signed-off-by: dengyunyang <584797741@qq.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit ce621be29bdce423f4d6fd2e248aa20f443135ca
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date: Tue May 12 11:56:10 2026 +0800
[BugFix] Modify the splicing method of streaming audio output. (#3438)
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
commit dd5626e6079ddfeea5430bd597fddc868e83d99b
Author: bjf-frz <frz123db@gmail.com>
Date: Tue May 12 10:44:20 2026 +0800
[Recipes]update Wan2.2-I2V gpu part (#3271)
Signed-off-by: bjf-frz <frz123db@gmail.com>
commit ac5fbed6c05115d2605bfc71922eddc69471e14d
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date: Tue May 12 10:23:56 2026 +0800
[CI][Bugfix] Improve e2e latency logging, update response classes to include detailed latency documentation and add startup time logging (#3246)
Signed-off-by: wangyu <410167048@qq.com>
Signed-off-by: [Your Name] <your.email@example.com>
commit 955fcff828705e685e1ad119ebd117940f480481
Author: Lancer <maruixiang6688@gmail.com>
Date: Tue May 12 10:08:28 2026 +0800
[Chore] explicit .float() conversion in Helios's optimized_scale function (#3529)
Signed-off-by: Lancer <maruixiang6688@gmail.com>
commit 4bca522f01ca49f04bb9a6cfa14c7c8839013b0c
Author: ChenWenjing <54166744+Shirley125@users.noreply.github.com>
Date: Tue May 12 01:09:10 2026 +0800
[bugfix][ci] avoid Whisper transcript deduplication in realtime audio test (#3417)
Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>
commit bd4ede391b58295335061102fb534007e3e149af
Author: Nick Cao <ncao@redhat.com>
Date: Mon May 11 12:04:56 2026 -0400
[Perf] Remove dead audio_tower and visual from Qwen3-Omni talker stage (#3296)
Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
commit 6be59f7d19e11427605a727ff5142c980c9ae19c
Author: Junhong Liu <ljh_lbj@163.com>
Date: Mon May 11 22:56:54 2026 +0800
[Fix] Fix RMSNorm inductor KeyError under HSDP + torch.compile (#3460)
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
commit a33e2eb5885472e4a87f9c431a7792967046fcb1
Author: Y. Fisher <yukexiong1@huawei.com>
Date: Mon May 11 22:49:19 2026 +0800
[Config] Add HunyuanImage3 deploy configs (#3172)
Signed-off-by: KexiongYu <yukexiong1@huawei.com>
Signed-off-by: Y. Fisher <yukexiong1@huawei.com>
commit c9a8556c24ade154b09b55a39acd36a1697a1f1f
Author: 汪志鹏 <wangzhipeng628@gmail.com>
Date: Mon May 11 22:19:00 2026 +0800
[New Model]: Add sensenova u1 support (#3319)
Signed-off-by: princepride <wangzhipeng628@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
commit 2cdffcea6b0117216f29ba329bebda814d090645
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date: Mon May 11 21:56:08 2026 +0800
[CI] skip failing diffusion and accuracy cases (#3432, #3256, #3257, #3488) (#3507)
Signed-off-by: wangyu <410167048@qq.com>
commit 3f27ffbd4de71df4bede265bcf4f8212e6bfa07a
Author: wuhang <wuhang6@huawei.com>
Date: Mon May 11 20:16:05 2026 +0800
[Misc] Clean logs for image gen task (#3414)
Signed-off-by: wuhang <wuhang6@huawei.com>
commit 3bf4f2850c254c45152e53224b1462a1c450581e
Author: dengyunyang <584797741@qq.com>
Date: Mon May 11 19:34:58 2026 +0800
[Bug][Hunyuanimage 3.0] fix different AR encode behavior between online and offline (#3500)
Signed-off-by: dengyunyang <584797741@qq.com>
commit 5e263b6929ef7cb19c37800db5257f700f41871c
Author: Canlin Guo <canlinguosdu@gmail.com>
Date: Mon May 11 11:27:35 2026 +0800
[BugFix] Rename attention_config to diffusion_attention_config (#3489)
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
commit e1088026faa5e4ee7b27ae3cd835fdc74f6431c0
Author: Baoyuan Qi <qibaoyuan@126.com>
Date: Mon May 11 10:05:46 2026 +0800
[Performance] Improve MiMo-Audio tokenizer decoding performance (#2183)
Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
Co-authored-by: Jialong Liu <88185941+Galleons2029@users.noreply.github.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit ef7f2f1cd2158bd55d00ee811eeb09608468841a
Author: Canlin Guo <canlinguosdu@gmail.com>
Date: Mon May 11 09:13:07 2026 +0800
[Docs] Refactor the attention backend docs/skill (#3475)
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 67e0e10c85e87e9934162390abe62eb075a6b2bd
Author: TJian <tunjian.tan@embeddedllm.com>
Date: Mon May 11 07:53:27 2026 +0800
[ROCm] [CI] Add the same skip ci logic as CUDA CI (#3482)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
commit 2c73cf3522ae6b462eeac6c0cbec639c6e86b4a4
Author: Sy03 <1370724210@qq.com>
Date: Mon May 11 07:07:38 2026 +0800
[Perf] Fix Qwen3-TTS latency regression (#3485)
Signed-off-by: Sy03 <1370724210@qq.com>
commit 857356d5b72f4b27a1f0a5f795f21463f190163b
Author: dengyunyang <584797741@qq.com>
Date: Sun May 10 22:08:24 2026 +0800
[Feature] hunyuanimage support flash attn (#2981)
Signed-off-by: dengyunyang <584797741@qq.com>
commit 11c4c7f0ff7f25eecec1b875dc3a44ed6060e9ba
Author: Canlin Guo <canlinguosdu@gmail.com>
Date: Sun May 10 11:59:05 2026 +0800
[Diffusion][Attention] Support per-role attention backend via CLI (#2681)
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 26d481fc847a584c3f385a9ddcce002af1bbd319
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date: Sun May 10 07:00:29 2026 +0800
[CI] Remove VLLM_TEST_CLEAN_GPU_MEMORY to avoid environment variable pollution that causes unnecessary GPU detection, thereby slowing down test case execution. (#3446)
Signed-off-by: wangyu <410167048@qq.com>
Signed-off-by: [Your Name] <your.email@example.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 77480215f5c854b030364a3e352862228f98de1a
Author: wuhang <wuhang6@huawei.com>
Date: Sat May 9 21:13:18 2026 +0800
[CI][Nightly] Shard nightly Diffusion X2I H100 lanes and centralize shard definitions (#3455)
Signed-off-by: wuhang <wuhang6@huawei.com>
commit c4a099004411f0aa5d30ad05ed4e7fe6876e58e0
Author: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Date: Sat May 9 04:55:05 2026 -0400
(Phase 1)Add ModelOpt FP8 auto-detect support for diffusion checkpoints #2709 (#2913)
Signed-off-by: roG0d <rodgarcas98@gmail.com>
Signed-off-by: roG0d <baonudesifeizhai@gmail.com>
Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Co-authored-by: roG0d <rodgarcas98@gmail.com>
commit 40a07e0d809e3c2dc07de52ef977ca364a1dc2cb
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date: Sat May 9 16:17:57 2026 +0800
[CI] Refine nightly pytest command in Omni · Function Test with H100 to avoid duplicate testing. (#3459)
Signed-off-by: wangyu <410167048@qq.com>
commit 0e81ef28707631fc6335bf083cf3df9966851403
Author: zhumingjue138 <zhumingjue@huawei.com>
Date: Sat May 9 16:17:43 2026 +0800
[CI] Update merge condition to skip L3 merges during weekly test and update doc (#3197)
Signed-off-by: zhumingjue <zhumingjue@huawei.com>
commit ac69cbd27ecbf67e3a994c15c55d9ee65dacbd16
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date: Sat May 9 03:22:43 2026 -0400
[Test] Restore tts mark and omni_runner_function fixture for Voxtral TTS (#3462)
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
commit d460673647dd97a2ba3976a8e8bcce3a2527a61e
Author: Wallbreazzz <110282866+Wallbreazzz@users.noreply.github.com>
Date: Sat May 9 14:58:56 2026 +0800
Fix NPU code predictor device mismatch in concurrent mode (#3453)
Co-authored-by: houzechen <h00875519@china.huawei.com>
commit f6e3dece09ad3a72d20a119a9341551cdb25065c
Author: akshatvishu <33392262+akshatvishu@users.noreply.github.com>
Date: Sat May 9 08:10:36 2026 +0530
[Feature] Add FP8 quantization for Voxtral TTS (#3036)
Signed-off-by: akshatvishu <akshatnayak197@gmail.com>
Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
Signed-off-by: akshatvishu <33392262+akshatvishu@users.noreply.github.com>
Co-authored-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
commit de3a2917107a7f2da68b35157d83735c1dc35897
Author: Lancer <maruixiang6688@gmail.com>
Date: Sat May 9 09:13:37 2026 +0800
[Bugfix] fix OmniGen2 offload and dtype mismatch (#2560)
Signed-off-by: Lancer <maruixiang6688@gmail.com>
Signed-off-by: Lancer <402430575@qq.com>
commit c481ccee2b405e2a580b4f050cbc795cdb1e10ba
Author: Dan <416947747@qq.com>
Date: Sat May 9 06:44:19 2026 +0800
[Perf] Optimize VoxCPM2 first-request latency via startup warmup (#3424)
Signed-off-by: Dan250124 <416947747@qq.com>
commit b4ab37da22e77a112e6f6e085937a4ea66ed6da9
Author: rongfu.leng <lenronfu@gmail.com>
Date: Sat May 9 06:41:59 2026 +0800
[Bugfix] Qwen-Image use teachche serve will crash (#3450)
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
commit c2a624bec41537a6d78454beebce58cf91764e7e
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date: Fri May 8 18:40:43 2026 -0400
[Bugfix][StableAudio] Pass model_class_name to Omni() and declare audio class attrs (#3405)
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
commit aca4b7d65c0d7925d22d055ef26c630a4b8dec82
Author: chzhang2021 <chzhang2021@gmail.com>
Date: Fri May 8 13:08:39 2026 -0700
Add Qwen3 TTS Model recipe (#3130)
Signed-off-by: Chonghao Zhang <chzhang2021@gmail.com>
Signed-off-by: chzhang2021 <chzhang2021@gmail.com>
Signed-off-by: Chonghao Zhang <chonghaoz@meta.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: Chonghao Zhang <chonghaoz@meta.com>
commit 65bc9684659d28dff1010940f0a3a0d6258fd62e
Author: Nick Cao <ncao@redhat.com>
Date: Fri May 8 10:16:49 2026 -0400
[Refactor] Rename SupportsModuleOffload to SupportsComponentDiscovery (#3354)
Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
commit b968373c886618a701bb8745eb065c26e555804b
Author: Ayush Agarwal <ayushag@nvidia.com>
Date: Fri May 8 06:37:57 2026 -0700
enhancement: extend to dmd2 to image generation + add flux, qwen image pipelines (#2974)
Signed-off-by: ayushag <ayushag@nvidia.com>
Signed-off-by: Ayush Agarwal <ayushag@nvidia.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 039a09a8e14bac3762cf1c7576e46f5c6a5e5c27
Author: skf <54565339+skf-1999@users.noreply.github.com>
Date: Fri May 8 21:33:53 2026 +0800
[Feature] online HunyuanImage-3.0 IT2I (image editing) support (#3410)
Signed-off-by: skf1999 <13234016272@163.com>
commit c83cd4506913e97c915be3484f862d328c332e0e
Author: zdoba <daixinning@gmail.com>
Date: Fri May 8 21:20:03 2026 +0800
[Feat] Add Sequence Parallelism (USP) support for HunyuanVideo 1.5 transformer (#2444)
Signed-off-by: daixinning <daixinning@163.com>
Co-authored-by: daixinning <daixinning@163.com>
commit f8624db93a3832136189e7cc7fec57d9f5c6e076
Author: boatman <109857087+sphinxkkkbc@users.noreply.github.com>
Date: Fri May 8 21:03:47 2026 +0800
[BugFix]Fix default stage config path in voxcpm2 (#3447)
Signed-off-by: sphinxkkkbc <binchengkang8@gmail.com>
Co-authored-by: sphinxkkkbc <binchengkang8@gmail.com>
commit 07fd6afb4b0cc45b7cf2dd7ef95287bd413a5c6c
Author: TaffyOfficial <2587297563@qq.com>
Date: Fri May 8 20:55:23 2026 +0800
[Test][HunyuanImage3] Add e2e offline I2T smoke test (#3332)
Signed-off-by: TaffyOfficial <2324465096@qq.com>
Co-authored-by: TaffyOfficial <2324465096@qq.com>
commit 5b61e7f1f1be0d3691a54541e3048c4bca980203
Author: dengyunyang <584797741@qq.com>
Date: Fri May 8 20:51:46 2026 +0800
[Feature][Hunyuan image 3.0] AR + DIT with kv reuse. (#3346)
Signed-off-by: dengyunyang <584797741@qq.com>
commit ce8a7dfd2da31c45084bab15b867f34a6b2b1ffa
Author: Alex Brooks <albrooks@redhat.com>
Date: Fri May 8 02:05:38 2026 -0600
[Bugfix] Fix Dtype Crashes in SD3 (#2526)
Signed-off-by: Alex Brooks <albrooks@redhat.com>
Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>
commit 50fd3a3f852918a46d721ad52e241abb80457645
Author: Phi-C <chenxjhit@163.com>
Date: Fri May 8 15:22:32 2026 +0800
[Bugfix] Fix the issue where the seed parameter does not take effect when using the OpenAI Python client (#3436)
Signed-off-by: Phi-C <chenxjhit@163.com>
commit 32663f21d5e760d0cfd769110d3e133a3582cfff
Author: lsyyyyy <siyuanlei37@gmail.com>
Date: Fri May 8 15:20:42 2026 +0800
[Feat] support hsdp for Bagel (#3150)
Signed-off-by: siyuan.lei <siyuanlei37@gmail.com>
Signed-off-by: lsyyyyy <siyuanlei37@gmail.com>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>
commit ea4cf77f56b8a42bd900193d59739a61ff7eec73
Author: Yuchen Jiang <yuchen.yj.jiang@gmail.com>
Date: Thu May 7 23:29:15 2026 -0700
[Hardware] Extend diffusion engine plugin extensibility for out-of-tree hardware backends (#3239)
Signed-off-by: Yuchen Jiang <yucjiang@amazon.com>
Co-authored-by: Yuchen Jiang <yucjiang@amazon.com>
Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>
commit 6f2ad7b403569ac4fa602348b5c90a8ceed15b09
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date: Fri May 8 14:12:23 2026 +0800
[Test] Unify L2/L3 test layout, Buildkite steps, and test helpers (#2556)
Signed-off-by: wangyu <410167048@qq.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: wangyu <53896905+yenuo26@users.noreply.github.com>
Co-authored-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
commit b85833eeec475497e28dd883c6436dcbfd5406de
Author: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Date: Fri May 8 12:21:29 2026 +0800
Update CODEOWNERS feature reviewers (#3378)
Signed-off-by: David Chen <530634352@qq.com>
commit eeb7e698c983d97c9dff8c877376109cdabc71ca
Author: Chenguang Zheng <645327136@qq.com>
Date: Fri May 8 09:28:52 2026 +0800
[Clean] Remove multi-replica Bagel CI and related docs/configs (#3407)
Signed-off-by: Chenguang ZHENG <645327136@qq.com>
commit e6466cf0e621f51432f1c5afe7f23df908862763
Author: Nick Cao <ncao@redhat.com>
Date: Thu May 7 11:23:59 2026 -0400
[Refactor] Replace and ban a few torch.cuda functions in favor of torch.accelerator replacements. (#3365)
Signed-off-by: Nick Cao <ncao@redhat.com>
commit 54277a8dd04088aaf591d3611973c4b547cc002b
Author: Gao Han <hgaoaf@connect.ust.hk>
Date: Thu May 7 16:16:36 2026 +0800
[chore] Update command to download dataset from huggingface-cli to hf (#3403)
Signed-off-by: Gao Han <hgaoaf@connect.ust.hk>
commit 4a24a517abc7769b1399ded594558a3fe8269872
Author: Canlin Guo <canlinguosdu@gmail.com>
Date: Thu May 7 11:54:16 2026 +0800
[BugFix] Probe __dict__ instead of hasattr when patching WanRMS_norm (#3400)
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
commit 5d4b16e6e37fdde8578c17cab1165b9ad5effb9c
Author: Peiqi Yin <60515999+yinpeiqi@users.noreply.github.com>
Date: Wed May 6 20:00:20 2026 -0700
[BugFix] Qwen2.5-Omni streaming code2wav input handling (#3396)
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
commit 3c85ca5536a767361c4a82b65a6d04c3a7d63258
Author: dengyunyang <584797741@qq.com>
Date: Thu May 7 10:03:21 2026 +0800
[bugfix][hunyuaniamge] Fix parameter issue introduced during PR #3107 rebase (#3395)
Signed-off-by: dengyunyang <584797741@qq.com>
commit c483a23debe6fadf9312c78c9c65c129791006d0
Author: Daniel Huang <daniel1.huang@intel.com>
Date: Wed May 6 17:08:25 2026 -0700
[CI Patch] Qwen 2.5 CI Fixes for Intel XPU (#3083)
Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 2856ff7aeac763e88b095a47d9af503901c50035
Author: Chendi.Xue <chendi.xue@intel.com>
Date: Wed May 6 17:45:30 2026 -0500
[XPU][DOCKER] update dockerfile.xpu after main repo updating to pt2.11 (#3393)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
commit 56ca5d9a8f8779336f6dcdd6f73b0ad020eb77fd
Author: Chen-Yo Sun <chenyo.sun@mistral.ai>
Date: Wed May 6 15:43:01 2026 -0700
[BugFix] Forward CLI --tokenizer to per-stage engine configs (#3120)
Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
commit 81ab2f98da21817638dbf14c0b0b46e2ad6354b1
Author: Haco <923390377@qq.com>
Date: Thu May 7 06:32:01 2026 +0800
[Config Refactor] Remove legacy Omni CLI arg helper and align tests with nullified parser defaults (#3144)
Signed-off-by: xiaohajiayou <923390377@qq.com>
commit b25ea13cb04c7a56b944da110bceb07a5c2bd6f7
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date: Thu May 7 06:21:00 2026 +0800
[BugFix] Fixed a precision issue with one-word answers. (#3385)
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: Canlin Guo <961750412@qq.com>
commit 687a44e5c83cedf16882b188ce0b042197fe69c8
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date: Wed May 6 18:17:57 2026 -0400
[Bugfix][OmniVoice] Read voice-cloning fields from OmniTextPrompt in offline path (#3392)
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
commit 19f8f428223fa8acbeabeed9d89609d623374689
Author: Juan Pablo Zuluaga <46724788+JuanPZuluaga@users.noreply.github.com>
Date: Thu May 7 00:16:03 2026 +0200
[Feat][Qwen3-Omni] Add CUDA graph support for Code2Wav decoder (#2376)
Signed-off-by: JuanPZuluaga <juanz9312@gmail.com>
Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
commit 0a8204cb81bf8608b21fcc3a4199e5bde6b1136c
Author: Isotr0py <mozf@mail2.sysu.edu.cn>
Date: Thu May 7 00:37:36 2026 +0800
[Quantization] Redo Z-Image text encoder FP8 online quantization (#3279)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
commit b8bd75837ebf716854e40997995928128b6cf0db
Author: Lidang Jiang <119769478+Lidang-Jiang@users.noreply.github.com>
Date: Wed May 6 23:43:19 2026 +0800
[Bugfix] Fix missing ANSI colors in CLI logo when output is piped (#1636)
Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
commit 576afb6f53c3e817c1895a3790bacef2470c0fa9
Author: skf <54565339+skf-1999@users.noreply.github.com>
Date: Wed May 6 22:37:39 2026 +0800
[Feature] HunyuanImage-3.0 IT2I (image editing) support (#3107)
Signed-off-by: TaffyOfficial <2324465096@qq.com>
Signed-off-by: zuiho <2324465096@qq.com>
Signed-off-by: skf1999 <13234016272@163.com>
Co-authored-by: TaffyOfficial <2324465096@qq.com>
Co-authored-by: dengyunyang <584797741@qq.com>
Co-authored-by: John Liu BUAA <liukecheng97@gmail.com>
commit 1e8dc841503146bcb2b5af01b36d1eca94dd8e24
Author: Haco <923390377@qq.com>
Date: Wed May 6 22:21:29 2026 +0800
[Bugfix] Fix default diffusion stage config generator drops runtime engine args (#2559)
Signed-off-by: xiaohajiayou <923390377@qq.com>
Co-authored-by: reidliu41 <reidliu41@users.noreply.github.com>
commit 28558cc37471da8258c95aa515363a4a05fce601
Author: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
Date: Wed May 6 21:11:35 2026 +0800
[bugfix][CI] Fix qwen image performance degradation w/ vllm 0.20 & CUDA 13.0 (#3352)
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
commit 282e0b664231275d9a17f56880c99e084028a435
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date: Wed May 6 20:42:56 2026 +0800
[BugFix][CI] Change max_tokens from 150 to 2048 (#3376)
Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>
commit 1e5f288a915494f8ffd9783a4886bbfe9929e65e
Author: Zheng Wengang <zwg0606@gmail.com>
Date: Wed May 6 19:33:18 2026 +0800
[FEAT] support multi-stage deployment (#2396)
Signed-off-by: ZhengWG <zwg0606@gmail.com>
Signed-off-by: Zheng Wengang <zwg0606@gmail.com>
Signed-off-by: Peiqi Yin <60515999+yinpeiqi@users.noreply.github.com>
Signed-off-by: yinpe <11810305@mail.sustech.edu.cn>
Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
Co-authored-by: Peiqi Yin <60515999+yinpeiqi@users.noreply.github.com>
Co-authored-by: yinpe <11810305@mail.sustech.edu.cn>
Co-authored-by: yinpeiqi <yinpeiqi809@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>
Co-authored-by: Chenguang Zheng <645327136@qq.com>
commit e969d2e99f464044b50a59ea618ad9a8edcbfb9f
Author: fywc <hanzheli@kuaishou.com>
Date: Wed May 6 18:11:22 2026 +0800
[Docs] Add LTX-2-T2V and LTX-2-I2V recipes (#3294)
Signed-off-by: hanzheli <hanzheli@kuaishou.com>
Signed-off-by: fywc <hanzheli@kuaishou.com>
commit 6f784cbc50b2ef1489a73b7c89016f5d95c18d7c
Author: Vensen <vensenmu@gmail.com>
Date: Wed May 6 15:35:41 2026 +0700
[Bugfix]: skip faulty pipelines during registry iteration (#2999)
Signed-off-by: vensen <vensenmu@gmail.com>
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>
Co-authored-by: Yueqian Lin <linyueqian@outlook.com>
Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
commit 369a47d5a1874e2a5050c830d5a18398b52446b7
Author: dengyunyang <584797741@qq.com>
Date: Wed May 6 15:31:11 2026 +0800
[Hunyuanimage-3.0] Accuracy fix (#3373)
Signed-off-by: dengyunyang <584797741@qq.com>
commit b076006c3541f1be53329bee8f7e8f91371c5ba0
Author: Haco <923390377@qq.com>
Date: Wed May 6 14:25:20 2026 +0800
[BugFix] Fix Whitelist optimization CI failure (#3290)
Signed-off-by: xiaohajiayou <923390377@qq.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
commit 9f81a0a4b07087f72284813aebae90d2b6f076ea
Author: fywc <hanzheli@kuaishou.com>
Date: Wed May 6 13:09:39 2026 +0800
[Feat] support HSDP for DreamID-Omni (#3138)
Signed-off-by: hanzheli <hanzheli@kuaishou.com>
Signed-off-by: fywc <hanzheli@kuaishou.com>
commit f36d891ed106aa2a73710a9a706bfc1ddf1a7294
Author: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Date: Wed May 6 12:43:08 2026 +0800
Update WeChat QR code (#3368)
Signed-off-by: David Chen <530634352@qq.com>
commit 354511b805f96dc2ffe8b72755af1764d2318fe1
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date: Tue May 5 23:32:39 2026 -0400
[CI][Bugfix] Relax stable-audio layerwise offload determinism tolerance to 1e-2 (#3371)
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
commit 005621ba7fd92ad0f11369ea26a5f05b56dad9af
Author: Mike Qiu <qdy220091330@gmail.com>
Date: Wed May 6 10:56:23 2026 +0800
Support both "voice" and "speaker" params in chat completions (#3248)
Signed-off-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
Co-authored-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
commit 4e41b7bc2324eed1b2af6a09a40f3d7005271001
Author: Chendi.Xue <chendi.xue@intel.com>
Date: Tue May 5 19:03:48 2026 -0500
Enable Wan2.2-S2V modeling to vLLM-omni (#2751)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 353ac8a402738865eb2d38fb4b26b456561a50b1
Author: Chendi.Xue <chendi.xue@intel.com>
Date: Tue May 5 18:48:46 2026 -0500
[Enhancement] Offload transformer after switch to transformer-2 (#3224)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>
commit e49fbd8a2d1ec0ad2ea593dfc591091b30f42e82
Author: Ting FU <semmer@live.cn>
Date: Wed May 6 05:47:13 2026 +0800
[Feat] DiffusionEngine Support async batch infer (#2729)
Signed-off-by: Semmer <semmer@live.cn>
Signed-off-by: jader <yjader@foxmail.com>
Signed-off-by: asukaqaq-s <1311722138@qq.com>
Co-authored-by: asukaqaq-s <1311722138@qq.com>
Co-authored-by: jader <yjader@foxmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>
commit 2b85474c2347d360df1e245ee6e6b628041536fb
Author: Sy03 <1370724210@qq.com>
Date: Wed May 6 02:57:55 2026 +0800
[Bugfix] Propagate seed to Qwen3-TTS Fast AR sampler (#3350)
Signed-off-by: Sy03 <1370724210@qq.com>
commit 5cf3f7947b84aecb0c908719c7573dcab6b00a06
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date: Tue May 5 13:36:16 2026 -0400
[Docs] Consolidate moss_tts_nano + ming_flash_omni_tts into TTS hub (#3358)
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
commit 44cde33eb5b08b09e1d7a42d320b9ef5aa87f830
Author: TaffyOfficial <2587297563@qq.com>
Date: Wed May 6 00:31:02 2026 +0800
[Bugfix][HunyuanImage3] Fix offline AR garbage output by switching to Instruct chat template (#3243)
Signed-off-by: zuiho <wu15922848573@outlook.com>
Signed-off-by: TaffyOfficial <2324465096@qq.com>
Signed-off-by: zuiho-kai <31877877+zuiho-kai@users.noreply.github.com>
Signed-off-by: zuiho <2324465096@qq.com>
Co-authored-by: TaffyOfficial <2324465096@qq.com>
Co-authored-by: zuiho-kai <31877877+zuiho-kai@users.noreply.github.com>
commit a77c56725d481fc30643dd76c176208d8bd03262
Author: TJian <tunjian.tan@embeddedllm.com>
Date: Tue May 5 23:28:37 2026 +0800
[ROCm] [CI] [Bugfix] 2/N Fix Qwen2.5 and Qwen3 test (#3343)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
commit fde38ca54f218900507745a4da0542ff9cd60a04
Author: Nick Cao <ncao@redhat.com>
Date: Tue May 5 11:26:30 2026 -0400
[Refactor][Qwen3-TTS] Construct Code2Wav decoder natively (#2341)
Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
commit b64ab05d09f0f639c83b8e8a81e45f4b898d6a3f
Author: Juan Pablo Zuluaga <46724788+JuanPZuluaga@users.noreply.github.com>
Date: Tue May 5 16:53:45 2026 +0200
[TTS][SpeakerCacheManager] A global speaker cache manager for Voice Cloning (#2630)
Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>
Co-authored-by: JuanPZuluaga <juanz9312@gmal.com>
commit a0918ce583985ce597d748fa223c0686204a1f5e
Author: boatman <109857087+sphinxkkkbc@users.noreply.github.com>
Date: Tue May 5 22:49:57 2026 +0800
[Feat]add cpu-offload/layerwise-offload for stable-audio-open & fix output inconsistency with same seed (#2909)
Signed-off-by: sphinxkkkbc <binchengkang8@gmail.com>
Co-authored-by: sphinxkkkbc <binchengkang8@gmail.com>
commit f19891e4f1bf1f4b29dec941e722a74069a82c74
Author: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
Date: Tue May 5 20:54:39 2026 +0800
[bugfix][CI] Diffusers backend update (#3096)
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
commit d0b531520d67f6a496afd6c1495aa765e0c78e65
Author: dengyunyang <584797741@qq.com>
Date: Tue May 5 20:05:46 2026 +0800
[Performance CI ]Hunyuan Image 3.0 DIT bench test (#2495)
Signed-off-by: zhou zhuoxin <zhouzhuoxin1508@outlook.com>
Signed-off-by: dengyunyang <584797741@qq.com>
Signed-off-by: TaffyOfficial <2324465096@qq.com>
Signed-off-by: ChenZhao <bounty-hunter@users.noreply.github.com>
Signed-off-by: zuiho <2324465096@qq.com>
Co-authored-by: zhou zhuoxin <zhouzhuoxin1508@outlook.com>
Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>
Co-authored-by: TaffyOfficial <2324465096@qq.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: TaffyOfficial <2587297563@qq.com>
commit 1a93818e54addb7c64ddce05cd53169ba895eb1d
Author: zhanqiuhu <49648934+ZhanqiuHu@users.noreply.github.com>
Date: Tue May 5 04:13:30 2026 -0400
[Bugfix] Add GatedRepoError Report (#1616)
Signed-off-by: Zhanqiu Hu <zhu@redhat.com>
commit bb239fa94959932731e8962fe3a3d18ebbb33fd8
Author: Alex Brooks <albrooks@redhat.com>
Date: Mon May 4 22:09:45 2026 -0600
[Core] Support Async & Sync AutoRegressive Scheduling (#3306)
Signed-off-by: Alex Brooks <albrooks@redhat.com>
commit 703e31fc470bb422bf36fbf6987707ffa6e9ffea
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date: Mon May 4 16:41:29 2026 -0400
[Docs] Consolidate per-model TTS examples into a single hub (#3234)
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
commit e1b30061b637ea37d44a0d8b4b551dbe891f2dfc
Author: Alex Brooks <albrooks@redhat.com>
Date: Mon May 4 10:55:54 2026 -0600
[CI] Use Logprobs Check for Flaky Prefix Cache Test (#3199)
Signed-off-by: Alex Brooks <albrooks@redhat.com>
commit c007d40b147a6130b2cc9f8fc6721f5aaf0179ee
Author: Canlin Guo <canlinguosdu@gmail.com>
Date: Mon May 4 22:38:43 2026 +0800
[NPU] Upgrade to v0.20.0 & align with GPU model runner (#3325)
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
commit c708aecb18491405a80ffabcb0f8aa54062baeb0
Author: Zhang Jian <jianmusings@gmail.com>
Date: Mon May 4 22:24:35 2026 +0800
[Diffusion] [Model] Support AudioX (#2077)
Signed-off-by: Zhang Jian <jianmusings@gmail.com>
Signed-off-by: Zhang <jianmusings@gmail.com>
Signed-off-by: Zhang <zhang.jian@u.nus.edu>
Signed-off-by: Zhang Jian <e0322744@u.nus.edu>
Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
Co-authored-by: Zhang Jian <e0322744@u.nus.edu>
Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>
commit c3065115128e5369ca51be5a9c40f68640d55a6e
Author: Lancer <maruixiang6688@gmail.com>
Date: Mon May 4 22:22:06 2026 +0800
[Bugfix] Fix CUBLAS_STATUS_EXECUTION_FAILED when native Flash Attention is available (Wan2.2) (#3327)
Signed-off-by: Lancer <maruixiang6688@gmail.com>
commit 5fc0bfe01eeff89e70befd01eac046f783a71072
Author: Nick Cao <ncao@redhat.com>
Date: Mon May 4 10:17:25 2026 -0400
[Cleanup] Use tokens_input() for TTS prompt construction (#3227)
Signed-off-by: Nick Cao <ncao@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
commit 9a5370367682da57a8d7f50f28c4753c2b7bd2b7
Author: ptarasiewiczNV <104908264+ptarasiewiczNV@users.noreply.github.com>
Date: Mon May 4 15:07:53 2026 +0200
[Bugfix] GLM-Image: route t2i requests through the multimodal processor (#3034) (#3189)
Signed-off-by: Piotr Tarasiewicz <ptarasiewicz@nvidia.com>
commit bb69cbc9f1a1b0379c0a4ba6895822f1b4d2089d
Author: Lancer <maruixiang6688@gmail.com>
Date: Mon May 4 17:34:43 2026 +0800
[Perf] Optimize RMSNorm in Z-Image (#3304)
Signed-off-by: Lancer <maruixiang6688@gmail.com>
commit 33586d845ce62104f68dae8da34a38c2b557618b
Author: Alex Brooks <albrooks@redhat.com>
Date: Sun May 3 23:50:31 2026 -0600
[Bugfix] Use get_open_ports_list for stage ports in OmniMasterServer (#3333)
Signed-off-by: Alex Brooks <albrooks@redhat.com>
commit 3a2679950745df348d0e933acdc58f9758b204e1
Author: bjf-frz <frz123db@gmail.com>
Date: Mon May 4 11:54:33 2026 +0800
[Bugfix] Fix GLM-Image prior token debug logging (#3165)
Signed-off-by: bjf-frz <frz123db@gmail.com>
commit 6bb18af119c797fae31e843748fd14fd1e6b2efb
Author: Yuanheng Zhao <54058983+yuanheng-zhao@users.noreply.github.com>
Date: Sun May 3 22:32:59 2026 +0800
[Chore][Doc] Fix example arg values in Profiler doc (#3309)
Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com>
commit 21a3035c101976a17fde1f8485659945acc13f9b
Author: Dan250124 <416947747@qq.com>
Date: Sun May 3 22:29:47 2026 +0800
Fixed memory leak and Remove dead code (#3312)
Signed-off-by: Dan250124 <416947747@qq.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
commit e50a0665485178f428e8e37ad7a8cff2fc413484
Author: 汪志鹏 <wangzhipeng628@gmail.com>
Date: Sun May 3 20:02:22 2026 +0800
[BugFix]: Fix async scheduer transfer exceed KV cache (#3318)
Signed-off-by: princepride <wangzhipeng628@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 4c06bb09ed9985da1c83c72ad704eb297c989986
Author: Lancer <maruixiang6688@gmail.com>
Date: Sun May 3 15:53:28 2026 +0800
[Feat] ERNIE image model (T2I) (#2861)
Signed-off-by: Lancer <maruixiang6688@gmail.com>
commit 5dabbb2f18f650bdfa3c5cc60195e25d5c87fc83
Author: GuoSheng Feng <146159551+sfiisf@users.noreply.github.com>
Date: Sun May 3 12:36:15 2026 +0800
[Bugfix] Map Qwen3-TTS max_new_tokens to max_tokens (#3217)
Signed-off-by: sfiisf <sfiisf@163.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 7491b674463362a2725ef728b303c60b691bbb08
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date: Sun May 3 00:35:35 2026 -0400
[Bugfix][MOSS-TTS-Nano] Drop fictional voice presets, require ref_audio (#3192)
Signed-off-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit d49356e4ca95e85613ae68243c838c69eabf5a02
Author: Yuanheng Zhao <54058983+yuanheng-zhao@users.noreply.github.com>
Date: Sun May 3 10:43:49 2026 +0800
[Config Refactor] Migrate Ming and Ming-TTS deploy/pipline configs (#3154)
Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>
Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com>
Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
commit 6a837849dd73cfa27a29b476ff03780ee543e41c
Author: Alex Brooks <albrooks@redhat.com>
Date: Sat May 2 18:02:47 2026 -0600
[CI] Fix Bad TP Initialization in Dynin-Omni Tests (#3298)
Signed-off-by: Alex …
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Related to #3324.
Test Plan
See #3324.
Test Result
See #3324.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)