Skip to content

[NPU] Upgrade to v0.20.0 & align with GPU model runner#3325

Merged
gcanlin merged 21 commits into
vllm-project:mainfrom
gcanlin:align-gpu
May 4, 2026
Merged

[NPU] Upgrade to v0.20.0 & align with GPU model runner#3325
gcanlin merged 21 commits into
vllm-project:mainfrom
gcanlin:align-gpu

Conversation

@gcanlin
Copy link
Copy Markdown
Collaborator

@gcanlin gcanlin commented May 3, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Related to #3324.

Test Plan

See #3324.

Test Result

See #3324.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan. Please provide the test scripts & test commands. Please state the reasons if your codes don't require additional test scripts. For test file guidelines, please check the test style doc
  • The test results. Please paste the results comparison before and after, or the e2e results.
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. Please run mkdocs serve to sync the documentation editions to ./docs.
  • (Optional) Release notes update. If your change is user-facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@gcanlin gcanlin requested a review from hsliuustc0106 as a code owner May 3, 2026 19:29
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@gcanlin gcanlin added the ready label to trigger buildkite CI label May 3, 2026
@gcanlin gcanlin mentioned this pull request May 3, 2026
8 tasks
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
@gcanlin
Copy link
Copy Markdown
Collaborator Author

gcanlin commented May 3, 2026

@hsliuustc0106 @Gaohan123 PTAL. Thanks!

@gcanlin gcanlin added this to the v0.20.0 milestone May 3, 2026
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Comment thread vllm_omni/deploy/qwen3_tts.yaml Outdated
max_num_seqs: 10
gpu_memory_utilization: 0.3
async_scheduling: true
enforce_eager: true
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@linyueqian @Sy0307 I remembered that we don't use CudaGraphWrapper's graph for qwen3-tts talker. Instead, we enable the code predictor graph by default. So we should enforce eager explicitly in deploy yaml. Correct me if I'm wrong. Thanks!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should set it to false. Stage 0 does run on cudagraph. @Sy0307 please correct me if i was wrong.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I move it to NPU only path now. We can discuss it later.

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

BLOCKING:

  • Test Coverage — No test evidence in PR description. For this substantial NPU upgrade (500+ lines), please paste L3 test results or explain why local tests are sufficient.

  • Documentation — Breaking changes need migration guide:

    • AscendSharedFusedMoE → AscendFusedMoE API change affects custom NPU models
    • qwen3_tts.yaml: enforce_eager changed from unspecified → true (what's the impact?)
    • New PCP+MM deep-copy behavior added to scheduler_output
  • CI Gate — buildkite/vllm-omni check is failing (Build #8779). Please fix CI before proceeding.

@Gaohan123
Copy link
Copy Markdown
Collaborator

Please fix CI failure.

@gcanlin
Copy link
Copy Markdown
Collaborator Author

gcanlin commented May 4, 2026

Please fix CI failure.

The failure looks unrelated to this PR.

gcanlin added 3 commits May 4, 2026 15:53
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
@gcanlin
Copy link
Copy Markdown
Collaborator Author

gcanlin commented May 4, 2026

@hsliuustc0106 @Gaohan123 Could you help approve this PR? It should only affect NPU code path.

Copy link
Copy Markdown
Collaborator

@linyueqian linyueqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@gcanlin gcanlin merged commit c007d40 into vllm-project:main May 4, 2026
8 checks passed
clodaghwalsh17 pushed a commit to clodaghwalsh17/nm-vllm-omni-ent that referenced this pull request May 12, 2026
pi314ever added a commit to pi314ever/vllm-omni that referenced this pull request May 19, 2026
commit a3d4ed809d56977eb632e8a63aae1fc090a790e3
Author: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Date:   Wed May 20 00:14:08 2026 +0800

    [Quantization][tools] Add diffusion quantization output comparison tool (#3175)

    Signed-off-by: david6666666 <530634352@qq.com>
    Signed-off-by: David Chen <530634352@qq.com>

commit 3c58868c9a4fb7f0b1754d07738d1f87d3af5dae
Author: dengyunyang <584797741@qq.com>
Date:   Tue May 19 22:22:27 2026 +0800

    [BugFix] fix mult cli timeout with get kv (#3741)

    Signed-off-by: dengyunyang <584797741@qq.com>

commit da5361879395d45d5017fb575a7446cb36774bf4
Author: Shin <shin@yixiaoer.sg>
Date:   Tue May 19 19:56:38 2026 +0800

    [Recipe] Qwen/Qwen-Image-Edit (#3684)

    Signed-off-by: yixiaoer <shin@yixiaoer.sg>

commit 18186db216319684e3e0d2c268d6a0409525fc2e
Author: Schatten <3192396192@qq.com>
Date:   Tue May 19 19:23:45 2026 +0800

    [Cleanup] Remove unused build_base_engine_args after #1115 (#3720)

    Signed-off-by: Schatten <czhengt@qq.com>

commit 14e5baceaf240e78d1a0c5dcc883563db23eb703
Author: Lu <luludachiever@gmail.com>
Date:   Tue May 19 19:19:58 2026 +0800

    [Qwen-Image] Drop unused vision tower from text encoder (#3608)

    Signed-off-by: lulugoodcoder <luludachiever@gmail.com>
    Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>

commit 2af2a50e0e2981ec2eef32e704f5a66c3d451c95
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date:   Tue May 19 15:22:02 2026 +0800

    [CI] improve Buildkite testcase statistics reports (#3543)

    Signed-off-by: wangyu <410167048@qq.com>

commit bd83ac9b4b6f7f3a64d13a1695d5a51e73164075
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date:   Tue May 19 14:28:47 2026 +0800

    [CI] invalid_param reliability suite and weekly http_invalid jobs (#3652)

    Signed-off-by: wangyu <410167048@qq.com>

commit e277feacaf859c1aa3f2f7354d6fc396cf06ba5d
Author: chickeyton <ngton2014@gmail.com>
Date:   Tue May 19 12:08:20 2026 +0800

    [large-scale-serving] Integrate OmniCoordinator into stage engine pipeline (#3569)

    Signed-off-by: chickeyton <ngton2014@gmail.com>
    Signed-off-by: herotai214 <herotai214@gmail.com>
    Co-authored-by: herotai214 <herotai214@gmail.com>

commit ca9fd0b71ce04fa6283154c0ee7f32fcfc2eaf11
Author: JiaHong <2360655509@qq.com>
Date:   Tue May 19 11:52:41 2026 +0800

    Reject non-positive Flux2 Klein inference steps (#3717)

    Signed-off-by: MmMaiIIi <2360655509@qq.com>

commit 3ac739817f5afce9b5a291c2eddaccf5c1927cab
Author: JiaHong <2360655509@qq.com>
Date:   Tue May 19 11:30:54 2026 +0800

    [Bugfix] Reject empty prompts in Flux2 Klein diffusion pipeline (#3711)

    Signed-off-by: MmMaiIIi <2360655509@qq.com>
    Co-authored-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com>

commit 1fa734419ec6b578537aa5267c0d42f006499201
Author: bjf-frz <frz123db@gmail.com>
Date:   Tue May 19 11:30:33 2026 +0800

    [Refactor]Rename diffusion benchmark backend to endpoint (#3137)

    Signed-off-by: bjf-frz <frz123db@gmail.com>
    Signed-off-by: bjfwhite <baijingfan1@huawei.com>
    Co-authored-by: bjfwhite <baijingfan1@huawei.com>

commit 2c6b1bb0c0b814aa562770737e8d0a6dd7c848f7
Author: fan2956 <zhoufan53@huawei.com>
Date:   Tue May 19 10:24:27 2026 +0800

    [Bugfix] Fix hunyuanimage3 dit quant storageshape mismatch error (#3694)

    Signed-off-by: fan2956 <zhoufan53@huawei.com>

commit e2ed1c457455f8460182873111882b46829dc2df
Author: Daniel Huang <daniel1.huang@intel.com>
Date:   Mon May 18 19:19:16 2026 -0700

    Disable sampler kernel for XPU test (#3718)

    Signed-off-by: Daniel Huang <daniel1.huang@intel.com>

commit 89f8819525589141fd825ce4f0d1e1be9cf3660b
Author: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>
Date:   Tue May 19 09:42:17 2026 +0800

    [Feature] Add support for Pipeline Parallel and integrate it into Wan 2.2 (#2322)

    Signed-off-by: Rustam Khadipash <16683750+hadipash@users.noreply.github.com>

commit 475a4002b0136235b4feb22d4a1e4b221ca5e112
Author: Chendi.Xue <chendi.xue@intel.com>
Date:   Mon May 18 18:58:15 2026 -0500

    [XPU] set flash_attn as default diffusion attn backend and fix k_len for cross_attn (#3525)

    Signed-off-by: Chendi Xue <chendi.xue@intel.com>

commit ab59673f21d804729557ac53d4c839e6d7353afb
Author: Sy03 <1370724210@qq.com>
Date:   Tue May 19 02:59:23 2026 +0800

    [Bugfix][Qwen3-Omni] Handle short Code2Wav chunk outputs (#3687)

    Signed-off-by: Sy03 <1370724210@qq.com>
    Co-authored-by: amy-why-3459 <wuhaiyan17@huawei.com>

commit 821286794f1afaac7d44d7a75371e87527b30d22
Author: lyj-jjj <liuyingjun5@huawei.com>
Date:   Tue May 19 00:35:30 2026 +0800

    [HY-Imgae3.0] support hunyuan image3 dit fa-fp8 on npu (#3540)

    Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>
    Co-authored-by: Cursor <cursoragent@cursor.com>

commit 309e5c38c665b91a9818f03dd5c515878caf0e53
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date:   Mon May 18 21:25:37 2026 +0800

    [BugFix][CI]Fixing occasional CI failures (#3623)

    Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

commit f4115bd7716e1d29c8233bc8a69125dfdd35b3d1
Author: Ding Zuhao <e1583181@u.nus.edu>
Date:   Mon May 18 21:12:46 2026 +0800

    [Bugfix] Fix SenseNova U1 broken import after SupportsModuleOffload  (#3691)

    Signed-off-by: nussejzz <nussejzz@users.noreply.github.com>
    Co-authored-by: nussejzz <nussejzz@users.noreply.github.com>

commit dbc589dbca09df88714ba433ee241c3aa6690235
Author: Lancer <maruixiang6688@gmail.com>
Date:   Mon May 18 17:23:40 2026 +0800

    [Bugfix] fix diffusion quantization benchmarking for Omni outputs (#3653)

    Signed-off-by: Lancer <maruixiang6688@gmail.com>

commit 990566aef10c69ac1fa3073437be0a3333b3dc15
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date:   Mon May 18 05:18:18 2026 -0400

    [Bugfix][TTS] Drop meaningless TTFT from speech-endpoint benchmarks (#3674)

    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>

commit 6d37e77fb2d9b9f4625a022ccffcafdff3134ef7
Author: Chendi.Xue <chendi.xue@intel.com>
Date:   Sun May 17 22:06:56 2026 -0500

    [XPU]  update dockerfile and CI to 0.21.0 (#3675)

    Signed-off-by: Chendi Xue <chendi.xue@intel.com>

commit 4ba8e14981bb80a1835a7956357ebf32011b0c27
Author: wuhang <wuhang6@huawei.com>
Date:   Mon May 18 08:54:57 2026 +0800

    Fix diffusion engine cleanup lifecycle (#3494)

    Signed-off-by: wuhang <wuhang6@huawei.com>
    Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit c99df1ebd9f8007639507a6ba6e5dea09e0abd9c
Author: Sy03 <1370724210@qq.com>
Date:   Mon May 18 04:58:59 2026 +0800

    [TTS][Perf] Optimize Qwen3-TTS high-concurrency serving (#3662)

    Signed-off-by: Sy03 <1370724210@qq.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>

commit 0a395f9de11469255d8347a1ce48df56fef74888
Author: bjf-frz <frz123db@gmail.com>
Date:   Mon May 18 00:59:46 2026 +0800

    [SKILL]Add diffusion perf skill (#3461)

    Signed-off-by: bjf-frz <frz123db@gmail.com>

commit c0e132d973276e5c1213bd03d930718ff056fd57
Author: Hongsheng Liu <liuhongsheng4@huawei.com>
Date:   Mon May 18 00:02:34 2026 +0800

    [Doc] Reorganize available recipes into a table (#3671)

    Signed-off-by: hsliu <liuhongsheng4@huawei.com>
    Co-authored-by: deepseek-v4-pro <noreply@anthropic.com>

commit 471ddfe025db12bf6f117eb6dd66c40343849c21
Author: Hongsheng Liu <liuhongsheng4@huawei.com>
Date:   Sun May 17 23:36:46 2026 +0800

    [Doc] Simplify template example subtitle (#3669)

    Signed-off-by: hsliu <liuhongsheng4@huawei.com>
    Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

commit 8cfc9179e6545ee45c90be36cfdba43afcec788e
Author: Mike Qiu <qdy220091330@gmail.com>
Date:   Sun May 17 23:30:24 2026 +0800

    Fix reasoning_parser crash: reconstruct StructuredOutputsConfig from dict (#2845)

    Signed-off-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
    Co-authored-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
    Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 0da9ffdb0d3023482e1e90d6563a3e379ed6a160
Author: Mike Qiu <qdy220091330@gmail.com>
Date:   Sun May 17 23:05:34 2026 +0800

    Fix output finish reason issue for audio chunk in stream mode (#2849)

    Signed-off-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
    Co-authored-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 4e880537501d2b2935c97ddfe3dfdf2679d3e2dc
Author: TaffyOfficial <2587297563@qq.com>
Date:   Sun May 17 22:42:59 2026 +0800

    [BugFix][HunyuanImage3] Set MRoPE dynamic_arg_dims so graph mode can compile (#3630)

    Signed-off-by: TaffyOfficial <2324465096@qq.com>
    Co-authored-by: TaffyOfficial <2324465096@qq.com>
    Co-authored-by: Codex <codex@openai.com>

commit 768943b8791abf30a1cc7b1cf82cbbad5d5ee247
Author: Reid <61492567+reidliu41@users.noreply.github.com>
Date:   Sun May 17 22:10:26 2026 +0800

      [Frontend]Handle audio generate engine errors consistently (#3316)

    Signed-off-by: reidliu41 <reid201711@gmail.com>
    Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>

commit 220db62b3f6a7877e0eb39f3cb8f15ec219d4136
Author: Yuanheng Zhao <54058983+yuanheng-zhao@users.noreply.github.com>
Date:   Sun May 17 21:58:44 2026 +0800

    [Bugfix] Adapt LTX-2 connector arg with diffusers 0.38.0 (#3661)

    Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>

commit 5549b7f44a0bfa75c294d397f8742208e253c3d1
Author: Kevin H. Luu <khluu000@gmail.com>
Date:   Sun May 17 04:02:28 2026 -0700

    [CI/Build] Enable twine upload to PyPI (#3667)

    Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

commit bc26cad19a0443cc4f444d5bb843e55c1ac3e2f4
Author: Kevin H. Luu <khluu000@gmail.com>
Date:   Sun May 17 03:40:14 2026 -0700

    [CI/Build] Unify release pipeline with NIGHTLY=1 option, add x86_64/aarch64 image builds (#3428)

    Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

commit 9c5e35f485a7d2037330ea74535e8373c739f350
Author: Alex Brooks <albrooks@redhat.com>
Date:   Sat May 16 17:33:03 2026 -0600

    [Config Refactor] Support Recursive Merging for Engine Args (#3009)

    Signed-off-by: Alex Brooks <albrooks@redhat.com>
    Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit a64ebf103b35fa48f42accf444e4f027c992009e
Author: Yuanheng Zhao <54058983+yuanheng-zhao@users.noreply.github.com>
Date:   Sun May 17 07:32:34 2026 +0800

    [Refactor] Migrate and clean up TTS configs: CosyVoice3, OmniVoice, VoxCPM (#3338)

    Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com>
    Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>

commit c08959ee040281ecd310293adeb82067fa2e5932
Author: TJian <tunjian.tan@embeddedllm.com>
Date:   Sat May 16 23:15:59 2026 +0800

    [ROCm] [CI] [Bugfix] Upgrade vllm version to v0.21.0 and ROCm 7.2.2 (#3659)

    Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

commit c5ac295e3c9f0b3425843b15964824a89cd271ae
Author: rongfu.leng <lenronfu@gmail.com>
Date:   Sat May 16 21:11:34 2026 +0800

    [Feat] Add helios support cache dit (#3470)

    Signed-off-by: rongfu.leng <lenronfu@gmail.com>

commit ea35a0cc4a35dcdb674af76d8279c084a6aaa181
Author: Zeng Chuang <zengchuang3@huawei.com>
Date:   Sat May 16 20:51:31 2026 +0800

    [Bugfix]update process name for dit stage (#3602)

    Signed-off-by: zengchuang <zengchuang3@huawei.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 0f4853ff86f3fd840f9404535c89961a48eb13e2
Author: wuhang <wuhang6@huawei.com>
Date:   Sat May 16 20:50:29 2026 +0800

    [Bugfix] Support diffusion worker dead detect when use inline engine (#3214)

    Signed-off-by: wuhang <wuhang6@huawei.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit b5e163cfcabbfdea73469c014766d104d2231e10
Author: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Date:   Sat May 16 20:08:51 2026 +0800

    [CI][Accuracy] Add Qwen-Image-2512 Qwen-Image-Edit-2511 pixel accuracy tests (#3502)

    Signed-off-by: david6666666 <530634352@qq.com>

commit d647e7e4cfa3c50bed50cc07e465365bc9627f0b
Author: dengyunyang <584797741@qq.com>
Date:   Sat May 16 19:35:05 2026 +0800

    [Hunyuanimage 3.0] hunyuan accuracy test (#3655)

    Signed-off-by: dengyunyang <584797741@qq.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 33220b1e39c51d87b982dd1d5e6abd8e20aa8b5a
Author: Nick Cao <ncao@redhat.com>
Date:   Sat May 16 07:34:04 2026 -0400

    [BugFix] Finish async_chunk requests without pad-token injection (#3613)

    Signed-off-by: Nick Cao <ncao@redhat.com>
    Co-authored-by: Claude <noreply@anthropic.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit eb4e60ee64f2e5cd785b43fdd3af9ff7822b5a4f
Author: Zhou Taichang <tzhouam@connect.ust.hk>
Date:   Sat May 16 18:18:43 2026 +0800

    [Rebase] Rebase to vllm v0.21.0 (#3530)

    Signed-off-by: tzhouam <tzhouam@connect.ust.hk>
    Signed-off-by: Zhou Taichang <tzhouam@connect.ust.hk>
    Signed-off-by: NumberWan <wantszkin2003@gmail.com>
    Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
    Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
    Signed-off-by: Dnoob <dxpouo@gmail.com>
    Signed-off-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
    Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
    Signed-off-by: rein yang <ruiruyang2@gmail.com>
    Signed-off-by: Nick Cao <ncao@redhat.com>
    Signed-off-by: zhumingjue <zhumingjue@huawei.com>
    Signed-off-by: Ricardo Noriega De Soto <rnoriega@redhat.com>
    Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>
    Signed-off-by: gcanlin <canlinguosdu@gmail.com>
    Signed-off-by: wangyu <410167048@qq.com>
    Signed-off-by: weizhoublue <weizhoublue@github.com>
    Signed-off-by: weizhou.lan@daocloud.io <weizhou.lan@daocloud.io>
    Signed-off-by: dengyunyang <584797741@qq.com>
    Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
    Signed-off-by: David Chen <530634352@qq.com>
    Signed-off-by: Jie Liu <33612777+keeper-jie@users.noreply.github.com>
    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
    Signed-off-by: princepride <wangzhipeng628@gmail.com>
    Signed-off-by: natureofnature <wzliu@connect.hku.hk>
    Signed-off-by: bjf-frz <frz123db@gmail.com>
    Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
    Signed-off-by: KexiongYu <yukexiong1@huawei.com>
    Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
    Signed-off-by: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
    Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
    Co-authored-by: NumberWan <wantszkin2003@gmail.com>
    Co-authored-by: dsinghvi <divyanshsinghvi@gmail.com>
    Co-authored-by: Dnoob <dxpouo@gmail.com>
    Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: knlnguyen1802 <knlnguyen1802@gmail.com>
    Co-authored-by: Samit <285365963@qq.com>
    Co-authored-by: rein yang <73573651+R2-Y@users.noreply.github.com>
    Co-authored-by: Nick Cao <ncao@redhat.com>
    Co-authored-by: zhumingjue138 <zhumingjue@huawei.com>
    Co-authored-by: Ricardo Noriega <rnoriega@redhat.com>
    Co-authored-by: lyj-jjj <liuyingjun5@huawei.com>
    Co-authored-by: Cursor <cursoragent@cursor.com>
    Co-authored-by: gcanlin <canlinguosdu@gmail.com>
    Co-authored-by: wangyu <53896905+yenuo26@users.noreply.github.com>
    Co-authored-by: weizhoublue <45163302+weizhoublue@users.noreply.github.com>
    Co-authored-by: weizhoublue <weizhoublue@github.com>
    Co-authored-by: dengyunyang <584797741@qq.com>
    Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>
    Co-authored-by: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
    Co-authored-by: Jie Liu <33612777+keeper-jie@users.noreply.github.com>
    Co-authored-by: Yueqian Lin <linyueqian@outlook.com>
    Co-authored-by: NATURE <wzliu@connect.hku.hk>
    Co-authored-by: bjf-frz <frz123db@gmail.com>
    Co-authored-by: amy-why-3459 <wuhaiyan17@huawei.com>
    Co-authored-by: Y. Fisher <yukexiong1@huawei.com>
    Co-authored-by: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
    Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

commit 5e1986206f7381757d51c507dcbd54b553889fb1
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date:   Fri May 15 11:35:44 2026 -0400

    [CI] Replace c=128 perf cell with c=16; loosen new-cell baselines (#3637)

    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit d1c65bdffaa21799d6a4dc34086ceb68dea9fe9d
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date:   Fri May 15 22:50:03 2026 +0800

    [BugFix] fix ci (#3650)

    Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

commit d18168ccbbb1b3735b43d25e712ad248e9a29ffa
Author: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
Date:   Fri May 15 17:55:09 2026 +0800

    [bugfix] Fix diffusers backend input bug after #2913 (#3644)

    Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
    Signed-off-by: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
    Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

commit 779bf3118bf642fbd8bb35f9416b443829f6c604
Author: Y. Fisher <yukexiong1@huawei.com>
Date:   Fri May 15 17:35:22 2026 +0800

    [Bugfix] fix compatibility of _hunyuan_image3_unpack_packed_topk between vllm / vllm ascend (#3640)

    Signed-off-by: KexiongYu <yukexiong1@huawei.com>

commit e7ee5de09f2fb32debadf4b42f193baf27042c69
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date:   Fri May 15 17:07:42 2026 +0800

    [BugFix] Fix the issue of thinker requests being preempted, causing shape mismatch. (#3147)

    Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

commit 440c718d2a6beb052a18c23163112a5ed5413d6d
Author: bjf-frz <frz123db@gmail.com>
Date:   Fri May 15 16:42:18 2026 +0800

    [Bugfix]Fix multimodal cache routing for AR replicas (#3605)

    Signed-off-by: bjf-frz <frz123db@gmail.com>

commit 82a0b3a46763d8be64c3613265297e2a2271faa4
Author: NATURE <wzliu@connect.hku.hk>
Date:   Fri May 15 14:14:08 2026 +0800

    [2/5] [core]refactor communication layer: PR 2 of 5 Qwen3 Omni non async  (#2677)

    Signed-off-by: natureofnature <wzliu@connect.hku.hk>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit c7178d89bb7a70817f239febc84c3b21a714dae7
Author: 汪志鹏 <wangzhipeng628@gmail.com>
Date:   Fri May 15 13:28:40 2026 +0800

    [Bugfix] UnspecifiedOmniPlatform.get_device_count returns 0 instead o… (#3636)

    Signed-off-by: princepride <wangzhipeng628@gmail.com>

commit fdb0efea946c35d2ee68f57274dadd0a616e561e
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date:   Fri May 15 11:50:22 2026 +0800

    [CI] add cuda marker to Diffusion X2V function pytest (#3625)

    Signed-off-by: wangyu <410167048@qq.com>

commit 90f5b3c3a10b8c6032bfb82d6e112ec6d70b761a
Author: Jie Liu <33612777+keeper-jie@users.noreply.github.com>
Date:   Fri May 15 11:43:05 2026 +0800

    Update streaming_speech_client.py to solve Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice voice problem (#3380)

    Signed-off-by: Jie Liu <33612777+keeper-jie@users.noreply.github.com>
    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
    Co-authored-by: Yueqian Lin <linyueqian@outlook.com>

commit bbc00f9f86e5bf54633737bedb7964ea4003e37d
Author: lyj-jjj <liuyingjun5@huawei.com>
Date:   Fri May 15 11:22:59 2026 +0800

    [BugFix] fix(omni): isolate diffusion KV-cache dtype from vLLM --kv-cache-dtype #3585 (#3596)

    Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>
    Co-authored-by: Cursor <cursoragent@cursor.com>

commit adb2291c2770a66a8658718780ff3b597591dc6d
Author: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Date:   Fri May 15 09:38:47 2026 +0800

    Update WeChat group QR code (#3624)

    Signed-off-by: David Chen <530634352@qq.com>

commit 4f13b871f949d29da952d7582a21d982330f4213
Author: Canlin Guo <canlinguosdu@gmail.com>
Date:   Thu May 14 20:37:51 2026 +0800

    [CI] Add Qwen3-TTS tests for ready tag (#3600)

    Signed-off-by: gcanlin <canlinguosdu@gmail.com>

commit 94254e015f3164a54ef66c042b8bce1a1abee34b
Author: dengyunyang <584797741@qq.com>
Date:   Thu May 14 20:07:12 2026 +0800

    [BugFix] fix shm connector (#3583)

    Signed-off-by: dengyunyang <584797741@qq.com>
    Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
    Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>

commit f7161b07d0126fc933c89eb057113cec089fc5d3
Author: bjf-frz <frz123db@gmail.com>
Date:   Thu May 14 17:27:59 2026 +0800

    [Bugfix]Allow HunyuanImage3 AR sampler batching (#3590)

    Signed-off-by: bjf-frz <frz123db@gmail.com>
    Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>

commit c0b7509f0789c79199f55e13dc7320ab22d95e97
Author: Hongsheng Liu <liuhongsheng4@huawei.com>
Date:   Thu May 14 17:23:55 2026 +0800

    update v0.20.0 readme (#3594)

    Signed-off-by: hsliu_ustc <hsliu_ustc@noreply.gitcode.com>
    Co-authored-by: hsliu_ustc <hsliu_ustc@noreply.gitcode.com>

commit 3f63aaf982bcba327b7e5150faf6ccc242f84eaa
Author: TaffyOfficial <2587297563@qq.com>
Date:   Thu May 14 16:58:40 2026 +0800

    [Feature] HunyuanImage-3.0 IT2I: multi-image input + prompt API cleanup (#3444)

    Signed-off-by: TaffyOfficial <2324465096@qq.com>
    Signed-off-by: TaffyOfficial <wu15922848573@outlook.com>
    Signed-off-by: skf1999 <13234016272@163.com>
    Signed-off-by: zuiho <2324465096@qq.com>
    Signed-off-by: Claude Code <noreply@anthropic.com>
    Signed-off-by: zuiho <wu15922848573@outlook.com>
    Signed-off-by: TaffyOfficial <2587297563@qq.com>
    Co-authored-by: TaffyOfficial <2324465096@qq.com>
    Co-authored-by: TaffyOfficial <wu15922848573@outlook.com>
    Co-authored-by: skf1999 <13234016272@163.com>

commit c4f859bf56ef294e0e70b7ea6befdfc5b3f0880b
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date:   Thu May 14 02:26:59 2026 -0400

    [CI] Harden Qwen3-TTS perf nightly: enable Base voice_clone, add c=64/128, 2-GPU split (#3491)

    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>

commit 0d9d57acd90f6b6418cf8ccc91c76991a84103e6
Author: wuhang <wuhang6@huawei.com>
Date:   Thu May 14 11:25:55 2026 +0800

    [Entrypoint][Refactor] Make field type hint more concrete (#3139)

    Signed-off-by: wuhang <wuhang6@huawei.com>

commit 51b4b1131e2811942d16fe984eaa1890a6112e44
Author: Y. Fisher <yukexiong1@huawei.com>
Date:   Thu May 14 11:17:15 2026 +0800

    [Bugfix]: Fix online serving failure when using deploy config (#3537)

    Signed-off-by: KexiongYu <yukexiong1@huawei.com>
    Signed-off-by: Y. Fisher <yukexiong1@huawei.com>

commit e818dba016c390b7a85afb2cb941af8f2928fe3f
Author: zhumingjue138 <zhumingjue@huawei.com>
Date:   Thu May 14 10:47:14 2026 +0800

    [Test] Add stability tests for HunyuanImage-3-Instruct (#3504)

    Signed-off-by: zhumingjue <zhumingjue@huawei.com>

commit 754d2e52fcbf3230b015457595991a1e6c9c2f6b
Author: Alex Brooks <albrooks@redhat.com>
Date:   Wed May 13 14:20:18 2026 -0600

    [BugFix] Refresh TeaCache when num_inference_steps=None (#2240)

    Signed-off-by: Alex Brooks <albrooks@redhat.com>

commit 9de9d1f7b593e5fc8884bcdd3456e062950f076f
Author: vraiti <vraiti@redhat.com>
Date:   Wed May 13 15:33:55 2026 -0400

    [Model] Add TP-aware MistralEncoder for FLUX.2-dev TP (#2465)

    Signed-off-by: vraiti <vraiti@redhat.com>
    Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

commit efd955674b608833533626fec21dfb7bacc8f009
Author: dengyunyang <584797741@qq.com>
Date:   Wed May 13 22:40:18 2026 +0800

    [Bugfix][HunyuanImage3.0] Fix KV reuse compatibility in SP scenarios (#3546)

    Signed-off-by: dengyunyang <584797741@qq.com>

commit 4d3eed152a697412c966d2ac97e0009b92490b5e
Author: Y. Fisher <yukexiong1@huawei.com>
Date:   Wed May 13 22:22:43 2026 +0800

    [Feat][Config] Support additional_config for diffusion worker (#3020)

    Signed-off-by: KexiongYu <yukexiong1@huawei.com>
    Signed-off-by: Y. Fisher <yukexiong1@huawei.com>

commit 16a84b29d51165a47152c540babce56392dfdc0e
Author: Zeng Chuang <zengchuang1005@gmail.com>
Date:   Wed May 13 22:10:35 2026 +0800

    [Bugfix] Add bot_task option of think_recaption for hunyuanimage3 it2i (#3551)

    Signed-off-by: zengchuang <zengchuang3@huawei.com>

commit b9cb57b6310de8bbc85a278e165ddf0690a5667c
Author: TJian <tunjian.tan@embeddedllm.com>
Date:   Wed May 13 20:50:57 2026 +0800

    [ROCm] Bugfix wan22 (#3463)

    Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

commit 2e8e3057bcefb9edcc62b3370914ed0e1352e44e
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date:   Wed May 13 17:54:21 2026 +0800

    [skip ci][Tests] Splitting Qwen3-omni's performance test cases (#3501)

    Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

commit a715abd4474f8c31084692b2637885088193d8c1
Author: hxhhhlalala <hyh_hh@163.com>
Date:   Wed May 13 17:14:42 2026 +0800

    [NPU][Quant] Add W8A8 MXFP8 online/offline quantization support for Wan2.2 T2V / I2V / TI2V inference on Ascend NPU (#3140)

    Signed-off-by: hyh_hh <huyinghong1@huawei.com>
    Co-authored-by: hyh_hh <huyinghong1@huawei.com>

commit b6bdc5997f73c85e3544f4e21c28049119fa7b63
Author: weizhoublue <45163302+weizhoublue@users.noreply.github.com>
Date:   Wed May 13 16:22:48 2026 +0800

    Fix: NPU AR model runner prefix cache key flattening (#3568)

    Signed-off-by: weizhoublue <weizhoublue@github.com>
    Signed-off-by: weizhou.lan@daocloud.io <weizhou.lan@daocloud.io>
    Co-authored-by: weizhoublue <weizhoublue@github.com>

commit 631251a1f8573fc1fcc325041bf1b3bf347226be
Author: knlnguyen1802 <knlnguyen1802@gmail.com>
Date:   Wed May 13 15:31:48 2026 +0800

    [Bugfix, rl] Diffusion worker SIGKILL under Ray actor (exitcode -9) (#3533)

    Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
    Co-authored-by: Samit <285365963@qq.com>

commit b6f29ee6145bf353b084557b80792dee2d5a7149
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date:   Wed May 13 14:25:56 2026 +0800

    [CI][Bugfix] skip fp8 Z-Image quality gate (#3531) and add torchdiffeq dev extra (#3563)

    Signed-off-by: wangyu <410167048@qq.com>

commit 0ab1d3005694473a4684959c809d3fd84a00ae69
Author: Canlin Guo <canlinguosdu@gmail.com>
Date:   Wed May 13 12:58:17 2026 +0800

    [CI][Test] Add NPU nightly tests (#3480)

    Signed-off-by: gcanlin <canlinguosdu@gmail.com>

commit 56ca7dd612bbd2298426ce34147845d29197e0b4
Author: lyj-jjj <liuyingjun5@huawei.com>
Date:   Wed May 13 11:46:00 2026 +0800

    support online FP8 quantization for FA on NPU #2236 (#2640)

    Signed-off-by: lyj-jjj <liuyingjun5@huawei.com>
    Signed-off-by: gcanlin <canlinguosdu@gmail.com>
    Co-authored-by: Cursor <cursoragent@cursor.com>
    Co-authored-by: gcanlin <canlinguosdu@gmail.com>

commit 83bbe39d39bb6c6db9278ba5e9bd3aee37ce0040
Author: Ricardo Noriega <rnoriega@redhat.com>
Date:   Wed May 13 04:25:02 2026 +0200

    Bump diffusers minimum version to >=0.38.0 (#3349)

    Signed-off-by: Ricardo Noriega De Soto <rnoriega@redhat.com>

commit 5313cf6d4800ec9dc438686f7e32eeee48bbb022
Author: Nick Cao <ncao@redhat.com>
Date:   Tue May 12 22:08:23 2026 -0400

    [Bugfix] Fix omni processing test for non-multimodal talker stage (#3559)

    Signed-off-by: Nick Cao <ncao@redhat.com>
    Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

commit c167b9d69190299070159e053a0be0a6db7f2cc1
Author: zhumingjue138 <zhumingjue@huawei.com>
Date:   Wed May 13 09:15:20 2026 +0800

    [Bugfix] Fix the issue where the qwen3-omni model long-term stability test sometimes gets stuck without sending requests. (#3468)

    Signed-off-by: zhumingjue <zhumingjue@huawei.com>

commit dca369d448cd714d36bfaab7d54ab9e3449de306
Author: Nick Cao <ncao@redhat.com>
Date:   Tue May 12 11:24:13 2026 -0400

    [Perf] Remove dead audio_tower and visual from Qwen2.5-Omni talker stage (#3425)

    Signed-off-by: Nick Cao <ncao@redhat.com>
    Co-authored-by: Claude <noreply@anthropic.com>

commit f4b28f239848db9f12121e1d760ef204b128e0be
Author: rein yang <73573651+R2-Y@users.noreply.github.com>
Date:   Tue May 12 22:10:10 2026 +0800

    [CI] update daily omni min accuracy (#3536)

    Signed-off-by: rein yang <ruiruyang2@gmail.com>

commit aa1184d737f2e908f1467b04e13b8df3aae12e53
Author: knlnguyen1802 <knlnguyen1802@gmail.com>
Date:   Tue May 12 14:55:30 2026 +0800

    [bugfix, rl] Fix race condition bug on async running for diffusion model  (#3379)

    Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
    Co-authored-by: Samit <285365963@qq.com>

commit d7ea5d5979c78fb697e7497dd1aed75bf886a9cf
Author: Dnoob <dxpouo@gmail.com>
Date:   Tue May 12 14:39:58 2026 +0800

    [New Model] Add support for tencent/Covo-Audio-Chat (#2293)

    Signed-off-by: Dnoob <dxpouo@gmail.com>
    Signed-off-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
    Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit e40076e872b0ac4a458ca3abb069bfbf9806935d
Author: dsinghvi <divyanshsinghvi@gmail.com>
Date:   Tue May 12 12:00:13 2026 +0530

    [Refactor] msgspec standardisation for data entry key names and improved type checks  (#3149)

    Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
    Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>

commit 9a60e11e3dbac99e2414b8c0eaa747119f9c61bd
Author: NumberWan <wantszkin2003@gmail.com>
Date:   Tue May 12 13:59:04 2026 +0800

    [Nightly CI] Remove TP case (#3534)

    Signed-off-by: NumberWan <wantszkin2003@gmail.com>

commit fe72d078caa30212244ad7d023fcf7af9531c176
Author: Saad Al-Tohamy <92796871+saadaltohamy@users.noreply.github.com>
Date:   Tue May 12 08:10:04 2026 +0300

    [FIX] Ensure `extra_params` are correctly merged into sampling params in `_create_diffusion_speech()` (#3320)

    Signed-off-by: saadaltohamy <saad_altohamy@yahoo.com>
    Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>

commit 0d91fbbbb7fe8b0e3b59e35403d2d2123969ae3f
Author: dengyunyang <584797741@qq.com>
Date:   Tue May 12 12:13:30 2026 +0800

    [Bugfix] Align the AR and DiT prompt formatting across both online and offline modes. (#3516)

    Signed-off-by: dengyunyang <584797741@qq.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit ce621be29bdce423f4d6fd2e248aa20f443135ca
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date:   Tue May 12 11:56:10 2026 +0800

    [BugFix] Modify the splicing method of streaming audio output. (#3438)

    Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>

commit dd5626e6079ddfeea5430bd597fddc868e83d99b
Author: bjf-frz <frz123db@gmail.com>
Date:   Tue May 12 10:44:20 2026 +0800

    [Recipes]update Wan2.2-I2V gpu part (#3271)

    Signed-off-by: bjf-frz <frz123db@gmail.com>

commit ac5fbed6c05115d2605bfc71922eddc69471e14d
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date:   Tue May 12 10:23:56 2026 +0800

    [CI][Bugfix] Improve e2e latency logging, update response classes to include detailed latency documentation and add startup time logging (#3246)

    Signed-off-by: wangyu <410167048@qq.com>
    Signed-off-by: [Your Name] <your.email@example.com>

commit 955fcff828705e685e1ad119ebd117940f480481
Author: Lancer <maruixiang6688@gmail.com>
Date:   Tue May 12 10:08:28 2026 +0800

    [Chore] explicit .float() conversion in Helios's optimized_scale function (#3529)

    Signed-off-by: Lancer <maruixiang6688@gmail.com>

commit 4bca522f01ca49f04bb9a6cfa14c7c8839013b0c
Author: ChenWenjing <54166744+Shirley125@users.noreply.github.com>
Date:   Tue May 12 01:09:10 2026 +0800

    [bugfix][ci] avoid Whisper transcript deduplication in realtime audio test (#3417)

    Signed-off-by: CHEN <116010019@link.cuhk.edu.cn>

commit bd4ede391b58295335061102fb534007e3e149af
Author: Nick Cao <ncao@redhat.com>
Date:   Mon May 11 12:04:56 2026 -0400

    [Perf] Remove dead audio_tower and visual from Qwen3-Omni talker stage (#3296)

    Signed-off-by: Nick Cao <ncao@redhat.com>
    Co-authored-by: Claude <noreply@anthropic.com>

commit 6be59f7d19e11427605a727ff5142c980c9ae19c
Author: Junhong Liu <ljh_lbj@163.com>
Date:   Mon May 11 22:56:54 2026 +0800

    [Fix] Fix RMSNorm inductor KeyError under HSDP + torch.compile (#3460)

    Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>

commit a33e2eb5885472e4a87f9c431a7792967046fcb1
Author: Y. Fisher <yukexiong1@huawei.com>
Date:   Mon May 11 22:49:19 2026 +0800

    [Config] Add HunyuanImage3 deploy configs (#3172)

    Signed-off-by: KexiongYu <yukexiong1@huawei.com>
    Signed-off-by: Y. Fisher <yukexiong1@huawei.com>

commit c9a8556c24ade154b09b55a39acd36a1697a1f1f
Author: 汪志鹏 <wangzhipeng628@gmail.com>
Date:   Mon May 11 22:19:00 2026 +0800

    [New Model]: Add sensenova u1 support (#3319)

    Signed-off-by: princepride <wangzhipeng628@gmail.com>
    Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>

commit 2cdffcea6b0117216f29ba329bebda814d090645
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date:   Mon May 11 21:56:08 2026 +0800

    [CI] skip failing diffusion and accuracy cases (#3432, #3256, #3257, #3488) (#3507)

    Signed-off-by: wangyu <410167048@qq.com>

commit 3f27ffbd4de71df4bede265bcf4f8212e6bfa07a
Author: wuhang <wuhang6@huawei.com>
Date:   Mon May 11 20:16:05 2026 +0800

    [Misc] Clean logs for image gen task (#3414)

    Signed-off-by: wuhang <wuhang6@huawei.com>

commit 3bf4f2850c254c45152e53224b1462a1c450581e
Author: dengyunyang <584797741@qq.com>
Date:   Mon May 11 19:34:58 2026 +0800

    [Bug][Hunyuanimage 3.0] fix different AR encode behavior  between online and offline (#3500)

    Signed-off-by: dengyunyang <584797741@qq.com>

commit 5e263b6929ef7cb19c37800db5257f700f41871c
Author: Canlin Guo <canlinguosdu@gmail.com>
Date:   Mon May 11 11:27:35 2026 +0800

    [BugFix] Rename attention_config to diffusion_attention_config (#3489)

    Signed-off-by: gcanlin <canlinguosdu@gmail.com>

commit e1088026faa5e4ee7b27ae3cd835fdc74f6431c0
Author: Baoyuan Qi <qibaoyuan@126.com>
Date:   Mon May 11 10:05:46 2026 +0800

    [Performance] Improve MiMo-Audio tokenizer decoding performance (#2183)

    Signed-off-by: 齐保元 <qibaoyuan@xiaomi.com>
    Co-authored-by: Jialong Liu <88185941+Galleons2029@users.noreply.github.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit ef7f2f1cd2158bd55d00ee811eeb09608468841a
Author: Canlin Guo <canlinguosdu@gmail.com>
Date:   Mon May 11 09:13:07 2026 +0800

    [Docs] Refactor the attention backend docs/skill (#3475)

    Signed-off-by: gcanlin <canlinguosdu@gmail.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 67e0e10c85e87e9934162390abe62eb075a6b2bd
Author: TJian <tunjian.tan@embeddedllm.com>
Date:   Mon May 11 07:53:27 2026 +0800

    [ROCm] [CI] Add the same skip ci logic as CUDA CI (#3482)

    Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

commit 2c73cf3522ae6b462eeac6c0cbec639c6e86b4a4
Author: Sy03 <1370724210@qq.com>
Date:   Mon May 11 07:07:38 2026 +0800

    [Perf] Fix Qwen3-TTS latency regression (#3485)

    Signed-off-by: Sy03 <1370724210@qq.com>

commit 857356d5b72f4b27a1f0a5f795f21463f190163b
Author: dengyunyang <584797741@qq.com>
Date:   Sun May 10 22:08:24 2026 +0800

    [Feature] hunyuanimage support flash attn (#2981)

    Signed-off-by: dengyunyang <584797741@qq.com>

commit 11c4c7f0ff7f25eecec1b875dc3a44ed6060e9ba
Author: Canlin Guo <canlinguosdu@gmail.com>
Date:   Sun May 10 11:59:05 2026 +0800

    [Diffusion][Attention] Support per-role attention backend via CLI (#2681)

    Signed-off-by: gcanlin <canlinguosdu@gmail.com>
    Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 26d481fc847a584c3f385a9ddcce002af1bbd319
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date:   Sun May 10 07:00:29 2026 +0800

    [CI] Remove VLLM_TEST_CLEAN_GPU_MEMORY to avoid environment variable pollution that causes unnecessary GPU detection, thereby slowing down test case execution. (#3446)

    Signed-off-by: wangyu <410167048@qq.com>
    Signed-off-by: [Your Name] <your.email@example.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 77480215f5c854b030364a3e352862228f98de1a
Author: wuhang <wuhang6@huawei.com>
Date:   Sat May 9 21:13:18 2026 +0800

    [CI][Nightly] Shard nightly Diffusion X2I H100 lanes and centralize shard definitions (#3455)

    Signed-off-by: wuhang <wuhang6@huawei.com>

commit c4a099004411f0aa5d30ad05ed4e7fe6876e58e0
Author: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
Date:   Sat May 9 04:55:05 2026 -0400

    (Phase 1)Add ModelOpt FP8 auto-detect support for diffusion checkpoints #2709 (#2913)

    Signed-off-by: roG0d <rodgarcas98@gmail.com>
    Signed-off-by: roG0d <baonudesifeizhai@gmail.com>
    Signed-off-by: baonudesifeizhai <85092850+baonudesifeizhai@users.noreply.github.com>
    Co-authored-by: roG0d <rodgarcas98@gmail.com>

commit 40a07e0d809e3c2dc07de52ef977ca364a1dc2cb
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date:   Sat May 9 16:17:57 2026 +0800

    [CI] Refine nightly pytest command in Omni · Function Test with H100 to avoid duplicate testing. (#3459)

    Signed-off-by: wangyu <410167048@qq.com>

commit 0e81ef28707631fc6335bf083cf3df9966851403
Author: zhumingjue138 <zhumingjue@huawei.com>
Date:   Sat May 9 16:17:43 2026 +0800

    [CI] Update merge condition to skip L3 merges during weekly test and update doc (#3197)

    Signed-off-by: zhumingjue <zhumingjue@huawei.com>

commit ac69cbd27ecbf67e3a994c15c55d9ee65dacbd16
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date:   Sat May 9 03:22:43 2026 -0400

    [Test] Restore tts mark and omni_runner_function fixture for Voxtral TTS (#3462)

    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>

commit d460673647dd97a2ba3976a8e8bcce3a2527a61e
Author: Wallbreazzz <110282866+Wallbreazzz@users.noreply.github.com>
Date:   Sat May 9 14:58:56 2026 +0800

    Fix NPU code predictor device mismatch in concurrent mode (#3453)

    Co-authored-by: houzechen <h00875519@china.huawei.com>

commit f6e3dece09ad3a72d20a119a9341551cdb25065c
Author: akshatvishu <33392262+akshatvishu@users.noreply.github.com>
Date:   Sat May 9 08:10:36 2026 +0530

    [Feature] Add FP8 quantization for Voxtral TTS (#3036)

    Signed-off-by: akshatvishu <akshatnayak197@gmail.com>
    Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
    Signed-off-by: akshatvishu <33392262+akshatvishu@users.noreply.github.com>
    Co-authored-by: Chen-Yo Sun <chenyo.sun@mistral.ai>
    Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>

commit de3a2917107a7f2da68b35157d83735c1dc35897
Author: Lancer <maruixiang6688@gmail.com>
Date:   Sat May 9 09:13:37 2026 +0800

    [Bugfix] fix OmniGen2 offload and dtype mismatch (#2560)

    Signed-off-by: Lancer <maruixiang6688@gmail.com>
    Signed-off-by: Lancer <402430575@qq.com>

commit c481ccee2b405e2a580b4f050cbc795cdb1e10ba
Author: Dan <416947747@qq.com>
Date:   Sat May 9 06:44:19 2026 +0800

    [Perf] Optimize VoxCPM2 first-request latency via startup warmup (#3424)

    Signed-off-by: Dan250124 <416947747@qq.com>

commit b4ab37da22e77a112e6f6e085937a4ea66ed6da9
Author: rongfu.leng <lenronfu@gmail.com>
Date:   Sat May 9 06:41:59 2026 +0800

    [Bugfix] Qwen-Image use teachche serve will crash (#3450)

    Signed-off-by: rongfu.leng <lenronfu@gmail.com>

commit c2a624bec41537a6d78454beebce58cf91764e7e
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date:   Fri May 8 18:40:43 2026 -0400

    [Bugfix][StableAudio] Pass model_class_name to Omni() and declare audio class attrs (#3405)

    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>

commit aca4b7d65c0d7925d22d055ef26c630a4b8dec82
Author: chzhang2021 <chzhang2021@gmail.com>
Date:   Fri May 8 13:08:39 2026 -0700

    Add Qwen3 TTS Model recipe (#3130)

    Signed-off-by: Chonghao Zhang <chzhang2021@gmail.com>
    Signed-off-by: chzhang2021 <chzhang2021@gmail.com>
    Signed-off-by: Chonghao Zhang <chonghaoz@meta.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: Chonghao Zhang <chonghaoz@meta.com>

commit 65bc9684659d28dff1010940f0a3a0d6258fd62e
Author: Nick Cao <ncao@redhat.com>
Date:   Fri May 8 10:16:49 2026 -0400

    [Refactor] Rename SupportsModuleOffload to SupportsComponentDiscovery (#3354)

    Signed-off-by: Nick Cao <ncao@redhat.com>
    Co-authored-by: Claude <noreply@anthropic.com>

commit b968373c886618a701bb8745eb065c26e555804b
Author: Ayush Agarwal <ayushag@nvidia.com>
Date:   Fri May 8 06:37:57 2026 -0700

    enhancement: extend to dmd2 to image generation + add flux, qwen image pipelines (#2974)

    Signed-off-by: ayushag <ayushag@nvidia.com>
    Signed-off-by: Ayush Agarwal <ayushag@nvidia.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 039a09a8e14bac3762cf1c7576e46f5c6a5e5c27
Author: skf <54565339+skf-1999@users.noreply.github.com>
Date:   Fri May 8 21:33:53 2026 +0800

    [Feature] online HunyuanImage-3.0 IT2I (image editing) support (#3410)

    Signed-off-by: skf1999 <13234016272@163.com>

commit c83cd4506913e97c915be3484f862d328c332e0e
Author: zdoba <daixinning@gmail.com>
Date:   Fri May 8 21:20:03 2026 +0800

    [Feat] Add Sequence Parallelism (USP) support for HunyuanVideo 1.5 transformer (#2444)

    Signed-off-by: daixinning <daixinning@163.com>
    Co-authored-by: daixinning <daixinning@163.com>

commit f8624db93a3832136189e7cc7fec57d9f5c6e076
Author: boatman <109857087+sphinxkkkbc@users.noreply.github.com>
Date:   Fri May 8 21:03:47 2026 +0800

    [BugFix]Fix default stage config path in voxcpm2 (#3447)

    Signed-off-by: sphinxkkkbc <binchengkang8@gmail.com>
    Co-authored-by: sphinxkkkbc <binchengkang8@gmail.com>

commit 07fd6afb4b0cc45b7cf2dd7ef95287bd413a5c6c
Author: TaffyOfficial <2587297563@qq.com>
Date:   Fri May 8 20:55:23 2026 +0800

    [Test][HunyuanImage3] Add e2e offline I2T smoke test (#3332)

    Signed-off-by: TaffyOfficial <2324465096@qq.com>
    Co-authored-by: TaffyOfficial <2324465096@qq.com>

commit 5b61e7f1f1be0d3691a54541e3048c4bca980203
Author: dengyunyang <584797741@qq.com>
Date:   Fri May 8 20:51:46 2026 +0800

    [Feature][Hunyuan image 3.0] AR + DIT with kv reuse. (#3346)

    Signed-off-by: dengyunyang <584797741@qq.com>

commit ce8a7dfd2da31c45084bab15b867f34a6b2b1ffa
Author: Alex Brooks <albrooks@redhat.com>
Date:   Fri May 8 02:05:38 2026 -0600

    [Bugfix] Fix Dtype Crashes in SD3 (#2526)

    Signed-off-by: Alex Brooks <albrooks@redhat.com>
    Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>

commit 50fd3a3f852918a46d721ad52e241abb80457645
Author: Phi-C <chenxjhit@163.com>
Date:   Fri May 8 15:22:32 2026 +0800

    [Bugfix] Fix the issue where the seed parameter does not take effect when using the OpenAI Python client (#3436)

    Signed-off-by: Phi-C <chenxjhit@163.com>

commit 32663f21d5e760d0cfd769110d3e133a3582cfff
Author: lsyyyyy <siyuanlei37@gmail.com>
Date:   Fri May 8 15:20:42 2026 +0800

    [Feat] support hsdp for Bagel (#3150)

    Signed-off-by: siyuan.lei <siyuanlei37@gmail.com>
    Signed-off-by: lsyyyyy <siyuanlei37@gmail.com>
    Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
    Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>

commit ea4cf77f56b8a42bd900193d59739a61ff7eec73
Author: Yuchen Jiang <yuchen.yj.jiang@gmail.com>
Date:   Thu May 7 23:29:15 2026 -0700

    [Hardware] Extend diffusion engine plugin extensibility for out-of-tree hardware backends (#3239)

    Signed-off-by: Yuchen Jiang <yucjiang@amazon.com>
    Co-authored-by: Yuchen Jiang <yucjiang@amazon.com>
    Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>

commit 6f2ad7b403569ac4fa602348b5c90a8ceed15b09
Author: wangyu <53896905+yenuo26@users.noreply.github.com>
Date:   Fri May 8 14:12:23 2026 +0800

    [Test] Unify L2/L3 test layout, Buildkite steps, and test helpers (#2556)

    Signed-off-by: wangyu <410167048@qq.com>
    Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
    Signed-off-by: wangyu <53896905+yenuo26@users.noreply.github.com>
    Co-authored-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>

commit b85833eeec475497e28dd883c6436dcbfd5406de
Author: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Date:   Fri May 8 12:21:29 2026 +0800

    Update CODEOWNERS feature reviewers (#3378)

    Signed-off-by: David Chen <530634352@qq.com>

commit eeb7e698c983d97c9dff8c877376109cdabc71ca
Author: Chenguang Zheng <645327136@qq.com>
Date:   Fri May 8 09:28:52 2026 +0800

    [Clean] Remove multi-replica Bagel CI and related docs/configs  (#3407)

    Signed-off-by: Chenguang ZHENG <645327136@qq.com>

commit e6466cf0e621f51432f1c5afe7f23df908862763
Author: Nick Cao <ncao@redhat.com>
Date:   Thu May 7 11:23:59 2026 -0400

    [Refactor] Replace and ban a few torch.cuda functions in favor of torch.accelerator replacements. (#3365)

    Signed-off-by: Nick Cao <ncao@redhat.com>

commit 54277a8dd04088aaf591d3611973c4b547cc002b
Author: Gao Han <hgaoaf@connect.ust.hk>
Date:   Thu May 7 16:16:36 2026 +0800

    [chore] Update command to download dataset from huggingface-cli to hf (#3403)

    Signed-off-by: Gao Han <hgaoaf@connect.ust.hk>

commit 4a24a517abc7769b1399ded594558a3fe8269872
Author: Canlin Guo <canlinguosdu@gmail.com>
Date:   Thu May 7 11:54:16 2026 +0800

    [BugFix] Probe __dict__ instead of hasattr when patching WanRMS_norm (#3400)

    Signed-off-by: gcanlin <canlinguosdu@gmail.com>

commit 5d4b16e6e37fdde8578c17cab1165b9ad5effb9c
Author: Peiqi Yin <60515999+yinpeiqi@users.noreply.github.com>
Date:   Wed May 6 20:00:20 2026 -0700

    [BugFix] Qwen2.5-Omni streaming code2wav input handling (#3396)

    Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>

commit 3c85ca5536a767361c4a82b65a6d04c3a7d63258
Author: dengyunyang <584797741@qq.com>
Date:   Thu May 7 10:03:21 2026 +0800

    [bugfix][hunyuaniamge] Fix parameter issue introduced during PR #3107 rebase (#3395)

    Signed-off-by: dengyunyang <584797741@qq.com>

commit c483a23debe6fadf9312c78c9c65c129791006d0
Author: Daniel Huang <daniel1.huang@intel.com>
Date:   Wed May 6 17:08:25 2026 -0700

    [CI Patch] Qwen 2.5 CI Fixes for Intel XPU (#3083)

    Signed-off-by: Daniel Huang <daniel1.huang@intel.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 2856ff7aeac763e88b095a47d9af503901c50035
Author: Chendi.Xue <chendi.xue@intel.com>
Date:   Wed May 6 17:45:30 2026 -0500

    [XPU][DOCKER] update dockerfile.xpu after main repo updating to pt2.11 (#3393)

    Signed-off-by: Chendi Xue <chendi.xue@intel.com>

commit 56ca5d9a8f8779336f6dcdd6f73b0ad020eb77fd
Author: Chen-Yo Sun <chenyo.sun@mistral.ai>
Date:   Wed May 6 15:43:01 2026 -0700

    [BugFix] Forward CLI --tokenizer to per-stage engine configs (#3120)

    Signed-off-by: Chen-Yo Sun <chenyo.sun@mistral.ai>

commit 81ab2f98da21817638dbf14c0b0b46e2ad6354b1
Author: Haco <923390377@qq.com>
Date:   Thu May 7 06:32:01 2026 +0800

    [Config Refactor] Remove legacy Omni CLI arg helper and align tests with nullified parser defaults (#3144)

    Signed-off-by: xiaohajiayou <923390377@qq.com>

commit b25ea13cb04c7a56b944da110bceb07a5c2bd6f7
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date:   Thu May 7 06:21:00 2026 +0800

    [BugFix] Fixed a precision issue with one-word answers. (#3385)

    Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
    Signed-off-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: Canlin Guo <961750412@qq.com>

commit 687a44e5c83cedf16882b188ce0b042197fe69c8
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date:   Wed May 6 18:17:57 2026 -0400

    [Bugfix][OmniVoice] Read voice-cloning fields from OmniTextPrompt in offline path (#3392)

    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>

commit 19f8f428223fa8acbeabeed9d89609d623374689
Author: Juan Pablo Zuluaga <46724788+JuanPZuluaga@users.noreply.github.com>
Date:   Thu May 7 00:16:03 2026 +0200

    [Feat][Qwen3-Omni] Add CUDA graph support for Code2Wav decoder (#2376)

    Signed-off-by: JuanPZuluaga <juanz9312@gmail.com>
    Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>

commit 0a8204cb81bf8608b21fcc3a4199e5bde6b1136c
Author: Isotr0py <mozf@mail2.sysu.edu.cn>
Date:   Thu May 7 00:37:36 2026 +0800

    [Quantization] Redo Z-Image text encoder FP8 online quantization (#3279)

    Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
    Signed-off-by: Isotr0py <2037008807@qq.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>

commit b8bd75837ebf716854e40997995928128b6cf0db
Author: Lidang Jiang <119769478+Lidang-Jiang@users.noreply.github.com>
Date:   Wed May 6 23:43:19 2026 +0800

    [Bugfix] Fix missing ANSI colors in CLI logo when output is piped (#1636)

    Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>

commit 576afb6f53c3e817c1895a3790bacef2470c0fa9
Author: skf <54565339+skf-1999@users.noreply.github.com>
Date:   Wed May 6 22:37:39 2026 +0800

    [Feature] HunyuanImage-3.0 IT2I (image editing) support (#3107)

    Signed-off-by: TaffyOfficial <2324465096@qq.com>
    Signed-off-by: zuiho <2324465096@qq.com>
    Signed-off-by: skf1999 <13234016272@163.com>
    Co-authored-by: TaffyOfficial <2324465096@qq.com>
    Co-authored-by: dengyunyang <584797741@qq.com>
    Co-authored-by: John Liu BUAA <liukecheng97@gmail.com>

commit 1e8dc841503146bcb2b5af01b36d1eca94dd8e24
Author: Haco <923390377@qq.com>
Date:   Wed May 6 22:21:29 2026 +0800

    [Bugfix] Fix default diffusion stage config generator drops runtime engine args (#2559)

    Signed-off-by: xiaohajiayou <923390377@qq.com>
    Co-authored-by: reidliu41 <reidliu41@users.noreply.github.com>

commit 28558cc37471da8258c95aa515363a4a05fce601
Author: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
Date:   Wed May 6 21:11:35 2026 +0800

    [bugfix][CI] Fix qwen image performance degradation w/ vllm 0.20 & CUDA 13.0 (#3352)

    Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>

commit 282e0b664231275d9a17f56880c99e084028a435
Author: amy-why-3459 <wuhaiyan17@huawei.com>
Date:   Wed May 6 20:42:56 2026 +0800

    [BugFix][CI] Change max_tokens from 150 to 2048 (#3376)

    Signed-off-by: amy-why-3459 <wuhaiyan17@huawei.com>
    Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>

commit 1e5f288a915494f8ffd9783a4886bbfe9929e65e
Author: Zheng Wengang <zwg0606@gmail.com>
Date:   Wed May 6 19:33:18 2026 +0800

    [FEAT] support multi-stage deployment (#2396)

    Signed-off-by: ZhengWG <zwg0606@gmail.com>
    Signed-off-by: Zheng Wengang <zwg0606@gmail.com>
    Signed-off-by: Peiqi Yin <60515999+yinpeiqi@users.noreply.github.com>
    Signed-off-by: yinpe <11810305@mail.sustech.edu.cn>
    Signed-off-by: yinpeiqi <yinpeiqi809@gmail.com>
    Co-authored-by: Peiqi Yin <60515999+yinpeiqi@users.noreply.github.com>
    Co-authored-by: yinpe <11810305@mail.sustech.edu.cn>
    Co-authored-by: yinpeiqi <yinpeiqi809@gmail.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>
    Co-authored-by: Chenguang Zheng <645327136@qq.com>

commit e969d2e99f464044b50a59ea618ad9a8edcbfb9f
Author: fywc <hanzheli@kuaishou.com>
Date:   Wed May 6 18:11:22 2026 +0800

    [Docs] Add LTX-2-T2V and LTX-2-I2V recipes (#3294)

    Signed-off-by: hanzheli <hanzheli@kuaishou.com>
    Signed-off-by: fywc <hanzheli@kuaishou.com>

commit 6f784cbc50b2ef1489a73b7c89016f5d95c18d7c
Author: Vensen <vensenmu@gmail.com>
Date:   Wed May 6 15:35:41 2026 +0700

    [Bugfix]: skip faulty pipelines during registry iteration (#2999)

    Signed-off-by: vensen <vensenmu@gmail.com>
    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
    Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>
    Co-authored-by: Yueqian Lin <linyueqian@outlook.com>
    Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>

commit 369a47d5a1874e2a5050c830d5a18398b52446b7
Author: dengyunyang <584797741@qq.com>
Date:   Wed May 6 15:31:11 2026 +0800

    [Hunyuanimage-3.0] Accuracy fix (#3373)

    Signed-off-by: dengyunyang <584797741@qq.com>

commit b076006c3541f1be53329bee8f7e8f91371c5ba0
Author: Haco <923390377@qq.com>
Date:   Wed May 6 14:25:20 2026 +0800

    [BugFix] Fix Whitelist optimization  CI failure (#3290)

    Signed-off-by: xiaohajiayou <923390377@qq.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: SYLAR <125541396+lishunyang12@users.noreply.github.com>
    Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>

commit 9f81a0a4b07087f72284813aebae90d2b6f076ea
Author: fywc <hanzheli@kuaishou.com>
Date:   Wed May 6 13:09:39 2026 +0800

    [Feat] support HSDP for DreamID-Omni (#3138)

    Signed-off-by: hanzheli <hanzheli@kuaishou.com>
    Signed-off-by: fywc <hanzheli@kuaishou.com>

commit f36d891ed106aa2a73710a9a706bfc1ddf1a7294
Author: WeiQing Chen <40507679+david6666666@users.noreply.github.com>
Date:   Wed May 6 12:43:08 2026 +0800

    Update WeChat QR code (#3368)

    Signed-off-by: David Chen <530634352@qq.com>

commit 354511b805f96dc2ffe8b72755af1764d2318fe1
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date:   Tue May 5 23:32:39 2026 -0400

    [CI][Bugfix] Relax stable-audio layerwise offload determinism tolerance to 1e-2 (#3371)

    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>

commit 005621ba7fd92ad0f11369ea26a5f05b56dad9af
Author: Mike Qiu <qdy220091330@gmail.com>
Date:   Wed May 6 10:56:23 2026 +0800

    Support both "voice" and "speaker" params in chat completions (#3248)

    Signed-off-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
    Co-authored-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
    Co-authored-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>

commit 4e41b7bc2324eed1b2af6a09a40f3d7005271001
Author: Chendi.Xue <chendi.xue@intel.com>
Date:   Tue May 5 19:03:48 2026 -0500

    Enable Wan2.2-S2V modeling to vLLM-omni (#2751)

    Signed-off-by: Chendi Xue <chendi.xue@intel.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 353ac8a402738865eb2d38fb4b26b456561a50b1
Author: Chendi.Xue <chendi.xue@intel.com>
Date:   Tue May 5 18:48:46 2026 -0500

    [Enhancement] Offload transformer after switch to transformer-2 (#3224)

    Signed-off-by: Chendi Xue <chendi.xue@intel.com>
    Co-authored-by: Canlin Guo <canlinguosdu@gmail.com>

commit e49fbd8a2d1ec0ad2ea593dfc591091b30f42e82
Author: Ting FU <semmer@live.cn>
Date:   Wed May 6 05:47:13 2026 +0800

    [Feat] DiffusionEngine Support async batch infer  (#2729)

    Signed-off-by: Semmer <semmer@live.cn>
    Signed-off-by: jader <yjader@foxmail.com>
    Signed-off-by: asukaqaq-s <1311722138@qq.com>
    Co-authored-by: asukaqaq-s <1311722138@qq.com>
    Co-authored-by: jader <yjader@foxmail.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
    Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>

commit 2b85474c2347d360df1e245ee6e6b628041536fb
Author: Sy03 <1370724210@qq.com>
Date:   Wed May 6 02:57:55 2026 +0800

    [Bugfix] Propagate seed to Qwen3-TTS Fast AR sampler (#3350)

    Signed-off-by: Sy03 <1370724210@qq.com>

commit 5cf3f7947b84aecb0c908719c7573dcab6b00a06
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date:   Tue May 5 13:36:16 2026 -0400

    [Docs] Consolidate moss_tts_nano + ming_flash_omni_tts into TTS hub (#3358)

    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>

commit 44cde33eb5b08b09e1d7a42d320b9ef5aa87f830
Author: TaffyOfficial <2587297563@qq.com>
Date:   Wed May 6 00:31:02 2026 +0800

    [Bugfix][HunyuanImage3] Fix offline AR garbage output by switching to Instruct chat template (#3243)

    Signed-off-by: zuiho <wu15922848573@outlook.com>
    Signed-off-by: TaffyOfficial <2324465096@qq.com>
    Signed-off-by: zuiho-kai <31877877+zuiho-kai@users.noreply.github.com>
    Signed-off-by: zuiho <2324465096@qq.com>
    Co-authored-by: TaffyOfficial <2324465096@qq.com>
    Co-authored-by: zuiho-kai <31877877+zuiho-kai@users.noreply.github.com>

commit a77c56725d481fc30643dd76c176208d8bd03262
Author: TJian <tunjian.tan@embeddedllm.com>
Date:   Tue May 5 23:28:37 2026 +0800

    [ROCm] [CI] [Bugfix] 2/N Fix Qwen2.5 and Qwen3 test (#3343)

    Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

commit fde38ca54f218900507745a4da0542ff9cd60a04
Author: Nick Cao <ncao@redhat.com>
Date:   Tue May 5 11:26:30 2026 -0400

    [Refactor][Qwen3-TTS] Construct Code2Wav decoder natively (#2341)

    Signed-off-by: Nick Cao <ncao@redhat.com>
    Co-authored-by: Claude <noreply@anthropic.com>

commit b64ab05d09f0f639c83b8e8a81e45f4b898d6a3f
Author: Juan Pablo Zuluaga <46724788+JuanPZuluaga@users.noreply.github.com>
Date:   Tue May 5 16:53:45 2026 +0200

    [TTS][SpeakerCacheManager] A global speaker cache manager for Voice Cloning (#2630)

    Signed-off-by: JuanPZuluaga <juanz9312@gmal.com>
    Co-authored-by: JuanPZuluaga <juanz9312@gmal.com>

commit a0918ce583985ce597d748fa223c0686204a1f5e
Author: boatman <109857087+sphinxkkkbc@users.noreply.github.com>
Date:   Tue May 5 22:49:57 2026 +0800

    [Feat]add cpu-offload/layerwise-offload for stable-audio-open & fix output inconsistency with same seed (#2909)

    Signed-off-by: sphinxkkkbc <binchengkang8@gmail.com>
    Co-authored-by: sphinxkkkbc <binchengkang8@gmail.com>

commit f19891e4f1bf1f4b29dec941e722a74069a82c74
Author: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
Date:   Tue May 5 20:54:39 2026 +0800

    [bugfix][CI] Diffusers backend update (#3096)

    Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
    Signed-off-by: Zeyu Huang | 黃澤宇 <11222265+fhfuih@users.noreply.github.com>
    Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

commit d0b531520d67f6a496afd6c1495aa765e0c78e65
Author: dengyunyang <584797741@qq.com>
Date:   Tue May 5 20:05:46 2026 +0800

    [Performance CI ]Hunyuan Image 3.0 DIT bench test (#2495)

    Signed-off-by: zhou zhuoxin <zhouzhuoxin1508@outlook.com>
    Signed-off-by: dengyunyang <584797741@qq.com>
    Signed-off-by: TaffyOfficial <2324465096@qq.com>
    Signed-off-by: ChenZhao <bounty-hunter@users.noreply.github.com>
    Signed-off-by: zuiho <2324465096@qq.com>
    Co-authored-by: zhou zhuoxin <zhouzhuoxin1508@outlook.com>
    Co-authored-by: Gao Han <hgaoaf@connect.ust.hk>
    Co-authored-by: TaffyOfficial <2324465096@qq.com>
    Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
    Co-authored-by: TaffyOfficial <2587297563@qq.com>

commit 1a93818e54addb7c64ddce05cd53169ba895eb1d
Author: zhanqiuhu <49648934+ZhanqiuHu@users.noreply.github.com>
Date:   Tue May 5 04:13:30 2026 -0400

    [Bugfix] Add GatedRepoError Report (#1616)

    Signed-off-by: Zhanqiu Hu <zhu@redhat.com>

commit bb239fa94959932731e8962fe3a3d18ebbb33fd8
Author: Alex Brooks <albrooks@redhat.com>
Date:   Mon May 4 22:09:45 2026 -0600

    [Core] Support Async & Sync AutoRegressive Scheduling (#3306)

    Signed-off-by: Alex Brooks <albrooks@redhat.com>

commit 703e31fc470bb422bf36fbf6987707ffa6e9ffea
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date:   Mon May 4 16:41:29 2026 -0400

    [Docs] Consolidate per-model TTS examples into a single hub (#3234)

    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>

commit e1b30061b637ea37d44a0d8b4b551dbe891f2dfc
Author: Alex Brooks <albrooks@redhat.com>
Date:   Mon May 4 10:55:54 2026 -0600

    [CI] Use Logprobs Check for Flaky Prefix Cache Test (#3199)

    Signed-off-by: Alex Brooks <albrooks@redhat.com>

commit c007d40b147a6130b2cc9f8fc6721f5aaf0179ee
Author: Canlin Guo <canlinguosdu@gmail.com>
Date:   Mon May 4 22:38:43 2026 +0800

    [NPU] Upgrade to v0.20.0 & align with GPU model runner (#3325)

    Signed-off-by: gcanlin <canlinguosdu@gmail.com>

commit c708aecb18491405a80ffabcb0f8aa54062baeb0
Author: Zhang Jian <jianmusings@gmail.com>
Date:   Mon May 4 22:24:35 2026 +0800

    [Diffusion] [Model] Support AudioX (#2077)

    Signed-off-by: Zhang Jian <jianmusings@gmail.com>
    Signed-off-by: Zhang <jianmusings@gmail.com>
    Signed-off-by: Zhang <zhang.jian@u.nus.edu>
    Signed-off-by: Zhang Jian <e0322744@u.nus.edu>
    Signed-off-by: 汪志鹏 <wangzhipeng628@gmail.com>
    Co-authored-by: Zhang Jian <e0322744@u.nus.edu>
    Co-authored-by: 汪志鹏 <wangzhipeng628@gmail.com>

commit c3065115128e5369ca51be5a9c40f68640d55a6e
Author: Lancer <maruixiang6688@gmail.com>
Date:   Mon May 4 22:22:06 2026 +0800

    [Bugfix] Fix CUBLAS_STATUS_EXECUTION_FAILED when native Flash Attention is available (Wan2.2) (#3327)

    Signed-off-by: Lancer <maruixiang6688@gmail.com>

commit 5fc0bfe01eeff89e70befd01eac046f783a71072
Author: Nick Cao <ncao@redhat.com>
Date:   Mon May 4 10:17:25 2026 -0400

    [Cleanup] Use tokens_input() for TTS prompt construction (#3227)

    Signed-off-by: Nick Cao <ncao@redhat.com>
    Co-authored-by: Claude <noreply@anthropic.com>

commit 9a5370367682da57a8d7f50f28c4753c2b7bd2b7
Author: ptarasiewiczNV <104908264+ptarasiewiczNV@users.noreply.github.com>
Date:   Mon May 4 15:07:53 2026 +0200

    [Bugfix] GLM-Image: route t2i requests through the multimodal processor (#3034) (#3189)

    Signed-off-by: Piotr Tarasiewicz <ptarasiewicz@nvidia.com>

commit bb69cbc9f1a1b0379c0a4ba6895822f1b4d2089d
Author: Lancer <maruixiang6688@gmail.com>
Date:   Mon May 4 17:34:43 2026 +0800

    [Perf] Optimize RMSNorm in Z-Image (#3304)

    Signed-off-by: Lancer <maruixiang6688@gmail.com>

commit 33586d845ce62104f68dae8da34a38c2b557618b
Author: Alex Brooks <albrooks@redhat.com>
Date:   Sun May 3 23:50:31 2026 -0600

    [Bugfix] Use get_open_ports_list for stage ports in OmniMasterServer (#3333)

    Signed-off-by: Alex Brooks <albrooks@redhat.com>

commit 3a2679950745df348d0e933acdc58f9758b204e1
Author: bjf-frz <frz123db@gmail.com>
Date:   Mon May 4 11:54:33 2026 +0800

    [Bugfix] Fix GLM-Image prior token debug logging (#3165)

    Signed-off-by: bjf-frz <frz123db@gmail.com>

commit 6bb18af119c797fae31e843748fd14fd1e6b2efb
Author: Yuanheng Zhao <54058983+yuanheng-zhao@users.noreply.github.com>
Date:   Sun May 3 22:32:59 2026 +0800

    [Chore][Doc] Fix example arg values in Profiler doc (#3309)

    Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com>

commit 21a3035c101976a17fde1f8485659945acc13f9b
Author: Dan250124 <416947747@qq.com>
Date:   Sun May 3 22:29:47 2026 +0800

    Fixed memory leak and Remove dead code (#3312)

    Signed-off-by: Dan250124 <416947747@qq.com>
    Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

commit e50a0665485178f428e8e37ad7a8cff2fc413484
Author: 汪志鹏 <wangzhipeng628@gmail.com>
Date:   Sun May 3 20:02:22 2026 +0800

    [BugFix]: Fix async scheduer transfer exceed KV cache (#3318)

    Signed-off-by: princepride <wangzhipeng628@gmail.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 4c06bb09ed9985da1c83c72ad704eb297c989986
Author: Lancer <maruixiang6688@gmail.com>
Date:   Sun May 3 15:53:28 2026 +0800

    [Feat] ERNIE image model (T2I) (#2861)

    Signed-off-by: Lancer <maruixiang6688@gmail.com>

commit 5dabbb2f18f650bdfa3c5cc60195e25d5c87fc83
Author: GuoSheng Feng <146159551+sfiisf@users.noreply.github.com>
Date:   Sun May 3 12:36:15 2026 +0800

    [Bugfix] Map Qwen3-TTS max_new_tokens to max_tokens (#3217)

    Signed-off-by: sfiisf <sfiisf@163.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 7491b674463362a2725ef728b303c60b691bbb08
Author: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
Date:   Sun May 3 00:35:35 2026 -0400

    [Bugfix][MOSS-TTS-Nano] Drop fictional voice presets, require ref_audio (#3192)

    Signed-off-by: Yueqian Lin <70319226+linyueqian@users.noreply.github.com>
    Signed-off-by: Yueqian Lin <linyueqian@outlook.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit d49356e4ca95e85613ae68243c838c69eabf5a02
Author: Yuanheng Zhao <54058983+yuanheng-zhao@users.noreply.github.com>
Date:   Sun May 3 10:43:49 2026 +0800

    [Config Refactor] Migrate Ming and Ming-TTS deploy/pipline configs (#3154)

    Signed-off-by: Yuanheng Zhao <jonathan.zhaoyh@gmail.com>
    Signed-off-by: yuanheng <jonathan.zhaoyh@gmail.com>
    Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>

commit 6a837849dd73cfa27a29b476ff03780ee543e41c
Author: Alex Brooks <albrooks@redhat.com>
Date:   Sat May 2 18:02:47 2026 -0600

    [CI] Fix Bad TP Initialization in Dynin-Omni Tests (#3298)

    Signed-off-by: Alex …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants