Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
95de2a1
feat: support multi-stage
ZhengWG Mar 31, 2026
0b774ca
feat: init multi engine-cores
ZhengWG Apr 1, 2026
4786f5e
fix lint
ZhengWG Apr 1, 2026
9add6f2
Merge branch 'main' into support-stage-scale-out
ZhengWG Apr 1, 2026
11976f1
refacotr: keep name consistency
ZhengWG Apr 2, 2026
4d9b160
Merge branch 'main' into support-stage-scale-out
ZhengWG Apr 2, 2026
eaa9dfd
fix lint
ZhengWG Apr 4, 2026
e122501
[Bugfix] Fix Bagel online mode for 1. Hang after several requests 2…
natureofnature Apr 13, 2026
cb4d13a
[Perf][Fish Speech] Enable CUDA Graph capture for Fast AR code predic…
Sy0307 Apr 13, 2026
8097747
[Model] Adapt Wan2.2-I2V-A14B via LightX2V offline conversion path (#…
Celeste-jq Apr 13, 2026
d9e745c
[Fix] VoxCPM2: support raw audio for voice cloning via OpenAI API (#2…
linyueqian Apr 13, 2026
2226143
[CI][Bugfix] Refactor the test case to add support for increasing ini…
yenuo26 Apr 13, 2026
d369648
refactor: add stage_pool
ZhengWG Apr 13, 2026
2b70e89
[Revert] Revert "[Log] Wire stat loggers into AsyncOmniEngine to matc…
amy-why-3459 Apr 13, 2026
533b2b5
Merge branch 'main' into support-stage-scale-out
ZhengWG Apr 13, 2026
0d4e975
[core]refactor communication layer: PR1(Added Refactor Infra Only) (#…
natureofnature Apr 13, 2026
cd2761e
[Feature]: support Flux.2-dev tea_cache (#1871)
nuclearwu Apr 13, 2026
155583f
[Bugfix] Release stage launch lock before handshake (#2717)
fake0fan Apr 13, 2026
ef3f72b
[Tests][Qwen3-Omni]Modify Qwen3-Omni performance test cases (#2600)
amy-why-3459 Apr 13, 2026
2c67c30
[Bagel]: Support `think mode` in single stage deployment of Bagel (#2…
princepride Apr 13, 2026
e0cdbe9
[Misc] Cleanup: use consistent pytest-mock in unit tests (#2698)
yuanheng-zhao Apr 13, 2026
2a1d506
[skip ci][doc]Update async_chunk design diagram (#2420)
amy-why-3459 Apr 13, 2026
6b5a52a
[Bugfix] Update Flux2-dev & Dynin_omni L4 e2e test (#2723)
wtomin Apr 13, 2026
c9e2e3e
[Voxtral TTS] Correct decode steps param in Voxtral TTS (#2524)
y123456y78 Apr 13, 2026
14f7910
[Perf]: Speedup VoxCPM2 TTS performance and Support PagedAttention (#…
Sy0307 Apr 13, 2026
dd13891
[Voxtral TTS] Fix Voxtral TTS input with text and ref_audio (#2750)
y123456y78 Apr 13, 2026
8d23549
[CI] Qwen image edit performance benckmark (#2216)
fhfuih Apr 14, 2026
ec64557
Merge branch 'main' into support-stage-scale-out
ZhengWG Apr 14, 2026
a5b38b5
[BugFix] Remove stage_configs_path validation (#2741)
amy-why-3459 Apr 14, 2026
644edac
[Perf] Optimize MP4 encoding latency in video generation (#2735)
SamitHuang Apr 14, 2026
48c30bc
[Qwen3-TTS] Remove hardcoded `distributed_executor_backend` to improv…
iancarrasco-b10 Apr 14, 2026
17acd05
[Test] Add Stable Audio offline e2e TeaCache Test (#2377)
zhangj1an Apr 14, 2026
6d01a8b
[Omni Connector] Omni Transfer Engine Connector: Enable 1-receiver-to…
natureofnature Apr 14, 2026
3229bae
[skip ci] fix docs, gdown remove --id param (#2787)
lengrongfu Apr 14, 2026
159d655
[Tests][Qwen3-Omni]Add test cases for long videos and long audios. (#…
amy-why-3459 Apr 14, 2026
f87674a
[skip ci]add skills (#2710)
hsliuustc0106 Apr 14, 2026
bcd5f16
[Misc] clean Temporary CI Configs (#2784)
n1ptune Apr 14, 2026
5ce0a43
[CI][Bugfix] Update thresholds for accuracy tests (#2725)
yenuo26 Apr 14, 2026
cf1fcd5
[CI/BugFix] Fix Flaky Test for Qwen Omni Perf (#2754)
alex-jw-brooks Apr 14, 2026
4fb078a
[Bugfix] Reject /v1/audio/speech for Qwen omni models (#2763)
scyyh11 Apr 14, 2026
53a9cf4
fix: do not apply FP8 quant config to vision/audio encoders for pre-q…
ianliuy Apr 14, 2026
f03ab38
[BugFix] Fix NoneType' object has no attribute 'detach' (#2797)
amy-why-3459 Apr 14, 2026
bc4a659
[Bugfix] Make mrope kwargs optional in HunyuanImage3 get_mrope_input_…
ianliuy Apr 14, 2026
9e46a79
[Bugfix] Handle numpy array outputs when generate image (#1680)
lengrongfu Apr 15, 2026
02e5dc7
[Perf] VoxCPM2: streaming VAE + compile optimization (45% RTF reducti…
linyueqian Apr 15, 2026
a782ae4
[Perf] Enhance benchmark script to support baseline thresholds and pr…
yenuo26 Apr 15, 2026
227bab3
[Benchmark]Omni-modality model accuracy benchmark(Daily-Omni & seed-t…
amy-why-3459 Apr 15, 2026
0d02073
[CI] qwen image edit L4 accuracy test (#2761)
fhfuih Apr 15, 2026
61a3cbd
[Perf] Eliminate Hop 3 IPC overhead for single-stage diffusion via in…
SamitHuang Apr 15, 2026
6c6551d
[Feature] feat: add video frame interpolation postprocess (#2555)
david6666666 Apr 15, 2026
1ad726f
[Fix] HunyuanImage-3.0: unify naming hunyuan_image_3 → hunyuan_image3…
TaffyOfficial Apr 15, 2026
2dff2d7
[PERF] Wan2.2 support adalayernorm fused op (#2585)
fan2956 Apr 15, 2026
133e2f9
[hotfix] API connection error in CI (#2810)
fhfuih Apr 15, 2026
38d5f2d
[Perf] VoxCPM2: Speedup by manual CUDA Graph capture for scaffold/res…
Sy0307 Apr 15, 2026
4bf4c63
Add voxcpm model support. (#2467)
IsleOfDawnlight Apr 15, 2026
82f8c93
[Feat][Qwen3-Omni] Shared code predictor module for Qwen3-TTS and Qwe…
JuanPZuluaga Apr 15, 2026
50ae1de
[Feature] HunyuanImage3 allow guidance_scale<=1 in DiT stage (#2762)
Fishermanykx Apr 15, 2026
7559686
Merge branch 'main' into support-stage-scale-out
ZhengWG Apr 15, 2026
c6d76d0
[Bugfix] Fix broken fp8 quantisation on Z-Image-Turbo, Qwen-Image, FL…
zhangj1an Apr 15, 2026
cad8956
fix: align processor/statge-replica
ZhengWG Apr 15, 2026
f1e3f03
[feature] Hidden State Prefix Caching (#2164)
alex-jw-brooks Apr 15, 2026
e958113
[Perf] Add Performance Test for Qwen-Image Step-Level Execution (#2707)
wtomin Apr 15, 2026
880a758
[CI] Skip test_thinker_prefix_caching in tests/e2e/online_serving/tes…
yenuo26 Apr 16, 2026
cde22f8
refactor: make stagepoll more clean
ZhengWG Apr 16, 2026
c83f664
[CI][Perf] Add nightly PR labels, consolidate pipeline, and switch be…
yenuo26 Apr 16, 2026
de5f8a2
[Doc][Misc] Update DreamID-Omni Example; Add DreamID-Omni post proces…
yuanheng-zhao Apr 16, 2026
b43c6c6
[Feat] add GLM-Image SP support (#1983)
RuixiangMa Apr 16, 2026
30ee64e
refactor: init replica in stage_pool
ZhengWG Apr 16, 2026
24e61f4
[CI] add qwen image and layered accuracy test (#2772)
david6666666 Apr 16, 2026
8d1ce63
refactor: init replica in stage_pool part2
ZhengWG Apr 16, 2026
4d816ff
[Feature] Bagel: Support tp+cfg parallel using mooncake transfer engi…
natureofnature Apr 16, 2026
bd6985e
refactor: init replica in stage_pool part3
ZhengWG Apr 16, 2026
658415e
add sample yaml for multi-replica
ZhengWG Apr 16, 2026
86f94c8
Merge branch 'main' into support-stage-scale-out
ZhengWG Apr 16, 2026
f1cb4eb
[PERF] Wan2.2 support rmsnorm fused op (#2583)
fan2956 Apr 16, 2026
e8658b5
[Test] Add performance tests for Qwen-Image-Layered model (#2807)
kechengliu97 Apr 16, 2026
dab0720
clean code
ZhengWG Apr 16, 2026
05e7e71
refacotr: keep name style & keep stage_pool clean
ZhengWG Apr 16, 2026
7d47ccf
refacotr: keep stage_id variable name
ZhengWG Apr 16, 2026
322620f
[Fix][Fish Speech] Remove redundant get_vocab() in control token enco…
Sy0307 Apr 16, 2026
45760d6
[Test] Skip tests for known issues in audio and speaker recognition …
yenuo26 Apr 16, 2026
1219c0f
UT: add ut for multi-replica
ZhengWG Apr 16, 2026
70501de
Merge branch 'main' into support-stage-scale-out
ZhengWG Apr 16, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,19 @@ steps:
agents:
queue: "cpu_queue_premerge"

# L4 Test — main+NIGHTLY=1 (scheduled), or PR with label nightly-test (e.g. add label then Rebuild)
# L4 Test — main+NIGHTLY=1 (scheduled), or PR with specific label (e.g. add label then Rebuild)
- label: "Upload Nightly Pipeline"
depends_on: image-build
key: upload-nightly-pipeline
if: '(build.branch == "main" && build.env("NIGHTLY") == "1") || (build.branch != "main" && build.pull_request.labels includes "nightly-test")'
if: >-
(build.branch == "main" && build.env("NIGHTLY") == "1") ||
(build.branch != "main" && (
build.pull_request.labels includes "nightly-test" ||
build.pull_request.labels includes "omni-test" ||
build.pull_request.labels includes "tts-test" ||
build.pull_request.labels includes "diffusion-x2iat-test" ||
build.pull_request.labels includes "diffusion-x2v-test"
))
commands:
- buildkite-agent pipeline upload .buildkite/test-nightly.yml
agents:
Expand Down
2 changes: 1 addition & 1 deletion .buildkite/test-amd-merge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ steps:
# - export GPU_ARCHS=gfx942
# - export VLLM_LOGGING_LEVEL=DEBUG
# - export VLLM_WORKER_MULTIPROC_METHOD=spawn
# - timeout 20m pytest -s -v tests/e2e/offline_inference/test_stable_audio_model.py
# - timeout 20m pytest -s -v tests/e2e/offline_inference/test_stable_audio_expansion.py -m "advanced_model and diffusion and L4" --run-level advanced_model

- label: "Diffusion Cache Backend Test"
agent_pool: mi325_1
Expand Down
2 changes: 1 addition & 1 deletion .buildkite/test-amd-ready.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ steps:
# - export GPU_ARCHS=gfx942
# - export VLLM_LOGGING_LEVEL=DEBUG
# - export VLLM_WORKER_MULTIPROC_METHOD=spawn
# - timeout 20m pytest -s -v tests/e2e/offline_inference/test_stable_audio_model.py
# - timeout 20m pytest -s -v tests/e2e/offline_inference/test_stable_audio_expansion.py -m "advanced_model and diffusion and L4" --run-level advanced_model

- label: "Diffusion Cache Backend Test"
agent_pool: mi325_1
Expand Down
20 changes: 1 addition & 19 deletions .buildkite/test-merge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,24 +76,6 @@ steps:
volumes:
- "/fsx/hf_cache:/fsx/hf_cache"

- label: "Audio Generation Model Test"
timeout_in_minutes: 20
depends_on: upload-merge-pipeline
commands:
- pytest -s -v tests/e2e/offline_inference/test_stable_audio_model.py
agents:
queue: "gpu_1_queue" # g6.4xlarge instance on AWS, has 1 L4 GPU
plugins:
- docker#v5.2.0:
image: public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:$BUILDKITE_COMMIT
always-pull: true
propagate-environment: true
environment:
- "HF_HOME=/fsx/hf_cache"
- "HF_TOKEN"
volumes:
- "/fsx/hf_cache:/fsx/hf_cache"

- label: "Diffusion Cache Backend Test"
timeout_in_minutes: 15
depends_on: upload-merge-pipeline
Expand All @@ -113,7 +95,7 @@ steps:
- "/fsx/hf_cache:/fsx/hf_cache"

- label: "Diffusion Sequence Parallelism Test"
timeout_in_minutes: 20
timeout_in_minutes: 25
depends_on: upload-merge-pipeline
commands:
- pytest -s -v tests/e2e/offline_inference/test_sequence_parallel.py tests/diffusion/distributed/test_ulysses_uaa_perf.py
Expand Down
Loading