[CI] Remove small resolution test in Qwen-Image Perf test when vae patch parallel is enabled#2872
Conversation
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
The rationale is sound: small resolutions with VAE patch parallel are indeed unstable. Alternative approaches considered:
Removing the test is the pragmatic solution here. |
|
Reasoning makes sense - small resolution + high parallelism = communication overhead dominates. One suggestion: add a comment in the test config explaining why 512x512 is skipped for this parallelization config. Future reviewers may wonder why this resolution is missing otherwise. Example: |
…tch parallel is enabled (vllm-project#2872) Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
…tch parallel is enabled (vllm-project#2872) Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com> Signed-off-by: nainiu258 <cperfect02@163.com>
…tch parallel is enabled (vllm-project#2872) Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
…tch parallel is enabled (vllm-project#2872) Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
…tch parallel is enabled (vllm-project#2872) Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
…tch parallel is enabled (vllm-project#2872) Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
…tch parallel is enabled (vllm-project#2872) Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Solving #2863.
Vae patch parallelism is enabled to reduce the peak memory of long sequence decoding. It may increase the latency when the resolution is too small, because the communication overhead is larger than the parallel computation gains.
Therefore, this PR removes the "512x512_steps20" test in the
Ulysses SP=2 + CFG-parallel=2 + VAE Patch Parallel=4test case. This actually helps to stablize the nightly CI performance, preventing it from failure.Previous CI results
The table above indicates that larger resolution yields more stable performance results.
Test Plan
Test Result
Wait for Nightly-CI results.
cc @yenuo26 @Gaohan123 @hsliuustc0106
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)