[Test] Add L4 diffusion feature test for LongCat-Image#1970
[Test] Add L4 diffusion feature test for LongCat-Image#1970Gaohan123 merged 12 commits intovllm-project:mainfrom
Conversation
Signed-off-by: lcukyfuture <zlf994478451@outlook.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 60e291117e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| server_args=[ | ||
| "--cache-backend", | ||
| "cache_dit", |
There was a problem hiding this comment.
Assert Cache-DiT is enabled for the cache_dit cases
These parametrizations only verify that image generation still works, but assert_diffusion_response in tests/conftest.py:1333-1367 checks response validity/shape only. Because Cache-DiT is an acceleration path, LongCat will still return a correct image if the cache backend is skipped or regresses, so this new L4 test can stay green while Cache-DiT support is broken. Please add a positive assertion (log/metric/backend state) that the Cache-DiT hooks were actually enabled.
Useful? React with 👍 / 👎.
| "--ulysses-degree", | ||
| "2", | ||
| ], |
There was a problem hiding this comment.
Verify Ulysses/Ring actually run instead of falling back
The sequence-parallel cases currently pass as long as a request returns an image, but LongCat can warn and continue without SP hooks (vllm_omni/diffusion/registry.py:283-289), and the attention factory explicitly falls back to NoParallelAttention when SP groups are unavailable (vllm_omni/diffusion/attention/parallel/factory.py:47-52). In that scenario the Ulysses/Ring feature is broken but this test still passes, so it does not reliably cover the advertised SP matrix.
Useful? React with 👍 / 👎.
Signed-off-by: lcukyfuture <zlf994478451@outlook.com>
hsliuustc0106
left a comment
There was a problem hiding this comment.
I found one blocking issue in the new LongCat L4 test.
The test claims to validate Ulysses and Ring sequence parallelism (parallel_001 / parallel_002), but LongCat does not appear to have any _sp_plan hooks, and the runtime explicitly no-ops SP when no _sp_plan is found. In vllm_omni/diffusion/registry.py, _apply_sequence_parallel_if_enabled() logs a warning and continues when applied_count == 0, so these cases can still return a valid image without ever exercising the SP path. I also could not find _sp_plan, ulysses_degree, or ring_degree in vllm_omni/diffusion/models/longcat_image/pipeline_longcat_image.py.
That means the PR currently overstates its feature coverage. Please either add actual LongCat SP support plus a test that proves the SP path was used, or remove the Ulysses/Ring cases from the claimed matrix for this model.
@hsliuustc0106 Thanks for your comment. It looks LongCat have SP support, and the _sp_plan is on For completeness, I am including the full server log below. Full server log for the Ulysses case |
|
1.Please use |
Signed-off-by: lcukyfuture <zlf994478451@outlook.com>
344e33c to
5926435
Compare
|
Yeah LongCat does support SP, but it was using the invasive approach instead of the SP plan until this PR - FYI @hsliuustc0106, maybe it's the point of confusion for hook related logs |
| - buildkite-agent artifact upload "tests/*.xlsx" | ||
| agents: | ||
| queue: "cpu_queue_premerge" | ||
| # - label: ":full_moon: Qwen3-TTS Non-Async-Chunk E2E Test" |
There was a problem hiding this comment.
why comment so many lines? @lcukyfuture @linyueqian PTAL
There was a problem hiding this comment.
To only run the relevant test group when temporarily running the test on CI
|
@yenuo26 do we have a guidance for how many tests we should add for models with different priorities? |
Following this comment, since we just made an additional decision that not all models enjoy a full test coverage of diffusion features, we may need extra discussion on how to handle these PR's (lest they crush our CI machine 😁 ) |
Signed-off-by: Lingfeng Zhang <48312954+lcukyfuture@users.noreply.github.com>
|
The conflict has been resolved. You can also assign LongCat-Image-Edit to me if needed. |
Thank you for your assistance. I will proceed with LongCat-Image-Edit. I expect the two files to be quite similar. I will request your review once the PR is ready. Thank again! |
I think we only fallback the test to modelwise offloading if this model does not support layerwise. So the number of tests don't change. But please also see my following comment |
|
@lcukyfuture apologize for any confusion caused, but after some internal discussion, we just decided that we should reduce the number of test cases for not-high-priority models. Could you help settle a recommended feature combination for this model, and edit the test script to only include that feature combination? If you need help finding a good combination of diffusion features, see if hsliuustc0106/vllm-omni-skills#19 this AI skill can help, or search relevant PRs in this repo that introduces this model or relevant features (for any example code snippets). And @NumberWan you can also coordinate on this matter |
|
ok, I will check this. 👍 |
|
@lcukyfuture After my local testing on the server and reviewing the merged PR, I think these two feature combinations are the best options for LongCat-Image:
Since only these cases cover the 1-GPU and 2-GPU configurations, and in my runs Ulysses-SP (2) showed the shortest generation latency. So the current PR can reduce the number of tests to 2. |
|
@lcukyfuture Just a heads-up: the related PR for LongCat-Image-Edit has been created. #2035 |
|
#2035 |
To add, we also plan to reduce the number of test cases for not-high priority models. So you two can also settle the feature combinations to test within 1~2 test cases |
Signed-off-by: lcukyfuture <zlf994478451@outlook.com>
33d30a7 to
f8850b8
Compare
|
@NumberWan I have modified the test cases based on your PR, and it's fine in my local tests. |
| if: build.branch != "main" | ||
| commands: | ||
| - buildkite-agent pipeline upload .buildkite/test-ready.yml | ||
| - buildkite-agent pipeline upload .buildkite/test-nightly.yml |
There was a problem hiding this comment.
Please remember to change it back before merging
|
@hsliuustc0106 @Gaohan123 Please add a ready label for the CI test |
Signed-off-by: Lingfeng Zhang <48312954+lcukyfuture@users.noreply.github.com>
|
@lcukyfuture Please comment out the Your latest temporary nightly CI did not run anything 😂 https://buildkite.com/vllm/vllm-omni/builds/4978/steps/canvas |
Signed-off-by: lcukyfuture <zlf994478451@outlook.com>
Remove duplicate `if: build.env("NIGHTLY") == "1"` and commands block
that was introduced by a bad conflict resolution on GitHub. Keep the
commented-out `if` so the Diffusion Test runs on non-nightly builds.
Signed-off-by: lcukyfuture <zlf994478451@outlook.com>
839864a to
bfe01d6
Compare
|
Sorry, it seems I made a mistake while handling the conflict. I'll fix it. 😢 |
Signed-off-by: lcukyfuture <zlf994478451@outlook.com>
392ca19 to
18e2902
Compare
Signed-off-by: Lingfeng Zhang <48312954+lcukyfuture@users.noreply.github.com>
fhfuih
left a comment
There was a problem hiding this comment.
Thanks. I can see that https://buildkite.com/vllm/vllm-omni/builds/4996/steps/canvas CI passes.
Please change the pipeline YAMLs back, and it will be ready to merge
Signed-off-by: lcukyfuture <zlf994478451@outlook.com>
…1970) Signed-off-by: lcukyfuture <zlf994478451@outlook.com> Signed-off-by: Lingfeng Zhang <48312954+lcukyfuture@users.noreply.github.com>
…1970) Signed-off-by: lcukyfuture <zlf994478451@outlook.com> Signed-off-by: Lingfeng Zhang <48312954+lcukyfuture@users.noreply.github.com>
Purpose
To follow the recent establishment of multi-level testing system, this PR adds L4 test for LongCat-Image.
Test Plan
The most recent list of diffusion features is at #1217. The test covers the supported LongCat-Image diffusion feature matrix in online serving mode, including:
Settings
(1 GPU) Module-wise CPU offloading only(2 GPUs) CacheDiT + Ulysses2Test Result
Collection check:
Additional Test Results:
pytest -s -v tests/e2e/online_serving/test_longcat_image_edit_expansion.py \ -m "diffusion and advanced_model and H100" \ --collect-onlyCI Test Result [Passed]
CI Successful in both 2 tests
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)