[CI/Build] Reorganize models tests #7820

DarkLight1337 · 2024-08-23T15:03:03Z

To avoid timeout error, split up the models tests as per #7439 (comment).

Also, I have moved the distributed basic correctness and model tests into the respective files for single-GPU case to avoid fragmentation of the test logic.

Note: The core_model flag isn't used in CI yet.

cc @khluu

github-actions · 2024-08-23T15:03:17Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

Comment /ready on the PR
Add ready label to the PR
Enable auto-merge.

🚀

DarkLight1337 · 2024-08-24T07:11:48Z

For some reason, splitting them up actually causes the multimodal tests to take longer than before...

Edit: It's probably because Qwen-VL tests are now being run as well. Let me disable them for now...

DarkLight1337 · 2024-08-24T11:47:07Z

~~That didn't seem to reduce the test time. @khluu any ideas?~~

DarkLight1337 · 2024-08-25T10:53:52Z

~~The longer test time also occurs for other PRs. Hope that this PR didn't mess up the HF cache somehow...~~

DarkLight1337 · 2024-08-26T07:41:37Z

Seems to be a temporary network issue, since the test times are back to normal now. Let's merge this before working on core_model.

… test collection

youkaichao · 2024-09-13T06:36:58Z

.buildkite/test-pipeline.yaml

@@ -311,11 +332,11 @@ steps:
  - tests/distributed/
  commands:
  - # the following commands are for the first node, with ip 192.168.10.10 (ray environment already set up)
-    - VLLM_TEST_SAME_HOST=0 torchrun --nnodes 2 --nproc-per-node=2 --rdzv_backend=c10d --rdzv_endpoint=192.168.10.10 distributed/test_same_node.py
+    - VLLM_TEST_SAME_HOST=0 torchrun --nnodes 2 --nproc-per-node=2 --rdzv_backend=c10d --rdzv_endpoint=192.168.10.10 distributed/test_same_node.py | grep -q 'Same node test passed'


why add grep?

Since I have added an if __name__ == "__main__" guard to avoid executing the code during test collection, I use grep to ensure that the code inside is actually run during the test.

tests/basic_correctness/test_basic_correctness.py

youkaichao · 2024-09-13T06:46:00Z

LGTM in general, thanks for the huge efforts!

My only question is w.r.t. the fork of process for every test (which is resolved in the comments).

tlrmchlsmth · 2024-09-13T19:50:37Z

Nice reorganization

…roject#7820)

Signed-off-by: Alvant <[email protected]>

Signed-off-by: Amit Garg <[email protected]>

Signed-off-by: LeiWang1999 <[email protected]>

Reorganize tests

2d36c83

DarkLight1337 added 4 commits August 23, 2024 15:30

Fix chameleon test

9561d6b

Remove unnecessary pytest.mark

93e8707

Update timings

96c4d68

Update timings

e351790

DarkLight1337 added 3 commits August 24, 2024 07:19

Rename and split multimodal tests

019a010

Skip qwen-vl tests

44324a8

Merge branch 'upstream' into reorganize-models-tests

37c7d36

Define core_model marker

794ba26

DarkLight1337 added 4 commits August 26, 2024 01:55

Merge branch 'upstream' into reorganize-models-tests

6630dbe

Merge branch 'upstream' into reorganize-models-tests

864be29

Remove notebook

e781692

Update timings

160c929

DarkLight1337 changed the title ~~[DONT MERGE] [CI/Build] Reorganize models tests~~ [CI/Build] Reorganize models tests Aug 26, 2024

DarkLight1337 requested a review from simon-mo August 26, 2024 07:41

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 26, 2024

DarkLight1337 added 4 commits August 28, 2024 07:08

Merge branch 'upstream' into reorganize-models-tests

36a8357

Fix imports

755d97e

Move multi-gpu tests for basic correctness and models

69bf5d4

Also move the chunked prefill tests

b76d784

DarkLight1337 requested a review from youkaichao August 28, 2024 08:13

DarkLight1337 added 4 commits August 28, 2024 08:28

Merge branch 'upstream' into reorganize-models-tests

2afa1ff

Fix CPU tests

efe5cd0

Avoid checking value at import time to avoid problems when performing…

c0d2304

… test collection

Merge branch 'upstream' into reorganize-models-tests

88f30c8

DarkLight1337 added 5 commits September 12, 2024 17:23

Merge branch 'upstream' into reorganize-models-tests

122d2a5

Merge branch 'upstream' into reorganize-models-tests

8ae76f1

Merge branch 'upstream' into reorganize-models-tests

efc487c

Update lazy import

42cb113

Update path

fddea8e

youkaichao reviewed Sep 13, 2024

View reviewed changes

tests/basic_correctness/test_basic_correctness.py Show resolved Hide resolved

simon-mo approved these changes Sep 13, 2024

View reviewed changes

simon-mo merged commit a84e598 into vllm-project:main Sep 13, 2024
72 checks passed

DarkLight1337 deleted the reorganize-models-tests branch September 13, 2024 17:23

tlrmchlsmth added a commit to neuralmagic/nm-vllm that referenced this pull request Sep 13, 2024

Move test_mamba.py (for vllm-project#7820)

a5bd7d2

dtrifiro pushed a commit to opendatahub-io/vllm that referenced this pull request Sep 16, 2024

[CI/Build] Reorganize models tests (vllm-project#7820)

754dc0f

njhill mentioned this pull request Sep 26, 2024

[BugFix] Fix test breakages from transformers 4.45 upgrade #8829

Merged

DarkLight1337 added a commit to njhill/vllm that referenced this pull request Sep 26, 2024

Update A100 distributed test with new file location (missed in vllm-p…

8e7f2b6

…roject#7820)

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[CI/Build] Reorganize models tests (vllm-project#7820)

5e6d0b5

Signed-off-by: Alvant <[email protected]>

garg-amit pushed a commit to garg-amit/vllm that referenced this pull request Oct 28, 2024

[CI/Build] Reorganize models tests (vllm-project#7820)

99890c1

Signed-off-by: Amit Garg <[email protected]>

LeiWang1999 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Mar 26, 2025

[CI/Build] Reorganize models tests (vllm-project#7820)

ab03bdb

Signed-off-by: LeiWang1999 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI/Build] Reorganize models tests #7820

[CI/Build] Reorganize models tests #7820

DarkLight1337 commented Aug 23, 2024 •

edited

Loading

github-actions bot commented Aug 23, 2024

DarkLight1337 commented Aug 24, 2024 •

edited

Loading

DarkLight1337 commented Aug 24, 2024 •

edited

Loading

DarkLight1337 commented Aug 25, 2024 •

edited

Loading

DarkLight1337 commented Aug 26, 2024

youkaichao Sep 13, 2024

DarkLight1337 Sep 13, 2024 •

edited

Loading

youkaichao commented Sep 13, 2024 •

edited

Loading

tlrmchlsmth commented Sep 13, 2024

[CI/Build] Reorganize models tests #7820

[CI/Build] Reorganize models tests #7820

Conversation

DarkLight1337 commented Aug 23, 2024 • edited Loading

github-actions bot commented Aug 23, 2024

DarkLight1337 commented Aug 24, 2024 • edited Loading

DarkLight1337 commented Aug 24, 2024 • edited Loading

DarkLight1337 commented Aug 25, 2024 • edited Loading

DarkLight1337 commented Aug 26, 2024

youkaichao Sep 13, 2024

Choose a reason for hiding this comment

DarkLight1337 Sep 13, 2024 • edited Loading

Choose a reason for hiding this comment

youkaichao commented Sep 13, 2024 • edited Loading

tlrmchlsmth commented Sep 13, 2024

DarkLight1337 commented Aug 23, 2024 •

edited

Loading

DarkLight1337 commented Aug 24, 2024 •

edited

Loading

DarkLight1337 commented Aug 24, 2024 •

edited

Loading

DarkLight1337 commented Aug 25, 2024 •

edited

Loading

DarkLight1337 Sep 13, 2024 •

edited

Loading

youkaichao commented Sep 13, 2024 •

edited

Loading