[CI] Replace large models with tiny alternatives in tests by tahsintunan · Pull Request #24057 · vllm-project/vllm

tahsintunan · 2025-09-01T21:26:40Z

Purpose

This PR replaces large language Llama models with tiny alternatives in test files to reduce CI execution time.

Partially addresses [CI]: Replace use of models with smaller models where possible #23456.
Only targeted the big, gated meta-llama models in this PR. More PRs will follow.
Made sure we're only modifying tests that do NOT test model-specific behavior.

Key changes:

Replace big (>1B) meta-llama models with smaller (e.g., EleutherAI/pythia-14m, JackFram/llama-68m) models.
Refactor some test implementations (e.g., test_shutdown, test_basic_correctness, test_sampling_params_e2e)

Test Plan

Ran the following (modified) tests and had them all pass.

# Basic Correctness Tests
pytest -xvs tests/basic_correctness/test_basic_correctness.py
pytest -xvs tests/basic_correctness/test_cpu_offload.py
pytest -xvs tests/basic_correctness/test_cumem.py

# Distributed Tests (Requires multiple GPUs)
pytest -xvs tests/distributed/test_sequence_parallel.py
pytest -xvs tests/entrypoints/llm/test_collective_rpc.py

# API Tests
pytest -xvs tests/entrypoints/openai/test_shutdown.py
pytest -xvs tests/entrypoints/openai/test_run_batch.py
pytest -xvs tests/entrypoints/openai/test_serving_models.py

# Sampling Tests
pytest -xvs tests/samplers/test_ignore_eos.py
pytest -xvs tests/samplers/test_no_bad_words.py

# V1 Tests
VLLM_USE_V1=1 pytest -xvs tests/v1/sample/test_sampling_params_e2e.py
VLLM_USE_V1=1 pytest -xvs tests/v1/core/test_scheduler_e2e.py
VLLM_USE_V1=1 pytest -xvs tests/v1/engine/test_engine_core.py
VLLM_USE_V1=1 pytest -xvs tests/v1/shutdown/test_delete.py
VLLM_USE_V1=1 pytest -xvs tests/v1/shutdown/test_forward_error.py
VLLM_USE_V1=1 pytest -xvs tests/v1/shutdown/test_startup_error.py

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

…tests Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

github-actions · 2025-09-01T21:26:49Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

tahsintunan · 2025-09-02T14:51:17Z

After further consideration, I've removed the pre-commit check from the Buildkite pipeline. Since the pre-commit checks already run on GitHub Actions (and it's way faster), I think it'd be simpler to just let Buildkite run only after GitHub's pre-commit validation passes. Perhaps we can configure Buildkite to skip builds with failing commit status, or use branch protection to enforce this order.

Not related to this PR, though. Earlier, I added the pre-commit check to Buildkite since it was a relatively small change and addressed #23452, but now I think there are better ways to handle this.

njhill · 2025-09-02T16:07:20Z

Thanks @tahsintunan! Re any changes to the pre-commit flow, it would be good to keep in separate PRs (I know that's maybe n/a now since you reverted it).

njhill

Thanks @tahsintunan, looks great! It may overlap slightly with other PRs e.g. #23896, but I think that's fine, we can just try to get them all merged quicky.

We were hoping to standardize on hmellor/tiny-random-LlamaForCausalLM as the small model, do you think you could update the PR to use that?

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

tahsintunan · 2025-09-02T20:35:07Z

@njhill Hey! I've updated the PR to use hmellor/tiny-random-LlamaForCausalLM where possible. Couldn't replace it in 3 tests due to specific requirements:

test_sequence_parallel - No decoder-only support
test_no_bad_words - Doesn't have add_prefix_space tokenizer support
test_basic_correctness - tokenization mismatches when using prompt embeddings (missing whitespace)

Edit: hmellor/tiny-random-LlamaForCausalLM also doesn't work with TP and SP

…ion mismatch Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor · 2025-10-08T11:46:22Z

@tahsintunan could you please look into the remaining failures?

mergify · 2025-10-14T04:34:14Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @tahsintunan.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

tahsintunan · 2025-10-14T19:01:23Z

@hmellor All failures are fixed

hmellor

All the other changes LGTM, just one question about test_shutdown.py

tests/entrypoints/openai/test_shutdown.py

…ct#24057) Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>

njhill · 2025-10-16T16:20:41Z

Thanks a lot @tahsintunan @hmellor!

…ct#24057) Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

…ct#24057) Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

…ct#24057) Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

tahsintunan added 6 commits September 2, 2025 02:13

add pre-commit check as first CI step to catch linting issues early

9318236

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

replace arbitrary use of big llama models with smaller models

250b7e4

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

split test_models into separate basic correctness and sliding window …

1c8e033

…tests Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

skip test_collective_rpc if num_gpu < tp

9ecbd1e

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

enable prefix caching in test_sampling_params_e2e

115af2e

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

refactor shutdown test to use explicit server termination

82aa2b3

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

mergify bot added ci/build v1 labels Sep 1, 2025

tahsintunan changed the title ~~Ci tiny models~~ [CI] Replace large models with tiny alternatives in tests Sep 1, 2025

tahsintunan marked this pull request as ready for review September 1, 2025 21:34

tahsintunan requested review from DarkLight1337, aarnphm, robertgshaw2-redhat and simon-mo as code owners September 1, 2025 21:34

robertgshaw2-redhat added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 1, 2025

remove pre-commit check from buildkite

1749e88

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

tahsintunan force-pushed the ci-tiny-models branch from 2af99d5 to 1749e88 Compare September 2, 2025 14:44

Merge branch 'main' into ci-tiny-models

62e37dc

njhill reviewed Sep 2, 2025

View reviewed changes

tahsintunan added 2 commits September 3, 2025 02:29

replace models with hmellor/tiny-random-LlamaForCausalLM

9fefe3e

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

Merge branch 'main' into ci-tiny-models

16ac0fe

tahsintunan requested a review from njhill September 2, 2025 20:35

tahsintunan added 3 commits September 3, 2025 04:18

remove tiny-random-llama from test_basic_correctness due to tokenizat…

8bae25d

…ion mismatch Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

fix memory profiling test flakiness

3429a6b

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

use small model to fix CI timeout

86b35f9

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

tahsintunan force-pushed the ci-tiny-models branch from d87a998 to 8527a62 Compare September 3, 2025 03:44

mergify bot removed the needs-rebase label Oct 8, 2025

hmellor added 2 commits October 8, 2025 12:02

Don't use Pythia because it's max model len is too short

4bacf51

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Revert one test which doesn't pass to unblock the rest

ad78423

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor enabled auto-merge (squash) October 8, 2025 10:13

mergify bot added the needs-rebase label Oct 14, 2025

Merge branch 'main' into ci-tiny-models

200df3e

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

auto-merge was automatically disabled October 14, 2025 13:39
Head branch was pushed to by a user without write access

mergify bot removed the needs-rebase label Oct 14, 2025

fix failing tests

93bc36d

Signed-off-by: Tahsin Tunan <tahsintunan@gmail.com>

hmellor reviewed Oct 15, 2025

View reviewed changes

tests/entrypoints/openai/test_shutdown.py Show resolved Hide resolved

hmellor approved these changes Oct 16, 2025

View reviewed changes

hmellor merged commit 43721bc into vllm-project:main Oct 16, 2025
31 checks passed

tahsintunan deleted the ci-tiny-models branch October 16, 2025 20:36

faaany mentioned this pull request Oct 17, 2025

[CI] Use float16 for test_bad_words to ensure compatibility with XPUs #27089

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI] Replace large models with tiny alternatives in tests#24057

[CI] Replace large models with tiny alternatives in tests#24057
hmellor merged 27 commits intovllm-project:mainfrom
tahsintunan:ci-tiny-models

tahsintunan commented Sep 1, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Sep 1, 2025

Uh oh!

tahsintunan commented Sep 2, 2025 •

edited

Loading

Uh oh!

njhill commented Sep 2, 2025

Uh oh!

njhill left a comment

Uh oh!

tahsintunan commented Sep 2, 2025 •

edited

Loading

Uh oh!

hmellor commented Oct 8, 2025

Uh oh!

mergify bot commented Oct 14, 2025

Uh oh!

tahsintunan commented Oct 14, 2025

Uh oh!

hmellor left a comment

Uh oh!

Uh oh!

Uh oh!

njhill commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

tahsintunan commented Sep 1, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Uh oh!

github-actions bot commented Sep 1, 2025

Uh oh!

tahsintunan commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

njhill commented Sep 2, 2025

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

tahsintunan commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hmellor commented Oct 8, 2025

Uh oh!

mergify bot commented Oct 14, 2025

Uh oh!

tahsintunan commented Oct 14, 2025

Uh oh!

hmellor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

njhill commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tahsintunan commented Sep 1, 2025 •

edited by github-actions bot

Loading

tahsintunan commented Sep 2, 2025 •

edited

Loading

tahsintunan commented Sep 2, 2025 •

edited

Loading