Skip to content

[Spyre-Next] Run upstream vLLM tests with pytest#800

Merged
joerunde merged 12 commits intomainfrom
upstream-tests
Mar 6, 2026
Merged

[Spyre-Next] Run upstream vLLM tests with pytest#800
joerunde merged 12 commits intomainfrom
upstream-tests

Conversation

@tjohnson31415
Copy link
Copy Markdown
Collaborator

Description

Adds a conftest.py for vllm_spyre_next to experiment with an approach to pulling in and running a subset of upstream vLLM tests as part of our suite.

A shallow checkout of upstream vLLM matching our pinned commit is downloaded to a cache directory. A configurable list of test files are gathered and processed to add to the test suite with the upstream marker, a list of regex expressions is used to select test cases that are expected to pass and mark with upstream_passing, and then all the tests are executed together.

See conftest.py documentation for more detail.

Related Issues

FIX: #795

Test Plan

pytest ./vllm_spyre_next/tests -m 'spyre or upstream_passing'

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 6, 2026

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, run ./format.sh.
Now you are good to go 🚀.

We also recommend installing prek and configuring it to check your code before every local commit.

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
@joerunde
Copy link
Copy Markdown
Collaborator

joerunde commented Mar 6, 2026

bot:next-test

@joerunde
Copy link
Copy Markdown
Collaborator

joerunde commented Mar 6, 2026

nit: we should probably open this from your fork as we switch off of using branches

Comment thread vllm_spyre_next/pyproject.toml Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to add vllm[dev] here to pull in its dev dependencies automatically from our locked version?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: no, vllm doesn't publish a dev extra, so we can't. It looks like we would have to do something like

uv add --dev /path/to/vllm/requirements/dev.txt

and maintain that ourselves :(


## Supported Configuration

- **SKIP_UPSTREAM_TESTS**: Set to "1", "true", or "yes" to skip upstream tests entirely
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're adding a lot of environment-based config like this, I think it'd be worth putting into a test_environment.py file or similar where these and their defaults are all defined in one place for readability ease

- Default: "https://github.com/vllm-project/vllm"
- **UPSTREAM_TESTS_PATHS**: Comma-separated paths relative to vLLM's tests/ directory
- Default: "models/language/generation"
- Example: "models/language/generation,models/vision"
Copy link
Copy Markdown
Collaborator

@joerunde joerunde Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to think about longer term here is that the upstream test pipelines generally split these groups of files out into their own jobs. Our eventual goal is to have buildkite workers with spyre cards running on vllm-project/vllm directly, but in the meantime I suspect we'll need/want to emulate that pipelining ourselves internally.

So being able to set UPSTREAM_TESTS_PATHS / UPSTREAM_PASSING_PATTERNS is really nice to enable this, but we'll need to store some config somewhere that defines the set of jobs to run, like:

vllm_test_jobs:
  - UPSTREAM_TESTS_PATHS: models/language/generation
    UPSTREAM_PASSING_PATTERNS: facebook
  - UPSTREAM_TESTS_PATHS: models/vision
    UPSTREAM_PASSING_PATTERNS: llava_next

(not suggesting for this PR, just thinking out loud about where we're heading)


# Enable sparse checkout at the worktree
_run(["git", "sparse-checkout", "init", "--cone"], cwd=td_path)
_run(["git", "sparse-checkout", "set", *sparse_paths], cwd=td_path)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice- the full vllm repo is getting pretty hefty 🚀

Comment thread vllm_spyre_next/tests/conftest.py Outdated
@joerunde
Copy link
Copy Markdown
Collaborator

joerunde commented Mar 6, 2026

bot:next-test
export SKIP_UPSTREAM_TESTS=1

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
@joerunde
Copy link
Copy Markdown
Collaborator

joerunde commented Mar 6, 2026

bot:next-test

1 similar comment
@joerunde
Copy link
Copy Markdown
Collaborator

joerunde commented Mar 6, 2026

bot:next-test

Copy link
Copy Markdown
Collaborator

@joerunde joerunde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@joerunde joerunde merged commit 9520e98 into main Mar 6, 2026
14 checks passed
@joerunde joerunde deleted the upstream-tests branch March 6, 2026 23:55
romitjain pushed a commit to romitjain/vllm-spyre that referenced this pull request Mar 16, 2026
Adds a conftest.py for vllm_spyre_next to experiment with an approach to
pulling in and running a subset of upstream vLLM tests as part of our
suite.

A shallow checkout of upstream vLLM matching our pinned commit is
downloaded to a cache directory. A configurable list of test files are
gathered and processed to add to the test suite with the `upstream`
marker, a list of regex expressions is used to select test cases that
are expected to pass and mark with `upstream_passing`, and then all the
tests are executed together.

See conftest.py documentation for more detail.

FIX: torch-spyre#795

```
pytest ./vllm_spyre_next/tests -m 'spyre or upstream_passing'
```

---------

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: romit <romit@ibm.com>
romitjain pushed a commit to romitjain/vllm-spyre that referenced this pull request Mar 16, 2026
Adds a conftest.py for vllm_spyre_next to experiment with an approach to
pulling in and running a subset of upstream vLLM tests as part of our
suite.

A shallow checkout of upstream vLLM matching our pinned commit is
downloaded to a cache directory. A configurable list of test files are
gathered and processed to add to the test suite with the `upstream`
marker, a list of regex expressions is used to select test cases that
are expected to pass and mark with `upstream_passing`, and then all the
tests are executed together.

See conftest.py documentation for more detail.

FIX: torch-spyre#795

```
pytest ./vllm_spyre_next/tests -m 'spyre or upstream_passing'
```

---------

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: romit <romit@ibm.com>
romitjain pushed a commit to romitjain/vllm-spyre that referenced this pull request Mar 17, 2026
Adds a conftest.py for vllm_spyre_next to experiment with an approach to
pulling in and running a subset of upstream vLLM tests as part of our
suite.

A shallow checkout of upstream vLLM matching our pinned commit is
downloaded to a cache directory. A configurable list of test files are
gathered and processed to add to the test suite with the `upstream`
marker, a list of regex expressions is used to select test cases that
are expected to pass and mark with `upstream_passing`, and then all the
tests are executed together.

See conftest.py documentation for more detail.

FIX: torch-spyre#795

```
pytest ./vllm_spyre_next/tests -m 'spyre or upstream_passing'
```

---------

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: romit <romit@ibm.com>
romitjain pushed a commit to romitjain/vllm-spyre that referenced this pull request Mar 17, 2026
Adds a conftest.py for vllm_spyre_next to experiment with an approach to
pulling in and running a subset of upstream vLLM tests as part of our
suite.

A shallow checkout of upstream vLLM matching our pinned commit is
downloaded to a cache directory. A configurable list of test files are
gathered and processed to add to the test suite with the `upstream`
marker, a list of regex expressions is used to select test cases that
are expected to pass and mark with `upstream_passing`, and then all the
tests are executed together.

See conftest.py documentation for more detail.

FIX: torch-spyre#795

```
pytest ./vllm_spyre_next/tests -m 'spyre or upstream_passing'
```

---------

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: romit <romit@ibm.com>
joerunde added a commit that referenced this pull request Mar 19, 2026
## Description

This PR does 2 things:

1. **Adds tests for SpyreRMSNorm that run from
`vllm-spyre/vllm_spyre_next`**

There are 2 tests added - a unit test verifying the correctness of the
layer on CPU/Spyre and an integration test to ensure `forward_oot` gets
called when it is installed as a vLLM plugin.
While writing down these tests, I saw a couple of issues in the
SpyreRMSNorm implementation - which I have attempted to fix, but please
correct me if I am wrong.

2. **Adds a framework for running upstream vLLM tests and runs RMSNorm
upstream tests**

Building on: vllm-project/vllm#36246, this PR
also adds a framework that can be used to filter and update upstream
tests and run them from the `vllm` repo.

1. We clone vllm separately and run tests from the vllm repo (copied
over from #800, not the contribution of this PR)
2. We manage the whitelist/filtering logic via a declarative YAML

## Related Issues

<!-- Link related issues, e.g., `Fixes #` or `Relates to #456` -->

#805

## Test Plan

<!-- Describe how you tested your changes. Include commands or steps to
reproduce. -->

To test both the features of this PR:

1. **Tests for SpyreRMSNorm that run from `vllm-spyre/vllm_spyre_next`**

```bash
cd vllm-spyre/vllm_spyre_next
# Installs the pytest plugin
uv pip install -e .

VLLM_PLUGINS=spyre_next_ops pytest -rA tests/test_rms_norm.py -m spyre
```
This is expected to produce,

```bash
================================================================================= short test summary info =================================================================================
PASSED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[False-64-1]
PASSED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[False-128-1]
PASSED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[False-256-1]
PASSED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[False-512-1]
PASSED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[True-64-1]
PASSED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[True-128-1]
PASSED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[True-256-1]
PASSED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[True-512-1]
PASSED tests/test_rms_norm.py::test_rmsnorm_oot_dispatch[False]
PASSED tests/test_rms_norm.py::test_rmsnorm_oot_dispatch[True]
SKIPPED [1] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_permute_cols.py:11: permute_cols is not supported on ROCm
FAILED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[False-63-1] - <redacted>
FAILED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[False-65-1] - <redacted>
FAILED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[False-127-1] - <redacted>
FAILED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[False-129-1] - <redacted>
FAILED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[True-63-1] - <redacted>
FAILED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[True-65-1] - <redacted>
FAILED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[True-127-1] - <redacted>
FAILED tests/test_rms_norm.py::test_spyre_rmsnorm_matches_reference[True-129-1] - <redacted>
========================================================== 8 failed, 10 passed, 1 skipped, 5836 deselected, 9 warnings in 24.42s ==========================================================
```

The tests are failing at boundaries of hidden dim, which is expected as
of now, since hidden dim is not being padded to 64. (I can raise a
separate PR to fix that, but I did not want to overload this PR)

2. **Upstream tests that run from vLLM**

This makes use of my PR on vLLM:
vllm-project/vllm#36246, which enables the
RMSNorm test to run for OOT devices

```bash
# For demonstration purposes, I am testing on my fork and commit
export VLLM_COMMIT=a3b591a09545403114885ac7fbd94b63fbac1696
export VLLM_REPO_URL=https://github.com/romitjain/vllm

VLLM_PLUGINS=spyre_next_ops,spyre_next_test python -m pytest -rA -m upstream
```

This is expected to produce

```bash
================================================================================= short test summary info =================================================================================
PASSED test_layernorm.py::test_rms_norm[False-cpu-0-dtype0-False-64-1]
PASSED test_layernorm.py::test_rms_norm[False-cpu-0-dtype0-False-64-16]
PASSED test_layernorm.py::test_rms_norm[False-cpu-0-dtype1-False-64-1]
PASSED test_layernorm.py::test_rms_norm[False-cpu-0-dtype1-False-64-16]
PASSED test_layernorm.py::test_rms_norm[False-cpu-0-dtype2-False-64-1]
PASSED test_layernorm.py::test_rms_norm[False-cpu-0-dtype2-False-64-16]
SKIPPED [1] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_permute_cols.py:11: permute_cols is not supported on ROCm
SKIPPED [126] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_activation.py:34: not in upstream_tests.yaml
SKIPPED [54] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_activation.py:119: not in upstream_tests.yaml
SKIPPED [4] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_apply_rotary_emb.py:188: Skipping CUDA/ROCm only tests.
SKIPPED [24] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_fused_qk_norm_rope.py:48: fused_qk_norm_rope custom op requires cuda and rocm platform
SKIPPED [4256] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_fused_quant_layernorm.py:156: not in upstream_tests.yaml
SKIPPED [16] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_fused_rms_norm_gated.py:21: not in upstream_tests.yaml
SKIPPED [8] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_fused_rms_norm_gated.py:61: not in upstream_tests.yaml
SKIPPED [12] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_layernorm.py:24: param skipped
SKIPPED [648] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_layernorm.py:85: blocked by upstream_tests.yaml
SKIPPED [24] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_mrope.py:59: Skipping CUDA/ROCm only tests.
SKIPPED [24] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_mrope.py:129: Skipping CUDA/ROCm only tests.
SKIPPED [1] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_opcheck.py: not in upstream_tests.yaml
SKIPPED [384] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_pos_encoding.py:54: not in upstream_tests.yaml
SKIPPED [1] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_pos_encoding.py: not in upstream_tests.yaml
SKIPPED [96] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_rotary_embedding.py:30: not in upstream_tests.yaml
SKIPPED [144] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_rotary_embedding_mla_cache_fused.py:20: not in upstream_tests.yaml
SKIPPED [1] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_uva.py:14: UVA is not available.
SKIPPED [1] ../../.cache/vllm-upstream-tests/worktree-a3b591a09545/tests/kernels/core/test_uva.py:36: UVA is not available.
FAILED test_layernorm.py::test_rms_norm[False-cpu-0-dtype0-True-64-1] - AssertionError: Tensor-likes are not close!
FAILED test_layernorm.py::test_rms_norm[False-cpu-0-dtype0-True-64-16] - AssertionError: Tensor-likes are not close!
FAILED test_layernorm.py::test_rms_norm[False-cpu-0-dtype1-True-64-1] - AssertionError: Tensor-likes are not close!
FAILED test_layernorm.py::test_rms_norm[False-cpu-0-dtype1-True-64-16] - AssertionError: Tensor-likes are not close!
FAILED test_layernorm.py::test_rms_norm[False-cpu-0-dtype2-True-64-1] - AssertionError: Tensor-likes are not close!
FAILED test_layernorm.py::test_rms_norm[False-cpu-0-dtype2-True-64-16] - AssertionError: Tensor-likes are not close!
========================================================= 6 failed, 6 passed, 5825 skipped, 19 deselected, 13 warnings in 23.26s ==========================================================
```

We can see that our YAML is being respected and:
1. Most of the upstream tests are skipped due to not being in our YAML
2. 12 tests are getting skipped for parameters not being supported
(`param skipped`)
3. 12 tests are selected for running, out of whcih 6 pass/6 fail

## Checklist

- [x] I have read the [contributing
guidelines](https://docs.vllm.ai/projects/spyre/en/latest/contributing)
- [x] My code follows the project's code style (run `bash format.sh`)
- [x] I have added tests for my changes (if applicable)
- [ ] I have updated the documentation (if applicable)
- [x] My commits include a `Signed-off-by:` line (DCO compliance)

---------

Signed-off-by: romit <romit@ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Thomas Ortner <boh@zurich.ibm.com>
Signed-off-by: Joe Runde <joe@joerun.de>
Co-authored-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Prashant Gupta <prashantgupta@us.ibm.com>
Co-authored-by: Joe Runde <joe@joerun.de>
Co-authored-by: Thomas Ortner <boh@zurich.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run upstream vLLM tests on PRs

2 participants