[CI] Stabilize cpu offload compressed tensors test by AndreasKaratzas · Pull Request #41102 · vllm-project/vllm

AndreasKaratzas · 2026-04-28T06:25:55Z

Avoid flaky stochastic sampling in the compressed-tensors MoE CPU offload test.

Changes

Added an opt-out for seeded sampling checks in compare_two_settings.
Disabled only the temperature=1 seeded sampling portion for the w4a16 compressed-tensors MoE offload test.
Kept deterministic coverage for greedy completion, token IDs, list prompts, and streaming.

Testing

pytest -s -v tests/quantization/test_cpu_offload.py::test_cpu_offload_compressed_tensors

cc @kenroche

Signed-off-by: Andreas Karatzas <akaratza@amd.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

gemini-code-assist

Code Review

This pull request introduces an include_seeded_sampling parameter to the test utility functions _test_completion, compare_two_settings, and compare_all_settings in tests/utils.py. This allows for conditionally skipping seeded random sampling checks during model comparisons, which has been applied to the CPU offload quantization tests to improve test stability or relevance in that context. I have no feedback to provide.

mergify · 2026-05-04T03:33:00Z

Hi @AndreasKaratzas, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

[CI] Stabilize cpu offload compressed tensors test

c305132

Signed-off-by: Andreas Karatzas <akaratza@amd.com>

AndreasKaratzas marked this pull request as ready for review April 28, 2026 06:26

AndreasKaratzas requested review from mgoin, pavanimajety, robertgshaw2-redhat and yewentao256 as code owners April 28, 2026 06:26

claude Bot reviewed Apr 28, 2026

View reviewed changes

gemini-code-assist Bot reviewed Apr 28, 2026

View reviewed changes

AndreasKaratzas mentioned this pull request Apr 29, 2026

[ROCm][Bugfix]: W4A4 MOE using emulation instead of AITER on MXFP4-supported hardware #41175

Merged

4 tasks

arpera mentioned this pull request Apr 30, 2026

Revert "[Perf] Enable FlashInfer top-k/top-p sampler by default" (#40376) #41316

Closed

ZhanqiuHu mentioned this pull request May 2, 2026

[CI Flaky 2026-05-02] Quantization: CPU offload compressed tensors output divergence (MoE non-determinism) ZhanqiuHu/vllm-ci-watch#79

Open

AndreasKaratzas mentioned this pull request May 2, 2026

[ROCm][Quantization][2/N] Refactor quark_moe w4a8 w/ oracle #39136

Merged

tjtanaa approved these changes May 4, 2026

View reviewed changes

tjtanaa added rocm Related to AMD ROCm ready ONLY add when PR is ready to merge/full CI is needed labels May 4, 2026

github-project-automation Bot added this to AMD May 4, 2026

github-project-automation Bot moved this to Todo in AMD May 4, 2026

tjtanaa enabled auto-merge (squash) May 4, 2026 03:32

tjtanaa merged commit 6ec9bbe into vllm-project:main May 4, 2026
19 of 22 checks passed

github-project-automation Bot moved this from Todo to Done in AMD May 4, 2026

AndreasKaratzas deleted the akaratza_fix_quantization branch May 4, 2026 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI] Stabilize cpu offload compressed tensors test#41102

[CI] Stabilize cpu offload compressed tensors test#41102
tjtanaa merged 1 commit intovllm-project:mainfrom
ROCm:akaratza_fix_quantization

AndreasKaratzas commented Apr 28, 2026

Uh oh!

claude Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

mergify Bot commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

AndreasKaratzas commented Apr 28, 2026

Changes

Testing

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mergify Bot commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants