Skip to content

[CI] Stabilize cpu offload compressed tensors test#41102

Merged
tjtanaa merged 1 commit intovllm-project:mainfrom
ROCm:akaratza_fix_quantization
May 4, 2026
Merged

[CI] Stabilize cpu offload compressed tensors test#41102
tjtanaa merged 1 commit intovllm-project:mainfrom
ROCm:akaratza_fix_quantization

Conversation

@AndreasKaratzas
Copy link
Copy Markdown
Collaborator

Avoid flaky stochastic sampling in the compressed-tensors MoE CPU offload test.

Changes

  • Added an opt-out for seeded sampling checks in compare_two_settings.
  • Disabled only the temperature=1 seeded sampling portion for the w4a16 compressed-tensors MoE offload test.
  • Kept deterministic coverage for greedy completion, token IDs, list prompts, and streaming.

Testing

  • pytest -s -v tests/quantization/test_cpu_offload.py::test_cpu_offload_compressed_tensors

cc @kenroche

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
@AndreasKaratzas AndreasKaratzas marked this pull request as ready for review April 28, 2026 06:26
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an include_seeded_sampling parameter to the test utility functions _test_completion, compare_two_settings, and compare_all_settings in tests/utils.py. This allows for conditionally skipping seeded random sampling checks during model comparisons, which has been applied to the CPU offload quantization tests to improve test stability or relevance in that context. I have no feedback to provide.

@tjtanaa tjtanaa added rocm Related to AMD ROCm ready ONLY add when PR is ready to merge/full CI is needed labels May 4, 2026
@github-project-automation github-project-automation Bot moved this to Todo in AMD May 4, 2026
@tjtanaa tjtanaa enabled auto-merge (squash) May 4, 2026 03:32
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented May 4, 2026

Hi @AndreasKaratzas, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

@tjtanaa tjtanaa merged commit 6ec9bbe into vllm-project:main May 4, 2026
19 of 22 checks passed
@github-project-automation github-project-automation Bot moved this from Todo to Done in AMD May 4, 2026
@AndreasKaratzas AndreasKaratzas deleted the akaratza_fix_quantization branch May 4, 2026 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants