Skip to content

[ROCm][CI] Fix Assertion Logic For test_gpt_oss#35806

Merged
DarkLight1337 merged 1 commit intovllm-project:mainfrom
ROCm:micah/quark-gptoss-test
Mar 3, 2026
Merged

[ROCm][CI] Fix Assertion Logic For test_gpt_oss#35806
DarkLight1337 merged 1 commit intovllm-project:mainfrom
ROCm:micah/quark-gptoss-test

Conversation

@micah-wil
Copy link
Copy Markdown
Contributor

@micah-wil micah-wil commented Mar 2, 2026

After #35658 was merged, we saw Quantized Models Test started failing in AMD CI:
https://buildkite.com/vllm/amd-ci/builds/5661/steps/canvas?sid=019cad58-e61d-4bf9-8028-1a6a6a7ac897&tab=output

As it turns out, this test was previously skipped because quark was not installed in our CI builds. Now that it is, it exposed that this test was flaky because it doesn't allow for measured accuracy that is higher than the expected accuracy. Before this PR, tests sometimes fail with errors like AssertionError: Expected: 0.89 | Measured: 0.913151364764268. (e.g. you can observe this with pytest -v -s models/quantization/test_gpt_oss.py::test_gpt_oss_attention_quantization[amd/gpt-oss-20b-WFP8-AFP8-KVFP8-0.89-1]). Clearly, 0.91 is a perfectly valid accuracy score, so we should accept it.

Signed-off-by: Micah Williamson <micah.williamson@amd.com>
@mergify mergify Bot added gpt-oss Related to GPT-OSS models rocm Related to AMD ROCm labels Mar 2, 2026
@github-project-automation github-project-automation Bot moved this to Todo in AMD Mar 2, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a flaky test, test_gpt_oss, which was failing in the AMD CI. The original assertion incorrectly rejected valid accuracy scores that were higher than the expected value plus a tolerance. The fix correctly adjusts the assertion to check only if the measured accuracy meets a minimum threshold (expected_accuracy - rtol), which is the standard practice for such tests. Additionally, an unused general importlib import was replaced with the more specific importlib.util. The changes are correct and well-implemented.

Copy link
Copy Markdown
Contributor

@BowenBao BowenBao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!
The original threshold might be a bit tight on the upper end.

@github-project-automation github-project-automation Bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Mar 3, 2026
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) March 3, 2026 04:19
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 3, 2026
@DarkLight1337 DarkLight1337 merged commit 8b9e8b7 into vllm-project:main Mar 3, 2026
16 of 17 checks passed
@github-project-automation github-project-automation Bot moved this from Todo to Done in AMD Mar 3, 2026
Copilot AI pushed a commit to machov/vllm that referenced this pull request Mar 10, 2026
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
avinashsingh77 pushed a commit to avinashsingh77/vllm that referenced this pull request Mar 12, 2026
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gpt-oss Related to GPT-OSS models ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants