Skip to content

[CI][AMD][BugFix] Use torch.testing.assert_close instead of assert torch.allclose in test_rocm_skinny_gemms.py#34181

Merged
gshtras merged 10 commits intovllm-project:mainfrom
rasmith:rasmith_add_assert_close_to_skinny_gemms_test
Feb 18, 2026
Merged

[CI][AMD][BugFix] Use torch.testing.assert_close instead of assert torch.allclose in test_rocm_skinny_gemms.py#34181
gshtras merged 10 commits intovllm-project:mainfrom
rasmith:rasmith_add_assert_close_to_skinny_gemms_test

Conversation

@rasmith
Copy link
Copy Markdown
Contributor

@rasmith rasmith commented Feb 9, 2026

This PR uses torch.testing.assert_close instead of assert torch.allclose in test_rocm_skinny_gemms.py. It is much easier to see how far off a failure is if the former is used.

Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
@rasmith rasmith requested a review from tjtanaa as a code owner February 9, 2026 23:03
@mergify mergify bot added rocm Related to AMD ROCm bug Something isn't working labels Feb 9, 2026
@github-project-automation github-project-automation bot moved this to Todo in AMD Feb 9, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces assert torch.allclose with torch.testing.assert_close in tests/kernels/quantization/test_rocm_skinny_gemms.py. This is a good change as torch.testing.assert_close provides more detailed and helpful error messages when tests fail, which aids in debugging. The changes correctly maintain the original absolute and relative tolerances for all assertions. Overall, this is a solid improvement to the test suite's robustness and developer experience.

@tjtanaa
Copy link
Copy Markdown
Collaborator

tjtanaa commented Feb 10, 2026

@rasmith can you share some test results?

@rasmith
Copy link
Copy Markdown
Contributor Author

rasmith commented Feb 10, 2026

@rasmith can you share some test results?

Sure, there are failures though that were caused by this PR, however they are being fixed by this PR:

Here are the test results (but this PR doesn't change the results, this PR just makes it easier to inspect the failures that do occur):

============================================================= short test summary info =============================================================
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-False-0-dtype0-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-False-0-dtype0-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-False-0-dtype0-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-False-0-dtype1-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-False-0-dtype1-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-False-0-dtype1-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-True-0-dtype0-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-True-0-dtype0-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-True-0-dtype0-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-True-0-dtype1-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-True-0-dtype1-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-False-True-0-dtype1-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-False-0-dtype0-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-False-0-dtype0-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-False-0-dtype0-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-False-0-dtype1-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-False-0-dtype1-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-False-0-dtype1-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-True-0-dtype0-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-True-0-dtype0-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-True-0-dtype0-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-True-0-dtype1-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-True-0-dtype1-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[False-True-True-0-dtype1-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-False-0-dtype0-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-False-0-dtype0-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-False-0-dtype0-4-65552-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-False-0-dtype1-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-False-0-dtype1-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-False-0-dtype1-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-True-0-dtype0-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-True-0-dtype0-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-True-0-dtype0-4-65552-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-True-0-dtype1-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-True-0-dtype1-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-False-True-0-dtype1-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-False-0-dtype0-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-False-0-dtype0-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-False-0-dtype0-4-65552-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-False-0-dtype1-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-False-0-dtype1-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-False-0-dtype1-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-True-0-dtype0-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-True-0-dtype0-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-True-0-dtype0-4-65552-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-True-0-dtype1-4-65536-28672-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-True-0-dtype1-4-65536-28688-False] - AssertionError: Tensor-likes are not close!
FAILED kernels/quantization/test_rocm_skinny_gemms.py::test_rocm_wvsplitk_fp8_kernel[True-True-True-0-dtype1-4-65552-28672-False] - AssertionError: Tensor-likes are not close!
====================================== 48 failed, 978 passed, 4608 skipped, 3 warnings in 351.13s (0:05:51) =======================================

@mergify
Copy link
Copy Markdown

mergify bot commented Feb 11, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @rasmith.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Feb 11, 2026
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
…b.com:rasmith/vllm into rasmith_add_assert_close_to_skinny_gemms_test
@mergify mergify bot removed the needs-rebase label Feb 11, 2026
@rasmith
Copy link
Copy Markdown
Contributor Author

rasmith commented Feb 11, 2026

@rasmith can you share some test results?

@tjtanaa A recent PR got merged that fixed all the failures. I would still like to convert all the assert torch.allclose instances to torch.testing.assert_close.

The tests are now:

1026 passed, 4608 skipped, 3 warnings in 280.09s

@rasmith
Copy link
Copy Markdown
Contributor Author

rasmith commented Feb 13, 2026

@tjtanaa Please take another look?

Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
@gshtras gshtras enabled auto-merge (squash) February 18, 2026 21:54
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 18, 2026
@gshtras gshtras merged commit 2b84ac6 into vllm-project:main Feb 18, 2026
14 of 15 checks passed
@github-project-automation github-project-automation bot moved this from Todo to Done in AMD Feb 18, 2026
jmamou pushed a commit to jmamou/vllm that referenced this pull request Feb 23, 2026
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181)

Signed-off-by: Randall Smith <Randall.Smith@amd.com>
ZJY0516 pushed a commit to ZJY0516/vllm that referenced this pull request Feb 23, 2026
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181)

Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
llsj14 pushed a commit to llsj14/vllm that referenced this pull request Mar 1, 2026
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181)

Signed-off-by: Randall Smith <Randall.Smith@amd.com>
tunglinwood pushed a commit to tunglinwood/vllm that referenced this pull request Mar 4, 2026
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181)

Signed-off-by: Randall Smith <Randall.Smith@amd.com>
askliar pushed a commit to askliar/vllm that referenced this pull request Mar 9, 2026
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181)

Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Signed-off-by: Andrii Skliar <askliar@nvidia.com>
Copilot AI pushed a commit to machov/vllm that referenced this pull request Mar 10, 2026
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181)

Signed-off-by: Randall Smith <Randall.Smith@amd.com>
EricccYang pushed a commit to EricccYang/vllm that referenced this pull request Apr 1, 2026
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181)

Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Signed-off-by: EricccYang <yangyang4991@gmail.com>
liuchenbing2026 pushed a commit to liuchenbing2026/vllm that referenced this pull request Apr 4, 2026
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181)

Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants