[CI][AMD][BugFix] Use torch.testing.assert_close instead of assert torch.allclose in test_rocm_skinny_gemms.py#34181
Conversation
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
There was a problem hiding this comment.
Code Review
This pull request replaces assert torch.allclose with torch.testing.assert_close in tests/kernels/quantization/test_rocm_skinny_gemms.py. This is a good change as torch.testing.assert_close provides more detailed and helpful error messages when tests fail, which aids in debugging. The changes correctly maintain the original absolute and relative tolerances for all assertions. Overall, this is a solid improvement to the test suite's robustness and developer experience.
|
@rasmith can you share some test results? |
Sure, there are failures though that were caused by this PR, however they are being fixed by this PR: Here are the test results (but this PR doesn't change the results, this PR just makes it easier to inspect the failures that do occur): |
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
…b.com:rasmith/vllm into rasmith_add_assert_close_to_skinny_gemms_test
|
@tjtanaa Please take another look? |
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181) Signed-off-by: Randall Smith <Randall.Smith@amd.com>
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181) Signed-off-by: Randall Smith <Randall.Smith@amd.com> Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181) Signed-off-by: Randall Smith <Randall.Smith@amd.com>
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181) Signed-off-by: Randall Smith <Randall.Smith@amd.com>
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181) Signed-off-by: Randall Smith <Randall.Smith@amd.com> Signed-off-by: Andrii Skliar <askliar@nvidia.com>
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181) Signed-off-by: Randall Smith <Randall.Smith@amd.com>
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181) Signed-off-by: Randall Smith <Randall.Smith@amd.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>
…rch.allclose in test_rocm_skinny_gemms.py (vllm-project#34181) Signed-off-by: Randall Smith <Randall.Smith@amd.com>
This PR uses
torch.testing.assert_closeinstead of asserttorch.allclosein test_rocm_skinny_gemms.py. It is much easier to see how far off a failure is if the former is used.