Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions tests/models/language/generation/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,31 @@
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""Pytest configuration for vLLM language generation tests."""

import os
import warnings

import torch

from vllm.platforms import current_platform


def pytest_configure(config):
"""Early ROCm configuration that must happen before test collection."""
if not current_platform.is_rocm():
return

# Disable skinny GEMM on ROCm to avoid non-deterministic results
# from atomic reductions in wvSplitKrc kernel.
# See: https://github.com/vllm-project/vllm/pull/33493#issuecomment-3906083975
os.environ["VLLM_ROCM_USE_SKINNY_GEMM"] = "0"
warnings.warn(
"ROCm: Set VLLM_ROCM_USE_SKINNY_GEMM=0 to avoid non-deterministic "
"results from skinny GEMM atomic reductions",
UserWarning,
stacklevel=1,
)
Comment on lines +13 to +27
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using pytest_configure to set the VLLM_ROCM_USE_SKINNY_GEMM environment variable will affect the entire pytest session, not just the tests within this directory. This is because pytest_configure is a global hook that is executed once at the beginning of the test run.

While this will fix the non-determinism in the language model tests, it will also disable skinny GEMMs for all other tests in the suite when running on ROCm. This could unintentionally affect tests that are specifically designed to validate the correctness or performance of the skinny GEMM feature, potentially causing them to not run as intended or to test a disabled code path.

If the intention is to disable this feature globally for all tests as a temporary measure, this implementation is correct, but it would be clearer to move this logic to the root conftest.py file to make its global scope more explicit.

If the goal is to only disable this for the language model tests, a different approach would be needed, such as running these tests in a separate pytest invocation with the environment variable set externally by the CI script.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You made the same comment in my other PR. Setting it off for this entire session will ensure that no further flakiness is observed during language tests. As stated in the comment, there is a legitimate accuracy issue which shall be resolved first. Also, there are no skinny GEMMs tests under this parent.



def pytest_sessionstart(session):
"""Configure ROCm-specific settings before test session starts."""
if not current_platform.is_rocm():
Expand Down