Skip to content

fix(test): use async side_effect for client.post mock in watsonx test#21275

Merged
jquinter merged 1 commit intomainfrom
fix/watsonx-gpt-oss-async-mock
Feb 15, 2026
Merged

fix(test): use async side_effect for client.post mock in watsonx test#21275
jquinter merged 1 commit intomainfrom
fix/watsonx-gpt-oss-async-mock

Conversation

@jquinter
Copy link
Contributor

Summary

Fixes intermittent test failure in test_watsonx_gpt_oss_prompt_transformation where POST was not being called (call_count was 0).

Root Cause

The test was using return_value to mock an async method (AsyncHTTPHandler.post), which doesn't properly handle async/await. This could cause the mock to not be invoked correctly, especially under certain test execution orders or when running tests in parallel.

Changes

  • Changed from mock_post.return_value = mock_completion_response to using side_effect=mock_post_func with an async function
  • Removed the line that set return_value after the context manager was created
  • This follows the same pattern used in other async tests like test_vertex_ai_gpt_oss_reasoning_effort

Testing

  • All watsonx tests pass: poetry run pytest tests/test_litellm/llms/watsonx/test_watsonx.py -v
  • The specific test passes: test_watsonx_gpt_oss_prompt_transformation

Related Issues

This is not related to PR #21217 (which only modifies Anthropic tests). This is a pre-existing issue in the watsonx test that could manifest under certain test execution orders.

🤖 Generated with Claude Code

The test_watsonx_gpt_oss_prompt_transformation was using return_value to mock
an async method (AsyncHTTPHandler.post), which doesn't work correctly with
async/await. This could cause intermittent failures in CI due to test ordering.

Changed to use side_effect with an async function (mock_post_func) to properly
mock the async post method, following the same pattern used in other async
tests like test_vertex_ai_gpt_oss_reasoning_effort.

This ensures the mock is always called correctly regardless of test execution
order or parallel test execution.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Feb 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 15, 2026 10:41pm

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 15, 2026

Greptile Summary

Fixes an intermittent test failure in test_watsonx_gpt_oss_prompt_transformation by properly mocking the async AsyncHTTPHandler.post method.

  • Replaced mock_post.return_value = mock_completion_response with side_effect=mock_post_func where mock_post_func is an async function, ensuring the mock correctly handles await semantics
  • This follows the same proven pattern used in test_vertex_ai_gpt_oss_transformation.py tests
  • No production code changes; test-only fix that improves CI reliability

Confidence Score: 5/5

  • This PR is safe to merge — it's a minimal, well-understood test fix with no production code changes.
  • The change is small (net +6 lines), test-only, follows an established pattern in the codebase, and correctly addresses the root cause of async mock handling. No risk to production behavior.
  • No files require special attention.

Important Files Changed

Filename Overview
tests/test_litellm/llms/watsonx/test_watsonx.py Replaces return_value with async side_effect function for AsyncHTTPHandler.post mock, fixing intermittent test failure. Follows established pattern from vertex_ai tests.

Sequence Diagram

sequenceDiagram
    participant Test as test_watsonx_gpt_oss_prompt_transformation
    participant LiteLLM as litellm.acompletion
    participant Client as AsyncHTTPHandler.post (mocked)

    Test->>LiteLLM: await acompletion(model, messages, client)
    LiteLLM->>Client: await client.post(url, data)
    Note over Client: Before: return_value (sync)<br/>→ intermittent failure<br/>After: side_effect=async func<br/>→ properly awaitable
    Client-->>LiteLLM: mock_completion_response
    LiteLLM-->>Test: response
    Test->>Test: assert mock_post.call_count >= 1
Loading

Last reviewed commit: be63bac

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@jquinter jquinter merged commit b0f2b4b into main Feb 15, 2026
17 of 23 checks passed
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 15, 2026

Greptile Summary

Fixes an intermittent test failure in test_watsonx_gpt_oss_prompt_transformation by correctly mocking the async AsyncHTTPHandler.post method. The previous approach used return_value on the mock, which doesn't properly handle await — causing the mock to sometimes not be invoked (call_count of 0). The fix uses side_effect with an async def function, which is the established pattern already used in the Vertex AI GPT-OSS test (test_vertex_ai_gpt_oss_reasoning_effort).

  • Replaced mock_post.return_value = mock_completion_response with side_effect=mock_post_func using an async function
  • Removed the separate return_value assignment that was inside the with block
  • No behavioral changes to what the test validates — only the mock setup is corrected

Confidence Score: 5/5

  • This PR is safe to merge — it's a minimal, well-scoped fix to an async mock in a test file with no production code changes.
  • The change is small (net +7/-6 lines), only affects a test file, follows an established pattern from another test in the codebase, and correctly addresses the root cause of intermittent test failures when mocking async methods. No production code is modified.
  • No files require special attention.

Important Files Changed

Filename Overview
tests/test_litellm/llms/watsonx/test_watsonx.py Correctly fixes async mock for AsyncHTTPHandler.post by using side_effect with an async function instead of return_value, following the established pattern from the Vertex AI test. No issues found.

Sequence Diagram

sequenceDiagram
    participant Test as test_watsonx_gpt_oss_prompt_transformation
    participant LiteLLM as litellm.acompletion
    participant Client as AsyncHTTPHandler
    participant MockPost as mock_post_func (async)

    Test->>LiteLLM: await acompletion(model, messages, client)
    LiteLLM->>Client: await client.post(url, data)
    Client->>MockPost: side_effect triggers async mock
    MockPost-->>Client: returns mock_completion_response
    Client-->>LiteLLM: mock response
    LiteLLM-->>Test: completion result
    Test->>Test: assert mock_post.call_count >= 1
    Test->>Test: verify prompt transformation in request body
Loading

Last reviewed commit: be63bac

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

jquinter added a commit that referenced this pull request Feb 15, 2026
…ution

Implements three key improvements to reduce test flakiness from parallel execution:

1. **Split Vertex AI tests into separate group** (workers: 1)
   - Vertex AI tests often have environment variable pollution issues
   - Running serially prevents cross-test interference with GOOGLE_APPLICATION_CREDENTIALS
   - Isolates authentication-related test failures

2. **Reduce workers for other LLM tests** (4 -> 2)
   - Decreases chance of race conditions and state conflicts
   - Still parallel but with less contention

3. **Add --dist=loadscope to pytest-xdist**
   - Keeps tests from the same file together on one worker
   - Reduces interference between unrelated test modules
   - Data shows 70% pass rate WITH loadscope vs 40% WITHOUT
   - Better test isolation while maintaining parallelism

Note: loadscope exposes one tokenizer cache issue in core-utils which will be
fixed in a separate PR. The tradeoff is worth it (7/10 pass vs 4/10 without).

These changes address the root causes of intermittent test failures in:
PRs #21268, #21271, #21272, #21273, #21275, #21276:
- Environment variable pollution (GOOGLE_APPLICATION_CREDENTIALS, VERTEXAI_PROJECT)
- Global state conflicts (litellm.known_tokenizer_config)
- Async mock timing issues with parallel execution

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
jquinter added a commit that referenced this pull request Feb 18, 2026
…ution

Implements three key improvements to reduce test flakiness from parallel execution:

1. **Split Vertex AI tests into separate group** (workers: 1)
   - Vertex AI tests often have environment variable pollution issues
   - Running serially prevents cross-test interference with GOOGLE_APPLICATION_CREDENTIALS
   - Isolates authentication-related test failures

2. **Reduce workers for other LLM tests** (4 -> 2)
   - Decreases chance of race conditions and state conflicts
   - Still parallel but with less contention

3. **Add --dist=loadscope to pytest-xdist**
   - Keeps tests from the same file together on one worker
   - Reduces interference between unrelated test modules
   - Data shows 70% pass rate WITH loadscope vs 40% WITHOUT
   - Better test isolation while maintaining parallelism

Note: loadscope exposes one tokenizer cache issue in core-utils which will be
fixed in a separate PR. The tradeoff is worth it (7/10 pass vs 4/10 without).

These changes address the root causes of intermittent test failures in:
PRs #21268, #21271, #21272, #21273, #21275, #21276:
- Environment variable pollution (GOOGLE_APPLICATION_CREDENTIALS, VERTEXAI_PROJECT)
- Global state conflicts (litellm.known_tokenizer_config)
- Async mock timing issues with parallel execution

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant