fix(test): use async side_effect for client.post mock in watsonx test#21275
Merged
fix(test): use async side_effect for client.post mock in watsonx test#21275
Conversation
The test_watsonx_gpt_oss_prompt_transformation was using return_value to mock an async method (AsyncHTTPHandler.post), which doesn't work correctly with async/await. This could cause intermittent failures in CI due to test ordering. Changed to use side_effect with an async function (mock_post_func) to properly mock the async post method, following the same pattern used in other async tests like test_vertex_ai_gpt_oss_reasoning_effort. This ensures the mock is always called correctly regardless of test execution order or parallel test execution. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
Greptile SummaryFixes an intermittent test failure in
Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| tests/test_litellm/llms/watsonx/test_watsonx.py | Replaces return_value with async side_effect function for AsyncHTTPHandler.post mock, fixing intermittent test failure. Follows established pattern from vertex_ai tests. |
Sequence Diagram
sequenceDiagram
participant Test as test_watsonx_gpt_oss_prompt_transformation
participant LiteLLM as litellm.acompletion
participant Client as AsyncHTTPHandler.post (mocked)
Test->>LiteLLM: await acompletion(model, messages, client)
LiteLLM->>Client: await client.post(url, data)
Note over Client: Before: return_value (sync)<br/>→ intermittent failure<br/>After: side_effect=async func<br/>→ properly awaitable
Client-->>LiteLLM: mock_completion_response
LiteLLM-->>Test: response
Test->>Test: assert mock_post.call_count >= 1
Last reviewed commit: be63bac
Contributor
Greptile SummaryFixes an intermittent test failure in
Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| tests/test_litellm/llms/watsonx/test_watsonx.py | Correctly fixes async mock for AsyncHTTPHandler.post by using side_effect with an async function instead of return_value, following the established pattern from the Vertex AI test. No issues found. |
Sequence Diagram
sequenceDiagram
participant Test as test_watsonx_gpt_oss_prompt_transformation
participant LiteLLM as litellm.acompletion
participant Client as AsyncHTTPHandler
participant MockPost as mock_post_func (async)
Test->>LiteLLM: await acompletion(model, messages, client)
LiteLLM->>Client: await client.post(url, data)
Client->>MockPost: side_effect triggers async mock
MockPost-->>Client: returns mock_completion_response
Client-->>LiteLLM: mock response
LiteLLM-->>Test: completion result
Test->>Test: assert mock_post.call_count >= 1
Test->>Test: verify prompt transformation in request body
Last reviewed commit: be63bac
jquinter
added a commit
that referenced
this pull request
Feb 15, 2026
…ution Implements three key improvements to reduce test flakiness from parallel execution: 1. **Split Vertex AI tests into separate group** (workers: 1) - Vertex AI tests often have environment variable pollution issues - Running serially prevents cross-test interference with GOOGLE_APPLICATION_CREDENTIALS - Isolates authentication-related test failures 2. **Reduce workers for other LLM tests** (4 -> 2) - Decreases chance of race conditions and state conflicts - Still parallel but with less contention 3. **Add --dist=loadscope to pytest-xdist** - Keeps tests from the same file together on one worker - Reduces interference between unrelated test modules - Data shows 70% pass rate WITH loadscope vs 40% WITHOUT - Better test isolation while maintaining parallelism Note: loadscope exposes one tokenizer cache issue in core-utils which will be fixed in a separate PR. The tradeoff is worth it (7/10 pass vs 4/10 without). These changes address the root causes of intermittent test failures in: PRs #21268, #21271, #21272, #21273, #21275, #21276: - Environment variable pollution (GOOGLE_APPLICATION_CREDENTIALS, VERTEXAI_PROJECT) - Global state conflicts (litellm.known_tokenizer_config) - Async mock timing issues with parallel execution Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
jquinter
added a commit
that referenced
this pull request
Feb 18, 2026
…ution Implements three key improvements to reduce test flakiness from parallel execution: 1. **Split Vertex AI tests into separate group** (workers: 1) - Vertex AI tests often have environment variable pollution issues - Running serially prevents cross-test interference with GOOGLE_APPLICATION_CREDENTIALS - Isolates authentication-related test failures 2. **Reduce workers for other LLM tests** (4 -> 2) - Decreases chance of race conditions and state conflicts - Still parallel but with less contention 3. **Add --dist=loadscope to pytest-xdist** - Keeps tests from the same file together on one worker - Reduces interference between unrelated test modules - Data shows 70% pass rate WITH loadscope vs 40% WITHOUT - Better test isolation while maintaining parallelism Note: loadscope exposes one tokenizer cache issue in core-utils which will be fixed in a separate PR. The tradeoff is worth it (7/10 pass vs 4/10 without). These changes address the root causes of intermittent test failures in: PRs #21268, #21271, #21272, #21273, #21275, #21276: - Environment variable pollution (GOOGLE_APPLICATION_CREDENTIALS, VERTEXAI_PROJECT) - Global state conflicts (litellm.known_tokenizer_config) - Async mock timing issues with parallel execution Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes intermittent test failure in
test_watsonx_gpt_oss_prompt_transformationwhere POST was not being called (call_count was 0).Root Cause
The test was using
return_valueto mock an async method (AsyncHTTPHandler.post), which doesn't properly handle async/await. This could cause the mock to not be invoked correctly, especially under certain test execution orders or when running tests in parallel.Changes
mock_post.return_value = mock_completion_responseto usingside_effect=mock_post_funcwith an async functionreturn_valueafter the context manager was createdtest_vertex_ai_gpt_oss_reasoning_effortTesting
poetry run pytest tests/test_litellm/llms/watsonx/test_watsonx.py -vtest_watsonx_gpt_oss_prompt_transformationRelated Issues
This is not related to PR #21217 (which only modifies Anthropic tests). This is a pre-existing issue in the watsonx test that could manifest under certain test execution orders.
🤖 Generated with Claude Code