[test] Fix flaky tests caused by module reloading and missing mocks#19747
Merged
jquinter merged 8 commits intoBerriAI:mainfrom Feb 15, 2026
Merged
[test] Fix flaky tests caused by module reloading and missing mocks#19747jquinter merged 8 commits intoBerriAI:mainfrom
jquinter merged 8 commits intoBerriAI:mainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
Author
|
Linter and MCP tests are failing: patch/solution is waiting in #19738 |
1f3df50 to
cd4ec81
Compare
Member
still seeing module import errors @jquinter - is this intentional ? |
Contributor
Author
Tests run upon the main/stable release? I noticed you approved my #19738 PR, but now it is merged in litellm_oss_staging_01_26_2026 branch. When should that be promoted to main? Many fixes were deployed in that PR, and are the base for a further solution I have been working on regarding other issues (on mock tests, for example). |
3 tasks
1. test_acompletion_with_mcp_streaming_metadata_in_correct_chunks:
- Moved stream consumption inside patch context to avoid real API calls
- The previous implementation had assertions outside the `with patch(...)`
block, causing real OpenAI API calls when consuming the stream
2. TestCheckResponsesCost tests:
- Added skip condition when litellm_enterprise module is not available
- These tests import from litellm_enterprise.proxy.common_utils.check_responses_cost
which is only available in the enterprise version
1. test_bedrock_converse_budget_tokens_preserved:
- Fixed mocking at the correct level (litellm.acompletion instead of client.post)
- The previous mock didn't work because the code runs through run_in_executor
and the passed client parameter was not being used
2. test_error_class_returns_volcengine_error:
- Changed isinstance check to class name comparison
- This avoids issues when module reloading (in conftest.py) causes class
identity mismatches during parallel test execution
Fix several tests that fail in CI due to parallel test execution and module reloading in conftest.py. 1. test_empty_assistant_message_handling: - Use patch.object on factory_module.litellm instead of direct assignment - Ensures the correct litellm reference is modified after conftest reloads 2. test_embedding_header_forwarding_with_model_group: - Use patch.object on pre_call_utils_module.litellm instead of direct assignment - Same fix for module reloading issue 3. test_embedding_input_array_of_tokens: - Move mock inside test function (after fixture initializes router) - Add skip condition if llm_router is None - Fixes "AttributeError: None does not have 'aembedding'" in parallel execution Root cause: conftest.py reloads litellm at module scope, which can cause: - Different litellm references between test code and library code - Global state (like llm_router) being None at decorator execution time - isinstance checks failing due to class identity mismatches
…st execution
- Reload litellm_pre_call_utils module inside test to get fresh litellm reference
- Use string-based patch("litellm.model_group_settings") instead of patch.object
- These changes ensure the patch targets the correct module after conftest reloads litellm
The setup_and_teardown fixture was failing with "ImportError: module litellm not in sys.modules" during parallel test execution. This occurs because another worker might have removed/modified litellm from sys.modules before this test tries to reload it. Fix: Check if litellm is in sys.modules before attempting reload.
The test was making real API calls instead of using mocks because the conftest.py reloads litellm at module scope, causing stale module references. The mock was patching the old reference while the actual code used the new one. Fix: Reload litellm.containers.main inside the test to get a fresh reference to base_llm_http_handler, then re-import create_container after the reload.
The tests were making real API calls instead of using mocks because conftest.py reloads litellm at module scope, causing the HTTPHandler class reference in the HuggingFace embedding handler to become stale. The patches were applied to the new class, but the handler used the old one. Fix: Add a reload_huggingface_modules fixture that reloads the relevant modules BEFORE the mock fixtures apply their patches. This ensures all references point to the same class object.
The test_end_to_end_rerank_flow mock for _ensure_access_token was not being applied because conftest reloads litellm, causing the VertexAIRerankConfig class to be a different object than what's patched. Fix: Reload the transformation module in setup_method and re-import the class to ensure the patch targets the same class object used by tests.
4e04bd1 to
bee4c94
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Relevant issues
Type
🧹 Refactoring
✅ Test
Changes
1. Stream consumption outside patch context
File:
tests/test_litellm/responses/mcp/test_chat_completions_handler.pytest_acompletion_with_mcp_streaming_metadata_in_correct_chunks: Move stream consumption insidewith patch(...)block to avoid real API calls2. Enterprise module skip condition
File:
tests/test_litellm/integrations/test_responses_background_cost.pyTestCheckResponsesCost(6 tests): Add skip whenlitellm_enterprisemodule is not available3. Fix mock targets
File:
tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_anthropic_experimental_pass_through_messages_handler.pytest_bedrock_converse_budget_tokens_preserved: Mocklitellm.acompletioninstead ofclient.post(which doesn't work across thread boundaries)4. Fix isinstance check after module reload
File:
tests/test_litellm/llms/volcengine/responses/test_volcengine_responses_transformation.pytest_error_class_returns_volcengine_error: Use class name comparison instead ofisinstanceto handle module reloading5. Fix module reference issues (conftest reloads litellm)
Files:
test_converse_transformation.py,test_litellm_pre_call_utils.pytest_empty_assistant_message_handling: Usepatch.object(factory_module.litellm, ...)instead of direct assignmenttest_embedding_header_forwarding_with_model_group: Usepatch.object(pre_call_utils_module.litellm, ...)instead of direct assignment6. Fix decorator timing issue
File:
tests/test_litellm/proxy/test_proxy_server.pytest_embedding_input_array_of_tokens: Move mock inside test function (after fixture initializes router) and add skip ifllm_routeris NoneRoot Cause
The
conftest.pyfile reloadslitellmat module scope, which causes:llm_router) being None at decorator execution timeisinstancechecks failing due to class identity mismatchesTest plan