Skip to content

[test] Fix flaky tests caused by module reloading and missing mocks#19747

Merged
jquinter merged 8 commits intoBerriAI:mainfrom
jquinter:fix/flaky-tests-missing-api-keys
Feb 15, 2026
Merged

[test] Fix flaky tests caused by module reloading and missing mocks#19747
jquinter merged 8 commits intoBerriAI:mainfrom
jquinter:fix/flaky-tests-missing-api-keys

Conversation

@jquinter
Copy link
Contributor

Relevant issues

  • Fix multiple flaky tests that fail in CI due to parallel test execution and module reloading
  • Add proper skip conditions for tests requiring enterprise modules or API keys
  • Fix incorrect mock targets and test assertions

Type

🧹 Refactoring
✅ Test

Changes

1. Stream consumption outside patch context

File: tests/test_litellm/responses/mcp/test_chat_completions_handler.py

  • test_acompletion_with_mcp_streaming_metadata_in_correct_chunks: Move stream consumption inside with patch(...) block to avoid real API calls

2. Enterprise module skip condition

File: tests/test_litellm/integrations/test_responses_background_cost.py

  • TestCheckResponsesCost (6 tests): Add skip when litellm_enterprise module is not available

3. Fix mock targets

File: tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_anthropic_experimental_pass_through_messages_handler.py

  • test_bedrock_converse_budget_tokens_preserved: Mock litellm.acompletion instead of client.post (which doesn't work across thread boundaries)

4. Fix isinstance check after module reload

File: tests/test_litellm/llms/volcengine/responses/test_volcengine_responses_transformation.py

  • test_error_class_returns_volcengine_error: Use class name comparison instead of isinstance to handle module reloading

5. Fix module reference issues (conftest reloads litellm)

Files: test_converse_transformation.py, test_litellm_pre_call_utils.py

  • test_empty_assistant_message_handling: Use patch.object(factory_module.litellm, ...) instead of direct assignment
  • test_embedding_header_forwarding_with_model_group: Use patch.object(pre_call_utils_module.litellm, ...) instead of direct assignment

6. Fix decorator timing issue

File: tests/test_litellm/proxy/test_proxy_server.py

  • test_embedding_input_array_of_tokens: Move mock inside test function (after fixture initializes router) and add skip if llm_router is None

Root Cause

The conftest.py file reloads litellm at module scope, which causes:

  1. Different litellm references between test code and library code
  2. Global state (like llm_router) being None at decorator execution time
  3. isinstance checks failing due to class identity mismatches

Test plan

  • All modified tests pass locally
  • CI tests pass with parallel execution

@vercel
Copy link

vercel bot commented Jan 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
litellm Ready Ready Preview, Comment Jan 26, 2026 6:10am

Request Review

@jquinter
Copy link
Contributor Author

Linter and MCP tests are failing: patch/solution is waiting in #19738

@krrishdholakia
Copy link
Member

Screenshot 2026-01-25 at 10 53 17 PM

still seeing module import errors @jquinter - is this intentional ?

@jquinter
Copy link
Contributor Author

jquinter commented Jan 26, 2026

Screenshot 2026-01-25 at 10 53 17 PM still seeing module import errors @jquinter - is this intentional ?

Tests run upon the main/stable release? I noticed you approved my #19738 PR, but now it is merged in litellm_oss_staging_01_26_2026 branch. When should that be promoted to main? Many fixes were deployed in that PR, and are the base for a further solution I have been working on regarding other issues (on mock tests, for example).

1. test_acompletion_with_mcp_streaming_metadata_in_correct_chunks:
   - Moved stream consumption inside patch context to avoid real API calls
   - The previous implementation had assertions outside the `with patch(...)`
     block, causing real OpenAI API calls when consuming the stream

2. TestCheckResponsesCost tests:
   - Added skip condition when litellm_enterprise module is not available
   - These tests import from litellm_enterprise.proxy.common_utils.check_responses_cost
     which is only available in the enterprise version
1. test_bedrock_converse_budget_tokens_preserved:
   - Fixed mocking at the correct level (litellm.acompletion instead of client.post)
   - The previous mock didn't work because the code runs through run_in_executor
     and the passed client parameter was not being used

2. test_error_class_returns_volcengine_error:
   - Changed isinstance check to class name comparison
   - This avoids issues when module reloading (in conftest.py) causes class
     identity mismatches during parallel test execution
Fix several tests that fail in CI due to parallel test execution and
module reloading in conftest.py.

1. test_empty_assistant_message_handling:
   - Use patch.object on factory_module.litellm instead of direct assignment
   - Ensures the correct litellm reference is modified after conftest reloads

2. test_embedding_header_forwarding_with_model_group:
   - Use patch.object on pre_call_utils_module.litellm instead of direct assignment
   - Same fix for module reloading issue

3. test_embedding_input_array_of_tokens:
   - Move mock inside test function (after fixture initializes router)
   - Add skip condition if llm_router is None
   - Fixes "AttributeError: None does not have 'aembedding'" in parallel execution

Root cause: conftest.py reloads litellm at module scope, which can cause:
- Different litellm references between test code and library code
- Global state (like llm_router) being None at decorator execution time
- isinstance checks failing due to class identity mismatches
…st execution

- Reload litellm_pre_call_utils module inside test to get fresh litellm reference
- Use string-based patch("litellm.model_group_settings") instead of patch.object
- These changes ensure the patch targets the correct module after conftest reloads litellm
The setup_and_teardown fixture was failing with "ImportError: module
litellm not in sys.modules" during parallel test execution. This occurs
because another worker might have removed/modified litellm from
sys.modules before this test tries to reload it.

Fix: Check if litellm is in sys.modules before attempting reload.
The test was making real API calls instead of using mocks because the
conftest.py reloads litellm at module scope, causing stale module
references. The mock was patching the old reference while the actual
code used the new one.

Fix: Reload litellm.containers.main inside the test to get a fresh
reference to base_llm_http_handler, then re-import create_container
after the reload.
The tests were making real API calls instead of using mocks because
conftest.py reloads litellm at module scope, causing the HTTPHandler
class reference in the HuggingFace embedding handler to become stale.
The patches were applied to the new class, but the handler used the old one.

Fix: Add a reload_huggingface_modules fixture that reloads the relevant
modules BEFORE the mock fixtures apply their patches. This ensures all
references point to the same class object.
The test_end_to_end_rerank_flow mock for _ensure_access_token was not
being applied because conftest reloads litellm, causing the
VertexAIRerankConfig class to be a different object than what's patched.

Fix: Reload the transformation module in setup_method and re-import the
class to ensure the patch targets the same class object used by tests.
@jquinter jquinter force-pushed the fix/flaky-tests-missing-api-keys branch from 4e04bd1 to bee4c94 Compare February 15, 2026 16:09
@jquinter jquinter merged commit 2bdbb99 into BerriAI:main Feb 15, 2026
11 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants