docs: comprehensive test flakiness solutions guide#21391
docs: comprehensive test flakiness solutions guide#21391
Conversation
Fix isinstance() checks failing due to module reload in conftest.py. The conftest.py fixture reloads the litellm module between test modules, which causes class references imported at module-level to become stale. When AsyncHTTPHandler is imported at the top of the file and then litellm is reloaded by the fixture, the isinstance() check fails because the returned instance is of the NEW AsyncHTTPHandler class while the test is checking against the OLD class reference. Solution: Import AsyncHTTPHandler locally within each test function that uses isinstance() checks. This ensures we get the fresh class reference after the module reload. Fixed tests: - test_session_reuse_integration - test_get_async_httpx_client_with_shared_session - test_get_async_httpx_client_without_shared_session This resolves intermittent CI failures where parallel test execution triggers the module reload behavior. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add two detailed guides for addressing CI test flakiness: 1. test-flakiness-guide.md - Developer guide with: - How to use @pytest.mark.no_parallel for async mocks - Patterns for robust async mock setup - Retry logic strategies - Module reload issues and fixes - Quick reference and checklist 2. ci-test-improvements.md - Implementation plan with: - Priority phased rollout (Quick wins → CI → Enforcement) - pytest-rerunfailures plugin setup - GitHub Actions improvements for retries - Makefile targets for testing - Pre-commit hooks for test quality - Test utilities module with decorators - Success metrics and monitoring These guides provide actionable solutions for the CI test failures observed in PRs #21107, #21388, and #21390. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR adds two documentation files (
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| docs/test-flakiness-guide.md | New documentation guide for addressing test flakiness. Contains accurate references to pyproject.toml settings and practical advice. Contradicts CLAUDE.md style guide on inline imports. |
| docs/ci-test-improvements.md | Implementation plan for CI test improvements with phased rollout. Contains code examples for pytest-rerunfailures, GitHub Actions, Makefile targets, and pre-commit hooks. All proposed, not yet implemented. |
| tests/test_litellm/llms/custom_httpx/test_http_handler.py | Fixes isinstance() flakiness by using local imports aliased as AsyncHTTPHandlerReload. Module-level AsyncHTTPHandler import remains for non-isinstance usage. Whitespace cleanup included. |
Flowchart
flowchart TD
A["conftest.py: importlib.reload(litellm)"] --> B["Module-level AsyncHTTPHandler reference becomes stale"]
B --> C{"isinstance(obj, AsyncHTTPHandler)?"}
C -->|"Stale reference"| D["❌ False — different class object"]
C -->|"Fresh local import"| E["✅ True — same class object"]
F["test_http_handler.py line 18:\nfrom ... import AsyncHTTPHandler"] --> B
G["PR Fix: Local import in test body\nfrom ... import AsyncHTTPHandler as AsyncHTTPHandlerReload"] --> E
Last reviewed commit: feb01b9
|
|
||
| For `isinstance()` checks that fail after module reload: | ||
|
|
||
| ```python | ||
| # ❌ BAD: Module-level import becomes stale after reload | ||
| from module import MyClass | ||
|
|
There was a problem hiding this comment.
Contradicts CLAUDE.md style guide
This guide recommends using local imports inside test functions, but CLAUDE.md explicitly states: "Avoid imports within methods — place all imports at the top of the file (module-level). Inline imports inside functions/methods make dependencies harder to trace and hurt readability. The only exception is avoiding circular imports where absolutely necessary."
Since this is a legitimate exception (module reload invalidates module-level references), consider updating CLAUDE.md to add test files as another accepted exception, or at least reference this conflict in this guide so developers aren't confused by the contradictory advice.
Context Used: Context from dashboard - CLAUDE.md (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| name: Check async tests have proper markers | ||
| entry: python scripts/check_test_markers.py | ||
| language: python | ||
| files: ^tests/.*test.*\.py$ | ||
| ``` | ||
|
|
||
| `scripts/check_test_markers.py`: | ||
|
|
There was a problem hiding this comment.
Pre-commit hook has false positives
The check_test_markers.py script uses a very broad heuristic: has_mock = 'patch' in content or 'Mock' in content or 'mock_' in node.name.lower(). This checks the entire file content for any occurrence of "patch" or "Mock", meaning if any test in the file uses mocks, every async test in that file will be flagged — even those that don't use mocks at all. The check should be scoped to each individual test function's decorators and body, not the entire file content.
|
Automated patch bundle from next-100 unresolved backlog expansion.\nGenerated due limited direct branch-write access; please apply/cherry-pick minimal edits below.\n\n## PR #21391 — Unresolved thread summary
Minimal patch proposals
|
Problem
CI tests are flaky - passing locally but failing in CI:
isinstance()failures (FIXED in fix(tests): resolve test isolation issue in http_handler tests #21388)Solution
Add comprehensive documentation with actionable solutions:
📖 test-flakiness-guide.md (Developer Guide)
Complete reference for test authors covering:
Better Test Isolation
@pytest.mark.no_parallelfor async mocksRobust Async Mock Setup
AsyncMockandside_effectproperlyRetry Logic
Test Fixtures & Cleanup
Quick Reference
🔧 ci-test-improvements.md (Implementation Plan)
Phased rollout plan with concrete next steps:
Phase 1: Quick Wins (1 day)
Phase 2: CI Improvements (2-3 days)
Phase 3: Enforcement (1 week)
Includes:
test-fast,test-flaky,test-repeat)Impact
Next Steps
After this documentation is merged, implement Phase 1:
poetry add --group dev pytest-rerunfailurestests/test_utils.pywith retry decorators@pytest.mark.flakyRelated
Examples
Before (flaky):
After (reliable):