Remove enable_docket setting; Docket is now always on#2558
Remove enable_docket setting; Docket is now always on#2558chrisguidry merged 12 commits intomainfrom
Conversation
|
Warning Rate limit exceeded@chrisguidry has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 20 minutes and 16 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughRemoved FASTMCP_ENABLE_DOCKET and the Settings.enable_docket field. Renamed check_docket_enabled() to check_distributed_backend(), which now validates a non-memory distributed backend URL. Server adds an asyncio.Event (_started) for readiness, changes the docket lifespan signature, makes the worker lifecycle cancelable, registers tools/prompts/resources/templates directly from internal managers, and manages current server/docket via ContextVars. Dependencies introduce InMemoryProgress as a fallback and raise errors when background-task features are used outside a running FastMCP server context. Documentation and examples remove the Docket enablement configuration. Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Test Failure AnalysisSummary: Three logging tests are failing on Windows because Docket initialization now takes longer than the test's 0.01 second sleep delay. Root Cause: The PR makes Docket always-on by removing the
On Windows, this initialization takes longer than 0.01 seconds, so the tests that check if Suggested Solution: Update the three failing tests in # Replace this:
server_task = asyncio.create_task(
mcp_server.run_http_async(log_level=test_log_level, port=8003)
)
await asyncio.sleep(0.01)
# With this:
server_task = asyncio.create_task(
mcp_server.run_http_async(log_level=test_log_level, port=8003)
)
await mcp_server.wait_until_ready()The PR already introduced
Detailed AnalysisFailed Tests (Windows only):
Error Message: Why It Happens Now: Why Related Files
|
Test Failure Analysis (Updated)Summary: Two OAuth tests are failing with because the server isn't fully ready when tests try to connect. Root Cause: The PR makes Docket always-on, which adds initialization time (~140ms). The
The Suggested Solution: Add a small delay after # In src/fastmcp/utilities/tests.py around line 212:
# Wait for server lifespan to be ready
await server.wait_until_ready()
# Give uvicorn a moment to bind to the socket
await asyncio.sleep(0.05)Alternatively, implement a more robust readiness check that actually tests the HTTP endpoint, but that's more complex and the small sleep is pragmatic given that Detailed AnalysisFailed Tests:
Error: Stack Trace Shows: Timeline:
Why This Worked Before: Related Files
Note: This is a separate issue from the Windows logging test failures mentioned in my previous comment. Those still need the fix described earlier. |
Test Failure Analysis
SummaryEight pytest-xdist workers crashed on Windows during test execution, hitting the maximum crashed worker limit. This prevents the test suite from completing successfully. Root CauseThe PR makes Docket always-on, which means every test that creates a FastMCP server now:
On Windows with pytest-xdist running 4 parallel workers, this creates significant resource pressure. The logs show:
This is a resource exhaustion issue specific to Windows + parallel testing, not a functional bug. Suggested SolutionUpdate - name: Run tests (excluding integration and client_process)
shell: bash
run: |
if [ "${{ runner.os }}" == "Windows" ]; then
# Windows: reduced parallelism due to Docket overhead
uv run pytest --inline-snapshot=disable tests -m "not integration and not client_process" --numprocesses auto --maxprocesses 2 --dist worksteal
else
# macOS/Linux: normal parallelism
uv run pytest --inline-snapshot=disable tests -m "not integration and not client_process" --numprocesses auto --maxprocesses 4 --dist worksteal
fiThis reduces parallel workers from 4 to 2 on Windows, which should prevent resource exhaustion while maintaining reasonable test performance. Detailed AnalysisWorker Crash Timeline (Run #19970379441)Tests run successfully for the first minute, then workers start crashing:
The pattern suggests cumulative resource buildup rather than immediate failure. Why Windows Is Affected
Why The Code Changes Are CorrectThe changes themselves are sound:
The issue is environmental (Windows + parallel testing) rather than a code defect. Evidence from LogsTests that were passing successfully started failing only after workers began crashing, indicating the crashes themselves are the root issue, not the test logic. Related Files
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/fastmcp/utilities/tests.py (1)
25-53:_wait_for_portlogic is solid; consider minor async/style refinementsThe retry and timeout behavior look correct and should eliminate the port‑readiness race without swallowing cancellation or unexpected errors. A couple of optional tweaks that might improve clarity and satisfy Ruff:
- Cache the running loop once and reuse it, instead of calling
asyncio.get_event_loop()twice per iteration:- start = asyncio.get_event_loop().time() + loop = asyncio.get_running_loop() + start = loop.time() @@ - except (OSError, asyncio.TimeoutError): - if asyncio.get_event_loop().time() - start > timeout: + except (OSError, asyncio.TimeoutError): + if loop.time() - start > timeout: raise TimeoutError( f"Port {port} on {host} not available after {timeout}s" ) from None await asyncio.sleep(interval)
- If you want to clear TRY300, you can move the
await asyncio.sleep(interval)into anelsebranch of the timeout check:- except (OSError, asyncio.TimeoutError): - if loop.time() - start > timeout: - raise TimeoutError( - f"Port {port} on {host} not available after {timeout}s" - ) from None - await asyncio.sleep(interval) + except (OSError, asyncio.TimeoutError): + if loop.time() - start > timeout: + raise TimeoutError( + f"Port {port} on {host} not available after {timeout}s" + ) from None + else: + await asyncio.sleep(interval)The TRY003 hint about message length is purely stylistic; the current explicit message is fine unless your Ruff config treats it as an error.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/fastmcp/utilities/tests.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Python source code must use Python ≥3.10 with full type annotations
Never use bareexcept- be specific with exception types
Prioritize readable, understandable code - clarity over cleverness; avoid obfuscated or confusing patterns even if shorter
Follow existing patterns and maintain consistency in code organization and style
Files:
src/fastmcp/utilities/tests.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T00:17:41.238Z
Learning: Applies to tests/**/*.py : Pass FastMCP servers directly to clients for testing using in-memory transport; only use HTTP transport when explicitly testing network features
📚 Learning: 2025-12-04T00:17:41.238Z
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T00:17:41.238Z
Learning: Applies to tests/**/*.py : Pass FastMCP servers directly to clients for testing using in-memory transport; only use HTTP transport when explicitly testing network features
Applied to files:
src/fastmcp/utilities/tests.py
🧬 Code graph analysis (1)
src/fastmcp/utilities/tests.py (1)
src/fastmcp/server/server.py (1)
wait_until_ready(523-529)
🪛 Ruff (0.14.7)
src/fastmcp/utilities/tests.py
46-46: Consider moving this statement to an else block
(TRY300)
49-51: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Run tests: Python 3.10 on ubuntu-latest
- GitHub Check: Run tests with lowest-direct dependencies
- GitHub Check: Run tests: Python 3.10 on windows-latest
🔇 Additional comments (1)
src/fastmcp/utilities/tests.py (1)
242-247: Readiness + port wait sequencing looks correct and should fix the raceWaiting on
server.wait_until_ready()first, then probing the TCP port with_wait_for_port(host, port), is a good ordering: it aligns with the new lifespan signaling and also covers the uvicorn bind race. This should make the async HTTP tests much less flaky without changing semantics for callers ofrun_server_async.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
tests/server/proxy/test_stateful_proxy_client.pyis excluded by none and included by nonetests/server/test_logging.pyis excluded by none and included by none
📒 Files selected for processing (2)
src/fastmcp/server/server.py(9 hunks)src/fastmcp/utilities/tests.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Python source code must use Python ≥3.10 with full type annotations
Never use bareexcept- be specific with exception types
Prioritize readable, understandable code - clarity over cleverness; avoid obfuscated or confusing patterns even if shorter
Follow existing patterns and maintain consistency in code organization and style
Files:
src/fastmcp/server/server.pysrc/fastmcp/utilities/tests.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T00:17:41.238Z
Learning: Applies to tests/**/*.py : Pass FastMCP servers directly to clients for testing using in-memory transport; only use HTTP transport when explicitly testing network features
📚 Learning: 2025-12-04T00:17:41.238Z
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T00:17:41.238Z
Learning: Applies to tests/**/*.py : Pass FastMCP servers directly to clients for testing using in-memory transport; only use HTTP transport when explicitly testing network features
Applied to files:
src/fastmcp/utilities/tests.py
🪛 Ruff (0.14.7)
src/fastmcp/utilities/tests.py
46-46: Consider moving this statement to an else block
(TRY300)
49-51: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Run tests: Python 3.10 on ubuntu-latest
- GitHub Check: Run tests: Python 3.10 on windows-latest
- GitHub Check: Run tests with lowest-direct dependencies
🔇 Additional comments (5)
src/fastmcp/utilities/tests.py (1)
242-247: Good fix for the startup race condition.The two-step wait (lifespan readiness + port availability) addresses the timing issue mentioned in the PR objectives where the proxy test fixture raced against Docket startup.
src/fastmcp/server/server.py (4)
217-217: Clean readiness signaling implementation.The
asyncio.Eventis an appropriate mechanism for signaling server readiness, and unconditional initialization ensures it's always available for waiters.
467-483: Worker cancellation pattern looks correct.The try/except/else block with explicit cancellation ensures the worker task group is cancelled in all exit paths (normal completion, exceptions, and cancellation). This addresses potential cleanup issues with long-running background workers.
514-518: Proper event lifecycle management.Setting
_startedin a try/finally block ensures the event is cleared even on exceptions, guaranteeing that waiters only observe the ready state during active lifespan progression.
410-453: Thehasattrguards are necessary and correct.Lines 412, 423, 434, and 445 check for the
fnattribute before registering with Docket. This filtering is necessary because the manager collections contain both base component classes (Tool, Prompt, Resource, ResourceTemplate) without anfnattribute and function-based subclasses (FunctionTool, FunctionPrompt, FunctionResource, FunctionResourceTemplate) with anfnattribute. Components from decorators are created as function subclasses, while mounted servers can contribute either type viamodel_copy()operations. Only function-based components with thefnattribute can be registered with Docket for background task execution, making these guards essential.
Docket provides background task execution and is now always available for all FastMCP servers. Only `enable_tasks` remains to control the SEP-1686 task protocol support. Changes: - Remove `enable_docket` setting and related validation - Docket/Worker lifecycle is always active in server lifespan - CurrentDocket and CurrentWorker dependencies work without config - Add server readiness signaling via `_started` event - Fix test timing issues with proper port probing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1eb378d to
47d3044
Compare
Lost during rebase conflict resolution. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
When pytest-xdist terminates worker processes on Windows, the event loop may close before cleanup can complete. Adding a 2-second timeout to the worker cancellation prevents indefinite hanging. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Windows ProactorEventLoop has known memory corruption issues that cause pytest-xdist worker crashes with "node down: Not properly terminated". Setting WindowsSelectorEventLoopPolicy in conftest.py avoids this issue. See: python/cpython#116773 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Fakeredis doesn't implement blocking xread properly - it returns immediately instead of waiting. This causes Docket._monitor_strikes to busy-loop, overwhelming pytest-xdist workers on Windows. Mock the method to just sleep, since strike coordination isn't useful with in-memory backends anyway. See: cunla/fakeredis-py#274 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Docket stores a shared FakeServer as a class attribute (_memory_server). When many tests run in parallel, shared state can cause issues on Windows. Add autouse fixture to reset the shared server before each test. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Test Failure Analysis
SummaryEight pytest-xdist workers crashed on Windows during test execution, hitting the maximum crashed worker limit. This prevents the test suite from completing successfully. Root CauseThe PR makes Docket always-on, which means every test that creates a FastMCP server now:
On Windows with pytest-xdist running 4 parallel workers, this creates significant resource pressure. The logs show:
This is a resource exhaustion issue specific to Windows + parallel testing, not a functional bug. Suggested SolutionUpdate - name: Run tests (excluding integration and client_process)
shell: bash
run: |
if [ "$RUNNER_OS" == "Windows" ]; then
# Windows: reduced parallelism due to Docket overhead
uv run pytest --inline-snapshot=disable tests -m "not integration and not client_process" --numprocesses auto --maxprocesses 2 --dist worksteal
else
# macOS/Linux: normal parallelism
uv run pytest --inline-snapshot=disable tests -m "not integration and not client_process" --numprocesses auto --maxprocesses 4 --dist worksteal
fiThis reduces parallel workers from 4 to 2 on Windows, which should prevent resource exhaustion while maintaining reasonable test performance. Detailed AnalysisWorker Crash Timeline (Run #19970549620)Tests run successfully for the first minute, then workers start crashing:
The pattern suggests cumulative resource buildup rather than immediate failure. Why Windows Is Affected
Why The Code Changes Are CorrectThe changes themselves are sound:
The issue is environmental (Windows + parallel testing) rather than a code defect. Evidence from LogsTests that were passing successfully started failing only after workers began crashing, indicating the crashes themselves are the root issue, not the test logic. Related Files
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/fastmcp/utilities/tests.py (2)
25-57:_wait_for_portlogic is sound; only minor style/robustness nitsThe polling loop, timeout accounting via
get_running_loop().time(), and Windows-specific handling ofwait_closed()all look correct and robust for test usage.Two small, optional tweaks you might consider:
- Cache the loop object once to avoid repeated
get_running_loop()calls and make the timeout math a bit clearer:- start = asyncio.get_running_loop().time() + loop = asyncio.get_running_loop() + start = loop.time() @@ - except (OSError, asyncio.TimeoutError): - if asyncio.get_running_loop().time() - start > timeout: + except (OSError, asyncio.TimeoutError): + if loop.time() - start > timeout:
- If you want to quiet Ruff’s TRY300 suggestion, you could move the
returninto anelse:block on thetry, but that’s purely stylistic and not required for correctness.I’d keep the explicit
TimeoutErrormessage despite TRY003; it’s helpful when tests fail.
246-252: Readiness + teardown sequencing looks good; consider handling pre‑yield failures and using a public readiness APIUsing
await server._started.wait()followed by_wait_for_port(host, port)cleanly removes the previous race between test code and the HTTP listener, and the new teardown (cancel+wait_forwith timeout while suppressingCancelledError/asyncio.TimeoutError) is a solid way to avoid hangs, especially on Windows.Two non-blocking improvements:
- If
_wait_for_port(or_started.wait()) raises before theyield, thetry/finallyblock is never entered, soserver_taskwon’t be cancelled/awaited. You could wrap the readiness phase in atrythat cancels and awaitsserver_taskon error before re‑raising, to avoid leaving a stray task around in failing tests.- If
FastMCPnow exposes a publicawait server.wait_until_ready()helper, prefer calling that instead of the privateserver._startedevent to decouple tests from internals.Overall this helper is a nice match for the in‑process testing guidance and should make async tests much less flaky. Based on learnings, ...
Also applies to: 256-260
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
.github/workflows/run-tests.ymlis excluded by none and included by none
📒 Files selected for processing (1)
src/fastmcp/utilities/tests.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Python source code must use Python ≥3.10 with full type annotations
Never use bareexcept- be specific with exception types
Prioritize readable, understandable code - clarity over cleverness; avoid obfuscated or confusing patterns even if shorter
Follow existing patterns and maintain consistency in code organization and style
Files:
src/fastmcp/utilities/tests.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T00:17:41.238Z
Learning: Applies to tests/**/*.py : Pass FastMCP servers directly to clients for testing using in-memory transport; only use HTTP transport when explicitly testing network features
📚 Learning: 2025-12-04T00:17:41.238Z
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-04T00:17:41.238Z
Learning: Applies to tests/**/*.py : Pass FastMCP servers directly to clients for testing using in-memory transport; only use HTTP transport when explicitly testing network features
Applied to files:
src/fastmcp/utilities/tests.py
🪛 Ruff (0.14.7)
src/fastmcp/utilities/tests.py
50-50: Consider moving this statement to an else block
(TRY300)
53-55: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Run tests: Python 3.10 on ubuntu-latest
- GitHub Check: Run tests with lowest-direct dependencies
|
this makes FastMCP incompatible with Trio now that Docket is required. |
Simplifies configuration by removing the
enable_docketflag. Every FastMCP server now starts a Docket instance and Worker automatically. Onlyenable_tasksremains to control SEP-1686 task protocol support.Making Docket always-on uncovered some startup timing issues. The proxy test fixture was racing against Docket startup (~140ms), so we added a
wait_until_ready()method backed by anasyncio.Eventthat signals when the server lifespan is fully initialized. Also updated the mount lifespan test expectation—with Docket keeping lifespans alive, mounted sub-servers no longer re-enter their lifespan on each proxy call.🤖 Generated with Claude Code