Fix ContextVar propagation for ASGI-mounted servers with tasks#2843
Fix ContextVar propagation for ASGI-mounted servers with tasks#2843chrisguidry merged 13 commits intorelease/2.xfrom
Conversation
When FastMCP runs with uvicorn, the lifespan is entered twice: 1. FastMCP's outer context (during http_app setup) 2. Starlette's ASGI lifespan (which request handlers inherit from) The second call was skipping ContextVar setup because _lifespan_result_set was already True. This caused _current_docket.get() to return None in request handlers even though server._docket was correctly set. Fix: Always set ContextVars when entering _lifespan_manager, using the already-initialized values from self._docket and self._worker. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ContextVars set during lifespan don't propagate to request handlers in Lambda (works fine locally). As a workaround, fall back to using server._docket when the ContextVar returns None. This is a Lambda-specific issue - possibly related to how Lambda Web Adapter or Lambda's asyncio runtime handles context propagation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Instead of relying on ContextVar propagation from lifespan (which fails in Lambda), set _current_docket and _current_worker when entering a Context for each request. This ensures user dependencies like CurrentDocket() and CurrentWorker() work in all environments. The values come from server._docket and server._worker which are always available after lifespan initialization. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Adds detailed logging to Context.__aenter__ and __aexit__ to track: - When Context is entered/exited - Values of server._docket and server._worker - ContextVar values before and after setting - Token values for debugging reset issues This will help diagnose why ContextVars might not propagate in Lambda. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When Redis operations fail, log the full traceback to help diagnose ACL and permission issues in production environments. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tracing where the Redis ACL error occurs - the initial Redis writes succeed but error happens somewhere after. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
pydocket 0.16.5 fixes an issue where worker_group_name was passed as a KEY instead of ARGV in Lua scripts, causing ACL failures when Redis users are restricted to key patterns. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
… time Remove redundant ContextVar handling: - _lifespan_manager no longer re-sets ContextVars in early-return branch - Handler fallback logic removed (no more `if docket is None: docket = server._docket`) The authoritative place for request-context ContextVars is now Context.__aenter__, which sets _current_docket and _current_worker from server instance attributes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
All three handlers (tool, prompt, resource) now have identical patterns: - Debug logging for docket access, Redis writes, docket.add, subscriptions - Try/except with traceback logging around Redis and docket operations - Consistent error messages with instance_id Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Removes all the verbose debug logging added during diagnosis while preserving the essential fix: Context.__aenter__ sets _current_docket and _current_worker from server instance attributes. This ensures ContextVars work in ASGI environments where lifespan and request handlers run in sibling async contexts. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
WalkthroughThe changes extend the per-request context manager to capture and propagate docket and worker context via ContextVars, enabling dependency injection for these resources across async context boundaries. A new internal worker reference is added to the FastMCP server instance and populated during the docket lifecycle. A guard is also introduced to the lifespan manager to prevent re-entrance. Error messages are updated to reference the broader server context requirement, and logging in subscription handlers is adjusted from warning to error level. Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/fastmcp/server/tasks/subscriptions.py (1)
72-73: Consider usinglogger.exception()to preserve stack traces.Switching from
warningtoerroris appropriate for subscription failures. However, removing the traceback (previouslyexc_info=True) loses valuable debugging information. Usinglogger.exception()logs at ERROR level while automatically including the traceback.♻️ Suggested fix
except Exception as e: - logger.error(f"subscribe_to_task_updates failed for {task_id}: {e}") + logger.exception(f"subscribe_to_task_updates failed for {task_id}: {e}")
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
pyproject.tomlis excluded by none and included by noneuv.lockis excluded by!**/*.lockand included by none
📒 Files selected for processing (4)
src/fastmcp/server/context.pysrc/fastmcp/server/server.pysrc/fastmcp/server/tasks/handlers.pysrc/fastmcp/server/tasks/subscriptions.py
🧰 Additional context used
📓 Path-based instructions (1)
src/fastmcp/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/fastmcp/**/*.py: Python ≥ 3.10 with full type annotations required
Prioritize readable, understandable code - clarity over cleverness. Avoid obfuscated or confusing patterns even if shorter
Follow existing patterns and maintain consistency in code implementation
Be intentional about re-exports - don't blindly re-export everything to parent namespaces. Core types defining a module's purpose should be exported. Specialized features can live in submodules. Only re-export to fastmcp.* for most fundamental types
Never use bare except - be specific with exception types
Files:
src/fastmcp/server/tasks/subscriptions.pysrc/fastmcp/server/context.pysrc/fastmcp/server/tasks/handlers.pysrc/fastmcp/server/server.py
🧬 Code graph analysis (2)
src/fastmcp/server/tasks/handlers.py (2)
src/fastmcp/server/server.py (1)
docket(379-384)src/fastmcp/server/dependencies.py (1)
message(405-406)
src/fastmcp/server/server.py (1)
src/fastmcp/cli/tasks.py (1)
worker(61-110)
🪛 Ruff (0.14.10)
src/fastmcp/server/tasks/subscriptions.py
73-73: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run tests: Python 3.13 on ubuntu-latest
- GitHub Check: Run tests: Python 3.10 on ubuntu-latest
- GitHub Check: Run tests: Python 3.10 on windows-latest
- GitHub Check: Run tests with lowest-direct dependencies
🔇 Additional comments (8)
src/fastmcp/server/tasks/handlers.py (3)
59-67: LGTM!The updated comment accurately documents the new behavior where Docket is retrieved from a ContextVar set at request time by
Context.__aenter__. The error message change to "running FastMCP server context" is more descriptive and aligns with the broader context propagation fix.
171-179: Consistent with other handlers.The same comment and error message pattern applied here maintains consistency across
handle_tool_as_task,handle_prompt_as_task, andhandle_resource_as_task.
281-289: Consistent error messaging across all task handlers.All three task handlers now share the same pattern for retrieving Docket from ContextVar and reporting the same user-friendly error message when the server context is unavailable.
src/fastmcp/server/server.py (3)
200-202: LGTM - Proper initialization of cross-context attributes.Initializing
_worker = Nonealongside_docket = Nonemaintains symmetry and enablesContext.__aenter__to check and propagate these values at request time.
471-486: Worker lifecycle properly managed on server instance.Storing the worker reference during the docket lifespan and clearing it on completion enables cross-context access via
server._worker. This pairs well with the existingserver._docketpattern.
567-571: Key fix for ASGI-mounted server context propagation.This guard correctly short-circuits when the lifespan has already run (e.g., when
http_app()is mounted into FastAPI/Starlette). The comment accurately explains thatContext.__aenter__will set the ContextVars at request time, solving the sibling async context problem.src/fastmcp/server/context.py (2)
188-205: Core fix for ContextVar propagation - well implemented.This is the authoritative fix for the ASGI-mounted server issue. By setting
_current_docketand_current_workerfromserver._docketandserver._workerat request entry time, ContextVars are properly available regardless of async context hierarchy (child vs. sibling contexts).The conditional checks (
if server._docket is not None) correctly handle:
- Servers with tasks enabled (docket/worker available)
- Servers without tasks (docket/worker are None)
- Mounted servers that skip their own docket lifecycle
208-228: Proper token cleanup in aexit.The LIFO reset order (worker → docket → server) correctly mirrors the set order. Using
hasattr()guards before reset handles cases where tokens weren't set (e.g., docket disabled). Deleting attributes after reset prevents stale references.
Test Failure AnalysisSummary: Test (line 572 in ) is timing out after 5 seconds on Windows (Python 3.10). Root Cause: The PR changes to remove the background notification flusher ( task) that was running in a task group. This background task was responsible for periodically flushing notifications during long-running operations. The timeout is happening because:
Detailed Analysis: In
Relevant Code Changes# REMOVED:
self._exit_stack = AsyncExitStack()
await self._exit_stack.__aenter__()
tg = await self._exit_stack.enter_async_context(anyio.create_task_group())
self._cancel_scope = anyio.CancelScope()
tg.start_soon(self._periodic_flush)The Suggested Solution: The ContextVar fix (setting
Related Files:
Why Windows-Specific?The test file has a marker at line 42: pytestmark = pytest.mark.skipif(
sys.platform.startswith("win32"),
reason="Windows has process lifecycle issues with stdio subprocesses",
)However, this marker should skip ALL tests in the file on Windows, but the test is still running. This suggests the skip marker isn't working as expected, OR this specific failure is exposing a real Windows-specific timing issue with stdio subprocesses. |
Test Failure AnalysisSummary: Test Root Cause: The PR changes The timeout is happening because:
Detailed Analysis: In
Relevant Code Changes# REMOVED:
self._exit_stack = AsyncExitStack()
await self._exit_stack.__aenter__()
tg = await self._exit_stack.enter_async_context(anyio.create_task_group())
self._cancel_scope = anyio.CancelScope()
tg.start_soon(self._periodic_flush)The Suggested Solution: The ContextVar fix (setting
Related Files:
Why Windows-Specific?The test file has a marker at line 42: pytestmark = pytest.mark.skipif(
sys.platform.startswith("win32"),
reason="Windows has process lifecycle issues with stdio subprocesses",
)However, this marker should skip ALL tests in the file on Windows, but the test is still running. This suggests the skip marker isn't working as expected, OR this specific failure is exposing a real Windows-specific timing issue with stdio subprocesses. |
Update: Additional FindingI discovered something important - the test should not be running on Windows at all! The test file pytestmark = pytest.mark.skipif(
sys.platform.startswith("win32"),
reason="Windows has process lifecycle issues with stdio subprocesses",
)This should skip ALL tests in that file on Windows, but the tests are clearly running (they're passing up until the timeout). This means:
Let me investigate the actual platform string on Windows GitHub Actions runners... Actually, looking at Python docs: on Windows, Hypothesis: The skipif might be evaluated at collection time with the wrong environment, or there's been a change in how pytest handles module-level Immediate Fix: |
Summary
Fixes background tasks failing with "Background tasks require a running FastMCP server context" when FastMCP is mounted to another ASGI application (FastAPI, Starlette, etc.) or deployed to serverless environments (Lambda, Cloud Run).
Root cause: ContextVars set during lifespan don't propagate to request handlers in ASGI environments because they run in sibling async contexts, not parent-child.
Fix:
Context.__aenter__now sets_current_docketand_current_workerfrom server instance attributes at request time, ensuring they're available regardless of async context hierarchy.Changes
context.py: Set docket/worker ContextVars fromserver._docket/server._workerin__aenter__server.py: Store_workeron server instance (was already storing_docket)pyproject.toml: Bump pydocket to >=0.16.6 (includes Redis ACL and py-key-value fixes)Closes #2671
🤖 Generated with Claude Code