Skip to content

Dereference $ref in tool schemas for MCP client compatibility#2814

Merged
jlowin merged 4 commits intomainfrom
fix/dereference-input-schemas
Jan 9, 2026
Merged

Dereference $ref in tool schemas for MCP client compatibility#2814
jlowin merged 4 commits intomainfrom
fix/dereference-input-schemas

Conversation

@jlowin
Copy link
Copy Markdown
Member

@jlowin jlowin commented Jan 9, 2026

MCP clients like VS Code Copilot and Claude Desktop don't properly handle $ref in tool input schemas—they strip $defs before sending schemas to the LLM, leaving dangling references that break enum parameters and complex types.

FastMCP now automatically dereferences all $ref entries, inlining definitions directly into the schema. This happens in compress_schema() at tool creation time and again in to_mcp_tool() at serving time (to catch OpenAPI tools that bypass compress_schema).

# Before: Schema with $ref (breaks in VS Code Copilot)
{
  "$defs": {"Status": {"enum": ["active", "inactive"], "type": "string"}},
  "properties": {"status": {"$ref": "#/$defs/Status"}}
}

# After: Dereferenced schema (works everywhere)
{
  "properties": {"status": {"enum": ["active", "inactive"], "type": "string"}}
}

For circular/self-referencing schemas that can't be fully dereferenced, FastMCP falls back to resolving only the root-level $ref while preserving $defs for nested references.

Follow-up to #2807, addresses #2236

@jlowin jlowin added the v3 Targeted for FastMCP 3 label Jan 9, 2026
@marvin-context-protocol marvin-context-protocol Bot added bug Something isn't working. Reports of errors, unexpected behavior, or broken functionality. server Related to FastMCP server implementation or server-side functionality. labels Jan 9, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 9, 2026

Warning

Rate limit exceeded

@jlowin has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 13 minutes and 7 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between d8b95a0 and 3dea366.

📒 Files selected for processing (1)
  • src/fastmcp/utilities/json_schema.py

Walkthrough

The docs add inline notes stating FastMCP automatically dereferences $ref entries and inlines complex Pydantic models for MCP client compatibility. In code, dereference_refs now inlines $ref targets, preserves sibling keywords (description, default, examples) via a new _merge_ref_siblings helper, and removes $defs after merging. resolve_root_ref was extended to inline root-level $ref entries. compress_schema now calls dereference_refs up-front, removes the public prune_defs parameter, and drives subsequent optimization via prune_titles and prune_additional_properties.

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description provides clear context, code examples, and issue references, but is missing the contributors checklist with required checkboxes for issue closure, workflow compliance, testing, and documentation updates. Complete the contributors checklist by confirming the issue number, development workflow adherence, manual testing, and documentation updates as specified in the template.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: automatically dereferencing $ref entries in tool schemas to ensure MCP client compatibility.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08caf77 and 809f239.

⛔ Files ignored due to path filters (6)
  • pyproject.toml is excluded by none and included by none
  • tests/tools/test_tool.py is excluded by none and included by none
  • tests/tools/test_tool_transform.py is excluded by none and included by none
  • tests/utilities/openapi/test_schemas.py is excluded by none and included by none
  • tests/utilities/test_json_schema.py is excluded by none and included by none
  • uv.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (4)
  • docs/servers/tools.mdx
  • src/fastmcp/tools/tool.py
  • src/fastmcp/tools/tool_transform.py
  • src/fastmcp/utilities/json_schema.py
🧰 Additional context used
📓 Path-based instructions (4)
docs/**/*.mdx

📄 CodeRabbit inference engine (docs/.cursor/rules/mintlify.mdc)

docs/**/*.mdx: Use clear, direct language appropriate for technical audiences
Write in second person ('you') for instructions and procedures in MDX documentation
Use active voice over passive voice in MDX technical documentation
Employ present tense for current states and future tense for outcomes in MDX documentation
Maintain consistent terminology throughout all MDX documentation
Keep sentences concise while providing necessary context in MDX documentation
Use parallel structure in lists, headings, and procedures in MDX documentation
Lead with the most important information using inverted pyramid structure in MDX documentation
Use progressive disclosure in MDX documentation: present basic concepts before advanced ones
Break complex procedures into numbered steps in MDX documentation
Include prerequisites and context before instructions in MDX documentation
Provide expected outcomes for each major step in MDX documentation
End sections with next steps or related information in MDX documentation
Use descriptive, keyword-rich headings for navigation and SEO in MDX documentation
Focus on user goals and outcomes rather than system features in MDX documentation
Anticipate common questions and address them proactively in MDX documentation
Include troubleshooting for likely failure points in MDX documentation
Provide multiple pathways (beginner vs advanced) but offer an opinionated path to avoid overwhelming users in MDX documentation
Always include complete, runnable code examples that users can copy and execute in MDX documentation
Show proper error handling and edge case management in MDX code examples
Use realistic data instead of placeholder values in MDX code examples
Include expected outputs and results for verification in MDX code examples
Test all code examples thoroughly before publishing in MDX documentation
Specify language and include filename when relevant in MDX code examples
Add explanatory comments for complex logic in MDX code examples
Document all API...

Files:

  • docs/servers/tools.mdx
docs/**/*.{md,mdx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Documentation uses Mintlify framework. Files must be in docs.json to be included. Never modify docs/python-sdk/** (auto-generated)

Files:

  • docs/servers/tools.mdx
docs/**/*.{md,mdx}

📄 CodeRabbit inference engine (AGENTS.md)

docs/**/*.{md,mdx}: Code examples in documentation must explain before showing code and make blocks fully runnable (include imports)
Documentation structure: Headers form navigation guide with logical H2/H3 hierarchy. Content should be user-focused with sections motivating features (why) before mechanics (how). Use prose over code comments for important information
Never use 'This isn't...' or 'not just...' constructions in writing - state what something IS directly. Avoid defensive writing patterns

Files:

  • docs/servers/tools.mdx
src/fastmcp/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/fastmcp/**/*.py: Python ≥ 3.10 with full type annotations required
Prioritize readable, understandable code - clarity over cleverness. Avoid obfuscated or confusing patterns even if shorter
Follow existing patterns and maintain consistency in code implementation
Be intentional about re-exports - don't blindly re-export everything to parent namespaces. Core types defining a module's purpose should be exported. Specialized features can live in submodules. Only re-export to fastmcp.* for most fundamental types
Never use bare except - be specific with exception types

Files:

  • src/fastmcp/tools/tool.py
  • src/fastmcp/tools/tool_transform.py
  • src/fastmcp/utilities/json_schema.py
🧠 Learnings (3)
📚 Learning: 2025-12-25T15:53:07.656Z
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-25T15:53:07.656Z
Learning: Applies to src/fastmcp/**/*.py : Python ≥ 3.10 with full type annotations required

Applied to files:

  • docs/servers/tools.mdx
  • src/fastmcp/tools/tool.py
📚 Learning: 2025-12-25T15:53:07.656Z
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-25T15:53:07.656Z
Learning: Applies to src/fastmcp/**/*.py : Follow existing patterns and maintain consistency in code implementation

Applied to files:

  • docs/servers/tools.mdx
📚 Learning: 2025-12-25T15:53:07.656Z
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-25T15:53:07.656Z
Learning: Applies to src/fastmcp/**/*.py : Be intentional about re-exports - don't blindly re-export everything to parent namespaces. Core types defining a module's purpose should be exported. Specialized features can live in submodules. Only re-export to fastmcp.* for most fundamental types

Applied to files:

  • src/fastmcp/tools/tool.py
🧬 Code graph analysis (2)
src/fastmcp/tools/tool.py (1)
src/fastmcp/utilities/json_schema.py (1)
  • compress_schema (286-323)
src/fastmcp/tools/tool_transform.py (1)
src/fastmcp/utilities/json_schema.py (1)
  • compress_schema (286-323)
🪛 Ruff (0.14.10)
src/fastmcp/utilities/json_schema.py

45-45: Consider moving this statement to an else block

(TRY300)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run tests: Python 3.10 on ubuntu-latest
  • GitHub Check: Run tests: Python 3.10 on windows-latest
  • GitHub Check: Run tests: Python 3.13 on ubuntu-latest
  • GitHub Check: Run tests with lowest-direct dependencies
🔇 Additional comments (5)
docs/servers/tools.mdx (1)

177-179: LGTM! Clear documentation of automatic dereferencing behavior.

The note effectively communicates the automatic $ref dereferencing feature and its purpose for MCP client compatibility. The placement in the Arguments section is appropriate since this affects how tool schemas are generated.

src/fastmcp/tools/tool.py (1)

37-37: LGTM! Import simplification aligns with centralized dereferencing.

The removal of resolve_root_ref import is correct now that compress_schema handles all $ref dereferencing internally via dereference_refs.

src/fastmcp/tools/tool_transform.py (1)

683-683: LGTM! Updated to use new compress_schema API.

Both callsites correctly updated to use compress_schema() without the removed prune_defs parameter. The new behavior automatically dereferences all $ref entries via dereference_refs(), which achieves the same goal of eliminating $defs from the output schema while being more thorough about inlining definitions.

Also applies to: 866-866

src/fastmcp/utilities/json_schema.py (2)

9-51: LGTM! Well-designed dereferencing with fallback for circular schemas.

The dereference_refs function properly handles the common case of inlining all $ref entries using jsonref's replace_refs, and gracefully falls back to resolve_root_ref for self-referencing/circular schemas that can't be fully dereferenced. The proxies=False and lazy_load=False parameters ensure immediate resolution to plain dicts, which is correct for the use case.

The automatic removal of $defs after successful dereferencing (lines 42-43) is appropriate since the definitions are now inlined.


53-89: LGTM! Fallback handles MCP spec requirements for circular schemas.

The resolve_root_ref function provides a sensible fallback for self-referencing schemas where full dereferencing isn't possible. By resolving only the root-level $ref while preserving $defs for nested references, it ensures the schema has an explicit "type": "object" at the root level (MCP spec requirement) while maintaining correctness for circular references.

The conditions (line 77) appropriately guard against resolving when not needed.

Comment thread src/fastmcp/utilities/json_schema.py
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 809f2392db

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +36 to +40
# Use jsonref to resolve all $ref references
# proxies=False returns plain dicts (not proxy objects)
# lazy_load=False resolves immediately
dereferenced = replace_refs(schema, proxies=False, lazy_load=False)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve local keywords when inlining $ref

replace_refs replaces any {"$ref": ...} object with the referenced schema, which drops sibling keywords on that node (e.g., description, default, examples, field-level constraints). Pydantic commonly emits those siblings when a field references a model with its own metadata, so dereferencing here can silently strip that metadata from tool schemas and change what clients see. Consider merging the referenced schema with the local node (minus $ref) or reapplying local keys after deref so field-level annotations survive inlining.

Useful? React with 👍 / 👎.

@jlowin jlowin added enhancement Improvement to existing functionality. For issues and smaller PR improvements. and removed bug Something isn't working. Reports of errors, unexpected behavior, or broken functionality. labels Jan 9, 2026
jlowin added 2 commits January 8, 2026 20:40
Pydantic places description, default, and examples as siblings to $ref.
The jsonref library drops these when inlining. This fix walks both the
original and dereferenced schemas in parallel, merging siblings back in.

Also fixes type annotations per CodeRabbit feedback.
@jlowin jlowin force-pushed the fix/dereference-input-schemas branch from e02a495 to 73ae9e3 Compare January 9, 2026 01:40
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/fastmcp/utilities/json_schema.py (2)

48-48: Replace assert with runtime check or type narrowing cast.

Using assert for type narrowing is problematic because assertions can be disabled with Python's -O flag, causing the type narrowing to be lost. If _merge_ref_siblings unexpectedly returns a non-dict at the top level, production code could silently fail.

♻️ Proposed fix using typing.cast
+from typing import Any, cast
+
 # ...
         dereferenced = replace_refs(schema, proxies=False, lazy_load=False)

         # Merge sibling keywords that were lost during dereferencing
         # Pydantic puts description, default, examples as siblings to $ref
         merged = _merge_ref_siblings(schema, dereferenced)
-        # Type assertion: top-level schema is always a dict
-        assert isinstance(merged, dict)
-        dereferenced = merged
+        # Top-level schema is always a dict after merging
+        dereferenced = cast(dict[str, Any], merged)

100-109: Simplify the complex list comprehension for better readability.

This nested list comprehension with conditional zip operations is difficult to understand and maintain, violating the coding guideline "Prioritize readable, understandable code - clarity over cleverness."

♻️ Proposed refactor for clarity
-    elif isinstance(original, list) and isinstance(dereferenced, list):
-        # Process list items in parallel
-        return [
-            _merge_ref_siblings(o, d) if i < len(original) else d
-            for i, (o, d) in enumerate(
-                zip(original, dereferenced, strict=False)
-                if len(original) <= len(dereferenced)
-                else zip(original[: len(dereferenced)], dereferenced, strict=False)
-            )
-        ] + (dereferenced[len(original) :] if len(dereferenced) > len(original) else [])
+    elif isinstance(original, list) and isinstance(dereferenced, list):
+        # Process list items in parallel
+        result = []
+        min_len = min(len(original), len(dereferenced))
+        
+        # Merge items at matching indices
+        for i in range(min_len):
+            result.append(_merge_ref_siblings(original[i], dereferenced[i]))
+        
+        # Append any remaining dereferenced items
+        if len(dereferenced) > min_len:
+            result.extend(dereferenced[min_len:])
+        
+        return result

As per coding guidelines, clarity should be prioritized over clever one-liners.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 809f239 and e02a495.

⛔ Files ignored due to path filters (1)
  • tests/utilities/test_json_schema.py is excluded by none and included by none
📒 Files selected for processing (1)
  • src/fastmcp/utilities/json_schema.py
🧰 Additional context used
📓 Path-based instructions (1)
src/fastmcp/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/fastmcp/**/*.py: Python ≥ 3.10 with full type annotations required
Prioritize readable, understandable code - clarity over cleverness. Avoid obfuscated or confusing patterns even if shorter
Follow existing patterns and maintain consistency in code implementation
Be intentional about re-exports - don't blindly re-export everything to parent namespaces. Core types defining a module's purpose should be exported. Specialized features can live in submodules. Only re-export to fastmcp.* for most fundamental types
Never use bare except - be specific with exception types

Files:

  • src/fastmcp/utilities/json_schema.py
🧠 Learnings (1)
📚 Learning: 2025-12-25T15:53:07.656Z
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-25T15:53:07.656Z
Learning: Applies to src/fastmcp/**/*.py : Python ≥ 3.10 with full type annotations required

Applied to files:

  • src/fastmcp/utilities/json_schema.py
🪛 Ruff (0.14.10)
src/fastmcp/utilities/json_schema.py

55-55: Consider moving this statement to an else block

(TRY300)

🔇 Additional comments (1)
src/fastmcp/utilities/json_schema.py (1)

347-384: LGTM - Dereferencing-first approach is well-structured.

The updated flow correctly prioritizes dereferencing before optimization:

  1. dereference_refs inlines all $ref entries and removes $defs (MCP compatibility)
  2. Parameter pruning operates on the dereferenced schema
  3. Additional optimizations run only when requested

Removing the prune_defs parameter from the public API is appropriate since dereferencing now handles definition cleanup.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/fastmcp/utilities/json_schema.py (2)

47-49: Avoid assert for runtime validation.

Using assert for type validation is problematic because assertions can be disabled with Python's -O flag and AssertionError doesn't provide descriptive context. Since this is a runtime safety check (not a development-time invariant), use a proper type guard or conditional check with a descriptive error.

♻️ Proposed fix
-        # Type assertion: top-level schema is always a dict
-        assert isinstance(merged, dict)
-        dereferenced = merged
+        # Top-level schema should always be a dict after merging
+        if not isinstance(merged, dict):
+            raise TypeError(f"Expected dict after merging, got {type(merged).__name__}")
+        dereferenced = merged

100-109: Simplify the list handling logic for better readability.

The current list-processing logic is overly complex with nested conditionals in the zip operation and a redundant length check on line 103. The condition if i < len(original) is always true within the enumerate because of how the zip is constructed. Additionally, using strict=False may mask bugs where the lists are expected to have the same structure.

Per the coding guidelines: "Prioritize readable, understandable code - clarity over cleverness."

♻️ Proposed simplification
-    elif isinstance(original, list) and isinstance(dereferenced, list):
-        # Process list items in parallel
-        return [
-            _merge_ref_siblings(o, d) if i < len(original) else d
-            for i, (o, d) in enumerate(
-                zip(original, dereferenced, strict=False)
-                if len(original) <= len(dereferenced)
-                else zip(original[: len(dereferenced)], dereferenced, strict=False)
-            )
-        ] + (dereferenced[len(original) :] if len(dereferenced) > len(original) else [])
+    elif isinstance(original, list) and isinstance(dereferenced, list):
+        # Process list items in parallel, handling different lengths
+        result = []
+        for i in range(len(dereferenced)):
+            if i < len(original):
+                result.append(_merge_ref_siblings(original[i], dereferenced[i]))
+            else:
+                result.append(dereferenced[i])
+        return result
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e02a495 and 73ae9e3.

⛔ Files ignored due to path filters (1)
  • tests/utilities/test_json_schema.py is excluded by none and included by none
📒 Files selected for processing (2)
  • docs/servers/tools.mdx
  • src/fastmcp/utilities/json_schema.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/servers/tools.mdx
🧰 Additional context used
📓 Path-based instructions (1)
src/fastmcp/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/fastmcp/**/*.py: Python ≥ 3.10 with full type annotations required
Prioritize readable, understandable code - clarity over cleverness. Avoid obfuscated or confusing patterns even if shorter
Follow existing patterns and maintain consistency in code implementation
Be intentional about re-exports - don't blindly re-export everything to parent namespaces. Core types defining a module's purpose should be exported. Specialized features can live in submodules. Only re-export to fastmcp.* for most fundamental types
Never use bare except - be specific with exception types

Files:

  • src/fastmcp/utilities/json_schema.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-25T15:53:07.656Z
Learning: Applies to src/fastmcp/**/*.py : Python ≥ 3.10 with full type annotations required
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-25T15:53:07.656Z
Learning: Applies to src/fastmcp/**/*.py : Follow existing patterns and maintain consistency in code implementation
📚 Learning: 2025-12-25T15:53:07.656Z
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-25T15:53:07.656Z
Learning: Applies to src/fastmcp/**/*.py : Python ≥ 3.10 with full type annotations required

Applied to files:

  • src/fastmcp/utilities/json_schema.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Run tests: Python 3.10 on windows-latest
🔇 Additional comments (2)
src/fastmcp/utilities/json_schema.py (2)

153-153: LGTM! Type annotations are correct.

The updated type annotations properly use dict[str, Any] syntax (Python 3.10+) and maintain consistency throughout the module. The removal of prune_defs from compress_schema's public API is appropriate since dereference_refs now handles removing $defs after inlining.

Also applies to: 175-179, 348-352


369-382: LGTM! Implementation correctly dereferences and optimizes.

The logic flow is correct:

  1. Dereference all $ref entries upfront (line 369) for MCP client compatibility
  2. Prune specific parameters if requested (lines 372-373)
  3. Apply optimizations only when needed (lines 376-382)

The hardcoded prune_defs=False on line 381 is appropriate since dereference_refs already removes $defs after inlining all references.

Comment on lines +63 to +112
def _merge_ref_siblings(
original: dict[str, Any] | list | Any,
dereferenced: dict[str, Any] | list | Any,
) -> dict[str, Any] | list | Any:
"""Merge sibling keywords from original $ref nodes into dereferenced schema.

When jsonref resolves $ref, it replaces the entire node with the referenced
definition, losing any sibling keywords like description, default, or examples.
This function walks both trees in parallel and merges those siblings back.

Args:
original: The original schema with $ref and potential siblings
dereferenced: The schema after jsonref processing

Returns:
The dereferenced schema with sibling keywords restored
"""
if isinstance(original, dict) and isinstance(dereferenced, dict):
# Check if original had a $ref with siblings
if "$ref" in original:
siblings = {k: v for k, v in original.items() if k not in ("$ref", "$defs")}
if siblings:
# Merge siblings into dereferenced, siblings take precedence
# (local description overrides referenced definition's description)
merged = dict(dereferenced)
merged.update(siblings)
return merged

# Recurse into nested structures
result = {}
for key, value in dereferenced.items():
if key in original:
result[key] = _merge_ref_siblings(original[key], value)
else:
result[key] = value
return result

elif isinstance(original, list) and isinstance(dereferenced, list):
# Process list items in parallel
return [
_merge_ref_siblings(o, d) if i < len(original) else d
for i, (o, d) in enumerate(
zip(original, dereferenced, strict=False)
if len(original) <= len(dereferenced)
else zip(original[: len(dereferenced)], dereferenced, strict=False)
)
] + (dereferenced[len(original) :] if len(dereferenced) > len(original) else [])

return dereferenced

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "*test*.py" | head -20

Repository: jlowin/fastmcp

Length of output: 135


🏁 Script executed:

rg -l "json_schema|_merge_ref_siblings|dereference_refs" --type=py | grep -i test

Repository: jlowin/fastmcp

Length of output: 40


🏁 Script executed:

rg -n "_merge_ref_siblings" --type=py

Repository: jlowin/fastmcp

Length of output: 440


🏁 Script executed:

cat -n ./src/fastmcp/utilities/tests.py

Repository: jlowin/fastmcp

Length of output: 11290


🏁 Script executed:

rg -n "json_schema|dereference" ./src/fastmcp/utilities/tests.py -A 5 -B 2

Repository: jlowin/fastmcp

Length of output: 40


🏁 Script executed:

rg -n "def test" ./src/fastmcp/utilities/tests.py

Repository: jlowin/fastmcp

Length of output: 104


🏁 Script executed:

find . -type f -name "*test_json*" -o -name "*json*test*" | grep -E '\.py$'

Repository: jlowin/fastmcp

Length of output: 40


🏁 Script executed:

rg -l "dereference_refs" --type=py

Repository: jlowin/fastmcp

Length of output: 94


🏁 Script executed:

find . -path "*/tests/*" -name "*.py" -type f | grep -E "(json|schema)" | head -20

Repository: jlowin/fastmcp

Length of output: 40


🏁 Script executed:

find . -path "*/test*" -name "*.py" -type f | head -30

Repository: jlowin/fastmcp

Length of output: 169


🏁 Script executed:

rg -n "dereference_refs|_merge_ref_siblings|json_schema" --type=py -l | grep -i test

Repository: jlowin/fastmcp

Length of output: 40


🏁 Script executed:

cat -n src/fastmcp/utilities/json_schema.py | head -60

Repository: jlowin/fastmcp

Length of output: 2907


🏁 Script executed:

rg -n "import.*json_schema|from.*json_schema" --type=py

Repository: jlowin/fastmcp

Length of output: 1341


🏁 Script executed:

find . -type d -name tests | xargs -I {} find {} -name "*.py" -type f

Repository: jlowin/fastmcp

Length of output: 102


🏁 Script executed:

rg -n "json_schema|dereference_refs|_merge_ref_siblings" ./examples/testing_demo/tests/test_server.py -A 3 -B 2

Repository: jlowin/fastmcp

Length of output: 40


🏁 Script executed:

cat -n ./examples/testing_demo/tests/test_server.py | head -100

Repository: jlowin/fastmcp

Length of output: 3863


🏁 Script executed:

find . -type f -name "conftest.py" -o -type f -path "*/tests/test_*.py" | head -20

Repository: jlowin/fastmcp

Length of output: 102


🏁 Script executed:

find . -type f -name "*.py" -path "*/test*" | xargs grep -l "compress_schema\|dereference" 2>/dev/null

Repository: jlowin/fastmcp

Length of output: 42


🏁 Script executed:

rg -r "def compress_schema|def dereference_refs" --type=py -A 2

Repository: jlowin/fastmcp

Length of output: 454


🏁 Script executed:

find . -type f -name "test_*.py" -o -type f -name "*_test.py" | wc -l

Repository: jlowin/fastmcp

Length of output: 59


🏁 Script executed:

cat -n src/fastmcp/utilities/json_schema.py

Repository: jlowin/fastmcp

Length of output: 18076


Add comprehensive test coverage for the dereference/merge logic.

The dereference_refs and _merge_ref_siblings functions implement critical logic for inlining $ref definitions while preserving sibling keywords (description, default, examples). No tests currently exist for these functions. Add tests covering:

  • Simple cases with siblings alongside $ref
  • Nested structures with mixed $ref and siblings
  • Lists containing schemas with $ref
  • Edge cases like empty dicts/lists and mismatched structures
  • Root-level $ref resolution (resolve_root_ref)

The _merge_ref_siblings function now follows $ref pointers into $defs
to preserve siblings on nested references (e.g., a field with default
inside a referenced definition). Added tests for list handling (anyOf)
and nested sibling preservation.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/fastmcp/utilities/json_schema.py (1)

6-6: Remove redundant lazy_load=False parameter.

The project uses jsonref>=1.1.0. The lazy_load parameter only applies when proxies=True, but this code uses proxies=False to get plain dict objects. Since lazy_load is ignored with proxies=False, remove it to avoid confusion:

dereferenced = replace_refs(schema, proxies=False)

Also applies to: 42-42

🧹 Nitpick comments (2)
src/fastmcp/utilities/json_schema.py (2)

64-82: Simplify overly broad type hints.

The type annotations dict[str, Any] | list | Any are redundant because Any subsumes all types. This makes the union meaningless and defeats the purpose of type hints.

♻️ Proposed type hint improvement
-def _merge_ref_siblings(
-    original: dict[str, Any] | list | Any,
-    dereferenced: dict[str, Any] | list | Any,
-    defs: dict[str, Any],
-) -> dict[str, Any] | list | Any:
+def _merge_ref_siblings(
+    original: Any,
+    dereferenced: Any,
+    defs: dict[str, Any],
+) -> Any:

Alternatively, if you want to be more specific about expected types:

+from typing import Any, Union
+
+_SchemaNode = Union[dict[str, Any], list[Any], str, int, float, bool, None]
+
 def _merge_ref_siblings(
-    original: dict[str, Any] | list | Any,
-    dereferenced: dict[str, Any] | list | Any,
+    original: _SchemaNode,
+    dereferenced: _SchemaNode,
     defs: dict[str, Any],
-) -> dict[str, Any] | list | Any:
+) -> _SchemaNode:

Based on coding guidelines requiring full type annotations for Python ≥ 3.10.


114-123: Simplify overly complex list handling logic.

The conditional expression on lines 119-121 is unnecessarily complex and hard to understand. Since zip(..., strict=False) already handles mismatched lengths by stopping at the shorter sequence, the explicit slicing in the else branch is redundant.

♻️ Proposed simplification
     elif isinstance(original, list) and isinstance(dereferenced, list):
         # Process list items in parallel
+        min_len = min(len(original), len(dereferenced))
         return [
-            _merge_ref_siblings(o, d, defs) if i < len(original) else d
-            for i, (o, d) in enumerate(
-                zip(original, dereferenced, strict=False)
-                if len(original) <= len(dereferenced)
-                else zip(original[: len(dereferenced)], dereferenced, strict=False)
-            )
-        ] + (dereferenced[len(original) :] if len(dereferenced) > len(original) else [])
+            _merge_ref_siblings(o, d, defs)
+            for o, d in zip(original, dereferenced, strict=False)
+        ] + dereferenced[min_len:]

This achieves the same result with clearer logic:

  1. zip with strict=False pairs up elements from both lists
  2. Any extra elements in dereferenced are appended without modification

Based on coding guidelines: "Prioritize readable, understandable code - clarity over cleverness."

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 73ae9e3 and d8b95a0.

⛔ Files ignored due to path filters (1)
  • tests/utilities/test_json_schema.py is excluded by none and included by none
📒 Files selected for processing (1)
  • src/fastmcp/utilities/json_schema.py
🧰 Additional context used
📓 Path-based instructions (1)
src/fastmcp/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/fastmcp/**/*.py: Python ≥ 3.10 with full type annotations required
Prioritize readable, understandable code - clarity over cleverness. Avoid obfuscated or confusing patterns even if shorter
Follow existing patterns and maintain consistency in code implementation
Be intentional about re-exports - don't blindly re-export everything to parent namespaces. Core types defining a module's purpose should be exported. Specialized features can live in submodules. Only re-export to fastmcp.* for most fundamental types
Never use bare except - be specific with exception types

Files:

  • src/fastmcp/utilities/json_schema.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-25T15:53:07.656Z
Learning: Applies to src/fastmcp/**/*.py : Python ≥ 3.10 with full type annotations required
📚 Learning: 2025-12-25T15:53:07.656Z
Learnt from: CR
Repo: jlowin/fastmcp PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-25T15:53:07.656Z
Learning: Applies to src/fastmcp/**/*.py : Python ≥ 3.10 with full type annotations required

Applied to files:

  • src/fastmcp/utilities/json_schema.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run tests: Python 3.10 on windows-latest
  • GitHub Check: Run tests: Python 3.13 on ubuntu-latest
  • GitHub Check: Run tests: Python 3.10 on ubuntu-latest
  • GitHub Check: Run tests with lowest-direct dependencies
🔇 Additional comments (3)
src/fastmcp/utilities/json_schema.py (3)

13-17: LGTM!

The docstring clearly explains the problem (MCP clients not handling $ref properly) and the solution (preserving sibling keywords during dereferencing).


167-167: LGTM!

Type hints properly updated to dict[str, Any], consistent with Python ≥ 3.10 requirements from coding guidelines.

Also applies to: 189-189, 193-193, 362-362, 366-366


383-396: LGTM!

The flow is correct: dereference_refs inlines all definitions and removes $defs (for MCP client compatibility), then subsequent optimizations are applied. Setting prune_defs=False on line 395 is appropriate since $defs were already removed.

Comment thread src/fastmcp/utilities/json_schema.py
Comment thread src/fastmcp/utilities/json_schema.py
Added visited-node tracking to protect against potential infinite recursion
if $defs contain circular references. Also simplified type hints and list
handling logic.
@jlowin jlowin merged commit 8dfa513 into main Jan 9, 2026
11 checks passed
@jlowin jlowin deleted the fix/dereference-input-schemas branch January 9, 2026 02:16
@marvin-context-protocol
Copy link
Copy Markdown
Contributor

Test Failure Analysis

Summary: The test test_base_authority_from_parameter is timing out on Windows when initializing AzureProvider due to SQLite database locking issues.

Root Cause:
The issue occurs during AzureProvider initialization when it creates a DiskStore for OAuth state storage (line 822 in oauth_proxy.py). All tests share the same SQLite database file at settings.home / "oauth-proxy", and on Windows:

  1. Tests run serially without parallelization (-v flag only)
  2. SQLite file locks from previous tests can persist
  3. The DiskStore initialization hangs when trying to connect to the locked database
  4. The test times out after 5 seconds (global pytest timeout)

The stack trace shows the hang occurs at:

File "diskcache/core.py", line 623, in _con
    con = self._local.con = sqlite3.connect(...)

Suggested Solution:

The best fix is to ensure each test uses an isolated database directory. This can be done by:

  1. Add a pytest fixture for temporary storage in tests/server/auth/providers/test_azure.py:
import pytest
from pathlib import Path

@pytest.fixture
def temp_oauth_storage(tmp_path):
    """Provide isolated storage for OAuth tests."""
    from key_value.aio.stores.disk import DiskStore
    from key_value.aio.wrappers.encryption import FernetEncryptionWrapper
    from cryptography.fernet import Fernet
    from fastmcp.server.auth.oauth_proxy import derive_jwt_key
    
    storage_encryption_key = derive_jwt_key(
        high_entropy_material="test-secret",
        salt="fastmcp-storage-encryption-key",
    )
    
    storage = FernetEncryptionWrapper(
        key_value=DiskStore(directory=tmp_path / "oauth-proxy"),
        fernet=Fernet(key=storage_encryption_key),
    )
    return storage
  1. Update tests to use the fixture:
def test_base_authority_from_parameter(self, temp_oauth_storage):
    """Test that base_authority can be set via parameter."""
    provider = AzureProvider(
        client_id="env-client-id",
        client_secret="env-secret",
        tenant_id="env-tenant-id",
        base_url="https://myserver.com",
        required_scopes=["read"],
        base_authority="login.microsoftonline.us",
        jwt_signing_key="test-secret",
        client_storage=temp_oauth_storage,  # Add this parameter
    )
    # ... rest of test
  1. Apply this pattern to all AzureProvider initialization tests that are currently using the default client_storage.

This ensures each test gets its own isolated SQLite database in a temporary directory that's automatically cleaned up after the test.

Detailed Analysis

Relevant log excerpts:

tests/server/auth/providers/test_azure.py::TestAzureProvider::test_base_authority_from_parameter 
+++++++++++++++++++++++++++++++++++ Timeout +++++++++++++++++++++++++++++++++++

~~~~~~~~~~~~~~~~~~~~~~~~~ Stack of MainThread (4184) ~~~~~~~~~~~~~~~~~~~~~~~~~~
  File "D:\a\fastmcp\fastmcp\src\fastmcp\server\auth\oauth_proxy.py", line 822, in __init__
    key_value=DiskStore(directory=settings.home / "oauth-proxy"),
  File "D:\a\fastmcp\fastmcp\.venv\lib\site-packages\diskcache\core.py", line 623, in _con
    con = self._local.con = sqlite3.connect(
+++++++++++++++++++++++++++++++++++ Timeout +++++++++++++++++++++++++++++++++++

Why Windows specifically?

  • The CI workflow runs tests serially on Windows (-v only) vs. parallel on Linux (--numprocesses auto)
  • SQLite on Windows has stricter file locking behavior
  • Leftover file handles from previous tests can block new connections

Why this test and not others?

  • Other similar tests (like test_base_authority_azure_government) likely complete before encountering the locked database
  • This is a race condition that depends on test execution order and timing
Related Files
  • tests/server/auth/providers/test_azure.py:354 - Failing test
  • src/fastmcp/server/auth/oauth_proxy.py:822 - Where DiskStore is created
  • src/fastmcp/server/auth/providers/azure.py:196 - AzureProvider initialization
  • src/fastmcp/settings.py:180 - Shared home directory definition
  • .github/workflows/run-tests.yml:55-59 - Windows test configuration without parallelization

@debnil-mercor
Copy link
Copy Markdown

@jlowin: thank you for your amazing work on this repo! Any chance this could get released sooner than v3? I keep running into this issue when parsing outputs from Gemini 3 Pro, and having this fix in FastMCP would be amazing. I upgraded to 2.14.3 but it didn't fix this. Let me know if I should do something else.

@jlowin
Copy link
Copy Markdown
Member Author

jlowin commented Jan 13, 2026

Hi @debnil-mercor - I've opened #2861 and will try release in the next couple days

@debnil-mercor
Copy link
Copy Markdown

@jlowin: thanks so much for the quick response! Let me know if I can help accelerate this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Improvement to existing functionality. For issues and smaller PR improvements. server Related to FastMCP server implementation or server-side functionality. v3 Targeted for FastMCP 3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants