Skip to content

fix(tools): prevent OOM with nested $defs in tool schemas#19112

Merged
krrishdholakia merged 1 commit intoBerriAI:litellm_staging_01_16_2026from
rsp2k:fix/unpack-defs-oom
Jan 16, 2026
Merged

fix(tools): prevent OOM with nested $defs in tool schemas#19112
krrishdholakia merged 1 commit intoBerriAI:litellm_staging_01_16_2026from
rsp2k:fix/unpack-defs-oom

Conversation

@rsp2k
Copy link
Contributor

@rsp2k rsp2k commented Jan 14, 2026

Summary

  • Remove "flatten defs" loop that caused exponential memory growth
  • unpack_defs() already handles nested refs recursively with circular detection

Test Plan

Closes #19098

)

Remove the "flatten defs" loop that pre-expanded each $def using
unpack_defs(). When definitions reference each other (common pattern),
this caused exponential memory growth because each subsequent call
copied already-expanded content.

Root cause: The loop was calling unpack_defs(value, defs_copy) for each
def, but since defs often reference each other, each call would copy
increasingly large expanded structures via deepcopy.

Fix: Call unpack_defs() only on the parameters, letting it handle nested
refs recursively with proper circular reference detection. This resolves
the same refs without the exponential memory explosion.

Testing showed:
- Before fix: 65KB schema → 2.8GB (35,000x expansion, then OOM)
- After fix: 65KB schema → 17MB (reasonable 260x expansion)

Changes:
- factory.py: Remove flatten loop in _bedrock_tools_pt()
- vertex_ai/common_utils.py: Remove flatten loop in _build_vertex_schema()
- Add regression test for OOM with nested refs
@vercel
Copy link

vercel bot commented Jan 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
litellm Ready Ready Preview, Comment Jan 14, 2026 9:25pm

@DLakin01
Copy link

Hey! I'm the author of #19098. Your fix will definitely help, and I have another suggestion - memoize the unpack_defs function so it caches refs and doesn't keep copying them into memory. If the schema is significantly nested and self-referential, memoizing saves a ton of memory

@ishaan-jaff
Copy link
Member

@rsp2k @DLakin01 - we're trying to do a better job with reducing OOM issues. Would you be open to working with us/giving us advice on how we can get this under control ?

If yes - I'd love to get your advice, sharing a link to my cal for your convenience

@krrishdholakia
Copy link
Member

krrishdholakia commented Jan 15, 2026

@rsp2k is this ready for review? your pr status shows draft
Screenshot 2026-01-16 at 3 25 35 AM

@rsp2k
Copy link
Contributor Author

rsp2k commented Jan 15, 2026

In reply to @DLakin01's memoization suggestion:

Great suggestion! I explored this in a separate branch (benchmark/unpack-defs-memoization) with benchmarks using your actual schema data. Here's what I found:

TL;DR

The OOM fix (removing the flatten loop) is the right solution. Memoization doesn't provide additional benefit due to how unpack_defs() handles circular references.

Why Memoization Is Challenging Here

I tried several memoization approaches, but they all hit the same fundamental issue:

1. Context-Dependent Expansion

The expansion result depends on the ref_chain - the path of refs we've traversed to get here. This is how circular refs are detected:

Expression → Condition → Expression (STOP - circular!)

If we cache the "expanded Expression", it contains a $ref placeholder for the circular case. But a different reference to Expression (not nested inside itself) should expand fully. The cached version would be wrong.

2. Each Reference Needs Its Own Copy

Even if we cache expanded defs, we still need deepcopy() for each reference because the caller mutates the result in place. The main cost is the copy, not the traversal.

3. Benchmarks Confirm No Benefit

Metric Current Memoized (pre-expand)
Memory 124 KB 291 KB (worse)
Time 5.9 ms 16.9 ms (worse)

The two-phase memoization approach (pre-expand all defs, then resolve from cache) actually does more work because it deepcopies twice.

The Real Fix

The OOM was caused by this loop that pre-expanded defs into defs:

# PROBLEMATIC - caused O(n!) memory growth:
for _, value in defs_copy.items():
    unpack_defs(value, defs_copy)  # Each iteration expands already-expanded content!

When defs reference each other (A→B→C), each iteration deepcopied already-expanded content, causing exponential growth. Removing this loop fixes the issue completely:

  • Before fix: 65KB → 2.8GB (35,000x) → OOM
  • After fix: 65KB → 17MB (260x) → Works fine

Future Optimization Ideas

If further optimization is ever needed, possibilities include:

  • Lazy copy-on-write patterns
  • Structural sharing for immutable parts
  • Reference counting to skip rarely-used defs

But with the current fix, performance is already good (~120KB peak, <1ms per schema), so these aren't necessary.


Thanks again for the suggestion - it led to a deeper understanding of why the current approach works well! 🙏

@rsp2k rsp2k marked this pull request as ready for review January 16, 2026 00:02
@krrishdholakia krrishdholakia changed the base branch from main to litellm_staging_01_16_2026 January 16, 2026 23:19
@krrishdholakia krrishdholakia merged commit 5a9f6e9 into BerriAI:litellm_staging_01_16_2026 Jan 16, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: unpack_defs() causes OOM with nested tool schemas (Bedrock/Vertex AI)

4 participants