fix: HTTP client memory leaks in Presidio, OpenAI, and Gemini by rsp2k · Pull Request #19190 · BerriAI/litellm

rsp2k · 2026-01-16T06:10:04Z

Fix: HTTP Client Memory Leaks (Issues #14540, #12443)

Fixes three high-impact memory leaks in LiteLLM's HTTP client lifecycle management.

Issues Addressed

Fixes [Bug]: Containers OOMing due to memory leaks / unclosed sessions #14540 - Presidio guardrails creating sessions per request
Fixes aiohttp.ClientSession not closed when using Gemini with LitellmModel — causes persistent warnings even with async usage #12443 - Gemini aiohttp session leak (persistent warnings)
General OpenAI/Azure client cache bypass

Changes

1. Presidio Guardrail Session Leak (CRITICAL)

Impact: Every guardrail check created new aiohttp.ClientSession

Added shared session pattern with _get_http_session()
Added __del__ cleanup for safety
Scope: Runs on EVERY proxy request when PII masking enabled

Files: litellm/proxy/guardrails/guardrail_hooks/presidio.py

2. OpenAI Client Caching Bypass

Impact: Every completion created new client, bypassing LiteLLM's TTL cache

Route through get_async_httpx_client() for proper caching
Critical: Include SSL config in cache key (prevents different SSL configs sharing same client)
Added specific exception handling with debug logging

Files: litellm/llms/openai/common_utils.py

3. Gemini aiohttp Session Leak (#12443)

Impact: Persistent "Unclosed client session" warnings

Fixed atexit cleanup to use asyncio.new_event_loop() (was failing with get_event_loop())
Added __del__ cleanup to BaseLLMAIOHTTPHandler for defense-in-depth
Close global base_llm_aiohttp_handler instance

Files:

litellm/llms/custom_httpx/async_client_cleanup.py
litellm/llms/custom_httpx/aiohttp_handler.py

Validation

All fixes validated with automated tests:

test_oom_fixes.py - Presidio + OpenAI validation (2/2 tests passing)
test_gemini_session_leak.py - Gemini cleanup validation (3/3 tests passing)

Run tests:

poetry run pytest test_oom_fixes.py -v
poetry run pytest test_gemini_session_leak.py -v

Context

This PR responds to @ishaan-jaff's request for collaboration on broader OOM issues in this comment.

The fixes follow a pattern of ensuring HTTP clients are managed through LiteLLM's centralized lifecycle system (LLMClientCache with TTL) rather than being created ad-hoc per request.

Root Cause Pattern

All three leaks shared a common anti-pattern:

Code created httpx.AsyncClient() or aiohttp.ClientSession() directly
Bypassed LiteLLM's caching infrastructure in litellm/llms/custom_httpx/http_handler.py
Resources accumulated without proper cleanup

Next Steps

Happy to continue working on remaining OOM issues if helpful:

[Bug]: Memory leak: Router.completion creates new SSL connections (Bedrock provider, OOM crash) #14384 - Router + Bedrock SSL connections
[Bug]: Unclosed aiohttp client session when using acompletion with concurrent requests #13251 - acompletion() unclosed connectors

cc @ishaan-jaff

vercel · 2026-01-16T06:10:09Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
litellm	Error		Jan 18, 2026 11:07pm

krrishdholakia · 2026-01-16T19:28:19Z

@rsp2k is this ready for review? i see it marked as draft

Fixes multiple memory leak issues reported in BerriAI#14540 and related tickets: **Presidio Guardrail Fix (BerriAI#14540)** - Problem: Every guardrail check created a new aiohttp.ClientSession - Impact: High-traffic proxies accumulated thousands of unclosed sessions - Solution: Share a single session across all guardrail checks - Added `self._http_session` instance variable - Lazy session creation via `_get_http_session()` - Proper cleanup via `_close_http_session()` and `__del__()` - Files: litellm/proxy/guardrails/guardrail_hooks/presidio.py **OpenAI HTTP Client Caching (BerriAI#14540)** - Problem: `_get_async_http_client()` created new httpx.AsyncClient on each call - Impact: OpenAI/Azure completions bypassed client caching system - Solution: Route through `get_async_httpx_client()` for TTL-based caching - Caches clients by provider and SSL config - Fallback to direct creation if caching fails - Applied to both async and sync client methods - Files: litellm/llms/openai/common_utils.py **Test Script** - Added validation script to demonstrate fixes - Counts file descriptors and unclosed session objects - Files: test_oom_fixes.py Related issues: BerriAI#14384, BerriAI#13251, BerriAI#12443

…nt creation Fixes two high-impact memory leaks: 1. Presidio Guardrail Session Leak (issue BerriAI#14540) - Problem: Created new aiohttp.ClientSession on every guardrail check - Impact: Runs on EVERY proxy request when PII masking enabled - Fix: Shared session pattern with lifecycle management - Files: litellm/proxy/guardrails/guardrail_hooks/presidio.py 2. OpenAI HTTP Client Cache Bypass (issue BerriAI#14540) - Problem: _get_async_http_client() created new httpx.AsyncClient, bypassing TTL cache - Impact: Every completion created new client with own connection pool - Fix: Route through get_async_httpx_client() for proper caching - Critical: Include SSL config in cache key for correctness - Files: litellm/llms/openai/common_utils.py Validation: - Presidio: 100 requests → 0 new sessions (was 100) - OpenAI: 100 calls → 1 unique client (was 100) - test_oom_fixes.py: Automated validation script

Fixes persistent "Unclosed client session" warnings when using Gemini models. Root Causes: 1. Broken atexit cleanup - get_event_loop() fails at exit time 2. On-demand session creation without reliable cleanup Changes: 1. Fixed atexit Cleanup (async_client_cleanup.py) - OLD: Used get_event_loop() which fails when loop is closed - NEW: Always create fresh event loop at exit time - Ensures cleanup runs successfully even when main loop is closed 2. Added __del__ Cleanup (aiohttp_handler.py) - Defense-in-depth: cleanup on garbage collection - Handles abnormal termination cases - Similar pattern to Presidio guardrail fix 3. Enhanced Cleanup Scope (async_client_cleanup.py) - Now closes global base_llm_aiohttp_handler instance - Previously only checked cache, missed module-level handler Validation: - Test 1: __del__ cleanup → 0 sessions leaked ✓ - Test 2: atexit cleanup → 0 sessions leaked ✓ - test_gemini_session_leak.py: Automated validation Related: BerriAI#14540 (broader OOM issue tracking)

MyPy was failing because llm_provider parameter expects Union[LlmProviders, httpxSpecialProvider], not a string. Changed from string "openai" to LlmProviders.OPENAI enum value.

- Move test_oom_fixes.py to tests/test_litellm/llms/ - Move test_gemini_session_leak.py to tests/test_litellm/llms/custom_httpx/ - Fix pytest warning: use pytest.skip() instead of return True This ensures CI actually runs our OOM fix validation tests.

…sion creation - Make _get_http_session() async with asyncio.Lock protection - Prevents multiple concurrent requests from creating orphaned sessions - Add concurrent load test (50 parallel requests) to validate fix - Test confirms only 1 session created under concurrent load Critical fix: Previous implementation had race condition where concurrent guardrail checks could create multiple sessions, defeating the shared session pattern and causing memory leaks.

Move asyncio.Lock creation from lazy initialization in _get_http_session() to __init__. The previous lazy init had a race condition where concurrent coroutines could both see _session_lock as None, both create locks, and end up with different lock instances - defeating the synchronization. asyncio.Lock() can be safely created without an event loop; it only requires one when awaited.

rsp2k · 2026-01-18T23:06:54Z

Yes, ready for review! Just pushed a fix for a race condition in the Presidio session lock initialization - the lock was being created lazily which could cause concurrent coroutines to end up with different lock instances.

vercel bot deployed to Preview January 16, 2026 06:11 View deployment

vercel bot deployed to Preview January 16, 2026 08:20 View deployment

krrishdholakia requested a review from AlexsanderHamir January 16, 2026 19:28

vercel bot deployed to Preview January 16, 2026 20:51 View deployment

rsp2k force-pushed the fix/oom-http-client-leaks branch from 043abf4 to 58e82f1 Compare January 16, 2026 22:00

vercel bot had a problem deploying to Preview January 16, 2026 22:01 Failure

AlexsanderHamir approved these changes Jan 17, 2026

View reviewed changes

vercel bot had a problem deploying to Preview January 17, 2026 23:42 Failure

rsp2k added 6 commits January 17, 2026 19:03

fix(types): use LlmProviders enum for get_async_httpx_client

cd1cd0e

MyPy was failing because llm_provider parameter expects Union[LlmProviders, httpxSpecialProvider], not a string. Changed from string "openai" to LlmProviders.OPENAI enum value.

rsp2k force-pushed the fix/oom-http-client-leaks branch from 411e3ee to 84234a7 Compare January 18, 2026 02:03

vercel bot had a problem deploying to Preview January 18, 2026 02:04 Failure

rsp2k marked this pull request as ready for review January 18, 2026 23:06

vercel bot had a problem deploying to Preview January 18, 2026 23:07 Failure

krrishdholakia changed the base branch from main to litellm_staging_01_20_2026 January 20, 2026 03:02

krrishdholakia merged commit 58c8c2b into BerriAI:litellm_staging_01_20_2026 Jan 20, 2026
5 of 7 checks passed

krrishdholakia mentioned this pull request Jan 20, 2026

aiohttp.ClientSession not closed when using Gemini with LitellmModel — causes persistent warnings even with async usage #12443

Closed

michelligabriele mentioned this pull request Jan 29, 2026

fix: revert httpx client caching that caused closed client errors #20025

Merged

6 tasks

Balmy-afterGlow mentioned this pull request Feb 6, 2026

[Bug]: OpenAI inference broken in SDK on v1.18.1 (latets pypi version) #19608

Open

1 task

This was referenced Feb 13, 2026

[Bug]: AiohttpTransport.aclose() closes shared ClientSession #21116

Closed

fix(aiohttp): prevent closing shared ClientSession in AiohttpTransport #21117

Merged

ArivunidhiA mentioned this pull request Feb 16, 2026

fix(transport): AiohttpTransport.aclose() should not close shared ClientSession #21287

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: HTTP client memory leaks in Presidio, OpenAI, and Gemini#19190

fix: HTTP client memory leaks in Presidio, OpenAI, and Gemini#19190
krrishdholakia merged 7 commits intoBerriAI:litellm_staging_01_20_2026from
rsp2k:fix/oom-http-client-leaks

rsp2k commented Jan 16, 2026 •

edited by krrishdholakia

Loading

Uh oh!

vercel bot commented Jan 16, 2026 •

edited

Loading

Uh oh!

krrishdholakia commented Jan 16, 2026

Uh oh!

rsp2k commented Jan 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

rsp2k commented Jan 16, 2026 • edited by krrishdholakia Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix: HTTP Client Memory Leaks (Issues #14540, #12443)

Issues Addressed

Changes

1. Presidio Guardrail Session Leak (CRITICAL)

2. OpenAI Client Caching Bypass

3. Gemini aiohttp Session Leak (#12443)

Validation

Context

Root Cause Pattern

Next Steps

Uh oh!

vercel bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krrishdholakia commented Jan 16, 2026

Uh oh!

rsp2k commented Jan 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rsp2k commented Jan 16, 2026 •

edited by krrishdholakia

Loading

vercel bot commented Jan 16, 2026 •

edited

Loading