Skip to content

fix: handles edge case which are blocked by Lock#20451

Merged
krrishdholakia merged 1 commit intoBerriAI:litellm_oss_staging_02_10_2026from
Harshit28j:litellm_add_edge_case_handle
Feb 10, 2026
Merged

fix: handles edge case which are blocked by Lock#20451
krrishdholakia merged 1 commit intoBerriAI:litellm_oss_staging_02_10_2026from
Harshit28j:litellm_add_edge_case_handle

Conversation

@Harshit28j
Copy link
Collaborator

@Harshit28j Harshit28j commented Feb 5, 2026

this PR handles edge case from #19544 by Threadlocks
see example below of error

Error:
  {"message": "Starting silent experiment for model nvidia_dynamo/meta/llama-3.3-70b-instruct-fb8", "level": "INFO", "timestamp": "2026-02-03T04:32:49.916090"}
{"message": "Silent experiment failed for model nvidia_dynamo/meta/llama-3.3-70b-instruct-fb8: cannot pickle '_thread.RLock' object", "level": "ERROR", "timestamp": "2026-02-03T04:32:49.916387"}

changes made:

  • updated with threading so it can process pickling args. Previously we used an executor which fails when kwargs contain unpicklable args

@vercel
Copy link

vercel bot commented Feb 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 5, 2026 1:55am

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 5, 2026

Greptile Overview

Greptile Summary

Fixed pickling error when spawning silent experiment threads by replacing ThreadPoolExecutor.submit() with threading.Thread. The issue occurred because executor.submit() requires pickling arguments, which fails when kwargs contains unpicklable objects like _thread.RLock from OTEL spans or loggers. Using threading.Thread directly avoids pickling since arguments are passed directly in the same process.

Changes:

  • Replaced executor.submit() with threading.Thread(target=..., daemon=True) for silent experiment calls in litellm/router.py:1276-1282
  • Removed import of executor from litellm.litellm_core_utils.thread_pool_executor
  • Added null checks before logging retry/allowed fails policy to prevent potential errors

Trade-offs:

  • Pro: Fixes the pickling error for deployments with OTEL spans, loggers, and other unpicklable objects
  • Con: Moves from bounded thread pool (max 100 threads) to unbounded thread creation, which could potentially create resource exhaustion under high traffic with many silent experiments
  • The daemon flag ensures threads don't prevent program exit

Testing:
Existing tests in tests/test_litellm/test_router_silent_experiment.py cover the silent experiment functionality and should validate this change works correctly.

Confidence Score: 4/5

  • This PR is safe to merge with minor consideration for potential resource usage under high load
  • The fix correctly addresses the pickling error by avoiding serialization altogether. The solution is straightforward and well-documented in comments. However, the move from bounded to unbounded thread creation introduces a potential resource management concern under high traffic, though this is unlikely to be an issue in most deployments given that silent experiments are typically low-volume background traffic.
  • No files require special attention - the change is localized and well-understood

Important Files Changed

Filename Overview
litellm/router.py Replaced ThreadPoolExecutor with threading.Thread for silent experiment calls to fix pickling errors with unpicklable objects like RLock from OTEL spans; minor logging guard additions

Sequence Diagram

sequenceDiagram
    participant Client
    participant Router
    participant MainThread
    participant SilentThread
    participant LiteLLM

    Client->>Router: completion(model, messages, **kwargs)
    Router->>Router: get_available_deployment()
    Router->>Router: Check for silent_model in litellm_params
    
    alt silent_model exists
        Router->>SilentThread: threading.Thread(target=_silent_experiment_completion, daemon=True)
        SilentThread->>SilentThread: thread.start()
        Note over SilentThread: Runs in background with kwargs<br/>(no pickling required)
        SilentThread->>Router: _silent_experiment_completion(silent_model, messages, **kwargs)
        SilentThread->>LiteLLM: completion(silent_model, messages, **silent_kwargs)
        Note over SilentThread: Silent experiment runs independently
    end
    
    Router->>MainThread: Continue primary request
    MainThread->>LiteLLM: completion(primary_model, messages, **kwargs)
    LiteLLM-->>MainThread: ModelResponse
    MainThread-->>Client: Return response
    
    Note over SilentThread: Background thread continues<br/>Results logged but not returned
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +1276 to +1282
thread = threading.Thread(
target=self._silent_experiment_completion,
args=(silent_model, messages),
kwargs=kwargs,
daemon=True,
)
thread.start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential resource management concern: Unlike ThreadPoolExecutor which had a bounded pool (MAX_THREADS=100), threading.Thread creates unbounded threads. If many requests with silent_model arrive simultaneously, this could create many threads and potentially exhaust system resources. Consider adding a semaphore or bounded thread pool to limit concurrent silent experiments.

Prompt To Fix With AI
This is a comment left during a code review.
Path: litellm/router.py
Line: 1276:1282

Comment:
Potential resource management concern: Unlike `ThreadPoolExecutor` which had a bounded pool (`MAX_THREADS=100`), `threading.Thread` creates unbounded threads. If many requests with `silent_model` arrive simultaneously, this could create many threads and potentially exhaust system resources. Consider adding a semaphore or bounded thread pool to limit concurrent silent experiments.

How can I resolve this? If you propose a fix, please make it concise.

@krrishdholakia krrishdholakia changed the base branch from main to litellm_oss_staging_02_10_2026 February 10, 2026 19:51
@krrishdholakia krrishdholakia merged commit 63a6542 into BerriAI:litellm_oss_staging_02_10_2026 Feb 10, 2026
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants