fix: handles edge case which are blocked by Lock#20451
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile OverviewGreptile SummaryFixed pickling error when spawning silent experiment threads by replacing Changes:
Trade-offs:
Testing: Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/router.py | Replaced ThreadPoolExecutor with threading.Thread for silent experiment calls to fix pickling errors with unpicklable objects like RLock from OTEL spans; minor logging guard additions |
Sequence Diagram
sequenceDiagram
participant Client
participant Router
participant MainThread
participant SilentThread
participant LiteLLM
Client->>Router: completion(model, messages, **kwargs)
Router->>Router: get_available_deployment()
Router->>Router: Check for silent_model in litellm_params
alt silent_model exists
Router->>SilentThread: threading.Thread(target=_silent_experiment_completion, daemon=True)
SilentThread->>SilentThread: thread.start()
Note over SilentThread: Runs in background with kwargs<br/>(no pickling required)
SilentThread->>Router: _silent_experiment_completion(silent_model, messages, **kwargs)
SilentThread->>LiteLLM: completion(silent_model, messages, **silent_kwargs)
Note over SilentThread: Silent experiment runs independently
end
Router->>MainThread: Continue primary request
MainThread->>LiteLLM: completion(primary_model, messages, **kwargs)
LiteLLM-->>MainThread: ModelResponse
MainThread-->>Client: Return response
Note over SilentThread: Background thread continues<br/>Results logged but not returned
| thread = threading.Thread( | ||
| target=self._silent_experiment_completion, | ||
| args=(silent_model, messages), | ||
| kwargs=kwargs, | ||
| daemon=True, | ||
| ) | ||
| thread.start() |
There was a problem hiding this comment.
Potential resource management concern: Unlike ThreadPoolExecutor which had a bounded pool (MAX_THREADS=100), threading.Thread creates unbounded threads. If many requests with silent_model arrive simultaneously, this could create many threads and potentially exhaust system resources. Consider adding a semaphore or bounded thread pool to limit concurrent silent experiments.
Prompt To Fix With AI
This is a comment left during a code review.
Path: litellm/router.py
Line: 1276:1282
Comment:
Potential resource management concern: Unlike `ThreadPoolExecutor` which had a bounded pool (`MAX_THREADS=100`), `threading.Thread` creates unbounded threads. If many requests with `silent_model` arrive simultaneously, this could create many threads and potentially exhaust system resources. Consider adding a semaphore or bounded thread pool to limit concurrent silent experiments.
How can I resolve this? If you propose a fix, please make it concise.63a6542
into
BerriAI:litellm_oss_staging_02_10_2026
this PR handles edge case from #19544 by Threadlocks
see example below of error
changes made: