fix: update gemini-live model endpoints and mode to realtime#22814
fix: update gemini-live model endpoints and mode to realtime#22814Chesars merged 5 commits intoBerriAI:mainfrom
Conversation
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
The gemini-live-2.5-flash-preview-native-audio-09-2025 model only works with WebSocket (Live API), not REST endpoints. Changed supported_endpoints from /v1/chat/completions to /vertex_ai/live to reflect the actual passthrough endpoint available in LiteLLM proxy.
The gemini/ prefix indicates Google AI Studio, which uses /v1/realtime endpoint (OpenAI-compatible), not /vertex_ai/live.
The mode field is used by health checks to determine the correct check method (WebSocket for realtime vs REST for chat).
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Greptile SummaryFixes incorrect configuration for
Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| model_prices_and_context_window.json | Corrects mode from "chat" to "realtime" and updates supported_endpoints to the proper WebSocket endpoints for both gemini-live model variants. Changes are consistent and match the proxy server's registered websocket routes. |
| litellm/model_prices_and_context_window_backup.json | Backup file mirrors the exact same changes as the primary JSON file, keeping both files in sync. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["gemini-live-2.5-flash model"] --> B{mode?}
B -->|"Before: chat"| C["acompletion() REST call"]
C --> D["❌ Fails — model is WebSocket-only"]
B -->|"After: realtime"| E["_realtime_health_check() WebSocket call"]
E --> F["✅ Correct health check"]
G["Vertex AI variant"] --> H["/vertex_ai/live endpoint"]
I["Gemini API variant"] --> J["/v1/realtime endpoint"]
Last reviewed commit: 0e1a633
Relevant issues
Supersedes #18009
Pre-Submission checklist
make test-unitType
🐛 Bug Fix
Changes
The
gemini-live-2.5-flash-preview-native-audio-09-2025model was incorrectly configured withmode: "chat"and REST API endpoints (/v1/chat/completions,/v1/completions), but this model only works with WebSockets (Realtime API).What changed:
supported_endpoints: Updated to correct realtime endpoints:gemini-live-*(vertex_ai) →/vertex_ai/livegemini/gemini-live-*(gemini) →/v1/realtimemode: Changed from"chat"to"realtime"so health checks use_realtime_health_check()(WebSocket) instead ofacompletion()(REST), which would fail for this modelFiles changed:
model_prices_and_context_window.jsonlitellm/model_prices_and_context_window_backup.json