Skip to content

fix: update gemini-live model endpoints and mode to realtime#22814

Merged
Chesars merged 5 commits intoBerriAI:mainfrom
Chesars:fix/gemini-live-supported-endpoints
Mar 4, 2026
Merged

fix: update gemini-live model endpoints and mode to realtime#22814
Chesars merged 5 commits intoBerriAI:mainfrom
Chesars:fix/gemini-live-supported-endpoints

Conversation

@Chesars
Copy link
Collaborator

@Chesars Chesars commented Mar 4, 2026

Relevant issues

Supersedes #18009

Pre-Submission checklist

  • N/A - JSON config only, no code changes
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes

The gemini-live-2.5-flash-preview-native-audio-09-2025 model was incorrectly configured with mode: "chat" and REST API endpoints (/v1/chat/completions, /v1/completions), but this model only works with WebSockets (Realtime API).

What changed:

  • supported_endpoints: Updated to correct realtime endpoints:
    • gemini-live-* (vertex_ai) → /vertex_ai/live
    • gemini/gemini-live-* (gemini) → /v1/realtime
  • mode: Changed from "chat" to "realtime" so health checks use _realtime_health_check() (WebSocket) instead of acompletion() (REST), which would fail for this model

Files changed:

  • model_prices_and_context_window.json
  • litellm/model_prices_and_context_window_backup.json

github-actions bot and others added 5 commits March 3, 2026 17:52
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
The gemini-live-2.5-flash-preview-native-audio-09-2025 model only works
with WebSocket (Live API), not REST endpoints. Changed supported_endpoints
from /v1/chat/completions to /vertex_ai/live to reflect the actual
passthrough endpoint available in LiteLLM proxy.
The gemini/ prefix indicates Google AI Studio, which uses /v1/realtime
endpoint (OpenAI-compatible), not /vertex_ai/live.
The mode field is used by health checks to determine the correct
check method (WebSocket for realtime vs REST for chat).
@vercel
Copy link

vercel bot commented Mar 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Error Error Mar 4, 2026 10:44pm

Request Review

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ Chesars
❌ github-actions[bot]
You have signed the CLA already but the status is still pending? Let us recheck it.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 4, 2026

Greptile Summary

Fixes incorrect configuration for gemini-live-2.5-flash-preview-native-audio-09-2025 models, which are WebSocket-only (Realtime API) but were misconfigured with REST API settings. Changes the mode from "chat" to "realtime" so health checks use _realtime_health_check() (WebSocket) instead of acompletion() (REST), and updates supported_endpoints to the correct WebSocket routes (/vertex_ai/live for Vertex AI, /v1/realtime for Gemini API).

  • Both the primary and backup JSON files are updated consistently
  • The endpoint values match the registered WebSocket routes in litellm/proxy/proxy_server.py
  • The mode: "realtime" value correctly maps to the realtime health check handler in litellm/litellm_core_utils/health_check_helpers.py
  • No code changes required — this is a JSON config-only fix, consistent with the project's convention of storing model-specific flags in model_prices_and_context_window.json

Confidence Score: 5/5

  • This PR is safe to merge — it corrects a config-only bug in JSON metadata with no code changes.
  • The changes are minimal, well-scoped, and correct. Both JSON files are updated consistently, the endpoint values match existing registered WebSocket routes, and the mode value maps to a valid health check handler. This is a straightforward bug fix with no risk of regression.
  • No files require special attention.

Important Files Changed

Filename Overview
model_prices_and_context_window.json Corrects mode from "chat" to "realtime" and updates supported_endpoints to the proper WebSocket endpoints for both gemini-live model variants. Changes are consistent and match the proxy server's registered websocket routes.
litellm/model_prices_and_context_window_backup.json Backup file mirrors the exact same changes as the primary JSON file, keeping both files in sync.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["gemini-live-2.5-flash model"] --> B{mode?}
    B -->|"Before: chat"| C["acompletion() REST call"]
    C --> D["❌ Fails — model is WebSocket-only"]
    B -->|"After: realtime"| E["_realtime_health_check() WebSocket call"]
    E --> F["✅ Correct health check"]

    G["Vertex AI variant"] --> H["/vertex_ai/live endpoint"]
    I["Gemini API variant"] --> J["/v1/realtime endpoint"]
Loading

Last reviewed commit: 0e1a633

@Chesars Chesars merged commit 028dd3f into BerriAI:main Mar 4, 2026
29 of 37 checks passed
@Chesars Chesars deleted the fix/gemini-live-supported-endpoints branch March 4, 2026 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants