-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Day 0 gemini 3.1 flash lite preview support #22674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
851be58
deb8fea
c3fe463
9d06106
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,175 @@ | ||
| --- | ||
| slug: gemini_3_1_flash_lite_preview | ||
| title: "DAY 0 Support: Gemini 3.1 Flash Lite Preview on LiteLLM" | ||
| date: 2026-03-03T08:00:00 | ||
| authors: | ||
| - name: Sameer Kankute | ||
| title: SWE @ LiteLLM (LLM Translation) | ||
| url: https://www.linkedin.com/in/sameer-kankute/ | ||
| image_url: https://pbs.twimg.com/profile_images/2001352686994907136/ONgNuSk5_400x400.jpg | ||
| - name: Krrish Dholakia | ||
| title: "CEO, LiteLLM" | ||
| url: https://www.linkedin.com/in/krish-d/ | ||
| image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg | ||
| - name: Ishaan Jaff | ||
| title: "CTO, LiteLLM" | ||
| url: https://www.linkedin.com/in/reffajnaahsi/ | ||
| image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg | ||
| description: "Guide to using Gemini 3.1 Flash Lite Preview on LiteLLM Proxy and SDK with day 0 support." | ||
| tags: [gemini, day 0 support, llms, supernova] | ||
| hide_table_of_contents: false | ||
| --- | ||
|
|
||
|
|
||
| import Tabs from '@theme/Tabs'; | ||
| import TabItem from '@theme/TabItem'; | ||
|
|
||
| # Gemini 3.1 Flash Lite Preview Day 0 Support | ||
|
|
||
| LiteLLM now supports `gemini-3.1-flash-lite-preview` with full day 0 support! | ||
|
|
||
| :::note | ||
| If you only want cost tracking, you need no change in your current Litellm version. But if you want the support for new features introduced along with it like thinking levels, you will need to use v1.80.8-stable.1 or above. | ||
| ::: | ||
|
|
||
| ## Deploy this version | ||
|
|
||
| <Tabs> | ||
| <TabItem value="docker" label="Docker"> | ||
|
|
||
| ``` showLineNumbers title="docker run litellm" | ||
| docker run \ | ||
| -e STORE_MODEL_IN_DB=True \ | ||
| -p 4000:4000 \ | ||
| ghcr.io/berriai/litellm:main-v1.80.8-stable.1 | ||
| ``` | ||
|
|
||
| </TabItem> | ||
|
|
||
| <TabItem value="pip" label="Pip"> | ||
|
|
||
| ``` showLineNumbers title="pip install litellm" | ||
| pip install litellm==v1.80.8-stable.1 | ||
| ``` | ||
|
|
||
| </TabItem> | ||
| </Tabs> | ||
|
|
||
| ## What's New | ||
|
|
||
| Supports all four thinking levels: | ||
| - **MINIMAL**: Ultra-fast responses with minimal reasoning | ||
| - **LOW**: Simple instruction following | ||
| - **MEDIUM**: Balanced reasoning for complex tasks | ||
| - **HIGH**: Maximum reasoning depth (dynamic) | ||
|
|
||
| --- | ||
|
|
||
| ## Quick Start | ||
|
|
||
| <Tabs> | ||
| <TabItem value="sdk" label="SDK"> | ||
|
|
||
| **Basic Usage** | ||
|
|
||
| ```python | ||
| from litellm import completion | ||
|
|
||
| response = completion( | ||
| model="gemini/gemini-3.1-flash-lite-preview", | ||
| messages=[{"role": "user", "content": "Extract key entities from this text: ..."}], | ||
| ) | ||
|
|
||
| print(response.choices[0].message.content) | ||
| ``` | ||
|
|
||
| **With Thinking Levels** | ||
|
|
||
| ```python | ||
| from litellm import completion | ||
|
|
||
| # Use MEDIUM thinking for complex reasoning tasks | ||
| response = completion( | ||
| model="gemini/gemini-3.1-flash-lite-preview", | ||
| messages=[{"role": "user", "content": "Analyze this dataset and identify patterns"}], | ||
| reasoning_effort="medium", # low, medium , high | ||
| ) | ||
|
|
||
| print(response.choices[0].message.content) | ||
| ``` | ||
|
|
||
| </TabItem> | ||
|
|
||
| <TabItem value="proxy" label="PROXY"> | ||
|
|
||
| **1. Setup config.yaml** | ||
|
|
||
| ```yaml | ||
| model_list: | ||
| - model_name: gemini-3.1-flash-lite | ||
| litellm_params: | ||
| model: gemini/gemini-3.1-flash-lite-preview | ||
| api_key: os.environ/GEMINI_API_KEY | ||
|
|
||
| # Or use Vertex AI | ||
| - model_name: vertex-gemini-3.1-flash-lite | ||
| litellm_params: | ||
| model: vertex_ai/gemini-3.1-flash-lite-preview | ||
| vertex_project: your-project-id | ||
| vertex_location: us-central1 | ||
| ``` | ||
| **2. Start proxy** | ||
| ```bash | ||
| litellm --config /path/to/config.yaml | ||
| ``` | ||
|
|
||
| **3. Make requests** | ||
|
|
||
| ```bash | ||
| curl -X POST http://localhost:4000/v1/chat/completions \ | ||
| -H "Content-Type: application/json" \ | ||
| -H "Authorization: Bearer <YOUR-LITELLM-KEY>" \ | ||
| -d '{ | ||
| "model": "gemini-3.1-flash-lite", | ||
| "messages": [{"role": "user", "content": "Extract structured data from this text"}], | ||
| "reasoning_effort": "low" | ||
| }' | ||
| ``` | ||
|
|
||
| </TabItem> | ||
| </Tabs> | ||
|
|
||
| --- | ||
|
|
||
| ## Supported Endpoints | ||
|
|
||
| LiteLLM provides **full end-to-end support** for Gemini 3.1 Flash Lite Preview on: | ||
|
|
||
| - ✅ `/v1/chat/completions` - OpenAI-compatible chat completions endpoint | ||
| - ✅ `/v1/responses` - OpenAI Responses API endpoint (streaming and non-streaming) | ||
| - ✅ [`/v1/messages`](../../docs/anthropic_unified) - Anthropic-compatible messages endpoint | ||
| - ✅ `/v1/generateContent` – [Google Gemini API](../../docs/generateContent.md) compatible endpoint | ||
|
|
||
| All endpoints support: | ||
| - Streaming and non-streaming responses | ||
| - Function calling with thought signatures | ||
| - Multi-turn conversations | ||
| - All Gemini 3-specific features (thinking levels, thought signatures) | ||
| - Full multimodal support (text, image, audio, video) | ||
|
|
||
| --- | ||
|
|
||
| ## `reasoning_effort` Mapping for Gemini 3.1 | ||
|
|
||
| LiteLLM automatically maps OpenAI's `reasoning_effort` parameter to Gemini's `thinkingLevel`: | ||
|
|
||
| | reasoning_effort | thinking_level | Use Case | | ||
| |------------------|----------------|----------| | ||
| | `minimal` | `minimal` | Ultra-fast responses, simple queries | | ||
| | `low` | `low` | Basic instruction following | | ||
| | `medium` | `medium` | Balanced reasoning for moderate complexity | | ||
| | `high` | `high` | Maximum reasoning depth, complex problems | | ||
| | `disable` | `minimal` | Disable extended reasoning | | ||
| | `none` | `minimal` | No extended reasoning | | ||
|
Comment on lines
+170
to
+175
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Blog reasoning_effort mapping table is inaccurate for this model The mapping table claims As a result, for
Either the code needs to be updated to recognize |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outdated version references in blog post
The blog post references
v1.80.8-stable.1for both Docker and pip install, but the current litellm version is1.82.0(perpyproject.toml). These version references appear outdated. Additionally, thepip install litellm==v1.80.8-stable.1uses avprefix which is non-standard for pip version specifiers — it should typically bepip install litellm==1.80.8.