Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 175 additions & 0 deletions docs/my-website/blog/gemini_3_1_flash_lite/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
---
slug: gemini_3_1_flash_lite_preview
title: "DAY 0 Support: Gemini 3.1 Flash Lite Preview on LiteLLM"
date: 2026-03-03T08:00:00
authors:
- name: Sameer Kankute
title: SWE @ LiteLLM (LLM Translation)
url: https://www.linkedin.com/in/sameer-kankute/
image_url: https://pbs.twimg.com/profile_images/2001352686994907136/ONgNuSk5_400x400.jpg
- name: Krrish Dholakia
title: "CEO, LiteLLM"
url: https://www.linkedin.com/in/krish-d/
image_url: https://pbs.twimg.com/profile_images/1298587542745358340/DZv3Oj-h_400x400.jpg
- name: Ishaan Jaff
title: "CTO, LiteLLM"
url: https://www.linkedin.com/in/reffajnaahsi/
image_url: https://pbs.twimg.com/profile_images/1613813310264340481/lz54oEiB_400x400.jpg
description: "Guide to using Gemini 3.1 Flash Lite Preview on LiteLLM Proxy and SDK with day 0 support."
tags: [gemini, day 0 support, llms, supernova]
hide_table_of_contents: false
---


import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# Gemini 3.1 Flash Lite Preview Day 0 Support

LiteLLM now supports `gemini-3.1-flash-lite-preview` with full day 0 support!

:::note
If you only want cost tracking, you need no change in your current Litellm version. But if you want the support for new features introduced along with it like thinking levels, you will need to use v1.80.8-stable.1 or above.
:::

## Deploy this version

<Tabs>
<TabItem value="docker" label="Docker">

``` showLineNumbers title="docker run litellm"
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.80.8-stable.1
```

</TabItem>

<TabItem value="pip" label="Pip">

``` showLineNumbers title="pip install litellm"
pip install litellm==v1.80.8-stable.1
Comment on lines +32 to +52
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated version references in blog post

The blog post references v1.80.8-stable.1 for both Docker and pip install, but the current litellm version is 1.82.0 (per pyproject.toml). These version references appear outdated. Additionally, the pip install litellm==v1.80.8-stable.1 uses a v prefix which is non-standard for pip version specifiers — it should typically be pip install litellm==1.80.8.

```

</TabItem>
</Tabs>

## What's New

Supports all four thinking levels:
- **MINIMAL**: Ultra-fast responses with minimal reasoning
- **LOW**: Simple instruction following
- **MEDIUM**: Balanced reasoning for complex tasks
- **HIGH**: Maximum reasoning depth (dynamic)

---

## Quick Start

<Tabs>
<TabItem value="sdk" label="SDK">

**Basic Usage**

```python
from litellm import completion

response = completion(
model="gemini/gemini-3.1-flash-lite-preview",
messages=[{"role": "user", "content": "Extract key entities from this text: ..."}],
)

print(response.choices[0].message.content)
```

**With Thinking Levels**

```python
from litellm import completion

# Use MEDIUM thinking for complex reasoning tasks
response = completion(
model="gemini/gemini-3.1-flash-lite-preview",
messages=[{"role": "user", "content": "Analyze this dataset and identify patterns"}],
reasoning_effort="medium", # low, medium , high
)

print(response.choices[0].message.content)
```

</TabItem>

<TabItem value="proxy" label="PROXY">

**1. Setup config.yaml**

```yaml
model_list:
- model_name: gemini-3.1-flash-lite
litellm_params:
model: gemini/gemini-3.1-flash-lite-preview
api_key: os.environ/GEMINI_API_KEY

# Or use Vertex AI
- model_name: vertex-gemini-3.1-flash-lite
litellm_params:
model: vertex_ai/gemini-3.1-flash-lite-preview
vertex_project: your-project-id
vertex_location: us-central1
```
**2. Start proxy**
```bash
litellm --config /path/to/config.yaml
```

**3. Make requests**

```bash
curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <YOUR-LITELLM-KEY>" \
-d '{
"model": "gemini-3.1-flash-lite",
"messages": [{"role": "user", "content": "Extract structured data from this text"}],
"reasoning_effort": "low"
}'
```

</TabItem>
</Tabs>

---

## Supported Endpoints

LiteLLM provides **full end-to-end support** for Gemini 3.1 Flash Lite Preview on:

-`/v1/chat/completions` - OpenAI-compatible chat completions endpoint
-`/v1/responses` - OpenAI Responses API endpoint (streaming and non-streaming)
-[`/v1/messages`](../../docs/anthropic_unified) - Anthropic-compatible messages endpoint
-`/v1/generateContent`[Google Gemini API](../../docs/generateContent.md) compatible endpoint

All endpoints support:
- Streaming and non-streaming responses
- Function calling with thought signatures
- Multi-turn conversations
- All Gemini 3-specific features (thinking levels, thought signatures)
- Full multimodal support (text, image, audio, video)

---

## `reasoning_effort` Mapping for Gemini 3.1

LiteLLM automatically maps OpenAI's `reasoning_effort` parameter to Gemini's `thinkingLevel`:

| reasoning_effort | thinking_level | Use Case |
|------------------|----------------|----------|
| `minimal` | `minimal` | Ultra-fast responses, simple queries |
| `low` | `low` | Basic instruction following |
| `medium` | `medium` | Balanced reasoning for moderate complexity |
| `high` | `high` | Maximum reasoning depth, complex problems |
| `disable` | `minimal` | Disable extended reasoning |
| `none` | `minimal` | No extended reasoning |
Comment on lines +170 to +175
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blog reasoning_effort mapping table is inaccurate for this model

The mapping table claims minimalminimal, disableminimal, noneminimal, and mediummedium. However, looking at _map_reasoning_effort_to_thinking_level in vertex_and_google_ai_studio_gemini.py, the is_gemini3flash check only matches "gemini-3-flash-preview" or "gemini-3-flash" — it does not match gemini-3.1-flash-lite-preview. Similarly, is_gemini31pro only matches "gemini-3.1-pro-preview".

As a result, for gemini-3.1-flash-lite-preview:

  • minimal actually maps to low (not minimal)
  • medium actually maps to high (not medium)
  • disable and none actually map to low (not minimal)

Either the code needs to be updated to recognize gemini-3.1-flash-lite-preview as a model that supports minimal/medium thinking levels, or the table should reflect the actual behavior.

1 change: 1 addition & 0 deletions docs/my-website/docs/providers/gemini.md
Original file line number Diff line number Diff line change
Expand Up @@ -2041,6 +2041,7 @@ response = litellm.completion(
| gemini-2.0-flash-lite-preview-02-05 | `completion(model='gemini/gemini-2.0-flash-lite-preview-02-05', messages)` | `os.environ['GEMINI_API_KEY']` |
| gemini-2.5-flash-preview-09-2025 | `completion(model='gemini/gemini-2.5-flash-preview-09-2025', messages)` | `os.environ['GEMINI_API_KEY']` |
| gemini-2.5-flash-lite-preview-09-2025 | `completion(model='gemini/gemini-2.5-flash-lite-preview-09-2025', messages)` | `os.environ['GEMINI_API_KEY']` |
| gemini-3.1-flash-lite-preview | `completion(model='gemini/gemini-3.1-flash-lite-preview', messages)` | `os.environ['GEMINI_API_KEY']` |
| gemini-flash-latest | `completion(model='gemini/gemini-flash-latest', messages)` | `os.environ['GEMINI_API_KEY']` |
| gemini-flash-lite-latest | `completion(model='gemini/gemini-flash-lite-latest', messages)` | `os.environ['GEMINI_API_KEY']` |

Expand Down
1 change: 1 addition & 0 deletions docs/my-website/docs/providers/vertex.md
Original file line number Diff line number Diff line change
Expand Up @@ -1685,6 +1685,7 @@ litellm.vertex_location = "us-central1 # Your Location
| gemini-2.5-pro | `completion('gemini-2.5-pro', messages)`, `completion('vertex_ai/gemini-2.5-pro', messages)` |
| gemini-2.5-flash-preview-09-2025 | `completion('gemini-2.5-flash-preview-09-2025', messages)`, `completion('vertex_ai/gemini-2.5-flash-preview-09-2025', messages)` |
| gemini-2.5-flash-lite-preview-09-2025 | `completion('gemini-2.5-flash-lite-preview-09-2025', messages)`, `completion('vertex_ai/gemini-2.5-flash-lite-preview-09-2025', messages)` |
| gemini-3.1-flash-lite-preview | `completion('gemini-3.1-flash-lite-preview', messages)`, `completion('vertex_ai/gemini-3.1-flash-lite-preview', messages)` |

## Private Service Connect (PSC) Endpoints

Expand Down
Loading
Loading