Skip to content

fix(gemini): handle 'minimal' reasoning_effort param for gemini-3.1-f…#22920

Merged
Sameerlite merged 1 commit intoBerriAI:mainfrom
Varad2001:litellm_gemini_3.1_reasoning_effort
Mar 6, 2026
Merged

fix(gemini): handle 'minimal' reasoning_effort param for gemini-3.1-f…#22920
Sameerlite merged 1 commit intoBerriAI:mainfrom
Varad2001:litellm_gemini_3.1_reasoning_effort

Conversation

@Varad2001
Copy link
Contributor

@Varad2001 Varad2001 commented Mar 5, 2026

…lash-lite-preview

Relevant issues

Fixes #22889

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

@vercel
Copy link

vercel bot commented Mar 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 5, 2026 7:05pm

Request Review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 5, 2026

Greptile Summary

This PR fixes a bug where reasoning_effort="minimal" was incorrectly mapped to thinkingLevel="low" (instead of "minimal") for gemini-3.1-flash-lite-preview and other gemini-3.1-flash variants. The root cause was that the is_gemini3flash guard only checked for "gemini-3-flash" as a substring, which does not match "gemini-3.1-flash-lite-preview" (note the .1 making them distinct strings). The fix adds "gemini-3.1-flash" as an additional OR condition.

Key changes:

  • vertex_and_google_ai_studio_gemini.py: Adds "gemini-3.1-flash" in model.lower() as a second OR branch in the is_gemini3flash check inside _map_reasoning_effort_to_thinking_level, so gemini-3.1-flash-lite-preview (and similar 3.1-flash variants) correctly receive thinkingLevel: "minimal" for reasoning_effort="minimal".
  • tests/llm_translation/test_gemini.py: Adds a regression test covering both the static-method level and the full return_raw_request flow for gemini-3.1-flash-lite-preview.
  • Concern: The is_gemini3flash variable is built from hardcoded substring checks. Per the project's custom rule, model-specific capability flags should live in model_prices_and_context_window.json and be read via get_model_info, so that new models automatically "just work" without requiring a code change and LiteLLM release.

Confidence Score: 3/5

  • The fix is logically correct and solves the reported bug, but it extends a hardcoded model-name pattern that violates the project's custom rule on model-specific flags.
  • The core logic change is accurate — adding "gemini-3.1-flash" in model.lower() correctly covers gemini-3.1-flash-lite-preview. A regression test is included and is mock-based. However, the approach of hardcoding model name strings to determine feature support violates the project's explicit custom rule (2605a1b1), which requires such flags to be data-driven via model_prices_and_context_window.json. Each new Gemini flash model variant would require another code change rather than a JSON update.
  • Pay attention to litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py — the is_gemini3flash check hardcodes model name substrings rather than reading from model metadata.

Important Files Changed

Filename Overview
litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py Adds "gemini-3.1-flash" as a substring check in is_gemini3flash so that models like gemini-3.1-flash-lite-preview correctly receive thinkingLevel: "minimal" instead of "low". The logic fix is correct, but hardcodes model-name strings rather than reading the capability from model_prices_and_context_window.json, violating the project's custom rule on model-specific flags.
tests/llm_translation/test_gemini.py Adds a new regression test that validates reasoning_effort="minimal" maps to thinkingLevel="minimal" for gemini-3.1-flash-lite-preview both at the static-method level and through the full return_raw_request flow. Test is mock-based (uses a fake API key via return_raw_request), consistent with existing patterns in this file.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["reasoning_effort param passed in"] --> B["_map_reasoning_effort_to_thinking_level(reasoning_effort, model)"]
    B --> C{"is_gemini3flash?\n'gemini-3-flash' in model\nOR 'gemini-3.1-flash' in model"}
    C -- Yes --> D{"reasoning_effort?"}
    C -- No --> E{"reasoning_effort?"}
    D -- minimal --> F["thinkingLevel: 'minimal'\nincludeThoughts: True"]
    D -- low --> G["thinkingLevel: 'low'\nincludeThoughts: True"]
    D -- medium --> H["thinkingLevel: 'medium'\nincludeThoughts: True"]
    D -- high --> I["thinkingLevel: 'high'\nincludeThoughts: True"]
    D -- "disable/none" --> J["thinkingLevel: 'minimal'\nincludeThoughts: False"]
    E -- minimal --> K["thinkingLevel: 'low'\nincludeThoughts: True"]
    E -- low --> L["thinkingLevel: 'low'\nincludeThoughts: True"]
    E -- medium --> M{"is_gemini31pro?"}
    M -- Yes --> N["thinkingLevel: 'medium'"]
    M -- No --> O["thinkingLevel: 'high'"]
    E -- high --> P["thinkingLevel: 'high'\nincludeThoughts: True"]
    E -- "disable/none" --> Q["thinkingLevel: 'low'\nincludeThoughts: False"]

    style F fill:#90EE90
    style K fill:#FFD700
Loading

Last reviewed commit: 6d4a281

Comment on lines 804 to 807
is_gemini3flash = model and (
"gemini-3-flash-preview" in model.lower()
or "gemini-3-flash" in model.lower()
"gemini-3-flash" in model.lower()
or "gemini-3.1-flash" in model.lower()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded model-name strings for feature detection

The is_gemini3flash flag is determined by checking for hardcoded substring patterns ("gemini-3-flash", "gemini-3.1-flash"). According to the project's custom rule, model-specific feature flags should not be hardcoded in the source; they should instead be driven by a field in model_prices_and_context_window.json read via get_model_info. With the current approach, every new Gemini flash model that supports the "minimal" thinking level (e.g. a hypothetical gemini-3.2-flash) requires another code change and a new LiteLLM release, rather than a simple JSON update.

The same anti-pattern already exists in _map_reasoning_effort_to_thinking_budget (lines ~745-754), but this PR extends it further. The suggested approach would be to add a flag such as supports_minimal_thinking_level to each relevant model entry in model_prices_and_context_window.json and then check that flag here, similar to how supports_reasoning is used elsewhere in the codebase.

# Example of data-driven approach (conceptual):
model_info = litellm.get_model_info(model=model, custom_llm_provider="gemini")
is_gemini3flash = bool(model_info.get("supports_minimal_thinking_level"))

Context Used: Rule from dashboard - What: Do not hardcode model-specific flags in the codebase. Instead, put them in model_prices_and_co... (source)

Copy link
Collaborator

@Sameerlite Sameerlite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Sameerlite Sameerlite merged commit 57596ca into BerriAI:main Mar 6, 2026
28 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issue: "minimal" reasoning_effort not supported for gemini-3.1-flash-lite-preview

2 participants