fix(ovhcloud): Edit models capabilities in model_prices_and_context_window.json#22905
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR refreshes the OVHcloud section of Key points:
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| model_prices_and_context_window.json | OVHcloud model entries fully replaced: 4 discontinued models removed, 12 models updated with revised pricing/context windows, and 3 new models added (Qwen3Guard-Gen-8B, Qwen3Guard-Gen-0.6B, Qwen3-Coder-30B-A3B-Instruct). Previously flagged issues (missing supports_vision, supports_reasoning, Mixtral flags) are now correctly restored. No test file was added despite the checklist claiming one was. File is also missing a trailing newline. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[OVHcloud Model Registry Update] --> B{Model Status}
B --> |Discontinued - Removed| C[Meta-Llama-3_1-70B-Instruct\nQwen2.5-Coder-32B-Instruct\nllava-v1.6-mistral-7b-hf\nmamba-codestral-7B-v0.1]
B --> |Updated Entries| D[12 existing models\nNew pricing + corrected context windows]
B --> |New Additions| E[Qwen3Guard-Gen-8B\nQwen3Guard-Gen-0.6B\nQwen3-Coder-30B-A3B-Instruct]
D --> F{Capability changes}
F --> |supports_vision restored| G[Qwen2.5-VL-72B-Instruct ✅\nMistral-Small-3.2-24B-Instruct ✅]
F --> |supports_reasoning restored| H[DeepSeek-R1-Distill-Llama-70B ✅\nQwen3-32B ✅\ngpt-oss-20b ✅\ngpt-oss-120b ✅]
F --> |function_calling upgraded| I[gpt-oss-20b: false → true\ngpt-oss-120b: false → true]
F --> |context window reduced| J[Mistral-Nemo: 118K → 65536\nMistral-7B: 127K → 65536]
E --> K{New model capabilities}
K --> |explicit reasoning=false| L[Qwen3-Coder-30B-A3B-Instruct]
K --> |no cost fields| M[Qwen3Guard-Gen-8B\nQwen3Guard-Gen-0.6B]
Last reviewed commit: b80b50c
182d416 to
015845c
Compare
015845c to
07baa95
Compare
| "max_input_tokens": 32768, | ||
| "max_output_tokens": 32768 | ||
| }, | ||
| "ovhcloud/Meta-Llama-3_3-70B-Instruct": { | ||
| "litellm_provider": "ovhcloud", | ||
| "mode": "chat", | ||
| "max_tokens": 131072, | ||
| "max_input_tokens": 131072, | ||
| "max_output_tokens": 131072, | ||
| "input_cost_per_token": 7.4e-07, | ||
| "output_cost_per_token": 7.4e-07, | ||
| "supports_function_calling": true, |
There was a problem hiding this comment.
source field dropped from all updated models
Every previously-existing OVHcloud entry had a source URL (e.g. "source": "https://endpoints.ai.cloud.ovh.net/models/...") that linked to the endpoint documentation. All twelve updated model entries in this PR have removed that field entirely. This reduces discoverability and makes it harder for users and maintainers to verify pricing and capability information.
Please restore the source field on all updated entries. For example, ovhcloud/Meta-Llama-3_3-70B-Instruct had:
"source": "https://endpoints.ai.cloud.ovh.net/models/meta-llama-3-3-70b-instruct"This pattern applies to all twelve re-declared models (Llama-3.1-8B-Instruct, Qwen2.5-VL-72B-Instruct, Qwen3-Coder-30B-A3B-Instruct, Mistral-Small-3.2-24B-Instruct-2506, Mistral-Nemo-Instruct-2407, Mixtral-8x7B-Instruct-v0.1, Qwen3-32B, DeepSeek-R1-Distill-Llama-70B, gpt-oss-20b, Mistral-7B-Instruct-v0.3, gpt-oss-120b).
There was a problem hiding this comment.
No it's not necessary
| "ovhcloud/Mistral-Nemo-Instruct-2407": { | ||
| "litellm_provider": "ovhcloud", | ||
| "mode": "chat", |
There was a problem hiding this comment.
Context window reduced for Mistral-Nemo-Instruct-2407 and Mistral-7B-Instruct-v0.3
Two models have had their context windows significantly reduced compared to the previous entries:
ovhcloud/Mistral-Nemo-Instruct-2407: 118 000 → 65 536 tokens (≈ 45% reduction)ovhcloud/Mistral-7B-Instruct-v0.3(line ~39141): 127 000 → 65 536 tokens (≈ 48% reduction)
Users who currently pass prompts between 65 K and 118 K / 127 K tokens to these models will start receiving context-limit errors after this change is deployed. Per the repo's guideline against backwards-incompatible changes without user-controlled flags, this should be documented in the PR description if it reflects a real OVHcloud endpoint limit change, or reverted if it does not.
Context Used: Rule from dashboard - What: avoid backwards-incompatible changes without user-controlled flags
Why: This breaks current ... (source)
07baa95 to
b80b50c
Compare
|
@greptileai Stop blocking my PR, the regressions are NORMAL, we do not use the |
|
I understand your frustration. Most of these changes look reasonable if they reflect OVHcloud's current API state. However, there's one critical issue that needs clarification:
This is a significant capability upgrade. If OVHcloud's endpoints don't actually support function calling yet, users will get runtime errors when they try to use tools with these models. Can you confirm these two models now support function calling on OVHcloud's side? If so, the PR is good to go. If not, we need to revert those two flags to prevent breaking users. For everything else (removed models, |
|
@greptileai Please update the confidence score, as the modifications for the gpt-oss are intended. |
bc781a0
into
BerriAI:litellm_oss_staging_03_06_2026
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/test_litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🐛 Bug Fix
Changes
Edit models capabilities for OVHcloud