Incident Report: vLLM Embeddings Broken by encoding_format Parameter#21474
Incident Report: vLLM Embeddings Broken by encoding_format Parameter#21474Sameerlite merged 2 commits intomainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR bundles three unrelated changes: (1) a new incident report blog post for the vLLM embeddings
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| docs/my-website/blog/vllm_embeddings_incident/index.md | New blog post for vLLM embeddings incident report. Contains an inconsistent date range ("Feb 16, 2025 - January 30, 2026") and deleted tests are referenced as done in the remediation table. |
| litellm/model_prices_and_context_window_backup.json | Adds 3 new Mistral Devstral model aliases (devstral-small-latest, devstral-latest, devstral-medium-latest) with pricing metadata. Entries look consistent with existing similar models. |
| tests/test_litellm/llms/hosted_vllm/embedding/test_hosted_vllm_embedding_transformation.py | Removes 2 valid tests (test_validate_environment_without_api_key and test_encoding_format_float_sent_in_actual_request) without explanation. These tests covered meaningful behavior and their removal reduces test coverage. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["PR #21474"] --> B["Incident Report Blog Post"]
A --> C["Model Pricing Updates"]
A --> D["Test File Changes"]
B --> B1["docs/.../vllm_embeddings_incident/index.md\n(new file)"]
C --> C1["model_prices_and_context_window_backup.json"]
C1 --> C2["+ mistral/devstral-small-latest"]
C1 --> C3["+ mistral/devstral-latest"]
C1 --> C4["+ mistral/devstral-medium-latest"]
D --> D1["test_hosted_vllm_embedding_transformation.py"]
D1 --> D2["- test_validate_environment_without_api_key"]
D1 --> D3["- test_encoding_format_float_sent_in_actual_request"]
style B1 fill:#d4edda,stroke:#28a745
style C2 fill:#d4edda,stroke:#28a745
style C3 fill:#d4edda,stroke:#28a745
style C4 fill:#d4edda,stroke:#28a745
style D2 fill:#f8d7da,stroke:#dc3545
style D3 fill:#f8d7da,stroke:#dc3545
Last reviewed commit: a9b7320
| hide_table_of_contents: false | ||
| --- | ||
|
|
||
| **Date:** Feb 16, 2025 - January 30, 2026 |
There was a problem hiding this comment.
Inconsistent date range in incident report
The date range "Feb 16, 2025 - January 30, 2026" mixes date formats and likely has incorrect years. The blog post date field is set to 2026-02-18, so this range should probably read something like "Jan 30, 2026 - Feb 16, 2026" (or similar consistent format). As written, "Feb 16, 2025" appears to be a typo (2025 vs 2026), and the duration of nearly a year contradicts the "~3 hours" duration stated on the next line.
| **Date:** Feb 16, 2025 - January 30, 2026 | |
| **Date:** January 30, 2026 - February 16, 2026 |
| @@ -306,61 +289,5 @@ def test_encoding_format_not_sent_in_actual_request(self): | |||
| assert sent_data["model"] == "BAAI/bge-small-en-v1.5" | |||
There was a problem hiding this comment.
Removed test for encoding_format='float' reduces coverage
The deleted test_encoding_format_float_sent_in_actual_request test verified that when a user explicitly passes encoding_format="float", it is actually sent in the HTTP request payload. This is the complementary test to test_encoding_format_not_sent_in_actual_request — together they confirmed both sides of the fix (omit when not provided, include when provided). Removing this test means there's no longer an E2E-style test verifying that valid encoding_format values are forwarded correctly through the full embedding call path.
Context Used: Rule from dashboard - What: Ensure that any PR claiming to fix an issue includes evidence that the issue is resolved, such... (source)
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewCI (LiteLLM team)
Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes