Skip to content

Incident Report: vLLM Embeddings Broken by encoding_format Parameter#21474

Merged
Sameerlite merged 2 commits intomainfrom
litellm_incident_report_vllm
Feb 18, 2026
Merged

Incident Report: vLLM Embeddings Broken by encoding_format Parameter#21474
Sameerlite merged 2 commits intomainfrom
litellm_incident_report_vllm

Conversation

@Sameerlite
Copy link
Copy Markdown
Contributor

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

@vercel
Copy link
Copy Markdown

vercel bot commented Feb 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 18, 2026 1:11pm

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Feb 18, 2026

Greptile Summary

This PR bundles three unrelated changes: (1) a new incident report blog post for the vLLM embeddings encoding_format bug, (2) addition of 3 Mistral Devstral model aliases to the model pricing JSON, and (3) deletion of 2 existing tests from the hosted_vllm embedding test suite.

  • Incident report: Well-written root cause analysis and remediation documentation. However, the date range on line 22 ("Feb 16, 2025 - January 30, 2026") appears inconsistent — it mixes date formats and likely has a typo in the year, contradicting the stated "~3 hours" duration.
  • Model pricing: Adds mistral/devstral-small-latest, mistral/devstral-latest, and mistral/devstral-medium-latest entries with pricing consistent with existing sibling models (labs-devstral-small-2512, devstral-2512).
  • Test deletions: Removes test_validate_environment_without_api_key and test_encoding_format_float_sent_in_actual_request without justification. The latter is the complementary test to the remaining test_encoding_format_not_sent_in_actual_request and its removal reduces confidence that valid encoding_format values are forwarded correctly through the full call path. Per project requirements, PRs should include evidence that changes maintain code quality.

Confidence Score: 3/5

  • Low-risk changes (docs + model config), but test deletions reduce coverage for a previously broken feature.
  • Score of 3 reflects: the model pricing additions and blog post are low-risk, but the unexplained deletion of 2 tests (especially the encoding_format='float' E2E test that validated the fix worked) is a concern — it reduces test coverage for the very feature this incident report documents. The date inconsistency in the blog post needs correction before publishing.
  • tests/test_litellm/llms/hosted_vllm/embedding/test_hosted_vllm_embedding_transformation.py needs review for the test deletions, and docs/my-website/blog/vllm_embeddings_incident/index.md needs the date corrected.

Important Files Changed

Filename Overview
docs/my-website/blog/vllm_embeddings_incident/index.md New blog post for vLLM embeddings incident report. Contains an inconsistent date range ("Feb 16, 2025 - January 30, 2026") and deleted tests are referenced as done in the remediation table.
litellm/model_prices_and_context_window_backup.json Adds 3 new Mistral Devstral model aliases (devstral-small-latest, devstral-latest, devstral-medium-latest) with pricing metadata. Entries look consistent with existing similar models.
tests/test_litellm/llms/hosted_vllm/embedding/test_hosted_vllm_embedding_transformation.py Removes 2 valid tests (test_validate_environment_without_api_key and test_encoding_format_float_sent_in_actual_request) without explanation. These tests covered meaningful behavior and their removal reduces test coverage.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["PR #21474"] --> B["Incident Report Blog Post"]
    A --> C["Model Pricing Updates"]
    A --> D["Test File Changes"]
    
    B --> B1["docs/.../vllm_embeddings_incident/index.md\n(new file)"]
    
    C --> C1["model_prices_and_context_window_backup.json"]
    C1 --> C2["+ mistral/devstral-small-latest"]
    C1 --> C3["+ mistral/devstral-latest"]
    C1 --> C4["+ mistral/devstral-medium-latest"]
    
    D --> D1["test_hosted_vllm_embedding_transformation.py"]
    D1 --> D2["- test_validate_environment_without_api_key"]
    D1 --> D3["- test_encoding_format_float_sent_in_actual_request"]
    
    style B1 fill:#d4edda,stroke:#28a745
    style C2 fill:#d4edda,stroke:#28a745
    style C3 fill:#d4edda,stroke:#28a745
    style C4 fill:#d4edda,stroke:#28a745
    style D2 fill:#f8d7da,stroke:#dc3545
    style D3 fill:#f8d7da,stroke:#dc3545
Loading

Last reviewed commit: a9b7320

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

hide_table_of_contents: false
---

**Date:** Feb 16, 2025 - January 30, 2026
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent date range in incident report

The date range "Feb 16, 2025 - January 30, 2026" mixes date formats and likely has incorrect years. The blog post date field is set to 2026-02-18, so this range should probably read something like "Jan 30, 2026 - Feb 16, 2026" (or similar consistent format). As written, "Feb 16, 2025" appears to be a typo (2025 vs 2026), and the duration of nearly a year contradicts the "~3 hours" duration stated on the next line.

Suggested change
**Date:** Feb 16, 2025 - January 30, 2026
**Date:** January 30, 2026 - February 16, 2026

@@ -306,61 +289,5 @@ def test_encoding_format_not_sent_in_actual_request(self):
assert sent_data["model"] == "BAAI/bge-small-en-v1.5"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed test for encoding_format='float' reduces coverage

The deleted test_encoding_format_float_sent_in_actual_request test verified that when a user explicitly passes encoding_format="float", it is actually sent in the HTTP request payload. This is the complementary test to test_encoding_format_not_sent_in_actual_request — together they confirmed both sides of the fix (omit when not provided, include when provided). Removing this test means there's no longer an E2E-style test verifying that valid encoding_format values are forwarded correctly through the full embedding call path.

Context Used: Rule from dashboard - What: Ensure that any PR claiming to fix an issue includes evidence that the issue is resolved, such... (source)

@Sameerlite Sameerlite merged commit 3e0a723 into main Feb 18, 2026
19 of 27 checks passed
@ishaan-berri ishaan-berri deleted the litellm_incident_report_vllm branch March 26, 2026 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant