Fix OTEL span redundancy, orphaned guardrail traces, and missing response IDs#23001
Fix OTEL span redundancy, orphaned guardrail traces, and missing response IDs#23001Harshit28j merged 1 commit intoBerriAI:mainfrom
Conversation
…onse IDs Addresses 4 critical OpenTelemetry span issues in LiteLLM: Issue #3: Remove redundant attributes from raw_gen_ai_request spans - Removed self.set_attributes() call that was duplicating all parent span attributes (gen_ai.*, metadata.*) onto the raw span - Raw span now only contains provider-specific llm.{provider}.* attributes - Reduces storage and eliminates search confusion from duplicate data Issue #4: Prevent attribute duplication on litellm_proxy_request parent span - When litellm_request child span exists, removed redundant set_attributes() call on the parent proxy span - Child span already carries all attributes; parent duplication doubles storage and complicates search Issue #5: Fix orphaned guardrail traces - Guardrail spans were created with context=None when no parent proxy span existed, resulting in orphaned root spans (separate trace_id) - Added _resolve_guardrail_context() helper to ensure guardrails always have a valid parent (litellm_request or proxy span) - Applied fix to both _handle_success and _handle_failure paths Issue BerriAI#8: Add gen_ai.response.id for embeddings and image generation - EmbeddingResponse and ImageResponse types don't have provider response IDs - Added fallback to standard_logging_payload["id"] (litellm call ID) for correlation across LiteLLM UI, Phoenix traces, and provider logs - Completions still use provider ID (e.g. "chatcmpl-xxx") when available Tests added: - TestRawSpanAttributeIsolation: Verify raw span has no gen_ai/metadata attrs - TestNoParentSpanDuplication: Verify parent span doesn't get duplicated attrs - TestGuardrailSpanParenting: Verify guardrails are children (not orphaned) - TestResponseIdFallback: Verify response ID set for all call types All existing OTEL tests pass (73 passed, 14 pre-existing protocol failures). Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes four OpenTelemetry span correctness issues in LiteLLM:
The implementation is sound: attribute isolation reduces storage footprint, guardrail parenting ensures proper trace hierarchy, and response ID fallback enables cross-system correlation. Tests are comprehensive (7 new tests covering all 4 fixes) with no real network calls. Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| litellm/integrations/opentelemetry.py | All four OTEL fixes are correctly implemented: raw-request sub-span attribute isolation (Issue #3), parent proxy span de-duplication when child exists (Issue #4), guardrail span parenting via _resolve_guardrail_context helper to prevent orphaned traces (Issue #5), and response ID fallback for embeddings/image-gen (Issue #8). The attribute placement logic is sound, guardrail context resolution properly chains through span → parent_span → fallback, and the response ID fallback handles call types that lack provider IDs. Code is technically correct and isolated to OTEL integration only. |
| tests/test_litellm/integrations/test_opentelemetry.py | 7 new comprehensive unit tests added covering all 4 fixes: TestRawSpanAttributeIsolation, TestNoParentSpanDuplication, TestGuardrailSpanParenting (2 variants), and TestResponseIdFallback (3 variants). Tests use in-memory span exporters and mocks with no real network calls, consistent with repository policy. All tests pass alongside 73 existing OTEL tests. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[_handle_success / _handle_failure] --> B{should_create_primary_span?}
B -- Yes --> C[Create litellm_request span\nset_attributes on child span]
B -- No --> D[Set attributes on parent_span\ndirectly - no child span]
C --> E{parent_span is\nproxy request span?}
E -- Yes\nOLD behavior --> F["set_attributes on parent_span\n(REMOVED in Issue #4)"]
E -- No / NEW behavior --> G[Skip parent span attributes]
F --> H[_resolve_guardrail_context]
G --> H
D --> H
H --> I{span not None?}
I -- Yes --> J[Use span as guardrail context]
I -- No --> K{parent_span not None?}
K -- Yes --> L[Use parent_span as guardrail context]
K -- No --> M[Use fallback_ctx\nmay be None]
J --> N[_create_guardrail_span\nas child of litellm_request]
L --> N
M --> O[_create_guardrail_span\nmaybe orphaned if ctx=None]
style F fill:#ffcccc,stroke:#cc0000
style N fill:#ccffcc,stroke:#006600
style O fill:#ffeecc,stroke:#cc6600
Last reviewed commit: 0b67b64
Summary
Fixes 4 critical OpenTelemetry span issues in LiteLLM that cause data duplication, orphaned traces, and missing correlation IDs:
Issue 3: Redundant Data in raw_gen_ai_request Spans
raw_gen_ai_requestspan was duplicating all parent span attributes (gen_ai., metadata.)self.set_attributes()call — raw span now only contains provider-specificllm.{provider}.*attributesIssue 4: Redundant litellm_request Spans
litellm_requestchild andlitellm_proxy_requestparent spans existed, attributes were duplicated on bothset_attributes()call on parent proxy spanIssue 5: Orphaned Guardrail Traces
context=Nonewhen no parent proxy span existed_resolve_guardrail_context()helper to ensure guardrails always have a valid parent_handle_successand_handle_failurepathsIssue 8: Missing LLM Call ID for Embeddings and Image Gen
gen_ai.response.idwas missing for embeddings and image generation callsEmbeddingResponseandImageResponsedon't have provider response IDs (unlike completions)standard_logging_payload["id"](litellm call ID)Test Plan
✅ Added 7 comprehensive tests covering all 4 fixes:
TestRawSpanAttributeIsolation— verifies raw span isolationTestNoParentSpanDuplication— verifies no parent span duplicationTestGuardrailSpanParenting(2 tests) — verifies guardrails are never orphanedTestResponseIdFallback(3 tests) — verifies response ID set for all call types✅ All 73 existing OTEL tests pass (14 pre-existing protocol failures unrelated to these changes)
✅ Code changes are isolated to OTEL integration only
Verified: