Fix OTEL span redundancy, orphaned guardrail traces, and missing response IDs by Harshit28j · Pull Request #23001 · BerriAI/litellm

Harshit28j · 2026-03-06T23:02:39Z

Summary

Fixes 4 critical OpenTelemetry span issues in LiteLLM that cause data duplication, orphaned traces, and missing correlation IDs:

Issue 3: Redundant Data in raw_gen_ai_request Spans

The raw_gen_ai_request span was duplicating all parent span attributes (gen_ai., metadata.)
Removed self.set_attributes() call — raw span now only contains provider-specific llm.{provider}.* attributes
Impact: Reduces storage footprint and eliminates confusion from duplicate data

Issue 4: Redundant litellm_request Spans

When both litellm_request child and litellm_proxy_request parent spans existed, attributes were duplicated on both
Removed redundant set_attributes() call on parent proxy span
Impact: Child span carries all attributes; parent duplication is unnecessary

Issue 5: Orphaned Guardrail Traces

Guardrail spans were created with context=None when no parent proxy span existed
This resulted in orphaned root spans with separate trace_ids (not visible as children)
Added _resolve_guardrail_context() helper to ensure guardrails always have a valid parent
Applied fix to both _handle_success and _handle_failure paths
Impact: Guardrail traces now properly appear as children in Phoenix and other OTEL UIs

Issue 8: Missing LLM Call ID for Embeddings and Image Gen

gen_ai.response.id was missing for embeddings and image generation calls
EmbeddingResponse and ImageResponse don't have provider response IDs (unlike completions)
Added fallback to standard_logging_payload["id"] (litellm call ID)
Completions still use provider ID (e.g., "chatcmpl-xxx") when available
Impact: All call types can now be correlated across LiteLLM UI, Phoenix traces, and provider logs

Test Plan

✅ Added 7 comprehensive tests covering all 4 fixes:

TestRawSpanAttributeIsolation — verifies raw span isolation
TestNoParentSpanDuplication — verifies no parent span duplication
TestGuardrailSpanParenting (2 tests) — verifies guardrails are never orphaned
TestResponseIdFallback (3 tests) — verifies response ID set for all call types

✅ All 73 existing OTEL tests pass (14 pre-existing protocol failures unrelated to these changes)

✅ Code changes are isolated to OTEL integration only

Verified:

Redundant Data in raw_gen_ai_request Spans

Redundant litellm_request Spans

Orphaned Guardrail Traces

Missing LLM Call ID for Embeddings and Image Gen

…onse IDs Addresses 4 critical OpenTelemetry span issues in LiteLLM: Issue #3: Remove redundant attributes from raw_gen_ai_request spans - Removed self.set_attributes() call that was duplicating all parent span attributes (gen_ai.*, metadata.*) onto the raw span - Raw span now only contains provider-specific llm.{provider}.* attributes - Reduces storage and eliminates search confusion from duplicate data Issue #4: Prevent attribute duplication on litellm_proxy_request parent span - When litellm_request child span exists, removed redundant set_attributes() call on the parent proxy span - Child span already carries all attributes; parent duplication doubles storage and complicates search Issue #5: Fix orphaned guardrail traces - Guardrail spans were created with context=None when no parent proxy span existed, resulting in orphaned root spans (separate trace_id) - Added _resolve_guardrail_context() helper to ensure guardrails always have a valid parent (litellm_request or proxy span) - Applied fix to both _handle_success and _handle_failure paths Issue BerriAI#8: Add gen_ai.response.id for embeddings and image generation - EmbeddingResponse and ImageResponse types don't have provider response IDs - Added fallback to standard_logging_payload["id"] (litellm call ID) for correlation across LiteLLM UI, Phoenix traces, and provider logs - Completions still use provider ID (e.g. "chatcmpl-xxx") when available Tests added: - TestRawSpanAttributeIsolation: Verify raw span has no gen_ai/metadata attrs - TestNoParentSpanDuplication: Verify parent span doesn't get duplicated attrs - TestGuardrailSpanParenting: Verify guardrails are children (not orphaned) - TestResponseIdFallback: Verify response ID set for all call types All existing OTEL tests pass (73 passed, 14 pre-existing protocol failures). Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

vercel · 2026-03-06T23:02:43Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 6, 2026 11:09pm

greptile-apps · 2026-03-06T23:08:25Z

Greptile Summary

This PR fixes four OpenTelemetry span correctness issues in LiteLLM:

Raw span attribute isolation (Issue 3): Removes set_attributes() call from the raw-request sub-span, ensuring it only logs provider-specific llm.{provider}.* attributes rather than duplicating parent attributes.
Parent proxy span de-duplication (Issue 4): Removes redundant attribute duplication on the litellm_proxy_request parent span when a child litellm_request span exists. The child now carries all attributes; the parent remains shallow.
Guardrail span parenting (Issue 5): Adds _resolve_guardrail_context() helper to ensure guardrails always have a valid parent context (prioritizing span → parent_span → fallback_ctx), preventing orphaned root spans with separate trace IDs.
Response ID fallback for all call types (Issue 8): Adds fallback to standard_logging_payload["id"] for embeddings and image-gen calls that lack provider response IDs, ensuring all call types can be correlated across LiteLLM UI and Phoenix traces.

The implementation is sound: attribute isolation reduces storage footprint, guardrail parenting ensures proper trace hierarchy, and response ID fallback enables cross-system correlation. Tests are comprehensive (7 new tests covering all 4 fixes) with no real network calls.

Confidence Score: 4/5

All four OTEL span fixes are technically sound, correctly implemented, and comprehensively tested with no regressions in existing tests.
The code changes are functionally correct across all four fixes: raw span attribute isolation reduces storage duplication, parent proxy span de-duplication keeps hierarchy clean when child spans exist, guardrail span parenting prevents orphaned traces through a well-structured context resolution helper, and response ID fallback enables cross-system call correlation. The 7 new tests are thorough, use safe mocking patterns with no real network calls, and all existing tests pass. No logic errors or bugs identified. Score reflects high technical quality with no blocking issues.
No files require special attention.

Important Files Changed

Filename	Overview
litellm/integrations/opentelemetry.py	All four OTEL fixes are correctly implemented: raw-request sub-span attribute isolation (Issue #3), parent proxy span de-duplication when child exists (Issue #4), guardrail span parenting via _resolve_guardrail_context helper to prevent orphaned traces (Issue #5), and response ID fallback for embeddings/image-gen (Issue #8). The attribute placement logic is sound, guardrail context resolution properly chains through span → parent_span → fallback, and the response ID fallback handles call types that lack provider IDs. Code is technically correct and isolated to OTEL integration only.
tests/test_litellm/integrations/test_opentelemetry.py	7 new comprehensive unit tests added covering all 4 fixes: TestRawSpanAttributeIsolation, TestNoParentSpanDuplication, TestGuardrailSpanParenting (2 variants), and TestResponseIdFallback (3 variants). Tests use in-memory span exporters and mocks with no real network calls, consistent with repository policy. All tests pass alongside 73 existing OTEL tests.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[_handle_success / _handle_failure] --> B{should_create_primary_span?}
    B -- Yes --> C[Create litellm_request span\nset_attributes on child span]
    B -- No --> D[Set attributes on parent_span\ndirectly - no child span]
    C --> E{parent_span is\nproxy request span?}
    E -- Yes\nOLD behavior --> F["set_attributes on parent_span\n(REMOVED in Issue #4)"]
    E -- No / NEW behavior --> G[Skip parent span attributes]
    F --> H[_resolve_guardrail_context]
    G --> H
    D --> H
    H --> I{span not None?}
    I -- Yes --> J[Use span as guardrail context]
    I -- No --> K{parent_span not None?}
    K -- Yes --> L[Use parent_span as guardrail context]
    K -- No --> M[Use fallback_ctx\nmay be None]
    J --> N[_create_guardrail_span\nas child of litellm_request]
    L --> N
    M --> O[_create_guardrail_span\nmaybe orphaned if ctx=None]

    style F fill:#ffcccc,stroke:#cc0000
    style N fill:#ccffcc,stroke:#006600
    style O fill:#ffeecc,stroke:#cc6600

_{Last reviewed commit: 0b67b64}

vercel bot deployed to Preview March 6, 2026 23:09 View deployment

Harshit28j merged commit 497be5f into BerriAI:main Mar 7, 2026
27 of 38 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix OTEL span redundancy, orphaned guardrail traces, and missing response IDs#23001

Fix OTEL span redundancy, orphaned guardrail traces, and missing response IDs#23001
Harshit28j merged 1 commit intoBerriAI:mainfrom
Harshit28j:litellm_fix3458

Harshit28j commented Mar 6, 2026 •

edited

Loading

Uh oh!

vercel bot commented Mar 6, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 6, 2026 •

edited by Harshit28j

Loading

Important Files Changed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Harshit28j commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Issue 3: Redundant Data in raw_gen_ai_request Spans

Issue 4: Redundant litellm_request Spans

Issue 5: Orphaned Guardrail Traces

Issue 8: Missing LLM Call ID for Embeddings and Image Gen

Test Plan

Verified:

Uh oh!

vercel bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Mar 6, 2026 • edited by Harshit28j Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Harshit28j commented Mar 6, 2026 •

edited

Loading

vercel bot commented Mar 6, 2026 •

edited

Loading

greptile-apps bot commented Mar 6, 2026 •

edited by Harshit28j

Loading