Skip to content

fix(openllmetry): support new gen_ai.input/output.messages format (v0.55.0+)#2931

Merged
nate-mar merged 14 commits intomainfrom
fix/openllmetry-genai-semconv-v055
Apr 1, 2026
Merged

fix(openllmetry): support new gen_ai.input/output.messages format (v0.55.0+)#2931
nate-mar merged 14 commits intomainfrom
fix/openllmetry-genai-semconv-v055

Conversation

@nate-mar
Copy link
Copy Markdown
Contributor

@nate-mar nate-mar commented Mar 30, 2026

Summary

opentelemetry-instrumentation-openai v0.55.0 (traceloop/openllmetry#3844), implementing OTel GenAI Semantic Conventions v0.5.1, changed how message data is attached to spans. This broke the OpenInferenceSpanProcessor which relied on the old format, causing the Python canary cron to fail on py310-ci-openllmetry-latest and py314-ci-openllmetry-latest.

Before (v0.54.x and earlier) - flat indexed span attributes

gen_ai.prompt.0.role = "user"
gen_ai.prompt.0.content = "What is the capital of Yemen?"
gen_ai.completion.0.role = "assistant"
gen_ai.completion.0.content = "The capital of Yemen is Sana'\''a."
gen_ai.completion.0.finish_reason = "stop"
gen_ai.system = "openai"

After (v0.55.0+) - JSON strings with parts-based schema

gen_ai.input.messages = '\''[{"role": "user", "parts": [{"type": "text", "content": "What is the capital of Yemen?"}]}]'\''
gen_ai.output.messages = '\''[{"role": "assistant", "parts": [{"type": "text", "content": "The capital of Yemen is Sana'\''a."}], "finish_reason": "stop"}]'\''
gen_ai.provider.name = "openai"

Changes

  • Adds _parse_messages_from_json() to handle the v0.55.0+ parts-based JSON format (text, tool_call, tool_call_response parts)
  • Updates on_end() to detect the JSON-based message format (default) or the legacy attribute-per-field format (fallback), routing to the appropriate parser
  • Adds gen_ai.tool.definitions to the tool key lookup chain
  • Caches enum validation sets at module level to avoid per-span set comprehension overhead

Related upstream changes

Test plan

  • All existing tests pass on pinned deps (ruff-mypy-test-openllmetry)
  • All tests pass on latest deps (py310-ci-openllmetry-latest) with opentelemetry-instrumentation-openai==0.55.0
  • Verified that without these changes, test_openllmetry_instrumentor fails on v0.55.0 (assert is_openinference_span(span))
  • Unit tests for _parse_messages_from_json() (simple messages, tool calls)
  • Integration test for OpenInferenceSpanProcessor.on_end() with updated message attributes
  • Unit tests for _extract_llm_provider_and_system()

….55.0+)

opentelemetry-instrumentation-openai v0.55.0 (traceloop/openllmetry#3844)
replaced gen_ai.prompt.{idx}.* / gen_ai.completion.{idx}.* span attributes
with gen_ai.input.messages / gen_ai.output.messages JSON strings using a
parts-based schema. This broke the OpenInferenceSpanProcessor which relied
on gen_ai.prompt.* to detect and parse LLM spans.

- Add _parse_genai_messages() to handle the new parts-based JSON format
- Update on_end() guard to detect either old or new message format
- Fall back to gen_ai.provider.name when gen_ai.system is absent
- Support gen_ai.tool.definitions for tool lookup
- Cache enum validation sets at module level for hot-path efficiency
@nate-mar nate-mar requested a review from a team as a code owner March 30, 2026 21:30
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

_collect_oi_messages -> _parse_messages_from_attributes
_parse_genai_messages -> _parse_messages_from_json

Both produce the same output (OI Messages). The names now reflect
the input format they parse, not the output type.
Use the same real TracerProvider + InMemorySpanExporter pattern as the
existing VCR test. Also switches test attrs from gen_ai.system to
gen_ai.provider.name to match the real v0.55.0+ attribute layout.
@caroger
Copy link
Copy Markdown
Collaborator

caroger commented Mar 30, 2026

feel free to bump the pinned/minimum version of openllmetry that we supported in the pyproject.toml test_requirements.txt file too

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Mar 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

Frame the JSON-based message format (v0.55.0+) as the default and the
legacy attribute-per-field format as the fallback. Note gen_ai.system
deprecation in favor of gen_ai.provider.name.
gen_ai.system is deprecated (OTel semconv v1.37.0) and v0.55.0+ only
emits gen_ai.provider.name. Keep system_val sourced solely from
gen_ai.system so it is None for newer spans rather than incorrectly
deriving it from provider.name. Update tests accordingly.
@nate-mar
Copy link
Copy Markdown
Contributor Author

nate-mar commented Apr 1, 2026

@claude review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 1, 2026

Claude finished @nate-mar's task in 1m 55s —— View job


Code Review

  • Gather context and understand the request
  • Read changed files
  • Review _span_processor.py changes
  • Review test_instrumentor.py changes
  • Post review feedback

Overall this is a clean, well-structured fix for the OpenLLMetry v0.55.0 semconv change. The backwards-compatibility approach (detect format, route to appropriate parser) is correct. A few observations:

Outstanding from reviewer feedback

test-requirements.txt still pins opentelemetry-instrumentation-openai==0.50.1@caroger explicitly requested bumping this, and it hasn't been done yet. The current state is:

# test-requirements.txt
opentelemetry-instrumentation-openai==0.50.1

Since the VCR test was renamed to test_span_processor_with_legacy_message_format (to explicitly signal it tests the old format), one option is to add a second VCR test recorded against v0.55.0 to cover the new format end-to-end. Alternatively, bump to >=0.55.0 if the cassette can be re-recorded. Either way, the comment from @caroger is unaddressed. Fix this →


Minor issues

1. Missing test for tool_call_response part type (_span_processor.py:233-235)

tool_call_response parts are handled but there's no unit test exercising that path. This would complete the test coverage for the three part types.

2. Missing test for malformed JSON input (_span_processor.py:197-200)

_parse_messages_from_json silently returns [], [] on a json.JSONDecodeError. Worth a quick test to confirm that resilience:

def test_parse_messages_from_json_invalid(self) -> None:
    messages, finish_reasons = _parse_messages_from_json("not valid json")
    assert messages == []
    assert finish_reasons == []

3. Pre-existing: json.loads on tool_key in request_body (_span_processor.py:373)

"tools": json.loads(attrs[tool_key]) if tool_key else None,

If attrs[tool_key] is already a list/dict (not a JSON string), this raises TypeError. The newly added gen_ai.tool.definitions key could surface this if OpenLLMetry stores it as a native Python object. This is pre-existing for the other keys, but the PR extends it to a third key. Not blocking, but worth noting.


What's done well

  • The format-detection logic in on_end() is clean and conservative (prefers new format, falls back to legacy). No risk of double-processing.
  • Module-level frozenset caching for _VALID_LLM_PROVIDERS and _VALID_LLM_SYSTEMS is a good micro-optimization.
  • The _TOOL_KEY_CANDIDATES list makes the tool key lookup more maintainable.
  • _parse_messages_from_json defensively handles non-list top-level JSON, non-dict parts, and non-str input.
  • VCR cassette rename clearly documents intent (legacy format test).
  • test_span_processor_with_json_message_format is a clean, self-contained integration test — no network required.

The main outstanding item is addressing @caroger's request to bump the pinned OpenLLMetry version in test-requirements.txt.

@nate-mar
Copy link
Copy Markdown
Contributor Author

nate-mar commented Apr 1, 2026

Discussed Roger's feedback off line; agreed to support both message formats for now since it's manageable.

@nate-mar nate-mar merged commit 8434f31 into main Apr 1, 2026
18 checks passed
@nate-mar nate-mar deleted the fix/openllmetry-genai-semconv-v055 branch April 1, 2026 05:50
@mikeldking mikeldking mentioned this pull request Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants