fix(openllmetry): support new gen_ai.input/output.messages format (v0.55.0+) by nate-mar · Pull Request #2931 · Arize-ai/openinference

nate-mar · 2026-03-30T21:30:51Z

Summary

opentelemetry-instrumentation-openai v0.55.0 (traceloop/openllmetry#3844), implementing OTel GenAI Semantic Conventions v0.5.1, changed how message data is attached to spans. This broke the OpenInferenceSpanProcessor which relied on the old format, causing the Python canary cron to fail on py310-ci-openllmetry-latest and py314-ci-openllmetry-latest.

Before (v0.54.x and earlier) - flat indexed span attributes

gen_ai.prompt.0.role = "user"
gen_ai.prompt.0.content = "What is the capital of Yemen?"
gen_ai.completion.0.role = "assistant"
gen_ai.completion.0.content = "The capital of Yemen is Sana'\''a."
gen_ai.completion.0.finish_reason = "stop"
gen_ai.system = "openai"

After (v0.55.0+) - JSON strings with parts-based schema

gen_ai.input.messages = '\''[{"role": "user", "parts": [{"type": "text", "content": "What is the capital of Yemen?"}]}]'\''
gen_ai.output.messages = '\''[{"role": "assistant", "parts": [{"type": "text", "content": "The capital of Yemen is Sana'\''a."}], "finish_reason": "stop"}]'\''
gen_ai.provider.name = "openai"

Changes

Adds _parse_messages_from_json() to handle the v0.55.0+ parts-based JSON format (text, tool_call, tool_call_response parts)
Updates on_end() to detect the JSON-based message format (default) or the legacy attribute-per-field format (fallback), routing to the appropriate parser
Adds gen_ai.tool.definitions to the tool key lookup chain
Caches enum validation sets at module level to avoid per-span set comprehension overhead

Related upstream changes

traceloop/openllmetry#3844 - OpenAI: conform to OTel GenAI Semantic Conventions 0.5.1
traceloop/openllmetry#3835 - Anthropic: same change (v0.54.0)
OTel GenAI Semantic Conventions v0.5.1 - upstream spec

Test plan

All existing tests pass on pinned deps (ruff-mypy-test-openllmetry)
All tests pass on latest deps (py310-ci-openllmetry-latest) with opentelemetry-instrumentation-openai==0.55.0
Verified that without these changes, test_openllmetry_instrumentor fails on v0.55.0 (assert is_openinference_span(span))
Unit tests for _parse_messages_from_json() (simple messages, tool calls)
Integration test for OpenInferenceSpanProcessor.on_end() with updated message attributes
Unit tests for _extract_llm_provider_and_system()

….55.0+) opentelemetry-instrumentation-openai v0.55.0 (traceloop/openllmetry#3844) replaced gen_ai.prompt.{idx}.* / gen_ai.completion.{idx}.* span attributes with gen_ai.input.messages / gen_ai.output.messages JSON strings using a parts-based schema. This broke the OpenInferenceSpanProcessor which relied on gen_ai.prompt.* to detect and parse LLM spans. - Add _parse_genai_messages() to handle the new parts-based JSON format - Update on_end() guard to detect either old or new message format - Fall back to gen_ai.provider.name when gen_ai.system is absent - Support gen_ai.tool.definitions for tool lookup - Cache enum validation sets at module level for hot-path efficiency

github-actions · 2026-03-30T21:36:16Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

_collect_oi_messages -> _parse_messages_from_attributes _parse_genai_messages -> _parse_messages_from_json Both produce the same output (OI Messages). The names now reflect the input format they parse, not the output type.

Use the same real TracerProvider + InMemorySpanExporter pattern as the existing VCR test. Also switches test attrs from gen_ai.system to gen_ai.provider.name to match the real v0.55.0+ attribute layout.

caroger · 2026-03-30T22:26:48Z

feel free to bump the pinned/minimum version of openllmetry that we supported in the pyproject.toml test_requirements.txt file too

github-actions · 2026-03-30T22:59:39Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

Frame the JSON-based message format (v0.55.0+) as the default and the legacy attribute-per-field format as the fallback. Note gen_ai.system deprecation in favor of gen_ai.provider.name.

gen_ai.system is deprecated (OTel semconv v1.37.0) and v0.55.0+ only emits gen_ai.provider.name. Keep system_val sourced solely from gen_ai.system so it is None for newer spans rather than incorrectly deriving it from provider.name. Update tests accordingly.

… fix/openllmetry-genai-semconv-v055

nate-mar · 2026-04-01T01:27:26Z

@claude review

github-actions · 2026-04-01T01:27:38Z

Claude finished @nate-mar's task in 1m 55s —— View job

Code Review

Gather context and understand the request
Read changed files
Review _span_processor.py changes
Review test_instrumentor.py changes
Post review feedback

Overall this is a clean, well-structured fix for the OpenLLMetry v0.55.0 semconv change. The backwards-compatibility approach (detect format, route to appropriate parser) is correct. A few observations:

Outstanding from reviewer feedback

test-requirements.txt still pins opentelemetry-instrumentation-openai==0.50.1 — @caroger explicitly requested bumping this, and it hasn't been done yet. The current state is:

# test-requirements.txt
opentelemetry-instrumentation-openai==0.50.1

Since the VCR test was renamed to test_span_processor_with_legacy_message_format (to explicitly signal it tests the old format), one option is to add a second VCR test recorded against v0.55.0 to cover the new format end-to-end. Alternatively, bump to >=0.55.0 if the cassette can be re-recorded. Either way, the comment from @caroger is unaddressed. Fix this →

Minor issues

1. Missing test for tool_call_response part type (_span_processor.py:233-235)

tool_call_response parts are handled but there's no unit test exercising that path. This would complete the test coverage for the three part types.

2. Missing test for malformed JSON input (_span_processor.py:197-200)

_parse_messages_from_json silently returns [], [] on a json.JSONDecodeError. Worth a quick test to confirm that resilience:

def test_parse_messages_from_json_invalid(self) -> None:
    messages, finish_reasons = _parse_messages_from_json("not valid json")
    assert messages == []
    assert finish_reasons == []

3. Pre-existing: json.loads on tool_key in request_body (_span_processor.py:373)

"tools": json.loads(attrs[tool_key]) if tool_key else None,

If attrs[tool_key] is already a list/dict (not a JSON string), this raises TypeError. The newly added gen_ai.tool.definitions key could surface this if OpenLLMetry stores it as a native Python object. This is pre-existing for the other keys, but the PR extends it to a third key. Not blocking, but worth noting.

What's done well

The format-detection logic in on_end() is clean and conservative (prefers new format, falls back to legacy). No risk of double-processing.
Module-level frozenset caching for _VALID_LLM_PROVIDERS and _VALID_LLM_SYSTEMS is a good micro-optimization.
The _TOOL_KEY_CANDIDATES list makes the tool key lookup more maintainable.
_parse_messages_from_json defensively handles non-list top-level JSON, non-dict parts, and non-str input.
VCR cassette rename clearly documents intent (legacy format test).
test_span_processor_with_json_message_format is a clean, self-contained integration test — no network required.

The main outstanding item is addressing @caroger's request to bump the pinned OpenLLMetry version in test-requirements.txt.

nate-mar · 2026-04-01T04:50:18Z

Discussed Roger's feedback off line; agreed to support both message formats for now since it's manageable.

nate-mar requested a review from a team as a code owner March 30, 2026 21:30

github-project-automation Bot added this to Instrumentation Mar 30, 2026

dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 30, 2026

nate-mar added 5 commits March 30, 2026 14:42

chore: rename test class and update docstrings

0d03e06

refactor: rename message parsers for clarity

d08d5ef

_collect_oi_messages -> _parse_messages_from_attributes _parse_genai_messages -> _parse_messages_from_json Both produce the same output (OI Messages). The names now reflect the input format they parse, not the output type.

test: replace MagicMock with real TracerProvider in updated format test

08b21eb

Use the same real TracerProvider + InMemorySpanExporter pattern as the existing VCR test. Also switches test attrs from gen_ai.system to gen_ai.provider.name to match the real v0.55.0+ attribute layout.

test: assert LLM_PROVIDER in updated format test

d2ba8cb

style: fix ruff formatting in openllmetry test

d22664c

nate-mar added 2 commits March 30, 2026 15:37

test: bump opentelemetry-instrumentation-openai to 0.55.0

29e854d

revert: keep opentelemetry-instrumentation-openai pinned at 0.50.1

0add332

caroger approved these changes Mar 30, 2026

View reviewed changes

dosubot Bot added the lgtm This PR has been approved by a maintainer label Mar 30, 2026

nate-mar added 5 commits March 31, 2026 17:28

refactor: clarify version handling comments and variable names

a482f82

Frame the JSON-based message format (v0.55.0+) as the default and the legacy attribute-per-field format as the fallback. Note gen_ai.system deprecation in favor of gen_ai.provider.name.

Merge branch 'main' of https://github.com/Arize-ai/openinference into…

d9ac922

… fix/openllmetry-genai-semconv-v055

style: fix ruff formatting in integration test

9e6e2e7

test: rename tests to clarify legacy vs JSON message format

67c6b5a

test: add tool_call_response and invalid JSON coverage

caea982

nate-mar merged commit 8434f31 into main Apr 1, 2026
18 checks passed

nate-mar deleted the fix/openllmetry-genai-semconv-v055 branch April 1, 2026 05:50

github-project-automation Bot moved this to Done in Instrumentation Apr 1, 2026

mikeldking mentioned this pull request Mar 31, 2026

chore: release main #2920

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(openllmetry): support new gen_ai.input/output.messages format (v0.55.0+)#2931

fix(openllmetry): support new gen_ai.input/output.messages format (v0.55.0+)#2931
nate-mar merged 14 commits intomainfrom
fix/openllmetry-genai-semconv-v055

nate-mar commented Mar 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

caroger commented Mar 30, 2026

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

nate-mar commented Apr 1, 2026

Uh oh!

github-actions Bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

nate-mar commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nate-mar commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Before (v0.54.x and earlier) - flat indexed span attributes

After (v0.55.0+) - JSON strings with parts-based schema

Changes

Related upstream changes

Test plan

Uh oh!

github-actions Bot commented Mar 30, 2026

Code review

Uh oh!

caroger commented Mar 30, 2026

Uh oh!

github-actions Bot commented Mar 30, 2026

Code review

Uh oh!

nate-mar commented Apr 1, 2026

Uh oh!

github-actions Bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Outstanding from reviewer feedback

Minor issues

What's done well

Uh oh!

nate-mar commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nate-mar commented Mar 30, 2026 •

edited

Loading

github-actions Bot commented Apr 1, 2026 •

edited

Loading