feat(langchain): new semnconv langchain#3900
Conversation
|
Caution Review failedPull request was closed or merged during review Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRefactors LangChain OpenTelemetry instrumentation to adopt GenAI semantic conventions: replace GEN_AI_SYSTEM with GEN_AI_PROVIDER_NAME, add GEN_AI_OPERATION_NAME, consolidate per-index attributes into JSON (input/output/messages/tool definitions), normalize finish reasons, and normalize vendor identifiers to lowercase qualified names. Changes
Sequence Diagram(s)(omitted) Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labelsPython SDK Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py (1)
350-351: Consider using_set_span_attributefor consistency.Line 351 uses
span.set_attributedirectly while the rest of the module uses the_set_span_attributehelper. The helper provides null/empty checking which could be valuable here too.♻️ Suggested change for consistency
- if finish_reasons: - span.set_attribute(GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS, finish_reasons) + if finish_reasons: + _set_span_attribute(span, GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS, finish_reasons)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py` around lines 350 - 351, Replace the direct call to span.set_attribute for GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS with the module helper _set_span_attribute to keep behavior consistent and gain its null/empty checking; locate the block that currently checks finish_reasons and change the call from span.set_attribute(...) to _set_span_attribute(span, GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS, finish_reasons) so the attribute is only set when appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py`:
- Line 29: Update the vendor_name value used in the vendor detection to match
the GenAISystem enum: change the string assigned to vendor_name from
"azure.ai.openai" to "az.ai.openai" (the value defined by GenAISystem.AZURE) so
vendor detection/vendor_name in vendor_detection.py aligns with
GenAISystem.AZURE.
---
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`:
- Around line 350-351: Replace the direct call to span.set_attribute for
GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS with the module helper
_set_span_attribute to keep behavior consistent and gain its null/empty
checking; locate the block that currently checks finish_reasons and change the
call from span.set_attribute(...) to _set_span_attribute(span,
GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS, finish_reasons) so the attribute
is only set when appropriate.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7a52c154-25c0-4eba-84b5-ca70d8d623d6
⛔ Files ignored due to path filters (1)
packages/opentelemetry-instrumentation-langchain/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (16)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.pypackages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/event_emitter.pypackages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.pypackages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/utils.pypackages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.pypackages/opentelemetry-instrumentation-langchain/tests/metrics/test_langchain_metrics.pypackages/opentelemetry-instrumentation-langchain/tests/test_agents.pypackages/opentelemetry-instrumentation-langchain/tests/test_chains.pypackages/opentelemetry-instrumentation-langchain/tests/test_documents_chains.pypackages/opentelemetry-instrumentation-langchain/tests/test_generation_role_extraction.pypackages/opentelemetry-instrumentation-langchain/tests/test_langgraph.pypackages/opentelemetry-instrumentation-langchain/tests/test_lcel.pypackages/opentelemetry-instrumentation-langchain/tests/test_llms.pypackages/opentelemetry-instrumentation-langchain/tests/test_structured_output.pypackages/opentelemetry-instrumentation-langchain/tests/test_tool_call_content.pypackages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py
| exact_matches={"AzureChatOpenAI", "AzureOpenAI", "AzureOpenAIEmbeddings"}, | ||
| patterns=["azure"], | ||
| vendor_name="Azure" | ||
| vendor_name="azure.ai.openai" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the GenAISystem.AZURE value in the semconv package
rg -n "AZURE\s*=" --type=py packages/opentelemetry-semantic-conventions-ai/Repository: traceloop/openllmetry
Length of output: 178
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py | head -50Repository: traceloop/openllmetry
Length of output: 1977
🏁 Script executed:
rg -n "^\s*(AZURE|OPENAI|AWS|ANTHROPIC|GCP)\s*=" --type=py packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py -A 0Repository: traceloop/openllmetry
Length of output: 177
🏁 Script executed:
rg -n "GCP\s*=" --type=py packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.pyRepository: traceloop/openllmetry
Length of output: 47
🏁 Script executed:
sed -n '15,55p' packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py | cat -nRepository: traceloop/openllmetry
Length of output: 1848
🏁 Script executed:
sed -n '25,80p' packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py | cat -nRepository: traceloop/openllmetry
Length of output: 2416
🏁 Script executed:
cat -n packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py | grep -E "^\s*(AZURE|AWS|GOOGLE|VERTEX)"Repository: traceloop/openllmetry
Length of output: 47
🏁 Script executed:
sed -n '1,40p' packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py | cat -nRepository: traceloop/openllmetry
Length of output: 1504
Change Azure vendor name from "azure.ai.openai" to "az.ai.openai" to match the GenAISystem enum.
Line 29 uses vendor_name="azure.ai.openai", but the corresponding GenAISystem.AZURE constant is defined as "az.ai.openai". This inconsistency can cause issues when correlating metrics or filtering by vendor across OpenLLMetry instrumentations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py`
at line 29, Update the vendor_name value used in the vendor detection to match
the GenAISystem enum: change the string assigned to vendor_name from
"azure.ai.openai" to "az.ai.openai" (the value defined by GenAISystem.AZURE) so
vendor detection/vendor_name in vendor_detection.py aligns with
GenAISystem.AZURE.
…m/traceloop/openllmetry into gz/langchain-new-otel-instrumenet
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`:
- Around line 194-235: The new semconv assertions (e.g., reading
chat_span.attributes via GenAIAttributes and variables like tool_defs,
system_instructions, input_messages, output_messages, and result) are never
executed because the surrounding tests are still decorated with
`@pytest.mark.skip`; move these assertions into a test that actually runs (an
executed LangChain path) or remove the skip decorator so spans/events are
emitted during the test run, ensuring the assertions execute and validate the
migration (apply the same change for the other skipped blocks noted at lines
450-482, 654-696, 845-849).
- Around line 1136-1138: The helper assert_message_in_logs currently hard-codes
"langchain" for GenAI provider; change its signature to accept an
expected_provider (e.g., def assert_message_in_logs(log: ReadableLogRecord,
event_name: str, expected_content: dict, expected_provider: str)) and replace
the hard-coded assertion with assert
log.log_record.attributes.get(GenAIAttributes.GEN_AI_PROVIDER_NAME) ==
expected_provider; then update every call site in the tests (OpenAI/Anthropic
event tests) to pass the correct provider string ("openai", "anthropic", or
"langchain") so each test asserts the proper provider.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 40ee8efd-9d33-4cdd-8047-42b625f9fdc8
📒 Files selected for processing (3)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.pypackages/opentelemetry-instrumentation-langchain/tests/test_finish_reasons.pypackages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py
✅ Files skipped from review due to trivial changes (1)
- packages/opentelemetry-instrumentation-langchain/tests/test_finish_reasons.py
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py
| def assert_message_in_logs(log: ReadableLogRecord, event_name: str, expected_content: dict): | ||
| assert log.log_record.event_name == event_name | ||
| assert log.log_record.attributes.get(GenAIAttributes.GEN_AI_SYSTEM) == "langchain" | ||
| assert log.log_record.attributes.get(GenAIAttributes.GEN_AI_PROVIDER_NAME) == "langchain" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and examine the test file
find . -name "test_tool_calls.py" -type fRepository: traceloop/openllmetry
Length of output: 140
🏁 Script executed:
# Then look at the function at lines 1136-1138 and surrounding context
cat -n packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py | sed -n '1130,1160p'Repository: traceloop/openllmetry
Length of output: 683
🏁 Script executed:
# Search for all calls to assert_message_in_logs in the test file
rg "assert_message_in_logs" packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py -nRepository: traceloop/openllmetry
Length of output: 1952
🏁 Script executed:
# Look at the callback_handler.py to understand the fallback logic
find . -name "callback_handler.py" -path "*/langchain/*" -type fRepository: traceloop/openllmetry
Length of output: 175
🏁 Script executed:
# Examine the callback_handler.py to understand provider name handling
cat packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py | head -200Repository: traceloop/openllmetry
Length of output: 6451
🏁 Script executed:
# Search for GEN_AI_PROVIDER_NAME in callback_handler.py
rg "GEN_AI_PROVIDER_NAME" packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py -n -B 3 -A 3Repository: traceloop/openllmetry
Length of output: 2756
🏁 Script executed:
# Check the test file for OpenAI and Anthropic test classes/functions
grep -n "class.*OpenAI\|class.*Anthropic\|def.*openai\|def.*anthropic" packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.pyRepository: traceloop/openllmetry
Length of output: 482
🏁 Script executed:
# Get a broader view of the test file structure to understand test organization
rg "^def test_|^class Test" packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py -n | head -50Repository: traceloop/openllmetry
Length of output: 1169
🏁 Script executed:
# Check if OpenAI tests are in the file and look at their structure
sed -n '27,130p' packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.pyRepository: traceloop/openllmetry
Length of output: 3883
🏁 Script executed:
# Check Anthropic tests to understand their structure
sed -n '425,500p' packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.pyRepository: traceloop/openllmetry
Length of output: 2968
🏁 Script executed:
# Search for vendor_detection to understand how provider names are detected
find . -name "*vendor_detection*" -type fRepository: traceloop/openllmetry
Length of output: 175
🏁 Script executed:
# Look at vendor detection logic
cat packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/vendor_detection.py | head -100Repository: traceloop/openllmetry
Length of output: 3282
🏁 Script executed:
# Look at the event_emitter to see if provider name is set on events/logs
find . -name "*event_emitter*" -type fRepository: traceloop/openllmetry
Length of output: 1999
🏁 Script executed:
# Check how logs are created with attributes
rg "emit_event\|log_record\|attributes" packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/event_emitter.py -n -B 2 -A 2 | head -100Repository: traceloop/openllmetry
Length of output: 47
🏁 Script executed:
# Verify that the span's provider name is accessible when emitting events
grep -n "span\|provider\|GEN_AI_PROVIDER" packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/event_emitter.py | head -30Repository: traceloop/openllmetry
Length of output: 138
🏁 Script executed:
# Get more context around the EVENT_ATTRIBUTES in langchain event_emitter
cat -n packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/event_emitter.py | head -80Repository: traceloop/openllmetry
Length of output: 2942
🏁 Script executed:
# See how EVENT_ATTRIBUTES are used in the emit_event function
cat -n packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/event_emitter.py | sed -n '1,150p'Repository: traceloop/openllmetry
Length of output: 4008
🏁 Script executed:
# Check the full event_emitter.py file
wc -l packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/event_emitter.pyRepository: traceloop/openllmetry
Length of output: 174
🏁 Script executed:
# Look at the complete file to understand how events are emitted
cat packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/event_emitter.pyRepository: traceloop/openllmetry
Length of output: 3266
🏁 Script executed:
# Check if there's a way to pass span context to the event emitter
rg "emit_event\|ChoiceEvent\|MessageEvent" packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py -n -B 2 -A 2 | head -80Repository: traceloop/openllmetry
Length of output: 47
🏁 Script executed:
# Look at where events are emitted in callback_handler to see if span context is available
grep -n "emit_event\|MessageEvent\|ChoiceEvent" packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.pyRepository: traceloop/openllmetry
Length of output: 1143
🏁 Script executed:
# Check the context where events are emitted in callback_handler to see if span context is available
sed -n '665,675p' packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.pyRepository: traceloop/openllmetry
Length of output: 442
🏁 Script executed:
# Look at where MessageEvent and ChoiceEvent are emitted to see if we have access to span and its provider
sed -n '1075,1085p' packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.pyRepository: traceloop/openllmetry
Length of output: 456
🏁 Script executed:
# Check if there are any open issues or comments about provider handling in events
rg "provider\|EVENT_ATTRIBUTES" packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/ -i -n | grep -i "event\|provider"Repository: traceloop/openllmetry
Length of output: 47
🏁 Script executed:
# Look at the complete emit_event calls to understand the context
sed -n '635,675p' packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.pyRepository: traceloop/openllmetry
Length of output: 1411
🏁 Script executed:
# Check how the tests call assert_message_in_logs in the OpenAI and Anthropic tests
sed -n '80,125p' packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.pyRepository: traceloop/openllmetry
Length of output: 1633
🏁 Script executed:
# Check the Anthropic test version
sed -n '492,550p' packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.pyRepository: traceloop/openllmetry
Length of output: 1988
🏁 Script executed:
# Verify the EventAttributes and semconv specifications being used
rg "GenAIAttributes\|gen_ai_attributes" packages/opentelemetry-instrumentation-langchain/ -l | head -5Repository: traceloop/openllmetry
Length of output: 47
🏁 Script executed:
# Check if there's any documentation about provider names in GenAI semconv
find . -name "*.md" -o -name "*.rst" | xargs grep -l "GEN_AI_PROVIDER" 2>/dev/null | head -3Repository: traceloop/openllmetry
Length of output: 47
🏁 Script executed:
# Search for references to GenAI semconv documentation or specifications in the codebase
find . -name "*.py" -type f -exec grep -l "semconv.*gen.*ai\|Gen.*AI.*spec" {} \; | head -5Repository: traceloop/openllmetry
Length of output: 1006
🏁 Script executed:
# Look for comments or docstrings about provider naming conventions
rg "provider.*name\|GEN_AI_PROVIDER.*openai\|GEN_AI_PROVIDER.*anthropic" packages/opentelemetry-instrumentation-langchain/ -i -A 2 -B 2Repository: traceloop/openllmetry
Length of output: 47
🏁 Script executed:
# Check the anthropic instrumentation to see how it handles provider names in events
cat packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py | head -50Repository: traceloop/openllmetry
Length of output: 1454
🏁 Script executed:
# See if other instrumentation packages parameterize provider names in their test helpers
find . -path "*/tests/*" -name "*.py" -type f -exec grep -l "assert.*GEN_AI_PROVIDER" {} \; | head -3Repository: traceloop/openllmetry
Length of output: 607
🏁 Script executed:
# Check how anthropic tests handle provider name assertions
grep -n "GEN_AI_PROVIDER" packages/opentelemetry-instrumentation-anthropic/tests/test_semconv_span_attrs.py -B 2 -A 2 | head -40Repository: traceloop/openllmetry
Length of output: 2173
🏁 Script executed:
# Look at the langchain metrics test to see how it handles provider assertions
grep -n "GEN_AI_PROVIDER\|assert.*provider" packages/opentelemetry-instrumentation-langchain/tests/metrics/test_langchain_metrics.py -B 2 -A 2 | head -40Repository: traceloop/openllmetry
Length of output: 1485
🏁 Script executed:
# Check the anthropic instrumentation's EVENT_ATTRIBUTES
cat packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/event_emitter.py | head -40Repository: traceloop/openllmetry
Length of output: 1132
Pass the expected provider name into the test helper.
assert_message_in_logs is shared by OpenAI and Anthropic event tests, but currently hard-codes "langchain" as the provider name. While the langchain event_emitter today uses "langchain" for all events, the GenAI semantic conventions (referenced in your spans via vendor detection) use provider identifiers like "openai" and "anthropic". Parameterizing the helper allows tests to specify the correct provider for each context, and will future-proof the tests when event emission is aligned with span vendor detection.
♻️ Suggested change
-def assert_message_in_logs(log: ReadableLogRecord, event_name: str, expected_content: dict):
+def assert_message_in_logs(
+ log: ReadableLogRecord,
+ event_name: str,
+ expected_content: dict,
+ expected_provider: str,
+):
assert log.log_record.event_name == event_name
- assert log.log_record.attributes.get(GenAIAttributes.GEN_AI_PROVIDER_NAME) == "langchain"
+ assert (
+ log.log_record.attributes.get(GenAIAttributes.GEN_AI_PROVIDER_NAME)
+ == expected_provider
+ )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/opentelemetry-instrumentation-langchain/tests/test_tool_calls.py`
around lines 1136 - 1138, The helper assert_message_in_logs currently hard-codes
"langchain" for GenAI provider; change its signature to accept an
expected_provider (e.g., def assert_message_in_logs(log: ReadableLogRecord,
event_name: str, expected_content: dict, expected_provider: str)) and replace
the hard-coded assertion with assert
log.log_record.attributes.get(GenAIAttributes.GEN_AI_PROVIDER_NAME) ==
expected_provider; then update every call site in the tests (OpenAI/Anthropic
event tests) to pass the correct provider string ("openai", "anthropic", or
"langchain") so each test asserts the proper provider.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`:
- Around line 267-271: The code currently extracts system message content into
system_instructions and then skips adding the message to the input sequence by
using "continue"; instead, when role == "system" (computed by
_message_type_to_role(msg.type)), append system instruction parts via
_content_to_parts(msg.content) but do NOT skip the message from
gen_ai.input.messages—remove the "continue" and allow the msg to remain in the
message stream so the original chat order is preserved; update the block in
span_utils.py around role == "system" (the variables msg, role,
system_instructions, and the call to _content_to_parts) accordingly.
- Around line 31-49: The span-level finish_reasons array currently deduplicates
entries and uses _map_finish_reason which normalizes provider values (e.g.,
"tool_calls"→"tool_call") — change this by (1) removing any deduplication so the
array preserves one entry per generation in order, (2) introduce a separate
mapping function for message-level normalization (keep _map_finish_reason for
message-level singularization) and add a new _map_span_finish_reason (or
similar) that preserves provider-returned values (i.e., returns the raw reason
or identity mapping using _FINISH_REASON_MAP only for known aliases if
necessary), and (3) ensure the code that builds gen_ai.response.finish_reasons
calls the span-level mapper (no dedupe) while message-level code continues to
call _map_finish_reason for singular normalized values.
- Around line 192-208: set_request_params and set_chat_request currently
unconditionally emit GenAIAttributes.GEN_AI_TOOL_DEFINITIONS and drop source
fields by building a reduced dict; change both functions to only set
GEN_AI_TOOL_DEFINITIONS when an opt-in flag (e.g., reuse should_send_prompts()
or add a new should_send_tool_definitions()) is true, and serialize the original
tool definition object rather than constructing a stripped
{name,description,parameters} dict (use tool.get("function", tool) as the source
object and include its full keys including "type" and any provider-specific
fields), still passing through CallbackFilteredJSONEncoder for filtering/size
control. Ensure the attribute key remains
GenAIAttributes.GEN_AI_TOOL_DEFINITIONS and that emission is guarded by the
opt-in check in both set_request_params and set_chat_request.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 56c9fd21-40a2-4438-8488-5f89e18fdb66
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.pypackages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
| _FINISH_REASON_MAP = { | ||
| # OpenAI / LangChain-native | ||
| "stop": "stop", | ||
| "length": "length", | ||
| "tool_calls": "tool_call", | ||
| "function_call": "tool_call", | ||
| "content_filter": "content_filter", | ||
| # Anthropic (surfaced through langchain-anthropic) | ||
| "end_turn": "stop", | ||
| "stop_sequence": "stop", | ||
| "tool_use": "tool_call", | ||
| "max_tokens": "length", | ||
| } | ||
|
|
||
|
|
||
| def _map_finish_reason(reason): | ||
| if not reason: | ||
| return reason | ||
| return _FINISH_REASON_MAP.get(reason, reason) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name span_utils.py | grep langchainRepository: traceloop/openllmetry
Length of output: 169
🏁 Script executed:
# Get file size first
wc -l packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.pyRepository: traceloop/openllmetry
Length of output: 171
🏁 Script executed:
# Read the critical sections: 31-49, 322-327, 372-384
cat -n packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py | sed -n '31,49p'Repository: traceloop/openllmetry
Length of output: 693
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py | sed -n '322,327p'Repository: traceloop/openllmetry
Length of output: 468
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py | sed -n '372,384p'Repository: traceloop/openllmetry
Length of output: 639
🏁 Script executed:
# Also check around those lines for context on finish_reason usage
cat -n packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py | sed -n '310,340p'Repository: traceloop/openllmetry
Length of output: 1412
🏁 Script executed:
cat -n packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py | sed -n '360,395p'Repository: traceloop/openllmetry
Length of output: 1523
🌐 Web query:
OpenTelemetry GenAI semantic convention gen_ai.response.finish_reasons specification
💡 Result:
The OpenTelemetry GenAI semantic convention attribute gen_ai.response.finish_reasons is a Recommended string array that captures the reasons why the generative AI model stopped generating tokens for each generation (choice or candidate) received in the response. It appears in spans (e.g., generative client AI spans, agent spans, provider-specific like OpenAI/Anthropic), events (e.g., gen_ai.client.inference.operation.details), and the attributes registry. Specification details: - Type: string[] - Stability: Development (GenAI conventions are in Development status as of latest docs) - Requirement: Recommended - Description: "Array of reasons the model stopped generating tokens, corresponding to each generation received." - Examples from official docs: ["stop"], ["stop", "length"] No exhaustive enum of well-known values is defined in the semantic conventions (unlike other attributes like gen_ai.operation.name). Values are free-form strings matching those returned by GenAI providers (e.g., OpenAI's "stop", "length", "content_filter"; similar for others). Instrumentations capture provider-reported finish reasons directly. Context: Used in GenAI inference spans/events to record response metadata alongside gen_ai.response.id, gen_ai.response.model, token usage, etc. Supports multi-choice responses where each has its own reason (see GitHub issue #1277 discussing multi-finish-reasons support).
Citations:
- 1: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/
- 2: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/
- 3: https://opentelemetry.io/docs/specs/semconv/gen-ai/anthropic/
- 4: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/
- 5: https://opentelemetry.io/docs/specs/semconv/registry/attributes/gen-ai
- 6: http://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/
- 7: https://hexdocs.pm/opentelemetry_semantic_conventions/gen-ai.html
- 8: https://raw.githubusercontent.com/open-telemetry/semantic-conventions/main/docs/gen-ai/gen-ai-spans.md
- 9: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events
- 10: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans
- 11: https://opentelemetry.io/docs/specs/semconv/gen-ai/openai/
Fix gen_ai.response.finish_reasons to preserve each generation's finish reason and match provider values.
The current code incorrectly deduplicates finish reasons with if fr not in finish_reasons, which violates the OpenTelemetry spec requirement that the array should "correspond to each generation received." Multi-generation responses will lose finish reason data if multiple generations return the same reason (e.g., both "stop" should yield ["stop", "stop"], not ["stop"]).
Additionally, the normalization map converts provider-returned finish reasons (e.g., OpenAI's "tool_calls") to singular form ("tool_call"), which contradicts the spec's requirement that values match those returned by GenAI providers. The message-level finish reason and span-level finish reason array require different normalization: span should preserve provider values (plural) while message-level use singular form.
Suggested fix: separate span-level and message-level mappings
-_FINISH_REASON_MAP = {
+_SPAN_FINISH_REASON_MAP = {
"stop": "stop",
"length": "length",
- "tool_calls": "tool_call",
- "function_call": "tool_call",
+ "tool_calls": "tool_calls",
+ "function_call": "tool_calls",
"content_filter": "content_filter",
"end_turn": "stop",
"stop_sequence": "stop",
- "tool_use": "tool_call",
+ "tool_use": "tool_calls",
"max_tokens": "length",
}
+
+_MESSAGE_FINISH_REASON_MAP = {
+ **_SPAN_FINISH_REASON_MAP,
+ "tool_calls": "tool_call",
+ "function_call": "tool_call",
+ "tool_use": "tool_call",
+}- fr = None
+ span_finish_reason = None
+ message_finish_reason = None
if generation.generation_info and generation.generation_info.get("finish_reason"):
- fr = _map_finish_reason(generation.generation_info["finish_reason"])
- if fr not in finish_reasons:
- finish_reasons.append(fr)
+ raw_reason = generation.generation_info["finish_reason"]
+ span_finish_reason = _SPAN_FINISH_REASON_MAP.get(raw_reason, raw_reason)
+ message_finish_reason = _MESSAGE_FINISH_REASON_MAP.get(raw_reason, raw_reason)
+ finish_reasons.append(span_finish_reason)
...
- if fr:
- msg_obj["finish_reason"] = fr
+ if message_finish_reason:
+ msg_obj["finish_reason"] = message_finish_reason🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`
around lines 31 - 49, The span-level finish_reasons array currently deduplicates
entries and uses _map_finish_reason which normalizes provider values (e.g.,
"tool_calls"→"tool_call") — change this by (1) removing any deduplication so the
array preserves one entry per generation in order, (2) introduce a separate
mapping function for message-level normalization (keep _map_finish_reason for
message-level singularization) and add a new _map_span_finish_reason (or
similar) that preserves provider-returned values (i.e., returns the raw reason
or identity mapping using _FINISH_REASON_MAP only for known aliases if
necessary), and (3) ensure the code that builds gen_ai.response.finish_reasons
calls the span-level mapper (no dedupe) while message-level code continues to
call _map_finish_reason for singular normalized values.
| tools = kwargs.get("invocation_params", {}).get("tools", []) | ||
| for i, tool in enumerate(tools): | ||
| tool_function = tool.get("function", tool) | ||
| _set_span_attribute( | ||
| span, | ||
| f"{SpanAttributes.LLM_REQUEST_FUNCTIONS}.{i}.name", | ||
| tool_function.get("name"), | ||
| ) | ||
| if tools: | ||
| tool_defs = [] | ||
| for tool in tools: | ||
| tool_function = tool.get("function", tool) | ||
| tool_def = {"name": tool_function.get("name")} | ||
| if tool_function.get("description"): | ||
| tool_def["description"] = tool_function.get("description") | ||
| params_val = tool_function.get("parameters", tool.get("input_schema")) | ||
| if params_val: | ||
| tool_def["parameters"] = params_val | ||
| tool_defs.append(tool_def) | ||
| _set_span_attribute( | ||
| span, | ||
| f"{SpanAttributes.LLM_REQUEST_FUNCTIONS}.{i}.description", | ||
| tool_function.get("description"), | ||
| ) | ||
| _set_span_attribute( | ||
| span, | ||
| f"{SpanAttributes.LLM_REQUEST_FUNCTIONS}.{i}.parameters", | ||
| json.dumps(tool_function.get("parameters", tool.get("input_schema"))), | ||
| GenAIAttributes.GEN_AI_TOOL_DEFINITIONS, | ||
| json.dumps(tool_defs, cls=CallbackFilteredJSONEncoder), | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
OpenTelemetry GenAI semantic convention gen_ai.tool.definitions specification
💡 Result:
The gen_ai.tool.definitions is an attribute in OpenTelemetry's Generative AI (GenAI) semantic conventions, used in spans and events for GenAI operations. Key details: - Type: any (array of objects, preferably structured; JSON string if structured not supported) - Stability: Development - Requirement Level: Opt-In - Description: The list of source system tool definitions available to the GenAI agent or model. - Usage: Matches the source system's tool definition format (e.g., OpenAI-style function tools). Instrumentations should deserialize serialized strings to arrays where possible. Not recommended to populate by default due to potential size; provide opt-in mechanism. - Example: [ { "type": "function", "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location", "unit"] } } ] Context: - Appears in inference spans (gen-ai spans), agent spans, events like gen_ai.client.inference.operation.details. - Part of broader GenAI conventions (status: Development), with transition guidance via OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental. - Official docs: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans, attributes registry https://opentelemetry.io/docs/specs/semconv/registry/attributes/gen-ai/, and provider-specific like OpenAI. These conventions standardize telemetry for GenAI systems including tools used by agents/models.
Citations:
- 1: https://opentelemetry.io/docs/specs/semconv/gen-ai/
- 2: https://raw.githubusercontent.com/open-telemetry/semantic-conventions/main/docs/gen-ai/gen-ai-spans.md
- 3: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/
- 4: https://opentelemetry.io/docs/specs/semconv/gen-ai/openai
- 5: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-spans.md
- 6: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/
- 7: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/
- 8: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans
- 9: https://opentelemetry.io/docs/specs/semconv/registry/attributes/gen-ai/
🏁 Script executed:
# Find and examine the LangChain span_utils.py file to verify the code claims
fd -t f "span_utils.py" | grep langchainRepository: traceloop/openllmetry
Length of output: 167
🏁 Script executed:
# Once we locate the file, examine the specific lines mentioned
find . -path "*langchain*span_utils.py" -type fRepository: traceloop/openllmetry
Length of output: 169
🏁 Script executed:
# Read the specific line ranges mentioned in the review
sed -n '185,215p' ./packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.pyRepository: traceloop/openllmetry
Length of output: 1199
🏁 Script executed:
# Also read the second mentioned range
sed -n '240,265p' ./packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.pyRepository: traceloop/openllmetry
Length of output: 1050
🏁 Script executed:
# Check the broader context to understand when/how these methods are called
sed -n '1,100p' ./packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py | head -80Repository: traceloop/openllmetry
Length of output: 1980
🏁 Script executed:
# Check if there's any opt-in mechanism for tool definitions in the config
rg -A 5 -B 5 "tool.*definition|GEN_AI_TOOL" ./packages/opentelemetry-instrumentation-langchain/Repository: traceloop/openllmetry
Length of output: 22261
🏁 Script executed:
# Check if there's any opt-in configuration or environment variable for tool definitions
rg -i "should_send_prompts|opt.in|tool.*definition" ./packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/Repository: traceloop/openllmetry
Length of output: 4364
🏁 Script executed:
# Look at how set_request_params is called from set_chat_request to verify the flow
sed -n '235,275p' ./packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.pyRepository: traceloop/openllmetry
Length of output: 1566
🏁 Script executed:
# Check the _extract_tool_definition function in patch.py to see how it handles the format
sed -n '430,480p' ./packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/patch.pyRepository: traceloop/openllmetry
Length of output: 2227
🏁 Script executed:
# Check where set_request_params is called from
rg -B 5 "set_request_params" ./packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py | head -60Repository: traceloop/openllmetry
Length of output: 584
🏁 Script executed:
# Check the full set_request_params function signature and beginning
sed -n '165,210p' ./packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.pyRepository: traceloop/openllmetry
Length of output: 1712
🏁 Script executed:
# Check what should_send_prompts does
rg -A 10 "def should_send_prompts" ./packages/opentelemetry-instrumentation-langchain/Repository: traceloop/openllmetry
Length of output: 1508
Tool definitions are emitted without opt-in and lose source format.
set_request_params() unconditionally emits gen_ai.tool.definitions from tools (lines 192–208), while set_chat_request() conditionally emits them only when should_send_prompts() is true (lines 245–260). Both implementations extract only {name, description, parameters}, dropping the type field and other source-specific fields. The OpenTelemetry GenAI specification explicitly states this attribute should match the source system's tool definition format and is "not recommended to populate by default" due to potential size — it requires an opt-in mechanism. The current implementation provides neither source fidelity nor opt-in control.
Code locations
Lines 192–208 (unconditional emission in set_request_params):
tools = kwargs.get("invocation_params", {}).get("tools", [])
if tools:
tool_defs = []
for tool in tools:
tool_function = tool.get("function", tool)
tool_def = {"name": tool_function.get("name")}
if tool_function.get("description"):
tool_def["description"] = tool_function.get("description")
params_val = tool_function.get("parameters", tool.get("input_schema"))
if params_val:
tool_def["parameters"] = params_val
tool_defs.append(tool_def)
_set_span_attribute(
span,
GenAIAttributes.GEN_AI_TOOL_DEFINITIONS,
json.dumps(tool_defs, cls=CallbackFilteredJSONEncoder),
)Also applies to lines 245–260 in set_chat_request().
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`
around lines 192 - 208, set_request_params and set_chat_request currently
unconditionally emit GenAIAttributes.GEN_AI_TOOL_DEFINITIONS and drop source
fields by building a reduced dict; change both functions to only set
GEN_AI_TOOL_DEFINITIONS when an opt-in flag (e.g., reuse should_send_prompts()
or add a new should_send_tool_definitions()) is true, and serialize the original
tool definition object rather than constructing a stripped
{name,description,parameters} dict (use tool.get("function", tool) as the source
object and include its full keys including "type" and any provider-specific
fields), still passing through CallbackFilteredJSONEncoder for filtering/size
control. Ensure the attribute key remains
GenAIAttributes.GEN_AI_TOOL_DEFINITIONS and that emission is guarded by the
opt-in check in both set_request_params and set_chat_request.
| role = _message_type_to_role(msg.type) | ||
|
|
||
| if role == "system": | ||
| system_instructions.extend(_content_to_parts(msg.content)) | ||
| continue |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "span_utils.py" -path "*/langchain/*" | head -20Repository: traceloop/openllmetry
Length of output: 169
🏁 Script executed:
wc -l ./packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.pyRepository: traceloop/openllmetry
Length of output: 173
🏁 Script executed:
sed -n '260,315p' ./packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py | cat -nRepository: traceloop/openllmetry
Length of output: 2510
🌐 Web query:
OpenTelemetry GenAI semantic conventions gen_ai.system_instructions vs gen_ai.input.messages system role
💡 Result:
In OpenTelemetry GenAI semantic conventions (Development status), gen_ai.system_instructions and gen_ai.input.messages distinguish how system-level instructions are captured from regular chat history. gen_ai.system_instructions (Opt-In, any type): Captures the system message or instructions provided to the GenAI model separately from the chat history. Use this when the provider/API (e.g., OpenAI, Anthropic) supports passing system instructions distinctly. Follows a specific JSON schema, e.g., [{"type": "text", "content": "You are a helpful assistant"}]. Instrumentations may record as JSON string on spans. gen_ai.input.messages (Opt-In, any type): Captures the chat history provided to the model as input, excluding system instructions. Follows Input messages JSON schema, e.g., [{"role": "user", "parts": [{"type": "text", "content": "Hello"}]}]. Messages in order sent to the model. Key distinction: Use gen_ai.system_instructions only for separately provided instructions; if instructions are part of the chat history (e.g., as a "system" role message), record them in gen_ai.input.messages instead. Both are opt-in due to potential sensitive data; default is not to capture. These attributes appear on spans (e.g., inference spans) and events (e.g., gen_ai.client.inference.operation.details). Official docs: opentelemetry.io/docs/specs/semconv/gen-ai/ and subpages.
Citations:
- 1: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/
- 2: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-spans.md
- 3: https://opentelemetry.io/docs/specs/semconv/gen-ai/openai
- 4: https://opentelemetry.io/docs/specs/semconv/gen-ai/anthropic/
- 5: https://raw.githubusercontent.com/open-telemetry/semantic-conventions/main/docs/gen-ai/gen-ai-spans.md
- 6: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans
System messages in chat history should remain in gen_ai.input.messages.
The OpenTelemetry GenAI semantic specification explicitly requires that system messages which are part of the chat history (as a "system" role message) be recorded in gen_ai.input.messages in order, not extracted to gen_ai.system_instructions. The gen_ai.system_instructions attribute is reserved only for instructions provided separately from chat history. The current implementation removes all system messages from the input message sequence, which prevents downstream consumers from reconstructing the original chat order and violates the specification's semantic contract.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.py`
around lines 267 - 271, The code currently extracts system message content into
system_instructions and then skips adding the message to the input sequence by
using "continue"; instead, when role == "system" (computed by
_message_type_to_role(msg.type)), append system instruction parts via
_content_to_parts(msg.content) but do NOT skip the message from
gen_ai.input.messages—remove the "continue" and allow the msg to remain in the
message stream so the original chat order is preserved; update the block in
span_utils.py around role == "system" (the variables msg, role,
system_instructions, and the call to _content_to_parts) accordingly.
feat(instrumentation): ...orfix(instrumentation): ....Summary by CodeRabbit
Improvements
Tests