Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@ async def aset_input_attributes(span, kwargs):
if schema:
set_span_attribute(
span,
SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA,
"gen_ai.request.structured_output_schema",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n "LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA\|structured_output_schema" \
  --type py \
  packages/opentelemetry-semantic-conventions-ai

Repository: traceloop/openllmetry

Length of output: 47


🏁 Script executed:

# Also check the imports and all attribute usages in the span_utils.py file
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py | head -20

Repository: traceloop/openllmetry

Length of output: 689


🏁 Script executed:

# Check all attribute assignments in the file to understand the pattern
rg -n "SpanAttributes\.|GenAIAttributes\.|\"gen_ai\." \
  packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py \
  -A 1

Repository: traceloop/openllmetry

Length of output: 4042


🏁 Script executed:

# Check what constants are available in SpanAttributes
rg -n "LLM_REQUEST|structured" \
  --type py \
  packages/opentelemetry-semantic-conventions-ai

Repository: traceloop/openllmetry

Length of output: 642


🏁 Script executed:

# List all available files in the semantic-conventions-ai package
find packages/opentelemetry-semantic-conventions-ai -name "*.py" -type f | head -20

Repository: traceloop/openllmetry

Length of output: 515


🏁 Script executed:

# Check the main attributes file to see what constants are defined
cat -n packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py

Repository: traceloop/openllmetry

Length of output: 15476


Add LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA constant to SpanAttributes or use the appropriate semantic convention constant.

The constant LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA does not exist in opentelemetry-semantic-conventions-ai. Line 177 uses a literal string "gen_ai.request.structured_output_schema" while all other attributes in this file reference named constants from SpanAttributes or GenAIAttributes (e.g., SpanAttributes.LLM_REQUEST_FUNCTIONS at line 159, SpanAttributes.LLM_USAGE_TOTAL_TOKENS at lines 337, 361).

Either define this constant in SpanAttributes to align with semantic conventions best practices and maintain consistency, or verify that this attribute name is correct and document why a literal string is necessary here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py`
at line 177, Replace the literal "gen_ai.request.structured_output_schema" with
a named semantic constant: add LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA =
"gen_ai.request.structured_output_schema" to the SpanAttributes constants (where
other SpanAttributes are defined) and then use
SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA in span_utils.py in place of
the literal; update any imports if necessary so span_utils.py references the new
SpanAttributes constant instead of the hard-coded string.

json.dumps(schema),
)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,13 @@ def _set_token_usage(
)
set_span_attribute(span, SpanAttributes.LLM_USAGE_TOTAL_TOKENS, total_tokens)

set_span_attribute(
span, SpanAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS, cache_read_tokens
)
set_span_attribute(
span, SpanAttributes.GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS, cache_creation_tokens
)
Comment on lines +97 to +102
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's verify the streaming.py file content at the specified lines
echo "=== Content of streaming.py around lines 97-102 ==="
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py | sed -n '80,110p'

Repository: traceloop/openllmetry

Length of output: 1460


🏁 Script executed:

# Now check for cache token attributes in __init__.py
echo "=== Searching for cache token attributes in __init__.py ==="
rg -n "CACHE_READ_INPUT_TOKENS|CACHE_CREATION_INPUT_TOKENS" \
  --type py \
  packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/

Repository: traceloop/openllmetry

Length of output: 1227


🏁 Script executed:

# Check the set_span_attribute function implementation
echo "=== Checking set_span_attribute function in utils.py ==="
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py | head -40

Repository: traceloop/openllmetry

Length of output: 1450


🏁 Script executed:

# Search for _set_token_usage and _aset_token_usage functions in __init__.py
echo "=== Searching for token usage functions in __init__.py ==="
rg -n "_set_token_usage|_aset_token_usage" \
  --type py \
  packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py \
  -A 15

Repository: traceloop/openllmetry

Length of output: 2829


🏁 Script executed:

# Check the actual cache token attribute lines in __init__.py more closely
echo "=== Checking cache token attributes in _aset_token_usage (around line 286) ==="
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | sed -n '280,295p'

echo ""
echo "=== Checking cache token attributes in _set_token_usage (around line 400) ==="
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | sed -n '395,410p'

Repository: traceloop/openllmetry

Length of output: 1455


🏁 Script executed:

# Verify the SpanAttributes constants used in both files
echo "=== SpanAttributes imports and usage in streaming.py ==="
head -50 packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py | grep -i "import\|SpanAttributes"

echo ""
echo "=== SpanAttributes imports in __init__.py ==="
head -50 packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | grep -i "import\|SpanAttributes"

Repository: traceloop/openllmetry

Length of output: 1867


🏁 Script executed:

# Let's verify the SpanAttributes constants to understand the difference
echo "=== Checking what SpanAttributes constants are available ==="
rg "LLM_USAGE_CACHE|GEN_AI_USAGE_CACHE" --type py

Repository: traceloop/openllmetry

Length of output: 2886


🏁 Script executed:

# Check if these are custom constants defined in the package
echo "=== Looking for SpanAttributes definition/import ==="
rg -n "class SpanAttributes|from.*SpanAttributes" --type py packages/opentelemetry-instrumentation-anthropic/ -B 2 -A 2

Repository: traceloop/openllmetry

Length of output: 2561


🏁 Script executed:

# Quick check to confirm cache_read_tokens and cache_creation_tokens extraction in __init__.py
echo "=== Checking cache token extraction in __init__.py _aset_token_usage ==="
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | sed -n '200,230p'

Repository: traceloop/openllmetry

Length of output: 1619


Cache token attributes correctly restored, but SpanAttributes constants are inconsistently named across files.

The new set_span_attribute calls for cache tokens in streaming.py (lines 97-102) are correct and consistent with the existing token usage pattern. Cache tokens default to 0 (lines 82, 85), so these attributes will always be set, matching the behavior of LLM_USAGE_TOTAL_TOKENS on line 95.

The non-streaming path in __init__.py also sets cache token attributes in both _aset_token_usage() (lines 286, 290) and _set_token_usage() (lines 400, 404), as expected.

However, there's an inconsistency: streaming.py uses SpanAttributes.LLM_USAGE_CACHE_* constants while __init__.py uses SpanAttributes.GEN_AI_USAGE_CACHE_* constants. Both resolve to the same attribute names (e.g., "gen_ai.usage.cache_read_input_tokens"), so they function identically, but the naming should be consistent across the codebase. Use the same constant names in both files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py`
around lines 97 - 102, The cache token span attributes in streaming.py use
SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS and
SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS which are inconsistent with
the rest of the codebase that uses
SpanAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS and
SpanAttributes.GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS (as seen in
__init__.py); update the set_span_attribute calls in streaming.py to use the
GEN_AI_USAGE_CACHE_* constants instead and confirm the SpanAttributes import
covers those names.


set_span_attribute(
span, GenAIAttributes.GEN_AI_RESPONSE_MODEL, complete_response.get("model")
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ requires-python = ">=3.10,<4"
dependencies = [
"opentelemetry-api>=1.38.0,<2",
"opentelemetry-instrumentation>=0.59b0",
"opentelemetry-semantic-conventions-ai>=0.4.13,<0.5.0",
"opentelemetry-semantic-conventions-ai>=0.4.14,<0.5.0",
"opentelemetry-semantic-conventions>=0.59b0",
]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,51 @@

import pytest
from opentelemetry.sdk._logs import ReadableLogRecord
from opentelemetry.sdk.trace import ReadableSpan
from opentelemetry.semconv._incubating.attributes import (
gen_ai_attributes as GenAIAttributes,
)

from .utils import verify_metrics


def _verify_caching_attributes(
cache_creation_span: ReadableSpan,
cache_read_span: ReadableSpan,
input_tokens: int,
cache_creation_span_output_tokens: int,
cache_read_span_output_tokens: int,
cached_tokens: int,
):
assert (
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"]
== cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"]
)

# first check that cache_creation_span only wrote to cache, but not read from it,
assert cache_creation_span.attributes["gen_ai.usage.cache_read_input_tokens"] == 0
assert (
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] != 0
)

# then check for exact figures for the fixture/cassette
assert (
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"]
== cached_tokens
)
assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == input_tokens
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == cache_creation_span_output_tokens

# first check that cache_read_span only read from cache, but not wrote to it,
assert cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] != 0
assert cache_read_span.attributes["gen_ai.usage.cache_creation_input_tokens"] == 0

# then check for exact figures for the fixture/cassette
assert cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] == cached_tokens
assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == input_tokens
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == cache_read_span_output_tokens


@pytest.mark.vcr
def test_anthropic_prompt_caching_legacy(
instrument_legacy, anthropic_client, span_exporter, log_exporter, reader
Expand Down Expand Up @@ -70,6 +108,11 @@ def test_anthropic_prompt_caching_legacy(
assert cache_read_span.attributes["gen_ai.prompt.1.role"] == "user"
assert text == cache_read_span.attributes["gen_ai.prompt.1.content"]

assert (
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"]
== cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"]
)

assert (
cache_creation_span.attributes.get("gen_ai.response.id")
== "msg_01EF3r8zYyZntM4Sg9a5kc6k"
Expand All @@ -82,11 +125,7 @@ def test_anthropic_prompt_caching_legacy(
assert cache_creation_span.attributes["gen_ai.completion.0.role"] == "assistant"
assert cache_read_span.attributes["gen_ai.completion.0.role"] == "assistant"

# assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1167
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 187

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1167
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 202
_verify_caching_attributes(cache_creation_span, cache_read_span, 1167, 187, 202, 1163)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -150,11 +189,7 @@ def test_anthropic_prompt_caching_with_events_with_content(
cache_creation_span = spans[0]
cache_read_span = spans[1]

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1167
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 187

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1167
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 202
_verify_caching_attributes(cache_creation_span, cache_read_span, 1167, 187, 202, 1163)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -316,11 +351,7 @@ def test_anthropic_prompt_caching_with_events_with_no_content(
cache_creation_span = spans[0]
cache_read_span = spans[1]

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1167
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 187

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1167
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 202
_verify_caching_attributes(cache_creation_span, cache_read_span, 1167, 187, 202, 1163)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -426,11 +457,7 @@ async def test_anthropic_prompt_caching_async_legacy(
assert cache_creation_span.attributes["gen_ai.completion.0.role"] == "assistant"
assert cache_read_span.attributes["gen_ai.completion.0.role"] == "assistant"

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 207

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 224
_verify_caching_attributes(cache_creation_span, cache_read_span, 1169, 207, 224, 1165)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -495,11 +522,7 @@ async def test_anthropic_prompt_caching_async_with_events_with_content(
cache_creation_span = spans[0]
cache_read_span = spans[1]

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 207

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 224
_verify_caching_attributes(cache_creation_span, cache_read_span, 1169, 207, 224, 1165)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -667,11 +690,12 @@ async def test_anthropic_prompt_caching_async_with_events_with_no_content(
cache_creation_span = spans[0]
cache_read_span = spans[1]

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 207
assert (
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"]
== cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"]
)

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 224
_verify_caching_attributes(cache_creation_span, cache_read_span, 1169, 207, 224, 1165)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -780,11 +804,7 @@ def test_anthropic_prompt_caching_stream_legacy(
assert cache_creation_span.attributes["gen_ai.completion.0.role"] == "assistant"
assert cache_read_span.attributes["gen_ai.completion.0.role"] == "assistant"

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 202

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 222
_verify_caching_attributes(cache_creation_span, cache_read_span, 1169, 202, 222, 1165)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -852,11 +872,7 @@ def test_anthropic_prompt_caching_stream_with_events_with_content(
cache_creation_span = spans[0]
cache_read_span = spans[1]

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 202

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 222
_verify_caching_attributes(cache_creation_span, cache_read_span, 1169, 202, 222, 1165)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -1029,11 +1045,7 @@ def test_anthropic_prompt_caching_stream_with_events_with_no_content(
cache_creation_span = spans[0]
cache_read_span = spans[1]

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 202

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1169
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 222
_verify_caching_attributes(cache_creation_span, cache_read_span, 1169, 202, 222, 1165)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -1142,11 +1154,7 @@ async def test_anthropic_prompt_caching_async_stream_legacy(
assert cache_read_span.attributes["gen_ai.prompt.1.role"] == "user"
assert text == cache_read_span.attributes["gen_ai.prompt.1.content"]

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1171
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 290

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1171
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 257
_verify_caching_attributes(cache_creation_span, cache_read_span, 1171, 290, 257, 1167)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -1215,11 +1223,7 @@ async def test_anthropic_prompt_caching_async_stream_with_events_with_content(
cache_creation_span = spans[0]
cache_read_span = spans[1]

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1171
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 290

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1171
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 257
_verify_caching_attributes(cache_creation_span, cache_read_span, 1171, 290, 257, 1167)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down Expand Up @@ -1403,11 +1407,7 @@ async def test_anthropic_prompt_caching_async_stream_with_events_with_no_content
cache_creation_span = spans[0]
cache_read_span = spans[1]

assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == 1171
assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == 290

assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == 1171
assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == 257
_verify_caching_attributes(cache_creation_span, cache_read_span, 1171, 290, 257, 1167)

# verify metrics
metrics_data = reader.get_metrics_data()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
from opentelemetry.semconv._incubating.attributes import (
gen_ai_attributes as GenAIAttributes,
)
from opentelemetry.semconv_ai import SpanAttributes


JOKE_SCHEMA = {
Expand Down Expand Up @@ -65,9 +64,9 @@ def test_anthropic_structured_outputs_legacy(
== "assistant"
)

assert SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA in anthropic_span.attributes
assert "gen_ai.request.structured_output_schema" in anthropic_span.attributes
schema_attr = json.loads(
anthropic_span.attributes[SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA]
anthropic_span.attributes["gen_ai.request.structured_output_schema"]
)
assert "properties" in schema_attr
assert "joke" in schema_attr["properties"]
Expand Down
Loading