fix(anthropic): restore accidentally lost cache tokens attributes#3648
fix(anthropic): restore accidentally lost cache tokens attributes#3648galkleinman merged 6 commits intotraceloop:mainfrom
Conversation
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed everything up to daa6ff5 in 11 seconds. Click for details.
- Reviewed
459lines of code in3files - Skipped
0files when reviewing. - Skipped posting
0draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
Workflow ID: wflow_ZIhOqCTqy5KUUHF1
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRestores cache token attributes on Anthropic spans, centralizes caching assertions in tests via a new helper, switches a structured-output attribute lookup to a literal key, and bumps the opentelemetry-semantic-conventions-ai dependency version. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
||
| set_span_attribute( | ||
| span, SpanAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS, cache_read_tokens | ||
| span, SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS, cache_read_tokens |
There was a problem hiding this comment.
Why did you change it? the GEN_AI prefix is the new naming.. if possible and already exist in the official otel semconv, i would even import from there instead of using the local version.
(same for all the occurrences below)
There was a problem hiding this comment.
In the current state, these (silently) fail with AttributeError. And we are very lucky that they are the last lines in functions decorated with dont_throw.
I did a little digging around, and, while it seems like the GEN_AI_ prefix is there in the local conventions definition, this has never been released. If you look at the blame, this was added on Oct 30 in #3138, but the last time that the local semantic conventions package was released was on August 22 (pypi, blame). Moreover, both variable prefixes resolve to the same attribute key starting with gen_ai.usage.
And yes, I would love to import from the official semconv (and I outlined the actual small difference in the attribute key in the linked issue description), but it looks like these are not in their Python package yet, as of v0.60.0b1.
@galkleinman if you could please bump the patch version in the opentelemetry-semantic-conventions-ai package and release, I'm more than happy to switch to the new prefix. Though frankly, nothing changes functionally (variables literally have the same values), and this is subject to change soon anyway, as we'll likely see these in the official otel package
There was a problem hiding this comment.
Awesome. Will do it later today and then you'll be able to bump version here.
There was a problem hiding this comment.
@galkleinman I pushed the change. I realized that, unfortunately, another attribute has been removed and not properly tested in #3138, namely gen_ai.request.structured_output_schema. I had to keep it as a raw string in code, but I am open to feedback.
I remember this was originally my initiative, and I think I saw it in some version of OTel conventions, but perhaps they decided not to add it in the end of the day.
There was a problem hiding this comment.
IMO let's keep it string for now, and create an issue to fix it. I don't want to halt this PR any more...
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py (1)
90-111: Consider extracting a shared helper for the repeated cache-token assertion block.The same assertion pattern (cross-span equality, directional checks, exact fixture values) is copy-pasted across all 12 tests with only the expected token counts varying. A small helper would cut ~180 lines of duplication and make future attribute-name migrations (the upcoming OTel semconv change) a single-point edit.
💡 Example helper
def assert_cache_token_attributes( cache_creation_span, cache_read_span, expected_cache_tokens: int, expected_input_tokens: int, creation_output_tokens: int, read_output_tokens: int, ): # cross-span consistency assert ( cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] == cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] ) # creation span: wrote to cache only assert cache_creation_span.attributes["gen_ai.usage.cache_read_input_tokens"] == 0 assert cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] != 0 assert cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] == expected_cache_tokens assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == expected_input_tokens assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == creation_output_tokens # read span: read from cache only assert cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] != 0 assert cache_read_span.attributes["gen_ai.usage.cache_creation_input_tokens"] == 0 assert cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] == expected_cache_tokens assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == expected_input_tokens assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == read_output_tokensThen each test simply calls:
assert_cache_token_attributes( cache_creation_span, cache_read_span, expected_cache_tokens=1163, expected_input_tokens=1167, creation_output_tokens=187, read_output_tokens=202, )Also applies to: 185-206, 378-399, 510-531, 601-622, 795-816, 930-951, 1024-1045, 1223-1244, 1358-1379, 1453-1474, 1662-1683
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py` around lines 90 - 111, The tests duplicate the same cache-token assertion block across many files; add a single helper function (e.g., assert_cache_token_attributes) that accepts cache_creation_span, cache_read_span and the expected counts (expected_cache_tokens, expected_input_tokens, creation_output_tokens, read_output_tokens) and performs the cross-span equality checks, directional zero/non-zero checks, and exact value assertions using the same attribute names ("gen_ai.usage.cache_creation_input_tokens", "gen_ai.usage.cache_read_input_tokens", "gen_ai.usage.input_tokens", "gen_ai.usage.output_tokens"); then replace each repeated block (the uses of cache_creation_span and cache_read_span shown in the diff) with a single call to that helper passing the appropriate expected values so future attribute-name changes or semantic-convention updates are updated in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Around line 175-183: The test contains a duplicated assertion comparing
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] to
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"]; remove the
redundant second occurrence so the equality is asserted only once (locate the
duplicate in tests/test_prompt_caching.py around the block that references
cache_creation_span and cache_read_span and delete the repeated assertion
lines).
- Around line 368-376: Remove the duplicate assertion comparing
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] to
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] (the repeated
block using cache_creation_span and cache_read_span); leave a single assertion
that performs this comparison in the test (e.g., in tests/test_prompt_caching.py
within the test function that references cache_creation_span and
cache_read_span) so the redundant copy-pasted assertion is deleted.
---
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Around line 90-111: The tests duplicate the same cache-token assertion block
across many files; add a single helper function (e.g.,
assert_cache_token_attributes) that accepts cache_creation_span, cache_read_span
and the expected counts (expected_cache_tokens, expected_input_tokens,
creation_output_tokens, read_output_tokens) and performs the cross-span equality
checks, directional zero/non-zero checks, and exact value assertions using the
same attribute names ("gen_ai.usage.cache_creation_input_tokens",
"gen_ai.usage.cache_read_input_tokens", "gen_ai.usage.input_tokens",
"gen_ai.usage.output_tokens"); then replace each repeated block (the uses of
cache_creation_span and cache_read_span shown in the diff) with a single call to
that helper passing the appropriate expected values so future attribute-name
changes or semantic-convention updates are updated in one place.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py (2)
27-42: Prefer> 0over!= 0for token count assertions.Token counts are non-negative integers;
> 0is the stronger, semantically correct form.!= 0would technically pass a negative value, which is invalid for a count.♻️ Use `> 0` for positive-count assertions
- assert ( - cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] != 0 - ) + assert cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] > 0 ... - assert cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] != 0 + assert cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] > 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py` around lines 27 - 42, Replace the weak non-zero assertions that use != 0 with strict positive checks > 0: update the assertion on cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] (in the cache-creation block) to use > 0 instead of != 0, and update the assertion on cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] (in the cache-read block) to use > 0 instead of != 0 so token counts are asserted to be strictly positive.
21-47: Use constant symbols instead of hardcoded string literals for attribute keys.While the hardcoded strings
"gen_ai.usage.cache_creation_input_tokens"and"gen_ai.usage.cache_read_input_tokens"are correct (the constantsLLM_USAGE_CACHE_CREATION_INPUT_TOKENSandLLM_USAGE_CACHE_READ_INPUT_TOKENSmap to these same values), using the constant symbols directly avoids magic strings and improves maintainability if attribute names ever change in the future.♻️ Suggested refactor
Import the constants at the top of the test file and use them in
_verify_caching_attributes:from opentelemetry.semconv_ai import SpanAttributes def _verify_caching_attributes(...): assert ( cache_creation_span.attributes[SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS] == cache_read_span.attributes[SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS] ) # ... apply consistently for all six attribute accesses🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py` around lines 21 - 47, Replace hardcoded attribute string literals in _verify_caching_attributes with the SpanAttributes constants from opentelemetry.semconv_ai: import SpanAttributes at top of the test file and use SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS and SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS (and the corresponding SpanAttributes for LLM_USAGE_INPUT_TOKENS and LLM_USAGE_OUTPUT_TOKENS) wherever the code currently indexes cache_creation_span.attributes and cache_read_span.attributes so all six attribute accesses reference the SpanAttributes constants instead of raw string literals.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Around line 111-115: Remove the redundant equality assertion comparing
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] and
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] from the
test; the same cross-span check is already performed by the helper function
_verify_caching_attributes, so delete the assertion block referencing
cache_creation_span and cache_read_span to avoid duplicate checks and keep tests
DRY.
- Around line 693-698: Remove the redundant inline equality assertion that
compares
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] to
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] in the test;
rely on the existing helper _verify_caching_attributes(cache_creation_span,
cache_read_span, 1169, 207, 224, 1165) to perform these checks instead—delete
the two-line assert block referencing cache_creation_span and cache_read_span to
avoid duplication.
---
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Around line 27-42: Replace the weak non-zero assertions that use != 0 with
strict positive checks > 0: update the assertion on
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] (in
the cache-creation block) to use > 0 instead of != 0, and update the assertion
on cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] (in the
cache-read block) to use > 0 instead of != 0 so token counts are asserted to be
strictly positive.
- Around line 21-47: Replace hardcoded attribute string literals in
_verify_caching_attributes with the SpanAttributes constants from
opentelemetry.semconv_ai: import SpanAttributes at top of the test file and use
SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS and
SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS (and the corresponding
SpanAttributes for LLM_USAGE_INPUT_TOKENS and LLM_USAGE_OUTPUT_TOKENS) wherever
the code currently indexes cache_creation_span.attributes and
cache_read_span.attributes so all six attribute accesses reference the
SpanAttributes constants instead of raw string literals.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (1)
7-7:⚠️ Potential issue | 🟡 Minor
SpanAttributesimport is now unused — will fail Ruff F401 linting.After lines 68 and 70 switched to the literal string,
SpanAttributesis no longer referenced anywhere in this file. The[tool.ruff.lint]config selects the"F"rule-set (which includes F401), so this will fail the linter.Either remove the import or — preferably, consistent with the rest of the codebase — restore the use of the semconv constant (see related comment in
span_utils.py).🔧 Proposed fix (remove unused import)
import json import pytest from opentelemetry.semconv._incubating.attributes import ( gen_ai_attributes as GenAIAttributes, ) -from opentelemetry.semconv_ai import SpanAttributesOr, once the constant is verified to exist in 0.4.14, restore the constant references:
- assert "gen_ai.request.structured_output_schema" in anthropic_span.attributes + assert SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA in anthropic_span.attributes schema_attr = json.loads( - anthropic_span.attributes["gen_ai.request.structured_output_schema"] + anthropic_span.attributes[SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA] )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py` at line 7, The import SpanAttributes is now unused in tests/test_structured_outputs.py and will trigger Ruff F401; either remove the unused import line for SpanAttributes, or restore usage of the semconv constant by replacing the literal attribute string(s) you added (the two occurrences you changed to string literals) with the appropriate SpanAttributes member(s) (e.g., use SpanAttributes.<CONST_NAME> where the literal was used) so the import is referenced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py`:
- Line 177: Replace the literal "gen_ai.request.structured_output_schema" with a
named semantic constant: add LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA =
"gen_ai.request.structured_output_schema" to the SpanAttributes constants (where
other SpanAttributes are defined) and then use
SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA in span_utils.py in place of
the literal; update any imports if necessary so span_utils.py references the new
SpanAttributes constant instead of the hard-coded string.
---
Outside diff comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py`:
- Line 7: The import SpanAttributes is now unused in
tests/test_structured_outputs.py and will trigger Ruff F401; either remove the
unused import line for SpanAttributes, or restore usage of the semconv constant
by replacing the literal attribute string(s) you added (the two occurrences you
changed to string literals) with the appropriate SpanAttributes member(s) (e.g.,
use SpanAttributes.<CONST_NAME> where the literal was used) so the import is
referenced.
| set_span_attribute( | ||
| span, | ||
| SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA, | ||
| "gen_ai.request.structured_output_schema", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n "LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA\|structured_output_schema" \
--type py \
packages/opentelemetry-semantic-conventions-aiRepository: traceloop/openllmetry
Length of output: 47
🏁 Script executed:
# Also check the imports and all attribute usages in the span_utils.py file
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py | head -20Repository: traceloop/openllmetry
Length of output: 689
🏁 Script executed:
# Check all attribute assignments in the file to understand the pattern
rg -n "SpanAttributes\.|GenAIAttributes\.|\"gen_ai\." \
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py \
-A 1Repository: traceloop/openllmetry
Length of output: 4042
🏁 Script executed:
# Check what constants are available in SpanAttributes
rg -n "LLM_REQUEST|structured" \
--type py \
packages/opentelemetry-semantic-conventions-aiRepository: traceloop/openllmetry
Length of output: 642
🏁 Script executed:
# List all available files in the semantic-conventions-ai package
find packages/opentelemetry-semantic-conventions-ai -name "*.py" -type f | head -20Repository: traceloop/openllmetry
Length of output: 515
🏁 Script executed:
# Check the main attributes file to see what constants are defined
cat -n packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.pyRepository: traceloop/openllmetry
Length of output: 15476
Add LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA constant to SpanAttributes or use the appropriate semantic convention constant.
The constant LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA does not exist in opentelemetry-semantic-conventions-ai. Line 177 uses a literal string "gen_ai.request.structured_output_schema" while all other attributes in this file reference named constants from SpanAttributes or GenAIAttributes (e.g., SpanAttributes.LLM_REQUEST_FUNCTIONS at line 159, SpanAttributes.LLM_USAGE_TOTAL_TOKENS at lines 337, 361).
Either define this constant in SpanAttributes to align with semantic conventions best practices and maintain consistency, or verify that this attribute name is correct and document why a literal string is necessary here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py`
at line 177, Replace the literal "gen_ai.request.structured_output_schema" with
a named semantic constant: add LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA =
"gen_ai.request.structured_output_schema" to the SpanAttributes constants (where
other SpanAttributes are defined) and then use
SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA in span_utils.py in place of
the literal; update any imports if necessary so span_utils.py references the new
SpanAttributes constant instead of the hard-coded string.
5100d19 to
01a7108
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py (2)
111-114: Redundant assertion — already covered by_verify_caching_attributeson line 128.The explicit check that
cache_creation_input_tokens == cache_read_input_tokens(lines 111–114) is the first assertion inside_verify_caching_attributes(lines 21–24). The same redundancy appears at lines 693–696 intest_anthropic_prompt_caching_async_with_events_with_no_content.♻️ Remove redundant assertions
In
test_anthropic_prompt_caching_legacy:- assert ( - cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] - == cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] - ) - assert ( cache_creation_span.attributes.get("gen_ai.response.id")In
test_anthropic_prompt_caching_async_with_events_with_no_content(lines 693–696):- assert ( - cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] - == cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] - ) - _verify_caching_attributes(cache_creation_span, cache_read_span, 1169, 207, 224, 1165)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py` around lines 111 - 114, Remove the redundant explicit assertions that compare cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] to cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] in the tests; these checks are already covered by the helper _verify_caching_attributes. Specifically, delete the explicit equality assertion in test_anthropic_prompt_caching_legacy (the one referencing cache_creation_span and cache_read_span) and the duplicate in test_anthropic_prompt_caching_async_with_events_with_no_content, leaving the calls to _verify_caching_attributes intact.
128-128: Consider using keyword arguments for readability.The positional call
_verify_caching_attributes(cache_creation_span, cache_read_span, 1167, 187, 202, 1163)requires cross-referencing the helper signature to understand what each number means. This pattern repeats across all 12 call sites.♻️ Example with keyword arguments
_verify_caching_attributes( cache_creation_span, cache_read_span, input_tokens=1167, cache_creation_span_output_tokens=187, cache_read_span_output_tokens=202, cached_tokens=1163, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py` at line 128, Replace the positional numeric arguments in the _verify_caching_attributes calls with explicit keyword arguments for clarity: locate each call like _verify_caching_attributes(cache_creation_span, cache_read_span, 1167, 187, 202, 1163) and change the numeric args to input_tokens=1167, cache_creation_span_output_tokens=187, cache_read_span_output_tokens=202, cached_tokens=1163 (keeping the same first two positional args cache_creation_span and cache_read_span); apply the same change to all similar call sites so the meaning of each numeric literal is explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py`:
- Around line 97-102: The cache token span attributes in streaming.py use
SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS and
SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS which are inconsistent with
the rest of the codebase that uses
SpanAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS and
SpanAttributes.GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS (as seen in
__init__.py); update the set_span_attribute calls in streaming.py to use the
GEN_AI_USAGE_CACHE_* constants instead and confirm the SpanAttributes import
covers those names.
---
Duplicate comments:
In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py`:
- Line 177: Replace the hard-coded string
"gen_ai.request.structured_output_schema" with a named constant: add
LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA = "gen_ai.request.structured_output_schema"
to the SpanAttributes constants class, then import/use
SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA in span_utils.py (where the
literal appears) so the file follows the same SpanAttributes/GenAIAttributes
constant pattern.
---
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Around line 111-114: Remove the redundant explicit assertions that compare
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] to
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] in the tests;
these checks are already covered by the helper _verify_caching_attributes.
Specifically, delete the explicit equality assertion in
test_anthropic_prompt_caching_legacy (the one referencing cache_creation_span
and cache_read_span) and the duplicate in
test_anthropic_prompt_caching_async_with_events_with_no_content, leaving the
calls to _verify_caching_attributes intact.
- Line 128: Replace the positional numeric arguments in the
_verify_caching_attributes calls with explicit keyword arguments for clarity:
locate each call like _verify_caching_attributes(cache_creation_span,
cache_read_span, 1167, 187, 202, 1163) and change the numeric args to
input_tokens=1167, cache_creation_span_output_tokens=187,
cache_read_span_output_tokens=202, cached_tokens=1163 (keeping the same first
two positional args cache_creation_span and cache_read_span); apply the same
change to all similar call sites so the meaning of each numeric literal is
explicit.
| set_span_attribute( | ||
| span, SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS, cache_read_tokens | ||
| ) | ||
| set_span_attribute( | ||
| span, SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS, cache_creation_tokens | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's verify the streaming.py file content at the specified lines
echo "=== Content of streaming.py around lines 97-102 ==="
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py | sed -n '80,110p'Repository: traceloop/openllmetry
Length of output: 1460
🏁 Script executed:
# Now check for cache token attributes in __init__.py
echo "=== Searching for cache token attributes in __init__.py ==="
rg -n "CACHE_READ_INPUT_TOKENS|CACHE_CREATION_INPUT_TOKENS" \
--type py \
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/Repository: traceloop/openllmetry
Length of output: 1227
🏁 Script executed:
# Check the set_span_attribute function implementation
echo "=== Checking set_span_attribute function in utils.py ==="
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py | head -40Repository: traceloop/openllmetry
Length of output: 1450
🏁 Script executed:
# Search for _set_token_usage and _aset_token_usage functions in __init__.py
echo "=== Searching for token usage functions in __init__.py ==="
rg -n "_set_token_usage|_aset_token_usage" \
--type py \
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py \
-A 15Repository: traceloop/openllmetry
Length of output: 2829
🏁 Script executed:
# Check the actual cache token attribute lines in __init__.py more closely
echo "=== Checking cache token attributes in _aset_token_usage (around line 286) ==="
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | sed -n '280,295p'
echo ""
echo "=== Checking cache token attributes in _set_token_usage (around line 400) ==="
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | sed -n '395,410p'Repository: traceloop/openllmetry
Length of output: 1455
🏁 Script executed:
# Verify the SpanAttributes constants used in both files
echo "=== SpanAttributes imports and usage in streaming.py ==="
head -50 packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py | grep -i "import\|SpanAttributes"
echo ""
echo "=== SpanAttributes imports in __init__.py ==="
head -50 packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | grep -i "import\|SpanAttributes"Repository: traceloop/openllmetry
Length of output: 1867
🏁 Script executed:
# Let's verify the SpanAttributes constants to understand the difference
echo "=== Checking what SpanAttributes constants are available ==="
rg "LLM_USAGE_CACHE|GEN_AI_USAGE_CACHE" --type pyRepository: traceloop/openllmetry
Length of output: 2886
🏁 Script executed:
# Check if these are custom constants defined in the package
echo "=== Looking for SpanAttributes definition/import ==="
rg -n "class SpanAttributes|from.*SpanAttributes" --type py packages/opentelemetry-instrumentation-anthropic/ -B 2 -A 2Repository: traceloop/openllmetry
Length of output: 2561
🏁 Script executed:
# Quick check to confirm cache_read_tokens and cache_creation_tokens extraction in __init__.py
echo "=== Checking cache token extraction in __init__.py _aset_token_usage ==="
cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | sed -n '200,230p'Repository: traceloop/openllmetry
Length of output: 1619
Cache token attributes correctly restored, but SpanAttributes constants are inconsistently named across files.
The new set_span_attribute calls for cache tokens in streaming.py (lines 97-102) are correct and consistent with the existing token usage pattern. Cache tokens default to 0 (lines 82, 85), so these attributes will always be set, matching the behavior of LLM_USAGE_TOTAL_TOKENS on line 95.
The non-streaming path in __init__.py also sets cache token attributes in both _aset_token_usage() (lines 286, 290) and _set_token_usage() (lines 400, 404), as expected.
However, there's an inconsistency: streaming.py uses SpanAttributes.LLM_USAGE_CACHE_* constants while __init__.py uses SpanAttributes.GEN_AI_USAGE_CACHE_* constants. Both resolve to the same attribute names (e.g., "gen_ai.usage.cache_read_input_tokens"), so they function identically, but the naming should be consistent across the codebase. Use the same constant names in both files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py`
around lines 97 - 102, The cache token span attributes in streaming.py use
SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS and
SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS which are inconsistent with
the rest of the codebase that uses
SpanAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS and
SpanAttributes.GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS (as seen in
__init__.py); update the set_span_attribute calls in streaming.py to use the
GEN_AI_USAGE_CACHE_* constants instead and confirm the SpanAttributes import
covers those names.
This was accidentally removed in #3138 and some subsequent changes. Restoring. More context in #3647.
Note: this PR does NOT fully close #3647, as there will be a new attribute convention soon, see open-telemetry/semantic-conventions#3163
feat(instrumentation): ...orfix(instrumentation): ....Important
Restores cache token attributes in Anthropic instrumentation and updates tests to verify correct attribute settings.
_aset_token_usage()and_set_token_usage()in__init__.pyand_set_token_usage()instreaming.py.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENStoLLM_USAGE_CACHE_READ_INPUT_TOKENSandGEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENStoLLM_USAGE_CACHE_CREATION_INPUT_TOKENS.test_prompt_caching.pyto verify cache token attributes are correctly set for both sync and async operations.This description was created by
for daa6ff5. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Bug Fixes
Chores