fix(anthropic): restore accidentally lost cache tokens attributes by dinmukhamedm · Pull Request #3648 · traceloop/openllmetry

dinmukhamedm · 2026-01-31T16:19:15Z

This was accidentally removed in #3138 and some subsequent changes. Restoring. More context in #3647.

Note: this PR does NOT fully close #3647, as there will be a new attribute convention soon, see open-telemetry/semantic-conventions#3163

I have added tests that cover my changes.
If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
(If applicable) I have updated the documentation accordingly.

Important

Restores cache token attributes in Anthropic instrumentation and updates tests to verify correct attribute settings.

Behavior:
- Restores cache token attributes in _aset_token_usage() and _set_token_usage() in __init__.py and _set_token_usage() in streaming.py.
- Updates attribute names from GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS to LLM_USAGE_CACHE_READ_INPUT_TOKENS and GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS to LLM_USAGE_CACHE_CREATION_INPUT_TOKENS.
Tests:
- Adds assertions in test_prompt_caching.py to verify cache token attributes are correctly set for both sync and async operations.

^{This description was created by}^{for daa6ff5. You can customize this summary. It will automatically update as commits are pushed.}

Summary by CodeRabbit

New Features
- Added cache-related input token metrics to OpenTelemetry instrumentation for Anthropic, exposing cache read and cache creation token counts.
Bug Fixes
- Standardized the structured output schema attribute to "gen_ai.request.structured_output_schema" for consistent telemetry reporting.
Chores
- Updated OpenTelemetry semantic conventions dependency.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to daa6ff5 in 11 seconds. Click for details.

Reviewed 459 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 0 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_ZIhOqCTqy5KUUHF1

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai · 2026-01-31T16:19:47Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Restores cache token attributes on Anthropic spans, centralizes caching assertions in tests via a new helper, switches a structured-output attribute lookup to a literal key, and bumps the opentelemetry-semantic-conventions-ai dependency version.

Changes

Cohort / File(s)	Summary
Cache Token Metrics `packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py`	Adds span attribute writes for `gen_ai.usage.cache_read_input_tokens` and `gen_ai.usage.cache_creation_input_tokens` when recording token usage.
Cache Metrics Test Infrastructure `packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`	Introduces `_verify_caching_attributes(...)` helper and replaces many per-test token assertions with calls to the helper to centralize validation across legacy, async, and streaming test variants.
Structured Output Schema Attribute `packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py`, `packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py`	Replaces usage of `SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA` with the literal attribute key `"gen_ai.request.structured_output_schema"` for structured JSON schema outputs.
Dependency Update `packages/opentelemetry-instrumentation-anthropic/pyproject.toml`	Bumps `opentelemetry-semantic-conventions-ai` requirement from `>=0.4.13,<0.5.0` to `>=0.4.14,<0.5.0`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 sniffs the logs
Cache tokens hop back into view,
Spans now whisper what reads and creations do,
Tests sing one tune, tidy and true,
A tiny bump and a hop — hooray, review! 🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely describes the main change: restoring cache token attributes that were accidentally removed, which matches the core objective of this pull request.
Linked Issues check	✅ Passed	The PR successfully implements the primary objective from issue `#3647`: restoring cache_read_input_tokens and cache_creation_input_tokens attributes on Anthropic spans for both sync and async operations.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to restoring cache token attributes and updating tests; the change to span_utils.py uses the literal string key which aligns with the attribute restoration objective.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

galkleinman · 2026-02-19T14:16:23Z


    set_span_attribute(
-        span, SpanAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS, cache_read_tokens
+        span, SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS, cache_read_tokens


Why did you change it? the GEN_AI prefix is the new naming.. if possible and already exist in the official otel semconv, i would even import from there instead of using the local version.

(same for all the occurrences below)

In the current state, these (silently) fail with AttributeError. And we are very lucky that they are the last lines in functions decorated with dont_throw.

I did a little digging around, and, while it seems like the GEN_AI_ prefix is there in the local conventions definition, this has never been released. If you look at the blame, this was added on Oct 30 in #3138, but the last time that the local semantic conventions package was released was on August 22 (pypi, blame). Moreover, both variable prefixes resolve to the same attribute key starting with gen_ai.usage.

And yes, I would love to import from the official semconv (and I outlined the actual small difference in the attribute key in the linked issue description), but it looks like these are not in their Python package yet, as of v0.60.0b1.

@galkleinman if you could please bump the patch version in the opentelemetry-semantic-conventions-ai package and release, I'm more than happy to switch to the new prefix. Though frankly, nothing changes functionally (variables literally have the same values), and this is subject to change soon anyway, as we'll likely see these in the official otel package

Awesome. Will do it later today and then you'll be able to bump version here.

0.4.14 is released :)

@galkleinman I pushed the change. I realized that, unfortunately, another attribute has been removed and not properly tested in #3138, namely gen_ai.request.structured_output_schema. I had to keep it as a raw string in code, but I am open to feedback.

I remember this was originally my initiative, and I think I saw it in some version of OTel conventions, but perhaps they decided not to add it in the end of the day.

IMO let's keep it string for now, and create an issue to fix it. I don't want to halt this PR any more...

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py (1)

90-111: Consider extracting a shared helper for the repeated cache-token assertion block.

The same assertion pattern (cross-span equality, directional checks, exact fixture values) is copy-pasted across all 12 tests with only the expected token counts varying. A small helper would cut ~180 lines of duplication and make future attribute-name migrations (the upcoming OTel semconv change) a single-point edit.

💡 Example helper

def assert_cache_token_attributes(
    cache_creation_span,
    cache_read_span,
    expected_cache_tokens: int,
    expected_input_tokens: int,
    creation_output_tokens: int,
    read_output_tokens: int,
):
    # cross-span consistency
    assert (
        cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"]
        == cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"]
    )
    # creation span: wrote to cache only
    assert cache_creation_span.attributes["gen_ai.usage.cache_read_input_tokens"] == 0
    assert cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] != 0
    assert cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] == expected_cache_tokens
    assert cache_creation_span.attributes["gen_ai.usage.input_tokens"] == expected_input_tokens
    assert cache_creation_span.attributes["gen_ai.usage.output_tokens"] == creation_output_tokens
    # read span: read from cache only
    assert cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] != 0
    assert cache_read_span.attributes["gen_ai.usage.cache_creation_input_tokens"] == 0
    assert cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] == expected_cache_tokens
    assert cache_read_span.attributes["gen_ai.usage.input_tokens"] == expected_input_tokens
    assert cache_read_span.attributes["gen_ai.usage.output_tokens"] == read_output_tokens

Then each test simply calls:

assert_cache_token_attributes(
    cache_creation_span, cache_read_span,
    expected_cache_tokens=1163, expected_input_tokens=1167,
    creation_output_tokens=187, read_output_tokens=202,
)

Also applies to: 185-206, 378-399, 510-531, 601-622, 795-816, 930-951, 1024-1045, 1223-1244, 1358-1379, 1453-1474, 1662-1683

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`
around lines 90 - 111, The tests duplicate the same cache-token assertion block
across many files; add a single helper function (e.g.,
assert_cache_token_attributes) that accepts cache_creation_span, cache_read_span
and the expected counts (expected_cache_tokens, expected_input_tokens,
creation_output_tokens, read_output_tokens) and performs the cross-span equality
checks, directional zero/non-zero checks, and exact value assertions using the
same attribute names ("gen_ai.usage.cache_creation_input_tokens",
"gen_ai.usage.cache_read_input_tokens", "gen_ai.usage.input_tokens",
"gen_ai.usage.output_tokens"); then replace each repeated block (the uses of
cache_creation_span and cache_read_span shown in the diff) with a single call to
that helper passing the appropriate expected values so future attribute-name
changes or semantic-convention updates are updated in one place.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Around line 175-183: The test contains a duplicated assertion comparing
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] to
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"]; remove the
redundant second occurrence so the equality is asserted only once (locate the
duplicate in tests/test_prompt_caching.py around the block that references
cache_creation_span and cache_read_span and delete the repeated assertion
lines).
- Around line 368-376: Remove the duplicate assertion comparing
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] to
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] (the repeated
block using cache_creation_span and cache_read_span); leave a single assertion
that performs this comparison in the test (e.g., in tests/test_prompt_caching.py
within the test function that references cache_creation_span and
cache_read_span) so the redundant copy-pasted assertion is deleted.

---

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Around line 90-111: The tests duplicate the same cache-token assertion block
across many files; add a single helper function (e.g.,
assert_cache_token_attributes) that accepts cache_creation_span, cache_read_span
and the expected counts (expected_cache_tokens, expected_input_tokens,
creation_output_tokens, read_output_tokens) and performs the cross-span equality
checks, directional zero/non-zero checks, and exact value assertions using the
same attribute names ("gen_ai.usage.cache_creation_input_tokens",
"gen_ai.usage.cache_read_input_tokens", "gen_ai.usage.input_tokens",
"gen_ai.usage.output_tokens"); then replace each repeated block (the uses of
cache_creation_span and cache_read_span shown in the diff) with a single call to
that helper passing the appropriate expected values so future attribute-name
changes or semantic-convention updates are updated in one place.

coderabbitai

🧹 Nitpick comments (2)

packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py (2)

27-42: Prefer > 0 over != 0 for token count assertions.

Token counts are non-negative integers; > 0 is the stronger, semantically correct form. != 0 would technically pass a negative value, which is invalid for a count.

♻️ Use `> 0` for positive-count assertions

-    assert (
-        cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] != 0
-    )
+    assert cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] > 0

 ...

-    assert cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] != 0
+    assert cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] > 0

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`
around lines 27 - 42, Replace the weak non-zero assertions that use != 0 with
strict positive checks > 0: update the assertion on
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] (in
the cache-creation block) to use > 0 instead of != 0, and update the assertion
on cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] (in the
cache-read block) to use > 0 instead of != 0 so token counts are asserted to be
strictly positive.

21-47: Use constant symbols instead of hardcoded string literals for attribute keys.

While the hardcoded strings "gen_ai.usage.cache_creation_input_tokens" and "gen_ai.usage.cache_read_input_tokens" are correct (the constants LLM_USAGE_CACHE_CREATION_INPUT_TOKENS and LLM_USAGE_CACHE_READ_INPUT_TOKENS map to these same values), using the constant symbols directly avoids magic strings and improves maintainability if attribute names ever change in the future.

♻️ Suggested refactor

Import the constants at the top of the test file and use them in _verify_caching_attributes:

from opentelemetry.semconv_ai import SpanAttributes

def _verify_caching_attributes(...):
    assert (
        cache_creation_span.attributes[SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS]
        == cache_read_span.attributes[SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS]
    )
    # ... apply consistently for all six attribute accesses

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`
around lines 21 - 47, Replace hardcoded attribute string literals in
_verify_caching_attributes with the SpanAttributes constants from
opentelemetry.semconv_ai: import SpanAttributes at top of the test file and use
SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS and
SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS (and the corresponding
SpanAttributes for LLM_USAGE_INPUT_TOKENS and LLM_USAGE_OUTPUT_TOKENS) wherever
the code currently indexes cache_creation_span.attributes and
cache_read_span.attributes so all six attribute accesses reference the
SpanAttributes constants instead of raw string literals.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Around line 111-115: Remove the redundant equality assertion comparing
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] and
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] from the
test; the same cross-span check is already performed by the helper function
_verify_caching_attributes, so delete the assertion block referencing
cache_creation_span and cache_read_span to avoid duplicate checks and keep tests
DRY.
- Around line 693-698: Remove the redundant inline equality assertion that
compares
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] to
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] in the test;
rely on the existing helper _verify_caching_attributes(cache_creation_span,
cache_read_span, 1169, 207, 224, 1165) to perform these checks instead—delete
the two-line assert block referencing cache_creation_span and cache_read_span to
avoid duplication.

---

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Around line 27-42: Replace the weak non-zero assertions that use != 0 with
strict positive checks > 0: update the assertion on
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] (in
the cache-creation block) to use > 0 instead of != 0, and update the assertion
on cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] (in the
cache-read block) to use > 0 instead of != 0 so token counts are asserted to be
strictly positive.
- Around line 21-47: Replace hardcoded attribute string literals in
_verify_caching_attributes with the SpanAttributes constants from
opentelemetry.semconv_ai: import SpanAttributes at top of the test file and use
SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS and
SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS (and the corresponding
SpanAttributes for LLM_USAGE_INPUT_TOKENS and LLM_USAGE_OUTPUT_TOKENS) wherever
the code currently indexes cache_creation_span.attributes and
cache_read_span.attributes so all six attribute accesses reference the
SpanAttributes constants instead of raw string literals.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py (1)
7-7: ⚠️ Potential issue | 🟡 Minor

SpanAttributes import is now unused — will fail Ruff F401 linting.

After lines 68 and 70 switched to the literal string, SpanAttributes is no longer referenced anywhere in this file. The [tool.ruff.lint] config selects the "F" rule-set (which includes F401), so this will fail the linter.

Either remove the import or — preferably, consistent with the rest of the codebase — restore the use of the semconv constant (see related comment in span_utils.py).
🔧 Proposed fix (remove unused import)
 import json
 
 import pytest
 from opentelemetry.semconv._incubating.attributes import (
     gen_ai_attributes as GenAIAttributes,
 )
-from opentelemetry.semconv_ai import SpanAttributes
Or, once the constant is verified to exist in 0.4.14, restore the constant references:
-    assert "gen_ai.request.structured_output_schema" in anthropic_span.attributes
+    assert SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA in anthropic_span.attributes
     schema_attr = json.loads(
-        anthropic_span.attributes["gen_ai.request.structured_output_schema"]
+        anthropic_span.attributes[SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA]
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py`
at line 7, The import SpanAttributes is now unused in
tests/test_structured_outputs.py and will trigger Ruff F401; either remove the
unused import line for SpanAttributes, or restore usage of the semconv constant
by replacing the literal attribute string(s) you added (the two occurrences you
changed to string literals) with the appropriate SpanAttributes member(s) (e.g.,
use SpanAttributes.<CONST_NAME> where the literal was used) so the import is
referenced.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py`:
- Line 177: Replace the literal "gen_ai.request.structured_output_schema" with a
named semantic constant: add LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA =
"gen_ai.request.structured_output_schema" to the SpanAttributes constants (where
other SpanAttributes are defined) and then use
SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA in span_utils.py in place of
the literal; update any imports if necessary so span_utils.py references the new
SpanAttributes constant instead of the hard-coded string.

---

Outside diff comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_structured_outputs.py`:
- Line 7: The import SpanAttributes is now unused in
tests/test_structured_outputs.py and will trigger Ruff F401; either remove the
unused import line for SpanAttributes, or restore usage of the semconv constant
by replacing the literal attribute string(s) you added (the two occurrences you
changed to string literals) with the appropriate SpanAttributes member(s) (e.g.,
use SpanAttributes.<CONST_NAME> where the literal was used) so the import is
referenced.

coderabbitai · 2026-02-22T20:50:20Z

                    set_span_attribute(
                        span,
-                        SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA,
+                        "gen_ai.request.structured_output_schema",


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n "LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA\|structured_output_schema" \ --type py \ packages/opentelemetry-semantic-conventions-ai

Repository: traceloop/openllmetry

Length of output: 47

🏁 Script executed:

# Also check the imports and all attribute usages in the span_utils.py file cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py | head -20

Repository: traceloop/openllmetry

Length of output: 689

🏁 Script executed:

# Check all attribute assignments in the file to understand the pattern rg -n "SpanAttributes\.|GenAIAttributes\.|\"gen_ai\." \ packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py \ -A 1

Repository: traceloop/openllmetry

Length of output: 4042

🏁 Script executed:

# Check what constants are available in SpanAttributes rg -n "LLM_REQUEST|structured" \ --type py \ packages/opentelemetry-semantic-conventions-ai

Repository: traceloop/openllmetry

Length of output: 642

🏁 Script executed:

# List all available files in the semantic-conventions-ai package find packages/opentelemetry-semantic-conventions-ai -name "*.py" -type f | head -20

Repository: traceloop/openllmetry

Length of output: 515

🏁 Script executed:

# Check the main attributes file to see what constants are defined cat -n packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py

Repository: traceloop/openllmetry

Length of output: 15476

Add LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA constant to SpanAttributes or use the appropriate semantic convention constant.

The constant LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA does not exist in opentelemetry-semantic-conventions-ai. Line 177 uses a literal string "gen_ai.request.structured_output_schema" while all other attributes in this file reference named constants from SpanAttributes or GenAIAttributes (e.g., SpanAttributes.LLM_REQUEST_FUNCTIONS at line 159, SpanAttributes.LLM_USAGE_TOTAL_TOKENS at lines 337, 361).

Either define this constant in SpanAttributes to align with semantic conventions best practices and maintain consistency, or verify that this attribute name is correct and document why a literal string is necessary here.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py` at line 177, Replace the literal "gen_ai.request.structured_output_schema" with a named semantic constant: add LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA = "gen_ai.request.structured_output_schema" to the SpanAttributes constants (where other SpanAttributes are defined) and then use SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA in span_utils.py in place of the literal; update any imports if necessary so span_utils.py references the new SpanAttributes constant instead of the hard-coded string.

galkleinman

LGTM 💪

Just fix the lint & rebase...

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py (2)

111-114: Redundant assertion — already covered by _verify_caching_attributes on line 128.

The explicit check that cache_creation_input_tokens == cache_read_input_tokens (lines 111–114) is the first assertion inside _verify_caching_attributes (lines 21–24). The same redundancy appears at lines 693–696 in test_anthropic_prompt_caching_async_with_events_with_no_content.

♻️ Remove redundant assertions

In test_anthropic_prompt_caching_legacy:

-    assert (
-        cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"]
-        == cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"]
-    )
-
     assert (
         cache_creation_span.attributes.get("gen_ai.response.id")

In test_anthropic_prompt_caching_async_with_events_with_no_content (lines 693–696):

-    assert (
-        cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"]
-        == cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"]
-    )
-
     _verify_caching_attributes(cache_creation_span, cache_read_span, 1169, 207, 224, 1165)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`
around lines 111 - 114, Remove the redundant explicit assertions that compare
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] to
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] in the tests;
these checks are already covered by the helper _verify_caching_attributes.
Specifically, delete the explicit equality assertion in
test_anthropic_prompt_caching_legacy (the one referencing cache_creation_span
and cache_read_span) and the duplicate in
test_anthropic_prompt_caching_async_with_events_with_no_content, leaving the
calls to _verify_caching_attributes intact.

128-128: Consider using keyword arguments for readability.

The positional call _verify_caching_attributes(cache_creation_span, cache_read_span, 1167, 187, 202, 1163) requires cross-referencing the helper signature to understand what each number means. This pattern repeats across all 12 call sites.

♻️ Example with keyword arguments

_verify_caching_attributes(
    cache_creation_span,
    cache_read_span,
    input_tokens=1167,
    cache_creation_span_output_tokens=187,
    cache_read_span_output_tokens=202,
    cached_tokens=1163,
)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`
at line 128, Replace the positional numeric arguments in the
_verify_caching_attributes calls with explicit keyword arguments for clarity:
locate each call like _verify_caching_attributes(cache_creation_span,
cache_read_span, 1167, 187, 202, 1163) and change the numeric args to
input_tokens=1167, cache_creation_span_output_tokens=187,
cache_read_span_output_tokens=202, cached_tokens=1163 (keeping the same first
two positional args cache_creation_span and cache_read_span); apply the same
change to all similar call sites so the meaning of each numeric literal is
explicit.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py`:
- Around line 97-102: The cache token span attributes in streaming.py use
SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS and
SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS which are inconsistent with
the rest of the codebase that uses
SpanAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS and
SpanAttributes.GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS (as seen in
__init__.py); update the set_span_attribute calls in streaming.py to use the
GEN_AI_USAGE_CACHE_* constants instead and confirm the SpanAttributes import
covers those names.

---

Duplicate comments:
In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py`:
- Line 177: Replace the hard-coded string
"gen_ai.request.structured_output_schema" with a named constant: add
LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA = "gen_ai.request.structured_output_schema"
to the SpanAttributes constants class, then import/use
SpanAttributes.LLM_REQUEST_STRUCTURED_OUTPUT_SCHEMA in span_utils.py (where the
literal appears) so the file follows the same SpanAttributes/GenAIAttributes
constant pattern.

---

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Around line 111-114: Remove the redundant explicit assertions that compare
cache_creation_span.attributes["gen_ai.usage.cache_creation_input_tokens"] to
cache_read_span.attributes["gen_ai.usage.cache_read_input_tokens"] in the tests;
these checks are already covered by the helper _verify_caching_attributes.
Specifically, delete the explicit equality assertion in
test_anthropic_prompt_caching_legacy (the one referencing cache_creation_span
and cache_read_span) and the duplicate in
test_anthropic_prompt_caching_async_with_events_with_no_content, leaving the
calls to _verify_caching_attributes intact.
- Line 128: Replace the positional numeric arguments in the
_verify_caching_attributes calls with explicit keyword arguments for clarity:
locate each call like _verify_caching_attributes(cache_creation_span,
cache_read_span, 1167, 187, 202, 1163) and change the numeric args to
input_tokens=1167, cache_creation_span_output_tokens=187,
cache_read_span_output_tokens=202, cached_tokens=1163 (keeping the same first
two positional args cache_creation_span and cache_read_span); apply the same
change to all similar call sites so the meaning of each numeric literal is
explicit.

coderabbitai · 2026-02-22T21:28:50Z

+    set_span_attribute(
+        span, SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS, cache_read_tokens
+    )
+    set_span_attribute(
+        span, SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS, cache_creation_tokens
+    )


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's verify the streaming.py file content at the specified lines echo "=== Content of streaming.py around lines 97-102 ===" cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py | sed -n '80,110p'

Repository: traceloop/openllmetry

Length of output: 1460

🏁 Script executed:

# Now check for cache token attributes in __init__.py echo "=== Searching for cache token attributes in __init__.py ===" rg -n "CACHE_READ_INPUT_TOKENS|CACHE_CREATION_INPUT_TOKENS" \ --type py \ packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/

Repository: traceloop/openllmetry

Length of output: 1227

🏁 Script executed:

# Check the set_span_attribute function implementation echo "=== Checking set_span_attribute function in utils.py ===" cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py | head -40

Repository: traceloop/openllmetry

Length of output: 1450

🏁 Script executed:

# Search for _set_token_usage and _aset_token_usage functions in __init__.py echo "=== Searching for token usage functions in __init__.py ===" rg -n "_set_token_usage|_aset_token_usage" \ --type py \ packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py \ -A 15

Repository: traceloop/openllmetry

Length of output: 2829

🏁 Script executed:

# Check the actual cache token attribute lines in __init__.py more closely echo "=== Checking cache token attributes in _aset_token_usage (around line 286) ===" cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | sed -n '280,295p' echo "" echo "=== Checking cache token attributes in _set_token_usage (around line 400) ===" cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | sed -n '395,410p'

Repository: traceloop/openllmetry

Length of output: 1455

🏁 Script executed:

# Verify the SpanAttributes constants used in both files echo "=== SpanAttributes imports and usage in streaming.py ===" head -50 packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py | grep -i "import\|SpanAttributes" echo "" echo "=== SpanAttributes imports in __init__.py ===" head -50 packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | grep -i "import\|SpanAttributes"

Repository: traceloop/openllmetry

Length of output: 1867

🏁 Script executed:

# Let's verify the SpanAttributes constants to understand the difference echo "=== Checking what SpanAttributes constants are available ===" rg "LLM_USAGE_CACHE|GEN_AI_USAGE_CACHE" --type py

Repository: traceloop/openllmetry

Length of output: 2886

🏁 Script executed:

# Check if these are custom constants defined in the package echo "=== Looking for SpanAttributes definition/import ===" rg -n "class SpanAttributes|from.*SpanAttributes" --type py packages/opentelemetry-instrumentation-anthropic/ -B 2 -A 2

Repository: traceloop/openllmetry

Length of output: 2561

🏁 Script executed:

# Quick check to confirm cache_read_tokens and cache_creation_tokens extraction in __init__.py echo "=== Checking cache token extraction in __init__.py _aset_token_usage ===" cat -n packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py | sed -n '200,230p'

Repository: traceloop/openllmetry

Length of output: 1619

Cache token attributes correctly restored, but SpanAttributes constants are inconsistently named across files.

The new set_span_attribute calls for cache tokens in streaming.py (lines 97-102) are correct and consistent with the existing token usage pattern. Cache tokens default to 0 (lines 82, 85), so these attributes will always be set, matching the behavior of LLM_USAGE_TOTAL_TOKENS on line 95.

The non-streaming path in __init__.py also sets cache token attributes in both _aset_token_usage() (lines 286, 290) and _set_token_usage() (lines 400, 404), as expected.

However, there's an inconsistency: streaming.py uses SpanAttributes.LLM_USAGE_CACHE_* constants while __init__.py uses SpanAttributes.GEN_AI_USAGE_CACHE_* constants. Both resolve to the same attribute names (e.g., "gen_ai.usage.cache_read_input_tokens"), so they function identically, but the naming should be consistent across the codebase. Use the same constant names in both files.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/streaming.py` around lines 97 - 102, The cache token span attributes in streaming.py use SpanAttributes.LLM_USAGE_CACHE_READ_INPUT_TOKENS and SpanAttributes.LLM_USAGE_CACHE_CREATION_INPUT_TOKENS which are inconsistent with the rest of the codebase that uses SpanAttributes.GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS and SpanAttributes.GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS (as seen in __init__.py); update the set_span_attribute calls in streaming.py to use the GEN_AI_USAGE_CACHE_* constants instead and confirm the SpanAttributes import covers those names.

dinmukhamedm requested a review from nirga January 31, 2026 16:19

ellipsis-dev Bot reviewed Jan 31, 2026

View reviewed changes

galkleinman reviewed Feb 19, 2026

View reviewed changes

dinmukhamedm requested a review from galkleinman February 21, 2026 14:48

coderabbitai Bot reviewed Feb 21, 2026

View reviewed changes

Comment thread packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py Outdated

Comment thread packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py Outdated

coderabbitai Bot reviewed Feb 22, 2026

View reviewed changes

galkleinman approved these changes Feb 22, 2026

View reviewed changes

dinmukhamedm added 5 commits February 22, 2026 21:24

fix(anthropic): restore accidentally lost cache tokens attributes

efe6a15

address small issues in tests

598cee8

factor out token checks from all tests

122dd17

bump semconv-ai to 0.4.14

f5877d4

fix lint

01a7108

dinmukhamedm force-pushed the fix/anthropic-cached-tokens branch from 5100d19 to 01a7108 Compare February 22, 2026 21:24

coderabbitai Bot reviewed Feb 22, 2026

View reviewed changes

update name in streaming file as well

1c95ef9

galkleinman merged commit 03d49ae into traceloop:main Feb 22, 2026
9 of 10 checks passed

Conversation

dinmukhamedm commented Jan 31, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

galkleinman Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

dinmukhamedm Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

galkleinman Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

galkleinman Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

dinmukhamedm Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

galkleinman Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

galkleinman left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dinmukhamedm commented Jan 31, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jan 31, 2026 •

edited

Loading

galkleinman left a comment •

edited

Loading