fix(anthropic): support with_raw_response wrapper for span generation#3250
fix(anthropic): support with_raw_response wrapper for span generation#3250
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Note Currently processing new changes in this PR. This may take a few minutes, please wait... 📒 Files selected for processing (4)
WalkthroughCentralizes response normalization with a new _extract_response_data used across instrumentation, replaces ad-hoc dict/dict handling for token and span attribute extraction, expands WRAPPED_METHODS to cover Bedrock/beta/with_raw_response surfaces, adds Bedrock async tests and VCR cassettes, and updates test dependency to Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant Client as Anthropic/Bedrock Client
participant Instr as OTel Anthropic instrumentation
participant Utils as _extract_response_data
participant Span as OTel Span
App->>Client: messages.create / messages.with_raw_response.create(...)
Client-->>Instr: response (dict / wrapper / object)
Instr->>Utils: _extract_response_data(response)
Utils-->>Instr: normalized dict-like response
Instr->>Span: set_response_attributes, _set_span_completions, _set_token_usage
Span-->>App: recorded span with attributes and usage
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Caution
Changes requested ❌
Reviewed everything up to 1a67ed3 in 1 minute and 54 seconds. Click for details.
- Reviewed
397lines of code in7files - Skipped
0files when reviewing. - Skipped posting
6draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:133
- Draft comment:
Refactor: Replacing manual conversion with _extract_response_data enhances consistency when handling wrapped responses. Consider checking for cases when an empty dict is returned. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The code already has proper null checks after calling _extract_response_data. The function is used to extract data in a consistent way, and the code is already defensive about handling empty/missing data. The comment seems to be suggesting additional checks that aren't necessary given the existing null handling. I could be wrong about the implementation of _extract_response_data - without seeing that code, I can't be 100% certain about its behavior. Even without seeing _extract_response_data, the calling code clearly handles None/empty cases properly through .get() calls and null checks. Additional empty dict checks would be redundant. The comment should be removed as it suggests unnecessary additional checks when the code already handles empty/null cases appropriately.
2. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py:174
- Draft comment:
Nice update: Using _extract_response_data in _set_span_completions and set_response_attributes ensures uniform extraction of response data, including wrapped responses. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
3. packages/opentelemetry-instrumentation-anthropic/pyproject.toml:39
- Draft comment:
Good improvement: Specifying the 'bedrock' extras for the anthropic dependency ensures tests run against AsyncAnthropicBedrock. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
4. packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml:1
- Draft comment:
Cassette test added for regular create calls. This provides good baseline coverage for span attributes. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
5. packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml:1
- Draft comment:
Cassette test for with_raw_response calls correctly includes the x-stainless-raw-response header. This helps verify the span generation for wrapped responses. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
6. packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py:30
- Draft comment:
Comprehensive tests for both with_raw_response and regular create methods validate the span attributes and token usage correctly. Test coverage appears robust. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
Workflow ID: wflow_3DZZR7zwpoemcbIx
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| parsed_response = parsed_response.__dict__ | ||
| return parsed_response | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
The _extract_response_data helper centralizes response extraction. Consider logging the exception inside the try/except block (instead of silently passing) to aid debugging potential parsing issues.
| pass | |
| logging.debug("Failed to parse response in _extract_response_data", exc_info=True) |
There was a problem hiding this comment.
Skipped PR review on 9bd6080 because no changed files had a supported extension. If you think this was in error, please contact us and we'll fix it right away.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py (1)
64-85: Helper function correctly handles response normalization.The
_extract_response_datafunction provides robust extraction logic to handle both regular responses andwith_raw_responsewrapped responses. The fallback chain (dict → parse() → dict → empty dict) is well-designed and exception-safe.Previous review correctly identified that logging the exception would aid debugging. Consider adding debug logging in the exception handler at line 78.
🧹 Nitpick comments (4)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml (1)
22-45: Filter volatile and identifying headers in cassettes to reduce churn and avoid leaking environment details.Headers like x-amz-date, user-agent, x-stainless-* and runtime versions are not needed for matching and change frequently, causing noisy cassette diffs and potential environment leakage.
Add a test-wide VCR configuration (e.g., in tests/conftest.py) to filter and normalize headers:
# tests/conftest.py import re import pytest SENSITIVE_HEADERS = [ "authorization", "x-amz-date", "x-amzn-trace-id", "x-stainless-arch", "x-stainless-lang", "x-stainless-os", "x-stainless-package-version", "x-stainless-raw-response", "x-stainless-read-timeout", "x-stainless-retry-count", "x-stainless-runtime", "x-stainless-runtime-version", "x-stainless-timeout", "user-agent", "content-length", ] @pytest.fixture(scope="session") def vcr_config(): return { "filter_headers": [(h, "<REDACTED>") for h in SENSITIVE_HEADERS], # Ensure request matching does not depend on transient headers "match_on": ["method", "scheme", "host", "port", "path", "query", "body"], "decode_compressed_response": True, }This keeps playback deterministic and private.
packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py (3)
16-20: Prefer also honoring AWS_DEFAULT_REGION for better local/dev parity.Some environments set AWS_DEFAULT_REGION instead of AWS_REGION. Fall back to either before defaulting.
Apply this diff:
- aws_region = os.environ.get("AWS_REGION", "us-east-1") + aws_region = os.environ.get("AWS_REGION") or os.environ.get("AWS_DEFAULT_REGION") or "us-east-1"
46-48: Minor assertion simplification.After asserting len(spans) == 1, checking all(...) is redundant. Access the single span directly.
Apply this diff:
- assert len(spans) == 1 - assert all(span.name == "anthropic.chat" for span in spans) + assert len(spans) == 1 + assert spans[0].name == "anthropic.chat"
91-94: Minor assertion simplification (same as above).Mirror the simplification to avoid redundant all(...).
Apply this diff:
- assert len(spans) == 1 - assert all(span.name == "anthropic.chat" for span in spans) + assert len(spans) == 1 + assert spans[0].name == "anthropic.chat"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
packages/opentelemetry-instrumentation-anthropic/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py(2 hunks)packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py(3 hunks)packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py(1 hunks)packages/opentelemetry-instrumentation-anthropic/pyproject.toml(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (1)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py (1)
_extract_response_data(64-84)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py (1)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py (1)
_extract_response_data(64-84)
packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py (3)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes(64-257)packages/opentelemetry-instrumentation-milvus/tests/conftest.py (1)
reader(37-41)packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (1)
get_finished_spans(40-43)
🪛 Gitleaks (8.27.2)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml
11-11: Identified a pattern that may indicate AWS credentials, risking unauthorized cloud resource access and data breaches on AWS platforms.
(aws-access-token)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml
11-11: Identified a pattern that may indicate AWS credentials, risking unauthorized cloud resource access and data breaches on AWS platforms.
(aws-access-token)
🪛 Checkov (3.2.334)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml
[HIGH] 11-12: AWS Access Key
(CKV_SECRET_2)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml
[HIGH] 11-12: AWS Access Key
(CKV_SECRET_2)
🔇 Additional comments (11)
packages/opentelemetry-instrumentation-anthropic/pyproject.toml (1)
39-39: LGTM! Bedrock extras added for comprehensive testing.This change appropriately includes the bedrock extras in the anthropic dependency to enable testing of AsyncAnthropicBedrock functionality, which is directly needed for the new tests in this PR.
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py (1)
87-97: Response normalization properly applied.The
shared_metrics_attributesfunction now correctly uses the centralized_extract_response_datahelper to normalize the response before extracting attributes. This ensures consistent behavior across both regular and wrapped responses.packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (2)
133-134: Centralized response normalization applied correctly.The async token usage function now properly uses
_extract_response_datato normalize responses before accessing usage attributes, replacing the previous inline dict detection logic. This ensures consistent handling of both regular andwith_raw_responsewrapped responses.
226-227: Consistent normalization in sync token usage function.The sync version correctly mirrors the async implementation by using
_extract_response_datafor response normalization. This maintains consistency across both sync and async code paths.packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml (2)
48-51: Test response data looks appropriate.The cassette contains a realistic Anthropic Claude response with proper token usage information (17 input tokens, 50 output tokens) and a valid joke response about OpenTelemetry. This data will effectively test the response parsing logic.
11-13: Sensitive AWS credentials detected in VCR cassettes
The following cassette files each contain a hard-coded AWS4-HMAC-SHA256 Authorization header with an AKIA… key:
- packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml
- packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml
Please confirm that these are disposable, non-production credentials. If they’re real, scrub or mask them (e.g. using VCR’s filter_sensitive_data) before merging.
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py (3)
11-11: Import correctly added for response normalization.The import of
_extract_response_datais properly added to support the centralized response extraction functionality.
174-174: Response normalization applied in completion processing.The
_set_span_completionsfunction now correctly normalizes the response before processing completions. This ensures that both regular andwith_raw_responsewrapped responses are handled consistently when extracting completion data.
241-241: Consistent normalization in response attributes.The
set_response_attributesfunction properly applies response normalization, replacing the previous manual dict fallback logic with the centralized helper. This ensures consistent attribute extraction across all response types.packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py (2)
28-71: LGTM: Good coverage for the with_raw_response path, including token usage and content extraction.The assertions validate prompts, completions, and usage metrics. Using response.parse() when available mirrors the LegacyAPIResponse flow under test.
74-115: LGTM: Regular create flow assertions mirror the raw-response test and exercise the normalized path.Consistent validation of prompts, completions, and token usage ensures both paths are covered.
| authorization: | ||
| - AWS4-HMAC-SHA256 Credential=AKIAQEMAC2MSQDTITCKK/20250812/us-east-1/bedrock/aws4_request, | ||
| SignedHeaders=accept;accept-encoding;content-length;content-type;host;x-amz-date;x-stainless-arch;x-stainless-lang;x-stainless-os;x-stainless-package-version;x-stainless-raw-response;x-stainless-read-timeout;x-stainless-retry-count;x-stainless-runtime;x-stainless-runtime-version;x-stainless-timeout, | ||
| Signature=163e22b87236d5029439f088af26ed28cfa293f95a22e4cdba8789e58e8709c2 | ||
| connection: |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Redact the AWS Authorization header from the cassette and rotate credentials if these are real.
Static analysis flagged this as an AWS credential pattern (Gitleaks: aws-access-token; Checkov: CKV_SECRET_2). Even in recorded fixtures, committing an AWS4-HMAC-SHA256 Authorization header with an AKIA key ID and Signature is a secret leakage risk and makes replays brittle. Redact it and configure VCR to filter it going forward.
Apply this diff to sanitize the cassette now:
authorization:
- - AWS4-HMAC-SHA256 Credential=AKIAQEMAC2MSQDTITCKK/20250812/us-east-1/bedrock/aws4_request,
- SignedHeaders=accept;accept-encoding;content-length;content-type;host;x-amz-date;x-stainless-arch;x-stainless-lang;x-stainless-os;x-stainless-package-version;x-stainless-raw-response;x-stainless-read-timeout;x-stainless-retry-count;x-stainless-runtime;x-stainless-runtime-version;x-stainless-timeout,
- Signature=163e22b87236d5029439f088af26ed28cfa293f95a22e4cdba8789e58e8709c2
+ - "<REDACTED>"To prevent reintroductions, add VCR config to filter these headers during recording (example below) and re-record cassettes. If these credentials were real, rotate the key pair immediately.
🏁 Script executed:
#!/bin/bash
# Find potential AWS credentials or recorded Authorization headers across the repo
rg -n --hidden --no-ignore 'AKIA[0-9A-Z]{16}' || true
rg -n --hidden --no-ignore 'AWS4-HMAC-SHA256' || true
rg -n --hidden --no-ignore 'x-amz-secret|aws_secret_access_key|aws_access_key_id' || trueLength of output: 2676
Redact AWS Authorization header from anthropic cassettes
Static analysis flagged AWS4-HMAC-SHA256 authorization headers in recorded fixtures, leaking credentials and making tests brittle. Remove these headers now and configure VCR to filter them going forward. If these credentials are real, rotate them immediately.
Affected cassettes:
- packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml (line 11)
- packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml (line 11)
Apply this diff in each file:
authorization:
- - AWS4-HMAC-SHA256 Credential=AKIAQEMAC2MSQDTITCKK/20250812/us-east-1/bedrock/aws4_request,
- SignedHeaders=accept;accept-encoding;content-length;content-type;host;x-amz-date;x-stainless-arch;x-stainless-lang;x-stainless-os;x-stainless-package-version;x-stainless-raw-response;x-stainless-read-timeout;x-stainless-retry-count;x-stainless-runtime;x-stainless-runtime-version;x-stainless-timeout,
- Signature=163e22b87236d5029439f088af26ed28cfa293f95a22e4cdba8789e58e8709c2
+ - "<REDACTED>"Add a vcr_config fixture in
packages/opentelemetry-instrumentation-anthropic/tests/conftest.py to filter the authorization header:
import pytest
@pytest.fixture(autouse=True)
def vcr_config():
return {
"filter_headers": ["authorization"],
}Re-record these cassettes after making these changes to prevent reintroducing secrets.
🧰 Tools
🪛 Gitleaks (8.27.2)
11-11: Identified a pattern that may indicate AWS credentials, risking unauthorized cloud resource access and data breaches on AWS platforms.
(aws-access-token)
🪛 Checkov (3.2.334)
[HIGH] 11-12: AWS Access Key
(CKV_SECRET_2)
🤖 Prompt for AI Agents
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml
lines 10-14: the cassette contains a raw AWS4-HMAC-SHA256 authorization header
that must be removed; delete the authorization header block from this file (and
the same header at the indicated line in
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml),
add a vcr_config fixture to
packages/opentelemetry-instrumentation-anthropic/tests/conftest.py that filters
the "authorization" header for all tests, and then re-record the affected
cassettes; if the exposed credentials are real, rotate them immediately.
9bd6080 to
26c4738
Compare
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed d392ee1 in 47 seconds. Click for details.
- Reviewed
35lines of code in1files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py:10
- Draft comment:
Removed extraneous trailing whitespace between the skip and credentials section. This aligns with clean formatting best practices. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
2. packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py:71
- Draft comment:
Removed trailing whitespace from the pytest.mark.asyncio decorator for consistency. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
3. packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py:116
- Draft comment:
Added a newline at the end of file to follow POSIX newline conventions. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
Workflow ID: wflow_AjsHngmj8ZL2c4Lw
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml (1)
10-13: Critical: Redact AWS Authorization header in cassette and filter going forwardThe cassette includes an AWS4-HMAC-SHA256 Authorization header with an AKIA key ID and signature. This is a secret-leak risk and makes tests brittle. Redact it and configure VCR to filter this header for all recordings. If the credentials are real, rotate them immediately.
Apply this diff to sanitize the cassette now:
authorization: - - AWS4-HMAC-SHA256 Credential=AKIAQEMAC2MSQDTITCKK/20250812/us-east-1/bedrock/aws4_request, - SignedHeaders=accept;accept-encoding;content-length;content-type;host;x-amz-date;x-stainless-arch;x-stainless-lang;x-stainless-os;x-stainless-package-version;x-stainless-raw-response;x-stainless-read-timeout;x-stainless-retry-count;x-stainless-runtime;x-stainless-runtime-version;x-stainless-timeout, - Signature=163e22b87236d5029439f088af26ed28cfa293f95a22e4cdba8789e58e8709c2 + - "<REDACTED>"Additionally, add a VCR config to filter these headers during recording (in packages/opentelemetry-instrumentation-anthropic/tests/conftest.py):
import pytest @pytest.fixture(autouse=True) def vcr_config(): return { "filter_headers": ["authorization"], }Run this script to find any other leaked AWS credentials/headers across the repo:
#!/bin/bash set -euo pipefail echo "Scanning for potential AWS credential leaks..." rg -n --hidden --no-ignore 'AKIA[0-9A-Z]{16}' || true rg -n --hidden --no-ignore 'AWS4-HMAC-SHA256' || true rg -n --hidden --no-ignore 'x-amz-secret|aws_secret_access_key|aws_access_key_id' || truepackages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml (1)
10-13: Critical: Redact AWS Authorization header in cassette and filter going forwardSame issue here: the cassette contains an AWS4-HMAC-SHA256 Authorization header with an AKIA key ID and signature. Redact it and filter this header via VCR to avoid future leaks. Rotate keys if real.
Apply this diff:
authorization: - - AWS4-HMAC-SHA256 Credential=AKIAQEMAC2MSQDTITCKK/20250812/us-east-1/bedrock/aws4_request, - SignedHeaders=accept;accept-encoding;content-length;content-type;host;x-amz-date;x-stainless-arch;x-stainless-lang;x-stainless-os;x-stainless-package-version;x-stainless-read-timeout;x-stainless-retry-count;x-stainless-runtime;x-stainless-runtime-version;x-stainless-timeout, - Signature=7eb3fdf5c7fe741fb5066e245bf529e9b385d7667eedfe0a955bdabec62ae1aa + - "<REDACTED>"
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py (1)
12-25: Guard against accidental credential use and improve region fallback
- Avoid depending on real AWS credentials during test runs; replays shouldn’t require them. Also prefer a more universal region fallback.
Apply:
def async_anthropic_bedrock_client(): if AsyncAnthropicBedrock is None: pytest.skip("AsyncAnthropicBedrock not available") - # Try to get credentials from environment first - aws_access_key = os.environ.get("AWS_ACCESS_KEY_ID", "test-key") - aws_secret_key = os.environ.get("AWS_SECRET_ACCESS_KEY", "test-secret") - aws_region = os.environ.get("AWS_REGION", "us-east-1") + # Try to get credentials from environment first (tests rely on VCR replays) + aws_access_key = os.environ.get("AWS_ACCESS_KEY_ID", "dummy") + aws_secret_key = os.environ.get("AWS_SECRET_ACCESS_KEY", "dummy") + aws_region = os.environ.get("AWS_REGION") or os.environ.get("AWS_DEFAULT_REGION", "us-east-1")Also ensure VCR filters the Authorization header globally (see cassette review comments) to prevent recording signatures even when real credentials are present during recording.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
packages/opentelemetry-instrumentation-anthropic/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py(2 hunks)packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py(3 hunks)packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py(1 hunks)packages/opentelemetry-instrumentation-anthropic/pyproject.toml(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- packages/opentelemetry-instrumentation-anthropic/pyproject.toml
- packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/init.py
- packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py
- packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py (3)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
SpanAttributes(64-257)packages/opentelemetry-instrumentation-milvus/tests/conftest.py (1)
reader(37-41)packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (1)
get_finished_spans(40-43)
🪛 Gitleaks (8.27.2)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml
11-11: Identified a pattern that may indicate AWS credentials, risking unauthorized cloud resource access and data breaches on AWS platforms.
(aws-access-token)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml
11-11: Identified a pattern that may indicate AWS credentials, risking unauthorized cloud resource access and data breaches on AWS platforms.
(aws-access-token)
🪛 Checkov (3.2.334)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_regular_create.yaml
[HIGH] 11-12: AWS Access Key
(CKV_SECRET_2)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_with_raw_response.yaml
[HIGH] 11-12: AWS Access Key
(CKV_SECRET_2)
🔇 Additional comments (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py (1)
65-71: LGTM: usage token assertions validate normalization pathThe checks on prompt/completion/total tokens are solid and directly validate the new response-normalization logic for both raw and regular paths.
Also applies to: 109-115
| spans = span_exporter.get_finished_spans() | ||
| assert len(spans) == 1 | ||
| assert all(span.name == "anthropic.chat" for span in spans) | ||
|
|
||
| anthropic_span = spans[0] | ||
| assert ( |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Make span assertions resilient to additional spans
Asserting len(spans) == 1 is brittle if future instrumentation emits auxiliary spans. Filter by name and assert exactly one anthropic.chat span.
Apply:
- spans = span_exporter.get_finished_spans()
- assert len(spans) == 1
- assert all(span.name == "anthropic.chat" for span in spans)
-
- anthropic_span = spans[0]
+ spans = span_exporter.get_finished_spans()
+ anthropic_spans = [s for s in spans if s.name == "anthropic.chat"]
+ assert len(anthropic_spans) == 1
+ anthropic_span = anthropic_spans[0]Also applies to: 91-96
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py
around lines 45-50 (and similarly at 91-96), the test currently asserts
len(spans) == 1 which is brittle; change the assertions to filter finished spans
by span.name == "anthropic.chat", assert that the filtered list has exactly one
element, and then set anthropic_span to that single filtered span (e.g.,
filtered_spans[0]); apply the same change to the block at lines 91-96 so all
checks use the filtered list instead of assuming only one span exists.
Fixes issue where AsyncAnthropicBedrock calls using with_raw_response.create() were not generating proper spans with response attributes and token usage. The problem was that with_raw_response wraps responses in a LegacyAPIResponse object that needs to be parsed differently than regular response objects. Changes: - Add _extract_response_data() helper to handle both regular and wrapped responses - Update all response processing functions to use the helper: - set_response_attributes() - _set_span_completions() - shared_metrics_attributes() - _set_token_usage() and _aset_token_usage() - Add bedrock extras to test dependencies for AsyncAnthropicBedrock testing - Add comprehensive tests for both with_raw_response and regular create calls Resolves spans not being generated for AsyncAnthropicBedrock with_raw_response calls. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Update lock file to include bedrock extras added to anthropic dependency for AsyncAnthropicBedrock testing support. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove trailing whitespace - Fix blank line whitespace - Add newline at end of file 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Add test for AsyncAnthropicBedrock beta.messages.with_raw_response.create() to ensure the fix works for the customer's specific use case. This covers the exact scenario mentioned by the user: self.client.beta.messages.with_raw_response.create(...) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
d392ee1 to
8d20dd2
Compare
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 8d20dd2 in 1 minute and 38 seconds. Click for details.
- Reviewed
55lines of code in1files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py:161
- Draft comment:
Add a newline at end of file to meet style best practices. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While having a newline at the end of files is indeed a common best practice, this is something that should be enforced by linters or EditorConfig rather than manual code review comments. It's a minor style issue that doesn't affect functionality. Additionally, many modern editors automatically add this newline, and git will warn about this. The comment is technically correct about the best practice. Missing newlines can cause issues with some tools and make diffs less clean. While true, this is exactly the kind of minor, obvious issue that should be handled by automation rather than taking up reviewer and author time. This comment should be deleted as it's too minor and obvious to warrant manual review attention, and should be handled by automation instead.
2. packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py:55
- Draft comment:
Consider extracting the repeated raw response parsing logic into a helper function to reduce duplication. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
Workflow ID: wflow_utINxrBk8m4KTUUw
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
…upport Add instrumentation for with_raw_response wrapper methods across all Anthropic API endpoints: - Regular SDK messages with_raw_response methods - Beta API with_raw_response methods for regular SDK - Beta API with_raw_response methods for Bedrock SDK Improve response data extraction to properly handle LegacyAPIResponse wrappers from with_raw_response calls by implementing parse() method support. This addresses the issue where spans were not generated for AsyncAnthropicBedrock beta.messages.with_raw_response.create calls, which is critical for a major client. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 5230ec5 in 2 minutes and 28 seconds. Click for details.
- Reviewed
220lines of code in3files - Skipped
0files when reviewing. - Skipped posting
6draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:81
- Draft comment:
New instrumentation entries for with_raw_response (both regular and async, beta API, and Bedrock SDK) have been added. Ensure these mappings match the current SDK methods. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%This comment is asking the PR author to ensure that the mappings match the current SDK methods, which is a form of asking for confirmation or verification. This violates the rule against asking the author to confirm or ensure behavior.
2. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:710
- Draft comment:
The exception handling for wrapping methods now catches all Exceptions (with debug logging). This is acceptable for instrumentation but consider if more context or a higher log level might be useful in production. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py:70
- Draft comment:
In _extract_response_data, the added debug logging in the exception block is useful for diagnosing parse failures. Verify that the fallback to dict is appropriate if parse() fails. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py:83
- Draft comment:
The fallback block assigns response.dict to a temporary variable before returning it. This is redundant; consider returning response.dict directly for simplicity. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While the comment is technically correct that the code could be simpler, this is an extremely minor style issue. The temporary variable might have been added for debugging purposes or to make the code more readable. The change doesn't impact functionality or maintainability in any significant way. According to the rules, we should not make purely informative comments or comments about obvious/unimportant issues. The temporary variable might actually serve a purpose - it could make debugging easier or make the code more readable for some developers. Am I being too quick to dismiss potential benefits? Even if the temporary variable has some minor benefits, the comment still falls into the category of being too minor and not clearly requiring a code change. The current code works perfectly fine either way. The comment should be deleted as it addresses an extremely minor style issue that doesn't clearly require a change, violating the rule about not making obvious or unimportant comments.
5. packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_beta_with_raw_response.yaml:51
- Draft comment:
Typo: The phrase "here''s" appears to use a double apostrophe. Consider changing it to "here's". - Reason this comment was not posted:
Comment was on unchanged code.
6. packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_beta_with_raw_response.yaml:57
- Draft comment:
Typo: The phrase "you''d" seems to have an extra apostrophe. Consider changing it to "you'd". - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_inkoWhzGb7mtiUGX
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (1)
825-826: Inconsistent exception handling between WRAPPED_METHODS and WRAPPED_AMETHODS.The async methods wrapping still only catches
ModuleNotFoundErrorwhile the sync methods now catch broaderExceptiontypes. This inconsistency could lead to different failure behaviors.Apply this diff to make the exception handling consistent:
- except ModuleNotFoundError: + except Exception as e: + logger.debug(f"Failed to wrap {wrap_package}.{wrap_object}.{wrap_method}: {e}") pass # that's ok, we don't want to fail if some methods do not exist
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_beta_with_raw_response.yaml (1)
22-33: Optional: filter volatile headers to reduce cassette churn.User-Agent, runtime/version, and stainless headers tend to change across environments/versions and can cause unnecessary cassette diffs. Consider filtering them out.
Suggested addition to the vcr_config filter_headers (on top of Authorization):
- user-agent
- x-stainless-arch
- x-stainless-lang
- x-stainless-os
- x-stainless-package-version
- x-stainless-runtime
- x-stainless-runtime-version
- x-stainless-timeout
- x-stainless-read-timeout
- x-stainless-retry-count
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py(4 hunks)packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py(1 hunks)packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_beta_with_raw_response.yaml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (1)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py (1)
_extract_response_data(64-87)
🪛 Gitleaks (8.27.2)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_beta_with_raw_response.yaml
11-11: Identified a pattern that may indicate AWS credentials, risking unauthorized cloud resource access and data breaches on AWS platforms.
(aws-access-token)
🪛 Checkov (3.2.334)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_beta_with_raw_response.yaml
[HIGH] 11-12: AWS Access Key
(CKV_SECRET_2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Lint
- GitHub Check: Build Packages (3.11)
🔇 Additional comments (7)
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_beta_with_raw_response.yaml (3)
34-35: Good: Explicitly exercising with_raw_response path.x-stainless-raw-response: 'true' confirms this cassette covers the wrapped-response scenario the PR fixes.
50-59: Good coverage: usage in body and token counts in headers present.Both response.usage and X-Amzn-Bedrock-*-Token-Count headers are recorded, allowing tests to validate extraction from either source as updated by _extract_response_data().
Also applies to: 69-74
1-81: Clean AWS Credentials Verification PassedNo occurrences of AWS Access Key IDs (
AKIA…), SigV4 Authorization headers (AWS4-HMAC-SHA256 Credential=), or generic Bearer/Basic tokens were found after scanning the entire repository. The codebase is free of leaked AWS identifiers.packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (4)
84-172: LGTM! Comprehensive expansion of instrumented API surface.The addition of all the new wrapped methods properly covers the gaps for with_raw_response calls and beta API methods across both regular Anthropic SDK and Bedrock SDK. This addresses the core issue where AsyncAnthropicBedrock calls using with_raw_response.create() were not generating proper spans.
222-223: Centralized response normalization correctly implemented.The import and usage of
_extract_response_dataproperly handles both regular responses and wrapped LegacyAPIResponse objects by normalizing them to dictionary format before token usage extraction.
315-316: Consistent response normalization in sync version.The same centralized approach is correctly applied to the synchronous token usage function, ensuring consistent handling across both async and sync code paths.
802-805: Improved error handling and logging for wrapped methods.The enhanced exception handling now catches broader Exception types (not just ModuleNotFoundError) and adds debug logging for both successful and failed method wrapping attempts. This will help with troubleshooting instrumentation issues while gracefully continuing when methods don't exist.
| authorization: | ||
| - AWS4-HMAC-SHA256 Credential=AKIAQEMAC2MSQDTITCKK/20250812/us-east-1/bedrock/aws4_request, | ||
| SignedHeaders=accept;accept-encoding;content-length;content-type;host;x-amz-date;x-stainless-arch;x-stainless-lang;x-stainless-os;x-stainless-package-version;x-stainless-raw-response;x-stainless-read-timeout;x-stainless-retry-count;x-stainless-runtime;x-stainless-runtime-version;x-stainless-timeout, | ||
| Signature=c2c5f7bbfea072d8914f9500ba1d01f219242c57037156147fb496a489d6c514 |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Critical: Authorization header leaks AWS Access Key ID (blocker).
Static analysis (Gitleaks/Checkov) correctly flags this. Even in cassettes, persisting an AKIA Access Key ID and SigV4 signature is a secret-scanning violation. Redact it and configure VCR to filter this header to prevent future leaks and flaky matches.
Apply this diff to redact the header in the cassette:
- authorization:
- - AWS4-HMAC-SHA256 Credential=AKIAQEMAC2MSQDTITCKK/20250812/us-east-1/bedrock/aws4_request,
- SignedHeaders=accept;accept-encoding;content-length;content-type;host;x-amz-date;x-stainless-arch;x-stainless-lang;x-stainless-os;x-stainless-package-version;x-stainless-raw-response;x-stainless-read-timeout;x-stainless-retry-count;x-stainless-runtime;x-stainless-runtime-version;x-stainless-timeout,
- Signature=c2c5f7bbfea072d8914f9500ba1d01f219242c57037156147fb496a489d6c514
+ authorization:
+ - "<REDACTED>"Additionally, add a VCR/pytest-recording filter to strip these headers at record/playback time to avoid reintroducing the leak:
Python (e.g., tests/conftest.py):
import pytest
@pytest.fixture(scope="session")
def vcr_config():
return {
"filter_headers": [
"authorization",
"x-amz-date",
"x-amzn-requestid",
# optionally reduce cassette churn:
"user-agent",
]
}🧰 Tools
🪛 Gitleaks (8.27.2)
11-11: Identified a pattern that may indicate AWS credentials, risking unauthorized cloud resource access and data breaches on AWS platforms.
(aws-access-token)
🪛 Checkov (3.2.334)
[HIGH] 11-12: AWS Access Key
(CKV_SECRET_2)
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-anthropic/tests/cassettes/test_bedrock_with_raw_response/test_async_anthropic_bedrock_beta_with_raw_response.yaml
around lines 10-13, the authorization header contains a real AWS Access Key ID
and SigV4 signature which must be removed; replace the header value with a
neutral placeholder like "<REDACTED_AUTHORIZATION>" (or remove the header entry
entirely) in the cassette file to eliminate the leaked secret, and then add a
VCR/pytest-recording config (e.g., in tests/conftest.py) that filters
"authorization", "x-amz-date", "x-amzn-requestid" (and optionally "user-agent")
during record/playback so future recordings will strip those headers
automatically.
…or $1B customer Key fixes implemented: 1. **Moved beta async methods to WRAPPED_AMETHODS**: Beta API async methods were incorrectly in WRAPPED_METHODS instead of WRAPPED_AMETHODS, preventing async wrapper execution 2. **Created async response processing functions**: Added _aextract_response_data, aset_response_attributes, _aset_span_completions, ashared_metrics_attributes, and _ahandle_response to properly handle async response extraction 3. **Fixed coroutine handling**: Async response processing can now properly await coroutines instead of failing with "Cannot extract data from unawaited coroutine" 4. **Comprehensive with_raw_response support**: Added instrumentation for all with_raw_response endpoints across regular SDK and Bedrock SDK This resolves the critical issue where AsyncAnthropicBedrock beta.messages.with_raw_response.create was not generating spans with proper response content and token usage metrics. Tests now pass: ✅ Regular bedrock with_raw_response ✅ Regular bedrock create ✅ Beta API bedrock with_raw_response (CRITICAL for $1B customer) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 0c4b780 in 2 minutes and 3 seconds. Click for details.
- Reviewed
337lines of code in4files - Skipped
0files when reviewing. - Skipped posting
8draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:55
- Draft comment:
The reordering/removal and re-addition of beta API async wrappers for regular and Bedrock SDK looks intentional. Please double-check that no duplicate or conflicting wrappers remain. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
2. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:219
- Draft comment:
Switching to the async helper _aextract_response_data and awaiting its result ensures consistency. Verify all call sites are updated accordingly. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:822
- Draft comment:
Catching a broad Exception here may mask unexpected errors. Consider narrowing the exception or logging more details for debugging. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:695
- Draft comment:
The async _awrap now awaits ashared_metrics_attributes and _ahandle_response. This ensures correct async handling—good update. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
5. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py:64
- Draft comment:
In _aextract_response_data, if response.parse() might be async in future it would be good to check and await it. Currently it handles only synchronous parse(). - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
6. packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py:56
- Draft comment:
Tests access raw response content correctly by checking for the parse() method. This test structure adequately verifies the span generation for with_raw_response. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
7. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/span_utils.py:189
- Draft comment:
Typo: The comment says "Antrhopic"; should be "Anthropic". - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While the typo fix is technically correct, it's just a misspelling in a comment that doesn't affect functionality. The comment is about documentation, not code. According to the rules, we should not make purely informative comments or comments that are unimportant. A typo in a comment is very low priority. The typo could be confusing to future readers and it appears in multiple places in the code. It's a simple fix that improves code quality. While it's a valid correction, the typo doesn't impact code functionality and fixing it provides minimal value. The rules specifically say to avoid unimportant comments. Delete this comment as it's too minor and doesn't affect code functionality. Comment typos don't meet the bar for required code changes.
8. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py:64
- Draft comment:
Typo: The function name_aextract_response_dataappears to be missing the 'sync' part. Consider renaming it to_async_extract_response_datafor better clarity and consistency with the asynchronous nature of the function. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The codebase consistently uses the 'a' prefix for async versions of functions. This is seen in multiple places like ashared_metrics_attributes and acount_prompt_tokens_from_request. While _async_extract_response_data might be more verbose, changing the existing pattern would reduce consistency. The current name follows the established convention. The comment has a point that _async_extract_response_data would be more immediately clear to readers unfamiliar with the codebase's conventions. However, consistency within a codebase is more important than slightly improved clarity for new readers. The 'a' prefix is a clear pattern here that developers will quickly learn. The comment should be deleted as it suggests breaking an established naming convention in the codebase. The current name is consistent with the project's patterns.
Workflow ID: wflow_e0CyyXAYhGNzO7eC
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
…se and beta API - Add beta API method paths to WRAPPED_AMETHODS for AsyncAnthropicBedrock - Create async response processing functions to handle coroutines properly - Add _aextract_response_data() to handle both coroutines and LegacyAPIResponse objects - Create comprehensive test coverage for with_raw_response and beta API endpoints - Fix issue where spans were not generated for $1B customer use case 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed cc5f94b in 1 minute and 16 seconds. Click for details.
- Reviewed
64lines of code in1files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:81
- Draft comment:
Explicit instrumentation for with_raw_response methods has been removed from the WRAPPED_METHODS list. Ensure that the unified response processing (via _extract_response_data) now correctly handles these raw responses and that backward compatibility is maintained. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment is asking the author to ensure that the unified response processing now correctly handles raw responses and maintains backward compatibility. This is a request for confirmation and testing, which violates the rules.
2. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:107
- Draft comment:
Removal of with_raw_response wrappers for Beta API (regular Anthropic SDK) appears intentional. Please confirm tests cover these flows using the new unified handling. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment asks the PR author to confirm that tests cover the new flows after removing certain wrappers. This violates the rule against asking the author to ensure that changes are tested.
3. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:133
- Draft comment:
Removal of with_raw_response wrappers for the Beta API (Bedrock SDK) methods is noted. Verify that instrumentation now correctly processes these responses. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_9LoULUcRIYO3MsDq
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
- Remove unused exception variable - Fix whitespace in blank lines - Add newline at end of test file 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 59a8423 in 54 seconds. Click for details.
- Reviewed
63lines of code in3files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py:797
- Draft comment:
Removed the unused exception variable in the except clause. This is fine if you don't need the error detail, but consider logging the error if debugging is needed. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
2. packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/utils.py:64
- Draft comment:
Minor whitespace cleanup in async _aextract_response_data and _extract_response_data functions improves readability. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
3. packages/opentelemetry-instrumentation-anthropic/tests/test_bedrock_with_raw_response.py:158
- Draft comment:
Added newline at end of file ensures proper POSIX compliance and avoids potential warnings. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
Workflow ID: wflow_V1PItxlG0V0nfL4R
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Fixes issue where AsyncAnthropicBedrock calls using with_raw_response.create() were not generating proper spans with response attributes and token usage.
The problem was that with_raw_response wraps responses in a LegacyAPIResponse object that needs to be parsed differently than regular response objects.
Changes:
Resolves spans not being generated for AsyncAnthropicBedrock with_raw_response calls.
🤖 Generated with Claude Code
feat(instrumentation): ...orfix(instrumentation): ....Important
Fixes span generation for
AsyncAnthropicBedrockwithwith_raw_responseby adding response parsing helpers and updating related functions._extract_response_data()and_aextract_response_data()inutils.pyto handle both regular and wrapped responses.set_response_attributes(),_set_span_completions(),shared_metrics_attributes(),_set_token_usage(), and_aset_token_usage()to use the new helper functions.bedrockextras to test dependencies inpyproject.toml.test_bedrock_with_raw_response.pyforAsyncAnthropicBedrockwithwith_raw_responseand regularcreatecalls.tests/cassettes/test_bedrock_with_raw_response/.This description was created by
for 59a8423. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit