Skip to content

fix(openai): improve LiteLLM LegacyAPIResponse handling for streaming#3290

Open
nirga wants to merge 5 commits intomainfrom
strands-example
Open

fix(openai): improve LiteLLM LegacyAPIResponse handling for streaming#3290
nirga wants to merge 5 commits intomainfrom
strands-example

Conversation

@nirga
Copy link
Copy Markdown
Member

@nirga nirga commented Aug 17, 2025

Summary

  • Fixes streaming response handling for LiteLLM's LegacyAPIResponse objects used by libraries like strands
  • Improves architecture by handling LegacyAPIResponse parsing at wrapper creation time instead of during iteration
  • Prevents mutation of original stream objects to maintain integrity across multiple consumers
  • Enhances error handling and type detection robustness

Key Changes

  • Better LegacyAPIResponse handling: Parse at wrapper creation time, not during iteration
  • Stream integrity: No longer mutate original response objects
  • Robust type detection: Use duck typing instead of fragile string matching
  • Enhanced error handling: Proper logging with graceful fallback behavior
  • Comprehensive test coverage: Added tests for stream integrity, multiple consumers, and edge cases

Test Plan

  • All existing OpenAI instrumentation tests pass (167/167)
  • New LegacyAPIResponse tests validate stream integrity
  • Real-world validation with strands library example
  • Error handling scenarios covered
  • Multiple consumer safety verified

🤖 Generated with Claude Code


Important

Improves LiteLLM's LegacyAPIResponse handling for streaming, ensuring integrity and robust error handling, with comprehensive test coverage.

  • Behavior:
    • Improved handling of LegacyAPIResponse in chat_wrappers.py and completion_wrappers.py by parsing at wrapper creation time.
    • Prevents mutation of original stream objects to maintain integrity across multiple consumers.
    • Enhanced error handling with proper logging and graceful fallback behavior.
  • Functions:
    • Added _is_legacy_api_response() to check for LegacyAPIResponse objects.
    • Updated is_streaming_response() to handle LegacyAPIResponse with streaming.
    • Modified model_as_dict() to return empty dict for streaming LegacyAPIResponse.
  • Classes:
    • Updated ChatStream in chat_wrappers.py to handle LegacyAPIResponse parsing and cleanup.
  • Tests:
    • Added tests in test_legacy_api_response.py for stream integrity, multiple consumers, and error handling.
    • Added VCR cassettes for testing raw-response headers and streaming scenarios.

This description was created by Ellipsis for b6981c8. You can customize this summary. It will automatically update as commits are pushed.


Summary by CodeRabbit

  • New Features

    • Improved handling of legacy OpenAI streaming responses with automatic parsing, ability to swap in parsed streams, and stream lifecycle helpers (parse, close, aclose).
    • Added a multi-agent travel-planning demo and tool integrations (search, weather, booking, coordination, persistence) in the sample app.
  • Bug Fixes

    • Prevent crashes and improve finalization/metrics for legacy streaming responses, including tool-call and finish-reason handling.
  • Tests

    • Added cassettes and tests for raw-response headers, streaming chunks, tool calls, and error cases.
  • Chores

    • Updated Python runtime requirement to 3.11.

- Handle LegacyAPIResponse parsing at wrapper creation time instead of during iteration
- Prevent mutation of original stream objects to maintain integrity
- Add robust type detection using duck typing instead of string matching
- Enhance error handling with proper logging and fallback behavior
- Add comprehensive tests for stream integrity and edge cases

Fixes issues with strands library and other tools using LiteLLM streaming responses.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Aug 17, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds detection and handling for legacy OpenAI streaming responses (objects with parse()), updates streaming detection and model extraction APIs, enhances ChatStream with parse/close/aclose and destructor finalization, adds tests/cassettes for legacy raw responses, bumps the OpenTelemetry OpenAI package Python runtime to 3.11, and adds two sample-app modules for multi-agent travel planning and Traceloop OpenAI tools.

Changes

Cohort / File(s) Summary of edits
Python version bump
packages/opentelemetry-instrumentation-openai/.python-version
Update Python runtime specification from 3.9.53.11.
Shared utils: streaming detection & dict extraction
packages/.../shared/__init__.py
Add private _is_legacy_api_response(obj); change is_streaming_response(response)is_streaming_response(response, kwargs=None) to consider legacy v1 wrappers; change model_as_dict(model)model_as_dict(model, is_streaming=False) and return {} for legacy streaming responses to avoid extraction during streaming. Preserve dict and pydantic handling.
Chat streaming wrappers
packages/.../shared/chat_wrappers.py
Detect LegacyAPIResponse in streaming paths and, when streaming is requested, attempt response.parse() to obtain an actual stream; pass parsed actual_response into ChatStream (marking parsed state); call model_as_dict(..., is_streaming) in streaming flows; add ChatStream.parse(), close(), and async aclose(); extend __del__ to propagate accumulated completion data and finalize spans.
Completion wrappers branching
packages/.../shared/completion_wrappers.py
Update calls to is_streaming_response(response, kwargs) in sync/async completion wrappers so legacy streaming signals (via kwargs) are honored; streaming builders unchanged.
Tests: legacy response cassettes
packages/.../tests/traces/cassettes/test_legacy_api_response/*.yaml
Add three VCR cassettes exercising legacy/raw streaming and header capture: raw error header completion attributes, raw streaming completion SSE chunks, and raw streaming tool_calls completion attributes.
Tests: legacy response behavior
packages/.../tests/traces/test_legacy_api_response.py
Add tests with MockLegacyAPIResponse(parse()), verify _is_legacy_api_response/is_streaming_response behavior, ChatStream.parse() wiring and multi-consumer iteration, model_as_dict streaming vs non-streaming behavior, error surfacing, and span attribute assertions for legacy streaming scenarios.
Sample app: multi-agent example
packages/sample-app/sample_app/strands_example.py
New multi-agent travel planning example: global shared_memory, memory helpers (add_to_memory, get_memory_context, save_trip_plan), research/planning/booking tools, web search using Tavily (env-gated), agent creation (create_agents) and demo flow (demo_multi_agent_conversation).
Sample app: Traceloop OpenAI tools
packages/sample-app/sample_app/traceloop_openai_tools.py
New Traceloop-integrated tools and agent: agent_memory, tool wrappers (search_tavily, get_weather, save_travel_plan), execute_function_call, tool definitions & registry, and travel_planning_agent / plan_travel workflows driving function-calling with OpenAI.

Sequence Diagram(s)

sequenceDiagram
  participant App
  participant Wrapper
  participant OpenAI
  participant ChatStream
  participant Span

  App->>Wrapper: call chat/completion(**kwargs)
  Wrapper->>OpenAI: POST /v1/chat/completions (stream=?)
  OpenAI-->>Wrapper: response (direct stream or LegacyAPIResponse)
  Wrapper->>Wrapper: is_streaming_response(response, kwargs)
  alt streaming
    alt legacy response + kwargs.stream
      Wrapper->>Wrapper: attempt response.parse() -> actual_response
      Wrapper->>ChatStream: wrap(actual_response or response)
      App->>ChatStream: iterate stream
      ChatStream->>Span: update partial attributes
      ChatStream->>ChatStream: parse() can swap underlying stream
    else direct stream
      Wrapper->>ChatStream: wrap direct stream
      App->>ChatStream: iterate stream
    end
    ChatStream->>Span: finalize on close/aclose/__del__
  else non-streaming
    Wrapper->>Wrapper: model_as_dict(response, is_streaming=False)
    Wrapper->>Span: record attributes
    Wrapper-->>App: return response
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • doronkopit5

Poem

I twitch my whiskers at legacy streams,
I parse old packets and stitch new dreams.
I swap the threads and guard each span,
Close doors gently with a careful plan.
A rabbit who traces — code and beams. 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch strands-example

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to 5efc5ca in 2 minutes and 0 seconds. Click for details.
  • Reviewed 907 lines of code in 8 files
  • Skipped 0 files when reviewing.
  • Skipped posting 6 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai/.python-version:1
  • Draft comment:
    Bump to Python 3.11; ensure that all dependencies and CI tests support this version.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is related to a dependency change, specifically a version bump to Python 3.11. The comment asks the author to ensure that all dependencies and CI tests support this version, which is a general request for confirmation and testing. This violates the rule against asking the author to ensure compatibility or testing.
2. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py:331
  • Draft comment:
    Using hasattr(obj, 'parse') for LegacyAPIResponse detection is clever. Consider whether this might catch unintended objects and if additional type checks are needed.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The comment raises a theoretical concern but doesn't provide evidence of actual problems. The double-check of both hasattr and callable is a common Python pattern. The function is private (_is_legacy_api_response) suggesting it's used in a controlled context. The code handles any false positives gracefully by just returning empty dict in model_as_dict() if needed. The comment raises a valid point about type safety. Duck typing can sometimes lead to unexpected behavior if other objects happen to have similarly named methods. While duck typing can be risky, this is a private helper function used in a controlled way, and the code handles potential issues gracefully. The comment doesn't suggest a concrete better alternative. The comment should be deleted as it raises a theoretical concern without providing actionable feedback or evidence of actual problems.
3. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py:359
  • Draft comment:
    For streaming LegacyAPIResponse, model_as_dict returns an empty dict; ensure that this behavior is well documented as it might hide data from non-streaming cases.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
4. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py:633
  • Draft comment:
    In ChatStream.parse(), consider using a deep copy for _complete_response if it may contain nested mutable objects; also consider setting a flag indicating that parsing succeeded.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% The suggestion about deep copy could be valid since _complete_response contains nested dictionaries and lists that would be shared between the old and new stream with a shallow copy. However, looking at _accumulate_stream_items(), the nested objects are always created fresh rather than modified in place, so a shallow copy is actually sufficient. The suggestion about setting a parse success flag seems unnecessary since the code already handles parse failures via try/except. I could be wrong about the nested object modification patterns - there might be places where nested objects are modified in place that I haven't found. And a parse success flag could potentially be useful for debugging. After reviewing _accumulate_stream_items() carefully, I'm confident that nested objects are always created fresh rather than modified in place. The parse success flag would add complexity without clear benefit since errors are already handled via exceptions. The comment should be deleted because both suggestions are unnecessary - shallow copy is sufficient given the implementation details, and parse success tracking is already handled appropriately via exceptions.
5. packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py:15
  • Draft comment:
    Comprehensive tests for LegacyAPIResponse handling have been added. Great job verifying detection, parsing, and stream integrity.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, praising the addition of tests for LegacyAPIResponse handling. It doesn't provide any actionable feedback or suggestions for improvement.
6. packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_completion_attributes.yaml:48
  • Draft comment:
    It appears there is an extraneous backslash before the "type" key in the JSON error message. Consider removing the extra backslash to improve clarity.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50% The backslash is actually part of the YAML's string continuation and indentation mechanism. It's not an error in the JSON - it's how YAML handles multiline strings with preserved formatting. The comment shows a misunderstanding of YAML syntax. This is a test fixture file, so the exact formatting is likely intentional to match real API responses. Could the backslash actually be causing problems when the YAML is parsed? Maybe it affects the test behavior? No - this is a standard YAML syntax feature for handling multiline strings with preserved whitespace. The tests would fail if this wasn't working correctly. The comment should be deleted because it incorrectly identifies standard YAML syntax as an error.

Workflow ID: wflow_TDKEox2O6RYPzj2G

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

actual_response = response
if _is_legacy_api_response(response) and kwargs.get('stream'):
try:
actual_response = response.parse()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In chat_wrapper, the block handling LegacyAPIResponse parsing logs a warning on failure. Consider setting a flag (e.g. _response_was_parsed) to indicate successful parsing so that downstream cleanup can update span attributes accordingly.

@dont_throw
def _record_partial_metrics(self):
"""Record metrics based on available partial data"""
# Debug logging
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove or expand the placeholder debug logging comment in _record_partial_metrics if not needed.

Suggested change
# Debug logging

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🔭 Outside diff range comments (2)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (2)

229-254: Async path: set parsed flag and return ChatStream; mirror sync changes.

The async path should set _response_was_parsed on the returned ChatStream when parsing succeeds.

-        if is_streaming_response(response, kwargs):
+        if is_streaming_response(response, kwargs):
             # span will be closed after the generator is done
             if is_openai_v1():
-                # Handle LegacyAPIResponse by parsing it first, without mutating original
-                actual_response = response
-                if _is_legacy_api_response(response) and kwargs.get('stream'):
-                    try:
-                        actual_response = response.parse()
-                    except Exception as e:
-                        logger.warning(f"Failed to parse LegacyAPIResponse: {e}")
-                        # Fall back to original response
-                        actual_response = response
-                
-                return ChatStream(
-                    span,
-                    actual_response,
-                    instance,
-                    token_counter,
-                    choice_counter,
-                    duration_histogram,
-                    streaming_time_to_first_token,
-                    streaming_time_to_generate,
-                    start_time,
-                    kwargs,
-                )
+                # Handle LegacyAPIResponse by parsing it first, without mutating original
+                actual_response = response
+                parsed_successfully = False
+                if _is_legacy_api_response(response) and kwargs.get("stream"):
+                    try:
+                        actual_response = response.parse()
+                        parsed_successfully = True
+                    except Exception as e:
+                        logger.warning("Failed to parse LegacyAPIResponse: %s", e)
+                        # Fall back to original response
+                        actual_response = response
+
+                stream = ChatStream(
+                    span,
+                    actual_response,
+                    instance,
+                    token_counter,
+                    choice_counter,
+                    duration_histogram,
+                    streaming_time_to_first_token,
+                    streaming_time_to_generate,
+                    start_time,
+                    kwargs,
+                )
+                if parsed_successfully:
+                    setattr(stream, "_response_was_parsed", True)
+                return stream

713-716: Ensure cleanup in async context managers and propagate aexit result.

Symmetry with exit: perform cleanup and return the wrapped result to preserve exception suppression semantics.

-    async def __aexit__(self, exc_type, exc_val, exc_tb):
-        await self.__wrapped__.__aexit__(exc_type, exc_val, exc_tb)
+    async def __aexit__(self, exc_type, exc_val, exc_tb):
+        cleanup_exception = None
+        try:
+            self._ensure_cleanup()
+        except Exception as e:
+            cleanup_exception = e
+        result = False
+        if hasattr(self.__wrapped__, "__aexit__"):
+            result = await self.__wrapped__.__aexit__(exc_type, exc_val, exc_tb)
+        if cleanup_exception:
+            logger.debug("Error during ChatStream cleanup in __aexit__: %s", cleanup_exception)
+        return result
♻️ Duplicate comments (3)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (3)

118-143: Mark parsed legacy streams on ChatStream and avoid losing the parse signal.

When LegacyAPIResponse is parsed up front, the resulting ChatStream should know it came from a parsed legacy response so partial cleanup can finalize spans appropriately. Also keep a boolean to avoid ambiguous state when parsing fails.

-        if is_streaming_response(response, kwargs):
+        if is_streaming_response(response, kwargs):
             # span will be closed after the generator is done
             if is_openai_v1():
-                # Handle LegacyAPIResponse by parsing it first, without mutating original
-                actual_response = response
-                if _is_legacy_api_response(response) and kwargs.get('stream'):
-                    try:
-                        actual_response = response.parse()
-                    except Exception as e:
-                        logger.warning(f"Failed to parse LegacyAPIResponse: {e}")
-                        # Fall back to original response
-                        actual_response = response
-                
-                return ChatStream(
-                    span,
-                    actual_response,
-                    instance,
-                    token_counter,
-                    choice_counter,
-                    duration_histogram,
-                    streaming_time_to_first_token,
-                    streaming_time_to_generate,
-                    start_time,
-                    kwargs,
-                )
+                # Handle LegacyAPIResponse by parsing it first, without mutating original
+                actual_response = response
+                parsed_successfully = False
+                if _is_legacy_api_response(response) and kwargs.get("stream"):
+                    try:
+                        actual_response = response.parse()
+                        parsed_successfully = True
+                    except Exception as e:
+                        logger.warning("Failed to parse LegacyAPIResponse: %s", e)
+                        # Fall back to original response
+                        actual_response = response
+
+                stream = ChatStream(
+                    span,
+                    actual_response,
+                    instance,
+                    token_counter,
+                    choice_counter,
+                    duration_histogram,
+                    streaming_time_to_first_token,
+                    streaming_time_to_generate,
+                    start_time,
+                    kwargs,
+                )
+                if parsed_successfully:
+                    setattr(stream, "_response_was_parsed", True)
+                return stream

864-888: Resolve placeholder debug comment; add a useful debug log or remove.

Replace the placeholder with concrete debug output to aid troubleshooting partial metrics flow.

-        # Debug logging
-        
+        logger.debug(
+            "Recording partial metrics: choices=%d, has_usage=%s",
+            len(self._complete_response.get("choices", [])),
+            bool(self._complete_response.get("usage")),
+        )
@@
-            # Also set completion attributes for any accumulated choices
+            # Also set completion attributes for any accumulated choices

633-663: Set _response_was_parsed on the newly created ChatStream.

_record_partial_metrics checks this flag, but nothing sets it. Ensure the new stream reflects successful parsing.

         if hasattr(self.__wrapped__, 'parse'):
             # Parse the response to get the actual stream
             parsed_stream = self.__wrapped__.parse()
@@
             # Transfer any accumulated response data to the new stream
             new_chat_stream._complete_response = self._complete_response.copy()
+            # Indicate that parsing was performed successfully
+            setattr(new_chat_stream, "_response_was_parsed", True)
🧹 Nitpick comments (12)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (3)

331-334: Use simpler duck-typing check; address Ruff B009.

hasattr + direct access avoids getattr with constant attr and satisfies the duck-typing intent.

-def _is_legacy_api_response(obj):
-    """Check if object is a LiteLLM LegacyAPIResponse"""
-    return hasattr(obj, 'parse') and callable(getattr(obj, 'parse'))
+def _is_legacy_api_response(obj):
+    """Duck-type check for LiteLLM LegacyAPIResponse-like objects."""
+    return hasattr(obj, "parse") and callable(obj.parse)

336-349: Simplify streaming detection and merge isinstance checks (SIM101/SIM102).

Equivalent logic with fewer branches and improved readability.

-def is_streaming_response(response, kwargs=None):
-    if is_openai_v1():
-        # Check if it's directly a stream
-        if isinstance(response, openai.Stream) or isinstance(response, openai.AsyncStream):
-            return True
-        
-        # For LegacyAPIResponse, check if the original request was streaming
-        # Note: LegacyAPIResponse handling is now done at wrapper level
-        if _is_legacy_api_response(response):
-            if kwargs and kwargs.get('stream'):
-                return True
-        
-        return False
-
-    return isinstance(response, types.GeneratorType) or isinstance(
-        response, types.AsyncGeneratorType
-    )
+def is_streaming_response(response, kwargs=None):
+    if is_openai_v1():
+        # Direct OpenAI v1 stream objects
+        if isinstance(response, (openai.Stream, openai.AsyncStream)):
+            return True
+        # LegacyAPIResponse: rely on the original request's stream flag
+        if _is_legacy_api_response(response) and kwargs and kwargs.get("stream"):
+            return True
+        return False
+    return isinstance(response, (types.GeneratorType, types.AsyncGeneratorType))

355-372: Propagate is_streaming in recursive parse() branch

We’ve checked all call sites—streaming contexts (e.g. chat wrappers) already pass is_streaming, and non-streaming usages correctly rely on the default. No further call‐site changes are required.

In packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py, update the parse() handler to pass through is_streaming:

-    elif hasattr(model, "parse"):  # Raw API response
-        return model_as_dict(model.parse())
+    elif hasattr(model, "parse"):  # Raw API response
+        return model_as_dict(model.parse(), is_streaming=is_streaming)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py (1)

253-256: Consider passing is_streaming=False explicitly to document intent.

Not required (default is False), but passing it explicitly can clarify that completion streams aren’t LegacyAPIResponse (chat path handles legacy). Optional.

-        item = model_as_dict(item)
+        item = model_as_dict(item, is_streaming=False)
packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py (5)

52-57: Rename unused loop variable to underscore.

Silences B007 (unused loop variable) and improves clarity.

-        for chunk in actual_stream:
+        for _ in actual_stream:
             pass
@@
-        for chunk in response:
+        for _ in response:
             pass

109-116: Rename unused loop variable to underscore (async tools test).

-        for chunk in actual_stream:
+        for _ in actual_stream:
             pass
@@
-        for chunk in response:
+        for _ in response:
             pass

181-181: Avoid equality to True in assertion.

-    assert chat_stream._cleanup_completed == True
+    assert chat_stream._cleanup_completed

246-248: Mocking iteration: prefer MagicMock patterns for dunder methods.

Assigning __iter__ directly to a lambda on an instance is unreliable for special method lookup. If this test later exercises iteration, switch to MagicMock idioms.

-    mock_stream.__iter__ = lambda: iter(mock_data)
-    mock_stream.__next__ = MagicMock(side_effect=StopIteration)
+    mock_iter = iter(mock_data)
+    mock_stream.__iter__.return_value = mock_stream  # behave like self-iterable stream objects
+    mock_stream.__next__.side_effect = lambda: next(mock_iter)

285-289: Avoid equality to True in assertion.

-    assert _is_legacy_api_response(failing_response) == True
+    assert _is_legacy_api_response(failing_response)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (3)

722-737: Tighten exception handling in next: handle StopIteration explicitly.

Use explicit StopIteration except path; otherwise errors get correct ERROR status and cleanup.

-        except Exception as e:
-            if isinstance(e, StopIteration):
-                self._process_complete_response()
-            else:
-                # Handle cleanup for other exceptions during stream iteration
-                self._ensure_cleanup()
-                if self._span and self._span.is_recording():
-                    self._span.set_status(Status(StatusCode.ERROR, str(e)))
-            raise
+        except StopIteration:
+            self._process_complete_response()
+            raise
+        except Exception as e:
+            # Handle cleanup for other exceptions during stream iteration
+            self._ensure_cleanup()
+            if self._span and self._span.is_recording():
+                self._span.set_status(Status(StatusCode.ERROR, str(e)))
+            raise

739-753: Tighten exception handling in anext: handle StopAsyncIteration explicitly.

-        except Exception as e:
-            if isinstance(e, StopAsyncIteration):
-                self._process_complete_response()
-            else:
-                # Handle cleanup for other exceptions during stream iteration
-                self._ensure_cleanup()
-                if self._span and self._span.is_recording():
-                    self._span.set_status(Status(StatusCode.ERROR, str(e)))
-            raise
+        except StopAsyncIteration:
+            self._process_complete_response()
+            raise
+        except Exception as e:
+            # Handle cleanup for other exceptions during stream iteration
+            self._ensure_cleanup()
+            if self._span and self._span.is_recording():
+                self._span.set_status(Status(StatusCode.ERROR, str(e)))
+            raise

691-709: Optional: guard exit for wrapped objects lacking context manager methods.

Some stream implementations may not implement exit. Guarding makes this wrapper more robust.

-        result = self.__wrapped__.__exit__(exc_type, exc_val, exc_tb)
+        result = False
+        if hasattr(self.__wrapped__, "__exit__"):
+            result = self.__wrapped__.__exit__(exc_type, exc_val, exc_tb)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b0dfd92 and c5f0f53.

📒 Files selected for processing (8)
  • packages/opentelemetry-instrumentation-openai/.python-version (1 hunks)
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (1 hunks)
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (8 hunks)
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py (2 hunks)
  • packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_completion_attributes.yaml (1 hunks)
  • packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_streaming_completion_attributes.yaml (1 hunks)
  • packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_tool_calls_completion_attributes.yaml (1 hunks)
  • packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/cassettes/**/*.{yaml,yml,json}

📄 CodeRabbit Inference Engine (CLAUDE.md)

Never commit secrets or PII in VCR cassettes; scrub sensitive data

Files:

  • packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_completion_attributes.yaml
  • packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_tool_calls_completion_attributes.yaml
  • packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_streaming_completion_attributes.yaml
**/*.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py
  • packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
🧬 Code Graph Analysis (4)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (1)
  • is_streaming_response (336-352)
packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py (2)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (2)
  • parse (633-662)
  • ChatStream (588-905)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (3)
  • is_streaming_response (336-352)
  • model_as_dict (355-371)
  • _is_legacy_api_response (331-333)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (6)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (1)
  • is_streaming_response (154-155)
packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.py (1)
  • is_streaming_response (68-69)
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/__init__.py (1)
  • is_streaming_response (125-126)
packages/opentelemetry-instrumentation-replicate/opentelemetry/instrumentation/replicate/__init__.py (1)
  • is_streaming_response (55-56)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py (1)
  • is_openai_v1 (20-21)
packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/utils.py (1)
  • model_as_dict (75-83)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (4)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (4)
  • _is_legacy_api_response (331-333)
  • is_streaming_response (336-352)
  • model_as_dict (355-371)
  • _set_response_attributes (190-238)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py (2)
  • is_openai_v1 (20-21)
  • should_send_prompts (168-171)
packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/utils.py (1)
  • should_send_prompts (11-14)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py (1)
  • _set_completions (169-179)
🪛 Ruff (0.12.2)
packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py

24-24: Avoid equality comparisons to True; use ...: for truth checks

Replace comparison

(E712)


27-27: Avoid equality comparisons to False; use not ...: for false checks

Replace comparison

(E712)


28-28: Avoid equality comparisons to False; use not is_streaming_response(legacy_response, {}): for false checks

Replace with not is_streaming_response(legacy_response, {})

(E712)


29-29: Avoid equality comparisons to False; use not is_streaming_response(legacy_response): for false checks

Replace with not is_streaming_response(legacy_response)

(E712)


32-32: Avoid equality comparisons to False; use not is_streaming_response("not_a_stream"): for false checks

Replace with not is_streaming_response("not_a_stream")

(E712)


33-33: Avoid equality comparisons to False; use not is_streaming_response({"not": "stream"}): for false checks

Replace with not is_streaming_response({"not": "stream"})

(E712)


52-52: Loop control variable chunk not used within loop body

Rename unused chunk to _chunk

(B007)


56-56: Loop control variable chunk not used within loop body

Rename unused chunk to _chunk

(B007)


111-111: Loop control variable chunk not used within loop body

Rename unused chunk to _chunk

(B007)


115-115: Loop control variable chunk not used within loop body

Rename unused chunk to _chunk

(B007)


181-181: Avoid equality comparisons to True; use chat_stream._cleanup_completed: for truth checks

Replace with chat_stream._cleanup_completed

(E712)


219-219: Avoid equality comparisons to True; use _is_legacy_api_response(legacy_response): for truth checks

Replace with _is_legacy_api_response(legacy_response)

(E712)


225-225: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)


285-285: Avoid equality comparisons to True; use _is_legacy_api_response(failing_response): for truth checks

Replace with _is_legacy_api_response(failing_response)

(E712)

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py

333-333: Do not call getattr with a constant attribute value. It is not any safer than normal property access.

Replace getattr with attribute access

(B009)


339-339: Multiple isinstance calls for response, merge into a single call

Merge isinstance calls for response

(SIM101)


344-345: Use a single if statement instead of nested if statements

(SIM102)


350-352: Multiple isinstance calls for response, merge into a single call

Merge isinstance calls for response

(SIM101)

🪛 Flake8 (7.2.0)
packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py

[error] 24-24: comparison to True should be 'if cond is True:' or 'if cond:'

(E712)


[error] 27-27: comparison to False should be 'if cond is False:' or 'if not cond:'

(E712)


[error] 28-28: comparison to False should be 'if cond is False:' or 'if not cond:'

(E712)


[error] 29-29: comparison to False should be 'if cond is False:' or 'if not cond:'

(E712)


[error] 32-32: comparison to False should be 'if cond is False:' or 'if not cond:'

(E712)


[error] 33-33: comparison to False should be 'if cond is False:' or 'if not cond:'

(E712)


[error] 181-181: comparison to True should be 'if cond is True:' or 'if cond:'

(E712)


[error] 219-219: comparison to True should be 'if cond is True:' or 'if cond:'

(E712)


[error] 225-225: do not compare types, for exact checks use is / is not, for instance checks use isinstance()

(E721)


[error] 285-285: comparison to True should be 'if cond is True:' or 'if cond:'

(E712)

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py

[error] 682-682: visually indented line with same indent as next logical line

(E129)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build Packages (3.11)
  • GitHub Check: Lint
🔇 Additional comments (6)
packages/opentelemetry-instrumentation-openai/.python-version (1)

1-1: Python 3.11 bump — quick scan shows partial coverage; manual verification required

I scanned the repo for Python version pins. packages/opentelemetry-instrumentation-openai/.python-version is 3.11 (this PR), and many packages already allow 3.11, but I did not find packaging metadata or CI workflow entries specifically for the bumped package in this quick run. Please ensure the following are updated and consistent:

  • Packaging (required)
    • packages/opentelemetry-instrumentation-openai/pyproject.toml or setup.cfg / setup.py — update python_requires and trove classifiers to include 3.11 (or >=3.11 as appropriate). I did not locate those files for this package in the scan — please confirm paths and update.
  • CI (required)
    • .github/workflows/*.yml — ensure setup-python or the test matrix adds 3.11 and removes 3.9 if you no longer support it. (Quick scan returned no setup-python matches; please double-check workflows.)
  • Dev tooling (required)
    • .python-version (already set to 3.11 for this package), .tool-versions, runtime.txt, pre-commit, tox.ini / noxfile.py — make sure these are consistent with 3.11.
  • Repo-wide packages to sanity-check (examples found in the scan)
    • packages/sample-app/pyproject.toml: python = ">=3.10,<3.13"
    • packages/traceloop-sdk/pyproject.toml: python = ">=3.10,<4"
    • packages/opentelemetry-semantic-conventions-ai/pyproject.toml: python = ">=3.9,<4"
    • Many poetry.lock files contain python-versions entries (>=3.9 / >=3.10) — review any strict locks or marker expressions that could block 3.11.

Action: update the packaging metadata and CI matrix, then run CI. If you want I can do a targeted re-scan of packages/opentelemetry-instrumentation-openai (look for pyproject/setup.cfg/setup.py, workflows, tox/nox) — tell me to continue and I’ll run focused checks.

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py (2)

108-111: Async path parity maintained: LGTM.

Async wrapper mirrors sync changes and returns the async builder with request kwargs. No issues spotted.


72-75: All is_streaming_response call sites in OpenAI instrumentation are updated

  • In completion_wrappers.py (lines 72 & 108), calls now pass kwargs.
  • In chat_wrappers.py (lines 118 & 229), calls now pass kwargs.
  • The detector’s signature in shared/__init__.py (is_streaming_response(response, kwargs=None)) still supports legacy (no-kwargs) use, as covered by existing tests.

No missed call sites were found; this change fully aligns the OpenAI instrumentation with the new signature.

packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_completion_attributes.yaml (1)

46-49: All API key references are placeholders; no real keys detected
The grep only picked up Cloudflare’s __cf_bm cookie (which contains sk-… internally), not any OpenAI API keys. The "test_api_key" in the YAML is a hard-coded placeholder and isn’t sourced from an environment variable or secret store.

packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py (1)

15-34: LGTM: solid coverage for legacy detection paths.

The mock-based detection checks and kwargs-based streaming assertions reflect the new is_streaming_response signature and behavior.

packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (1)

321-348: LGTM: model_as_dict now passed is_streaming flag.

This aligns with the updated shared API and prevents parsing legacy streaming responses post-consumption.

Comment on lines +670 to +675
async def aclose(self):
"""Close the async stream and ensure cleanup"""
self._ensure_cleanup()
if hasattr(self.__wrapped__, 'close'):
return await self.__wrapped__.close()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix async close: don’t await sync close; prefer aclose when available.

Awaiting a sync close() will raise a TypeError. Check aclose first; call sync close() without await as a fallback.

-    async def aclose(self):
-        """Close the async stream and ensure cleanup"""
-        self._ensure_cleanup()
-        if hasattr(self.__wrapped__, 'close'):
-            return await self.__wrapped__.close()
+    async def aclose(self):
+        """Close the async stream and ensure cleanup"""
+        self._ensure_cleanup()
+        if hasattr(self.__wrapped__, "aclose"):
+            return await self.__wrapped__.aclose()
+        if hasattr(self.__wrapped__, "close"):
+            # Synchronous close as a fallback; do not await
+            return self.__wrapped__.close()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def aclose(self):
"""Close the async stream and ensure cleanup"""
self._ensure_cleanup()
if hasattr(self.__wrapped__, 'close'):
return await self.__wrapped__.close()
async def aclose(self):
"""Close the async stream and ensure cleanup"""
self._ensure_cleanup()
if hasattr(self.__wrapped__, "aclose"):
return await self.__wrapped__.aclose()
if hasattr(self.__wrapped__, "close"):
# Synchronous close as a fallback; do not await
return self.__wrapped__.close()
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
around lines 670 to 675, the async aclose currently awaits a potentially
synchronous close() which can raise TypeError; change logic to prefer calling
and awaiting aclose() on the wrapped object if it exists, and only call close()
without await as a synchronous fallback. Implement: check for attribute 'aclose'
and await it when present, else if 'close' exists call it without await; keep
self._ensure_cleanup() as-is.

Comment on lines 676 to +687
def __del__(self):
"""Cleanup when object is garbage collected"""
if hasattr(self, '_cleanup_completed') and not self._cleanup_completed:
# If we have accumulated completion data, make sure it gets set on the span
if (hasattr(self, '_complete_response') and
(self._complete_response.get('usage') or self._complete_response.get('choices')) and
self._span and self._span.is_recording()):
_set_response_attributes(self._span, self._complete_response)
if should_send_prompts():
_set_completions(self._span, self._complete_response.get("choices"))
self._span.set_status(Status(StatusCode.OK))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix multiline condition indentation in del and keep it flake8-compliant.

Current indentation triggers E129. Restructure the condition for clarity and lint compliance.

-        if hasattr(self, '_cleanup_completed') and not self._cleanup_completed:
-            # If we have accumulated completion data, make sure it gets set on the span
-            if (hasattr(self, '_complete_response') and 
-                (self._complete_response.get('usage') or self._complete_response.get('choices')) and 
-                self._span and self._span.is_recording()):
+        if hasattr(self, "_cleanup_completed") and not self._cleanup_completed:
+            # If we have accumulated completion data, make sure it gets set on the span
+            if (
+                hasattr(self, "_complete_response")
+                and (
+                    self._complete_response.get("usage")
+                    or self._complete_response.get("choices")
+                )
+                and self._span
+                and self._span.is_recording()
+            ):
                 _set_response_attributes(self._span, self._complete_response)
-                if should_send_prompts():
+                if should_send_prompts():
                     _set_completions(self._span, self._complete_response.get("choices"))
                 self._span.set_status(Status(StatusCode.OK))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def __del__(self):
"""Cleanup when object is garbage collected"""
if hasattr(self, '_cleanup_completed') and not self._cleanup_completed:
# If we have accumulated completion data, make sure it gets set on the span
if (hasattr(self, '_complete_response') and
(self._complete_response.get('usage') or self._complete_response.get('choices')) and
self._span and self._span.is_recording()):
_set_response_attributes(self._span, self._complete_response)
if should_send_prompts():
_set_completions(self._span, self._complete_response.get("choices"))
self._span.set_status(Status(StatusCode.OK))
def __del__(self):
"""Cleanup when object is garbage collected"""
if hasattr(self, "_cleanup_completed") and not self._cleanup_completed:
# If we have accumulated completion data, make sure it gets set on the span
if (
hasattr(self, "_complete_response")
and (
self._complete_response.get("usage")
or self._complete_response.get("choices")
)
and self._span
and self._span.is_recording()
):
_set_response_attributes(self._span, self._complete_response)
if should_send_prompts():
_set_completions(self._span, self._complete_response.get("choices"))
self._span.set_status(Status(StatusCode.OK))
🧰 Tools
🪛 Flake8 (7.2.0)

[error] 682-682: visually indented line with same indent as next logical line

(E129)

Comment on lines +64 to +68
- __cf_bm=CnO4N7IJLSVPZ1NJ1sXGrDdWIzyN41oYvTESuNDO7T0-1755447803-1.0.1.1-6VdBrSZbrcn1jl6eSOEfCU0vIIwWabzVpC2g9rFtnJ8T06pw_RH1ussZACigLi9axHA6Rn8ObEttzPb1mOFxmMyN8jcG978QILGVfIqS2_A;
path=/; expires=Sun, 17-Aug-25 16:53:23 GMT; domain=.api.openai.com; HttpOnly;
Secure; SameSite=None
- _cfuvid=OYHhC81cl5STQdGR0AnkjRK_.Pzo0gGLZWk7EwG9w88-1755447803697-0.0.1.1-604800000;
path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Redact Set-Cookie headers in error cassette as well.

Same concern as the streaming cassette: Set-Cookie values should be filtered.

-      Set-Cookie:
-      - __cf_bm=CnO4N7IJLSVPZ1NJ1sXGrDdWIzyN41oYvTESuNDO7T0-1755447803-1.0.1.1-6VdBrSZbrcn1jl6eSOEfCU0vIIwWabzVpC2g9rFtnJ8T06pw_RH1ussZACigLi9axHA6Rn8ObEttzPb1mOFxmMyN8jcG978QILGVfIqS2_A;
-        path=/; expires=Sun, 17-Aug-25 16:53:23 GMT; domain=.api.openai.com; HttpOnly;
-        Secure; SameSite=None
-      - _cfuvid=OYHhC81cl5STQdGR0AnkjRK_.Pzo0gGLZWk7EwG9w88-1755447803697-0.0.1.1-604800000;
-        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      Set-Cookie:
+      - "<filtered>"
+      - "<filtered>"

If not already configured, add cassette filtering as shown in the other comment to avoid future leaks.

🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_completion_attributes.yaml
around lines 64-68, the cassette contains raw Set-Cookie header values that must
be redacted; update this cassette to remove or replace the Set-Cookie header
values (e.g., replace cookie strings with a fixed placeholder or remove the
header entirely) and apply the same cassette filtering configuration used for
the streaming cassette so future recordings are automatically sanitized; after
making the change, run the test suite to confirm no other cassettes leak cookie
values.

Comment on lines +100 to +106
Set-Cookie:
- __cf_bm=u8xrIzhWWeQ4EKdvALOqYMoRkfkygsO_SmbCs4nuz6g-1755448036-1.0.1.1-79K1BXstf3jOeIig9Wr_rNvYR_SDLD5rxQRbW1dRV3wGvg4q8Iiek0b9fzqwqmf6rWC9q3GBuNeorf85TE1tr.lmNxYh0x1lBejVO2Or3CQ;
path=/; expires=Sun, 17-Aug-25 16:57:16 GMT; domain=.api.openai.com; HttpOnly;
Secure; SameSite=None
- _cfuvid=P6Lb2tVtZSlO4afd70zEtpZqCmAJ.tqDpKsE17TqOE8-1755448036360-0.0.1.1-604800000;
path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
Strict-Transport-Security:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Scrub sensitive headers in cassette (Set-Cookie, x-request-id, project id).

Cassettes should not contain secrets/PII. These headers can be considered sensitive and should be masked or filtered to avoid accidental leakage and to keep tests deterministic.

Apply a minimal scrub directly in this cassette:

-      Set-Cookie:
-      - __cf_bm=u8xrIzhWWeQ4EKdvALOqYMoRkfkygsO_SmbCs4nuz6g-1755448036-1.0.1.1-79K1BXstf3jOeIig9Wr_rNvYR_SDLD5rxQRbW1dRV3wGvg4q8Iiek0b9fzqwqmf6rWC9q3GBuNeorf85TE1tr.lmNxYh0x1lBejVO2Or3CQ;
-        path=/; expires=Sun, 17-Aug-25 16:57:16 GMT; domain=.api.openai.com; HttpOnly;
-        Secure; SameSite=None
-      - _cfuvid=P6Lb2tVtZSlO4afd70zEtpZqCmAJ.tqDpKsE17TqOE8-1755448036360-0.0.1.1-604800000;
-        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      Set-Cookie:
+      - __cf_bm=<redacted>; path=/; expires=<redacted>; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      - _cfuvid=<redacted>; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
@@
-      openai-project:
-      - proj_tzz1TbPPOXaf6j9tEkVUBIAa
+      openai-project:
+      - <redacted>
@@
-      x-request-id:
-      - req_1336ff5357994f4297346daa5b3cb4a6
+      x-request-id:
+      - req_<redacted>

Longer-term, prefer configuring the VCR to filter these headers globally (e.g., filter_headers=["set-cookie","x-request-id","openai-project"]) to prevent future regressions.

Also applies to: 119-123, 141-141

Comment on lines +72 to +77
Set-Cookie:
- __cf_bm=wJjR0Xu057Zf_UQmoZWp3TkN0KNg2iXu4MmcFV7PDMo-1755448036-1.0.1.1-hmrKS7BrpsnXYvrimqZt2dGAz9tkQ6dqHrUH.aUFgU_2_2AdXHBdOeSbLXBR7FboG3pa5nrwcsygn3IG4xk9xMoqmUIeQvm0JjHe_wQyRMM;
path=/; expires=Sun, 17-Aug-25 16:57:16 GMT; domain=.api.openai.com; HttpOnly;
Secure; SameSite=None
- _cfuvid=Y6KGeOFFKbSGF9.tRLqFA3ghl7np4QoWbM1H_M1vrn8-1755448036979-0.0.1.1-604800000;
path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Scrub Set-Cookie headers from VCR cassette.

Set-Cookie values are not needed for tests and can be considered sensitive. Please filter them at record time or redact them here.

Apply this minimal redaction:

-      Set-Cookie:
-      - __cf_bm=wJjR0Xu057Zf_UQmoZWp3TkN0KNg2iXu4MmcFV7PDMo-1755448036-1.0.1.1-hmrKS7BrpsnXYvrimqZt2dGAz9tkQ6dqHrUH.aUFgU_2_2AdXHBdOeSbLXBR7FboG3pa5nrwcsygn3IG4xk9xMoqmUIeQvm0JjHe_wQyRMM;
-        path=/; expires=Sun, 17-Aug-25 16:57:16 GMT; domain=.api.openai.com; HttpOnly;
-        Secure; SameSite=None
-      - _cfuvid=Y6KGeOFFKbSGF9.tRLqFA3ghl7np4QoWbM1H_M1vrn8-1755448036979-0.0.1.1-604800000;
-        path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
+      Set-Cookie:
+      - "<filtered>"
+      - "<filtered>"

Optionally configure vcrpy/pytest-recording to filter these headers globally:

# e.g., in tests/conftest.py
def pytest_recording_configure(config, vcr):
    vcr.filter_headers = vcr.filter_headers + [
        ("authorization", "<filtered>"),
        ("set-cookie", "<filtered>"),
        ("x-request-id", "<filtered>"),
        ("openai-organization", "<filtered>"),
        ("openai-project", "<filtered>"),
    ]

🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail
echo "Scanning cassettes for sensitive headers and tokens..."
rg -n -C1 -g '**/cassettes/**/*.{yml,yaml,json}' \
  -e 'Authorization|Bearer|Set-Cookie|x-request-id|openai-project|openai-organization|sk-[a-zA-Z0-9]+' || true

Length of output: 265688


Scrub Set-Cookie headers across all VCR cassettes

There are hundreds of cassettes committed with real Set-Cookie values. We need to redact this header globally at record time and scrub existing files.

• In your tests/conftest.py, configure vcrpy to filter Set-Cookie (and other sensitive headers):

def pytest_recording_configure(config, vcr):
    # redact auth, cookies, request IDs, org/project headers
    vcr.filter_headers = [
        ("authorization", "<filtered>"),
        ("set-cookie", "<filtered>"),
        ("x-request-id", "<filtered>"),
        ("openai-organization", "<filtered>"),
        ("openai-project", "<filtered>"),
    ]

• (Optional) For pre-recorded cassettes, either regenerate them after adding the filter or apply a one-off redaction, e.g.:

# replace all Set-Cookie blocks with a placeholder
grep -Rl "Set-Cookie:" packages/**/tests/**/cassettes | \
  xargs sed -i '/Set-Cookie:/,/^[[:space:]]*[^-]/{
    s/Set-Cookie:/Set-Cookie:/;
    s/- .*/- "<filtered>"/
  }'

With this in place, all future recordings will automatically scrub cookies, and existing cassettes can be cleaned in bulk.

🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_tool_calls_completion_attributes.yaml
around lines 72-77, real Set-Cookie header values are present and must be
redacted; update tests/conftest.py to configure vcrpy to filter sensitive
headers (include "set-cookie" with a placeholder like "<filtered>" alongside
authorization, x-request-id, openai-organization, openai-project) so future
recordings omit cookies, and scrub existing cassettes either by regenerating
them after enabling the filter or running a bulk replacement to replace all
Set-Cookie blocks with a single filtered placeholder.

Comment on lines +90 to +97
openai-organization:
- traceloop
openai-processing-ms:
- '237'
openai-project:
- proj_tzz1TbPPOXaf6j9tEkVUBIAa
openai-version:
- '2020-10-01'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Redact organization/project identifiers in cassette headers.

openai-organization and openai-project can reveal account metadata. These aren’t necessary for test assertions and should be filtered/redacted.

-      openai-organization:
-      - traceloop
-      openai-project:
-      - proj_tzz1TbPPOXaf6j9tEkVUBIAa
+      openai-organization:
+      - "<filtered>"
+      openai-project:
+      - "<filtered>"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
openai-organization:
- traceloop
openai-processing-ms:
- '237'
openai-project:
- proj_tzz1TbPPOXaf6j9tEkVUBIAa
openai-version:
- '2020-10-01'
openai-organization:
- "<filtered>"
openai-processing-ms:
- '237'
openai-project:
- "<filtered>"
openai-version:
- '2020-10-01'
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_legacy_api_response/test_raw_response_header_tool_calls_completion_attributes.yaml
around lines 90 to 97, the cassette contains openai-organization and
openai-project header values that expose account metadata; redact them by
replacing the actual values with a neutral placeholder (e.g. "<REDACTED>") or
remove those header entries entirely if tests don’t rely on them, and ensure any
test assertions or snapshot comparisons are updated to expect the redacted
placeholder so tests remain deterministic.

Comment on lines +24 to +33
assert is_streaming_response(legacy_response, {"stream": True}) == True

# Test that LegacyAPIResponse is not streaming when stream=False or not set
assert is_streaming_response(legacy_response, {"stream": False}) == False
assert is_streaming_response(legacy_response, {}) == False
assert is_streaming_response(legacy_response) == False

# Test with regular objects
assert is_streaming_response("not_a_stream") == False
assert is_streaming_response({"not": "stream"}) == False
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix E712 comparisons to True/False in assertions.

Avoid equality to True/False in tests; assert directly or use not.

-    assert is_streaming_response(legacy_response, {"stream": True}) == True
+    assert is_streaming_response(legacy_response, {"stream": True})
@@
-    assert is_streaming_response(legacy_response, {"stream": False}) == False
-    assert is_streaming_response(legacy_response, {}) == False
-    assert is_streaming_response(legacy_response) == False
+    assert not is_streaming_response(legacy_response, {"stream": False})
+    assert not is_streaming_response(legacy_response, {})
+    assert not is_streaming_response(legacy_response)
@@
-    assert is_streaming_response("not_a_stream") == False
-    assert is_streaming_response({"not": "stream"}) == False
+    assert not is_streaming_response("not_a_stream")
+    assert not is_streaming_response({"not": "stream"})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert is_streaming_response(legacy_response, {"stream": True}) == True
# Test that LegacyAPIResponse is not streaming when stream=False or not set
assert is_streaming_response(legacy_response, {"stream": False}) == False
assert is_streaming_response(legacy_response, {}) == False
assert is_streaming_response(legacy_response) == False
# Test with regular objects
assert is_streaming_response("not_a_stream") == False
assert is_streaming_response({"not": "stream"}) == False
assert is_streaming_response(legacy_response, {"stream": True})
# Test that LegacyAPIResponse is not streaming when stream=False or not set
assert not is_streaming_response(legacy_response, {"stream": False})
assert not is_streaming_response(legacy_response, {})
assert not is_streaming_response(legacy_response)
# Test with regular objects
assert not is_streaming_response("not_a_stream")
assert not is_streaming_response({"not": "stream"})
🧰 Tools
🪛 Ruff (0.12.2)

24-24: Avoid equality comparisons to True; use ...: for truth checks

Replace comparison

(E712)


27-27: Avoid equality comparisons to False; use not ...: for false checks

Replace comparison

(E712)


28-28: Avoid equality comparisons to False; use not is_streaming_response(legacy_response, {}): for false checks

Replace with not is_streaming_response(legacy_response, {})

(E712)


29-29: Avoid equality comparisons to False; use not is_streaming_response(legacy_response): for false checks

Replace with not is_streaming_response(legacy_response)

(E712)


32-32: Avoid equality comparisons to False; use not is_streaming_response("not_a_stream"): for false checks

Replace with not is_streaming_response("not_a_stream")

(E712)


33-33: Avoid equality comparisons to False; use not is_streaming_response({"not": "stream"}): for false checks

Replace with not is_streaming_response({"not": "stream"})

(E712)

🪛 Flake8 (7.2.0)

[error] 24-24: comparison to True should be 'if cond is True:' or 'if cond:'

(E712)


[error] 27-27: comparison to False should be 'if cond is False:' or 'if not cond:'

(E712)


[error] 28-28: comparison to False should be 'if cond is False:' or 'if not cond:'

(E712)


[error] 29-29: comparison to False should be 'if cond is False:' or 'if not cond:'

(E712)


[error] 32-32: comparison to False should be 'if cond is False:' or 'if not cond:'

(E712)


[error] 33-33: comparison to False should be 'if cond is False:' or 'if not cond:'

(E712)

🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py
around lines 24 to 33, change assertions that compare to True/False to direct
assertions: replace "assert is_streaming_response(legacy_response, {\"stream\":
True}) == True" with "assert is_streaming_response(legacy_response, {\"stream\":
True})", and replace the comparisons to False (including the bare call and other
cases) with "assert not is_streaming_response(...)" (e.g., assert not
is_streaming_response(legacy_response, {\"stream\": False}), assert not
is_streaming_response(legacy_response), assert not
is_streaming_response(\"not_a_stream\"), assert not
is_streaming_response({\"not\": \"stream\"})). Ensure no equality comparisons to
True/False remain.

Comment on lines +219 to +226
assert _is_legacy_api_response(legacy_response) == True

# Call parse() method
parsed_stream = legacy_response.parse()

# Verify original object is unchanged
assert type(legacy_response) == original_type
assert hasattr(legacy_response, 'parse')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Prefer direct truthiness and is for exact type comparison.

  • Avoid == True.
  • Use is for exact type checks (or isinstance if subtype is acceptable). Here, exact type comparison is fine.
-    assert _is_legacy_api_response(legacy_response) == True
+    assert _is_legacy_api_response(legacy_response)
@@
-    assert type(legacy_response) == original_type
+    assert type(legacy_response) is original_type
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
assert _is_legacy_api_response(legacy_response) == True
# Call parse() method
parsed_stream = legacy_response.parse()
# Verify original object is unchanged
assert type(legacy_response) == original_type
assert hasattr(legacy_response, 'parse')
assert _is_legacy_api_response(legacy_response)
# Call parse() method
parsed_stream = legacy_response.parse()
# Verify original object is unchanged
assert type(legacy_response) is original_type
assert hasattr(legacy_response, 'parse')
🧰 Tools
🪛 Ruff (0.12.2)

219-219: Avoid equality comparisons to True; use _is_legacy_api_response(legacy_response): for truth checks

Replace with _is_legacy_api_response(legacy_response)

(E712)


225-225: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)

🪛 Flake8 (7.2.0)

[error] 219-219: comparison to True should be 'if cond is True:' or 'if cond:'

(E712)


[error] 225-225: do not compare types, for exact checks use is / is not, for instance checks use isinstance()

(E721)

🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py
around lines 219 to 226, replace the equality-style and loose truth checks:
change "assert _is_legacy_api_response(legacy_response) == True" to a direct
truthiness or explicit identity (prefer "assert
_is_legacy_api_response(legacy_response)" or "assert
_is_legacy_api_response(legacy_response) is True"), and change the exact type
comparison "assert type(legacy_response) == original_type" to use identity
("assert type(legacy_response) is original_type"); leave the hasattr check
as-is.

- Fix async context manager to ensure cleanup and proper exception handling
- Add _response_was_parsed flag to both sync and async paths for proper span status
- Ensure ChatStream cleanup occurs in __aexit__ to mirror __exit__ behavior

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 6176a2c in 1 minute and 20 seconds. Click for details.
  • Reviewed 87 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py:122
  • Draft comment:
    Consider renaming the variable 'parsed_successfully' to something more descriptive (e.g. 'legacy_parse_succeeded') for clarity.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
2. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py:146
  • Draft comment:
    Adding the '_response_was_parsed' flag on the ChatStream is a good indicator; consider adding a brief inline comment explaining its role in cleanup/metric recording for future maintainers.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
3. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py:723
  • Draft comment:
    The async aexit method now captures and logs cleanup exceptions without re-raising. A short inline comment explaining why this exception is logged (and not propagated) would aid clarity.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
4. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py:238
  • Draft comment:
    The legacy API response parsing logic is duplicated between the synchronous (chat_wrapper) and asynchronous (achat_wrapper) implementations. Consider extracting this logic into a helper function to improve maintainability.
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_xTYqUVz8SQavNbzv

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (3)

689-697: Fix multiline condition indentation in del (flake8 E129).

Current indentation triggers E129; reformat for clarity and lint compliance.

Apply this diff:

-        if hasattr(self, '_cleanup_completed') and not self._cleanup_completed:
-            # If we have accumulated completion data, make sure it gets set on the span
-            if (hasattr(self, '_complete_response') and 
-                (self._complete_response.get('usage') or self._complete_response.get('choices')) and 
-                self._span and self._span.is_recording()):
-                _set_response_attributes(self._span, self._complete_response)
-                if should_send_prompts():
-                    _set_completions(self._span, self._complete_response.get("choices"))
-                self._span.set_status(Status(StatusCode.OK))
+        if hasattr(self, "_cleanup_completed") and not self._cleanup_completed:
+            # If we have accumulated completion data, make sure it gets set on the span
+            if (
+                hasattr(self, "_complete_response")
+                and (
+                    self._complete_response.get("usage")
+                    or self._complete_response.get("choices")
+                )
+                and self._span
+                and self._span.is_recording()
+            ):
+                _set_response_attributes(self._span, self._complete_response)
+                if should_send_prompts():
+                    _set_completions(self._span, self._complete_response.get("choices"))
+                self._span.set_status(Status(StatusCode.OK))

892-914: Remove placeholder comment and use standard semconv key for finish_reason.

  • Remove the “Debug logging” placeholder or add real debug statements.
  • Prefer SpanAttributes.LLM_RESPONSE_FINISH_REASON via _set_span_attribute over the raw string key for consistency with semconv usage elsewhere.

Apply this diff:

-        # Debug logging
-        
@@
-            if getattr(self, '_response_was_parsed', False):
-                self._span.set_status(Status(StatusCode.OK))
-                # Set a basic finish reason since we know the response completed
-                self._span.set_attribute("gen_ai.response.finish_reason", "stop")
+            if getattr(self, "_response_was_parsed", False):
+                self._span.set_status(Status(StatusCode.OK))
+                # Set a basic finish reason since we know the response completed
+                _set_span_attribute(
+                    self._span, SpanAttributes.LLM_RESPONSE_FINISH_REASON, "stop"
+                )

680-685: Fix async close: don’t await sync close; prefer aclose when available.

Awaiting a synchronous close() will raise TypeError. Prefer aclose() when present, otherwise call close() without await.

Apply this diff:

     async def aclose(self):
         """Close the async stream and ensure cleanup"""
         self._ensure_cleanup()
-        if hasattr(self.__wrapped__, 'close'):
-            return await self.__wrapped__.close()
+        if hasattr(self.__wrapped__, "aclose"):
+            return await self.__wrapped__.aclose()
+        if hasattr(self.__wrapped__, "close"):
+            # Synchronous close as a fallback; do not await
+            return self.__wrapped__.close()
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (1)

643-673: Make parse() handoff fully isolated and preserve parsed-state.

Two small improvements:

  • Use a deep copy for _complete_response to avoid shared nested structures between the old and new streams.
  • Mark the new stream as parsed (_response_was_parsed = True) so downstream partial cleanup logic can set span status deterministically.

Apply this diff:

-            new_chat_stream._complete_response = self._complete_response.copy()
+            # Deep copy to avoid sharing nested structures across streams
+            new_chat_stream._complete_response = copy.deepcopy(self._complete_response)
+            # Mark the new stream as having been parsed successfully
+            new_chat_stream._response_was_parsed = True
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c5f0f53 and 6176a2c.

📒 Files selected for processing (1)
  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (11 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

  • packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py
🧬 Code Graph Analysis (1)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (4)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/__init__.py (3)
  • _is_legacy_api_response (331-333)
  • is_streaming_response (336-352)
  • _set_response_attributes (190-238)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/utils.py (2)
  • is_openai_v1 (20-21)
  • should_send_prompts (168-171)
packages/opentelemetry-instrumentation-openai/tests/traces/test_legacy_api_response.py (2)
  • parse (11-12)
  • parse (279-280)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/completion_wrappers.py (1)
  • _set_completions (169-179)
🪛 Flake8 (7.2.0)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py

[error] 692-692: visually indented line with same indent as next logical line

(E129)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Packages (3.10)
  • GitHub Check: Test Packages (3.11)
  • GitHub Check: Test Packages (3.12)
  • GitHub Check: Build Packages (3.11)
  • GitHub Check: Lint
🔇 Additional comments (7)
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/shared/chat_wrappers.py (7)

14-14: Duck-typing import for legacy detection looks good.

Importing _is_legacy_api_response to avoid brittle type checks is the right call.


118-147: LegacyAPIResponse parsing at wrapper creation: solid approach.

Parsing the legacy stream upfront without mutating the original and signaling via _response_was_parsed preserves stream integrity and enables safer multi-consumer scenarios.


234-263: Async legacy streaming handling mirrors sync path correctly.

The async wrapper follows the same safe parse-first pattern and marks _response_was_parsed on success. Good symmetry with the sync path.


332-332: Propagating streaming context into model_as_dict: LGTM.

Passing is_streaming into model_as_dict aligns extraction with streaming context.


674-679: Explicit close() hook with cleanup is appropriate.

Ensuring cleanup before delegating to the wrapped close() is correct.


724-740: Async context manager cleanup is robust.

Catching cleanup exceptions, delegating to wrapped __aexit__ when present, and not masking the original exception is the right behavior.


793-793: No action needed.

Whitespace-only change; nothing to address.

@nirga
Copy link
Copy Markdown
Member Author

nirga commented Aug 17, 2025

Review Comment Responses

Thanks for the thorough review! I've addressed the key issues in commit 487ea034:

✅ Fixed Issues:

1. Async Context Manager (aexit) - Fixed to ensure cleanup runs and properly handle exceptions like exit does

2. Missing Parse Success Flag - Added _response_was_parsed = True to both sync and async paths when LegacyAPIResponse parsing succeeds. This was incomplete logic - the flag was checked in _record_partial_metrics() but never set anywhere.

📝 Design Decisions (No Changes):

3. Deep Copy vs Shallow Copy - Keeping shallow copy because _accumulate_stream_items() creates fresh nested objects rather than modifying existing ones, so deep copy would add unnecessary overhead.

4. Duck Typing for LegacyAPIResponse - Keeping the duck typing approach (hasattr(obj, 'parse') and callable(obj.parse)) because it's more robust across LiteLLM versions and handles edge cases gracefully. The function is private and used in controlled contexts.

The core functionality now works correctly with proper span status and cleanup behavior for LiteLLM streaming responses.

Copy link
Copy Markdown
Contributor

@ellipsis-dev ellipsis-dev Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed b6981c8 in 1 minute and 54 seconds. Click for details.
  • Reviewed 751 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/sample-app/sample_app/strands_example.py:292
  • Draft comment:
    Avoid using exit(1) in library code; prefer raising an exception to allow proper error handling.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This is a demo script meant to be run directly (note the if name == "main" block). The exit(1) is used in the create_agents() function which is only called from the demo script. For a demo/example script, using exit(1) for missing required configuration is reasonable and common practice. This isn't library code that others would import and use programmatically. The comment assumes this is library code, but it appears to be a demo script. The exit(1) usage is appropriate for a command-line script when missing required configuration. While raising exceptions is generally good practice in library code, this is clearly a demo script where exit(1) is an acceptable way to handle missing required configuration. The comment should be deleted as it incorrectly assumes this is library code. Using exit(1) is appropriate for this demo script context.
2. packages/sample-app/sample_app/strands_example.py:360
  • Draft comment:
    Consider using a logging framework instead of print(), to provide better control over logging levels and outputs.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This is a demo/example script meant to showcase functionality in a user-friendly way. The print statements are intentionally used for formatted output with emojis to create an interactive demo experience. A logging framework would make the output more formal and less demo-friendly. The prints here are not for debugging or system logging purposes. The comment has a point that logging frameworks provide better control. If this was production code handling errors or system events, logging would be better. However, this is clearly a demo script where the prints are part of the user interface, not system logging. Using a logging framework would make the demo output more complex and less friendly. The comment should be deleted. The print statements are appropriate here since they are used for demo output formatting, not system logging.

Workflow ID: wflow_2Tcy53opxf3r5JHD

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

# Step 1: Coordinator analyzes and delegates
print("🤖 Travel Coordinator analyzing request...")
coord_response = agents["coordinator"](user_request)
print(f"Coordinator: {coord_response.message['content'][0]['text']}\n")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directly indexing response messages (e.g., message['content'][0]['text']) risks an index error if the list is empty. Consider adding a check.

Suggested change
print(f"Coordinator: {coord_response.message['content'][0]['text']}\n")
print(f"Coordinator: {coord_response.message['content'][0]['text'] if coord_response.message['content'] else '[No content]'}\n")

]

# Process tool calls
while response.choices[0].message.tool_calls:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tool call processing loop in travel_planning_agent lacks an iteration limit, risking an infinite loop if the assistant continually returns tool_calls.

def execute_function_call(function_call) -> str:
"""Execute OpenAI function calls."""
function_name = function_call.function.name
arguments = json.loads(function_call.function.arguments)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json.loads() is used on tool call arguments without error handling; consider wrapping it in try/except to catch malformed JSON.

Be thorough, use multiple searches, and provide detailed, actionable travel advice."""

try:
print(f"🔧 **Initializing OpenAI with {len(TOOL_DEFINITIONS)} tool definitions**")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace print() calls with a proper logging mechanism to manage output verbosity and integrate with the application’s logging.

return entry

# Define tools for OpenAI function calling
@tool(name="tavily_search")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typographical inconsistency: The tool decorator uses the name "tavily_search", but later in the TOOL_DEFINITIONS and function registry the function is referred to as "search_tavily". Please ensure the tool name is consistent.

Suggested change
@tool(name="tavily_search")
@tool(name="search_tavily")

except Exception as e:
return f"Search error: {str(e)}"

@tool(name="weather_api")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typographical inconsistency: The tool decorator labels the function as "weather_api", but in the tool definitions and registry it is referred to by the function name "get_weather". Please standardize the naming.

Suggested change
@tool(name="weather_api")
@tool(name="get_weather")

except Exception as e:
return f"🌤️ Weather for {city}: April typically 20-25°C (error: {str(e)[:50]})"

@tool(name="save_file")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typographical inconsistency: The tool decorator uses the name "save_file", but in the tool definitions and registry the function is called "save_travel_plan". Please align the names.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (5)
packages/sample-app/sample_app/traceloop_openai_tools.py (4)

39-41: Align tool decorator names with function/registry names.

Decorator names differ from TOOL_DEFINITIONS/FUNCTION_REGISTRY, causing confusion and inconsistent telemetry naming. Prior feedback requested this change.

-@tool(name="tavily_search")
+@tool(name="search_tavily")
 def search_tavily(query: str) -> str:
@@
-@tool(name="weather_api")
+@tool(name="get_weather")
 def get_weather(city: str) -> str:
@@
-@tool(name="save_file")
+@tool(name="save_travel_plan")
 def save_travel_plan(filename: str, content: str) -> str:

Also applies to: 84-86, 110-112


243-281: Add an iteration limit to the tool-call loop to prevent infinite cycles.

If the assistant keeps returning tool_calls, this can loop indefinitely. Prior feedback flagged this.

-        # Process tool calls
-        while response.choices[0].message.tool_calls:
+        # Process tool calls
+        max_tool_iterations = int(os.getenv("TOOL_CALL_MAX_STEPS", "10"))
+        iterations = 0
+        while response.choices[0].message.tool_calls:
+            if iterations >= max_tool_iterations:
+                log_action("travel_agent", "tool_loop_stopped", {"reason": "max_iterations", "max": max_tool_iterations})
+                break
+            iterations += 1
             message = response.choices[0].message
             conversation.append({
                 "role": "assistant",
                 "content": message.content,
                 "tool_calls": [
                     {
                         "id": tc.id,
                         "type": tc.type,
                         "function": {
                             "name": tc.function.name,
                             "arguments": tc.function.arguments
                         }
                     } for tc in message.tool_calls
                 ]
             })
@@
             # Continue conversation with tool results
             response = client.chat.completions.create(
                 model="gpt-4o-mini",
                 messages=conversation,
                 tools=TOOL_DEFINITIONS,
                 tool_choice="auto",
                 max_tokens=1500
             )

224-236: Prefer logging over print for observability and integration.

Use the logging module (configurable levels/handlers) instead of print, as previously requested.

Add near the imports (outside these ranges):

import logging
logger = logging.getLogger(__name__)
if not logger.handlers:
    logging.basicConfig(level=logging.INFO)

Then update representative prints:

-        print(f"🔧 **Initializing OpenAI with {len(TOOL_DEFINITIONS)} tool definitions**")
+        logger.info("🔧 Initializing OpenAI with %d tool definitions", len(TOOL_DEFINITIONS))
@@
-                print(f"🔧 Calling tool: {tool_call.function.name}")
+                logger.info("🔧 Calling tool: %s", tool_call.function.name)
-                print(f"📊 Result: {tool_result[:100]}...\n")
+                logger.info("📊 Result: %s...", tool_result[:100])
@@
-    print("🔑 API Status:")
-    print(f"   OpenAI: {'✅ Ready' if openai_key else '❌ Missing'}")
-    print(f"   Tavily: {'✅ Ready' if tavily_key else '❌ Missing'}")
-    print(f"   Weather: {'✅ Ready' if weather_key else '❌ Missing'}")
+    logger.info("🔑 API Status:")
+    logger.info("   OpenAI: %s", "✅ Ready" if openai_key else "❌ Missing")
+    logger.info("   Tavily: %s", "✅ Ready" if tavily_key else "❌ Missing")
+    logger.info("   Weather: %s", "✅ Ready" if weather_key else "❌ Missing")
@@
-    print("\n" + "=" * 60)
-    print(f"📊 **Stats:** {len(agent_memory['conversation_log'])} actions logged")
+    logger.info("=" * 60)
+    logger.info("📊 Stats: %d actions logged", len(agent_memory["conversation_log"]))

Also applies to: 261-266, 310-319, 321-327


187-195: Harden JSON argument parsing for tool calls (avoid runtime failures).

LLM-produced tool arguments can be malformed; json.loads() may raise. Prior feedback requested error handling here.

 def execute_function_call(function_call) -> str:
     """Execute OpenAI function calls."""
     function_name = function_call.function.name
-    arguments = json.loads(function_call.function.arguments)
+    try:
+        arguments = json.loads(function_call.function.arguments)
+    except json.JSONDecodeError as e:
+        return f"❌ Invalid arguments for {function_name}: {e}"
+    except Exception as e:
+        return f"❌ Error parsing arguments for {function_name}: {e}"
     
     if function_name in FUNCTION_REGISTRY:
         return FUNCTION_REGISTRY[function_name](**arguments)
     else:
         return f"❌ Unknown function: {function_name}"
packages/sample-app/sample_app/strands_example.py (1)

386-386: Guard against empty content to avoid IndexError.

This was raised before; add a safe fallback.

-    print(f"Coordinator: {coord_response.message['content'][0]['text']}\n")
+    print(
+        f"Coordinator: "
+        f"{coord_response.message['content'][0]['text'] if coord_response.message['content'] else '[No content]'}\n"
+    )
🧹 Nitpick comments (3)
packages/sample-app/sample_app/traceloop_openai_tools.py (1)

70-74: Defensive formatting to avoid TypeError if API returns None fields.

Cast values to str before slicing to avoid potential TypeError if a key is present but null.

-                    title = res.get("title", "")[:50] + "..."
-                    content = res.get("content", "")[:150] + "..."
+                    title = str(res.get("title", ""))[:50] + "..."
+                    content = str(res.get("content", ""))[:150] + "..."
packages/sample-app/sample_app/strands_example.py (2)

47-51: Add a timeout to the DuckDuckGo API request.

Prevents hangs and improves reliability under network issues.

         response = requests.get(
             "https://api.duckduckgo.com/",
-            params={"q": query, "format": "json", "no_html": "1", "skip_disambig": "1"}
+            params={"q": query, "format": "json", "no_html": "1", "skip_disambig": "1"},
+            timeout=10
         )

293-296: Prefer not to exit the interpreter in library-like code.

Raising an exception is friendlier to callers/tests than exit(1).

-    if not openai_api_key:
-        print("Please set OPENAI_API_KEY environment variable")
-        exit(1)
+    if not openai_api_key:
+        raise RuntimeError("Please set OPENAI_API_KEY environment variable")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6176a2c and b6981c8.

📒 Files selected for processing (2)
  • packages/sample-app/sample_app/strands_example.py (1 hunks)
  • packages/sample-app/sample_app/traceloop_openai_tools.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit Inference Engine (CLAUDE.md)

**/*.py: Store API keys only in environment variables/secure vaults; never hardcode secrets in code
Use Flake8 for code linting and adhere to its rules

Files:

  • packages/sample-app/sample_app/strands_example.py
  • packages/sample-app/sample_app/traceloop_openai_tools.py
🧬 Code Graph Analysis (2)
packages/sample-app/sample_app/strands_example.py (2)
packages/traceloop-sdk/traceloop/sdk/decorators/__init__.py (1)
  • tool (63-73)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
  • Traceloop (37-274)
  • init (49-206)
packages/sample-app/sample_app/traceloop_openai_tools.py (2)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
  • Traceloop (37-274)
  • init (49-206)
packages/traceloop-sdk/traceloop/sdk/decorators/__init__.py (2)
  • agent (50-60)
  • tool (63-73)
🪛 Ruff (0.12.2)
packages/sample-app/sample_app/strands_example.py

3-3: subprocess imported but unused

Remove unused import: subprocess

(F401)


7-7: typing.Dict imported but unused

Remove unused import: typing.Dict

(F401)


79-79: Local variable research_data is assigned to but never used

Remove assignment to unused variable research_data

(F841)


115-115: Local variable base_cost is assigned to but never used

Remove assignment to unused variable base_cost

(F841)

packages/sample-app/sample_app/traceloop_openai_tools.py

6-6: typing.Dict imported but unused

Remove unused import

(F401)


6-6: typing.List imported but unused

Remove unused import

(F401)

🪛 Flake8 (7.2.0)
packages/sample-app/sample_app/strands_example.py

[error] 3-3: 'subprocess' imported but unused

(F401)


[error] 7-7: 'typing.Dict' imported but unused

(F401)


[error] 79-79: local variable 'research_data' is assigned to but never used

(F841)


[error] 115-115: local variable 'base_cost' is assigned to but never used

(F841)


[error] 308-308: continuation line unaligned for hanging indent

(E131)


[error] 320-320: continuation line unaligned for hanging indent

(E131)


[error] 332-332: continuation line unaligned for hanging indent

(E131)


[error] 344-344: continuation line unaligned for hanging indent

(E131)

packages/sample-app/sample_app/traceloop_openai_tools.py

[error] 6-6: 'typing.Dict' imported but unused

(F401)


[error] 6-6: 'typing.List' imported but unused

(F401)

@@ -0,0 +1,410 @@
import os
import json
import subprocess
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unused imports to pass Flake8/Ruff.

subprocess and Dict are unused.

-import subprocess
@@
-from typing import Dict, Any
+from typing import Any

Also applies to: 7-7

🧰 Tools
🪛 Ruff (0.12.2)

3-3: subprocess imported but unused

Remove unused import: subprocess

(F401)

🪛 Flake8 (7.2.0)

[error] 3-3: 'subprocess' imported but unused

(F401)

🤖 Prompt for AI Agents
In packages/sample-app/sample_app/strands_example.py around lines 3 and 7,
remove the unused imports causing Flake8/Ruff failures: delete the import
subprocess on line 3 and remove the unused Dict import on line 7 (or consolidate
imports to only those actually used in the file). Save the file and run the
linter to confirm the warnings are resolved.

def create_itinerary(destination: str, duration: str, interests: str) -> str:
"""Create detailed travel itinerary based on research and preferences."""
# Check memory for research data
research_data = shared_memory.get("research_data", {})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unused variable.

research_data is assigned but not used.

-    research_data = shared_memory.get("research_data", {})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
research_data = shared_memory.get("research_data", {})
🧰 Tools
🪛 Ruff (0.12.2)

79-79: Local variable research_data is assigned to but never used

Remove assignment to unused variable research_data

(F841)

🪛 Flake8 (7.2.0)

[error] 79-79: local variable 'research_data' is assigned to but never used

(F841)

🤖 Prompt for AI Agents
In packages/sample-app/sample_app/strands_example.py around line 79, the
variable research_data is assigned from shared_memory.get but never used; remove
the unused assignment line (delete the research_data =
shared_memory.get("research_data", {}) statement) or, if the data is intended to
be used, replace the removal by using research_data where needed — but if not
required, simply delete the assignment to eliminate the unused variable.

@tool
def estimate_budget(destination: str, duration: str, travelers: int = 1) -> str:
"""Estimate travel budget for the trip."""
base_cost = 150 # per day per person
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unused variable.

base_cost is assigned but not used.

-    base_cost = 150  # per day per person
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
base_cost = 150 # per day per person
🧰 Tools
🪛 Ruff (0.12.2)

115-115: Local variable base_cost is assigned to but never used

Remove assignment to unused variable base_cost

(F841)

🪛 Flake8 (7.2.0)

[error] 115-115: local variable 'base_cost' is assigned to but never used

(F841)

🤖 Prompt for AI Agents
In packages/sample-app/sample_app/strands_example.py around line 115, the
variable "base_cost" is assigned but never used; remove the unused assignment
(delete the "base_cost = 150" line) or, if intended, replace its usage by
referencing "base_cost" where the per-day-per-person cost should be applied so
the value is used in subsequent calculations or returned by the function.

Comment on lines +307 to +313
system_prompt="You are a Travel Research Specialist with access to real-time web data. Your expertise includes:\n" +
"- Live destination research using current web information\n" +
"- Real weather forecasts and climate data\n" +
"- Up-to-date travel advisories and requirements\n" +
"- Current events and local information\n" +
"Always provide accurate, real-time travel information."
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix E131 by using implicit string concatenation (no hanging indents).

Avoids Flake8 continuation alignment errors and improves readability.

-        system_prompt="You are a Travel Research Specialist with access to real-time web data. Your expertise includes:\n" +
-                     "- Live destination research using current web information\n" +
-                     "- Real weather forecasts and climate data\n" +
-                     "- Up-to-date travel advisories and requirements\n" +
-                     "- Current events and local information\n" +
-                     "Always provide accurate, real-time travel information."
+        system_prompt=(
+            "You are a Travel Research Specialist with access to real-time web data. Your expertise includes:\n"
+            "- Live destination research using current web information\n"
+            "- Real weather forecasts and climate data\n"
+            "- Up-to-date travel advisories and requirements\n"
+            "- Current events and local information\n"
+            "Always provide accurate, real-time travel information."
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
system_prompt="You are a Travel Research Specialist with access to real-time web data. Your expertise includes:\n" +
"- Live destination research using current web information\n" +
"- Real weather forecasts and climate data\n" +
"- Up-to-date travel advisories and requirements\n" +
"- Current events and local information\n" +
"Always provide accurate, real-time travel information."
)
system_prompt=(
"You are a Travel Research Specialist with access to real-time web data. Your expertise includes:\n"
"- Live destination research using current web information\n"
"- Real weather forecasts and climate data\n"
"- Up-to-date travel advisories and requirements\n"
"- Current events and local information\n"
"Always provide accurate, real-time travel information."
)
)
🧰 Tools
🪛 Flake8 (7.2.0)

[error] 308-308: continuation line unaligned for hanging indent

(E131)

🤖 Prompt for AI Agents
In packages/sample-app/sample_app/strands_example.py around lines 307 to 313,
the multiline system_prompt uses explicit string concatenation with '+' and
hanging indents which triggers Flake8 E131; replace the '+' concatenation with
implicit string literal concatenation by placing the string pieces adjacent
inside the parentheses (each piece as a separate quoted literal on its own line,
properly indented) so Python will join them without operators and eliminate the
continuation alignment error.

Comment on lines +319 to +325
system_prompt="You are a Trip Planning Specialist. Your expertise includes:\n" +
"- Creating detailed, personalized itineraries\n" +
"- Budget estimation and financial planning\n" +
"- Optimizing travel schedules and logistics\n" +
"- Balancing activities with traveler preferences\n" +
"Create well-structured, realistic travel plans."
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix E131 for planning agent prompt.

-        system_prompt="You are a Trip Planning Specialist. Your expertise includes:\n" +
-                     "- Creating detailed, personalized itineraries\n" +
-                     "- Budget estimation and financial planning\n" +
-                     "- Optimizing travel schedules and logistics\n" +
-                     "- Balancing activities with traveler preferences\n" +
-                     "Create well-structured, realistic travel plans."
+        system_prompt=(
+            "You are a Trip Planning Specialist. Your expertise includes:\n"
+            "- Creating detailed, personalized itineraries\n"
+            "- Budget estimation and financial planning\n"
+            "- Optimizing travel schedules and logistics\n"
+            "- Balancing activities with traveler preferences\n"
+            "Create well-structured, realistic travel plans."
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
system_prompt="You are a Trip Planning Specialist. Your expertise includes:\n" +
"- Creating detailed, personalized itineraries\n" +
"- Budget estimation and financial planning\n" +
"- Optimizing travel schedules and logistics\n" +
"- Balancing activities with traveler preferences\n" +
"Create well-structured, realistic travel plans."
)
system_prompt=(
"You are a Trip Planning Specialist. Your expertise includes:\n"
"- Creating detailed, personalized itineraries\n"
"- Budget estimation and financial planning\n"
"- Optimizing travel schedules and logistics\n"
"- Balancing activities with traveler preferences\n"
"Create well-structured, realistic travel plans."
)
🧰 Tools
🪛 Flake8 (7.2.0)

[error] 320-320: continuation line unaligned for hanging indent

(E131)

🤖 Prompt for AI Agents
In packages/sample-app/sample_app/strands_example.py around lines 319 to 325,
the multiline system_prompt concatenation triggers E131 (continuation line
unaligned for hanging indent); replace the current string concatenation with a
properly aligned multiline string (either a single triple-quoted string or
implicit concatenation inside parentheses) and align subsequent lines to the
opening parenthesis or triple-quote so the continuation lines comply with PEP8
indentation rules.

Comment on lines +331 to +337
system_prompt="You are a Travel Booking Specialist. Your expertise includes:\n" +
"- Finding the best flight deals and options\n" +
"- Locating suitable accommodations\n" +
"- Comparing prices and amenities\n" +
"- Providing booking recommendations and alternatives\n" +
"Help travelers find the best deals within their budget."
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix E131 for booking agent prompt.

-        system_prompt="You are a Travel Booking Specialist. Your expertise includes:\n" +
-                     "- Finding the best flight deals and options\n" +
-                     "- Locating suitable accommodations\n" +
-                     "- Comparing prices and amenities\n" +
-                     "- Providing booking recommendations and alternatives\n" +
-                     "Help travelers find the best deals within their budget."
+        system_prompt=(
+            "You are a Travel Booking Specialist. Your expertise includes:\n"
+            "- Finding the best flight deals and options\n"
+            "- Locating suitable accommodations\n"
+            "- Comparing prices and amenities\n"
+            "- Providing booking recommendations and alternatives\n"
+            "Help travelers find the best deals within their budget."
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
system_prompt="You are a Travel Booking Specialist. Your expertise includes:\n" +
"- Finding the best flight deals and options\n" +
"- Locating suitable accommodations\n" +
"- Comparing prices and amenities\n" +
"- Providing booking recommendations and alternatives\n" +
"Help travelers find the best deals within their budget."
)
system_prompt=(
"You are a Travel Booking Specialist. Your expertise includes:\n"
"- Finding the best flight deals and options\n"
"- Locating suitable accommodations\n"
"- Comparing prices and amenities\n"
"- Providing booking recommendations and alternatives\n"
"Help travelers find the best deals within their budget."
)
🧰 Tools
🪛 Flake8 (7.2.0)

[error] 332-332: continuation line unaligned for hanging indent

(E131)

🤖 Prompt for AI Agents
In packages/sample-app/sample_app/strands_example.py around lines 331 to 337,
the multi-line system_prompt string concatenation triggers E131 (continuation
line indentation); replace the current string + concatenation with a properly
aligned multi-line literal — either use a triple-quoted string or implicit
concatenation inside parentheses with each continued line indented to match the
opening parenthesis — and ensure the closing parenthesis is on its own line
aligned with the start of the call.

Comment on lines +343 to +349
system_prompt="You are the Travel Coordinator. You orchestrate a team of specialists:\n" +
"- Research Agent: destination info, weather, attractions\n" +
"- Planning Agent: itineraries, budgets, logistics\n" +
"- Booking Agent: flights, hotels, reservations\n" +
"\nAnalyze user requests and delegate to the appropriate specialist. " +
"Coordinate between agents to provide comprehensive travel assistance."
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix E131 for coordinator agent prompt.

-        system_prompt="You are the Travel Coordinator. You orchestrate a team of specialists:\n" +
-                     "- Research Agent: destination info, weather, attractions\n" +
-                     "- Planning Agent: itineraries, budgets, logistics\n" +
-                     "- Booking Agent: flights, hotels, reservations\n" +
-                     "\nAnalyze user requests and delegate to the appropriate specialist. " +
-                     "Coordinate between agents to provide comprehensive travel assistance."
+        system_prompt=(
+            "You are the Travel Coordinator. You orchestrate a team of specialists:\n"
+            "- Research Agent: destination info, weather, attractions\n"
+            "- Planning Agent: itineraries, budgets, logistics\n"
+            "- Booking Agent: flights, hotels, reservations\n"
+            "\nAnalyze user requests and delegate to the appropriate specialist. "
+            "Coordinate between agents to provide comprehensive travel assistance."
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
system_prompt="You are the Travel Coordinator. You orchestrate a team of specialists:\n" +
"- Research Agent: destination info, weather, attractions\n" +
"- Planning Agent: itineraries, budgets, logistics\n" +
"- Booking Agent: flights, hotels, reservations\n" +
"\nAnalyze user requests and delegate to the appropriate specialist. " +
"Coordinate between agents to provide comprehensive travel assistance."
)
system_prompt=(
"You are the Travel Coordinator. You orchestrate a team of specialists:\n"
"- Research Agent: destination info, weather, attractions\n"
"- Planning Agent: itineraries, budgets, logistics\n"
"- Booking Agent: flights, hotels, reservations\n"
"\nAnalyze user requests and delegate to the appropriate specialist. "
"Coordinate between agents to provide comprehensive travel assistance."
)
)
🧰 Tools
🪛 Flake8 (7.2.0)

[error] 344-344: continuation line unaligned for hanging indent

(E131)

🤖 Prompt for AI Agents
In packages/sample-app/sample_app/strands_example.py around lines 343 to 349,
the multi-line system_prompt concatenation triggers E131 (continuation line
unaligned); reformat the string so continuation lines are properly aligned with
the opening delimiter — for example wrap the full prompt in parentheses and use
implicit string literal concatenation or a triple-quoted string, or move the '+'
operators to the start of continued lines so indentation matches the opening
line; ensure all continuation lines align consistently to satisfy the linter.

import requests
from datetime import datetime
from pathlib import Path
from typing import Dict, Any, List
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unused typing imports to satisfy Flake8/Ruff.

Dict and List are unused.

-from typing import Dict, Any, List
+from typing import Any
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from typing import Dict, Any, List
from typing import Any
🧰 Tools
🪛 Ruff (0.12.2)

6-6: typing.Dict imported but unused

Remove unused import

(F401)


6-6: typing.List imported but unused

Remove unused import

(F401)

🪛 Flake8 (7.2.0)

[error] 6-6: 'typing.Dict' imported but unused

(F401)


[error] 6-6: 'typing.List' imported but unused

(F401)

🤖 Prompt for AI Agents
In packages/sample-app/sample_app/traceloop_openai_tools.py around line 6, the
typing import includes unused names Dict and List which trigger Flake8/Ruff
errors; remove Dict and List from the import so only used types remain (e.g.,
keep Any if used), and run the linter to confirm the unused-import warning is
resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant