fix(openai-agents): fix broken traces with agents handoff on run_stream#3143
fix(openai-agents): fix broken traces with agents handoff on run_stream#3143galkleinman merged 20 commits intomainfrom
Conversation
|
Warning Rate limit exceeded@galkleinman has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 14 minutes and 42 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
WalkthroughThe changes introduce advanced tracing and streaming support for OpenAI Agents SDK instrumentation, including thread-local span management for trace continuity, enhanced tool and agent telemetry, and comprehensive multi-agent workflow testing. A new streaming recipe management chat application is added as a sample, featuring multi-agent handoffs, function tool integration, and OpenAI-powered recipe editing. Additionally, JSON serialization is improved with a custom encoder handling dataclasses and special cases. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant MainChatAgent
participant RecipeEditorAgent
participant FunctionTool
participant OpenAIAPI
User->>MainChatAgent: User query (e.g. "Make carbonara vegetarian")
MainChatAgent->>RecipeEditorAgent: Handoff with context
RecipeEditorAgent->>FunctionTool: plan_and_apply_recipe_modifications
FunctionTool->>OpenAIAPI: ChatCompletion (modification plan)
OpenAIAPI-->>FunctionTool: Plan response
FunctionTool-->>RecipeEditorAgent: Modified recipe result
RecipeEditorAgent-->>MainChatAgent: Edited recipe output
MainChatAgent-->>User: Streaming assistant response
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Caution
Changes requested ❌
Reviewed everything up to dc1247b in 3 minutes and 16 seconds. Click for details.
- Reviewed
2815lines of code in6files - Skipped
2files when reviewing. - Skipped posting
5draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/tests/test_openai_agents.py:39
- Draft comment:
Instead of comparing span.kind with itself (e.g. span.kind == span.kind.CLIENT), use the constant from opentelemetry (e.g. SpanKind.CLIENT) for clarity and correctness. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. packages/tests/test_openai_agents.py:110
- Draft comment:
Similarly, when verifying tool spans, avoid using 'tool_span.kind == tool_span.kind.INTERNAL'. Instead, use the Opentelemetry constant (SpanKind.INTERNAL). - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py:49
- Draft comment:
Consider whether directly clearing internal globals (_root_span_storage and _instrumented_tools) in tests is acceptable; if these are internal details, they may need to be encapsulated to avoid accidental misuse. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
4. packages/opentelemetry-instrumentation-openai-agents/tests/cassettes/test_openai_agents/test_recipe_workflow_agent_handoffs_with_function_tools.yaml:5
- Draft comment:
Typo: The phrase "Recipe Editor Agent agent" appears redundant. Consider changing it to "Recipe Editor Agent". - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% Looking at line 5, there is indeed a redundant "agent" in "Recipe Editor Agent agent". However, this is a minor text issue in a function description. The rules state not to make comments that are obvious or unimportant. This seems like a very minor stylistic issue that doesn't affect functionality. The redundancy could theoretically cause confusion for developers reading the API documentation. Maybe it's worth fixing for clarity? While clear documentation is important, this is an extremely minor text issue that doesn't impact functionality or understanding. The meaning is still clear despite the redundancy. This comment should be deleted as it points out a trivial text issue that doesn't meaningfully impact the code or documentation.
5. packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py:193
- Draft comment:
Typographical suggestion: Consider changing the message from "Found 1 recipes matching "{query}"" to "Found 1 recipe matching "{query}"" to maintain number agreement. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This is a minor grammatical issue in a test file. While technically correct, it's a very minor change that doesn't affect functionality. The message is only used for testing purposes and doesn't impact production code or user experience. According to the rules, we should not make purely informative comments or comments about unimportant issues. The grammar is technically incorrect, and fixing it would make the code more polished. Some might argue that maintaining proper grammar everywhere helps maintain code quality. While correct grammar is good, this is a test file and the message is only used for testing. The benefit of fixing this is minimal and doesn't justify the overhead of a PR comment. Delete this comment as it points out an unimportant grammatical issue in test code that doesn't impact functionality or readability significantly.
Workflow ID: wflow_EmuMroB15GV0pCcj
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| if content_parts: | ||
| print("".join(content_parts), end="", flush=True) | ||
|
|
||
| elif event.name == "handoff_occured": |
There was a problem hiding this comment.
Typographical error: 'handoff_occured' is misspelled. Consider changing it to 'handoff_occurred' to reflect correct spelling.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py (1)
21-21: Consider using modern type union syntax.Since the project requires Python >=3.10 (as specified in pyproject.toml), you can use the modern union syntax with
|instead ofUnion.-from typing import List, Dict, Union +from typing import List, DictThen update the type annotations:
- recipes: Union[Dict[str, Recipe], None] = None - recipe_count: Union[int, None] = None - query: Union[str, None] = None + recipes: Dict[str, Recipe] | None = None + recipe_count: int | None = None + query: str | None = None- modified_recipe: Union[Recipe, None] = None - changes_made: Union[List[str], None] = None - original_recipe: Union[Recipe, None] = None + modified_recipe: Recipe | None = None + changes_made: List[str] | None = None + original_recipe: Recipe | None = NoneAlso applies to: 161-163, 168-170
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (1)
299-302: Remove or use the unused loop variable.The loop variable
eventis not used within the loop body.- async for event in recipe_runner.stream_events(): - pass # Just consume the events to let the execution complete + async for _ in recipe_runner.stream_events(): + pass # Just consume the events to let the execution completepackages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (2)
132-132: Simplify.get()calls as suggested by static analysis.The default value
Noneis redundant when using.get().- root_span = _root_span_storage.get(thread_id, None) + root_span = _root_span_storage.get(thread_id)Also applies to: 292-292, 527-527
163-188: Consider extracting common input capture logic.The input capture logic is duplicated between
_wrap_agent_run_streamedand_wrap_agent_run. Consider extracting this into a shared helper function to improve maintainability.Would you like me to provide a refactored version that extracts the common input/output capture logic into reusable helper functions?
Also applies to: 323-347
packages/sample-app/sample_app/openai_agents_example.py (1)
579-587: Simplify nested if statements.The nested if statements can be combined for better readability.
- if event.type == "raw_response_event": - if isinstance(event.data, ResponseTextDeltaEvent): - # Stream the assistant's token delta - print(event.data.delta, end="", flush=True) - elif isinstance(event.data, ResponseOutputItemAddedEvent): - if isinstance(event.data.item, ResponseFunctionToolCall): - # Tool call metadata becomes available here - print(f"\n[Calling tool: {event.data.item.name}]\n") + if event.type == "raw_response_event": + if isinstance(event.data, ResponseTextDeltaEvent): + # Stream the assistant's token delta + print(event.data.delta, end="", flush=True) + elif isinstance(event.data, ResponseOutputItemAddedEvent) and isinstance(event.data.item, ResponseFunctionToolCall): + # Tool call metadata becomes available here + print(f"\n[Calling tool: {event.data.item.name}]\n")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
package-lock.jsonis excluded by!**/package-lock.jsonpackages/sample-app/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py(7 hunks)packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py(3 hunks)packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py(2 hunks)packages/sample-app/pyproject.toml(2 hunks)packages/sample-app/sample_app/openai_agents_example.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py (1)
packages/sample-app/sample_app/openai_agents_example.py (5)
Recipe(46-53)SearchResponse(65-70)EditResponse(56-62)search_recipes(357-484)plan_and_apply_recipe_modifications(255-353)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (3)
packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py (2)
exporter(27-37)recipe_workflow_agents(145-249)packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (1)
get_finished_spans(40-43)packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (1)
TraceloopSpanKindValues(269-274)
🪛 Ruff (0.12.2)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py
301-301: Loop control variable event not used within loop body
(B007)
packages/sample-app/sample_app/openai_agents_example.py
583-584: Use a single if statement instead of nested if statements
(SIM102)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py
132-132: Use _root_span_storage.get(thread_id) instead of _root_span_storage.get(thread_id, None)
Replace _root_span_storage.get(thread_id, None) with _root_span_storage.get(thread_id)
(SIM910)
292-292: Use _root_span_storage.get(thread_id) instead of _root_span_storage.get(thread_id, None)
Replace _root_span_storage.get(thread_id, None) with _root_span_storage.get(thread_id)
(SIM910)
527-527: Use _root_span_storage.get(thread_id) instead of _root_span_storage.get(thread_id, None)
Replace _root_span_storage.get(thread_id, None) with _root_span_storage.get(thread_id)
(SIM910)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test Packages (3.12)
🔇 Additional comments (7)
packages/sample-app/pyproject.toml (1)
58-58: LGTM! Dependencies properly configured.The addition of
openai-agentspackage and the local development dependency for the instrumentation package are correctly configured to support the new OpenAI Agents SDK integration.Also applies to: 97-99
packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py (2)
49-52: Excellent addition for test isolation!Clearing the global span storage and instrumented tools between tests ensures clean state and prevents test interference. This is a critical fix for reliable test execution.
144-249: Well-structured test fixture for multi-agent workflows.The
recipe_workflow_agentsfixture provides comprehensive mock data and properly configured agents with function tools and handoff capabilities. The mock implementations ofsearch_recipesandplan_and_apply_recipe_modificationsare realistic and will enable thorough testing of the multi-agent tracing functionality.packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (2)
275-398: Comprehensive test coverage for multi-agent workflows!This test thoroughly validates:
- Agent handoffs between Main Chat Agent and Recipe Editor Agent
- Function tool invocations with proper span attributes
- Parent-child span relationships
- Trace continuity across multiple agents
- Input/output attribute capture
The test structure and assertions provide excellent coverage for the new streaming and handoff instrumentation features.
101-101: Span count reduction verified as intentionalNo TODO/FIXME comments or related references to span-count changes exist in the repository, and the current instrumentation emits exactly three spans (agent, tool, and underlying internal span) for the function-tool test. The updated assertion
assert len(spans) == 3correctly reflects the actual behavior.packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (1)
108-270: Excellent implementation of streaming and trace continuity!The implementation properly handles:
- Thread-local root span storage for maintaining trace context across agent handoffs
- Comprehensive attribute capture for both streaming and non-streaming execution
- Proper error handling and status setting
- Metrics recording for duration and token usage
This effectively solves the broken traces issue with agent handoffs.
Also applies to: 272-428
packages/sample-app/sample_app/openai_agents_example.py (1)
1-695: Excellent sample application demonstrating multi-agent workflows!This comprehensive example effectively showcases:
- Multi-agent architecture with specialized roles
- Function tools for recipe search and AI-powered modifications
- Streaming responses with proper event handling
- Agent handoffs based on conversation context
- Well-structured data models using Pydantic
The application serves as a valuable reference for users implementing similar multi-agent systems with the OpenAI Agents SDK.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 6ea260a in 2 minutes and 33 seconds. Click for details.
- Reviewed
463lines of code in2files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:123
- Draft comment:
Consider using a context-based solution (e.g. contextvars) instead of threading.get_ident() for storing the root span. Relying on thread IDs may be problematic in asynchronous scenarios where the thread context can change. - Reason this comment was not posted:
Comment was on unchanged code.
2. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:166
- Draft comment:
Avoid using eval() to parse input data. This approach can be a security risk if untrusted data is processed; consider replacing it with a safer alternative such as json.loads(). - Reason this comment was not posted:
Comment was on unchanged code.
3. packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py:364
- Draft comment:
The new trace ID consistency check using the 'all_trace_ids' set in the handoff test is a solid improvement. It robustly ensures that all spans belong to the same trace. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
Workflow ID: wflow_uaqoait34LkSeMVP
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (2)
103-245: Well-structured streaming wrapper with security concern.The new streaming wrapper function provides comprehensive tracing for streaming execution with excellent thread-local context management and detailed attribute extraction.
However, there's a critical security issue:
Avoid using
eval()for security reasons.Using
eval()on string input is a security risk and should be avoided. Consider usingjson.loads()orast.literal_eval()for safer parsing.- item_dict = eval(item) if isinstance(item, str) else item + item_dict = json.loads(item) if isinstance(item, str) else itemThe function excels in:
- Thread-local root span management
- Comprehensive input/output serialization
- Proper error handling and status setting
- Metrics recording with detailed attributes
262-359: Enhanced wrapper with thread-local storage support and minor issues.The wrapper now properly supports thread-local root span storage for trace continuity across handoffs, with improved input/output attribute handling.
Issues to address:
Avoid using
eval()for security reasons.Using
eval()on string input is a security risk. Consider usingjson.loads()orast.literal_eval()for safer parsing.Simplify dictionary get() call:
- root_span = _root_span_storage.get(thread_id, None) + root_span = _root_span_storage.get(thread_id)The enhanced functionality is excellent:
- Thread-local root span context propagation
- Comprehensive input/output serialization
- Proper context management for trace continuity
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (1)
479-566: Excellent tool instrumentation with thread-local context support.The enhanced tool details extraction provides comprehensive instrumentation with proper thread-local context management and duplicate prevention.
Simplify dictionary get() call:
- root_span = _root_span_storage.get(thread_id, None) + root_span = _root_span_storage.get(thread_id)Key improvements:
- Thread-local root span context propagation
- Registry-based duplicate prevention for tool instrumentation
- Proper wrapping of FunctionTool execution with spans
- Immediate span creation for hosted tools with detailed attributes
- Maintained parent-child relationships through context management
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py(7 hunks)packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py(2 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py
297-297: Loop control variable event not used within loop body
(B007)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py
125-125: Use _root_span_storage.get(thread_id) instead of _root_span_storage.get(thread_id, None)
Replace _root_span_storage.get(thread_id, None) with _root_span_storage.get(thread_id)
(SIM910)
263-263: Use _root_span_storage.get(thread_id) instead of _root_span_storage.get(thread_id, None)
Replace _root_span_storage.get(thread_id, None) with _root_span_storage.get(thread_id)
(SIM910)
481-481: Use _root_span_storage.get(thread_id) instead of _root_span_storage.get(thread_id, None)
Replace _root_span_storage.get(thread_id, None) with _root_span_storage.get(thread_id)
(SIM910)
🔇 Additional comments (4)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (1)
101-101: LGTM: Updated span count expectation aligns with instrumentation changes.The reduction from 4 to 3 expected spans correctly reflects the optimized span management introduced by the new thread-local root span storage system.
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (3)
5-5: LGTM: Essential imports and global state for thread-local span management.The new imports and global variables are well-designed for the enhanced instrumentation:
threadingfor thread identificationcontextandset_span_in_contextfor proper span context propagation_root_span_storagefor thread-local root span tracking_instrumented_toolsfor preventing duplicate tool instrumentationAlso applies to: 8-8, 10-10, 29-30
80-81: LGTM: Proper cleanup of new instrumentation components.The uninstrumentation method correctly unwraps the new streaming method and clears the instrumented tools registry, ensuring proper cleanup.
67-76: LGTM: Proper registration of streaming method wrapper.The wrapper registration for
_run_single_turn_streamedfollows the established pattern and correctly integrates the new streaming instrumentation.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 7d221ab in 1 minute and 11 seconds. Click for details.
- Reviewed
31lines of code in1files - Skipped
0files when reviewing. - Skipped posting
3draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:122
- Draft comment:
Simplify dict.get usage: specifying None is redundant since get() returns None by default. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
2. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:166
- Draft comment:
Replaced eval with json.loads to safely parse JSON strings and avoid security risks. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%This comment is purely informative, explaining a change that was made. It does not provide a suggestion, ask for confirmation, or point out a potential issue. According to the rules, purely informative comments should be removed.
3. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:300
- Draft comment:
Using json.loads instead of eval enhances security by preventing arbitrary code execution. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%This comment is purely informative, explaining why json.loads is preferred over eval. It doesn't suggest a change or ask for confirmation about the code's intention.
Workflow ID: wflow_t2llUm84pNJaRuDo
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (2)
262-269: Thread-local span management implemented correctly.The root span storage logic ensures trace continuity. Minor suggestion: line 263 can be simplified.
Apply this diff to simplify the code (as suggested by static analysis):
- root_span = _root_span_storage.get(thread_id, None) + root_span = _root_span_storage.get(thread_id)
478-531: Excellent refactoring for FunctionTool instrumentation.The implementation:
- Correctly prevents duplicate instrumentation using
_instrumented_toolsset- Properly wraps async tool invocations with tracing spans
- Maintains context propagation from root span
- Captures input/output for observability
Minor suggestion for line 481 (from static analysis).
Apply this diff to simplify:
- root_span = _root_span_storage.get(thread_id, None) + root_span = _root_span_storage.get(thread_id)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py(7 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py
263-263: Use _root_span_storage.get(thread_id) instead of _root_span_storage.get(thread_id, None)
Replace _root_span_storage.get(thread_id, None) with _root_span_storage.get(thread_id)
(SIM910)
481-481: Use _root_span_storage.get(thread_id) instead of _root_span_storage.get(thread_id, None)
Replace _root_span_storage.get(thread_id, None) with _root_span_storage.get(thread_id)
(SIM910)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test Packages (3.9)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
🔇 Additional comments (4)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (4)
67-76: LGTM! Streaming instrumentation added correctly.The wrapper for
_run_single_turn_streamedfollows the established pattern and properly passes the required parameters.
80-82: Proper cleanup in uninstrumentation.Good addition of clearing the
_instrumented_toolsset to prevent memory leaks and ensure clean re-instrumentation.
103-245: Well-implemented streaming wrapper with proper trace continuity.The implementation correctly:
- Uses thread-local storage for maintaining root span context across async operations
- Handles various input/output data formats with appropriate serialization
- Records duration metrics and sets proper span attributes
- Uses safe
json.loads()for parsing (noteval())
532-567: Proper handling of hosted tools with immediate span creation.The implementation correctly:
- Creates and ends spans immediately for hosted tools
- Sets tool-specific attributes for observability
- Maintains trace context from root span
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 76dcbcb in 2 minutes and 33 seconds. Click for details.
- Reviewed
652lines of code in2files - Skipped
0files when reviewing. - Skipped posting
6draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:124
- Draft comment:
Consider cleaning up the stored root span (in _root_span_storage) once the span’s work is complete. In long-running applications, keeping stale span references may lead to memory leaks. A finally block to remove the entry after span completion might be beneficial. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:158
- Draft comment:
The change converting single-quoted attribute names to double quotes improves consistency. Ensure that this styling change is aligned with project conventions and does not inadvertently alter expected key names. - Reason this comment was not posted:
Confidence changes required:20%<= threshold50%None
3. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:136
- Draft comment:
The use of trailing commas in multi-line dictionary literals enhances readability and minimizes diff noise. This looks good; ensure consistency throughout the code base. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
4. packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py:56
- Draft comment:
The reformatting of assert statements to remove redundant parentheses improves readability. Verify that all assertions evaluate correctly. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
5. packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py:257
- Draft comment:
In the test for recipe workflow agent handoffs, the expectation of exactly 3 'Recipe Editor Agent.agent' spans enforces a specific behavior. Consider abstracting these repeated span attribute checks into helper methods for better maintainability as the handoff logic evolves. - Reason this comment was not posted:
Confidence changes required:33%<= threshold50%None
6. packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py:215
- Draft comment:
It appears that the assertion starting on this line is incomplete. The diff shows an unfinished expression (e.g., a missing attribute key or closing bracket) at line 216. Please complete this assertion to ensure the test is valid. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_RsqQwqBuQ6KeAjp5
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (1)
226-366: Comprehensive test coverage with unused loop variable issue.This test provides excellent coverage of the multi-agent workflow with streaming, handoffs, and function tools. The assertions thoroughly validate span attributes, relationships, and trace continuity.
However, there's an unused loop variable that should be addressed:
- async for event in recipe_runner.stream_events(): - pass + async for _ in recipe_runner.stream_events(): + passThe test design is well-structured and validates key aspects:
- Span counts and names
- Agent and tool attributes
- Parent-child relationships
- Trace ID continuity across handoffs
- Proper status codes
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py(10 hunks)packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py(7 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py
250-250: Loop control variable event not used within loop body
(B007)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.9)
🔇 Additional comments (12)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (6)
57-68: Assertion refactoring improves readability.The conversion from multi-line span attribute checks to single-line assertions makes the test code more concise and easier to read while maintaining the same validation logic.
97-105: Assertions correctly validate tool attributes.The single-line assertions properly validate the tool name, type, and schema attributes for the FunctionTool.
134-139: Tool type validation is correct.The assertions properly validate WebSearchTool attributes and confirm the absence of optional user_location attribute.
159-167: Handoff attribute parsing is properly tested.The test correctly validates that handoff information is stored as JSON strings and can be parsed back to verify the agent names and instructions.
196-214: Metrics validation is comprehensive.The refactored assertions properly validate token usage and duration metrics with appropriate attribute checks.
82-82: Span count matches updated instrumentationI’ve confirmed that the agent instrumentation now emits exactly three spans—one agent span, one tool span, and one OpenAI API (LLM) span—so reducing the test assertion from 4 to 3 is correct and no spans are missing.
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (6)
102-255: Streaming wrapper implementation is comprehensive.The new
_wrap_agent_run_streamedfunction provides excellent support for streaming execution with proper context propagation and detailed input/output attribute extraction. The implementation correctly:
- Maintains thread-local root span storage for trace continuity
- Handles various input/output data formats with fallback error handling
- Records metrics and maintains proper span lifecycle
The structured approach to handling different data types (model_dump, dict, string) with graceful fallbacks ensures robustness across various agent configurations.
272-396: Enhanced agent run wrapper maintains trace continuity.The modifications to
_wrap_agent_runproperly implement thread-local context propagation to ensure trace continuity across agent handoffs. The input/output attribute extraction logic mirrors the streaming wrapper for consistency.The thread-aware context management ensures that child spans maintain proper parent-child relationships even in complex multi-agent workflows.
489-602: Tool instrumentation provides comprehensive telemetry.The enhanced
extract_tool_detailsfunction properly wraps FunctionTool execution with tracing spans while creating immediate spans for hosted tools. The implementation correctly:
- Avoids duplicate instrumentation through
_instrumented_toolstracking- Maintains proper context propagation for tool spans
- Captures detailed attributes for different tool types
- Handles both sync and async tool execution patterns
The closure-based wrapping pattern for FunctionTool ensures proper scope capture while maintaining the original tool interface.
6-6: Threading import supports new thread-aware features.The addition of the
threadingimport is necessary for the new thread-local span management functionality.
9-11: Context imports enable proper span propagation.The additional imports for
set_span_in_contextandcontextare essential for the new thread-aware context propagation mechanisms.
67-80: Instrumentation registration includes streaming wrapper.The addition of the streaming wrapper registration and cleanup in
_uninstrumentproperly manages the new instrumentation lifecycle.
| _root_span_storage = {} | ||
| _instrumented_tools = set() |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Global state management needs thread safety consideration.
The global dictionaries _root_span_storage and _instrumented_tools are accessed from multiple threads but lack explicit synchronization. While _root_span_storage uses thread ID as key (inherently thread-safe for access), _instrumented_tools could have race conditions during concurrent tool instrumentation.
Consider adding thread-safe access patterns:
+import threading
+
+_storage_lock = threading.Lock()
_root_span_storage = {}
_instrumented_tools = set()Then protect critical sections:
- if tool_id in _instrumented_tools:
- continue
- _instrumented_tools.add(tool_id)
+ with _storage_lock:
+ if tool_id in _instrumented_tools:
+ continue
+ _instrumented_tools.add(tool_id)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| _root_span_storage = {} | |
| _instrumented_tools = set() | |
| # At the top of the file, add the lock alongside the globals | |
| import threading | |
| _storage_lock = threading.Lock() | |
| _root_span_storage = {} | |
| _instrumented_tools = set() | |
| # … rest of imports and code … | |
| # Then in the tool-instrumentation section, replace the non-atomic check/add: | |
| # if tool_id in _instrumented_tools: | |
| # continue | |
| # _instrumented_tools.add(tool_id) | |
| # with a locked critical section: | |
| with _storage_lock: | |
| if tool_id in _instrumented_tools: | |
| continue | |
| _instrumented_tools.add(tool_id) | |
| # … rest of the code … |
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py
around lines 30 to 31, the global variables _root_span_storage and
_instrumented_tools are accessed concurrently without synchronization, risking
race conditions especially on _instrumented_tools. To fix this, introduce a
threading.Lock to protect access to _instrumented_tools and wrap all
modifications and reads of this set within lock acquire/release blocks to ensure
thread safety.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed bafbc91 in 1 minute and 36 seconds. Click for details.
- Reviewed
518lines of code in2files - Skipped
0files when reviewing. - Skipped posting
5draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py:247
- Draft comment:
Use '_' for the unused loop variable in async iteration to clearly indicate it's not needed. - Reason this comment was not posted:
Comment looked like it was already resolved.
2. packages/sample-app/sample_app/openai_agents_example.py:97
- Draft comment:
Added trailing commas to list literals for consistency and better diffs in future edits. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%This comment is purely informative, explaining why trailing commas were added. It doesn't provide a suggestion or ask for confirmation about the code's functionality.
3. packages/sample-app/sample_app/openai_agents_example.py:256
- Draft comment:
Flatten function parameters into a single line in plan_and_apply_recipe_modifications for conciseness without losing clarity. - Reason this comment was not posted:
Comment looked like it was already resolved.
4. packages/sample-app/sample_app/openai_agents_example.py:604
- Draft comment:
Correct the event name typo from 'handoff_occured' to 'handoff_occurred' to match expected trigger names. - Reason this comment was not posted:
Comment looked like it was already resolved.
5. packages/sample-app/sample_app/openai_agents_example.py:304
- Draft comment:
Consistent dictionary formatting with trailing commas in the messages list improves maintainability. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%This comment is purely informative and does not provide a specific suggestion or ask for a change. It simply states a general benefit of a coding style without addressing a specific issue in the code.
Workflow ID: wflow_Ljko8daf2FsaXKhB
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (1)
250-250: Use underscore for unused loop variable.This issue has already been identified in a previous review. Replace
_with_to indicate the variable is intentionally unused.packages/sample-app/sample_app/openai_agents_example.py (1)
269-270: Consider using logging framework instead of print statements.For production-ready sample applications, replace print statements with proper logging to allow better control over log levels and formatting.
Example refactor:
+import logging + +logger = logging.getLogger(__name__) + @function_tool async def plan_and_apply_recipe_modifications( cw: RunContextWrapper[ChatContext], recipe: Recipe, modification_request: str ) -> EditResponse: - print(f"Planning and applying modifications for recipe: {recipe.name}") - print(f"Modification request: {modification_request}") + logger.info(f"Planning and applying modifications for recipe: {recipe.name}") + logger.info(f"Modification request: {modification_request}")Also applies to: 474-474
🧹 Nitpick comments (1)
packages/sample-app/sample_app/openai_agents_example.py (1)
569-572: Simplify nested if statements.Combine the nested conditions for cleaner code.
- elif isinstance(event.data, ResponseOutputItemAddedEvent): - if isinstance(event.data.item, ResponseFunctionToolCall): - print(f"\n[Calling tool: {event.data.item.name}]\n") + elif isinstance(event.data, ResponseOutputItemAddedEvent) and isinstance(event.data.item, ResponseFunctionToolCall): + print(f"\n[Calling tool: {event.data.item.name}]\n")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py(7 hunks)packages/sample-app/sample_app/openai_agents_example.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
packages/sample-app/sample_app/openai_agents_example.py
569-570: Use a single if statement instead of nested if statements
(SIM102)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.9)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
🔇 Additional comments (6)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (1)
57-57: Clean formatting improvements.The simplified single-line assertions improve readability without changing functionality.
Also applies to: 60-60, 63-68, 82-82, 97-98, 104-104, 134-134, 139-139, 159-159, 164-164, 196-199, 206-206, 209-209, 212-212
packages/sample-app/sample_app/openai_agents_example.py (5)
1-23: Well-organized imports and proper initialization.The imports are cleanly organized and Traceloop is correctly initialized with batch disabled for streaming support.
25-88: Well-structured Pydantic models.The data models provide excellent type safety and clear API contracts for the recipe management system.
349-478: Excellent implementation with semantic search and fallback strategy.The search function provides intelligent semantic search using AI with a robust keyword-based fallback, ensuring reliability even when AI parsing fails.
480-548: Well-designed agent architecture with clear separation of concerns.The two-agent system effectively separates general conversation handling from specialized recipe management, with clear handoff triggers.
612-663: Clean implementation of the streaming chat workflow.The main execution flow effectively demonstrates multi-agent handoffs and streaming responses with a practical recipe editing example.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed eb1377b in 1 minute and 47 seconds. Click for details.
- Reviewed
556lines of code in4files - Skipped
0files when reviewing. - Skipped posting
4draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:148
- Draft comment:
The new JSON serialization logic for inputs (converting each arg with json.dumps and json.loads using a custom JSONEncoder) is a clear improvement in ensuring that the span attribute for input is standardized. However, note that performing a double JSON conversion on every argument can incur performance overhead. Consider checking whether the argument is already a basic type or pre-serializable before applying the conversion. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
2. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:183
- Draft comment:
The output handling logic has been simplified by using a try/catch block that wraps json.dumps(result, cls=JSONEncoder). This unifies processing and reduces complexity compared to the previous multi-branch logic. Ensure the custom JSONEncoder covers all cases; otherwise, the fallback to converting to string is a reasonable backup. - Reason this comment was not posted:
Confidence changes required:30%<= threshold50%None
3. packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py:282
- Draft comment:
The removal of assertions that check for the SpanAttributes.LLM_SYSTEM attribute in the agent spans suggests an intentional change in metadata recording. Confirm that downstream systems do not rely on this attribute. If it is no longer needed, this simplification is acceptable. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
4. packages/sample-app/sample_app/openai_agents_example.py:477
- Draft comment:
The removal of the @agent() decorators for the RecipeEditorAgent and MainChatAgent classes simplifies the code and avoids redundant decoration. Verify that this change does not affect the agent registration or behavior, and that the agents still initialize as expected. - Reason this comment was not posted:
Confidence changes required:30%<= threshold50%None
Workflow ID: wflow_NL9Pyw7zfVVfcEbw
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (1)
402-515: Comprehensive tool instrumentation enhancement.This enhancement provides excellent support for different tool types:
- Dynamic wrapping of FunctionTool methods with proper span creation
- Immediate span handling for hosted tools (WebSearchTool, etc.)
- Comprehensive attribute setting for telemetry
- Proper context propagation using thread-local storage
The implementation correctly handles the different tool execution patterns while maintaining trace continuity.
Note: The thread safety concern for
_instrumented_toolsmentioned in past reviews should still be addressed in a future iteration.
🧹 Nitpick comments (1)
packages/sample-app/sample_app/openai_agents_example.py (1)
479-657: Excellent agent definitions and application flow.The implementation provides:
- Well-defined agents with clear roles and comprehensive instructions
- Good stream handling utility for real-time user interaction
- Proper demonstration of agent handoffs and streaming execution
- Practical example of the enhanced instrumentation in action
Consider simplifying the nested if statements in the stream handling (lines 566-567) as suggested by static analysis:
- if isinstance(event.data, ResponseOutputItemAddedEvent): - if isinstance(event.data.item, ResponseFunctionToolCall): + if isinstance(event.data, ResponseOutputItemAddedEvent) and isinstance(event.data.item, ResponseFunctionToolCall):
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py(10 hunks)packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py(5 hunks)packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py(7 hunks)packages/sample-app/sample_app/openai_agents_example.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py (1)
packages/sample-app/sample_app/openai_agents_example.py (5)
Recipe(45-52)SearchResponse(64-69)EditResponse(55-61)search_recipes(349-476)plan_and_apply_recipe_modifications(255-345)
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (3)
packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py (3)
SpanAttributes(36-229)Meters(6-33)TraceloopSpanKindValues(269-274)packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py (2)
exporter(27-37)recipe_workflow_agents(148-269)packages/traceloop-sdk/traceloop/sdk/utils/in_memory_span_exporter.py (1)
get_finished_spans(40-43)
packages/sample-app/sample_app/openai_agents_example.py (2)
packages/traceloop-sdk/traceloop/sdk/__init__.py (2)
Traceloop(37-227)init(49-202)packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py (5)
Recipe(151-158)EditResponse(167-172)SearchResponse(160-165)plan_and_apply_recipe_modifications(218-253)search_recipes(197-215)
🪛 Ruff (0.12.2)
packages/sample-app/sample_app/openai_agents_example.py
566-567: Use a single if statement instead of nested if statements
(SIM102)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.9)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
🔇 Additional comments (12)
packages/opentelemetry-instrumentation-openai-agents/tests/conftest.py (3)
47-55: Good enhancement for test isolation.The fixture now properly clears the global instrumentation state between tests, preventing test interference. The local imports and explicit clearing of
_root_span_storageand_instrumented_toolsensure a clean state for each test run.
147-270: Excellent test infrastructure for multi-agent workflows.This fixture provides comprehensive support for testing multi-agent scenarios with:
- Well-defined Pydantic models for structured data
- Realistic function tools with proper async implementations
- Mock recipe data that supports test scenarios
- Proper agent configuration with tools and handoffs
The implementation effectively mirrors the sample application structure while providing a controlled test environment.
99-99: Minor formatting improvements.These cosmetic changes improve code readability and consistency without affecting functionality.
Also applies to: 118-120, 142-142
packages/opentelemetry-instrumentation-openai-agents/tests/test_openai_agents.py (2)
57-68: Test formatting improvements with span count adjustment.The formatting changes improve code readability by consolidating assertions. However, note the span count change from 4 to 3 in line 82 - ensure this reflects the actual instrumentation behavior change and is not a regression.
Also applies to: 82-82, 97-98, 104-104, 134-134, 139-139, 159-159, 164-164, 196-213
226-380: Comprehensive test for multi-agent workflows.This test provides excellent coverage of the enhanced instrumentation features:
- Validates streaming execution and agent handoffs
- Checks span counts, attributes, and relationships
- Verifies trace continuity across agent boundaries
- Tests both agent and tool span creation
- Validates JSON input/output serialization
The test structure is well-organized and the assertions are thorough, providing confidence in the instrumentation's correctness.
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (3)
104-208: Well-implemented streaming wrapper.This new wrapper provides comprehensive support for streaming execution:
- Proper thread-local span storage for trace continuity
- Consistent input/output serialization using JSONEncoder
- Comprehensive error handling with appropriate span status
- Maintains compatibility with existing instrumentation patterns
The implementation correctly handles the async nature of streaming calls while preserving trace context.
224-244: Enhanced agent run wrapper with improved context propagation.The enhancements provide:
- Consistent thread-local span storage for trace continuity
- Improved input/output serialization with fallback handling
- Better context propagation across agent calls
These changes align with the streaming wrapper implementation and improve overall instrumentation quality.
Also applies to: 249-275
6-6: Supporting changes for enhanced instrumentation.These changes provide necessary infrastructure:
- New imports for threading and context management
- Global state storage for trace continuity
- Enhanced uninstrumentation with proper cleanup
- Minor improvements to agent details extraction
The changes are well-integrated and support the main functionality enhancements.
Also applies to: 9-9, 11-11, 25-25, 31-32, 67-83, 313-313, 336-336, 345-345
packages/sample-app/sample_app/openai_agents_example.py (4)
1-87: Well-structured data models and initialization.The implementation provides:
- Comprehensive Pydantic models with proper type hints
- Clean import organization
- Appropriate Traceloop initialization for sample usage
- Well-designed data structures for recipe management
The foundation is solid for demonstrating the instrumentation capabilities.
88-240: Comprehensive recipe database for testing.The sample recipe data provides excellent variety:
- Multiple cuisines and cooking styles
- Complete recipe information with ingredients, instructions, and timing
- Good coverage for testing search and modification scenarios
This data structure effectively supports the demonstration of the application's capabilities.
254-346: Well-implemented AI-powered recipe modification.This function demonstrates excellent AI integration:
- Comprehensive prompt engineering covering various modification scenarios
- Proper structured output parsing with Pydantic models
- Robust error handling and validation
- Appropriate global context updates
The implementation shows practical usage of OpenAI's structured output capabilities.
348-477: Sophisticated AI-powered recipe search.This function provides advanced search capabilities:
- AI-powered semantic search with intelligent recipe matching
- Robust fallback to keyword search when AI parsing fails
- Comprehensive error handling and logging
- Well-structured response formatting
The implementation demonstrates practical AI search techniques with proper error resilience.
There was a problem hiding this comment.
Caution
Changes requested ❌
Reviewed fe4fa8f in 1 minute and 53 seconds. Click for details.
- Reviewed
49lines of code in2files - Skipped
0files when reviewing. - Skipped posting
1draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:21
- Draft comment:
Consolidated import from local utils replaces the external JSONEncoder. Confirm that the local version meets all previous expectations. - Reason this comment was not posted:
Confidence changes required:30%<= threshold50%None
Workflow ID: wflow_Hu5dYCLm3nFFdgmL
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (1)
30-31: Thread safety concerns with global state management.The global dictionaries are accessed from multiple threads but lack explicit synchronization. This is a known issue from previous reviews that still needs to be addressed.
Consider adding thread-safe access patterns as suggested in the previous review:
+import threading + +_storage_lock = threading.Lock() _root_span_storage = {} _instrumented_tools = set()Then protect critical sections where
_instrumented_toolsis accessed.
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/utils.py (1)
24-42: Consider consolidating nested if statements and handling dictionary mutation.The
JSONEncoderimplementation looks solid for handling various object types, but there are two improvements to consider:
Dictionary mutation: The code directly modifies the input dictionary by deleting the "callbacks" key, which could have side effects on the original object.
Nested if statements: As noted by static analysis, the nested if statements can be combined.
Apply this diff to address both issues:
- if isinstance(o, dict): - if "callbacks" in o: - del o["callbacks"] - return o + if isinstance(o, dict) and "callbacks" in o: + # Create a copy to avoid mutating the original dictionary + o_copy = dict(o) + del o_copy["callbacks"] + return o_copyThis prevents unintended side effects while also simplifying the conditional logic.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py(10 hunks)packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/utils.py(2 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/utils.py
26-27: Use a single if statement instead of nested if statements
Combine if statements using and
(SIM102)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test Packages (3.9)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Lint
🔇 Additional comments (6)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/utils.py (1)
1-2: LGTM on the necessary imports.The
dataclassesandjsonimports are properly added to support the newJSONEncoderfunctionality.packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (5)
67-75: LGTM on the streaming wrapper integration.The addition of the streaming wrapper properly extends the instrumentation to support
_run_single_turn_streamedmethod, maintaining consistency with the existing wrapper pattern.
103-206: Excellent implementation of streaming agent tracing.The
_wrap_agent_run_streamedfunction provides comprehensive tracing for streaming execution with proper:
- Thread-local root span management
- Input/output serialization with fallback handling
- Duration metrics recording
- Error handling and span status management
The implementation correctly mirrors the non-streaming wrapper while adding streaming-specific considerations.
223-275: Enhanced tracing with consistent thread-local management.The modifications to
_wrap_agent_runproperly integrate thread-local root span storage and improved input/output serialization, ensuring consistency with the streaming wrapper.
79-81: Good cleanup implementation in uninstrumentation.The cleanup of global state in
_uninstrumentproperly prevents memory leaks and ensures clean test isolation.
462-513: Proper hosted tool span creation with context propagation.The implementation correctly creates and immediately ends spans for hosted tools (
WebSearchTool,FileSearchTool,ComputerTool) with proper context propagation from the root span. The tool-specific attribute extraction is comprehensive and well-structured.
| tool_id = id(tool) | ||
| if tool_id in _instrumented_tools: | ||
| continue | ||
|
|
||
| _instrumented_tools.add(tool_id) | ||
|
|
||
| original_on_invoke_tool = tool.on_invoke_tool | ||
|
|
||
| def create_wrapped_tool(original_tool, original_func): | ||
| async def wrapped_on_invoke_tool(tool_context, args_json): | ||
| tool_name = getattr(original_tool, "name", "tool") | ||
| if root_span: | ||
| ctx = set_span_in_context(root_span, context.get_current()) | ||
| else: | ||
| ctx = context.get_current() | ||
|
|
||
| with tracer.start_as_current_span( | ||
| f"{tool_name}.tool", | ||
| kind=SpanKind.INTERNAL, | ||
| attributes={ | ||
| SpanAttributes.TRACELOOP_SPAN_KIND: ( | ||
| TraceloopSpanKindValues.TOOL.value | ||
| ) | ||
| }, | ||
| context=ctx, | ||
| ) as span: | ||
| try: | ||
| span.set_attribute( | ||
| f"{GEN_AI_COMPLETION}.tool.user_location", | ||
| str(tool.user_location) | ||
| f"{GEN_AI_COMPLETION}.tool.name", tool_name | ||
| ) | ||
| elif isinstance(tool, ComputerTool): | ||
| span.set_attribute( | ||
| f"{GEN_AI_COMPLETION}.tool.type", "ComputerTool" | ||
| ) | ||
| span.set_attribute( | ||
| f"{GEN_AI_COMPLETION}.tool.computer", | ||
| str(tool.computer) | ||
| ) | ||
| span.set_status(Status(StatusCode.OK)) | ||
| except Exception as e: | ||
| span.set_status(Status(StatusCode.ERROR, str(e))) | ||
| raise | ||
| span.set_attribute( | ||
| f"{GEN_AI_COMPLETION}.tool.type", "FunctionTool" | ||
| ) | ||
| span.set_attribute( | ||
| f"{GEN_AI_COMPLETION}.tool.description", | ||
| original_tool.description, | ||
| ) | ||
| span.set_attribute( | ||
| f"{GEN_AI_COMPLETION}.tool.strict_json_schema", | ||
| original_tool.strict_json_schema, | ||
| ) | ||
| span.set_attribute("traceloop.entity.input", args_json) | ||
| result = await original_func(tool_context, args_json) | ||
| span.set_attribute("traceloop.entity.output", str(result)) | ||
| span.set_status(Status(StatusCode.OK)) | ||
| return result | ||
| except Exception as e: | ||
| span.set_status(Status(StatusCode.ERROR, str(e))) | ||
| raise | ||
|
|
||
| return wrapped_on_invoke_tool | ||
|
|
||
| tool.on_invoke_tool = create_wrapped_tool(tool, original_on_invoke_tool) | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Potential memory leak with function tool instrumentation.
The current implementation stores tool IDs in _instrumented_tools but never removes them, which could lead to memory leaks in long-running applications where tools are dynamically created and destroyed.
Consider implementing a cleanup mechanism or using weak references:
import weakref
# Option 1: Use WeakSet for automatic cleanup
_instrumented_tools = weakref.WeakSet()
# Option 2: Add cleanup method to be called when tools are no longer needed
def cleanup_tool_instrumentation(tool):
tool_id = id(tool)
_instrumented_tools.discard(tool_id)Additionally, the function wrapping approach modifies the original tool object, which could have side effects. Consider using a more isolated approach if possible.
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py
between lines 408 and 461, the _instrumented_tools set stores tool IDs
indefinitely, risking memory leaks in long-running processes with dynamic tool
lifecycles. Replace _instrumented_tools with a weakref.WeakSet to allow
automatic cleanup of tool references when they are garbage collected.
Additionally, consider adding a cleanup function to explicitly remove tool IDs
from _instrumented_tools when tools are no longer needed. Also, review the
function wrapping approach to avoid modifying the original tool object directly,
possibly by using a proxy or decorator pattern to isolate instrumentation side
effects.
…etry/instrumentation/openai_agents/utils.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/utils.py(2 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/utils.py
26-27: Use a single if statement instead of nested if statements
Combine if statements using and
(SIM102)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test Packages (3.9)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Lint
🔇 Additional comments (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/utils.py (1)
1-2: LGTM! Imports are appropriate for the JSONEncoder implementation.The
dataclassesandjsonimports are necessary for the new JSONEncoder class functionality.
| class JSONEncoder(json.JSONEncoder): | ||
| def default(self, o): | ||
| if isinstance(o, dict): | ||
| if "callbacks" in o: | ||
| o = {k: v for k, v in o.items() if k != "callbacks"} | ||
| return o | ||
| if dataclasses.is_dataclass(o): | ||
| return dataclasses.asdict(o) | ||
|
|
||
| if hasattr(o, "to_json"): | ||
| return o.to_json() | ||
|
|
||
| if hasattr(o, "json"): | ||
| return o.json() | ||
|
|
||
| if hasattr(o, "__class__"): | ||
| return o.__class__.__name__ | ||
|
|
||
| return super().default(o) |
There was a problem hiding this comment.
Fix the dict handling logic and consider simplifying nested conditions.
The JSONEncoder implementation has a logical flaw in the dict handling section. Currently, it only returns the filtered dict if "callbacks" is present, but it should process all dicts consistently.
Apply this diff to fix the dict handling logic:
- if isinstance(o, dict):
- if "callbacks" in o:
- o = {k: v for k, v in o.items() if k != "callbacks"}
- return o
+ if isinstance(o, dict):
+ return {k: v for k, v in o.items() if k != "callbacks"}Additionally, consider the static analysis suggestion to combine the nested if statements in the set_span_attribute function (lines 8-10), though this is a minor improvement:
- if value is not None:
- if value != "":
- span.set_attribute(name, value)
+ if value is not None and value != "":
+ span.set_attribute(name, value)Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.12.2)
26-27: Use a single if statement instead of nested if statements
Combine if statements using and
(SIM102)
🤖 Prompt for AI Agents
In
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/utils.py
lines 24-42, the JSONEncoder's dict handling only returns a filtered dict when
"callbacks" is present, skipping other dicts. Modify the logic to always process
dicts by filtering out "callbacks" if present, otherwise returning the dict as
is. Also, simplify the nested if statements in the set_span_attribute function
(lines 8-10) by combining conditions into a single if statement for clarity.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed ca0b49e in 45 seconds. Click for details.
- Reviewed
13lines of code in1files - Skipped
0files when reviewing. - Skipped posting
1draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/utils.py:28
- Draft comment:
Nice improvement avoiding in-place mutation. Ensure returning a new dict with callbacks removed aligns with expected JSON encoding semantics. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
Workflow ID: wflow_Bta6yLadqWDavHGo
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (1)
402-405: Add synchronization around_instrumented_toolsto prevent race conditionsYour
extract_tool_detailslogic instrumentsFunctionToolinstances without any lock, which can lead to double-wrapping or missed instrumentation when multiple threads race through the membership check and add. Introduce a module-level lock and wrap the check/add in a critical section:• File:
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/init.py
Function:extract_tool_details• Changes needed:
– Define a lock alongside_instrumented_tools
– Enclose theif tool in …and.add(tool)calls inwith _instrumentation_lock:Example diff snippet:
--- a/.../openai_agents/__init__.py +++ b/.../openai_agents/__init__.py @@ -_instrumented_tools = weakref.WeakSet() +_instrumented_tools = weakref.WeakSet() +_instrumentation_lock = threading.Lock() def extract_tool_details(tracer: Tracer, tools): """Create spans for hosted tools and wrap FunctionTool execution.""" thread_id = threading.get_ident() root_span = _root_span_storage.get(thread_id) @@ for tool in tools: if isinstance(tool, FunctionTool): - if tool in _instrumented_tools: - continue - _instrumented_tools.add(tool) + with _instrumentation_lock: + if tool in _instrumented_tools: + continue + _instrumented_tools.add(tool) original_on_invoke_tool = tool.on_invoke_tool …This ensures that membership checks and additions to the
WeakSetare atomic and thread-safe.
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (1)
312-318: Model name extraction needs error handling.The nested attribute access in
get_model_namecould raiseAttributeErrorif the agent structure is unexpected. Consider adding proper error handling.def get_model_name(agent): - model_attr = getattr(getattr(agent, "model", None), "model", "unknown_model") - if model_attr == "unknown_model": - model_attr = getattr(agent, "model", None) - return model_attr - else: - return model_attr + try: + model = getattr(agent, "model", None) + if model is None: + return "unknown_model" + return getattr(model, "model", str(model)) + except (AttributeError, TypeError): + return "unknown_model"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py(10 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Test Packages (3.11)
- GitHub Check: Build Packages (3.11)
- GitHub Check: Test Packages (3.12)
- GitHub Check: Test Packages (3.10)
- GitHub Check: Test Packages (3.9)
- GitHub Check: Lint
🔇 Additional comments (10)
packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py (10)
6-7: Threading and weakref imports support the new concurrency model.The addition of
threadingandweakrefimports properly supports the thread-local span storage and memory-safe tool tracking introduced in this PR.
31-32: Thread-local span storage addresses trace continuity issues.The implementation uses thread ID as the key for span storage, which is appropriate for maintaining trace context across async operations within the same thread. The use of
weakref.WeakSetfor_instrumented_toolshelps prevent memory leaks by allowing garbage collection of tool references.
68-76: Streaming instrumentation wrapper correctly added.The new wrapper for
_run_single_turn_streamedproperly extends tracing support to streaming execution paths, maintaining consistency with the existing non-streaming wrapper pattern.
80-82: Proper cleanup in uninstrumentation.The cleanup of both
_instrumented_toolsand_root_span_storagein the uninstrumentation method ensures no memory leaks when the instrumentation is disabled.
104-207: Streaming wrapper implementation looks comprehensive.The streaming wrapper correctly:
- Manages thread-local root span storage
- Handles context propagation with proper fallback
- Captures input/output with JSON serialization and fallback
- Records metrics and span attributes
- Handles errors appropriately
The implementation mirrors the non-streaming wrapper while adding streaming-specific handling.
224-244: Enhanced context propagation in non-streaming wrapper.The addition of thread-local span storage and context propagation ensures proper trace linkage across agent handoffs. The logic correctly falls back to current context when no root span exists.
250-275: JSON serialization with proper fallback handling.The input serialization logic correctly uses the custom
JSONEncoderwith proper exception handling and fallback to string representation. This ensures robust attribute capture even for complex objects.
284-288: Consistent output serialization pattern.The output serialization follows the same pattern as input serialization, ensuring consistency in attribute capture across the instrumentation.
402-461: Function tool instrumentation with proper span wrapping.The implementation correctly:
- Checks for already instrumented tools to avoid duplicate wrapping
- Uses closure to preserve original function reference
- Maintains proper context propagation from root span
- Captures comprehensive tool attributes and I/O
- Handles errors appropriately
The use of
weakref.WeakSetfor_instrumented_toolsaddresses the memory leak concern from past reviews.
462-513: Hosted tool span creation with proper context.The implementation for hosted tools (
WebSearchTool,FileSearchTool,ComputerTool) correctly:
- Uses the same context propagation pattern as function tools
- Creates spans with appropriate attributes
- Immediately ends spans to preserve original behavior
- Captures tool-specific configuration attributes
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed e93d8ea in 1 minute and 13 seconds. Click for details.
- Reviewed
15lines of code in1files - Skipped
0files when reviewing. - Skipped posting
1draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/utils.py:24
- Draft comment:
Removed filtering for 'callbacks' from dicts. Ensure that including 'callbacks' in the serialized output won't cause non-serializable or sensitive data issues. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_p4FMCg5DEgCVloWZ
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Caution
Changes requested ❌
Reviewed 634af4b in 1 minute and 59 seconds. Click for details.
- Reviewed
97lines of code in2files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:32
- Draft comment:
Changed _instrumented_tools from a WeakSet to a plain set (storing tool IDs). Ensure this change is intentional and that it won’t cause memory retention issues. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The change actually seems well thought out. By storing IDs instead of objects, we avoid any memory retention issues that could have existed with the WeakSet. The _uninstrument() method properly clears the set. The WeakSet was likely unnecessary since we have explicit cleanup. The comment seems to raise a non-issue and could mislead developers. Could there be edge cases where tools need to be garbage collected before _uninstrument() is called? Am I missing some threading concerns? The tools are only used during instrumentation and explicitly cleaned up. Even if there are edge cases, storing IDs is actually safer than storing object references. The comment should be deleted. The code change is an improvement that avoids potential memory issues by storing IDs instead of object references, and has proper cleanup in place.
2. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:193
- Draft comment:
Removed LLM_SYSTEM and LLM_RESPONSE_MODEL attributes from histogram record calls. Confirm that this removal is intended and that downstream metrics processing doesn’t rely on these attributes. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_ZaKPi4gKKFjXDgX4
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| except Exception: | ||
| span.set_attribute("traceloop.entity.output", json.dumps(str(response))) | ||
| if duration_histogram: | ||
| duration_histogram.record( |
There was a problem hiding this comment.
Note the inconsistency: the streamed agent run attaches an attribute ('gen_ai.agent.name') while the non-streamed run does not pass any attributes when recording duration. Confirm if this difference is intentional.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 993b90e in 1 minute and 6 seconds. Click for details.
- Reviewed
20lines of code in1files - Skipped
0files when reviewing. - Skipped posting
2draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:6
- Draft comment:
Removed unused import 'weakref'. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
2. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:120
- Draft comment:
Removed unused variable 'model_name' in _wrap_agent_run_streamed. - Reason this comment was not posted:
Confidence changes required:0%<= threshold50%None
Workflow ID: wflow_SGJxCr7dxqyCbTKE
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed 1673c03 in 1 minute and 21 seconds. Click for details.
- Reviewed
72lines of code in1files - Skipped
0files when reviewing. - Skipped posting
4draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:165
- Draft comment:
Good refactor: using SpanAttributes constants (TRACELOOP_ENTITY_INPUT) instead of hardcoded strings for input attributes improves maintainability and consistency. Verify that these constants are defined as expected. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment starts with a positive note about the refactor, which is informative. However, it ends with a request to verify that the constants are defined as expected, which is asking the author to double-check something. This violates the rule against asking the author to confirm or verify things.
2. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:183
- Draft comment:
Similarly, replacing the hardcoded output attribute key with SpanAttributes.TRACELOOP_ENTITY_OUTPUT enhances consistency across streaming and single-turn execution paths. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%This comment seems to be purely informative, explaining the benefit of a change without suggesting any action or asking for confirmation. It doesn't align with the rules for useful comments.
3. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:264
- Draft comment:
The same improvements are applied in the _wrap_agent_run function for both input and output attribute settings using SpanAttributes constants. This reinforces uniform telemetry tagging. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply states what has been done in the code without offering any critique or asking for confirmation of intent.
4. packages/opentelemetry-instrumentation-openai-agents/opentelemetry/instrumentation/openai_agents/__init__.py:441
- Draft comment:
In the wrapped tool invocation, replacing hardcoded attribute keys with SpanAttributes constants for input and output further standardizes telemetry data tagging across tools. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%This comment is purely informative, explaining what a change does without suggesting any action or asking for clarification. It doesn't fit the criteria for a useful comment as per the rules.
Workflow ID: wflow_tb3BBxwvBzR3HCYJ
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
Important
Enhances OpenAI Agents SDK with improved tracing, multi-agent handoffs, and expanded test coverage, including a new sample app for recipe management.
Recipe,SearchResponse,EditResponse, andRecipeModificationResultmodels inopenai_agents_example.py.plan_and_apply_recipe_modifications()andsearch_recipes()functions inopenai_agents_example.py.JSONEncoderinutils.pyfor customized JSON serialization.test_recipe_workflow_agent_handoffs_with_function_tools.yamlfor testing agent handoffs.conftest.pywith fixtures for recipe workflow agents.test_openai_agents.pyfor new functionalities.pyproject.tomlto includeopenai-agentsand local OpenTelemetry packages.This description was created by
for 1673c03. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
New Features
Bug Fixes
Chores