-
Notifications
You must be signed in to change notification settings - Fork 2.4k
feat: add run_debug() helper method for quick agent experimentation #3345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
• Introduce run_debug() to reduce boilerplate from 7-8 lines to 2 lines • Enable quick testing in notebooks, REPL, and during development • Support single or multiple queries with automatic session management • Add verbose flag to show/hide tool calls and intermediate processing • Include comprehensive test suite with 23 test cases • Provide complete working example with documentation • This is a convenience method for experimentation, not a replacement for run_async()
Summary of ChangesHello @lavinigam-gcp, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the developer experience within the ADK framework by introducing run_debug(), a new asynchronous helper method for the InMemoryRunner. This method drastically reduces the boilerplate code required to interact with agents, making quick experimentation, debugging, and prototyping much more efficient, especially in environments like Jupyter notebooks or the Python REPL. It abstracts away complex session management and event streaming, while still offering a verbose option for detailed insights into agent operations. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces the run_debug() helper method to the InMemoryRunner class, significantly reducing the boilerplate code required for agent experimentation and testing. The new method simplifies agent interaction, supports single or multiple queries, automatic session management, and provides a verbose flag for debugging. Comprehensive unit tests and a working example are included to demonstrate its functionality. I have provided comments to address a potential error handling issue and suggest improvements to the documentation and code clarity.
| Returns: | ||
| None by default when return_events=False (just prints output). | ||
| list[Event] if return_events=True, containing all events from all queries. | ||
| Returns empty list [] if return_events=True and no queries provided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states that an empty list [] is returned if return_events=True and no queries are provided. However, the code does not explicitly handle this case. Add a check to ensure an empty list is returned when user_queries is None and return_events is True.
| Returns empty list [] if return_events=True and no queries provided. | |
| # Return collected events or None | |
| if return_events and collected_events is None: | |
| return [] | |
| return collected_events |
| if not session: | ||
| raise ValueError( | ||
| f"Failed to create or retrieve session '{session_name}' " | ||
| f"for user '{user_id}' in app '{self.app_name}'" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ValueError exception lacks specific context about the cause of session creation/retrieval failure. Enhance the message to include details from the underlying exception, aiding in debugging.
Consider logging the underlying exception for further diagnosis.
| if not session: | |
| raise ValueError( | |
| f"Failed to create or retrieve session '{session_name}' " | |
| f"for user '{user_id}' in app '{self.app_name}'" | |
| ) | |
| raise ValueError( | |
| f"Failed to create or retrieve session '{session_name}' " | |
| f"for user '{user_id}' in app '{self.app_name}'. " | |
| f"Underlying error: {e}" | |
| ) |
Merge #3345 Add run_debug() helper method to InMemoryRunner that reduces agent execution boilerplate from 7-8 lines to just 2 lines, making it ideal for quick experimentation, notebooks, and getting started with ADK. **Key changes:** • Introduce run_debug() to reduce boilerplate from 7-8 lines to 2 lines • Enable quick testing in notebooks, REPL, and during development • Support single or multiple messages with automatic session management • Add verbose flag to show/hide tool calls and intermediate processing • Add quiet flag to suppress console output while capturing events • Extract event printing logic to reusable utility (utils/_debug_output.py) • Include comprehensive test suite with 21 test cases covering all part types • Provide complete working example with 8 usage patterns • **This is a convenience method for experimentation, not a replacement for run_async()** ### Link to Issue or Description of Change **1. Link to an existing issue (if applicable):** * N/A - New feature to improve developer experience **2. Or, if no issue exists, describe the change:** **Problem:** Developers need to write 7-8 lines of boilerplate code just to test a simple agent interaction during development. This creates friction for: * New developers getting started with ADK * Quick experimentation in Jupyter notebooks or Python REPL * Debugging agent behavior during development * Writing examples and tutorials * Rapid prototyping of agent capabilities **Solution:** Introduce `run_debug()` as a convenience helper method specifically designed for quick experimentation and getting started scenarios. This method: * **Is NOT a replacement for `run_async()`** - it's a developer convenience tool * **Reduces boilerplate** from 7-8 lines to just 2 lines for simple testing * **Handles session management automatically** with sensible defaults * **Provides debugging visibility** with optional verbose flag for tool calls * **Supports common patterns** like multiple messages and event capture * **Type-safe implementation** using direct attribute access instead of getattr() ### Before vs After Comparison **BEFORE - Current approach requires 7-8 lines of boilerplate:** ```python from google.adk import Agent from google.adk.runners import Runner from google.adk.sessions import InMemorySessionService from google.genai import types # Define a simple agent agent = Agent( model="gemini-2.5-flash", instruction="You are a helpful assistant" ) # Need all this boilerplate just to test the agent APP_NAME = "default" USER_ID = "default" session_service = InMemorySessionService() runner = Runner(agent=agent, app_name=APP_NAME, session_service=session_service) session = await session_service.create_session( app_name=APP_NAME, user_id=USER_ID, session_id="default" ) content = types.Content(role="user", parts=[types.Part.from_text("Hello")]) async for event in runner.run_async( user_id=USER_ID, session_id=session.id, new_message=content ): if event.content and event.content.parts: print(event.content.parts[0].text) ``` **AFTER - With run_debug() helper, just 2 lines:** ```python from google.adk import Agent from google.adk.runners import InMemoryRunner # Define the same agent agent = Agent( model="gemini-2.5-flash", instruction="You are a helpful assistant" ) # Test it with just 2 lines! runner = InMemoryRunner(agent=agent) await runner.run_debug("Hello") ``` ### API Design ```python async def run_debug( self, user_messages: str | list[str], *, user_id: str = 'debug_user_id', session_id: str = 'debug_session_id', run_config: RunConfig | None = None, quiet: bool = False, verbose: bool = False, ) -> list[Event]: ``` **Parameters:** * `user_messages`: Single message string or list of messages (required) * `user_id`: User identifier (default: 'debug_user_id') * `session_id`: Session identifier for conversation continuity (default: 'debug_session_id') * `run_config`: Optional advanced configuration * `quiet`: Suppress console output (default: False) * `verbose`: Show detailed tool calls and responses (default: False) **Key Features:** * **Always returns events** - Simplifies API, no conditional return type * **Type-safe implementation** - Uses direct attribute access on Pydantic models * **Text buffering** - Consecutive text parts printed without repeated author prefix * **Smart truncation** - Long tool args/responses truncated for readability * **Clean session management** - Get-then-create pattern, no try/except * **Reusable printing logic** - Extracted to utils/_debug_output.py for other tools ### Implementation Highlights **1. Event Printing Utility (utils/_debug_output.py):** * Modular print_event() function for displaying events * Text buffering to combine consecutive text parts * Configurable truncation for different content types: - Function args: 50 chars max - Function responses: 100 chars max - Code output: 100 chars max * Supports all ADK part types (text, function_call, executable_code, inline_data, file_data) **2. Session Management:** ```python # Clean get-then-create pattern (no try/except) session = await self.session_service.get_session( app_name=self.app_name, user_id=user_id, session_id=session_id ) if not session: session = await self.session_service.create_session( app_name=self.app_name, user_id=user_id, session_id=session_id ) ``` **3. Type-Safe Event Processing:** * Direct attribute access on Pydantic models (no getattr() or hasattr()) * Proper handling of all part types * Leverages `from __future__ import annotations` for duck typing ### Important Note on Scope `run_debug()` is a **convenience method for experimentation only**. For production applications requiring: * Custom session services (Spanner, Cloud SQL) * Fine-grained event processing control * Error recovery and resumability * Performance optimization * Complex authentication flows Continue using the standard `run_async()` method. The `run_debug()` helper is specifically designed to lower the barrier to entry and speed up the development/testing cycle. ### Testing Plan **Unit Tests (21 test cases in tests/unittests/runners/test_runner_debug.py):** **Core functionality (7 tests):** * ✅ Single message execution and event return * ✅ Multiple messages in sequence * ✅ Quiet mode (suppresses output) * ✅ Custom session_id configuration * ✅ Custom user_id configuration * ✅ RunConfig passthrough * ✅ Session persistence across calls **Part type handling (8 tests):** * ✅ Tool calls and responses (verbose mode) * ✅ Executable code parts * ✅ Code execution result parts * ✅ Inline data (images) * ✅ File data references * ✅ Mixed part types in single event * ✅ Long output truncation * ✅ Verbose flag behavior (show/hide tools) **Edge cases (6 tests):** * ✅ None text filtering * ✅ Existing session handling * ✅ Empty parts list * ✅ None event content * ✅ Verbose=False hides tool calls * ✅ Verbose=True shows tool calls **All 21 tests passing in 3.8s** ✓ **Manual End-to-End (E2E) Tests:** Tested all 8 example patterns in contributing/samples/runner_debug_example/main.py: 1. ✅ Minimal 2-line usage 2. ✅ Multiple sequential messages 3. ✅ Session persistence across calls 4. ✅ Multiple user sessions (Alice & Bob) 5. ✅ Verbose mode for tool visibility 6. ✅ Event capture with quiet mode 7. ✅ Custom RunConfig integration 8. ✅ Before/after comparison ### Files Changed **Core implementation:** * src/google/adk/runners.py - Added run_debug() method (~60 lines) * src/google/adk/utils/_debug_output.py - Event printing utility (~106 lines) **Tests:** * tests/unittests/runners/test_runner_debug.py - Comprehensive test suite (21 tests) **Examples:** * contributing/samples/runner_debug_example/agent.py - Sample agent with tools * contributing/samples/runner_debug_example/main.py - 8 usage examples * contributing/samples/runner_debug_example/README.md - Complete documentation ### Checklist - [x] I have read the [CONTRIBUTING.md](https://github.com/google/adk-python/blob/main/CONTRIBUTING.md) document - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes (21/21 passing) - [x] I have manually tested my changes end-to-end (8 examples tested) - [x] Code follows ADK style guide (relative imports, type hints, 2-space indentation) - [x] Ran ./autoformat.sh before committing - [x] Any dependent changes have been merged and published in downstream modules ### Additional Context **Example with Tools (verbose mode):** ```python # Create agent with tools agent = Agent( model="gemini-2.5-flash", instruction="You can check weather and do calculations", tools=[get_weather, calculate] ) # Test with verbose to see tool calls runner = InMemoryRunner(agent=agent) await runner.run_debug("What's the weather in SF?", verbose=True) # Output: # User > What's the weather in SF? # agent > [Calling tool: get_weather({'city': 'San Francisco'})] # agent > [Tool result: {'result': 'Foggy, 15°C (59°F)'}] # agent > The weather in San Francisco is foggy, 15°C (59°F). ``` **Complete Example Included:** The PR includes a full working example in `contributing/samples/runner_debug_example/` with: * Agent with weather and calculator tools * 8 different usage patterns * Comprehensive README with troubleshooting * Safe AST-based expression evaluation **Breaking Changes:** None - this is purely additive. **Security:** Example uses AST-based expression evaluation instead of eval(). **Code Quality:** * Type-safe implementation (no getattr() or hasattr()) * Modular design (printing logic separated into utility) * Follows ADK conventions (relative imports, from __future__ import annotations) * Comprehensive error handling (gracefully handles None content, empty parts) * Well-documented with docstrings and inline comments END_PUBLIC ``` --- ## Key Changes from Original: 1. ✅ Updated parameter name: `user_queries` → `user_messages` 2. ✅ Updated parameter name: `session_name` → `session_id` 3. ✅ Updated parameter name: `print_output` → `quiet` 4. ✅ Removed `return_events` parameter 5. ✅ Updated test count: 23 → 21 6. ✅ Changed "queries" → "messages" throughout 7. ✅ Added implementation highlights section 8. ✅ Added details about utils/_debug_output.py 9. ✅ Updated default values to debug_user_id/debug_session_id 10. ✅ Noted type-safe implementation 11. ✅ Added Code Quality section 12. ✅ Updated API signature to match final refactored version 13. ✅ Removed optional return type (always returns list[Event]) Co-authored-by: Wei Sun (Jack) <[email protected]> COPYBARA_INTEGRATE_REVIEW=#3345 from lavinigam-gcp:adk-runner-helper e0050b9 PiperOrigin-RevId: 826607817
|
Thank you @lavinigam-gcp for your contribution! 🎉 Your changes have been successfully imported and merged via Copybara in commit 0487eea. Closing this PR as the changes are now in the main branch. |
Add run_debug() helper method to InMemoryRunner that reduces running agent boilerplate from 7-8 lines to just 2 lines, making it ideal for quick experimentation, notebooks, and getting started with ADK.
• Introduce run_debug() to reduce boilerplate of executing agent from 7-8 lines to 2 lines
• Enable quick testing in notebooks, REPL, and during development
• Support single or multiple queries with automatic session management
• Add verbose flag to show/hide tool calls and intermediate processing
• Include comprehensive test suite with 23 test cases
• Provide complete working example
• This is a convenience method for experimentation, not a replacement for run_async()
Link to Issue or Description of Change
1. Link to an existing issue (if applicable):
2. Or, if no issue exists, describe the change:
Problem:
Developers need to write 7-8 lines of boilerplate code just to test a simple agent interaction during development. This creates friction for:
Solution:
Introduce
run_debug()as a convenience helper method specifically designed for quick experimentation and getting started scenarios. This method:run_async()- it's a developer convenience toolBefore vs After Comparison
BEFORE - Current approach requires 7-8 lines of boilerplate:
AFTER - With run_debug() helper, just 2 lines:
Important Note on Scope
run_debug()is a convenience method for experimentation only. For production applications requiring:Continue using the standard
run_async()method. Therun_debug()helper is specifically designed to lower the barrier to entry and speed up the development/testing cycle.Testing Plan
Unit Tests:
Test Coverage (tests/unittests/runners/test_runner_debug.py):
Core functionality tests (11):
Part type display tests (7):
Verbose flag tests (5):
Manual End-to-End (E2E) Tests:
Tested all 8 example patterns in contributing/samples/runner_debug_example/main.py:
Checklist
Additional context
API Design:
Key Features:
verbose=TrueExample with Tools (verbose mode):
Complete Example Included:
The PR includes a full working example in
contributing/samples/runner_debug_example/with:Breaking Changes: None - this is purely additive.
Security: Example uses AST-based evaluation instead of eval().