Memory Interface in AgentChat #4438

victordibia · 2024-12-01T00:33:42Z

Memory for AgentChat Agents

It would be useful to have some notion of memory, and the ability to attach memory to an agent.
Right now the AssitantAgent can take on tools.

agent = Agent(model=model, tools=[] )

Some use cases often benefit from being able to retrieve memory just in time, add to the prompt before responding (RAG etc).

agent = Agent(model=model, tools=[], memory=[])

This PR implements

Memory Protocol
ListMemory - simple memory based on a list and basic similarity matching.
ChromaDBMemory - implemented using ChromaDB with similar expected behaviour for other vectordb offerins such as pinecone, scann, faiss, mongodb etc. (this impl is added more as an example and might be removed and added somewhere else e.g., in autogen_ext or some 3rd party repo)

Memory Behaviour

Memory protocol that devs can overload.

class MemoryContent(BaseModel):
    content: ContentType
    mime_type: MemoryMimeType | str
    metadata: Dict[str, Any] | None = None
    timestamp: datetime | None = None
    source: str | None = None
    score: float = 0.0

    model_config = ConfigDict(arbitrary_types_allowed=True)


@runtime_checkable
class Memory(Protocol):
    """Protocol defining the interface for memory implementations."""

    @property
    def name(self) -> str | None:
        """The name of this memory implementation."""
        ...

    @property
    def config(self) -> BaseMemoryConfig:
        """The configuration for this memory implementation."""
        ...

    async def transform(
        self,
        model_context: ChatCompletionContext,
    ) -> List[MemoryContent]:
        """
        Transform the provided model context using relevant memory content.

        Args:
            model_context: The context to transform

        Returns:
            List of memory entries with relevance scores
        """
        ...

    async def query(
        self,
        query: MemoryContent,
        cancellation_token: "CancellationToken | None" = None,
        **kwargs: Any,
    ) -> List[MemoryContent]:
        """
        Query the memory store and return relevant entries.

        Args:
            query: Query content item
            cancellation_token: Optional token to cancel operation
            **kwargs: Additional implementation-specific parameters

        Returns:
            List of memory entries with relevance scores
        """
        ...

    async def add(self, content: MemoryContent, cancellation_token: "CancellationToken | None" = None) -> None:
        """
        Add a new content to memory.

        Args:
            content: The memory content to add
            cancellation_token: Optional token to cancel operation
        """
        ...

    async def clear(self) -> None:
        """Clear all entries from memory."""
        ...

    async def cleanup(self) -> None:
        """Clean up any resources used by the memory implementation."""
        ...

Integrating with AssistantAgent

Perhaps a big change with this PR is how AssistantAgent is extended to use memory.

AssistantAgent can receive a list of memory objects. It will then call the transform method for each memory to update the model_context.
Each transform call returns a list of memory objects that are constructed into a MemoryQueryEvent message and yielded for observability.
The implementation AssistantAgent impl above focuses on memory.query and adds that JIT to the agent context. It does not concern itself much with how stuff is added to memory - reason being that his can be heavily usecase driven. It is expected that the developer will run memory.add outside of agent logic .
Developers can implement their own custom memory classes by implementing the Memory protocol.
A basic ListMemory example class is provided.

Example Implementation

Example notebook highlighting these.

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.memory import ListMemory, MemoryContent, MemoryMimeType
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient

# Initialize user memory
user_memory = ListMemory()

# Add user preferences to memory
await user_memory.add(MemoryContent(content="The weather should be in metric units", mime_type=MemoryMimeType.TEXT))

await user_memory.add(MemoryContent(content="Meal recipe must be vegan", mime_type=MemoryMimeType.TEXT))


async def get_weather(city: str, units: str = "imperial") -> str:
    if units == "imperial":
        return f"The weather in {city} is 73 degrees and Sunny."
    elif units == "metric":
        return f"The weather in {city} is 23 degrees and Sunny."


assistant_agent = AssistantAgent(
    name="assistant_agent",
    model_client=OpenAIChatCompletionClient(
        model="gpt-4o-2024-08-06",
    ),
    tools=[get_weather],
    memory=[user_memory],
)

---------- user ----------
What is the weather in New York?
---------- assistant_agent ----------
[MemoryContent(content='The weather should be in metric units', mime_type=<MemoryMimeType.TEXT: 'text/plain'>, metadata=None, timestamp=None, source=None, score=0.463768115942029)]
---------- assistant_agent ----------
[FunctionCall(id='call_OkQ4Z7u2RZLU6dA7GTAQiG9j', arguments='{"city":"New York","units":"metric"}', name='get_weather')]
[Prompt tokens: 128, Completion tokens: 20]
---------- assistant_agent ----------
[FunctionExecutionResult(content='The weather in New York is 23 degrees and Sunny.', call_id='call_OkQ4Z7u2RZLU6dA7GTAQiG9j')]
---------- assistant_agent ----------
The weather in New York is 23 degrees and Sunny.
---------- Summary ----------
Number of messages: 5
Finish reason: None
Total prompt tokens: 128
Total completion tokens: 20
Duration: 0.80 seconds

Related issue number

Closes #4039, #4648

TBD

Finalize design
Add tests

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

Open Questions / TODO

Figure out where to collate examples of more complex memory implementations e.g., ChromaDB or Ducktyping around semantic kernel memory objects.

python/packages/autogen-agentchat/src/autogen_agentchat/memory/_base_memory.py

colombod · 2024-12-01T12:16:35Z

python/packages/autogen-agentchat/src/autogen_agentchat/memory/_base_memory.py

+
+    async def query(
+        self,
+        query: Union[str, Image, List[Union[str, Image]]],


query also could benefit of mimetypes

Yes, query is now also a MemoryContent class with mimetype ..

python/packages/autogen-agentchat/src/autogen_agentchat/agents/_assistant_agent.py

afourney · 2024-12-05T23:39:56Z

I agree with @husseinmozannar that this is a reasonable and very clean implementation of this idea -- if perhaps a little restrictive. I like the idea of passing the entire context (or perhaps even state!) to the query engine. It's also worth thinking if we can somehow parameterize how the memory is added to the context at the time of the inference. E.g., this implementation adds memory right after the system prompt, and without any explanation or preamble. Other implementations are also reasonable. For example you could introduce memory with something like: "As you work through the user's request, the following snippets may, or may not, be helpful:" You could decide to include memory as the second-to-last message, or the last message (rather than the second). In AutoGen 0.2, we had the idea of context transformers. I wonder if something similar could work here.

ekzhu · 2024-12-11T23:24:49Z

I second @afourney and @husseinmozannar's suggestion. I think the query method forces the caller of the memory (e.g., AssistantAgent) make an upfront choice on how memory is used to added to the context.

How about let the memory protocol provide a transform method that takes a model context (i.e., a list of LLMMessage, tool calls, etc.) and returns a transformed model context that can be sent to the model client directly. This way the caller of the memory module doesn't need to make an opinionated decision on how to query and how to use the result, rather, we can leave this decision to the memory module itself, and caller of AssistantAgent can choose from a preset or customize this from application.

There is a ModelContext module in the Core API that is barely used, perhaps we can refine that one and make it work side-by-side with the memory protocol

gagb · 2024-12-18T22:34:47Z

Chatted with @victordibia API is nice and clean and I agree with its usefulness.

It would be useful to have following somewhere in the repo but not in the base protocol

example of memory related events being raised for observability
example of agent selectively calling .pop on the memory
example of agent selectively calling .add on the memory -- replicates memory feature in ChatGPT UI.
a full fledged RAG agent implemented using this protocol. I would like to be able to add AutoGen repo to it and ask questions.

lspinheiro · 2024-12-20T05:56:31Z

Chatted with @victordibia API is nice and clean and I agree with its usefulness.

It would be useful to have following somewhere in the repo but not in the base protocol

example of memory related events being raised for observability

example of agent selectively calling .pop on the memory

example of agent selectively calling .add on the memory -- replicates memory feature in ChatGPT UI.

a full fledged RAG agent implemented using this protocol. I would like to be able to add AutoGen repo to it and ask questions.

Since

Chatted with @victordibia API is nice and clean and I agree with its usefulness.

It would be useful to have following somewhere in the repo but not in the base protocol

example of memory related events being raised for observability

example of agent selectively calling .pop on the memory

example of agent selectively calling .add on the memory -- replicates memory feature in ChatGPT UI.

a full fledged RAG agent implemented using this protocol. I would like to be able to add AutoGen repo to it and ask questions.

Adding my 2 cents here. I think would be interesting to have a lower-level abstraction for storage and information types which MimeType and MemoryContent are derived from. There may be some differences between knowledge-base retrieval vs memory retrieval that may be useful to consider when creating these abstractions.

I think it could be useful to think how memory uses storage and have chroma as a storage implementation that some VectorEmbeddingMemory uses and then users can easily swap whatever vector database they want to use. Then the storage abstraction can possibly be adapted to some knowledge-base retrievers we decide to implement. I think most other agentic frameworks such as semantic kernel and langchain also have abstractions at the storage layer and it may be easier for us to create adapters in this way.

victordibia · 2024-12-20T17:18:49Z

@lspinheiro ,
The AgentChat framework will likely only have the Memory protocol, developers should overload it to implement whatever vector, graph or any other type of Just in time memory they need for their agent.

I think would be interesting to have a lower-level abstraction for storage and information types which MimeType and MemoryContent are derived from

Good idea, can you propose some concrete examples?

I think it could be useful to think how memory uses storage and have chroma as a storage implementation that some VectorEmbeddingMemory uses and then users can easily swap whatever vector database they want to use.

I think I understand your comment here ie. that VectorEmbeddingMemory is a general enough case that we should explore some standardized implemetation that enables easily switching out various standard dbs. One thing to note is that the apis for this DBs are so different that there will still be quite a bit of code written specifically for each. That being said, perhaps we can get the base Memory protocol done in this PR and then open an new issue for designing something for VectorEmbeddingMemory

ekzhu · 2024-12-20T23:10:54Z

Agree with @victordibia here, let's focus on the memory protocol first before worrying about the implementation level stuff.

Furthermore, I would argue that we should be careful not to introduce too many abstractions.

open an new issue for designing something for VectorEmbeddingMemory

We should take a look at Semantic Kernel's vector memory abstraction and consider adopt that or duck type it.

codecov · 2025-01-04T01:07:57Z

Codecov Report

Attention: Patch coverage is 82.75862% with 25 lines in your changes missing coverage. Please review.

Project coverage is 68.41%. Comparing base (2eb46d2) to head (dfb1da6).

Files with missing lines	Patch %	Lines
...tchat/src/autogen_agentchat/memory/_list_memory.py	79.22%	16 Missing ⚠️
...tchat/src/autogen_agentchat/memory/_base_memory.py	83.72%	7 Missing ⚠️
...t/src/autogen_agentchat/agents/_assistant_agent.py	87.50%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4438      +/-   ##
==========================================
+ Coverage   68.21%   68.41%   +0.20%     
==========================================
  Files         158      161       +3     
  Lines        9960    10104     +144     
==========================================
+ Hits         6794     6913     +119     
- Misses       3166     3191      +25

Flag	Coverage Δ
unittests	`68.41% <82.75%> (+0.20%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…to agentchat_memory_vd

victordibia added 2 commits November 29, 2024 21:59

initial base memroy impl

48d7ecb

update, add example with chromadb

f70f61e

victordibia requested review from ekzhu and colombod December 1, 2024 00:33

victordibia mentioned this pull request Dec 1, 2024

Memory Interface for AgentChat agents #4039

Open

colombod reviewed Dec 1, 2024

View reviewed changes

python/packages/autogen-agentchat/src/autogen_agentchat/memory/_base_memory.py Outdated Show resolved Hide resolved

colombod reviewed Dec 1, 2024

View reviewed changes

include mimetype consideration

24fa684

husseinmozannar reviewed Dec 2, 2024

View reviewed changes

python/packages/autogen-agentchat/src/autogen_agentchat/agents/_assistant_agent.py Outdated Show resolved Hide resolved

This was referenced Dec 12, 2024

Chat History Reduction #4648

Open

Decouple model_context from AssistantAgent #4681

Merged

victordibia mentioned this pull request Dec 15, 2024

Adding Memory Components (useful for RAG workflows) in AGS #4707

Open

6 tasks

ekzhu mentioned this pull request Dec 15, 2024

Migration guide for 0.4 #3867

Closed

8 tasks

victordibia mentioned this pull request Dec 17, 2024

RAG Agent in agentchat #4742

Open

gagb mentioned this pull request Dec 19, 2024

Graphrag integration #4612

Open

3 tasks

victordibia added 3 commits December 19, 2024 15:53

Merge remote-tracking branch 'origin/main' into agentchat_memory_vd

9e94ec8

add transform method

0b7469e

update to address feedback, will update after 4681 is merged

138ee05

Merge remote-tracking branch 'origin/main' into agentchat_memory_vd

a94634b

victordibia added 5 commits December 25, 2024 07:22

update memory impl,

675924c

remove chroma db, typing fixes

b1da7e2

Merge remote-tracking branch 'origin/main' into agentchat_memory_vd

f0812a3

format, add test

32701db

update uv lock

d7bf4d2

victordibia and others added 5 commits January 3, 2025 17:27

update docs

afbef4d

format updates

003bb2e

update notebook

7b15c2e

add memoryqueryevent message, yield message for observability.

b353110

Merge branch 'main' into agentchat_memory_vd

e1a9be2

victordibia marked this pull request as ready for review January 4, 2025 16:57

victordibia changed the title ~~[Draft, Feedback Needed] Memory in AgentChat~~ Memory Interface in AgentChat Jan 4, 2025

victordibia added 2 commits January 4, 2025 11:55

minor fixes, make score optional/none

c797f6a

Merge branch 'agentchat_memory_vd' of github.com:microsoft/autogen in…

dfb1da6

…to agentchat_memory_vd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Interface in AgentChat #4438

Memory Interface in AgentChat #4438

victordibia commented Dec 1, 2024 •

edited

Loading

colombod Dec 1, 2024

victordibia Jan 4, 2025

afourney commented Dec 5, 2024 •

edited

Loading

ekzhu commented Dec 11, 2024 •

edited

Loading

gagb commented Dec 18, 2024 •

edited

Loading

lspinheiro commented Dec 20, 2024

victordibia commented Dec 20, 2024

ekzhu commented Dec 20, 2024

codecov bot commented Jan 4, 2025 •

edited

Loading

Memory Interface in AgentChat #4438

Are you sure you want to change the base?

Memory Interface in AgentChat #4438

Conversation

victordibia commented Dec 1, 2024 • edited Loading

Memory for AgentChat Agents

Memory Behaviour

Integrating with AssistantAgent

Example Implementation

Related issue number

Checks

Open Questions / TODO

colombod Dec 1, 2024

Choose a reason for hiding this comment

victordibia Jan 4, 2025

Choose a reason for hiding this comment

afourney commented Dec 5, 2024 • edited Loading

ekzhu commented Dec 11, 2024 • edited Loading

gagb commented Dec 18, 2024 • edited Loading

lspinheiro commented Dec 20, 2024

victordibia commented Dec 20, 2024

ekzhu commented Dec 20, 2024

codecov bot commented Jan 4, 2025 • edited Loading

Codecov Report

victordibia commented Dec 1, 2024 •

edited

Loading

afourney commented Dec 5, 2024 •

edited

Loading

ekzhu commented Dec 11, 2024 •

edited

Loading

gagb commented Dec 18, 2024 •

edited

Loading

codecov bot commented Jan 4, 2025 •

edited

Loading