-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] Roadmap for handling context overflow #156
Comments
Probably a slight tangent, but I'm finding error stack traces to a major culprit of context overflow |
Would using something like Llama Index help? |
could this work? https://memgpt.ai/ |
MemWalker processed long context intro a tree of summaries |
@Hacker0912 Are you interested in this topic? |
AutoGen is a great project! I'm very interested in how do you solve the context overflow. |
@qidanrui Here is a experimental PR for compression: #131 . It would be great if you can check it out and test it! Just found a potential good solution for compression and I will look into this: https://arxiv.org/abs/2310.06839 |
Thanks for sharing! @kevin666aa I'm so interested in the AutoGen project! |
i also am having issue with using a mistral model , using textgen webui as the api host openai.error.InvalidRequestError: This model maximum context length is 2048 tokens. However, your messages resulted in over 2165 tokens. |
Edit: import os
import autogen
import asyncio
from absl import app, flags
config_list = [
{
'model': 'gpt-4',
'api_key': os.getenv('OPENAI_API_KEY'),
},
]
MEMGPT = True
if not MEMGPT:
llm_config = {"config_list": config_list, "seed": 42}
user_proxy = autogen.UserProxyAgent(
name="User_proxy",
system_message="A human admin.",
code_execution_config={"last_n_messages": 2, "work_dir": "groupchat"},
human_input_mode="TERMINATE"
)
coder = autogen.AssistantAgent(
name="Coder",
llm_config=llm_config,
)
pm = autogen.AssistantAgent(
name="Product_manager",
system_message="Creative in software product ideas.",
llm_config=llm_config,
)
else:
import memgpt.autogen.memgpt_agent as memgpt_autogen
import memgpt.autogen.interface as autogen_interface
import memgpt.agent as agent
import memgpt.system as system
import memgpt.utils as utils
import memgpt.presets as presets
import memgpt.constants as constants
import memgpt.personas.personas as personas
import memgpt.humans.humans as humans
from memgpt.persistence_manager import InMemoryStateManager, InMemoryStateManagerWithPreloadedArchivalMemory, InMemoryStateManagerWithFaiss
llm_config = {"config_list": config_list, "seed": 42}
user_proxy = autogen.UserProxyAgent(
name="User_proxy",
system_message="A human admin.",
code_execution_config={"last_n_messages": 2, "work_dir": "groupchat"},
)
interface = autogen_interface.AutoGenInterface()
persistence_manager = InMemoryStateManager()
memgpt_agent = presets.use_preset(presets.DEFAULT, 'gpt-4', personas.get_persona_text(personas.DEFAULT), humans.get_human_text(humans.DEFAULT), interface, persistence_manager)
# MemGPT coder
coder = memgpt_autogen.MemGPTAgent(
name="MemGPT_coder",
agent=memgpt_agent,
)
# non-MemGPT PM
pm = autogen.AssistantAgent(
name="Product_manager",
system_message="Creative in software product ideas.",
llm_config=llm_config,
)
groupchat = autogen.GroupChat(agents=[user_proxy, coder, pm], messages=[], max_round=12)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
user_proxy.initiate_chat(manager, message="First send the message 'Let's go Mario!'") |
I suggest starting with the simplest implementations without external dependencies first and later building on these. Later on, the user can switch between different implementations how to handle context window through truncation, vector embeddings DB or MemGPT. |
Hello, @MrXandbadas Thanks for the heads up! It's great to have a memGPT agent in AutoGen. It requires much effort to add memgpt, but it seems that we can work with people from memGPT to make it happen. I guess the first step would be to make memgpt a built-in agent in AutoGen. Then we can think about how users can switch with different options for context overflow. @PriNova Thanks for the advice! Your suggestion of truncating history is brought up here #195.
From the first part, it would be easy to add different ways to handle context later. Please take a look at it if you are interested! @PriNova @MrXandbadas |
Sounds good, except "the first step would be to make memgpt a built-in agent in AutoGen." Before we make it a built-in agent, it'll be helpful to demonstrate one good use case of memgpt-based agent in autogen. |
I must be totally misunderstanding the request for "one good use case". Would this approach not give AutoGen Agents near limitless long term memory storage? |
Just suggesting doing it step by step. Test-driven development. |
I completely agree. Testing it in an more robust setup would garner more fruitful insight as to if it was going to be beneficial for the system or just a hinderance in the continual prompt generation that allows these agents to function so flawlessly. |
I think it's gonna be a great start. |
I am interested in this topic, is anyone working on this? I can help can work together with others |
Hello @SDcodehub, thanks for your interest! Currently I am working on adding a compressible agent #443. It could be used as an interface for different types of compression and truncations. On the other hand, it is also possible to utilize existing framework like memGPT. memGPT is actively supporting autogen agents. As @sonichi pointed out, we need to "demonstrate one good use case of memgpt-based agent in autogen". It is not hard to add a memGPT agent, but how to modify it to serve as a group memory requires more effort and thinking. |
Hey guys! The lack of possibility to see exact prompt which agent gets to a context and lack of ability of managing agent's contexts are the main problems of autogen in my opinion. Main reason for it is not even overflow of context window - the main reasons is the LLMs working much worse when had too long context with a lot of useless noise, and also generate unnecesary costs. Maybe you remember, lack of input prompt visibility was big problem with Langchain, until they did a Langsmith. With autogen we have same problem again. In my opinion, we can't talk about any serious AI development if we can't see and edit input prompt of LLMs. What do you think about it? Is such features of editing context (as summarizing or removing old messages) will be available in near future? Maybe there are already solutions exist I don't know about? Cheers! |
I'm quite excited at #1091 by @rickyloynd-microsoft. It makes the teachability a composable capability to any conversable agent. More generally, the same mechanism may be used for solving other longstanding issues like long context handling and allowing other interesting capabilities to be defined. I like the extensibility and the composability of this approach. Reviews are welcome. |
@sonichi |
Teachability is just one capability added through this new mechanism, and teachability is not designed to compress context or memorize general things like MemGPT. But other capabilities (like MemGPT or other ways of handling long context) could be added through this general capability-addition mechanism. |
Shall we close this issue as several recent PRs related to long context handling have merged. @kevin666aa |
Yes, thanks! |
Hi, I'm working on conversable agent flow with autogen. and really wants to know the status of handling context window length and truncate chat history. I read the above conversation and have a few questions?
It would be really helpful if you can answer the question! |
@JingPush current we use this: https://microsoft.github.io/autogen/docs/topics/long_contexts |
Any help is appreciated!
Thit task demands a considerable amount of effort. If you have insights, suggestions, or can contribute in any way, your help would be immensely valued.
Problem Description
(Continue from #9) Current LLM have limited context size / token limit (gpt3.5turbo: 4096, gpt4 8192, etc). Although the current max_token limit from OpenAI is sufficient for many tasks, the token limit will be always exceeded with the conversation running. autogen.Completion will raise this InvalidRequestError that indicates the context size is exceeded since autogen doesn’t have a way to handle long context sizes.
Potential Methods
Some References
Compression & Truncation
Retrieval
The text was updated successfully, but these errors were encountered: