Add RetrieveChat #1158

thinkall · 2023-07-31T13:25:48Z

Why are these changes needed?

RetrieveChat is a convesational framework for retrieve augmented code generation and question answering. Essentially,RetrieveAssistantAgent and RetrieveUserProxyAgent implements a different auto reply mechanism corresponding to the RetrieveChat prompts.

Related issue number

Closes #1107

Checks

I've included any doc changes needed for https://microsoft.github.io/FLAML/. See https://microsoft.github.io/FLAML/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

…xyAgent

sonichi

I reviewed all except the notebook. The change to the notebook can be made in the next round.
Overall, the application looks interesting and general. The implementation is creative. It's worth considering writing a blog post about it.

flaml/autogen/agentchat/contrib/retrieve_assistant_agent.py

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

sonichi · 2023-07-31T21:56:27Z

flaml/autogen/retrieval_utils.py

nice. could be useful for other applications too.
@QingYun

setup.py

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

sonichi

Could you add a test?
Could you review #1165 ? It provides some utility for this PR.

flaml/autogen/agentchat/contrib/retrieve_assistant_agent.py

sonichi · 2023-08-02T15:24:17Z

flaml/autogen/agentchat/contrib/retrieve_assistant_agent.py

+        if "exitcode: 0 (execution succeeded)" in message.get("content", ""):
+            return "TERMINATE"


Sometimes even when the execution succeeds, the task is still not finished.
The result might indicate logic error or requires further steps.

Although sometimes even when the code execution succeeds, the task is not solved, but it's hard to tell. If the human_input_mode of RetrieveUserProxyAgent is "TERMINATE" or "ALWAYS", user can still continue the conversation.

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

sonichi · 2023-08-02T15:50:22Z

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

+            # return exitcode, log, None
+
+            result = self._ipython.run_cell(code)
+            log = str(result.result)


The problem with this is that the result.result is often None because the stdout + stderr are not part of the result.result. This is true for the example you had in the notebook.
If capture doesn't work, then we need to find some other way to get the stdout + stderr.

I'm not sure whether we should capture all of the stdout and stderr outputs and include them in the message. In the case of FLAML, the training logs can often be too long to fit into the LLM tokens. The result.result can only capture concise error information, which may not be sufficient for debugging problems. Therefore, I believe that for complex problems, it's essential to include human input as well.

We can work on this in future PR.

flaml/autogen/agentchat/contrib/retrieve_assistant_agent.py

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

sonichi

Most comments are for changes required in future PR. A few comments need to be addressed in this PR. If it's unclear which are required for this PR, please let me know.

sonichi · 2023-08-08T15:39:33Z

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

+        self.register_auto_reply(Agent, RetrieveUserProxyAgent._generate_retrieve_user_reply)
+
+    @staticmethod
+    def get_max_tokens(model="gpt-3.5-turbo"):


This method shouldn't be in this class. Move it to oai.openai_utils? Can be done in next PR.

sonichi · 2023-08-08T15:53:19Z

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

+        self._client = self._retrieve_config.get("client", chromadb.Client())
+        self._docs_path = self._retrieve_config.get("docs_path", "./docs")
+        self._collection_name = self._retrieve_config.get("collection_name", "flaml-docs")
+        self._model = self._retrieve_config.get("model", "gpt-4")


This info is supposed to be in the sender's llm_config, not here. Refactoring will be needed later.

sonichi · 2023-08-08T15:55:40Z

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

+        self._docs_path = self._retrieve_config.get("docs_path", "./docs")
+        self._collection_name = self._retrieve_config.get("collection_name", "flaml-docs")
+        self._model = self._retrieve_config.get("model", "gpt-4")
+        self._max_tokens = self.get_max_tokens(self._model)


As discussed in #1160 , where to get the token limit info may change in future.

sonichi · 2023-08-08T15:57:20Z

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

+        self._doc_idx = -1  # the index of the current used doc
+        self._results = {}  # the results of the current query


This assumes a single agent to work with. Different agents will need different context. Need to change later.

sonichi · 2023-08-08T15:58:32Z

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

+        self._ipython = get_ipython()
+        self._doc_idx = -1  # the index of the current used doc
+        self._results = {}  # the results of the current query
+        self.register_auto_reply(Agent, RetrieveUserProxyAgent._generate_retrieve_user_reply)


A more recommended way is to register the context needed by this function through context instead of class variables. Can refactor later.

sonichi · 2023-08-08T16:01:37Z

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

+            # return exitcode, log, None
+
+            result = self._ipython.run_cell(code)
+            log = str(result.result)


We can work on this in future PR.

flaml/autogen/agentchat/contrib/retrieve_user_proxy_agent.py

flaml/autogen/agentchat/responsive_agent.py

sonichi · 2023-08-08T16:09:27Z

flaml/autogen/retrieve_utils.py

+TEXT_FORMATS = ["txt", "json", "csv", "tsv", "md", "html", "htm", "rtf", "rst", "jsonl", "log", "xml", "yaml", "yml"]
+
+
+def num_tokens_from_text(


oai.openai_utils is a better fit? Can change later.

sonichi · 2023-08-10T23:22:11Z

For future PRs that requires OpenAI test, please use a branch in microsoft/FLAML

yiranwu0 · 2023-08-12T16:27:20Z

Since the notebook is very long, can you add a table of contents for the examples at the beginning? That is to show what the example demonstrates and unique features of retrieve chat, and the user can go to the corresponding examples. For example, Example 5: we demonstrate the unique feature "UPDATE CONTEXT"...

thinkall · 2023-08-13T00:34:50Z

Since the notebook is very long, can you add a table of contents for the examples at the beginning? That is to show what the example demonstrates and unique features of retrieve chat, and the user can go to the corresponding examples. For example, Example 5: we demonstrate the unique feature "UPDATE CONTEXT"...

Thank you @kevin666aa for the suggestion. I've added the ToC, it works in VSCode and local jupyter server, but doesn't work as expected in GitHub preview. Looks like github has limited preview support on the links.

yiranwu0 · 2023-08-13T11:55:34Z

We can just put the contents without the links. I think people can easily go to the example by searching the keywords. I have no more questions.

notebook/autogen_agentchat_RetrieveChat.ipynb

Add RetrieveChat notebook, RetrieveAssistantAgent and RetrieveUserPro…

9aa34fc

…xyAgent

thinkall requested review from sonichi and qingyun-wu July 31, 2023 13:25

sonichi reviewed Jul 31, 2023

View reviewed changes

sonichi requested a review from LeoLjl August 1, 2023 03:15

sonichi mentioned this pull request Sep 23, 2023

handle context size overflow in AssistantAgent microsoft/autogen#9

Closed

thinkall added 3 commits August 1, 2023 18:43

merge main

b751a04

Update according to comments

5a0a117

Add output

147b606

thinkall requested a review from sonichi August 1, 2023 23:19

sonichi requested review from yiranwu0 and BeibinLi August 2, 2023 15:15

sonichi reviewed Aug 2, 2023

View reviewed changes

thinkall added 5 commits August 3, 2023 20:34

Merge branch 'main' into retrieve_agent

11e1c1f

Add tests, merge main, address comments

22c2303

Fix tests

ca19515

Merge branch 'main' into retrieve_agent

0652cdb

Merge main

8943523

thinkall requested a review from sonichi August 6, 2023 08:08

thinkall had a problem deploying to openai August 6, 2023 09:33 — with GitHub Actions Error

thinkall had a problem deploying to openai August 6, 2023 09:33 — with GitHub Actions Failure

thinkall added 2 commits August 6, 2023 18:16

Remove unnecessary code

744131c

Update test

f3041d7

sonichi mentioned this pull request Aug 7, 2023

Make auto reply method pluggable #1177

Merged

3 tasks

thinkall had a problem deploying to openai August 7, 2023 07:10 — with GitHub Actions Failure

thinkall had a problem deploying to openai August 7, 2023 07:10 — with GitHub Actions Error

Update QA, merge main

e2c9b3e

sonichi reviewed Aug 8, 2023

View reviewed changes

thinkall added 2 commits August 10, 2023 20:26

Update notebook

4e7b845

Update notebook

5e3b16f

sonichi approved these changes Aug 10, 2023

View reviewed changes

thinkall had a problem deploying to openai August 10, 2023 16:00 — with GitHub Actions Error

thinkall had a problem deploying to openai August 10, 2023 16:01 — with GitHub Actions Failure

thinkall had a problem deploying to openai August 10, 2023 21:36 — with GitHub Actions Failure

thinkall had a problem deploying to openai August 10, 2023 21:36 — with GitHub Actions Error

thinkall added 7 commits August 11, 2023 10:09

Add terminate if no more context

b36cd43

Update prompt and notebook, add example for update context

671ffd4

Update results

dc3dfbb

Update results

e388f2f

Merge branch 'main' into retrieve_agent

54d7c14

Update results of update context

5518ac6

Fix typo

74698cb

thinkall added 2 commits August 13, 2023 08:11

Add table of contents

11f46d0

Update table of contents

e4b2e52

yiranwu0 approved these changes Aug 13, 2023

View reviewed changes

notebook/autogen_agentchat_RetrieveChat.ipynb Show resolved Hide resolved

notebook/autogen_agentchat_RetrieveChat.ipynb Show resolved Hide resolved

notebook/autogen_agentchat_RetrieveChat.ipynb Outdated Show resolved Hide resolved

sonichi added this pull request to the merge queue Aug 13, 2023

Merged via the queue into microsoft:main with commit 700ff05 Aug 13, 2023
13 of 16 checks passed

thinkall deleted the retrieve_agent branch August 13, 2023 14:54

thinkall had a problem deploying to openai September 12, 2023 00:31 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RetrieveChat #1158

Add RetrieveChat #1158

thinkall commented Jul 31, 2023

sonichi left a comment

sonichi Jul 31, 2023

sonichi left a comment

sonichi Aug 2, 2023

thinkall Aug 3, 2023

sonichi Aug 2, 2023

thinkall Aug 3, 2023

sonichi Aug 8, 2023

sonichi left a comment

sonichi Aug 8, 2023

sonichi Aug 8, 2023

sonichi Aug 8, 2023

sonichi Aug 8, 2023

sonichi Aug 8, 2023

sonichi Aug 8, 2023

sonichi Aug 8, 2023

sonichi commented Aug 10, 2023

yiranwu0 commented Aug 12, 2023 •

edited

Loading

thinkall commented Aug 13, 2023

yiranwu0 commented Aug 13, 2023 •

edited

Loading

		if "exitcode: 0 (execution succeeded)" in message.get("content", ""):
		return "TERMINATE"

		self._doc_idx = -1 # the index of the current used doc
		self._results = {} # the results of the current query

		TEXT_FORMATS = ["txt", "json", "csv", "tsv", "md", "html", "htm", "rtf", "rst", "jsonl", "log", "xml", "yaml", "yml"]


		def num_tokens_from_text(

Add RetrieveChat #1158

Add RetrieveChat #1158

Conversation

thinkall commented Jul 31, 2023

Why are these changes needed?

Related issue number

Checks

sonichi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sonichi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sonichi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sonichi commented Aug 10, 2023

yiranwu0 commented Aug 12, 2023 • edited Loading

thinkall commented Aug 13, 2023

yiranwu0 commented Aug 13, 2023 • edited Loading

yiranwu0 commented Aug 12, 2023 •

edited

Loading

yiranwu0 commented Aug 13, 2023 •

edited

Loading