diff --git a/website/blog/2023-11-09-EcoAssistant/img/chat.png b/website/blog/2023-11-09-EcoAssistant/img/chat.png new file mode 100644 index 000000000000..7b08b22f2ee3 Binary files /dev/null and b/website/blog/2023-11-09-EcoAssistant/img/chat.png differ diff --git a/website/blog/2023-11-09-EcoAssistant/img/results.png b/website/blog/2023-11-09-EcoAssistant/img/results.png new file mode 100644 index 000000000000..650410ea7f5d Binary files /dev/null and b/website/blog/2023-11-09-EcoAssistant/img/results.png differ diff --git a/website/blog/2023-11-09-EcoAssistant/img/system.png b/website/blog/2023-11-09-EcoAssistant/img/system.png new file mode 100644 index 000000000000..997422fcb688 Binary files /dev/null and b/website/blog/2023-11-09-EcoAssistant/img/system.png differ diff --git a/website/blog/2023-11-09-EcoAssistant/img/template-demo.png b/website/blog/2023-11-09-EcoAssistant/img/template-demo.png new file mode 100644 index 000000000000..6a8f8c691fed Binary files /dev/null and b/website/blog/2023-11-09-EcoAssistant/img/template-demo.png differ diff --git a/website/blog/2023-11-09-EcoAssistant/img/template.png b/website/blog/2023-11-09-EcoAssistant/img/template.png new file mode 100644 index 000000000000..5621e15bd3a0 Binary files /dev/null and b/website/blog/2023-11-09-EcoAssistant/img/template.png differ diff --git a/website/blog/2023-11-09-EcoAssistant/index.mdx b/website/blog/2023-11-09-EcoAssistant/index.mdx new file mode 100644 index 000000000000..4882261d5112 --- /dev/null +++ b/website/blog/2023-11-09-EcoAssistant/index.mdx @@ -0,0 +1,102 @@ +--- +title: EcoAssistant - Using LLM Assistants More Accurately and Affordably +authors: jieyuz2 +tags: [LMM, RAG, cost-effectiveness] +--- + +![system](img/system.png) + +**TL;DR:** +* Introducing the **EcoAssistant**, which is designed to solve user queries more accurately and affordably. +* We show how to let the LLM assistant agent leverage external API to solve user query. +* We show how to reduce the cost of using GPT models via **Assistant Hierachy**. +* We show how to leverage the idea of Retrieval-augmented Generation (RAG) to improve the success rate via **Solution Demonstration**. + + +## EcoAssistant + +In this blog, we introduce the **EcoAssistant**, a system built upon AutoGen with the goal of solving user queries more accurately and affordably. + +### Problem setup + +Recently, users have been using conversational LLMs such as ChatGPT for various queries. +Reports indicate that 23% of ChatGPT user queries are for knowledge extraction purposes. +Many of these queries require knowledge that is external to the information stored within any pre-trained large language models (LLMs). +These tasks can only be completed by generating code to fetch necessary information via external APIs that contain the requested information. +In the table below, we show three types of user queries that we aim to address in this work. + +| Dataset | API | Example query | +|-------------|----------|----------| +| Places| [Google Places](https://developers.google.com/maps/documentation/places/web-service/overview) | I’m looking for a 24-hour pharmacy in Montreal, can you find one for me? | +| Weather | [Weather API](https://www.weatherapi.com) | What is the current cloud coverage in Mumbai, India? | +| Stock | [Alpha Vantage Stock API](https://www.alphavantage.co/documentation/) | Can you give me the opening price of Microsoft for the month of January 2023? | + + +### Leveraging external APIs + +To address these queries, we first build a **two-agent system** based on AutoGen, +where the first agent is a **LLM assistant agent** (`AssistantAgent` in AutoGen) that is responsible for proposing and refining the code and +the second agent is a **code executor agent** (`UserProxyAgent` in AutoGen) that would extract the generated code and execute it, forwarding the output back to the LLM assistant agent. +A visualization of the two-agent system is shown below. + +![chat](img/chat.png) + +To instruct the assistant agent to leverage external APIs, we only need to add the API name/key dictionary at the beginning of the initial message. +The template is shown below, where the red part is the information of APIs and black part is user query. + +![template](img/template.png) + +Importantly, we don't want to reveal our real API key to the assistant agent for safety concerns. +Therefore, we use a **fake API key** to replace the real API key in the initial message. +In particular, we generate a random token (e.g., `181dbb37`) for each API key and replace the real API key with the token in the initial message. +Then, when the code executor execute the code, the fake API key would be automatically replaced by the real API key. + + +### Solution Demonstration +In most practical scenarios, queries from users would appear sequentially over time. +Our **EcoAssistant** leverages past success to help the LLM assistants address future queries via **Solution Demonstration**. +Specifically, whenever a query is deemed successfully resolved by user feedback, we capture and store the query and the final generated code snippet. +These query-code pairs are saved in a specialized vector database. When new queries appear, **EcoAssistant** retrieves the most similar query from the database, which is then appended with the associated code to the initial prompt for the new query, serving as a demonstration. +The new template of initial message is shown below, where the blue part corresponds to the solution demonstration. + +![template](img/template-demo.png) + +We found that this utilization of past successful query-code pairs improves the query resolution process with fewer iterations and enhances the system's performance. + + +### Assistant Hierarchy +LLMs usually have different prices and performance, for example, GPT-3.5-turbo is much cheaper than GPT-4 but also less accurate. +Thus, we propose the **Assistant Hierarchy** to reduce the cost of using LLMs. +The core idea is that we use the cheaper LLMs first and only use the more expensive LLMs when necessary. +By this way, we are able to reduce the reliance on expensive LLMs and thus reduce the cost. +In particular, given multiple LLMs, we initiate one assistant agent for each and start the conversation with the most cost-effective LLM assistant. +If the conversation between the current LLM assistant and the code executor concludes without successfully resolving the query, **EcoAssistant** would then restart the conversation with the next more expensive LLM assistant in the hierarchy. +We found that this strategy significantly reduces costs while still effectively addressing queries. + +### A Synergistic Effect +We found that the **Assistant Hierarchy** and **Solution Demonstration** of **EcoAssistant** have a synergistic effect. +Because the query-code database is shared by all LLM assistants, even without specialized design, +the solution from more powerful LLM assistant (e.g., GPT-4) could be later retrieved to guide weaker LLM assistant (e.g., GPT-3.5-turbo). +Such a synergistic effect further improves the performance and reduces the cost of **EcoAssistant**. + +### Experimental Results + +We evaluate **EcoAssistant** on three datasets: Places, Weather, and Stock. When comparing it with a single GPT-4 assistant, we found that **EcoAssistant** achieves a higher success rate with a lower cost as shown in the figure below. +For more details about the experimental results and other experiments, please refer to our [paper](https://arxiv.org/abs/2310.03046). + +![exp](img/results.png) + +## Further reading + +Please refer to our [paper](https://arxiv.org/abs/2310.03046) and [codebase](https://github.com/JieyuZ2/EcoAssistant) for more details about **EcoAssistant**. + +If you find this blog useful, please consider citing: + +```bibtex +@article{zhang2023ecoassistant, + title={EcoAssistant: Using LLM Assistant More Affordably and Accurately}, + author={Zhang, Jieyu and Krishna, Ranjay and Awadallah, Ahmed H and Wang, Chi}, + journal={arXiv preprint arXiv:2310.03046}, + year={2023} +} +``` diff --git a/website/blog/authors.yml b/website/blog/authors.yml index d2fefc887cc9..cf77c02e3637 100644 --- a/website/blog/authors.yml +++ b/website/blog/authors.yml @@ -39,3 +39,9 @@ beibinli: title: Senior Research Engineer at Microsoft url: https://github.com/beibinli image_url: https://github.com/beibinli.png + +jieyuz2: + name: Jieyu Zhang + title: PhD student at University of Washington + url: https://jieyuz2.github.io/ + image_url: https://github.com/jieyuz2.png