From fbf24a5f6f5a80f95650b5f6b7f1bee219350425 Mon Sep 17 00:00:00 2001 From: Mark Sze <66362098+marklysze@users.noreply.github.com> Date: Wed, 3 Jul 2024 05:28:24 +1000 Subject: [PATCH] Blog post for enhanced non-OpenAI model support (#2965) * Blogpost for enhanced non-OpenAI model support * update: quickstart with simple conversation * update: authors details * Added upfront text * Added function calling, refined text. Added chess for alt-models notebook, updated examples listing. * Added Groq to blog * Removed acknowledgements --------- Co-authored-by: Hk669 Co-authored-by: HRUSHIKESH DOKALA <96101829+Hk669@users.noreply.github.com> --- ...entchat_nested_chats_chess_altmodels.ipynb | 584 ++++++++++++++++++ .../img/agentstogether.jpeg | 3 + .../2024-06-24-AltModels-Classes/index.mdx | 393 ++++++++++++ website/blog/authors.yml | 12 + website/docs/Examples.md | 3 + website/docusaurus.config.js | 4 + 6 files changed, 999 insertions(+) create mode 100644 notebook/agentchat_nested_chats_chess_altmodels.ipynb create mode 100644 website/blog/2024-06-24-AltModels-Classes/img/agentstogether.jpeg create mode 100644 website/blog/2024-06-24-AltModels-Classes/index.mdx diff --git a/notebook/agentchat_nested_chats_chess_altmodels.ipynb b/notebook/agentchat_nested_chats_chess_altmodels.ipynb new file mode 100644 index 000000000000..69d3edbcfb50 --- /dev/null +++ b/notebook/agentchat_nested_chats_chess_altmodels.ipynb @@ -0,0 +1,584 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Conversational Chess using non-OpenAI clients\n", + "\n", + "This notebook provides tips for using non-OpenAI models when using functions/tools.\n", + "\n", + "The code is based on [this notebook](/docs/notebooks/agentchat_nested_chats_chess),\n", + "which provides a detailed look at nested chats for tool use. Please refer to that\n", + "notebook for more on nested chats as this will be concentrated on tweaks to\n", + "improve performance with non-OpenAI models.\n", + "\n", + "The notebook represents a chess game between two players with a nested chat to\n", + "determine the available moves and select a move to make.\n", + "\n", + "This game contains a couple of functions/tools that the LLMs must use correctly by the\n", + "LLMs:\n", + "- `get_legal_moves` to get a list of current legal moves.\n", + "- `make_move` to make a move.\n", + "\n", + "Two agents will be used to represent the white and black players, each associated with\n", + "a different LLM cloud provider and model:\n", + "- Anthropic's Sonnet 3.5 will be Player_White\n", + "- Mistral's Mixtral 8x7B (using Together.AI) will be Player_Black\n", + "\n", + "As this involves function calling, we use larger, more capable, models from these providers.\n", + "\n", + "The nested chat will be supported be a board proxy agent who is set up to execute\n", + "the tools and manage the game.\n", + "\n", + "Tips to improve performance with these non-OpenAI models will be noted throughout **in bold**." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Installation\n", + "\n", + "First, you need to install the `pyautogen` and `chess` packages to use AutoGen. We'll include Anthropic and Together.AI libraries." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "! pip install -qqq pyautogen[anthropic,together] chess" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setting up LLMs\n", + "\n", + "We'll use the Anthropic (`api_type` is `anthropic`) and Together.AI (`api_type` is `together`) client classes, with their respective models, which both support function calling." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "\n", + "import chess\n", + "import chess.svg\n", + "from IPython.display import display\n", + "from typing_extensions import Annotated\n", + "\n", + "from autogen import ConversableAgent, register_function\n", + "\n", + "# Let's set our two player configs, specifying clients and models\n", + "\n", + "# Anthropic's Sonnet for player white\n", + "player_white_config_list = [\n", + " {\n", + " \"api_type\": \"anthropic\",\n", + " \"model\": \"claude-3-5-sonnet-20240620\",\n", + " \"api_key\": os.getenv(\"ANTHROPIC_API_KEY\"),\n", + " \"cache_seed\": None,\n", + " },\n", + "]\n", + "\n", + "# Mistral's Mixtral 8x7B for player black (through Together.AI)\n", + "player_black_config_list = [\n", + " {\n", + " \"api_type\": \"together\",\n", + " \"model\": \"mistralai/Mixtral-8x7B-Instruct-v0.1\",\n", + " \"api_key\": os.environ.get(\"TOGETHER_API_KEY\"),\n", + " \"cache_seed\": None,\n", + " },\n", + "]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll setup game variables and the two functions for getting the available moves and then making a move." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "# Initialize the board.\n", + "board = chess.Board()\n", + "\n", + "# Keep track of whether a move has been made.\n", + "made_move = False\n", + "\n", + "\n", + "def get_legal_moves() -> Annotated[\n", + " str,\n", + " \"Call this tool to list of all legal chess moves on the board, output is a list in UCI format, e.g. e2e4,e7e5,e7e8q.\",\n", + "]:\n", + " return \"Possible moves are: \" + \",\".join([str(move) for move in board.legal_moves])\n", + "\n", + "\n", + "def make_move(\n", + " move: Annotated[\n", + " str,\n", + " \"Call this tool to make a move after you have the list of legal moves and want to make a move. Takes UCI format, e.g. e2e4 or e7e5 or e7e8q.\",\n", + " ]\n", + ") -> Annotated[str, \"Result of the move.\"]:\n", + " move = chess.Move.from_uci(move)\n", + " board.push_uci(str(move))\n", + " global made_move\n", + " made_move = True\n", + " # Display the board.\n", + " display(\n", + " chess.svg.board(board, arrows=[(move.from_square, move.to_square)], fill={move.from_square: \"gray\"}, size=200)\n", + " )\n", + " # Get the piece name.\n", + " piece = board.piece_at(move.to_square)\n", + " piece_symbol = piece.unicode_symbol()\n", + " piece_name = (\n", + " chess.piece_name(piece.piece_type).capitalize()\n", + " if piece_symbol.isupper()\n", + " else chess.piece_name(piece.piece_type)\n", + " )\n", + " return f\"Moved {piece_name} ({piece_symbol}) from {chess.SQUARE_NAMES[move.from_square]} to {chess.SQUARE_NAMES[move.to_square]}.\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Creating agents\n", + "\n", + "Our main player agents are created next, with a few tweaks to help our models play:\n", + "\n", + "- Explicitly **telling agents their names** (as the name field isn't sent to the LLM).\n", + "- Providing simple instructions on the **order of functions** (not all models will need it).\n", + "- Asking the LLM to **include their name in the response** so the message content will include their names, helping the LLM understand who has made which moves.\n", + "- Ensure **no spaces are in the agent names** so that their name is distinguishable in the conversation.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "player_white = ConversableAgent(\n", + " name=\"Player_White\",\n", + " system_message=\"You are a chess player and you play as white, your name is 'Player_White'. \"\n", + " \"First call the function get_legal_moves() to get list of legal moves. \"\n", + " \"Then call the function make_move(move) to make a move. \"\n", + " \"Then tell Player_Black you have made your move and it is their turn. \"\n", + " \"Make sure you tell Player_Black you are Player_White.\",\n", + " llm_config={\"config_list\": player_white_config_list, \"cache_seed\": None},\n", + ")\n", + "\n", + "player_black = ConversableAgent(\n", + " name=\"Player_Black\",\n", + " system_message=\"You are a chess player and you play as black, your name is 'Player_Black'. \"\n", + " \"First call the function get_legal_moves() to get list of legal moves. \"\n", + " \"Then call the function make_move(move) to make a move. \"\n", + " \"Then tell Player_White you have made your move and it is their turn. \"\n", + " \"Make sure you tell Player_White you are Player_Black.\",\n", + " llm_config={\"config_list\": player_black_config_list, \"cache_seed\": None},\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we create a proxy agent that will be used to move the pieces on the board." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "# Check if the player has made a move, and reset the flag if move is made.\n", + "def check_made_move(msg):\n", + " global made_move\n", + " if made_move:\n", + " made_move = False\n", + " return True\n", + " else:\n", + " return False\n", + "\n", + "\n", + "board_proxy = ConversableAgent(\n", + " name=\"Board_Proxy\",\n", + " llm_config=False,\n", + " # The board proxy will only terminate the conversation if the player has made a move.\n", + " is_termination_msg=check_made_move,\n", + " # The auto reply message is set to keep the player agent retrying until a move is made.\n", + " default_auto_reply=\"Please make a move.\",\n", + " human_input_mode=\"NEVER\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Our functions are then assigned to the agents so they can be passed to the LLM to choose from.\n", + "\n", + "We have tweaked the descriptions to provide **more guidance on when** to use it." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "register_function(\n", + " make_move,\n", + " caller=player_white,\n", + " executor=board_proxy,\n", + " name=\"make_move\",\n", + " description=\"Call this tool to make a move after you have the list of legal moves.\",\n", + ")\n", + "\n", + "register_function(\n", + " get_legal_moves,\n", + " caller=player_white,\n", + " executor=board_proxy,\n", + " name=\"get_legal_moves\",\n", + " description=\"Call this to get a legal moves before making a move.\",\n", + ")\n", + "\n", + "register_function(\n", + " make_move,\n", + " caller=player_black,\n", + " executor=board_proxy,\n", + " name=\"make_move\",\n", + " description=\"Call this tool to make a move after you have the list of legal moves.\",\n", + ")\n", + "\n", + "register_function(\n", + " get_legal_moves,\n", + " caller=player_black,\n", + " executor=board_proxy,\n", + " name=\"get_legal_moves\",\n", + " description=\"Call this to get a legal moves before making a move.\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Almost there, we now create nested chats between players and the board proxy agent to work out the available moves and make the move." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "player_white.register_nested_chats(\n", + " trigger=player_black,\n", + " chat_queue=[\n", + " {\n", + " # The initial message is the one received by the player agent from\n", + " # the other player agent.\n", + " \"sender\": board_proxy,\n", + " \"recipient\": player_white,\n", + " # The final message is sent to the player agent.\n", + " \"summary_method\": \"last_msg\",\n", + " }\n", + " ],\n", + ")\n", + "\n", + "player_black.register_nested_chats(\n", + " trigger=player_white,\n", + " chat_queue=[\n", + " {\n", + " # The initial message is the one received by the player agent from\n", + " # the other player agent.\n", + " \"sender\": board_proxy,\n", + " \"recipient\": player_black,\n", + " # The final message is sent to the player agent.\n", + " \"summary_method\": \"last_msg\",\n", + " }\n", + " ],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Playing the game\n", + "\n", + "Now the game can begin!" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33mPlayer_Black\u001b[0m (to Player_White):\n", + "\n", + "Let's play chess! Your move.\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[31m\n", + ">>>>>>>> USING AUTO REPLY...\u001b[0m\n", + "\u001b[34m\n", + "********************************************************************************\u001b[0m\n", + "\u001b[34mStarting a new chat....\u001b[0m\n", + "\u001b[34m\n", + "********************************************************************************\u001b[0m\n", + "\u001b[33mBoard_Proxy\u001b[0m (to Player_White):\n", + "\n", + "Let's play chess! Your move.\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[31m\n", + ">>>>>>>> USING AUTO REPLY...\u001b[0m\n", + "\u001b[33mPlayer_White\u001b[0m (to Board_Proxy):\n", + "\n", + "Certainly! I'd be happy to play chess with you. As White, I'll make the first move. Let me start by checking the legal moves available to me.\n", + "\u001b[32m***** Suggested tool call (toolu_015sLMucefMVqS5ZNyWVGjgu): get_legal_moves *****\u001b[0m\n", + "Arguments: \n", + "{}\n", + "\u001b[32m*********************************************************************************\u001b[0m\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[35m\n", + ">>>>>>>> EXECUTING FUNCTION get_legal_moves...\u001b[0m\n", + "\u001b[33mBoard_Proxy\u001b[0m (to Player_White):\n", + "\n", + "\u001b[33mBoard_Proxy\u001b[0m (to Player_White):\n", + "\n", + "\u001b[32m***** Response from calling tool (toolu_015sLMucefMVqS5ZNyWVGjgu) *****\u001b[0m\n", + "Possible moves are: g1h3,g1f3,b1c3,b1a3,h2h3,g2g3,f2f3,e2e3,d2d3,c2c3,b2b3,a2a3,h2h4,g2g4,f2f4,e2e4,d2d4,c2c4,b2b4,a2a4\n", + "\u001b[32m***********************************************************************\u001b[0m\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[31m\n", + ">>>>>>>> USING AUTO REPLY...\u001b[0m\n", + "\u001b[33mPlayer_White\u001b[0m (to Board_Proxy):\n", + "\n", + "Thank you for initiating a game of chess! As Player_White, I'll make the first move. After analyzing the legal moves, I've decided to make a classic opening move.\n", + "\u001b[32m***** Suggested tool call (toolu_01VjmBhHcGw5RTRKYC4Y5MeV): make_move *****\u001b[0m\n", + "Arguments: \n", + "{\"move\": \"e2e4\"}\n", + "\u001b[32m***************************************************************************\u001b[0m\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[35m\n", + ">>>>>>>> EXECUTING FUNCTION make_move...\u001b[0m\n" + ] + }, + { + "data": { + "image/svg+xml": [ + "
r n b q k b n r\n",
+       "p p p p p p p p\n",
+       ". . . . . . . .\n",
+       ". . . . . . . .\n",
+       ". . . . P . . .\n",
+       ". . . . . . . .\n",
+       "P P P P . P P P\n",
+       "R N B Q K B N R
" + ], + "text/plain": [ + "'
r n b q k b n r\\np p p p p p p p\\n. . . . . . . .\\n. . . . . . . .\\n. . . . P . . .\\n. . . . . . . .\\nP P P P . P P P\\nR N B Q K B N R
'" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33mBoard_Proxy\u001b[0m (to Player_White):\n", + "\n", + "\u001b[33mBoard_Proxy\u001b[0m (to Player_White):\n", + "\n", + "\u001b[32m***** Response from calling tool (toolu_01VjmBhHcGw5RTRKYC4Y5MeV) *****\u001b[0m\n", + "Moved pawn (♙) from e2 to e4.\n", + "\u001b[32m***********************************************************************\u001b[0m\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[31m\n", + ">>>>>>>> USING AUTO REPLY...\u001b[0m\n", + "\u001b[33mPlayer_White\u001b[0m (to Board_Proxy):\n", + "\n", + "Hello, Player_Black! I'm Player_White, and I've just made my move. I've chosen to play the classic opening move e2e4, moving my king's pawn forward two squares. This opens up lines for both my queen and king's bishop, and stakes a claim to the center of the board. It's now your turn to make a move. Good luck!\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[33mPlayer_White\u001b[0m (to Player_Black):\n", + "\n", + "Hello, Player_Black! I'm Player_White, and I've just made my move. I've chosen to play the classic opening move e2e4, moving my king's pawn forward two squares. This opens up lines for both my queen and king's bishop, and stakes a claim to the center of the board. It's now your turn to make a move. Good luck!\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[31m\n", + ">>>>>>>> USING AUTO REPLY...\u001b[0m\n", + "\u001b[34m\n", + "********************************************************************************\u001b[0m\n", + "\u001b[34mStarting a new chat....\u001b[0m\n", + "\u001b[34m\n", + "********************************************************************************\u001b[0m\n", + "\u001b[33mBoard_Proxy\u001b[0m (to Player_Black):\n", + "\n", + "Hello, Player_Black! I'm Player_White, and I've just made my move. I've chosen to play the classic opening move e2e4, moving my king's pawn forward two squares. This opens up lines for both my queen and king's bishop, and stakes a claim to the center of the board. It's now your turn to make a move. Good luck!\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[31m\n", + ">>>>>>>> USING AUTO REPLY...\u001b[0m\n", + "\u001b[33mPlayer_Black\u001b[0m (to Board_Proxy):\n", + "\n", + "\u001b[32m***** Suggested tool call (call_z6jagiqn59m784w1n0zhmiop): get_legal_moves *****\u001b[0m\n", + "Arguments: \n", + "{}\n", + "\u001b[32m********************************************************************************\u001b[0m\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[35m\n", + ">>>>>>>> EXECUTING FUNCTION get_legal_moves...\u001b[0m\n", + "\u001b[33mBoard_Proxy\u001b[0m (to Player_Black):\n", + "\n", + "\u001b[33mBoard_Proxy\u001b[0m (to Player_Black):\n", + "\n", + "\u001b[32m***** Response from calling tool (call_z6jagiqn59m784w1n0zhmiop) *****\u001b[0m\n", + "Possible moves are: g8h6,g8f6,b8c6,b8a6,h7h6,g7g6,f7f6,e7e6,d7d6,c7c6,b7b6,a7a6,h7h5,g7g5,f7f5,e7e5,d7d5,c7c5,b7b5,a7a5\n", + "\u001b[32m**********************************************************************\u001b[0m\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[31m\n", + ">>>>>>>> USING AUTO REPLY...\u001b[0m\n", + "\u001b[33mPlayer_Black\u001b[0m (to Board_Proxy):\n", + "\n", + "\u001b[32m***** Suggested tool call (call_59t20pl0ab68z4xx2workgbc): make_move *****\u001b[0m\n", + "Arguments: \n", + "{\"move\":\"g8h6\"}\n", + "\u001b[32m**************************************************************************\u001b[0m\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[35m\n", + ">>>>>>>> EXECUTING FUNCTION make_move...\u001b[0m\n" + ] + }, + { + "data": { + "image/svg+xml": [ + "
r n b q k b . r\n",
+       "p p p p p p p p\n",
+       ". . . . . . . n\n",
+       ". . . . . . . .\n",
+       ". . . . P . . .\n",
+       ". . . . . . . .\n",
+       "P P P P . P P P\n",
+       "R N B Q K B N R
" + ], + "text/plain": [ + "'
r n b q k b . r\\np p p p p p p p\\n. . . . . . . n\\n. . . . . . . .\\n. . . . P . . .\\n. . . . . . . .\\nP P P P . P P P\\nR N B Q K B N R
'" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33mBoard_Proxy\u001b[0m (to Player_Black):\n", + "\n", + "\u001b[33mBoard_Proxy\u001b[0m (to Player_Black):\n", + "\n", + "\u001b[32m***** Response from calling tool (call_59t20pl0ab68z4xx2workgbc) *****\u001b[0m\n", + "Moved knight (♞) from g8 to h6.\n", + "\u001b[32m**********************************************************************\u001b[0m\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[31m\n", + ">>>>>>>> USING AUTO REPLY...\u001b[0m\n", + "\u001b[33mPlayer_Black\u001b[0m (to Board_Proxy):\n", + "\n", + "\u001b[32m***** Suggested tool call (call_jwv1d86srs1fnvu33cky9tgv): make_move *****\u001b[0m\n", + "Arguments: \n", + "{\"move\":\"g8h6\"}\n", + "\u001b[32m**************************************************************************\u001b[0m\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[33mPlayer_Black\u001b[0m (to Player_White):\n", + "\n", + "\n", + "\n", + "--------------------------------------------------------------------------------\n", + "\u001b[31m\n", + ">>>>>>>> USING AUTO REPLY...\u001b[0m\n", + "\u001b[31m\n", + ">>>>>>>> USING AUTO REPLY...\u001b[0m\n" + ] + } + ], + "source": [ + "# Clear the board.\n", + "board = chess.Board()\n", + "\n", + "chat_result = player_black.initiate_chat(\n", + " player_white,\n", + " message=\"Let's play chess! Your move.\",\n", + " max_turns=10,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "At this stage, it's hard to tell who's going to win, but they're playing well and using the functions correctly." + ] + } + ], + "metadata": { + "front_matter": { + "description": "LLM-backed agents playing chess with each other using nested chats.", + "tags": [ + "nested chat", + "tool use", + "orchestration" + ] + }, + "kernelspec": { + "display_name": "autogen", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/website/blog/2024-06-24-AltModels-Classes/img/agentstogether.jpeg b/website/blog/2024-06-24-AltModels-Classes/img/agentstogether.jpeg new file mode 100644 index 000000000000..fd859fc62076 --- /dev/null +++ b/website/blog/2024-06-24-AltModels-Classes/img/agentstogether.jpeg @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:964628601b60ddbab8940fea45014dcd841b89783bb0b7e9ac9d3690f1c41798 +size 659594 diff --git a/website/blog/2024-06-24-AltModels-Classes/index.mdx b/website/blog/2024-06-24-AltModels-Classes/index.mdx new file mode 100644 index 000000000000..9c94094e7e4c --- /dev/null +++ b/website/blog/2024-06-24-AltModels-Classes/index.mdx @@ -0,0 +1,393 @@ +--- +title: Enhanced Support for Non-OpenAI Models +authors: + - marklysze + - Hk669 +tags: [mistral ai,anthropic,together.ai,gemini] +--- + +![agents](img/agentstogether.jpeg) + +## TL;DR + +- **AutoGen has expanded integrations with a variety of cloud-based model providers beyond OpenAI.** +- **Leverage models and platforms from Gemini, Anthropic, Mistral AI, Together.AI, and Groq for your AutoGen agents.** +- **Utilise models specifically for chat, language, image, and coding.** +- **LLM provider diversification can provide cost and resilience benefits.** + +In addition to the recently released AutoGen [Google Gemini](https://ai.google.dev/) client, new client classes for [Mistral AI](https://mistral.ai/), [Anthropic](https://www.anthropic.com/), [Together.AI](https://www.together.ai/), and [Groq](https://groq.com/) enable you to utilize over 75 different large language models in your AutoGen agent workflow. + +These new client classes tailor AutoGen's underlying messages to each provider's unique requirements and remove that complexity from the developer, who can then focus on building their AutoGen workflow. + +Using them is as simple as installing the client-specific library and updating your LLM config with the relevant `api_type` and `model`. We'll demonstrate how to use them below. + +The community is continuing to enhance and build new client classes as cloud-based inference providers arrive. So, watch this space, and feel free to [discuss](https://discord.gg/pAbnFJrkgZ) or [develop](https://github.com/microsoft/autogen/pulls) another one. + +## Benefits of choice + +The need to use only the best models to overcome workflow-breaking LLM inconsistency has diminished considerably over the last 12 months. + +These new classes provide access to the very largest trillion-parameter models from OpenAI, Google, and Anthropic, continuing to provide the most consistent +and competent agent experiences. However, it's worth trying smaller models from the likes of Meta, Mistral AI, Microsoft, Qwen, and many others. Perhaps they +are capable enough for a task, or sub-task, or even better suited (such as a coding model)! + +Using smaller models will have cost benefits, but they also allow you to test models that you could run locally, allowing you to determine if you can remove cloud inference costs +altogether or even run an AutoGen workflow offline. + +On the topic of cost, these client classes also include provider-specific token cost calculations so you can monitor the cost impact of your workflows. With costs per million +tokens as low as 10 cents (and some are even free!), cost savings can be noticeable. + +## Mix and match + +How does Google's Gemini 1.5 Pro model stack up against Anthropic's Opus or Meta's Llama 3? + +Now you have the ability to quickly change your agent configs and find out. If you want to run all three in the one workflow, +AutoGen's ability to associate specific configurations to each agent means you can select the best LLM for each agent. + +## Capabilities + +The common requirements of text generation and function/tool calling are supported by these client classes. + +Multi-modal support, such as for image/audio/video, is an area of active development. The [Google Gemini](https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-gemini) client class can be +used to create a multimodal agent. + +## Tips + +Here are some tips when working with these client classes: + +- **Most to least capable** - start with larger models and get your workflow working, then iteratively try smaller models. +- **Right model** - choose one that's suited to your task, whether it's coding, function calling, knowledge, or creative writing. +- **Agent names** - these cloud providers do not use the `name` field on a message, so be sure to use your agent's name in their `system_message` and `description` fields, as well as instructing the LLM to 'act as' them. This is particularly important for "auto" speaker selection in group chats as we need to guide the LLM to choose the next agent based on a name, so tweak `select_speaker_message_template`, `select_speaker_prompt_template`, and `select_speaker_auto_multiple_template` with more guidance. +- **Context length** - as your conversation gets longer, models need to support larger context lengths, be mindful of what the model supports and consider using [Transform Messages](https://microsoft.github.io/autogen/docs/topics/handling_long_contexts/intro_to_transform_messages) to manage context size. +- **Provider parameters** - providers have parameters you can set such as temperature, maximum tokens, top-k, top-p, and safety. See each client class in AutoGen's [API Reference](https://microsoft.github.io/autogen/docs/reference/oai/gemini) or [documentation](https://microsoft.github.io/autogen/docs/topics/non-openai-models/cloud-gemini) for details. +- **Prompts** - prompt engineering is critical in guiding smaller LLMs to do what you need. [ConversableAgent](https://microsoft.github.io/autogen/docs/reference/agentchat/conversable_agent), [GroupChat](https://microsoft.github.io/autogen/docs/reference/agentchat/groupchat), [UserProxyAgent](https://microsoft.github.io/autogen/docs/reference/agentchat/user_proxy_agent), and [AssistantAgent](https://microsoft.github.io/autogen/docs/reference/agentchat/assistant_agent) all have customizable prompt attributes that you can tailor. Here are some prompting tips from [Anthropic](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview)([+Library](https://docs.anthropic.com/en/prompt-library/library)), [Mistral AI](https://docs.mistral.ai/guides/prompting_capabilities/), [Together.AI](https://docs.together.ai/docs/examples), and [Meta](https://llama.meta.com/docs/how-to-guides/prompting/). +- **Help!** - reach out on the AutoGen [Discord](https://discord.gg/pAbnFJrkgZ) or [log an issue](https://github.com/microsoft/autogen/issues) if you need help with or can help improve these client classes. + +Now it's time to try them out. + +## Quickstart + +### Installation + +Install the appropriate client based on the model you wish to use. + +```sh +pip install pyautogen["mistral"] # for Mistral AI client +pip install pyautogen["anthropic"] # for Anthropic client +pip install pyautogen["together"] # for Together.AI client +pip install pyautogen["groq"] # for Groq client +``` + +### Configuration Setup + +Add your model configurations to the `OAI_CONFIG_LIST`. Ensure you specify the `api_type` to initialize the respective client (Anthropic, Mistral, or Together). + +```yaml +[ + { + "model": "your anthropic model name", + "api_key": "your Anthropic api_key", + "api_type": "anthropic" + }, + { + "model": "your mistral model name", + "api_key": "your Mistral AI api_key", + "api_type": "mistral" + }, + { + "model": "your together.ai model name", + "api_key": "your Together.AI api_key", + "api_type": "together" + }, + { + "model": "your groq model name", + "api_key": "your Groq api_key", + "api_type": "groq" + } +] +``` + +### Usage + +The `[config_list_from_json](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils/#config_list_from_json)` function loads a list of configurations from an environment variable or a json file. + +```py +import autogen +from autogen import AssistantAgent, UserProxyAgent + +config_list = autogen.config_list_from_json( + "OAI_CONFIG_LIST" +) +``` + +### Construct Agents + +Construct a simple conversation between a User proxy and an Assistant agent + +```py +user_proxy = UserProxyAgent( + name="User_proxy", + code_execution_config={ + "last_n_messages": 2, + "work_dir": "groupchat", + "use_docker": False, # Please set use_docker = True if docker is available to run the generated code. Using docker is safer than running the generated code directly. + }, + human_input_mode="ALWAYS", + is_termination_msg=lambda msg: not msg["content"] +) + +assistant = AssistantAgent( + name="assistant", + llm_config = {"config_list": config_list} +) +``` + +### Start chat + +```py + +user_proxy.intiate_chat(assistant, message="Write python code to print Hello World!") + +``` + +**NOTE: To integrate this setup into GroupChat, follow the [tutorial](https://microsoft.github.io/autogen/docs/notebooks/agentchat_groupchat) with the same config as above.** + + +## Function Calls + +Now, let's look at how Anthropic's Sonnet 3.5 is able to suggest multiple function calls in a single response. + +This example is a simple travel agent setup with an agent for function calling and a user proxy agent for executing the functions. + +One thing you'll note here is Anthropic's models are more verbose than OpenAI's and will typically provide chain-of-thought or general verbiage when replying. Therefore we provide more explicit instructions to `functionbot` to not reply with more than necessary. Even so, it can't always help itself! + +Let's start with setting up our configuration and agents. + +```py +import os +import autogen +import json +from typing import Literal +from typing_extensions import Annotated + +# Anthropic configuration, using api_type='anthropic' +anthropic_llm_config = { + "config_list": + [ + { + "api_type": "anthropic", + "model": "claude-3-5-sonnet-20240620", + "api_key": os.getenv("ANTHROPIC_API_KEY"), + "cache_seed": None + } + ] +} + +# Our functionbot, who will be assigned two functions and +# given directions to use them. +functionbot = autogen.AssistantAgent( + name="functionbot", + system_message="For currency exchange tasks, only use " + "the functions you have been provided with. Do not " + "reply with helpful tips. Once you've recommended functions " + "reply with 'TERMINATE'.", + is_termination_msg=lambda x: x.get("content", "") and (x.get("content", "").rstrip().endswith("TERMINATE") or x.get("content", "") == ""), + llm_config=anthropic_llm_config, +) + +# Our user proxy agent, who will be used to manage the customer +# request and conversation with the functionbot, terminating +# when we have the information we need. +user_proxy = autogen.UserProxyAgent( + name="user_proxy", + system_message="You are a travel agent that provides " + "specific information to your customers. Get the " + "information you need and provide a great summary " + "so your customer can have a great trip. If you " + "have the information you need, simply reply with " + "'TERMINATE'.", + is_termination_msg=lambda x: x.get("content", "") and (x.get("content", "").rstrip().endswith("TERMINATE") or x.get("content", "") == ""), + human_input_mode="NEVER", + max_consecutive_auto_reply=10, +) +``` + +We define the two functions. +```py +CurrencySymbol = Literal["USD", "EUR"] + +def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float: + if base_currency == quote_currency: + return 1.0 + elif base_currency == "USD" and quote_currency == "EUR": + return 1 / 1.1 + elif base_currency == "EUR" and quote_currency == "USD": + return 1.1 + else: + raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}") + +def get_current_weather(location, unit="fahrenheit"): + """Get the weather for some location""" + if "chicago" in location.lower(): + return json.dumps({"location": "Chicago", "temperature": "13", "unit": unit}) + elif "san francisco" in location.lower(): + return json.dumps({"location": "San Francisco", "temperature": "55", "unit": unit}) + elif "new york" in location.lower(): + return json.dumps({"location": "New York", "temperature": "11", "unit": unit}) + else: + return json.dumps({"location": location, "temperature": "unknown"}) +``` + +And then associate them with the `user_proxy` for execution and `functionbot` for the LLM to consider using them. + +```py +@user_proxy.register_for_execution() +@functionbot.register_for_llm(description="Currency exchange calculator.") +def currency_calculator( + base_amount: Annotated[float, "Amount of currency in base_currency"], + base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD", + quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR", +) -> str: + quote_amount = exchange_rate(base_currency, quote_currency) * base_amount + return f"{quote_amount} {quote_currency}" + +@user_proxy.register_for_execution() +@functionbot.register_for_llm(description="Weather forecast for US cities.") +def weather_forecast( + location: Annotated[str, "City name"], +) -> str: + weather_details = get_current_weather(location=location) + weather = json.loads(weather_details) + return f"{weather['location']} will be {weather['temperature']} degrees {weather['unit']}" +``` + +Finally, we start the conversation with a request for help from our customer on their upcoming trip to New York and the Euro they would like exchanged to USD. + +Importantly, we're also using Anthropic's Sonnet to provide a summary through the `summary_method`. Using `summary_prompt`, we guide Sonnet to give us an email output. + +```py +# start the conversation +res = user_proxy.initiate_chat( + functionbot, + message="My customer wants to travel to New York and " + "they need to exchange 830 EUR to USD. Can you please " + "provide them with a summary of the weather and " + "exchanged currently in USD?", + summary_method="reflection_with_llm", + summary_args={ + "summary_prompt": """Summarize the conversation by + providing an email response with the travel information + for the customer addressed as 'Dear Customer'. Do not + provide any additional conversation or apologise, + just provide the relevant information and the email.""" + }, +) +``` + +After the conversation has finished, we'll print out the summary. + +```py +print(f"Here's the LLM summary of the conversation:\n\n{res.summary['content']}") +``` + +Here's the resulting output. + +```text +user_proxy (to functionbot): + +My customer wants to travel to New York and they need to exchange 830 EUR +to USD. Can you please provide them with a summary of the weather and +exchanged currently in USD? + +-------------------------------------------------------------------------------- +functionbot (to user_proxy): + +Certainly! I'd be happy to help your customer with information about the +weather in New York and the currency exchange from EUR to USD. Let's use +the available tools to get this information. + +***** Suggested tool call (toolu_016wBUKVX2TToBaMqmiGvhav): weather_forecast ***** +Arguments: +{"location": "New York"} +********************************************************************************** +***** Suggested tool call (toolu_01Nxjeew2BspfKdZ85on3XkP): currency_calculator ***** +Arguments: +{"base_amount": 830, "base_currency": "EUR", "quote_currency": "USD"} +************************************************************************************* + +-------------------------------------------------------------------------------- + +>>>>>>>> EXECUTING FUNCTION weather_forecast... + +>>>>>>>> EXECUTING FUNCTION currency_calculator... +user_proxy (to functionbot): + +user_proxy (to functionbot): + +***** Response from calling tool (toolu_016wBUKVX2TToBaMqmiGvhav) ***** +New York will be 11 degrees fahrenheit +*********************************************************************** + +-------------------------------------------------------------------------------- +user_proxy (to functionbot): + +***** Response from calling tool (toolu_01Nxjeew2BspfKdZ85on3XkP) ***** +913.0000000000001 USD +*********************************************************************** + +-------------------------------------------------------------------------------- +functionbot (to user_proxy): + +Thank you for providing the information. I can now give your customer a +summary of the weather in New York and the currency exchange from EUR to USD. + +Weather in New York: +The current forecast for New York indicates that it will be 11 degrees +Fahrenheit. This is quite cold, so your customer should be prepared with +warm clothing for their trip. + +Currency Exchange: +For 830 EUR, your customer will receive approximately 913 USD. This is based +on the current exchange rate provided by our currency calculator. + +To summarize: +1. Weather in New York: 11°F (very cold) +2. Currency exchange: 830 EUR = 913 USD + +Your customer should pack warm clothes for the cold weather in New York and +can expect to have about 913 USD for their trip after exchanging 830 EUR. + +TERMINATE + +-------------------------------------------------------------------------------- +Here's the LLM summary of the conversation: + +Certainly. I'll provide an email response to the customer with the travel +information as requested. + +Dear Customer, + +We are pleased to provide you with the following information for your +upcoming trip to New York: + +Weather Forecast: +The current forecast for New York indicates a temperature of 11 degrees +Fahrenheit. Please be prepared for very cold weather and pack appropriate +warm clothing. + +Currency Exchange: +We have calculated the currency exchange for you. Your 830 EUR will be +equivalent to approximately 913 USD at the current exchange rate. + +We hope this information helps you prepare for your trip to New York. Have +a safe and enjoyable journey! + +Best regards, +Travel Assistance Team +``` + +So we can see how Anthropic's Sonnet is able to suggest multiple tools in a single response, with AutoGen executing them both and providing the results back to Sonnet. Sonnet then finishes with a nice email summary that can be the basis for continued real-life conversation with the customer. + +## More tips and tricks + +For an interesting chess game between Anthropic's Sonnet and Mistral's Mixtral, we've put together a sample notebook that highlights some of the tips and tricks for working with non-OpenAI LLMs. [See the notebook here](https://microsoft.github.io/autogen/docs/notebooks/agentchat_nested_chats_chess_altmodels). diff --git a/website/blog/authors.yml b/website/blog/authors.yml index 83ea099ade48..0e023514465d 100644 --- a/website/blog/authors.yml +++ b/website/blog/authors.yml @@ -128,3 +128,15 @@ jluey: name: James Woffinden-Luey title: Senior Research Engineer at Microsoft Research url: https://github.com/jluey1 + +Hk669: + name: Hrushikesh Dokala + title: CS Undergraduate Based in India + url: https://github.com/Hk669 + image_url: https://github.com/Hk669.png + +marklysze: + name: Mark Sze + title: AI Freelancer + url: https://github.com/marklysze + image_url: https://github.com/marklysze.png diff --git a/website/docs/Examples.md b/website/docs/Examples.md index 2ec83d1e0f21..3b71dd682e97 100644 --- a/website/docs/Examples.md +++ b/website/docs/Examples.md @@ -80,6 +80,9 @@ Links to notebook examples: - OpenAI Assistant in a Group Chat - [View Notebook](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_oai_assistant_groupchat.ipynb) - GPTAssistantAgent based Multi-Agent Tool Use - [View Notebook](https://github.com/microsoft/autogen/blob/main/notebook/gpt_assistant_agent_function_call.ipynb) +### Non-OpenAI Models +- Conversational Chess using non-OpenAI Models - [View Notebook](/docs/notebooks/agentchat_nested_chats_chess_altmodels) + ### Multimodal Agent - Multimodal Agent Chat with DALLE and GPT-4V - [View Notebook](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_dalle_and_gpt4v.ipynb) diff --git a/website/docusaurus.config.js b/website/docusaurus.config.js index 0d581c83ddcd..f0c0f84a3945 100644 --- a/website/docusaurus.config.js +++ b/website/docusaurus.config.js @@ -281,6 +281,10 @@ module.exports = { to: "/docs/notebooks/agentchat_nested_chats_chess", from: ["/docs/notebooks/agentchat_chess"], }, + { + to: "/docs/notebooks/agentchat_nested_chats_chess_altmodels", + from: ["/docs/notebooks/agentchat_chess_altmodels"], + }, { to: "/docs/contributor-guide/contributing", from: ["/docs/Contribute"],