From 74a4d70a056f788e8765ac85d43f504e639a8f17 Mon Sep 17 00:00:00 2001
From: Maxim Saplin <smaxmail@gmail.com>
Date: Tue, 30 Jan 2024 17:48:36 +0300
Subject: [PATCH] FAQ, working with LLM endpoints and explaining why
 config-list is a list (#1451)

* FAQ and notebook extended

* Update notebook/oai_openai_utils.ipynb

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update website/docs/FAQ.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update website/docs/FAQ.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update website/docs/FAQ.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update website/docs/FAQ.md

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update FAQ.md

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
---
 notebook/oai_openai_utils.ipynb |  4 +++-
 website/docs/FAQ.md             | 40 +++++++++++++++++++++++++++++++--
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/notebook/oai_openai_utils.ipynb b/notebook/oai_openai_utils.ipynb
index 2040f570c5ed..7b40ae3e0e3e 100644
--- a/notebook/oai_openai_utils.ipynb
+++ b/notebook/oai_openai_utils.ipynb
@@ -24,7 +24,9 @@
     "- `config_list_openai_aoai`: Constructs a list of configurations using both Azure OpenAI and OpenAI endpoints, sourcing API keys from environment variables or local files.\n",
     "- `config_list_from_json`: Loads configurations from a JSON structure, either from an environment variable or a local JSON file, with the flexibility of filtering configurations based on given criteria.\n",
     "- `config_list_from_models`: Creates configurations based on a provided list of models, useful when targeting specific models without manually specifying each configuration.\n",
-    "- `config_list_from_dotenv`: Constructs a configuration list from a `.env` file, offering a consolidated way to manage multiple API configurations and keys from a single file."
+    "- `config_list_from_dotenv`: Constructs a configuration list from a `.env` file, offering a consolidated way to manage multiple API configurations and keys from a single file.\n",
+    "\n",
+    "If mutiple models are provided, the Autogen client (`OpenAIWrapper`) and agents don't choose the \"best model\" on any criteria - inference is done through the very first model and the next one is used only if the current model fails (e.g. API throttling by the provider or a filter condition is unsatisfied)."
    ]
   },
   {
diff --git a/website/docs/FAQ.md b/website/docs/FAQ.md
index 96110e67c898..6a354b0b0d6e 100644
--- a/website/docs/FAQ.md
+++ b/website/docs/FAQ.md
@@ -3,6 +3,7 @@
 - [Set your API endpoints](#set-your-api-endpoints)
   - [Use the constructed configuration list in agents](#use-the-constructed-configuration-list-in-agents)
   - [Unexpected keyword argument 'base_url'](#unexpected-keyword-argument-base_url)
+  - [How does an agent decide which model to pick out of the list?](#how-does-an-agent-decide-which-model-to-pick-out-of-the-list)
   - [Can I use non-OpenAI models?](#can-i-use-non-openai-models)
 - [Handle Rate Limit Error and Timeout Error](#handle-rate-limit-error-and-timeout-error)
 - [How to continue a finished conversation](#how-to-continue-a-finished-conversation)
@@ -20,7 +21,34 @@
 
 ## Set your API endpoints
 
-There are multiple ways to construct configurations for LLM inference in the `oai` utilities:
+Autogen relies on 3rd party API endpoints for LLM inference. It works great with OpenAI and Azure OpenAI models, and can work with any model (self-hosted or local) that is accessible through an inference server compatible with OpenAI Chat Completions API.
+
+An agent requires a list of configuration dictionaries for setting up model endpoints. Each configuration dictionary can contain the following keys:
+- `model` (str): Required. The identifier of the model to be used, such as 'gpt-4', 'gpt-3.5-turbo', or 'llama-7B'.
+- `api_key` (str): Optional. The API key required for authenticating requests to the model's API endpoint.
+- `api_type` (str): Optional. The type of API service being used. This could be 'azure' for Azure Cognitive Services, 'openai' for OpenAI, or other custom types.
+- `base_url` (str): Optional. The base URL of the API endpoint. This is the root address where API calls are directed.
+- `api_version` (str): Optional. The version of the Azure API you wish to use
+
+For example:
+```python
+config_list = [
+    {
+        "model": "gpt-4",
+        "api_key": os.environ.get("AZURE_OPENAI_API_KEY"),
+        "api_type": "azure",
+        "base_url": os.environ.get("AZURE_OPENAI_API_BASE"),
+        "api_version": "2023-12-01-preview",
+    },
+    {
+        "model": "llama-7B",
+        "base_url": "http://127.0.0.1:8080",
+        "api_type": "openai",
+    }
+]
+```
+
+In `autogen` module there are [multiple helper functions](/docs/reference/oai/openai_utils) allowing to construct configurations using different sources:
 
 - `get_config_list`: Generates configurations for API calls, primarily from provided API keys.
 - `config_list_openai_aoai`: Constructs a list of configurations using both Azure OpenAI and OpenAI endpoints, sourcing API keys from environment variables or local files.
@@ -47,13 +75,21 @@ You can also explicitly specify that by:
 assistant = autogen.AssistantAgent(name="assistant", llm_config={"api_key": ...})
 ```
 
+### How does an agent decide which model to pick out of the list?
+
+An agent uses the very first model available in the "config_list" and makes LLM calls against this model. If the model fail (e.g. API throttling) the agent will retry the request against the 2nd model and so on until  prompt completion is received (or throws an error if none of the models successfully completes the request). There's no implicit/hidden logic inside agents that is used to pick "the best model for the task". It is developers responsibility to pick the right models and use them with agents.
+
+Besides throttling/rotating models the 'config_list' can be useful for:
+- Having a single global list of models and [filtering it](/docs/reference/oai/openai_utils/#filter_config) based on certain keys (e.g. name, tag) in order to pass select models into a certain agent (e.g. use cheaper GPT 3.5 for agents solving easier tasks)
+- Using more advanced features for special purposes related to inference, such as `filter_func` with [`OpenAIWrapper`](/docs/reference/oai/client#create) or [inference optimization](/docs/Examples#enhanced-inferences)
+
 ### Unexpected keyword argument 'base_url'
 
 In version >=1, OpenAI renamed their `api_base` parameter to `base_url`. So for older versions, use `api_base` but for newer versions use `base_url`.
 
 ### Can I use non-OpenAI models?
 
-Yes. Please check https://microsoft.github.io/autogen/blog/2023/07/14/Local-LLMs for an example.
+Yes. Autogen can work with any API endpoint which complies with OpenAI-compatible RESTful APIs - e.g. serving local LLM via FastChat or LM Studio. Please check https://microsoft.github.io/autogen/blog/2023/07/14/Local-LLMs for an example.
 
 ## Handle Rate Limit Error and Timeout Error