FAQ, working with LLM endpoints and explaining why config-list is a l…

…ist (#1451) * FAQ and notebook extended * Update notebook/oai_openai_utils.ipynb Co-authored-by: Chi Wang <[email protected]> * Update website/docs/FAQ.md Co-authored-by: Chi Wang <[email protected]> * Update website/docs/FAQ.md Co-authored-by: Chi Wang <[email protected]> * Update website/docs/FAQ.md Co-authored-by: Chi Wang <[email protected]> * Update website/docs/FAQ.md Co-authored-by: Chi Wang <[email protected]> * Update FAQ.md --------- Co-authored-by: Chi Wang <[email protected]>
microsoft · Jan 30, 2024 · 74a4d70 · 74a4d70
1 parent 3608079
commit 74a4d70
Show file tree

Hide file tree

Showing 2 changed files with 41 additions and 3 deletions.
diff --git a/notebook/oai_openai_utils.ipynb b/notebook/oai_openai_utils.ipynb
@@ -24,7 +24,9 @@
     "- `config_list_openai_aoai`: Constructs a list of configurations using both Azure OpenAI and OpenAI endpoints, sourcing API keys from environment variables or local files.\n",
     "- `config_list_from_json`: Loads configurations from a JSON structure, either from an environment variable or a local JSON file, with the flexibility of filtering configurations based on given criteria.\n",
     "- `config_list_from_models`: Creates configurations based on a provided list of models, useful when targeting specific models without manually specifying each configuration.\n",
-    "- `config_list_from_dotenv`: Constructs a configuration list from a `.env` file, offering a consolidated way to manage multiple API configurations and keys from a single file."
+    "- `config_list_from_dotenv`: Constructs a configuration list from a `.env` file, offering a consolidated way to manage multiple API configurations and keys from a single file.\n",
+    "\n",
+    "If mutiple models are provided, the Autogen client (`OpenAIWrapper`) and agents don't choose the \"best model\" on any criteria - inference is done through the very first model and the next one is used only if the current model fails (e.g. API throttling by the provider or a filter condition is unsatisfied)."
    ]
   },
   {

diff --git a/website/docs/FAQ.md b/website/docs/FAQ.md
@@ -3,6 +3,7 @@
 - [Set your API endpoints](#set-your-api-endpoints)
   - [Use the constructed configuration list in agents](#use-the-constructed-configuration-list-in-agents)
   - [Unexpected keyword argument 'base_url'](#unexpected-keyword-argument-base_url)
+  - [How does an agent decide which model to pick out of the list?](#how-does-an-agent-decide-which-model-to-pick-out-of-the-list)
   - [Can I use non-OpenAI models?](#can-i-use-non-openai-models)
 - [Handle Rate Limit Error and Timeout Error](#handle-rate-limit-error-and-timeout-error)
 - [How to continue a finished conversation](#how-to-continue-a-finished-conversation)
@@ -20,7 +21,34 @@
 
 ## Set your API endpoints
 
-There are multiple ways to construct configurations for LLM inference in the `oai` utilities:
+Autogen relies on 3rd party API endpoints for LLM inference. It works great with OpenAI and Azure OpenAI models, and can work with any model (self-hosted or local) that is accessible through an inference server compatible with OpenAI Chat Completions API.
+
+An agent requires a list of configuration dictionaries for setting up model endpoints. Each configuration dictionary can contain the following keys:
+- `model` (str): Required. The identifier of the model to be used, such as 'gpt-4', 'gpt-3.5-turbo', or 'llama-7B'.
+- `api_key` (str): Optional. The API key required for authenticating requests to the model's API endpoint.
+- `api_type` (str): Optional. The type of API service being used. This could be 'azure' for Azure Cognitive Services, 'openai' for OpenAI, or other custom types.
+- `base_url` (str): Optional. The base URL of the API endpoint. This is the root address where API calls are directed.
+- `api_version` (str): Optional. The version of the Azure API you wish to use
+
+For example:
+```python
+config_list = [
+    {
+        "model": "gpt-4",
+        "api_key": os.environ.get("AZURE_OPENAI_API_KEY"),
+        "api_type": "azure",
+        "base_url": os.environ.get("AZURE_OPENAI_API_BASE"),
+        "api_version": "2023-12-01-preview",
+    },
+    {
+        "model": "llama-7B",
+        "base_url": "http://127.0.0.1:8080",
+        "api_type": "openai",
+    }
+]
+```
+
+In `autogen` module there are [multiple helper functions](/docs/reference/oai/openai_utils) allowing to construct configurations using different sources:
 
 - `get_config_list`: Generates configurations for API calls, primarily from provided API keys.
 - `config_list_openai_aoai`: Constructs a list of configurations using both Azure OpenAI and OpenAI endpoints, sourcing API keys from environment variables or local files.
@@ -47,13 +75,21 @@ You can also explicitly specify that by:
 assistant = autogen.AssistantAgent(name="assistant", llm_config={"api_key": ...})
 ```
 
+### How does an agent decide which model to pick out of the list?
+
+An agent uses the very first model available in the "config_list" and makes LLM calls against this model. If the model fail (e.g. API throttling) the agent will retry the request against the 2nd model and so on until  prompt completion is received (or throws an error if none of the models successfully completes the request). There's no implicit/hidden logic inside agents that is used to pick "the best model for the task". It is developers responsibility to pick the right models and use them with agents.
+
+Besides throttling/rotating models the 'config_list' can be useful for:
+- Having a single global list of models and [filtering it](/docs/reference/oai/openai_utils/#filter_config) based on certain keys (e.g. name, tag) in order to pass select models into a certain agent (e.g. use cheaper GPT 3.5 for agents solving easier tasks)
+- Using more advanced features for special purposes related to inference, such as `filter_func` with [`OpenAIWrapper`](/docs/reference/oai/client#create) or [inference optimization](/docs/Examples#enhanced-inferences)
+
 ### Unexpected keyword argument 'base_url'
 
 In version >=1, OpenAI renamed their `api_base` parameter to `base_url`. So for older versions, use `api_base` but for newer versions use `base_url`.
 
 ### Can I use non-OpenAI models?
 
-Yes. Please check https://microsoft.github.io/autogen/blog/2023/07/14/Local-LLMs for an example.
+Yes. Autogen can work with any API endpoint which complies with OpenAI-compatible RESTful APIs - e.g. serving local LLM via FastChat or LM Studio. Please check https://microsoft.github.io/autogen/blog/2023/07/14/Local-LLMs for an example.
 
 ## Handle Rate Limit Error and Timeout Error