Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation] Using non-OpenAI models #2076

Merged
merged 16 commits into from
Mar 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Non-OpenAI Models

AutoGen allows you to use non-OpenAI models through proxy servers that provide
an OpenAI-compatible API or a [custom model client](https://microsoft.github.io/autogen/blog/2024/01/26/Custom-Models)
class.

Benefits of this flexibility include access to hundreds of models, assigning specialized
models to agents (e.g., fine-tuned coding models), the ability to run AutoGen entirely
within your environment, utilising both OpenAI and non-OpenAI models in one system, and cost
reductions in inference.

## OpenAI-compatible API proxy server
Any proxy server that provides an API that is compatible with [OpenAI's API](https://platform.openai.com/docs/api-reference)
will work with AutoGen.

These proxy servers can be cloud-based or running locally within your environment.

![Cloud or Local Proxy Servers](images/cloudlocalproxy.png)

### Cloud-based proxy servers
By using cloud-based proxy servers, you are able to use models without requiring the hardware
and software to run them.

These providers can host open source/weight models, like [Hugging Face](https://huggingface.co/),
or their own closed models.

When cloud-based proxy servers provide an OpenAI-compatible API, using them in AutoGen
is straightforward. With [LLM Configuration](/docs/topics/llm_configuration) done in
the same way as when using OpenAI's models, the primary difference is typically the
authentication which is usually handled through an API key.

Examples of using cloud-based proxy servers providers that have an OpenAI-compatible API
are provided below:

- [together.ai example](cloud-togetherai)


### Locally run proxy servers
An increasing number of LLM proxy servers are available for use locally. These can be
open-source (e.g., LiteLLM, Ollama, vLLM) or closed-source (e.g., LM Studio), and are
typically used for running the full-stack within your environment.

Similar to cloud-based proxy servers, as long as these proxy servers provide an
OpenAI-compatible API, running them in AutoGen is straightforward.

Examples of using locally run proxy servers that have an OpenAI-compatible API are
provided below:

- [LiteLLM with Ollama example](local-litellm-ollama)
- [LM Studio](local-lm-studio)
- [vLLM example](local-vllm)

````mdx-code-block
:::tip
If you are planning to use Function Calling, not all cloud-based and local proxy servers support
Function Calling with their OpenAI-compatible API, so check their documentation.
:::
````

### Configuration for Non-OpenAI models

Whether you choose a cloud-based or locally-run proxy server, the configuration is done in
the same way as using OpenAI's models, see [LLM Configuration](/docs/topics/llm_configuration)
for further information.

You can use [model configuration filtering](/docs/topics/llm_configuration#config-list-filtering)
to assign specific models to agents.


## Custom Model Client class
For more advanced users, you can create your own custom model client class, enabling
you to define and load your own models.

See the [AutoGen with Custom Models: Empowering Users to Use Their Own Inference Mechanism](/blog/2024/01/26/Custom-Models)
blog post and [this notebook](/docs/notebooks/agentchat_custom_model/) for a guide to creating custom model client classes.
170 changes: 170 additions & 0 deletions website/docs/topics/non-openai-models/cloud-togetherai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# Together AI
This cloud-based proxy server example, using [together.ai](https://www.together.ai/), is a group chat between a Python developer
and a code reviewer, who are given a coding task.

Start by [installing AutoGen](/docs/installation/) and getting your [together.ai API key](https://api.together.xyz/settings/profile).

Put your together.ai API key in an environment variable, TOGETHER_API_KEY.

Linux / Mac OSX:

```bash
export TOGETHER_API_KEY=YourTogetherAIKeyHere
```

Windows (command prompt):

```powershell
set TOGETHER_API_KEY=YourTogetherAIKeyHere
```

Create your LLM configuration, with the [model you want](https://docs.together.ai/docs/inference-models).

```python
import autogen
import os

llm_config={
"config_list": [
{
# Available together.ai model strings:
# https://docs.together.ai/docs/inference-models
"model": "mistralai/Mistral-7B-Instruct-v0.1",
"api_key": os.environ['TOGETHER_API_KEY'],
"base_url": "https://api.together.xyz/v1"
}
],
"cache_seed": 42
}
```

## Construct Agents

```python
# User Proxy will execute code and finish the chat upon typing 'exit'
user_proxy = autogen.UserProxyAgent(
name="UserProxy",
system_message="A human admin",
code_execution_config={
"last_n_messages": 2,
"work_dir": "groupchat",
"use_docker": False,
}, # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
human_input_mode="TERMINATE",
is_termination_msg=lambda x: "TERMINATE" in x.get("content"),
)

# Python Coder agent
coder = autogen.AssistantAgent(
name="softwareCoder",
description="Software Coder, writes Python code as required and reiterates with feedback from the Code Reviewer.",
system_message="You are a senior Python developer, a specialist in writing succinct Python functions.",
llm_config=llm_config,
)

# Code Reviewer agent
reviewer = autogen.AssistantAgent(
name="codeReviewer",
description="Code Reviewer, reviews written code for correctness, efficiency, and security. Asks the Software Coder to address issues.",
system_message="You are a Code Reviewer, experienced in checking code for correctness, efficiency, and security. Review and provide feedback to the Software Coder until you are satisfied, then return the word TERMINATE",
is_termination_msg=lambda x: "TERMINATE" in x.get("content"),
llm_config=llm_config,
)
```

## Establish the group chat

```python
# Establish the Group Chat and disallow a speaker being selected consecutively
groupchat = autogen.GroupChat(agents=[user_proxy, coder, reviewer], messages=[], max_round=12, allow_repeat_speaker=False)

# Manages the group of multiple agents
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
```

## Start Chat

```python
# Start the chat with a request to write a function
user_proxy.initiate_chat(
manager,
message="Write a Python function for the Fibonacci sequence, the function will have one parameter for the number in the sequence, which the function will return the Fibonacci number for."
)
# type exit to terminate the chat
```

Output:
```` text
UserProxy (to chat_manager):

Write a Python function for the Fibonacci sequence, the function will have one parameter for the number in the sequence, which the function will return the Fibonacci number for.

--------------------------------------------------------------------------------
softwareCoder (to chat_manager):

Sure, here is a simple Python function that uses recursion to calculate the Fibonacci number:

```python
def fibonacci(n):
if n <= 0:
return "Input should be a positive integer."
elif n == 1:
return 0
elif n == 2:
return 1
else:
return fibonacci(n-1) + fibonacci(n-2)
```

This function takes an integer `n` as input and returns the `n`th number in the Fibonacci sequence. The Fibonacci sequence is a series of numbers in which each number is the sum of the two preceding ones, usually starting with 0 and 1.

Note that this implementation uses recursion and may not be efficient for large values of `n`. In such cases, an iterative approach or memoization would be more appropriate.

--------------------------------------------------------------------------------
codeReviewer (to chat_manager):

I see a couple of issues with the current implementation of the `fibonacci` function:

1. The function does not handle negative inputs correctly. Currently, it returns a string message for any input less than or equal to 0. It would be better to raise a `ValueError` or return a more informative message.
2. The function uses recursion to calculate the Fibonacci number, which can be inefficient for large inputs. A more efficient approach would be to use an iterative approach or memoization.

Here's an updated implementation that addresses these issues:

```python
def fibonacci(n):
if n <= 0:
raise ValueError("Input should be a positive integer.")
elif n == 1:
return 0
elif n == 2:
return 1
else:
a, b = 0, 1
for _ in range(n - 2):
a, b = b, a + b
return b
```

This implementation uses a loop to calculate the Fibonacci number iteratively, which is more efficient than the recursive approach. It also raises a `ValueError` for negative inputs, which is a more appropriate way to handle invalid inputs.

--------------------------------------------------------------------------------

>>>>>>>> USING AUTO REPLY...

>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
UserProxy (to chat_manager):

exitcode: 0 (execution succeeded)
Code output:


--------------------------------------------------------------------------------
codeReviewer (to chat_manager):

I'm glad the updated implementation addresses the issues with the original code. Let me know if you have any further questions or if there's anything else I can help you with.

To terminate the conversation, please type "TERMINATE".

--------------------------------------------------------------------------------
Please give feedback to chat_manager. Press enter or type 'exit' to stop the conversation: exit
````
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading