diff --git a/website/docs/topics/non-openai-models/about-using-nonopenai-models.md b/website/docs/topics/non-openai-models/about-using-nonopenai-models.md
new file mode 100644
index 000000000000..c9ddc1b3988d
--- /dev/null
+++ b/website/docs/topics/non-openai-models/about-using-nonopenai-models.md
@@ -0,0 +1,75 @@
+# Non-OpenAI Models
+
+AutoGen allows you to use non-OpenAI models through proxy servers that provide
+an OpenAI-compatible API or a [custom model client](https://microsoft.github.io/autogen/blog/2024/01/26/Custom-Models)
+class.
+
+Benefits of this flexibility include access to hundreds of models, assigning specialized
+models to agents (e.g., fine-tuned coding models), the ability to run AutoGen entirely
+within your environment, utilising both OpenAI and non-OpenAI models in one system, and cost
+reductions in inference.
+
+## OpenAI-compatible API proxy server
+Any proxy server that provides an API that is compatible with [OpenAI's API](https://platform.openai.com/docs/api-reference)
+will work with AutoGen.
+
+These proxy servers can be cloud-based or running locally within your environment.
+
+![Cloud or Local Proxy Servers](images/cloudlocalproxy.png)
+
+### Cloud-based proxy servers
+By using cloud-based proxy servers, you are able to use models without requiring the hardware
+and software to run them.
+
+These providers can host open source/weight models, like [Hugging Face](https://huggingface.co/),
+or their own closed models.
+
+When cloud-based proxy servers provide an OpenAI-compatible API, using them in AutoGen
+is straightforward. With [LLM Configuration](/docs/topics/llm_configuration) done in
+the same way as when using OpenAI's models, the primary difference is typically the
+authentication which is usually handled through an API key.
+
+Examples of using cloud-based proxy servers providers that have an OpenAI-compatible API
+are provided below:
+
+- [together.ai example](cloud-togetherai)
+
+
+### Locally run proxy servers
+An increasing number of LLM proxy servers are available for use locally. These can be
+open-source (e.g., LiteLLM, Ollama, vLLM) or closed-source (e.g., LM Studio), and are
+typically used for running the full-stack within your environment.
+
+Similar to cloud-based proxy servers, as long as these proxy servers provide an
+OpenAI-compatible API, running them in AutoGen is straightforward.
+
+Examples of using locally run proxy servers that have an OpenAI-compatible API are
+provided below:
+
+- [LiteLLM with Ollama example](local-litellm-ollama)
+- [LM Studio](local-lm-studio)
+- [vLLM example](local-vllm)
+
+````mdx-code-block
+:::tip
+If you are planning to use Function Calling, not all cloud-based and local proxy servers support
+Function Calling with their OpenAI-compatible API, so check their documentation.
+:::
+````
+
+### Configuration for Non-OpenAI models
+
+Whether you choose a cloud-based or locally-run proxy server, the configuration is done in
+the same way as using OpenAI's models, see [LLM Configuration](/docs/topics/llm_configuration)
+for further information.
+
+You can use [model configuration filtering](/docs/topics/llm_configuration#config-list-filtering)
+to assign specific models to agents.
+
+
+## Custom Model Client class
+For more advanced users, you can create your own custom model client class, enabling
+you to define and load your own models.
+
+See the [AutoGen with Custom Models: Empowering Users to Use Their Own Inference Mechanism](/blog/2024/01/26/Custom-Models)
+blog post and [this notebook](/docs/notebooks/agentchat_custom_model/) for a guide to creating custom model client classes.
diff --git a/website/docs/topics/non-openai-models/cloud-togetherai.md b/website/docs/topics/non-openai-models/cloud-togetherai.md
new file mode 100644
index 000000000000..94d24967a332
--- /dev/null
+++ b/website/docs/topics/non-openai-models/cloud-togetherai.md
@@ -0,0 +1,170 @@
+# Together AI
+This cloud-based proxy server example, using [together.ai](https://www.together.ai/), is a group chat between a Python developer
+and a code reviewer, who are given a coding task.
+
+Start by [installing AutoGen](/docs/installation/) and getting your [together.ai API key](https://api.together.xyz/settings/profile).
+
+Put your together.ai API key in an environment variable, TOGETHER_API_KEY.
+
+Linux / Mac OSX:
+
+```bash
+export TOGETHER_API_KEY=YourTogetherAIKeyHere
+```
+
+Windows (command prompt):
+
+```powershell
+set TOGETHER_API_KEY=YourTogetherAIKeyHere
+```
+
+Create your LLM configuration, with the [model you want](https://docs.together.ai/docs/inference-models).
+
+```python
+import autogen
+import os
+
+llm_config={
+    "config_list": [
+        {
+            # Available together.ai model strings:
+            # https://docs.together.ai/docs/inference-models
+            "model": "mistralai/Mistral-7B-Instruct-v0.1",
+            "api_key": os.environ['TOGETHER_API_KEY'],
+            "base_url": "https://api.together.xyz/v1"
+        }
+    ],
+    "cache_seed": 42
+}
+```
+
+## Construct Agents
+
+```python
+# User Proxy will execute code and finish the chat upon typing 'exit'
+user_proxy = autogen.UserProxyAgent(
+    name="UserProxy",
+    system_message="A human admin",
+    code_execution_config={
+        "last_n_messages": 2,
+        "work_dir": "groupchat",
+        "use_docker": False,
+    },  # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
+    human_input_mode="TERMINATE",
+    is_termination_msg=lambda x: "TERMINATE" in x.get("content"),
+)
+
+# Python Coder agent
+coder = autogen.AssistantAgent(
+    name="softwareCoder",
+    description="Software Coder, writes Python code as required and reiterates with feedback from the Code Reviewer.",
+    system_message="You are a senior Python developer, a specialist in writing succinct Python functions.",
+    llm_config=llm_config,
+)
+
+# Code Reviewer agent
+reviewer = autogen.AssistantAgent(
+    name="codeReviewer",
+    description="Code Reviewer, reviews written code for correctness, efficiency, and security. Asks the Software Coder to address issues.",
+    system_message="You are a Code Reviewer, experienced in checking code for correctness, efficiency, and security. Review and provide feedback to the Software Coder until you are satisfied, then return the word TERMINATE",
+    is_termination_msg=lambda x: "TERMINATE" in x.get("content"),
+    llm_config=llm_config,
+)
+```
+
+## Establish the group chat
+
+```python
+# Establish the Group Chat and disallow a speaker being selected consecutively
+groupchat = autogen.GroupChat(agents=[user_proxy, coder, reviewer], messages=[], max_round=12, allow_repeat_speaker=False)
+
+# Manages the group of multiple agents
+manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
+```
+
+## Start Chat
+
+```python
+# Start the chat with a request to write a function
+user_proxy.initiate_chat(
+    manager,
+    message="Write a Python function for the Fibonacci sequence, the function will have one parameter for the number in the sequence, which the function will return the Fibonacci number for."
+)
+# type exit to terminate the chat
+```
+
+Output:
+```` text
+UserProxy (to chat_manager):
+
+Write a Python function for the Fibonacci sequence, the function will have one parameter for the number in the sequence, which the function will return the Fibonacci number for.
+
+--------------------------------------------------------------------------------
+softwareCoder (to chat_manager):
+
+ Sure, here is a simple Python function that uses recursion to calculate the Fibonacci number:
+
+```python
+def fibonacci(n):
+    if n <= 0:
+        return "Input should be a positive integer."
+    elif n == 1:
+        return 0
+    elif n == 2:
+        return 1
+    else:
+        return fibonacci(n-1) + fibonacci(n-2)
+```
+
+This function takes an integer `n` as input and returns the `n`th number in the Fibonacci sequence. The Fibonacci sequence is a series of numbers in which each number is the sum of the two preceding ones, usually starting with 0 and 1.
+
+Note that this implementation uses recursion and may not be efficient for large values of `n`. In such cases, an iterative approach or memoization would be more appropriate.
+
+--------------------------------------------------------------------------------
+codeReviewer (to chat_manager):
+
+ I see a couple of issues with the current implementation of the `fibonacci` function:
+
+1. The function does not handle negative inputs correctly. Currently, it returns a string message for any input less than or equal to 0. It would be better to raise a `ValueError` or return a more informative message.
+2. The function uses recursion to calculate the Fibonacci number, which can be inefficient for large inputs. A more efficient approach would be to use an iterative approach or memoization.
+
+Here's an updated implementation that addresses these issues:
+
+```python
+def fibonacci(n):
+    if n <= 0:
+        raise ValueError("Input should be a positive integer.")
+    elif n == 1:
+        return 0
+    elif n == 2:
+        return 1
+    else:
+        a, b = 0, 1
+        for _ in range(n - 2):
+            a, b = b, a + b
+        return b
+```
+
+This implementation uses a loop to calculate the Fibonacci number iteratively, which is more efficient than the recursive approach. It also raises a `ValueError` for negative inputs, which is a more appropriate way to handle invalid inputs.
+
+--------------------------------------------------------------------------------
+
+>>>>>>>> USING AUTO REPLY...
+
+>>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...
+UserProxy (to chat_manager):
+
+exitcode: 0 (execution succeeded)
+Code output:
+
+
+--------------------------------------------------------------------------------
+codeReviewer (to chat_manager):
+
+ I'm glad the updated implementation addresses the issues with the original code. Let me know if you have any further questions or if there's anything else I can help you with.
+
+To terminate the conversation, please type "TERMINATE".
+
+--------------------------------------------------------------------------------
+Please give feedback to chat_manager. Press enter or type 'exit' to stop the conversation: exit
+````
diff --git a/website/docs/topics/non-openai-models/images/cloudlocalproxy.png b/website/docs/topics/non-openai-models/images/cloudlocalproxy.png
new file mode 100755
index 000000000000..db907e155dbe
Binary files /dev/null and b/website/docs/topics/non-openai-models/images/cloudlocalproxy.png differ
diff --git a/website/docs/topics/non-openai-models/local-litellm-ollama.md b/website/docs/topics/non-openai-models/local-litellm-ollama.md
new file mode 100644
index 000000000000..98b326acdf4a
--- /dev/null
+++ b/website/docs/topics/non-openai-models/local-litellm-ollama.md
@@ -0,0 +1,329 @@
+# LiteLLM with Ollama
+[LiteLLM](https://litellm.ai/) is an open-source locally run proxy server that provides an
+OpenAI-compatible API. It interfaces with a large number of providers that do the inference.
+To handle the inference, a popular open-source inference engine is [Ollama](https://ollama.com/).
+
+As not all proxy servers support OpenAI's [Function Calling](https://platform.openai.com/docs/guides/function-calling) (usable with AutoGen),
+LiteLLM together with Ollama enable this useful feature.
+
+Running this stack requires the installation of:
+1. AutoGen ([installation instructions](/docs/installation))
+2. LiteLLM
+3. Ollama
+
+Note: We recommend using a virtual environment for your stack, see [this article](https://microsoft.github.io/autogen/docs/installation/#create-a-virtual-environment-optional) for guidance.
+
+## Installing LiteLLM
+
+Install LiteLLM with the proxy server functionality:
+
+```bash
+pip install litellm[proxy]
+```
+
+Note: If using Windows, run LiteLLM and Ollama within a [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install).
+
+````mdx-code-block
+:::tip
+For custom LiteLLM installation instructions, see their [GitHub repository](https://github.com/BerriAI/litellm).
+:::
+````
+
+## Installing Ollama
+
+For Mac and Windows, [download Ollama](https://ollama.com/download).
+
+For Linux:
+
+```bash
+curl -fsSL https://ollama.com/install.sh | sh
+```
+
+## Downloading models
+
+Ollama has a library of models to choose from, see them [here](https://ollama.com/library).
+
+Before you can use a model, you need to download it (using the name of the model from the library):
+
+```bash
+ollama pull llama2
+```
+
+To view the models you have downloaded and can use:
+
+```bash
+ollama list
+```
+
+````mdx-code-block
+:::tip
+Ollama enables the use of GGUF model files, available readily on Hugging Face. See Ollama`s [GitHub repository](https://github.com/ollama/ollama)
+for examples.
+:::
+````
+
+## Running LiteLLM proxy server
+
+To run LiteLLM with the model you have downloaded, in your terminal:
+
+```bash
+litellm --model ollama_chat/llama2
+```
+
+```` text
+INFO:     Started server process [19040]
+INFO:     Waiting for application startup.
+
+#------------------------------------------------------------#
+#                                                            #
+#       'This feature doesn't meet my needs because...'       #
+#        https://github.com/BerriAI/litellm/issues/new        #
+#                                                            #
+#------------------------------------------------------------#
+
+ Thank you for using LiteLLM! - Krrish & Ishaan
+
+
+
+Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
+
+
+INFO:     Application startup complete.
+INFO:     Uvicorn running on http://0.0.0.0:4000 (Press CTRL+C to quit)
+````
+
+This will run the proxy server and it will be available at 'http://0.0.0.0:4000/'.
+
+## Using LiteLLM+Ollama with AutoGen
+
+Now that we have the URL for the LiteLLM proxy server, you can use it within AutoGen
+in the same way as OpenAI or cloud-based proxy servers.
+
+As you are running this proxy server locally, no API key is required. Additionally, as
+the model is being set when running the
+LiteLLM command, no model name needs to be configured in AutoGen. However, ```model```
+and ```api_key``` are mandatory fields for configurations within AutoGen so we put dummy
+values in them, as per the example below.
+
+```python
+from autogen import UserProxyAgent, ConversableAgent
+
+local_llm_config={
+    "config_list": [
+        {
+            "model": "NotRequired", # Loaded with LiteLLM command
+            "api_key": "NotRequired", # Not needed
+            "base_url": "http://0.0.0.0:4000"  # Your LiteLLM URL
+        }
+    ],
+    "cache_seed": None # Turns off caching, useful for testing different models
+}
+
+# Create the agent that uses the LLM.
+assistant = ConversableAgent("agent", llm_config=local_llm_config)
+
+# Create the agent that represents the user in the conversation.
+user_proxy = UserProxyAgent("user", code_execution_config=False)
+
+# Let the assistant start the conversation.  It will end when the user types exit.
+assistant.initiate_chat(user_proxy, message="How can I help you today?")
+```
+
+Output:
+
+```` text
+agent (to user):
+
+How can I help you today?
+
+--------------------------------------------------------------------------------
+Provide feedback to agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: Tell me, why is the sky blue?
+user (to agent):
+
+Tell me, why is the sky blue?
+
+--------------------------------------------------------------------------------
+
+>>>>>>>> USING AUTO REPLY...
+agent (to user):
+
+Ah, a classic question! The sky appears blue because of a phenomenon called Rayleigh scattering. When sunlight enters Earth's atmosphere, it encounters tiny molecules of gases such as nitrogen and oxygen. These molecules scatter the light in all directions, but they scatter shorter (blue) wavelengths more than longer (red) wavelengths. This is why the sky looks blue during the daytime, especially in the direction of the sun.
+
+But wait, there's more! The color of the sky can also change depending on the time of day and atmospheric conditions. For example, during sunrise and sunset, the sky can take on hues of red, orange, and pink due to the way that light interacts with the Earth's atmosphere. And if you're in a location with a lot of dust or pollution, the sky may appear more hazy or gray.
+
+So there you have it! The sky is blue because of the way that light interacts with the Earth's atmosphere, and there are many factors that can affect its appearance. Is there anything else you'd like to know?
+
+--------------------------------------------------------------------------------
+Provide feedback to agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: And what would it take to make it red?
+user (to agent):
+
+And what would it take to make it red?
+
+--------------------------------------------------------------------------------
+
+>>>>>>>> USING AUTO REPLY...
+agent (to user):
+
+Ah, an interesting question! To make the sky appear red, we need to look at the scattering of light in a different way. When the sun is low on the horizon, the light it emits has to travel through a longer distance in the atmosphere to reach our eyes. As a result, much of the shorter wavelength blue light is scattered away, leaving mainly the longer wavelength red and orange light to reach our eyes. This is why the sky can take on hues of red, orange, and pink during sunrise and sunset.
+
+However, if we were to somehow change the composition of the atmosphere or add some additional particles into the air, we could potentially make the sky appear red even when the sun is high in the sky. For example, if we were to add a lot of dust or smoke into the atmosphere, the sky might take on a reddish hue due to the scattering of light by these particles. Or, if we were to create a situation where the air was filled with a high concentration of certain gases, such as nitrogen oxides or sulfur compounds, the sky could potentially appear red or orange as a result of the way that these gases interact with light.
+
+So there you have it! While the sky is typically blue during the daytime due to Rayleigh scattering, there are many other factors that can affect its appearance, and with the right conditions, we can even make the sky appear red! Is there anything else you'd like to know?
+
+--------------------------------------------------------------------------------
+Provide feedback to agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: exit
+````
+
+## Example with Function Calling
+Function calling (aka Tool calling) is a feature of OpenAI's API that AutoGen and LiteLLM support.
+
+Below is an example of using function calling with LiteLLM and Ollama. Based on this [currency conversion](https://github.com/microsoft/autogen/blob/501f8d22726e687c55052682c20c97ce62f018ac/notebook/agentchat_function_call_currency_calculator.ipynb) notebook.
+
+LiteLLM is loaded in the same way as the previous example, however the DolphinCoder model is used as it is better at constructing the
+function calling message required.
+
+In your terminal:
+
+```bash
+litellm --model ollama_chat/dolphincoder
+```
+
+
+```python
+import autogen
+from typing import Literal
+from typing_extensions import Annotated
+
+local_llm_config={
+    "config_list": [
+        {
+            "model": "NotRequired", # Loaded with LiteLLM command
+            "api_key": "NotRequired", # Not needed
+            "base_url": "http://0.0.0.0:4000"  # Your LiteLLM URL
+        }
+    ],
+    "cache_seed": None # Turns off caching, useful for testing different models
+}
+
+# Create the agent and include examples of the function calling JSON in the prompt
+# to help guide the model
+chatbot = autogen.AssistantAgent(
+    name="chatbot",
+    system_message="""For currency exchange tasks,
+        only use the functions you have been provided with.
+        Output 'TERMINATE' when an answer has been provided.
+        Do not include the function name or result in the JSON.
+        Example of the return JSON is:
+        {
+            "parameter_1_name": 100.00,
+            "parameter_2_name": "ABC",
+            "parameter_3_name": "DEF",
+        }.
+        Another example of the return JSON is:
+        {
+            "parameter_1_name": "GHI",
+            "parameter_2_name": "ABC",
+            "parameter_3_name": "DEF",
+            "parameter_4_name": 123.00,
+        }. """,
+
+    llm_config=local_llm_config,
+)
+
+user_proxy = autogen.UserProxyAgent(
+    name="user_proxy",
+    is_termination_msg=lambda x: x.get("content", "") and "TERMINATE" in x.get("content", ""),
+    human_input_mode="NEVER",
+    max_consecutive_auto_reply=1,
+)
+
+
+CurrencySymbol = Literal["USD", "EUR"]
+
+# Define our function that we expect to call
+def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
+    if base_currency == quote_currency:
+        return 1.0
+    elif base_currency == "USD" and quote_currency == "EUR":
+        return 1 / 1.1
+    elif base_currency == "EUR" and quote_currency == "USD":
+        return 1.1
+    else:
+        raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")
+
+# Register the function with the agent
+@user_proxy.register_for_execution()
+@chatbot.register_for_llm(description="Currency exchange calculator.")
+def currency_calculator(
+    base_amount: Annotated[float, "Amount of currency in base_currency"],
+    base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
+    quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
+) -> str:
+    quote_amount = exchange_rate(base_currency, quote_currency) * base_amount
+    return f"{format(quote_amount, '.2f')} {quote_currency}"
+
+# start the conversation
+res = user_proxy.initiate_chat(
+    chatbot,
+    message="How much is 123.45 EUR in USD?",
+    summary_method="reflection_with_llm",
+)
+```
+
+Output:
+
+```` text
+user_proxy (to chatbot):
+
+How much is 123.45 EUR in USD?
+
+--------------------------------------------------------------------------------
+chatbot (to user_proxy):
+
+***** Suggested tool Call (call_c93c4390-93d5-4a28-b40d-09fe74cc58da): currency_calculator *****
+Arguments:
+{
+  "base_amount": 123.45,
+  "base_currency": "EUR",
+  "quote_currency": "USD"
+}
+
+
+************************************************************************************************
+
+--------------------------------------------------------------------------------
+
+>>>>>>>> EXECUTING FUNCTION currency_calculator...
+user_proxy (to chatbot):
+
+user_proxy (to chatbot):
+
+***** Response from calling tool "call_c93c4390-93d5-4a28-b40d-09fe74cc58da" *****
+135.80 USD
+**********************************************************************************
+
+--------------------------------------------------------------------------------
+chatbot (to user_proxy):
+
+***** Suggested tool Call (call_d8fd94de-5286-4ef6-b1f6-72c826531ff9): currency_calculator *****
+Arguments:
+{
+  "base_amount": 123.45,
+  "base_currency": "EUR",
+  "quote_currency": "USD"
+}
+
+
+************************************************************************************************
+````
+
+````mdx-code-block
+:::warning
+Not all open source/weight models are suitable for function calling and AutoGen continues to be
+developed to provide wider support for open source models.
+
+The [#alt-models](https://discord.com/channels/1153072414184452236/1201369716057440287) channel
+on AutoGen's Discord is an active community discussing the use of open source/weight models
+with AutoGen.
+:::
+````
diff --git a/website/docs/topics/non-openai-models/lm-studio.ipynb b/website/docs/topics/non-openai-models/local-lm-studio.ipynb
similarity index 100%
rename from website/docs/topics/non-openai-models/lm-studio.ipynb
rename to website/docs/topics/non-openai-models/local-lm-studio.ipynb
diff --git a/website/docs/topics/non-openai-models/local-vllm.md b/website/docs/topics/non-openai-models/local-vllm.md
new file mode 100644
index 000000000000..841b0323be90
--- /dev/null
+++ b/website/docs/topics/non-openai-models/local-vllm.md
@@ -0,0 +1,158 @@
+# vLLM
+[vLLM](https://github.com/vllm-project/vllm) is a locally run proxy and inference server,
+providing an OpenAI-compatible API. As it performs both the proxy and the inferencing,
+you don't need to install an additional inference server.
+
+Note: vLLM does not support OpenAI's [Function Calling](https://platform.openai.com/docs/guides/function-calling)
+(usable with AutoGen). However, it is in development and may be available by the time you
+read this.
+
+Running this stack requires the installation of:
+1. AutoGen ([installation instructions](/docs/installation))
+2. vLLM
+
+Note: We recommend using a virtual environment for your stack, see [this article](https://microsoft.github.io/autogen/docs/installation/#create-a-virtual-environment-optional)
+for guidance.
+
+## Installing vLLM
+
+In your terminal:
+
+```bash
+pip install vllm
+```
+
+## Choosing models
+
+vLLM will download new models when you run the server.
+
+The models are sourced from [Hugging Face](https://huggingface.co), a filtered list of Text
+Generation models is [here](https://huggingface.co/models?pipeline_tag=text-generation&sort=trending)
+and vLLM has a list of [commonly used models](https://docs.vllm.ai/en/latest/models/supported_models.html).
+Use the full model name, e.g. `mistralai/Mistral-7B-Instruct-v0.2`.
+
+## Chat Template
+
+vLLM uses a pre-defined chat template, unless the model has a chat template defined in its config file on Hugging Face.
+This can cause an issue if the chat template doesn't allow `'role' : 'system'` messages, as used in AutoGen.
+
+Therefore, we will create a chat template for the Mistral.AI Mistral 7B model we are using that allows roles of 'user',
+'assistant', and 'system'.
+
+Create a file name `autogenmistraltemplate.jinja` with the following content:
+```` text
+{{ bos_token }}
+{% for message in messages %}
+    {% if ((message['role'] == 'user' or message['role'] == 'system') != (loop.index0 % 2 == 0)) %}
+        {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}
+    {% endif %}
+
+    {% if (message['role'] == 'user' or message['role'] == 'system') %}
+        {{ '[INST] ' + message['content'] + ' [/INST]' }}
+    {% elif message['role'] == 'assistant' %}
+        {{ message['content'] + eos_token}}
+    {% else %}
+        {{ raise_exception('Only system, user and assistant roles are supported!') }}
+    {% endif %}
+{% endfor %}
+````
+
+````mdx-code-block
+:::warning
+Chat Templates are specific to the model/model family. The example shown here is for Mistral-based models like Mistral 7B and Mixtral 8x7B.
+
+vLLM has a number of [example templates](https://github.com/vllm-project/vllm/tree/main/examples) for models that can be a
+starting point for your chat template. Just remember, the template may need to be adjusted to support 'system' role messages.
+:::
+````
+
+## Running vLLM proxy server
+
+To run vLLM with the chosen model and our chat template, in your terminal:
+
+```bash
+python -m vllm.entrypoints.openai.api_server --model mistralai/Mistral-7B-Instruct-v0.2 --chat-template autogenmistraltemplate.jinja
+```
+
+By default, vLLM will run on 'http://0.0.0.0:8000'.
+
+## Using vLLM with AutoGen
+
+Now that we have the URL for the vLLM proxy server, you can use it within AutoGen in the same
+way as OpenAI or cloud-based proxy servers.
+
+As you are running this proxy server locally, no API key is required. As ```api_key``` is a mandatory
+field for configurations within AutoGen we put a dummy value in it, as per the example below.
+
+Although we are specifying the model when running the vLLM command, we must still put it into the
+```model``` value for vLLM.
+
+
+```python
+from autogen import UserProxyAgent, ConversableAgent
+
+local_llm_config={
+    "config_list": [
+        {
+            "model": "mistralai/Mistral-7B-Instruct-v0.2", # Same as in vLLM command
+            "api_key": "NotRequired", # Not needed
+            "base_url": "http://0.0.0.0:8000/v1"  # Your vLLM URL, with '/v1' added
+        }
+    ],
+    "cache_seed": None # Turns off caching, useful for testing different models
+}
+
+# Create the agent that uses the LLM.
+assistant = ConversableAgent("agent", llm_config=local_llm_config,system_message="")
+
+# Create the agent that represents the user in the conversation.
+user_proxy = UserProxyAgent("user", code_execution_config=False,system_message="")
+
+# Let the assistant start the conversation.  It will end when the user types exit.
+assistant.initiate_chat(user_proxy, message="How can I help you today?")
+```
+
+Output:
+
+```` text
+agent (to user):
+
+How can I help you today?
+
+--------------------------------------------------------------------------------
+Provide feedback to agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: Why is the sky blue?
+user (to agent):
+
+Why is the sky blue?
+
+--------------------------------------------------------------------------------
+
+>>>>>>>> USING AUTO REPLY...
+agent (to user):
+
+
+The sky appears blue due to a phenomenon called Rayleigh scattering. As sunlight reaches Earth's atmosphere, it interacts with molecules and particles in the air, causing the scattering of light. Blue light has a shorter wavelength and gets scattered more easily than other colors, which is why the sky appears blue during a clear day.
+
+However, during sunrise and sunset, the sky can appear red, orange, or purple due to a different type of scattering called scattering by dust, pollutants, and water droplets, which scatter longer wavelengths of light more effectively.
+
+--------------------------------------------------------------------------------
+Provide feedback to agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: and why does it turn red?
+user (to agent):
+
+and why does it turn red?
+
+--------------------------------------------------------------------------------
+
+>>>>>>>> USING AUTO REPLY...
+agent (to user):
+
+
+During sunrise and sunset, the angle of the sun's rays in the sky is lower, and they have to pass through more of the Earth's atmosphere before reaching an observer. This additional distance results in more scattering of sunlight, which preferentially scatters the longer wavelengths (red, orange, and yellow) more than the shorter wavelengths (blue and green).
+
+The scattering of sunlight by the Earth's atmosphere causes the red, orange, and yellow colors to be more prevalent in the sky during sunrise and sunset, resulting in the beautiful display of colors often referred to as a sunrise or sunset.
+
+As the sun continues to set, the sky can transition to various shades of purple, pink, and eventually dark blue or black, as the available sunlight continues to decrease and the longer wavelengths are progressively scattered less effectively.
+
+--------------------------------------------------------------------------------
+Provide feedback to agent. Press enter to skip and use auto-reply, or type 'exit' to end the conversation: exit
+````
diff --git a/website/docusaurus.config.js b/website/docusaurus.config.js
index d290baf0e756..e0f40e72e7d7 100644
--- a/website/docusaurus.config.js
+++ b/website/docusaurus.config.js
@@ -212,6 +212,10 @@ module.exports = {
           {
             to: "/docs/tutorial/what-next",
             from: ["/docs/tutorial/what-is-next"],
+          },
+          {
+            to: "/docs/topics/non-openai-models/local-lm-studio",
+            from: ["/docs/topics/non-openai-models/lm-studio"],
           }
         ],
       },