From d62de59be087a8f72ee1cc3b5e15c49409bd588d Mon Sep 17 00:00:00 2001
From: Nat Kershaw <nakersha@microsoft.com>
Date: Fri, 2 Jan 2026 14:24:11 -0800
Subject: [PATCH 1/3] Update README

---
 README.md | 104 ++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 66 insertions(+), 38 deletions(-)
diff --git a/README.md b/README.md
index 0c2f6b1ebc..3fac6d3090 100644
--- a/README.md
+++ b/README.md
@@ -1,29 +1,5 @@
 # ONNX Runtime GenAI
 
-Note: between `v0.11.0` and `v0.10.1`, there is a breaking API usage change to improve model quality during multi-turn conversations.
-
-Previously, the decoding loop could be written as follows.
-
-```
-while not IsDone():
-    GenerateToken()
-    GetLastToken()
-    PrintLastToken()
-```
-
-In 0.11.0, the decoding loop should now be written as follows.
-
-```
-while True:
-    GenerateToken()
-    if IsDone():
-        break
-    GetLastToken()
-    PrintLastToken()
-```
-
-Please read [this PR's description](https://github.com/microsoft/onnxruntime-genai/pull/1849) for more information.
-
 ## Status
 
 [![Latest version](https://img.shields.io/nuget/vpre/Microsoft.ML.OnnxRuntimeGenAI.Managed?label=latest)](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntimeGenAI.Managed/absoluteLatest)
@@ -32,20 +8,22 @@ Please read [this PR's description](https://github.com/microsoft/onnxruntime-gen
 
 ## Description
 
-Run generative AI models with ONNX Runtime. This API gives you an easy, flexible and performant way of running LLMs on device. It implements the generative AI loop for ONNX models, including pre and post processing, inference with ONNX Runtime, logits processing, search and sampling, and KV cache management.
+Run generative AI models with ONNX Runtime. This API gives you an easy, flexible and performant way of running LLMs on device. It implements the generative AI loop for ONNX models, including pre and post processing, inference with ONNX Runtime, logits processing, search and sampling, KV cache management, and grammar specification for tool calling.
+
+ONNX Runtime GenAI powers Foundry Local, Windows ML, and the Visual Studio Code AI Toolkit.
 
 See documentation at the [ONNX Runtime website](https://onnxruntime.ai/docs/genai) for more details.
 
-|Support matrix|Supported now|Under development|On the roadmap|
+| Support matrix | Supported now | Under development | On the roadmap|
 | -------------- | ------------- | ----------------- | -------------- |
-| Model architectures | AMD OLMo <br/> ChatGLM <br/> DeepSeek <br/> ERNIE 4.5 <br/> Gemma <br/> gpt-oss <br/> Granite <br/> Llama <br/> Mistral <br/> Nemotron <br/> Phi (language + vision) <br/> Qwen <br/> SmolLM3 <br/> Whisper | Stable diffusion | Multi-modal models |
+| Model architectures | ChatGLM</br>DeepSeek</br>Ernie</br>Gemma</br>GPTOSS</br>Granite</br>Llama</br>Mistral</br>Nemotron</br>OLMo</br>Phi</br>Phi3V</br>Phi4MM</br>Qwen</br>Qwen25VLText</br>SmolLM3</br>Whisper</br>| Stable diffusion ||
 | API| Python <br/>C# <br/>C/C++ <br/> Java ^ | Objective-C ||
-| Platform | Linux <br/> Windows <br/>Mac ^ <br/>Android ^  || iOS |||
-| Architecture | x86 <br/> x64 <br/> Arm64 ~ ||||
+| O/S | Linux <br/> Windows <br/>Mac  <br/>Android   || iOS |||
+| Architecture | x86 <br/> x64 <br/> arm64 ||||
 | Hardware Acceleration | CPU <br/> CUDA <br/> DirectML <br/> NvTensorRtRtx (TRT-RTX) <br/> OpenVINO <br/> QNN <br/> WebGPU | | AMD GPU |
 | Features | Multi-LoRA <br/> Continuous decoding <br/> Constrained decoding | | Speculative decoding |
 
-\~ Windows builds available, requires build from source for other platforms
+^ Requires build from source
 
 ## Installation
 
@@ -60,7 +38,7 @@ See [installation instructions](https://onnxruntime.ai/docs/genai/howto/install)
    ```
 
 2. Install the API
-   
+
    ```shell
    pip install numpy
    pip install --pre onnxruntime-genai
@@ -113,30 +91,80 @@ See [installation instructions](https://onnxruntime.ai/docs/genai/howto/install)
    del generator
    ```
 
-### Choosing the Right Examples: Release vs. Main Branch
+### Choose the correct version of the examples
 
-Due to the evolving nature of this project and ongoing feature additions, examples in the `main` branch may not always align with the latest stable release. This section outlines how to ensure compatibility between the examples and the corresponding version. The majority of the steps would remain same. Just the package installation and the model example file would change.
+Due to the evolving nature of this project and ongoing feature additions, examples in the `main` branch may not always align with the latest stable release. This section outlines how to ensure compatibility between the examples and the corresponding version.
 
 ### Stable version
-Install the package according to the [installation instructions](https://onnxruntime.ai/docs/genai/howto/install). Let's say you installed the 0.10.1 version of ONNX Runtime GenAI, so the instructions would look like this:
+
+Install the package according to the [installation instructions](https://onnxruntime.ai/docs/genai/howto/install). For example, install the Python package.
+
+```bash
+pip install onnxruntime-genai
+```
+
+Get the version of the package
+
+```bash
+pip list | grep onnxruntime-genai
+```
+
+Checkout the version of the examples that correspond to that release.
 
 ```bash
 # Clone the repo
 git clone https://github.com/microsoft/onnxruntime-genai.git && cd onnxruntime-genai
 # Checkout the branch for the version you are using
-git checkout v0.10.1
+git checkout v0.11.4
 cd examples
 ```
 
-### Nightly version (Main Branch)
-Build the package from source using these [instructions](https://onnxruntime.ai/docs/genai/howto/build-from-source.html). Now just go to the folder location where all the examples are present.
+### Nightly version (main branch)
+
+Checkout the main branch of the repo
 
 ```bash
-# Clone the repo
 git clone https://github.com/microsoft/onnxruntime-genai.git && cd onnxruntime-genai
+```
+
+Build from source, using these [instructions](https://onnxruntime.ai/docs/genai/howto/build-from-source.html). For example, to build the Python wheel:
+
+```bash
+python build.py
+```
+
+Navigate to the examples folder in the main branch.
+
+```bash
 cd examples
 ```
 
+## Breaking API changes
+
+### v0.11.0
+
+Between `v0.11.0` and `v0.10.1`, there is a breaking API usage change to improve model quality during multi-turn conversations.
+
+Previously, the decoding loop could be written as follows.
+
+```
+while not IsDone():
+    GenerateToken()
+    GetLastToken()
+    PrintLastToken()
+```
+
+In 0.11.0, the decoding loop should now be written as follows.
+
+```
+while True:
+    GenerateToken()
+    if IsDone():
+        break
+    GetLastToken()
+    PrintLastToken()
+```
+
 ## Roadmap
 
 See the [Discussions](https://github.com/microsoft/onnxruntime-genai/discussions) to request new features and up-vote existing requests.

From 9efdb31f9d0f8af9be0031b34359ad3c3464df70 Mon Sep 17 00:00:00 2001
From: Copilot <198982749+Copilot@users.noreply.github.com>
Date: Wed, 7 Jan 2026 14:25:40 -0800
Subject: [PATCH 2/3] Add Windows alternative for package version check command
 (#1936)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Addresses review feedback on #1934 to provide Windows-friendly command
alternative for checking installed package version.

**Changes:**
- Added platform-specific commands in README.md for `pip list` filtering
  - Linux/Mac: `pip list | grep onnxruntime-genai`
  - Windows: `pip list | findstr "onnxruntime-genai"`

The Windows command uses `findstr` instead of `grep` to maintain
cross-platform compatibility in documentation.

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 We'd love your input! Share your thoughts on Copilot coding agent in
our [2 minute survey](https://gh.io/copilot-coding-agent-survey).

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: natke <3302433+natke@users.noreply.github.com>
---
 README.md | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/README.md b/README.md
index 3fac6d3090..28c96bb531 100644
--- a/README.md
+++ b/README.md
@@ -105,10 +105,16 @@ pip install onnxruntime-genai
 
 Get the version of the package
 
+Linux/Mac:
 ```bash
 pip list | grep onnxruntime-genai
 ```
 
+Windows:
+```bash
+pip list | findstr "onnxruntime-genai"
+```
+
 Checkout the version of the examples that correspond to that release.
 
 ```bash

From 728e679f35005a1aadaae8c2c23346d292a832b1 Mon Sep 17 00:00:00 2001
From: Copilot <198982749+Copilot@users.noreply.github.com>
Date: Wed, 7 Jan 2026 14:29:30 -0800
Subject: [PATCH 3/3] Update README model architectures: remove Qwen25VLText,
 add Qwen-2.5VL and Fara (#1937)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Addresses feedback from PR #1934 to update the supported model
architectures list in README.md.

**Changes:**
- Removed `Qwen25VLText` (superseded by Qwen-2.5VL which supports its
functionality)
- Added `Qwen-2.5VL` (vision-language model with 3D position embeddings)
- Added `Fara` (vision-language model architecture)

Models maintained in alphabetical order within the support matrix table.

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for
you](https://github.com/microsoft/onnxruntime-genai/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: natke <3302433+natke@users.noreply.github.com>
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 28c96bb531..2fd0d344f5 100644
--- a/README.md
+++ b/README.md
@@ -16,7 +16,7 @@ See documentation at the [ONNX Runtime website](https://onnxruntime.ai/docs/gena
 
 | Support matrix | Supported now | Under development | On the roadmap|
 | -------------- | ------------- | ----------------- | -------------- |
-| Model architectures | ChatGLM</br>DeepSeek</br>Ernie</br>Gemma</br>GPTOSS</br>Granite</br>Llama</br>Mistral</br>Nemotron</br>OLMo</br>Phi</br>Phi3V</br>Phi4MM</br>Qwen</br>Qwen25VLText</br>SmolLM3</br>Whisper</br>| Stable diffusion ||
+| Model architectures | ChatGLM</br>DeepSeek</br>Ernie</br>Fara</br>Gemma</br>GPTOSS</br>Granite</br>Llama</br>Mistral</br>Nemotron</br>OLMo</br>Phi</br>Phi3V</br>Phi4MM</br>Qwen</br>Qwen-2.5VL</br>SmolLM3</br>Whisper</br>| Stable diffusion ||
 | API| Python <br/>C# <br/>C/C++ <br/> Java ^ | Objective-C ||
 | O/S | Linux <br/> Windows <br/>Mac  <br/>Android   || iOS |||
 | Architecture | x86 <br/> x64 <br/> arm64 ||||