From 21f89d0aee919fb92db69b520507b99f4f94801c Mon Sep 17 00:00:00 2001 From: Genevieve Warren <24882762+gewarren@users.noreply.github.com> Date: Wed, 10 Dec 2025 18:31:32 -0800 Subject: [PATCH 1/5] general freshness update for AI docs --- docs/ai/conceptual/agents.md | 11 +- docs/ai/conceptual/data-ingestion.md | 2 +- docs/ai/conceptual/rag.md | 11 +- docs/ai/conceptual/vector-databases.md | 9 +- docs/ai/dotnet-ai-ecosystem.md | 13 +- docs/ai/evaluation/evaluate-safety.md | 6 +- docs/ai/ichatclient.md | 155 ++++++++++++++ .../{vector-databases.md => sk-connectors.md} | 0 docs/ai/index.yml | 24 +-- docs/ai/microsoft-extensions-ai.md | 189 ++---------------- docs/ai/overview.md | 2 +- .../ai/quickstarts/build-vector-search-app.md | 8 +- docs/ai/quickstarts/create-assistant.md | 2 +- docs/ai/toc.yml | 10 +- 14 files changed, 222 insertions(+), 220 deletions(-) create mode 100644 docs/ai/ichatclient.md rename docs/ai/includes/{vector-databases.md => sk-connectors.md} (100%) diff --git a/docs/ai/conceptual/agents.md b/docs/ai/conceptual/agents.md index 91b66d6ca8273..b053d22428a90 100644 --- a/docs/ai/conceptual/agents.md +++ b/docs/ai/conceptual/agents.md @@ -3,7 +3,7 @@ title: Agents description: Introduction to agents author: luisquintanilla ms.author: luquinta -ms.date: 10/01/2025 +ms.date: 12/10/2025 ms.topic: concept-article --- @@ -62,6 +62,7 @@ Agentic workflows can be orchestrated in a variety of ways. The following are a - [Concurrent](#concurrent) - [Handoff](#handoff) - [Group chat](#group-chat) +- [Magentic](#magentic) #### Sequential @@ -87,10 +88,12 @@ Agents collaborate in a shared conversation, exchanging insights in real-time. ![Group chat orchestration: User and Agents A, B, C collaborate via GroupChat to produce final output](../media/agents/groupchat-workflow.png) +#### Magentic + +A lead agent directs other agents. + ## How can I get started building agents in .NET? The building blocks in and supply the foundations for agents by providing modular components for AI models, tools, and data. -These components serve as the foundation for Microsoft Agent Framework. - -For more information, see [Microsoft Agent Framework](/agent-framework/overview/agent-framework-overview). +These components serve as the foundation for Microsoft Agent Framework. For more information, see [Microsoft Agent Framework](/agent-framework/overview/agent-framework-overview). diff --git a/docs/ai/conceptual/data-ingestion.md b/docs/ai/conceptual/data-ingestion.md index 430efb4600ae3..59fc6376dc744 100644 --- a/docs/ai/conceptual/data-ingestion.md +++ b/docs/ai/conceptual/data-ingestion.md @@ -110,7 +110,7 @@ These processors use [Microsoft.Extensions.AI.Abstractions](https://www.nuget.or stores processed chunks into a data store for later retrieval. Using Microsoft.Extensions.AI and [Microsoft.Extensions.VectorData.Abstractions](https://www.nuget.org/packages/Microsoft.Extensions.VectorData.Abstractions), the library provides the class that supports storing chunks in any vector store supported by Microsoft.Extensions.VectorData. -Vectore stores include popular options like [Qdrant](https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.Qdrant), [SQL Server](https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.SqlServer), [CosmosDB](https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.CosmosNoSQL), [MongoDB](https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.MongoDB), [ElasticSearch](https://www.nuget.org/packages/Elastic.SemanticKernel.Connectors.Elasticsearch), and many more. The writer can also automatically generate embeddings for your chunks using Microsoft.Extensions.AI, readying them for semantic search and retrieval scenarios. +Vector stores include popular options like [Qdrant](https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.Qdrant), [SQL Server](https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.SqlServer), [CosmosDB](https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.CosmosNoSQL), [MongoDB](https://www.nuget.org/packages/Microsoft.SemanticKernel.Connectors.MongoDB), [ElasticSearch](https://www.nuget.org/packages/Elastic.SemanticKernel.Connectors.Elasticsearch), and many more. The writer can also automatically generate embeddings for your chunks using Microsoft.Extensions.AI, readying them for semantic search and retrieval scenarios. ```csharp OpenAIClient openAIClient = new( diff --git a/docs/ai/conceptual/rag.md b/docs/ai/conceptual/rag.md index c0c90a6b9436c..db67b8a72ac45 100644 --- a/docs/ai/conceptual/rag.md +++ b/docs/ai/conceptual/rag.md @@ -1,11 +1,8 @@ --- title: "Integrate Your Data into AI Apps with Retrieval-Augmented Generation" description: "Learn how retrieval-augmented generation lets you use your data with LLMs to generate better completions in .NET." -ms.topic: concept-article #Don't change. -ms.date: 05/29/2025 - -#customer intent: As a .NET developer, I want to understand how retrieval-augmented generation works in .NET so that LLMs can use my data sources to provide more valuable completions. - +ms.topic: concept-article +ms.date: 12/10/2025 --- # Retrieval-augmented generation (RAG) provides LLM knowledge @@ -36,6 +33,6 @@ To perform RAG, you must process each data source that you want to use for retri - **Converting the text to vectors**: These are embeddings. Vectors are numerical representations of concepts converted to number sequences, which make it easy for computers to understand the relationships between those concepts. - **Links between source data and embeddings**: This information is stored as metadata on the chunks you created, which are then used to help the LLMs generate citations while generating responses. -## Related content +## See also -- [Prompt engineering](prompt-engineering-dotnet.md) +- [Data ingestion](data-ingestion.md) diff --git a/docs/ai/conceptual/vector-databases.md b/docs/ai/conceptual/vector-databases.md index 7a543fad8ae17..36fbfed5b4f4e 100644 --- a/docs/ai/conceptual/vector-databases.md +++ b/docs/ai/conceptual/vector-databases.md @@ -1,11 +1,8 @@ --- title: "Using Vector Databases to Extend LLM Capabilities" description: "Learn how vector databases extend LLM capabilities by storing and processing embeddings in .NET." -ms.topic: concept-article #Don't change. +ms.topic: concept-article ms.date: 05/29/2025 - -#customer intent: As a .NET developer, I want to learn how vector databases store and process embeddings in .NET so I can make more data available to LLMs in my apps. - --- # Vector databases for .NET + AI @@ -44,9 +41,9 @@ Other benefits of the RAG pattern include: - Overcome LLM tokens limits - the heavy lifting is done through the database vector search. - Reduce the costs from frequent fine-tuning on updated data. -## Available vector database solutions +## Semantic Kernel vector database solutions -[!INCLUDE [vector-databases](../includes/vector-databases.md)] +[!INCLUDE [sk-connectors](../includes/sk-connectors.md)] ## Related content diff --git a/docs/ai/dotnet-ai-ecosystem.md b/docs/ai/dotnet-ai-ecosystem.md index b5448d3fd81c2..8195c9407f78f 100644 --- a/docs/ai/dotnet-ai-ecosystem.md +++ b/docs/ai/dotnet-ai-ecosystem.md @@ -1,7 +1,7 @@ --- title: .NET + AI ecosystem tools and SDKs description: This article provides an overview of the ecosystem of SDKs and tools available to .NET developers integrating AI into their applications. -ms.date: 11/04/2025 +ms.date: 12/10/2025 ms.topic: overview --- @@ -18,6 +18,12 @@ The .NET ecosystem provides many powerful tools, libraries, and services to deve `Microsoft.Extensions.AI` provides abstractions that can be implemented by various services, all adhering to the same core concepts. This library is not intended to provide APIs tailored to any specific provider's services. The goal of `Microsoft.Extensions.AI` is to act as a unifying layer within the .NET ecosystem, enabling developers to choose their preferred frameworks and libraries while ensuring seamless integration and collaboration across the ecosystem. +## Other AI-related Microsoft.Extensions libraries + +The [📦 Microsoft.Extensions.VectorData.Abstractions package](https://www.nuget.org/packages/Microsoft.Extensions.VectorData.Abstractions/) provides a unified layer of abstractions for interacting with a variety of vector stores. It lets you store processed chunks in vector stores such as Qdrant, Azure SQL, CosmosDB, MongoDB, ElasticSearch, and many more. For more information, see [Build a .NET AI vector search app](quickstarts/build-vector-search-app.md). + +The [📦 Microsoft.Extensions.DataIngestion package](https://www.nuget.org/packages/Microsoft.Extensions.DataIngestion) provides foundational .NET building blocks for data ingestion. It enables developers to read, process, and prepare documents for AI and machine learning workflows, especially retrieval-augmented generation (RAG) scenarios. For more information, see [Data ingestion](conceptual/data-ingestion.md). + ## Microsoft Agent Framework If you want to use low-level services, such as and , you can reference the `Microsoft.Extensions.AI.Abstractions` package directly from your app. However, if you want to build agentic AI applications with higher-level orchestration capabilities, you should use [Microsoft Agent Framework](/agent-framework/overview/agent-framework-overview). Agent Framework builds on the `Microsoft.Extensions.AI.Abstractions` package and provides concrete implementations of for different services, including OpenAI, Azure OpenAI, Azure AI Foundry, and more. @@ -80,14 +86,9 @@ For example, you can use [Ollama](https://ollama.com/) to [connect to local AI m > [!NOTE] > The preceding SLMs can also be hosted on other services, such as Azure. -## Connect to vector databases and services - -[!INCLUDE [vector-databases](includes/vector-databases.md)] - ## Next steps - [What is Microsoft Agent Framework?](/agent-framework/overview/agent-framework-overview) -- [What is Semantic Kernel?](/semantic-kernel/overview/) - [Quickstart - Summarize text using Azure AI chat app with .NET](quickstarts/prompt-model.md) [phi3]: https://azure.microsoft.com/products/phi-3 diff --git a/docs/ai/evaluation/evaluate-safety.md b/docs/ai/evaluation/evaluate-safety.md index 7a0c686130802..4dc07a69482ca 100644 --- a/docs/ai/evaluation/evaluate-safety.md +++ b/docs/ai/evaluation/evaluate-safety.md @@ -42,9 +42,9 @@ Complete the following steps to create an MSTest project. ```dotnetcli dotnet add package Azure.AI.OpenAI dotnet add package Azure.Identity - dotnet add package Microsoft.Extensions.AI.Abstractions --prerelease - dotnet add package Microsoft.Extensions.AI.Evaluation --prerelease - dotnet add package Microsoft.Extensions.AI.Evaluation.Reporting --prerelease + dotnet add package Microsoft.Extensions.AI.Abstractions + dotnet add package Microsoft.Extensions.AI.Evaluation + dotnet add package Microsoft.Extensions.AI.Evaluation.Reporting dotnet add package Microsoft.Extensions.AI.Evaluation.Safety --prerelease dotnet add package Microsoft.Extensions.AI.OpenAI --prerelease dotnet add package Microsoft.Extensions.Configuration diff --git a/docs/ai/ichatclient.md b/docs/ai/ichatclient.md new file mode 100644 index 0000000000000..8d95fce2b40f8 --- /dev/null +++ b/docs/ai/ichatclient.md @@ -0,0 +1,155 @@ +--- +title: Use the IChatClient interface +description: Learn how to use the IChatClient interface to get model responses and call tools +ms.date: 12/10/2025 +--- + +# Use the IChatClient interface + +The interface defines a client abstraction responsible for interacting with AI services that provide chat capabilities. It includes methods for sending and receiving messages with multi-modal content (such as text, images, and audio), either as a complete set or streamed incrementally. Additionally, it allows for retrieving strongly typed services provided by the client or its underlying services. + +.NET libraries that provide clients for language models and services can provide an implementation of the `IChatClient` interface. Any consumers of the interface are then able to interoperate seamlessly with these models and services via the abstractions. You can see a simple implementation at [Sample implementations of IChatClient and IEmbeddingGenerator](advanced/sample-implementations.md). + +## Request a chat response + +With an instance of , you can call the method to send a request and get a response. The request is composed of one or more messages, each of which is composed of one or more pieces of content. Accelerator methods exist to simplify common cases, such as constructing a request for a single piece of text content. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI/Program.cs"::: + +The core `IChatClient.GetResponseAsync` method accepts a list of messages. This list represents the history of all messages that are part of the conversation. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.GetResponseAsyncArgs/Program.cs" id="Snippet1"::: + +The that's returned from `GetResponseAsync` exposes a list of instances that represent one or more messages generated as part of the operation. In common cases, there is only one response message, but in some situations, there can be multiple messages. The message list is ordered, such that the last message in the list represents the final message to the request. To provide all of those response messages back to the service in a subsequent request, you can add the messages from the response back into the messages list. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.AddMessages/Program.cs" id="Snippet1"::: + +## Request a streaming chat response + +The inputs to are identical to those of `GetResponseAsync`. However, rather than returning the complete response as part of a object, the method returns an where `T` is , providing a stream of updates that collectively form the single response. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.GetStreamingResponseAsync/Program.cs" id="Snippet1"::: + +> [!TIP] +> Streaming APIs are nearly synonymous with AI user experiences. C# enables compelling scenarios with its `IAsyncEnumerable` support, allowing for a natural and efficient way to stream data. + +As with `GetResponseAsync`, you can add the updates from back into the messages list. Because the updates are individual pieces of a response, you can use helpers like to compose one or more updates back into a single instance. + +Helpers like compose a and then extract the composed messages from the response and add them to a list. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.AddMessages/Program.cs" id="Snippet2"::: + +## Tool calling + +Some models and services support _tool calling_. To gather additional information, you can configure the with information about tools (usually .NET methods) that the model can request the client to invoke. Instead of sending a final response, the model requests a function invocation with specific arguments. The client then invokes the function and sends the results back to the model with the conversation history. The `Microsoft.Extensions.AI.Abstractions` library includes abstractions for various message content types, including function call requests and results. While `IChatClient` consumers can interact with this content directly, `Microsoft.Extensions.AI` provides helpers that can enable automatically invoking the tools in response to corresponding requests. The `Microsoft.Extensions.AI.Abstractions` and `Microsoft.Extensions.AI` libraries provide the following types: + +- : Represents a function that can be described to an AI model and invoked. +- : Provides factory methods for creating `AIFunction` instances that represent .NET methods. +- : Wraps an `IChatClient` as another `IChatClient` that adds automatic function-invocation capabilities. + +The following example demonstrates a random function invocation (this example depends on the [📦 OllamaSharp](https://www.nuget.org/packages/OllamaSharp) NuGet package): + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.ToolCalling/Program.cs"::: + +The preceding code: + +- Defines a function named `GetCurrentWeather` that returns a random weather forecast. +- Instantiates a with an `OllamaSharp.OllamaApiClient` and configures it to use function invocation. +- Calls `GetStreamingResponseAsync` on the client, passing a prompt and a list of tools that includes a function created with . +- Iterates over the response, printing each update to the console. + +For more information about creating AI functions, see [Access data in AI functions](how-to/access-data-in-functions.md). + +You can also use Model Context Protocol (MCP) tools with your `IChatClient`. For more information, see [Build a minimal MCP client](./quickstarts/build-mcp-client.md). + +## Cache responses + +If you're familiar with [caching in .NET](../core/extensions/caching.md), it's good to know that provides delegating `IChatClient` implementations for caching. The is an `IChatClient` that layers caching around another arbitrary `IChatClient` instance. When a novel chat history is submitted to the `DistributedCachingChatClient`, it forwards it to the underlying client and then caches the response before sending it back to the consumer. The next time the same history is submitted, such that a cached response can be found in the cache, the `DistributedCachingChatClient` returns the cached response rather than forwarding the request along the pipeline. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.CacheResponses/Program.cs"::: + +This example depends on the [📦 Microsoft.Extensions.Caching.Memory](https://www.nuget.org/packages/Microsoft.Extensions.Caching.Memory) NuGet package. For more information, see [Caching in .NET](../core/extensions/caching.md). + +## Use telemetry + +Another example of a delegating chat client is the . This implementation adheres to the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/). Similar to other `IChatClient` delegators, it layers metrics and spans around other arbitrary `IChatClient` implementations. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.UseTelemetry/Program.cs"::: + +(The preceding example depends on the [📦 OpenTelemetry.Exporter.Console](https://www.nuget.org/packages/OpenTelemetry.Exporter.Console) NuGet package.) + +Alternatively, the and corresponding method provide a simple way to write log entries to an for every request and response. + +## Provide options + +Every call to or can optionally supply a instance containing additional parameters for the operation. The most common parameters among AI models and services show up as strongly typed properties on the type, such as . Other parameters can be supplied by name in a weakly typed manner, via the dictionary, or via an options instance that the underlying provider understands, using the property. + +You can also specify options when building an `IChatClient` with the fluent API by chaining a call to the extension method. This delegating client wraps another client and invokes the supplied delegate to populate a `ChatOptions` instance for every call. For example, to ensure that the property defaults to a particular model name, you can use code like the following: + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.ProvideOptions/Program.cs"::: + +## Functionality pipelines + +`IChatClient` instances can be layered to create a pipeline of components that each add additional functionality. These components can come from `Microsoft.Extensions.AI`, other NuGet packages, or custom implementations. This approach allows you to augment the behavior of the `IChatClient` in various ways to meet your specific needs. Consider the following code snippet that layers a distributed cache, function invocation, and OpenTelemetry tracing around a sample chat client: + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.FunctionalityPipelines/Program.cs" id="Snippet1"::: + +## Custom `IChatClient` middleware + +To add additional functionality, you can implement `IChatClient` directly or use the class. This class serves as a base for creating chat clients that delegate operations to another `IChatClient` instance. It simplifies chaining multiple clients, which allows calls to pass through to an underlying client. + +The `DelegatingChatClient` class provides default implementations for methods like `GetResponseAsync`, `GetStreamingResponseAsync`, and `Dispose`, which forward calls to the inner client. A derived class can then override only the methods it needs to augment the behavior, while delegating other calls to the base implementation. This approach is useful for creating flexible and modular chat clients that are easy to extend and compose. + +The following is an example class derived from `DelegatingChatClient` that uses the [System.Threading.RateLimiting](https://www.nuget.org/packages/System.Threading.RateLimiting) library to provide rate-limiting functionality. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/AI.Shared/RateLimitingChatClient.cs"::: + +As with other `IChatClient` implementations, the `RateLimitingChatClient` can be composed: + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.CustomClientMiddle/Program.cs"::: + +To simplify the composition of such components with others, component authors should create a `Use*` extension method for registering the component into a pipeline. For example, consider the following `UseRateLimiting` extension method: + +:::code language="csharp" source="snippets/microsoft-extensions-ai/AI.Shared/RateLimitingChatClientExtensions.cs" id="one"::: + +Such extensions can also query for relevant services from the DI container; the used by the pipeline is passed in as an optional parameter: + +:::code language="csharp" source="snippets/microsoft-extensions-ai/AI.Shared/RateLimitingChatClientExtensions.OptionalOverload.cs" id="two"::: + +Now it's easy for the consumer to use this in their pipeline, for example: + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.ConsumeClientMiddleware/Program.cs" id="SnippetUse"::: + +The previous extension methods demonstrate using a `Use` method on . `ChatClientBuilder` also provides overloads that make it easier to write such delegating handlers. For example, in the earlier `RateLimitingChatClient` example, the overrides of `GetResponseAsync` and `GetStreamingResponseAsync` only need to do work before and after delegating to the next client in the pipeline. To achieve the same thing without writing a custom class, you can use an overload of `Use` that accepts a delegate that's used for both `GetResponseAsync` and `GetStreamingResponseAsync`, reducing the boilerplate required: + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.UseExample/Program.cs"::: + +For scenarios where you need a different implementation for `GetResponseAsync` and `GetStreamingResponseAsync` to handle their unique return types, you can use the overload that accepts a delegate for each. + +## Dependency injection + + implementations are often provided to an application via [dependency injection (DI)](../core/extensions/dependency-injection.md). In the following example, an is added into the DI container, as is an `IChatClient`. The registration for the `IChatClient` uses a builder that creates a pipeline containing a caching client (which then uses an `IDistributedCache` retrieved from DI) and the sample client. The injected `IChatClient` can be retrieved and used elsewhere in the app. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.DependencyInjection/Program.cs"::: + +What instance and configuration is injected can differ based on the current needs of the application, and multiple pipelines can be injected with different keys. + +## Stateless vs. stateful clients + +_Stateless_ services require all relevant conversation history to be sent back on every request. In contrast, _stateful_ services keep track of the history and require only additional messages to be sent with a request. The interface is designed to handle both stateless and stateful AI services. + +When working with a stateless service, callers maintain a list of all messages. They add in all received response messages and provide the list back on subsequent interactions. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.StatelessStateful/Program.cs" id="Snippet1"::: + +For stateful services, you might already know the identifier used for the relevant conversation. You can put that identifier into . Usage then follows the same pattern, except there's no need to maintain a history manually. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.StatelessStateful/Program.cs" id="Snippet2"::: + +Some services might support automatically creating a conversation ID for a request that doesn't have one, or creating a new conversation ID that represents the current state of the conversation after incorporating the last round of messages. In such cases, you can transfer the over to the `ChatOptions.ConversationId` for subsequent requests. For example: + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.StatelessStateful/Program.cs" id="Snippet3"::: + +If you don't know ahead of time whether the service is stateless or stateful, you can check the response and act based on its value. If it's set, then that value is propagated to the options and the history is cleared so as to not resend the same history again. If the response `ConversationId` isn't set, then the response message is added to the history so that it's sent back to the service on the next turn. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.StatelessStateful/Program.cs" id="Snippet4"::: diff --git a/docs/ai/includes/vector-databases.md b/docs/ai/includes/sk-connectors.md similarity index 100% rename from docs/ai/includes/vector-databases.md rename to docs/ai/includes/sk-connectors.md diff --git a/docs/ai/index.yml b/docs/ai/index.yml index 87a1ce1e73ce3..205d6b55a00a1 100644 --- a/docs/ai/index.yml +++ b/docs/ai/index.yml @@ -8,15 +8,15 @@ metadata: description: Samples, tutorials, and education for building AI apps with .NET ms.topic: landing-page ms.service: dotnet - ms.date: 8/26/2025 + ms.date: 12/10/2025 author: alexwolfmsft ms.author: alexwolf # linkListType: architecture | concept | deploy | download | get-started | how-to-guide | tutorial | overview | quickstart | reference | sample | training | tutorial | video | whats-new landingContent: -# Cards and links should be based on top customer tasks or top subjects -# Start card title with a verb + # Cards and links should be based on top customer tasks or top subjects + # Start card title with a verb # Card - title: Get started @@ -45,16 +45,16 @@ landingContent: links: - text: How generative AI and LLMs work url: conceptual/how-genai-and-llms-work.md - - text: Understand tokens - url: conceptual/understanding-tokens.md + - text: Build agents to automate workflows + url: conceptual/agents.md - text: Preserve semantic meaning with embeddings url: conceptual/embeddings.md - text: Semantic search with vector databases url: conceptual/vector-databases.md - - text: Prompt engineering - url: conceptual/prompt-engineering-dotnet.md - - text: Evaluation libraries - url: conceptual/evaluation-libraries.md + - text: Data ingestion + url: conceptual/data-ingestion.md + - text: Retrieval-augmented generation + url: conceptual/rag.md # Card - title: Common tasks @@ -77,13 +77,13 @@ landingContent: linkLists: - linkListType: tutorial links: - - text: Fundamentals of Azure OpenAI Service - url: /training/modules/explore-azure-openai + - text: Plan and prepare to develop AI solutions on Azure + url: /training/modules/prepare-azure-ai-development/ - text: Generate conversations Azure OpenAI completions url: /training/modules/open-ai-dotnet-text-completions - text: .NET enterprise chat sample using RAG url: get-started-app-chat-template.md - - text: Develop AI agents using Azure OpenAI + - text: Develop generative AI apps with Azure OpenAI and Semantic Kernel url: /training/paths/develop-ai-agents-azure-open-ai-semantic-kernel-sdk # Card diff --git a/docs/ai/microsoft-extensions-ai.md b/docs/ai/microsoft-extensions-ai.md index 3a8b236ad892d..31004902da710 100644 --- a/docs/ai/microsoft-extensions-ai.md +++ b/docs/ai/microsoft-extensions-ai.md @@ -1,9 +1,7 @@ --- title: Microsoft.Extensions.AI libraries description: Learn how to use the Microsoft.Extensions.AI libraries to integrate and interact with various AI services in your .NET applications. -author: IEvangelist -ms.author: dapine -ms.date: 11/20/2025 +ms.date: 12/10/2025 --- # Microsoft.Extensions.AI libraries @@ -26,177 +24,17 @@ Libraries that provide implementations of the abstractions typically reference o For information about how to install NuGet packages, see [dotnet package add](../core/tools/dotnet-package-add.md) or [Manage package dependencies in .NET applications](../core/tools/dependencies.md). -## APIs and examples +## APIs and functionality -The following subsections show specific [`IChatClient`](#the-ichatclient-interface) usage examples: - -- [Request a chat response](#request-a-chat-response) -- [Request a streaming chat response](#request-a-streaming-chat-response) -- [Tool calling](#tool-calling) -- [Cache responses](#cache-responses) -- [Use telemetry](#use-telemetry) -- [Provide options](#provide-options) -- [Pipelines of functionality](#functionality-pipelines) -- [Custom `IChatClient` middleware](#custom-ichatclient-middleware) -- [Dependency injection](#dependency-injection) -- [Stateless vs. stateful clients](#stateless-vs-stateful-clients) - -The following sections show specific [`IEmbeddingGenerator`](#the-iembeddinggenerator-interface) usage examples: - -- [Create embeddings](#create-embeddings) -- [Pipelines of functionality](#pipelines-of-functionality) - -- [`IImageGenerator`](#the-iimagegenerator-interface) +- [The `IChatClient` interface](#the-ichatclient-interface) +- [The `IEmbeddingGenerator` interface](#the-iembeddinggenerator-interface) +- [Image generation (experimental)](#image-generation-experimental) ### The `IChatClient` interface -The interface defines a client abstraction responsible for interacting with AI services that provide chat capabilities. It includes methods for sending and receiving messages with multi-modal content (such as text, images, and audio), either as a complete set or streamed incrementally. Additionally, it allows for retrieving strongly typed services provided by the client or its underlying services. - -.NET libraries that provide clients for language models and services can provide an implementation of the `IChatClient` interface. Any consumers of the interface are then able to interoperate seamlessly with these models and services via the abstractions. You can see a simple implementation at [Sample implementations of IChatClient and IEmbeddingGenerator](advanced/sample-implementations.md). - -#### Request a chat response - -With an instance of , you can call the method to send a request and get a response. The request is composed of one or more messages, each of which is composed of one or more pieces of content. Accelerator methods exist to simplify common cases, such as constructing a request for a single piece of text content. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI/Program.cs"::: - -The core `IChatClient.GetResponseAsync` method accepts a list of messages. This list represents the history of all messages that are part of the conversation. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.GetResponseAsyncArgs/Program.cs" id="Snippet1"::: - -The that's returned from `GetResponseAsync` exposes a list of instances that represent one or more messages generated as part of the operation. In common cases, there is only one response message, but in some situations, there can be multiple messages. The message list is ordered, such that the last message in the list represents the final message to the request. To provide all of those response messages back to the service in a subsequent request, you can add the messages from the response back into the messages list. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.AddMessages/Program.cs" id="Snippet1"::: - -#### Request a streaming chat response - -The inputs to are identical to those of `GetResponseAsync`. However, rather than returning the complete response as part of a object, the method returns an where `T` is , providing a stream of updates that collectively form the single response. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.GetStreamingResponseAsync/Program.cs" id="Snippet1"::: - -> [!TIP] -> Streaming APIs are nearly synonymous with AI user experiences. C# enables compelling scenarios with its `IAsyncEnumerable` support, allowing for a natural and efficient way to stream data. - -As with `GetResponseAsync`, you can add the updates from back into the messages list. Because the updates are individual pieces of a response, you can use helpers like to compose one or more updates back into a single instance. - -Helpers like compose a and then extract the composed messages from the response and add them to a list. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.AddMessages/Program.cs" id="Snippet2"::: - -#### Tool calling - -Some models and services support _tool calling_. To gather additional information, you can configure the with information about tools (usually .NET methods) that the model can request the client to invoke. Instead of sending a final response, the model requests a function invocation with specific arguments. The client then invokes the function and sends the results back to the model with the conversation history. The `Microsoft.Extensions.AI.Abstractions` library includes abstractions for various message content types, including function call requests and results. While `IChatClient` consumers can interact with this content directly, `Microsoft.Extensions.AI` provides helpers that can enable automatically invoking the tools in response to corresponding requests. The `Microsoft.Extensions.AI.Abstractions` and `Microsoft.Extensions.AI` libraries provide the following types: - -- : Represents a function that can be described to an AI model and invoked. -- : Provides factory methods for creating `AIFunction` instances that represent .NET methods. -- : Wraps an `IChatClient` as another `IChatClient` that adds automatic function-invocation capabilities. - -The following example demonstrates a random function invocation (this example depends on the [📦 OllamaSharp](https://www.nuget.org/packages/OllamaSharp) NuGet package): - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.ToolCalling/Program.cs"::: - -The preceding code: - -- Defines a function named `GetCurrentWeather` that returns a random weather forecast. -- Instantiates a with an `OllamaSharp.OllamaApiClient` and configures it to use function invocation. -- Calls `GetStreamingResponseAsync` on the client, passing a prompt and a list of tools that includes a function created with . -- Iterates over the response, printing each update to the console. - -For more information about creating AI functions, see [Access data in AI functions](how-to/access-data-in-functions.md). - -You can also use Model Context Protocol (MCP) tools with your `IChatClient`. For more information, see [Build a minimal MCP client](./quickstarts/build-mcp-client.md). - -#### Cache responses - -If you're familiar with [caching in .NET](../core/extensions/caching.md), it's good to know that provides delegating `IChatClient` implementations for caching. The is an `IChatClient` that layers caching around another arbitrary `IChatClient` instance. When a novel chat history is submitted to the `DistributedCachingChatClient`, it forwards it to the underlying client and then caches the response before sending it back to the consumer. The next time the same history is submitted, such that a cached response can be found in the cache, the `DistributedCachingChatClient` returns the cached response rather than forwarding the request along the pipeline. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.CacheResponses/Program.cs"::: - -This example depends on the [📦 Microsoft.Extensions.Caching.Memory](https://www.nuget.org/packages/Microsoft.Extensions.Caching.Memory) NuGet package. For more information, see [Caching in .NET](../core/extensions/caching.md). - -#### Use telemetry - -Another example of a delegating chat client is the . This implementation adheres to the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/). Similar to other `IChatClient` delegators, it layers metrics and spans around other arbitrary `IChatClient` implementations. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.UseTelemetry/Program.cs"::: - -(The preceding example depends on the [📦 OpenTelemetry.Exporter.Console](https://www.nuget.org/packages/OpenTelemetry.Exporter.Console) NuGet package.) - -Alternatively, the and corresponding method provide a simple way to write log entries to an for every request and response. - -#### Provide options +The interface defines a client abstraction responsible for interacting with AI services that provide chat capabilities. It includes methods for sending and receiving messages with multi-modal content (such as text, images, and audio), either as a complete set or streamed incrementally. -Every call to or can optionally supply a instance containing additional parameters for the operation. The most common parameters among AI models and services show up as strongly typed properties on the type, such as . Other parameters can be supplied by name in a weakly typed manner, via the dictionary, or via an options instance that the underlying provider understands, using the property. - -You can also specify options when building an `IChatClient` with the fluent API by chaining a call to the extension method. This delegating client wraps another client and invokes the supplied delegate to populate a `ChatOptions` instance for every call. For example, to ensure that the property defaults to a particular model name, you can use code like the following: - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.ProvideOptions/Program.cs"::: - -#### Functionality pipelines - -`IChatClient` instances can be layered to create a pipeline of components that each add additional functionality. These components can come from `Microsoft.Extensions.AI`, other NuGet packages, or custom implementations. This approach allows you to augment the behavior of the `IChatClient` in various ways to meet your specific needs. Consider the following code snippet that layers a distributed cache, function invocation, and OpenTelemetry tracing around a sample chat client: - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.FunctionalityPipelines/Program.cs" id="Snippet1"::: - -#### Custom `IChatClient` middleware - -To add additional functionality, you can implement `IChatClient` directly or use the class. This class serves as a base for creating chat clients that delegate operations to another `IChatClient` instance. It simplifies chaining multiple clients, which allows calls to pass through to an underlying client. - -The `DelegatingChatClient` class provides default implementations for methods like `GetResponseAsync`, `GetStreamingResponseAsync`, and `Dispose`, which forward calls to the inner client. A derived class can then override only the methods it needs to augment the behavior, while delegating other calls to the base implementation. This approach is useful for creating flexible and modular chat clients that are easy to extend and compose. - -The following is an example class derived from `DelegatingChatClient` that uses the [System.Threading.RateLimiting](https://www.nuget.org/packages/System.Threading.RateLimiting) library to provide rate-limiting functionality. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/AI.Shared/RateLimitingChatClient.cs"::: - -As with other `IChatClient` implementations, the `RateLimitingChatClient` can be composed: - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.CustomClientMiddle/Program.cs"::: - -To simplify the composition of such components with others, component authors should create a `Use*` extension method for registering the component into a pipeline. For example, consider the following `UseRateLimiting` extension method: - -:::code language="csharp" source="snippets/microsoft-extensions-ai/AI.Shared/RateLimitingChatClientExtensions.cs" id="one"::: - -Such extensions can also query for relevant services from the DI container; the used by the pipeline is passed in as an optional parameter: - -:::code language="csharp" source="snippets/microsoft-extensions-ai/AI.Shared/RateLimitingChatClientExtensions.OptionalOverload.cs" id="two"::: - -Now it's easy for the consumer to use this in their pipeline, for example: - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.ConsumeClientMiddleware/Program.cs" id="SnippetUse"::: - -The previous extension methods demonstrate using a `Use` method on . `ChatClientBuilder` also provides overloads that make it easier to write such delegating handlers. For example, in the earlier `RateLimitingChatClient` example, the overrides of `GetResponseAsync` and `GetStreamingResponseAsync` only need to do work before and after delegating to the next client in the pipeline. To achieve the same thing without writing a custom class, you can use an overload of `Use` that accepts a delegate that's used for both `GetResponseAsync` and `GetStreamingResponseAsync`, reducing the boilerplate required: - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.UseExample/Program.cs"::: - -For scenarios where you need a different implementation for `GetResponseAsync` and `GetStreamingResponseAsync` to handle their unique return types, you can use the overload that accepts a delegate for each. - -#### Dependency injection - - implementations are often provided to an application via [dependency injection (DI)](../core/extensions/dependency-injection.md). In the following example, an is added into the DI container, as is an `IChatClient`. The registration for the `IChatClient` uses a builder that creates a pipeline containing a caching client (which then uses an `IDistributedCache` retrieved from DI) and the sample client. The injected `IChatClient` can be retrieved and used elsewhere in the app. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.DependencyInjection/Program.cs"::: - -What instance and configuration is injected can differ based on the current needs of the application, and multiple pipelines can be injected with different keys. - -#### Stateless vs. stateful clients - -_Stateless_ services require all relevant conversation history to be sent back on every request. In contrast, _stateful_ services keep track of the history and require only additional messages to be sent with a request. The interface is designed to handle both stateless and stateful AI services. - -When working with a stateless service, callers maintain a list of all messages. They add in all received response messages and provide the list back on subsequent interactions. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.StatelessStateful/Program.cs" id="Snippet1"::: - -For stateful services, you might already know the identifier used for the relevant conversation. You can put that identifier into . Usage then follows the same pattern, except there's no need to maintain a history manually. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.StatelessStateful/Program.cs" id="Snippet2"::: - -Some services might support automatically creating a conversation ID for a request that doesn't have one, or creating a new conversation ID that represents the current state of the conversation after incorporating the last round of messages. In such cases, you can transfer the over to the `ChatOptions.ConversationId` for subsequent requests. For example: - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.StatelessStateful/Program.cs" id="Snippet3"::: - -If you don't know ahead of time whether the service is stateless or stateful, you can check the response and act based on its value. If it's set, then that value is propagated to the options and the history is cleared so as to not resend the same history again. If the response `ConversationId` isn't set, then the response message is added to the history so that it's sent back to the service on the next turn. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.StatelessStateful/Program.cs" id="Snippet4"::: +For more information and detailed usage examples, see [Use the IChatClient interface](ichatclient.md). ### The `IEmbeddingGenerator` interface @@ -236,12 +74,23 @@ This can then be layered around an arbitrary `IEmbeddingGenerator>` instances to provide rate-limiting functionality. -### The `IImageGenerator` interface +### Image generation (experimental) The interface represents a generator for creating images from text prompts or other input. This interface enables applications to integrate image generation capabilities from various AI services through a consistent API. The interface supports text-to-image generation (by calling ) and [configuration options](xref:Microsoft.Extensions.AI.ImageGenerationOptions) for image size and format. Like other interfaces in the library, it can be composed with middleware for caching, telemetry, and other cross-cutting concerns. For more information, see [Generate images from text using AI](quickstarts/text-to-image.md). +## Data ingestion (preview) + +*Data ingestion* is the process of reading and preparing data from different sources to make it usable for downstream apps. .NET provides the building blocks that enable developers to read, process, and prepare documents for AI and machine learning workflows, especially retrieval-augmented generation (RAG) scenarios. + +- +- +- +- +- +- + ## Build with Microsoft.Extensions.AI You can start building with `Microsoft.Extensions.AI` in the following ways: diff --git a/docs/ai/overview.md b/docs/ai/overview.md index 97e03f98af080..1a47c46923936 100644 --- a/docs/ai/overview.md +++ b/docs/ai/overview.md @@ -1,7 +1,7 @@ --- title: Develop .NET apps with AI features description: Learn how you can build .NET applications that include AI features. -ms.date: 11/20/2025 +ms.date: 12/10/2025 ms.topic: overview --- diff --git a/docs/ai/quickstarts/build-vector-search-app.md b/docs/ai/quickstarts/build-vector-search-app.md index 199d89ffcdc5e..6e7af69aebc6a 100644 --- a/docs/ai/quickstarts/build-vector-search-app.md +++ b/docs/ai/quickstarts/build-vector-search-app.md @@ -4,7 +4,6 @@ description: Create an AI powered app to search and integrate with vector stores ms.date: 05/29/2025 ms.topic: quickstart zone_pivot_groups: openai-library -# CustomerIntent: As a .NET developer new to AI, I want deploy and use sample code to interact to learn from the sample code. --- # Build a .NET AI vector search app @@ -22,9 +21,6 @@ The app uses the and [!NOTE] -> The [Microsoft.Extensions.VectorData.Abstractions](https://www.nuget.org/packages/Microsoft.Extensions.VectorData.Abstractions/) library is currently in preview. - :::zone target="docs" pivot="openai" @@ -68,7 +64,7 @@ Complete the following steps to create a .NET console app that can: dotnet add package Azure.Identity dotnet add package Azure.AI.OpenAI dotnet add package Microsoft.Extensions.AI.OpenAI --prerelease - dotnet add package Microsoft.Extensions.VectorData.Abstractions --prerelease + dotnet add package Microsoft.Extensions.VectorData.Abstractions dotnet add package Microsoft.SemanticKernel.Connectors.InMemory --prerelease dotnet add package Microsoft.Extensions.Configuration dotnet add package Microsoft.Extensions.Configuration.UserSecrets @@ -90,7 +86,7 @@ Complete the following steps to create a .NET console app that can: ```bash dotnet add package Microsoft.Extensions.AI.OpenAI --prerelease - dotnet add package Microsoft.Extensions.VectorData.Abstractions --prerelease + dotnet add package Microsoft.Extensions.VectorData.Abstractions dotnet add package Microsoft.SemanticKernel.Connectors.InMemory --prerelease dotnet add package Microsoft.Extensions.Configuration dotnet add package Microsoft.Extensions.Configuration.UserSecrets diff --git a/docs/ai/quickstarts/create-assistant.md b/docs/ai/quickstarts/create-assistant.md index 4f68cbbc053a0..74763f11b1f22 100644 --- a/docs/ai/quickstarts/create-assistant.md +++ b/docs/ai/quickstarts/create-assistant.md @@ -69,7 +69,7 @@ Complete the following steps to create a .NET console app and add the package ne 1. Add the [OpenAI](https://www.nuget.org/packages/OpenAI) package to your app: ```dotnetcli - dotnet add package OpenAI --prerelease + dotnet add package OpenAI ``` 1. Open the new app in your editor of choice, such as Visual Studio Code. diff --git a/docs/ai/toc.yml b/docs/ai/toc.yml index 80d10744a9d41..919b015707483 100644 --- a/docs/ai/toc.yml +++ b/docs/ai/toc.yml @@ -36,7 +36,7 @@ items: items: - name: How generative AI and LLMs work href: conceptual/how-genai-and-llms-work.md - - name: Building agents in .NET + - name: Agents href: conceptual/agents.md - name: Tokens href: conceptual/understanding-tokens.md @@ -126,7 +126,11 @@ items: href: resources/mcp-servers.md - name: API reference items: - - name: Microsoft.Extensions.AI - href: /dotnet/api/microsoft.extensions.ai - name: Microsoft.Agents.AI href: /dotnet/api/microsoft.agents.ai + - name: Microsoft.Extensions.AI + href: /dotnet/api/microsoft.extensions.ai + - name: Microsoft.Extensions.DataIngestion + href: /dotnet/api/microsoft.extensions.dataingestion + - name: Microsoft.Extensions.VectorData + href: /dotnet/api/microsoft.extensions.vectordata From 0b24a79cea104a84a1125f7bd81c394d010656ef Mon Sep 17 00:00:00 2001 From: Genevieve Warren <24882762+gewarren@users.noreply.github.com> Date: Wed, 10 Dec 2025 18:34:42 -0800 Subject: [PATCH 2/5] fix heading --- docs/ai/microsoft-extensions-ai.md | 15 ++------------- 1 file changed, 2 insertions(+), 13 deletions(-) diff --git a/docs/ai/microsoft-extensions-ai.md b/docs/ai/microsoft-extensions-ai.md index 31004902da710..3ff8042f1d17b 100644 --- a/docs/ai/microsoft-extensions-ai.md +++ b/docs/ai/microsoft-extensions-ai.md @@ -28,7 +28,7 @@ For information about how to install NuGet packages, see [dotnet package add](.. - [The `IChatClient` interface](#the-ichatclient-interface) - [The `IEmbeddingGenerator` interface](#the-iembeddinggenerator-interface) -- [Image generation (experimental)](#image-generation-experimental) +- [The `IImageGenerator` interface (experimental)](#the-iimagegenerator-interface-experimental) ### The `IChatClient` interface @@ -74,23 +74,12 @@ This can then be layered around an arbitrary `IEmbeddingGenerator>` instances to provide rate-limiting functionality. -### Image generation (experimental) +### The IImageGenerator interface (experimental) The interface represents a generator for creating images from text prompts or other input. This interface enables applications to integrate image generation capabilities from various AI services through a consistent API. The interface supports text-to-image generation (by calling ) and [configuration options](xref:Microsoft.Extensions.AI.ImageGenerationOptions) for image size and format. Like other interfaces in the library, it can be composed with middleware for caching, telemetry, and other cross-cutting concerns. For more information, see [Generate images from text using AI](quickstarts/text-to-image.md). -## Data ingestion (preview) - -*Data ingestion* is the process of reading and preparing data from different sources to make it usable for downstream apps. .NET provides the building blocks that enable developers to read, process, and prepare documents for AI and machine learning workflows, especially retrieval-augmented generation (RAG) scenarios. - -- -- -- -- -- -- - ## Build with Microsoft.Extensions.AI You can start building with `Microsoft.Extensions.AI` in the following ways: From 13e4c363e2f2b11204ac5c0aafe37465f5a55e5f Mon Sep 17 00:00:00 2001 From: Genevieve Warren <24882762+gewarren@users.noreply.github.com> Date: Thu, 11 Dec 2025 08:40:50 -0800 Subject: [PATCH 3/5] fix bookmark --- docs/ai/conceptual/embeddings.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ai/conceptual/embeddings.md b/docs/ai/conceptual/embeddings.md index 4389d3bf0f1e7..e6ae4047f247c 100644 --- a/docs/ai/conceptual/embeddings.md +++ b/docs/ai/conceptual/embeddings.md @@ -49,7 +49,7 @@ You generate embeddings for your raw data by using an AI embedding model, which ### Store and process embeddings in a vector database -After you generate embeddings, you'll need a way to store them so you can later retrieve them with calls to an LLM. Vector databases are designed to store and process vectors, so they're a natural home for embeddings. Different vector databases offer different processing capabilities, so you should choose one based on your raw data and your goals. For information about your options, see [available vector database solutions](vector-databases.md#available-vector-database-solutions). +After you generate embeddings, you'll need a way to store them so you can later retrieve them with calls to an LLM. Vector databases are designed to store and process vectors, so they're a natural home for embeddings. Different vector databases offer different processing capabilities, so you should choose one based on your raw data and your goals. For information about your options, see [available vector database solutions](vector-databases.md#semantic-kernel-vector-database-solutions). ### Using embeddings in your LLM solution From 25d3814f82fed8ccd14c230b05ee6f2e469aacf2 Mon Sep 17 00:00:00 2001 From: Genevieve Warren <24882762+gewarren@users.noreply.github.com> Date: Thu, 11 Dec 2025 11:21:39 -0800 Subject: [PATCH 4/5] break out interfaces --- .openpublishing.redirection.ai.json | 4 ++ docs/ai/advanced/sample-implementations.md | 38 ---------------- docs/ai/ichatclient.md | 27 ++++++++--- docs/ai/iembeddinggenerator.md | 52 ++++++++++++++++++++++ docs/ai/microsoft-extensions-ai.md | 38 ---------------- docs/ai/toc.yml | 12 ++--- 6 files changed, 83 insertions(+), 88 deletions(-) delete mode 100644 docs/ai/advanced/sample-implementations.md create mode 100644 docs/ai/iembeddinggenerator.md diff --git a/.openpublishing.redirection.ai.json b/.openpublishing.redirection.ai.json index da61574c73a57..6827c469e6d96 100644 --- a/.openpublishing.redirection.ai.json +++ b/.openpublishing.redirection.ai.json @@ -1,5 +1,9 @@ { "redirections": [ + { + "source_path_from_root": "/docs/ai/advanced/sample-implementations.md", + "redirect_url": "/dotnet/ai/ichatclient" + }, { "source_path_from_root": "/docs/ai/azure-ai-for-dotnet-developers.md", "redirect_url": "/dotnet/ai/resources/azure-ai" diff --git a/docs/ai/advanced/sample-implementations.md b/docs/ai/advanced/sample-implementations.md deleted file mode 100644 index 76b258120a7c2..0000000000000 --- a/docs/ai/advanced/sample-implementations.md +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: "Sample implementations of IChatClient and IEmbeddingGenerator" -description: Learn more about the IChatClient and IEmbeddingGenerator interfaces, see simple implementations, and find links to concrete implementations. -ms.topic: article -ms.date: 05/28/2025 ---- - -# Sample implementations of IChatClient and IEmbeddingGenerator - -.NET libraries that provide clients for language models and services can provide implementations of the and interfaces. Any consumers of the interfaces are then able to interoperate seamlessly with these models and services via the abstractions. - -## The `IChatClient` interface - -The interface defines a client abstraction responsible for interacting with AI services that provide chat capabilities. It includes methods for sending and receiving messages with multi-modal content (such as text, images, and audio), either as a complete set or streamed incrementally. Additionally, it allows for retrieving strongly typed services provided by the client or its underlying services. - -The following sample implements `IChatClient` to show the general structure. - -:::code language="csharp" source="./snippets/sample-implementations/SampleChatClient.cs"::: - -For more realistic, concrete implementations of `IChatClient`, see: - -- [AzureAIInferenceChatClient.cs](https://github.com/dotnet/extensions/blob/main/src/Libraries/Microsoft.Extensions.AI.AzureAIInference/AzureAIInferenceChatClient.cs) -- [OpenAIChatClient.cs](https://github.com/dotnet/extensions/blob/main/src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIChatClient.cs) -- [Microsoft.Extensions.AI chat clients](https://github.com/dotnet/extensions/tree/main/src/Libraries/Microsoft.Extensions.AI/ChatCompletion) - -## The `IEmbeddingGenerator` interface - -The interface represents a generic generator of embeddings. Here, `TInput` is the type of input values being embedded, and `TEmbedding` is the type of generated embedding, which inherits from the class. - -The `Embedding` class serves as a base class for embeddings generated by an `IEmbeddingGenerator`. It's designed to store and manage the metadata and data associated with embeddings. Derived types, like `Embedding`, provide the concrete embedding vector data. For example, an `Embedding` exposes a `ReadOnlyMemory Vector { get; }` property for access to its embedding data. - -The `IEmbeddingGenerator` interface defines a method to asynchronously generate embeddings for a collection of input values, with optional configuration and cancellation support. It also provides metadata describing the generator and allows for the retrieval of strongly typed services that can be provided by the generator or its underlying services. - -The following code shows how the `SampleEmbeddingGenerator` class implements the `IEmbeddingGenerator` interface. It has a primary constructor that accepts an endpoint and model ID, which are used to identify the generator. It also implements the method to generate embeddings for a collection of input values. - -:::code language="csharp" source="./snippets/sample-implementations/SampleEmbeddingGenerator.cs"::: - -This sample implementation just generates random embedding vectors. For a more realistic, concrete implementation, see [OpenTelemetryEmbeddingGenerator.cs](https://github.com/dotnet/extensions/blob/main/src/Libraries/Microsoft.Extensions.AI/Embeddings/OpenTelemetryEmbeddingGenerator.cs). diff --git a/docs/ai/ichatclient.md b/docs/ai/ichatclient.md index 1cb6bd8c16e9a..85edd58aa25e2 100644 --- a/docs/ai/ichatclient.md +++ b/docs/ai/ichatclient.md @@ -1,7 +1,8 @@ --- title: Use the IChatClient interface -description: Learn how to use the IChatClient interface to get model responses and call tools +description: Learn how to use the IChatClient interface to get model responses and call tools. ms.date: 12/10/2025 +no-loc: ["IChatClient"] --- # Use the IChatClient interface @@ -62,6 +63,13 @@ For more information about creating AI functions, see [Access data in AI functio You can also use Model Context Protocol (MCP) tools with your `IChatClient`. For more information, see [Build a minimal MCP client](./quickstarts/build-mcp-client.md). +### Tool reduction (experimental) + +> [!IMPORTANT] +> This feature is experimental and subject to change. + +Tool reduction helps manage large tool catalogs by trimming them based on relevance to the current conversation context. The interface defines strategies for reducing the number of tools sent to the model. The library provides implementations like that ranks tools by embedding similarity to the conversation. Use the extension method to add tool reduction to your chat client pipeline. + ## Cache responses If you're familiar with [caching in .NET](../core/extensions/caching.md), it's good to know that provides delegating `IChatClient` implementations for caching. The is an `IChatClient` that layers caching around another arbitrary `IChatClient` instance. When a novel chat history is submitted to the `DistributedCachingChatClient`, it forwards it to the underlying client and then caches the response before sending it back to the consumer. The next time the same history is submitted, such that a cached response can be found in the cache, the `DistributedCachingChatClient` returns the cached response rather than forwarding the request along the pipeline. @@ -154,16 +162,21 @@ If you don't know ahead of time whether the service is stateless or stateful, yo :::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.StatelessStateful/Program.cs" id="Snippet4"::: -## Chat reduction (experimental) +## Implementation examples -> [!IMPORTANT] -> This feature is experimental and subject to change. +The following sample implements `IChatClient` to show the general structure. -Chat reduction helps manage conversation history by limiting the number of messages or summarizing older messages when the conversation exceeds a specified length. The `Microsoft.Extensions.AI` library provides reducers like that limits the number of non-system messages, and that automatically summarizes older messages while preserving context. +:::code language="csharp" source="./snippets/sample-implementations/SampleChatClient.cs"::: -## Tool reduction (experimental) +For more realistic, concrete implementations of `IChatClient`, see: + +- [AzureAIInferenceChatClient.cs](https://github.com/dotnet/extensions/blob/main/src/Libraries/Microsoft.Extensions.AI.AzureAIInference/AzureAIInferenceChatClient.cs) +- [OpenAIChatClient.cs](https://github.com/dotnet/extensions/blob/main/src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIChatClient.cs) +- [Microsoft.Extensions.AI chat clients](https://github.com/dotnet/extensions/tree/main/src/Libraries/Microsoft.Extensions.AI/ChatCompletion) + +## Chat reduction (experimental) > [!IMPORTANT] > This feature is experimental and subject to change. -Tool reduction helps manage large tool catalogs by trimming them based on relevance to the current conversation context. The interface defines strategies for reducing the number of tools sent to the model. The library provides implementations like that ranks tools by embedding similarity to the conversation. Use the extension method to add tool reduction to your chat client pipeline. +Chat reduction helps manage conversation history by limiting the number of messages or summarizing older messages when the conversation exceeds a specified length. The `Microsoft.Extensions.AI` library provides reducers like that limits the number of non-system messages, and that automatically summarizes older messages while preserving context. diff --git a/docs/ai/iembeddinggenerator.md b/docs/ai/iembeddinggenerator.md new file mode 100644 index 0000000000000..82bc4efe79b13 --- /dev/null +++ b/docs/ai/iembeddinggenerator.md @@ -0,0 +1,52 @@ +--- +title: Use the IEmbeddingGenerator interface +description: Learn how to use the IEmbeddingGenerator interface to generate embeddings for a collection of input values, with optional configuration and cancellation support. +ms.date: 12/11/2025 +no-loc: ["IEmbeddingGenerator"] +--- + +# Use the IEmbeddingGenerator interface + +The interface represents a generic generator of embeddings. For the generic type parameters, `TInput` is the type of input values being embedded, and `TEmbedding` is the type of generated embedding, which inherits from the class. + +The `Embedding` class serves as a base class for embeddings generated by an `IEmbeddingGenerator`. It's designed to store and manage the metadata and data associated with embeddings. Derived types, like , provide the concrete embedding vector data. For example, an `Embedding` exposes a `ReadOnlyMemory Vector { get; }` property for access to its embedding data. + +The `IEmbeddingGenerator` interface defines a method to asynchronously generate embeddings for a collection of input values, with optional configuration and cancellation support. It also provides metadata describing the generator and allows for the retrieval of strongly typed services that can be provided by the generator or its underlying services. + +## Create embeddings + +The primary operation performed with an is embedding generation, which is accomplished with its method. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.CreateEmbeddings/Program.cs" id="Snippet1"::: + +Accelerator extension methods also exist to simplify common cases, such as generating an embedding vector from a single input. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.CreateEmbeddings/Program.cs" id="Snippet2"::: + +## Pipelines of functionality + +As with `IChatClient`, `IEmbeddingGenerator` implementations can be layered. `Microsoft.Extensions.AI` provides a delegating implementation for `IEmbeddingGenerator` for caching and telemetry. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.CustomEmbeddingsMiddle/Program.cs"::: + +The `IEmbeddingGenerator` enables building custom middleware that extends the functionality of an `IEmbeddingGenerator`. The class is an implementation of the `IEmbeddingGenerator` interface that serves as a base class for creating embedding generators that delegate their operations to another `IEmbeddingGenerator` instance. It allows for chaining multiple generators in any order, passing calls through to an underlying generator. The class provides default implementations for methods such as and `Dispose`, which forward the calls to the inner generator instance, enabling flexible and modular embedding generation. + +The following is an example implementation of such a delegating embedding generator that rate-limits embedding generation requests: + +:::code language="csharp" source="snippets/microsoft-extensions-ai/AI.Shared/RateLimitingEmbeddingGenerator.cs"::: + +This can then be layered around an arbitrary `IEmbeddingGenerator>` to rate limit all embedding generation operations. + +:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.ConsumeRateLimitingEmbedding/Program.cs"::: + +In this way, the `RateLimitingEmbeddingGenerator` can be composed with other `IEmbeddingGenerator>` instances to provide rate-limiting functionality. + +## Implementation examples + +Most users don't need to implement the `IEmbeddingGenerator` interface. However, if you're a library author, then it might be helpful to look at these implementation examples. + +The following code shows how the `SampleEmbeddingGenerator` class implements the `IEmbeddingGenerator` interface. It has a primary constructor that accepts an endpoint and model ID, which are used to identify the generator. It also implements the method to generate embeddings for a collection of input values. + +:::code language="csharp" source="./snippets/sample-implementations/SampleEmbeddingGenerator.cs"::: + +This sample implementation just generates random embedding vectors. For a more realistic, concrete implementation, see [OpenTelemetryEmbeddingGenerator.cs](https://github.com/dotnet/extensions/blob/main/src/Libraries/Microsoft.Extensions.AI/Embeddings/OpenTelemetryEmbeddingGenerator.cs). diff --git a/docs/ai/microsoft-extensions-ai.md b/docs/ai/microsoft-extensions-ai.md index 3ff8042f1d17b..3db28deacf259 100644 --- a/docs/ai/microsoft-extensions-ai.md +++ b/docs/ai/microsoft-extensions-ai.md @@ -36,44 +36,6 @@ The interface defines a client abstra For more information and detailed usage examples, see [Use the IChatClient interface](ichatclient.md). -### The `IEmbeddingGenerator` interface - -The interface represents a generic generator of embeddings. For the generic type parameters, `TInput` is the type of input values being embedded, and `TEmbedding` is the type of generated embedding, which inherits from the class. - -The `Embedding` class serves as a base class for embeddings generated by an `IEmbeddingGenerator`. It's designed to store and manage the metadata and data associated with embeddings. Derived types, like , provide the concrete embedding vector data. For example, an `Embedding` exposes a `ReadOnlyMemory Vector { get; }` property for access to its embedding data. - -The `IEmbeddingGenerator` interface defines a method to asynchronously generate embeddings for a collection of input values, with optional configuration and cancellation support. It also provides metadata describing the generator and allows for the retrieval of strongly typed services that can be provided by the generator or its underlying services. - -Most users don't need to implement the `IEmbeddingGenerator` interface. However, if you're a library author, you can see a simple implementation at [Sample implementations of IChatClient and IEmbeddingGenerator](advanced/sample-implementations.md). - -#### Create embeddings - -The primary operation performed with an is embedding generation, which is accomplished with its method. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.CreateEmbeddings/Program.cs" id="Snippet1"::: - -Accelerator extension methods also exist to simplify common cases, such as generating an embedding vector from a single input. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.CreateEmbeddings/Program.cs" id="Snippet2"::: - -#### Pipelines of functionality - -As with `IChatClient`, `IEmbeddingGenerator` implementations can be layered. `Microsoft.Extensions.AI` provides a delegating implementation for `IEmbeddingGenerator` for caching and telemetry. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.CustomEmbeddingsMiddle/Program.cs"::: - -The `IEmbeddingGenerator` enables building custom middleware that extends the functionality of an `IEmbeddingGenerator`. The class is an implementation of the `IEmbeddingGenerator` interface that serves as a base class for creating embedding generators that delegate their operations to another `IEmbeddingGenerator` instance. It allows for chaining multiple generators in any order, passing calls through to an underlying generator. The class provides default implementations for methods such as and `Dispose`, which forward the calls to the inner generator instance, enabling flexible and modular embedding generation. - -The following is an example implementation of such a delegating embedding generator that rate-limits embedding generation requests: - -:::code language="csharp" source="snippets/microsoft-extensions-ai/AI.Shared/RateLimitingEmbeddingGenerator.cs"::: - -This can then be layered around an arbitrary `IEmbeddingGenerator>` to rate limit all embedding generation operations. - -:::code language="csharp" source="snippets/microsoft-extensions-ai/ConsoleAI.ConsumeRateLimitingEmbedding/Program.cs"::: - -In this way, the `RateLimitingEmbeddingGenerator` can be composed with other `IEmbeddingGenerator>` instances to provide rate-limiting functionality. - ### The IImageGenerator interface (experimental) The interface represents a generator for creating images from text prompts or other input. This interface enables applications to integrate image generation capabilities from various AI services through a consistent API. The interface supports text-to-image generation (by calling ) and [configuration options](xref:Microsoft.Extensions.AI.ImageGenerationOptions) for image size and format. Like other interfaces in the library, it can be composed with middleware for caching, telemetry, and other cross-cutting concerns. diff --git a/docs/ai/toc.yml b/docs/ai/toc.yml index 919b015707483..c18445eb66441 100644 --- a/docs/ai/toc.yml +++ b/docs/ai/toc.yml @@ -8,7 +8,13 @@ items: - name: Overview href: dotnet-ai-ecosystem.md - name: Microsoft.Extensions.AI - href: microsoft-extensions-ai.md + items: + - name: Overview + href: microsoft-extensions-ai.md + - name: The IChatClient interface + href: ichatclient.md + - name: The IEmbeddingGenerator interface + href: iembeddinggenerator.md - name: Microsoft Agent Framework href: /agent-framework/overview/agent-framework-overview?toc=/dotnet/ai/toc.json&bc=/dotnet/ai/toc.json - name: C# SDK for MCP @@ -112,10 +118,6 @@ items: href: evaluation/evaluate-with-reporting.md - name: "Evaluate response safety with caching and reporting" href: evaluation/evaluate-safety.md -- name: Advanced - items: - - name: Sample interface implementations - href: advanced/sample-implementations.md - name: Resources items: - name: Get started resources From 88e977253c3cdeee351caf3b48a267abb3377c21 Mon Sep 17 00:00:00 2001 From: Genevieve Warren <24882762+gewarren@users.noreply.github.com> Date: Thu, 11 Dec 2025 11:29:47 -0800 Subject: [PATCH 5/5] fix build warnings --- docs/ai/ichatclient.md | 2 +- docs/ai/microsoft-extensions-ai.md | 8 +++++++- .../sample-implementations/Implementations.csproj | 0 .../snippets/sample-implementations/SampleChatClient.cs | 0 .../sample-implementations/SampleEmbeddingGenerator.cs | 0 5 files changed, 8 insertions(+), 2 deletions(-) rename docs/ai/{advanced => }/snippets/sample-implementations/Implementations.csproj (100%) rename docs/ai/{advanced => }/snippets/sample-implementations/SampleChatClient.cs (100%) rename docs/ai/{advanced => }/snippets/sample-implementations/SampleEmbeddingGenerator.cs (100%) diff --git a/docs/ai/ichatclient.md b/docs/ai/ichatclient.md index 85edd58aa25e2..d01962fe26621 100644 --- a/docs/ai/ichatclient.md +++ b/docs/ai/ichatclient.md @@ -9,7 +9,7 @@ no-loc: ["IChatClient"] The interface defines a client abstraction responsible for interacting with AI services that provide chat capabilities. It includes methods for sending and receiving messages with multi-modal content (such as text, images, and audio), either as a complete set or streamed incrementally. Additionally, it allows for retrieving strongly typed services provided by the client or its underlying services. -.NET libraries that provide clients for language models and services can provide an implementation of the `IChatClient` interface. Any consumers of the interface are then able to interoperate seamlessly with these models and services via the abstractions. You can see a simple implementation at [Sample implementations of IChatClient and IEmbeddingGenerator](advanced/sample-implementations.md). +.NET libraries that provide clients for language models and services can provide an implementation of the `IChatClient` interface. Any consumers of the interface are then able to interoperate seamlessly with these models and services via the abstractions. You can find examples in the [Implementation examples](#implementation-examples) section. ## Request a chat response diff --git a/docs/ai/microsoft-extensions-ai.md b/docs/ai/microsoft-extensions-ai.md index 3db28deacf259..3e1576eebbe6b 100644 --- a/docs/ai/microsoft-extensions-ai.md +++ b/docs/ai/microsoft-extensions-ai.md @@ -36,6 +36,12 @@ The interface defines a client abstra For more information and detailed usage examples, see [Use the IChatClient interface](ichatclient.md). +### The `IEmbeddingGenerator` interface + +The interface represents a generic generator of embeddings. For the generic type parameters, `TInput` is the type of input values being embedded, and `TEmbedding` is the type of generated embedding, which inherits from the class. + +For more information and detailed usage examples, see [Use the IEmbeddingGenerator interface](iembeddinggenerator.md). + ### The IImageGenerator interface (experimental) The interface represents a generator for creating images from text prompts or other input. This interface enables applications to integrate image generation capabilities from various AI services through a consistent API. The interface supports text-to-image generation (by calling ) and [configuration options](xref:Microsoft.Extensions.AI.ImageGenerationOptions) for image size and format. Like other interfaces in the library, it can be composed with middleware for caching, telemetry, and other cross-cutting concerns. @@ -46,7 +52,7 @@ For more information, see [Generate images from text using AI](quickstarts/text- You can start building with `Microsoft.Extensions.AI` in the following ways: -- **Library developers**: If you own libraries that provide clients for AI services, consider implementing the interfaces in your libraries. This allows users to easily integrate your NuGet package via the abstractions. For example implementations, see [Sample implementations of IChatClient and IEmbeddingGenerator](advanced/sample-implementations.md). +- **Library developers**: If you own libraries that provide clients for AI services, consider implementing the interfaces in your libraries. This allows users to easily integrate your NuGet package via the abstractions. For examples, see [IChatClient implementation examples](ichatclient.md#implementation-examples) and [IEmbeddingGenerator implementation examples](iembeddinggenerator.md#implementation-examples). - **Service consumers**: If you're developing libraries that consume AI services, use the abstractions instead of hardcoding to a specific AI service. This approach gives your consumers the flexibility to choose their preferred provider. - **Application developers**: Use the abstractions to simplify integration into your apps. This enables portability across models and services, facilitates testing and mocking, leverages middleware provided by the ecosystem, and maintains a consistent API throughout your app, even if you use different services in different parts of your application. - **Ecosystem contributors**: If you're interested in contributing to the ecosystem, consider writing custom middleware components. diff --git a/docs/ai/advanced/snippets/sample-implementations/Implementations.csproj b/docs/ai/snippets/sample-implementations/Implementations.csproj similarity index 100% rename from docs/ai/advanced/snippets/sample-implementations/Implementations.csproj rename to docs/ai/snippets/sample-implementations/Implementations.csproj diff --git a/docs/ai/advanced/snippets/sample-implementations/SampleChatClient.cs b/docs/ai/snippets/sample-implementations/SampleChatClient.cs similarity index 100% rename from docs/ai/advanced/snippets/sample-implementations/SampleChatClient.cs rename to docs/ai/snippets/sample-implementations/SampleChatClient.cs diff --git a/docs/ai/advanced/snippets/sample-implementations/SampleEmbeddingGenerator.cs b/docs/ai/snippets/sample-implementations/SampleEmbeddingGenerator.cs similarity index 100% rename from docs/ai/advanced/snippets/sample-implementations/SampleEmbeddingGenerator.cs rename to docs/ai/snippets/sample-implementations/SampleEmbeddingGenerator.cs