Semantic Rerank: Adds Semantic Rerank API by NaluTripician · Pull Request #5445 · Azure/azure-cosmos-dotnet-v3

NaluTripician · 2025-10-13T17:51:41Z

Pull Request Template

Description

This pull request introduces a new semantic reranking feature to the Azure Cosmos DB .NET SDK, enabling users to rerank documents using an inference service that leverages Azure Active Directory (AAD) authentication. The main changes include the addition of the InferenceService class, new API surface for semantic reranking, and appropriate integration into the SDK's authorization and client context infrastructure. Notably, this functionality is only available when using AAD authentication.

Semantic Reranking Feature Integration:

Added the InferenceService class, which handles communication with the Cosmos DB Inference Service for semantic reranking, including HTTP client configuration, payload construction, and response handling. This service enforces AAD authentication and manages its own authorization and disposal.
Introduced a new public (under PREVIEW) or internal API SemanticRerankAsync to the Container class, allowing users to rerank a list of documents based on a context/query string. This is implemented in ContainerInlineCore and routed through the client context. [1] [2]
To use this feature, the environment variable "AZURE_COSMOS_SEMANTIC_RERANKER_INFERENCE_ENDPOINT", must be set with the inference endpoint from the service.
Additionally, the environment variable "AZURE_COSMOS_SEMANTIC_RERANKER_INFERENCE_SERVICE_MAX_CONNECTION_LIMIT", can be set to change the inference client's max connection limit.

Authorization and Token Handling Updates:

Extended the AuthorizationTokenProvider abstraction and its implementations to support a new method, AddInferenceAuthorizationHeaderAsync, which is only valid for AAD-based token providers. Non-AAD providers throw a NotImplementedException for this method. [1] [2] [3] [4] [5] [6]

Client Context and Resource Management:

Updated ClientContextCore and CosmosClientContext to manage the lifecycle of the InferenceService, including creation, caching, and disposal. Added methods for invoking semantic reranking and for retrieving or creating the inference service instance. [1] [2] [3] [4] [5] [6]

Dependency Updates:

Added a dependency on the Azure.Identity package in the test project to support AAD authentication scenarios.
Please delete options that are not relevant.

Example

//Sample code to demonstrate Semantic Reranking
// Assume 'container' is an instance of Cosmos.Container
// This example queries items from a fitness store with full-text search and then reranks them semantically.

string search_text = "integrated pull-up bar";

string queryString = $@"
    SELECT TOP 15 c.id, c.Name, c.Brand, c.Description
    FROM c
    WHERE FullTextContains(c.Description, ""{search_text}"")
    ORDER BY RANK FullTextScore(c.Description, ""{search_text}"")
    ";

string reranking_context = "most economical with multiple pulley adjustmnets and ideal for home gyms";

List<string> documents = new List<string>();
FeedIterator<dynamic> resultSetIterator = container.GetItemQueryIterator<dynamic>(
    new QueryDefinition(queryString),
    requestOptions: new QueryRequestOptions()
    {
        MaxItemCount = 15,
    });

while (resultSetIterator.HasMoreResults)
{
    FeedResponse<dynamic> response = await resultSetIterator.ReadNextAsync();
    foreach (JsonElement item in response)
    {
        documents.Add(item.ToString());
    }
}

Dictionary<string, dynamic> options = new Dictionary<string, dynamic>
{
    { "return_documents", true },
    { "top_k", 10 },
    { "batch_size", 32 },
    { "sort", true }
};

SemanticRerankResult results = await container.SemanticRerankAsync(
    reranking_context,
    documents,
    options);

// get the best resulting document from the query
results.RerankScores.First().Document;
// or the index of the document in the original list
results.RerankScores.First().Index;
// or the reranking score 
results.RerankScores.First().Score;

// get the latency information from the reranking operation
Dictonary<string, object. latencyInfo = results.Latency;

// get the token usage information from the reranking operation
Dictonary<string, object> tokenUseageInfo = results.TokenUseage;

[] New feature (non-breaking change which adds functionality)

Closing issues

To automatically close an issue: closes #IssueNumber

milismsft

Please try to address the potential multiple background tasks related to the Interference object (and proper dispose of that task as well) :-)

aayush3011

@NaluTripician LGTM, added the comments, that we discussed offline.

Microsoft.Azure.Cosmos/src/Inference/InferenceService.cs

Microsoft.Azure.Cosmos/src/RequestOptions/SemanticRerankRequestOptions.cs

Microsoft.Azure.Cosmos/src/Resource/ClientContextCore.cs

Microsoft.Azure.Cosmos/src/Resource/Container/Container.cs

Microsoft.Azure.Cosmos/src/Resource/Container/ContainerInlineCore.cs

Microsoft.Azure.Cosmos/src/Resource/CosmosClientContext.cs

NaluTripician added 4 commits September 18, 2025 15:57

inital commit

7f732e5

updates

11c81aa

tests

bad23ca

test changes

f396716

NaluTripician self-assigned this Oct 13, 2025

NaluTripician requested review from FabianMeiswinkel, Pilchie, adityasa, khdang, kirankumarkolli, kirillg, neildsh and sboshra as code owners October 13, 2025 17:51

Merge branch 'master' into users/nalutripician/semanticRerank

697d9eb

NaluTripician marked this pull request as draft October 13, 2025 17:52

NaluTripician added 5 commits October 16, 2025 10:34

auth changes

02fc277

test changes

e16e988

Update Microsoft.Azure.Cosmos.sln

3ad012b

fixed auth issues

f2e0e5b

Merge branch 'master' into users/nalutripician/semanticRerank

db15b93

NaluTripician marked this pull request as ready for review October 22, 2025 22:37

milismsft previously approved these changes Oct 22, 2025

View reviewed changes

milismsft reviewed Oct 22, 2025

View reviewed changes

Merge branch 'master' into users/nalutripician/semanticRerank

e42274e

aayush3011 requested changes Oct 25, 2025

View reviewed changes

addresses PR comments

8923888

NaluTripician dismissed milismsft’s stale review via 8923888 October 28, 2025 21:18

NaluTripician requested a review from a team as a code owner October 28, 2025 21:18

NaluTripician added 2 commits October 28, 2025 14:18

Merge branch 'master' into users/nalutripician/semanticRerank

5437698

Merge branch 'master' into users/nalutripician/semanticRerank

4a5afaf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Semantic Rerank: Adds Semantic Rerank API#5445

Semantic Rerank: Adds Semantic Rerank API#5445
microsoft-github-policy-service[bot] merged 43 commits intomasterfrom
users/nalutripician/semanticRerank

NaluTripician commented Oct 13, 2025 •

edited

Loading

Uh oh!

milismsft left a comment

Uh oh!

aayush3011 left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Comments

Conversation

NaluTripician commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Template

Description

Closing issues

Uh oh!

milismsft left a comment

Choose a reason for hiding this comment

Uh oh!

aayush3011 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

NaluTripician commented Oct 13, 2025 •

edited

Loading

aayush3011 left a comment •

edited

Loading