-
Notifications
You must be signed in to change notification settings - Fork 537
Semantic Rerank: Adds Semantic Rerank API #5445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
microsoft-github-policy-service
merged 43 commits into
master
from
users/nalutripician/semanticRerank
Nov 24, 2025
Merged
Changes from all commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
7f732e5
inital commit
NaluTripician 11c81aa
updates
NaluTripician bad23ca
tests
NaluTripician f396716
test changes
NaluTripician 697d9eb
Merge branch 'master' into users/nalutripician/semanticRerank
NaluTripician 02fc277
auth changes
NaluTripician e16e988
test changes
NaluTripician 3ad012b
Update Microsoft.Azure.Cosmos.sln
NaluTripician f2e0e5b
fixed auth issues
NaluTripician db15b93
Merge branch 'master' into users/nalutripician/semanticRerank
NaluTripician e42274e
Merge branch 'master' into users/nalutripician/semanticRerank
NaluTripician 8923888
addresses PR comments
NaluTripician 5437698
Merge branch 'master' into users/nalutripician/semanticRerank
NaluTripician 4a5afaf
Merge branch 'master' into users/nalutripician/semanticRerank
NaluTripician c05501e
test Fix
NaluTripician 76f9ce1
Adds Semantic rerank result
NaluTripician 8fba1a1
fixed typo
NaluTripician e8445e7
small changes and bugfixes
NaluTripician 57e65a7
Update SemanticRerankResult.cs
NaluTripician 0bd9579
Merge branch 'master' into users/nalutripician/semanticRerank
NaluTripician 257847e
Update EndToEndTraceWriterBaselineTests.StreamPointOperationsAsync.xml
NaluTripician d061439
PR comments
NaluTripician ee91cd4
Merge branch 'master' into users/nalutripician/semanticRerank
NaluTripician dd1b39c
comments
NaluTripician ea2df60
Merge branch 'users/nalutripician/semanticRerank' of https://github.c…
NaluTripician 1c40b8d
test fixes for preview
NaluTripician 4638ed4
Merge branch 'master' into users/nalutripician/semanticRerank
NaluTripician 0b9bf26
Update EncryptionContainer.cs
NaluTripician 287073a
Update EncryptionContainer.cs
NaluTripician 4ceb261
Merge branch 'master' into users/nalutripician/semanticRerank
NaluTripician 65c0237
move encryption impl to right place
NaluTripician 7f2aab4
Merge branch 'users/nalutripician/semanticRerank' of https://github.c…
NaluTripician 38b8a78
Update EncryptionContainer.cs
NaluTripician 24723d6
Update EncryptionContainer.cs
NaluTripician 57af783
fixed preview ref
NaluTripician 95527ac
Update EncryptionContainer.cs
NaluTripician 7d83b0b
Update EncryptionContainer.cs
NaluTripician 7bb337b
nits
NaluTripician 4366eaa
Update Container.cs
NaluTripician b555e30
updated example
NaluTripician 9d9f3dc
Merge branch 'master' into users/nalutripician/semanticRerank
NaluTripician c958a5d
nit
NaluTripician 982ea78
Merge branch 'users/nalutripician/semanticRerank' of https://github.c…
NaluTripician File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
209 changes: 209 additions & 0 deletions
209
Microsoft.Azure.Cosmos/src/Inference/InferenceService.cs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,209 @@ | ||
| //------------------------------------------------------------ | ||
| // Copyright (c) Microsoft Corporation. All rights reserved. | ||
| //------------------------------------------------------------ | ||
|
|
||
| namespace Microsoft.Azure.Cosmos | ||
| { | ||
| using System; | ||
| using System.Collections.Generic; | ||
| using System.Diagnostics; | ||
| using System.Linq; | ||
| using System.Net.Http; | ||
| using System.Net.Http.Headers; | ||
| using System.Text; | ||
| using System.Threading; | ||
| using System.Threading.Tasks; | ||
| using global::Azure.Core; | ||
| using Microsoft.Azure.Documents; | ||
| using Microsoft.Azure.Documents.Collections; | ||
|
|
||
| /// <summary> | ||
| /// Provides functionality to interact with the Cosmos DB Inference Service for semantic reranking. | ||
| /// </summary> | ||
| internal class InferenceService : IDisposable | ||
| { | ||
| // Base path for the inference service endpoint. | ||
| private const string basePath = "/inference/semanticReranking"; | ||
| // User agent string for inference requests. | ||
| private const string inferenceUserAgent = "cosmos-inference-dotnet"; | ||
| // Default scope for AAD authentication. | ||
| private const string inferenceServiceDefaultScope = "https://dbinference.azure.com/.default"; | ||
| private const int inferenceServiceDefaultMaxConnectionLimit = 50; | ||
|
|
||
| private readonly int inferenceServiceMaxConnectionLimit; | ||
| private readonly string inferenceServiceBaseUrl; | ||
| private readonly Uri inferenceEndpoint; | ||
|
|
||
| private HttpClient httpClient; | ||
| private AuthorizationTokenProvider cosmosAuthorization; | ||
|
|
||
| private bool disposedValue; | ||
|
|
||
| /// <summary> | ||
| /// Initializes a new instance of the <see cref="InferenceService"/> class. | ||
| /// </summary> | ||
| /// <param name="client">The CosmosClient instance.</param> | ||
| /// <exception cref="InvalidOperationException">Thrown if AAD authentication is not used.</exception> | ||
| public InferenceService(CosmosClient client) | ||
|
NaluTripician marked this conversation as resolved.
|
||
| { | ||
| this.inferenceServiceBaseUrl = ConfigurationManager.GetEnvironmentVariable<string>("AZURE_COSMOS_SEMANTIC_RERANKER_INFERENCE_ENDPOINT", null); | ||
|
|
||
| if (string.IsNullOrEmpty(this.inferenceServiceBaseUrl)) | ||
| { | ||
| throw new ArgumentNullException("Set environment variable AZURE_COSMOS_SEMANTIC_RERANKER_INFERENCE_ENDPOINT to use inference service"); | ||
| } | ||
|
|
||
| this.inferenceServiceMaxConnectionLimit = ConfigurationManager.GetEnvironmentVariable<int?>( | ||
| "AZURE_COSMOS_SEMANTIC_RERANKER_INFERENCE_SERVICE_MAX_CONNECTION_LIMIT", | ||
| inferenceServiceDefaultMaxConnectionLimit) ?? inferenceServiceDefaultMaxConnectionLimit; | ||
|
|
||
| // Create and configure HttpClient for inference requests. | ||
| HttpMessageHandler httpMessageHandler = CosmosHttpClientCore.CreateHttpClientHandler( | ||
| gatewayModeMaxConnectionLimit: this.inferenceServiceMaxConnectionLimit, | ||
| webProxy: null, | ||
| serverCertificateCustomValidationCallback: client.DocumentClient.ConnectionPolicy.ServerCertificateCustomValidationCallback); | ||
|
|
||
| this.httpClient = new HttpClient(httpMessageHandler); | ||
|
NaluTripician marked this conversation as resolved.
|
||
|
|
||
| this.CreateClientHelper(this.httpClient); | ||
|
|
||
| // Construct the inference service endpoint URI. | ||
| this.inferenceEndpoint = new Uri($"{this.inferenceServiceBaseUrl}/{basePath}"); | ||
|
|
||
| // Ensure AAD authentication is used. | ||
| if (client.DocumentClient.cosmosAuthorization.GetType() != typeof(AuthorizationTokenProviderTokenCredential)) | ||
| { | ||
| throw new InvalidOperationException("InferenceService only supports AAD authentication."); | ||
| } | ||
|
|
||
| // Set up token credential for authorization. | ||
| // This is done to ensure the correct scope, which is different than the scope of the client, is used for the inference service. | ||
| AuthorizationTokenProviderTokenCredential defaultOperationTokenProvider = client.DocumentClient.cosmosAuthorization as AuthorizationTokenProviderTokenCredential; | ||
| TokenCredential tokenCredential = defaultOperationTokenProvider.tokenCredential; | ||
|
|
||
| this.cosmosAuthorization = new AuthorizationTokenProviderTokenCredential( | ||
|
NaluTripician marked this conversation as resolved.
|
||
| tokenCredential: tokenCredential, | ||
| accountEndpoint: new Uri(inferenceServiceDefaultScope), | ||
| backgroundTokenCredentialRefreshInterval: client.ClientOptions?.TokenCredentialBackgroundRefreshInterval); | ||
| } | ||
|
|
||
| /// <summary> | ||
| /// Sends a semantic rerank request to the inference service. | ||
| /// </summary> | ||
| /// <param name="rerankContext">The context/query for reranking.</param> | ||
| /// <param name="documents">The documents to be reranked.</param> | ||
| /// <param name="options">Optional additional options for the request.</param> | ||
| /// <param name="cancellationToken">Cancellation token.</param> | ||
| /// <returns>A dictionary containing the reranked results.</returns> | ||
| public async Task<SemanticRerankResult> SemanticRerankAsync( | ||
| string rerankContext, | ||
| IEnumerable<string> documents, | ||
| IDictionary<string, object> options = null, | ||
| CancellationToken cancellationToken = default) | ||
| { | ||
| // Prepare HTTP request for semantic reranking. | ||
| HttpRequestMessage message = new HttpRequestMessage(HttpMethod.Post, this.inferenceEndpoint); | ||
| INameValueCollection additionalHeaders = new RequestNameValueCollection(); | ||
| await this.cosmosAuthorization.AddInferenceAuthorizationHeaderAsync( | ||
| headersCollection: additionalHeaders, | ||
| this.inferenceEndpoint, | ||
| HttpConstants.HttpMethods.Post, | ||
| AuthorizationTokenType.AadToken); | ||
| additionalHeaders.Add(HttpConstants.HttpHeaders.UserAgent, inferenceUserAgent); | ||
|
|
||
| // Add all headers to the HTTP request. | ||
| foreach (string key in additionalHeaders.AllKeys()) | ||
| { | ||
| message.Headers.Add(key, additionalHeaders[key]); | ||
| } | ||
|
|
||
| // Build the request payload. | ||
| Dictionary<string, object> body = this.AddSemanticRerankPayload(rerankContext, documents, options); | ||
|
|
||
| message.Content = new StringContent( | ||
| Newtonsoft.Json.JsonConvert.SerializeObject(body), | ||
| Encoding.UTF8, | ||
| RuntimeConstants.MediaTypes.Json); | ||
|
|
||
| // Send the request and ensure success. | ||
| HttpResponseMessage responseMessage = await this.httpClient.SendAsync(message, cancellationToken); | ||
|
NaluTripician marked this conversation as resolved.
|
||
| responseMessage.EnsureSuccessStatusCode(); | ||
|
|
||
| // Deserialize and return the response content as a dictionary. | ||
| return await SemanticRerankResult.DeserializeSemanticRerankResultAsync(responseMessage); | ||
| } | ||
|
|
||
| /// <summary> | ||
| /// Configures the provided HttpClient with default headers and settings for inference requests. | ||
| /// </summary> | ||
| /// <param name="httpClient">The HttpClient to configure.</param> | ||
| private void CreateClientHelper(HttpClient httpClient) | ||
| { | ||
| httpClient.Timeout = TimeSpan.FromSeconds(120); | ||
| httpClient.DefaultRequestHeaders.CacheControl = new CacheControlHeaderValue { NoCache = true }; | ||
|
|
||
| // Set requested API version header for version enforcement. | ||
| httpClient.DefaultRequestHeaders.Add(HttpConstants.HttpHeaders.Version, | ||
| HttpConstants.Versions.CurrentVersion); | ||
|
|
||
| httpClient.DefaultRequestHeaders.Add(HttpConstants.HttpHeaders.Accept, RuntimeConstants.MediaTypes.Json); | ||
| } | ||
|
|
||
| /// <summary> | ||
| /// Constructs the payload for the semantic rerank request. | ||
| /// </summary> | ||
| /// <param name="rerankContext">The context/query for reranking.</param> | ||
| /// <param name="documents">The documents to be reranked.</param> | ||
| /// <param name="options">Optional additional options.</param> | ||
| /// <returns>A dictionary representing the request payload.</returns> | ||
| private Dictionary<string, object> AddSemanticRerankPayload(string rerankContext, IEnumerable<string> documents, IDictionary<string, object> options) | ||
| { | ||
| Dictionary<string, object> payload = new Dictionary<string, object> | ||
| { | ||
| { "query", rerankContext }, | ||
| { "documents", documents.ToArray() } | ||
| }; | ||
|
|
||
| if (options == null) | ||
| { | ||
| return payload; | ||
| } | ||
|
|
||
| // Add any additional options to the payload. | ||
| foreach (string option in options.Keys) | ||
| { | ||
| payload.Add(option, options[option]); | ||
| } | ||
|
|
||
| return payload; | ||
| } | ||
|
|
||
| /// <summary> | ||
| /// Disposes managed resources used by the service. | ||
| /// </summary> | ||
| /// <param name="disposing">Indicates if called from Dispose.</param> | ||
| protected void Dispose(bool disposing) | ||
| { | ||
| if (!this.disposedValue) | ||
| { | ||
| if (disposing) | ||
| { | ||
| this.httpClient.Dispose(); | ||
|
NaluTripician marked this conversation as resolved.
|
||
| this.cosmosAuthorization.Dispose(); | ||
| this.httpClient = null; | ||
| this.cosmosAuthorization = null; | ||
| } | ||
|
|
||
| this.disposedValue = true; | ||
| } | ||
| } | ||
|
|
||
| /// <summary> | ||
| /// Disposes the service and its resources. | ||
| /// </summary> | ||
| public void Dispose() | ||
| { | ||
| this.Dispose(true); | ||
| } | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| //------------------------------------------------------------ | ||
| // Copyright (c) Microsoft Corporation. All rights reserved. | ||
| //------------------------------------------------------------ | ||
|
|
||
| namespace Microsoft.Azure.Cosmos | ||
| { | ||
| /// <summary> | ||
| /// Represents the score assigned to a document after a reranking operation. | ||
| /// </summary> | ||
| #if PREVIEW | ||
| public | ||
| #else | ||
| internal | ||
| #endif | ||
|
|
||
| class RerankScore | ||
| { | ||
| /// <summary> | ||
| /// Gets the document content or identifier that was reranked. | ||
| /// </summary> | ||
| public object Document { get; } | ||
|
|
||
| /// <summary> | ||
| /// Gets the score assigned to the document after reranking. | ||
| /// </summary> | ||
| public double Score { get; } | ||
|
|
||
| /// <summary> | ||
| /// Gets the original index or position of the document before reranking. | ||
| /// </summary> | ||
| public int Index { get; } | ||
|
|
||
| /// <summary> | ||
| /// Initializes a new instance of the <see cref="RerankScore"/> class. | ||
| /// </summary> | ||
| /// <param name="document">The document content or identifier.</param> | ||
| /// <param name="score">The reranked score for the document.</param> | ||
| /// <param name="index">The original index of the document.</param> | ||
| public RerankScore(object document, double score, int index) | ||
| { | ||
| this.Document = document; | ||
| this.Score = score; | ||
| this.Index = index; | ||
| } | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.