[Preview] Semantic Reranking: Adds response body in semantic reranking error responses#5634
Merged
kirankumarkolli merged 2 commits intomasterfrom Feb 24, 2026
Conversation
When the inference service returns a non-success status code (e.g. 400 Bad Request), the response body containing error details was being discarded by EnsureSuccessStatusCode(). Now throws a CosmosException that includes the full response body, making it easier for users to diagnose failures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
kirankumarkolli
approved these changes
Feb 24, 2026
This was referenced Mar 17, 2026
4 tasks
5 tasks
microsoft-github-policy-service Bot
pushed a commit
that referenced
this pull request
Mar 20, 2026
## Version Changes | Property | Old | New | |---|---|---| | `ClientOfficialVersion` | 3.57.0 | **3.58.0** | | `ClientPreviewVersion` | 3.58.0 | **3.59.0** | | `ClientPreviewSuffixVersion` | preview.0 | **preview.0** | ## Changelog ### 3.59.0-preview.0 (Preview) #### Added - [#5502](#5502) VectorIndex Policy: Adds Support for QuantizerType in IndexingPolicy - [#5634](#5634) Semantic Reranking: Adds response body in semantic reranking error responses - [#5685](#5685) Read Consistency Strategy: Adds Read Consistency Strategy option for read requests ### 3.58.0 (GA) #### Added - [#5447](#5447) Per Partition Automatic Failover: Adds Hub Region Processing Only While Routing Requests Failed with 404/1002 for single master accounts - [#5551](#5551) HPK: Adds internal CosmosClientOptions flag UseLengthAwareRangeComparer for length aware range comparer rollout - [#5582](#5582) Query: Adds ability to choose global vs local/focused statistics for FullTextScore - [5610](#5610) Refactors N-Region Synchronous Commit feature to use IServiceConfigurationReaderVNext interface. - [#5693](#5693) ThinClient Integration: Adds Enable Multiple Http2 connection on SocketsHttpHandler - [#5614](#5614) ThinClient Integration: Adds support for QueryPlan in thinclient mode #### Fixed - [#5597](#5597) CosmosClient: Fixes ObjectDisposedException message when client is disposed during request - [#5613](#5613) CrossRegionHedgingAvailabilityStrategy: Fixes ArgumentNullException race condition in hedging cancellation - [#5650](#5650) Batch: Fixes null ErrorMessage when promoting status from MultiStatus response - [#5651](#5651) Serializer: Fixes unsafe stream cast in FromStream<T> - [#5697](#5697) ResourceThrottleRetryPolicy: Fixes cumulativeRetryDelay tracking when x-ms-retry-after-ms header is absent ### API Contract Diff (GA) ```diff diff --git "a/Microsoft.Azure.Cosmos\\contracts\\API_3.57.0.txt" "b/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0.txt" index a1fa19e..1b74a69 100644 --- "a/Microsoft.Azure.Cosmos\\contracts\\API_3.57.0.txt" +++ "b/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0.txt" @@ -639,6 +639,11 @@ namespace Microsoft.Azure.Cosmos public string DefaultLanguage { get; set; } public Collection<FullTextPath> FullTextPaths { get; set; } } + public enum FullTextScoreScope + { + Global = 0, + Local = 1, + } public sealed class GeospatialConfig { public GeospatialConfig(); @@ -869,6 +874,7 @@ namespace Microsoft.Azure.Cosmos public Nullable<bool> EnableLowPrecisionOrderBy { get; set; } public bool EnableOptimisticDirectExecution { get; set; } public Nullable<bool> EnableScanInQuery { get; set; } + public FullTextScoreScope FullTextScoreScope { get; set; } public Nullable<int> MaxBufferedItemCount { get; set; } public Nullable<int> MaxConcurrency { get; set; } public Nullable<int> MaxItemCount { get; set; } ``` ### API Contract Diff (Preview) ```diff diff --git "a/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0-preview.0.txt" "b/Microsoft.Azure.Cosmos\\contracts\\API_3.59.0-preview.0.txt" index af57dd8..1ae52c0 100644 --- "a/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0-preview.0.txt" +++ "b/Microsoft.Azure.Cosmos\\contracts\\API_3.59.0-preview.0.txt" @@ -128,6 +128,7 @@ namespace Microsoft.Azure.Cosmos public new string IfMatchEtag { get; set; } public new string IfNoneMatchEtag { get; set; } public Nullable<int> PageSizeHint { get; set; } + public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; } } public abstract class ChangeFeedStartFrom { @@ -414,6 +415,7 @@ namespace Microsoft.Azure.Cosmos public Nullable<TimeSpan> OpenTcpConnectionTimeout { get; set; } public Nullable<PortReuseMode> PortReuseMode { get; set; } public Nullable<PriorityLevel> PriorityLevel { get; set; } + public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; } public TimeSpan RequestTimeout { get; set; } public CosmosSerializer Serializer { get; set; } public CosmosSerializationOptions SerializerOptions { get; set; } @@ -746,6 +748,11 @@ namespace Microsoft.Azure.Cosmos public string DefaultLanguage { get; set; } public Collection<FullTextPath> FullTextPaths { get; set; } } + public enum FullTextScoreScope + { + Global = 0, + Local = 1, + } public sealed class GeospatialConfig { public GeospatialConfig(); @@ -825,6 +832,7 @@ namespace Microsoft.Azure.Cosmos public Nullable<IndexingDirective> IndexingDirective { get; set; } public IEnumerable<string> PostTriggers { get; set; } public IEnumerable<string> PreTriggers { get; set; } + public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; } public string SessionToken { get; set; } } public class ItemResponse<T> : Response<T> @@ -972,6 +980,11 @@ namespace Microsoft.Azure.Cosmos High = 1, Low = 2, } + public enum QuantizerType + { + Product = 0, + Spherical = 1, + } public class QueryDefinition { public QueryDefinition(string query); @@ -988,6 +1001,7 @@ namespace Microsoft.Azure.Cosmos public Nullable<bool> EnableLowPrecisionOrderBy { get; set; } public bool EnableOptimisticDirectExecution { get; set; } public Nullable<bool> EnableScanInQuery { get; set; } + public FullTextScoreScope FullTextScoreScope { get; set; } public Nullable<int> MaxBufferedItemCount { get; set; } public Nullable<int> MaxConcurrency { get; set; } public Nullable<int> MaxItemCount { get; set; } @@ -995,6 +1009,7 @@ namespace Microsoft.Azure.Cosmos public Nullable<bool> PopulateIndexMetrics { get; set; } public Nullable<bool> PopulateQueryAdvice { get; set; } public QueryTextMode QueryTextMode { get; set; } + public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; } public Nullable<int> ResponseContinuationTokenLimitInKb { get; set; } public string SessionToken { get; set; } } @@ -1004,10 +1019,18 @@ namespace Microsoft.Azure.Cosmos None = 0, ParameterizedOnly = 1, } + public enum ReadConsistencyStrategy + { + Eventual = 1, + GlobalStrong = 4, + LatestCommitted = 3, + Session = 2, + } public class ReadManyRequestOptions : RequestOptions { public ReadManyRequestOptions(); public Nullable<ConsistencyLevel> ConsistencyLevel { get; set; } + public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; } public string SessionToken { get; set; } } public static class Regions @@ -1383,6 +1406,7 @@ namespace Microsoft.Azure.Cosmos public int IndexingSearchListSize { get; set; } public string Path { get; set; } public int QuantizationByteSize { get; set; } + public Nullable<QuantizerType> QuantizerType { get; set; } public VectorIndexType Type { get; set; } public string[] VectorIndexShardKey { get; set; } } @@ -1482,6 +1506,7 @@ namespace Microsoft.Azure.Cosmos.Fluent public CosmosClientBuilder WithHttpClientFactory(Func<HttpClient> httpClientFactory); public CosmosClientBuilder WithLimitToEndpoint(bool limitToEndpoint); public CosmosClientBuilder WithPriorityLevel(PriorityLevel priorityLevel); + public CosmosClientBuilder WithReadConsistencyStrategy(ReadConsistencyStrategy readConsistencyStrategy); public CosmosClientBuilder WithRequestTimeout(TimeSpan requestTimeout); public CosmosClientBuilder WithSerializerOptions(CosmosSerializationOptions cosmosSerializerOptions); public CosmosClientBuilder WithSystemTextJsonSerializerOptions(JsonSerializerOptions serializerOptions); @@ -1540,6 +1565,7 @@ namespace Microsoft.Azure.Cosmos.Fluent public VectorIndexDefinition<T> Path(string path, VectorIndexType indexType); public VectorIndexDefinition<T> WithIndexingSearchListSize(int indexingSearchListSize); public VectorIndexDefinition<T> WithQuantizationByteSize(int quantizationByteSize); + public VectorIndexDefinition<T> WithQuantizerType(QuantizerType quantizerType); public VectorIndexDefinition<T> WithVectorIndexShardKey(string[] vectorIndexShardKey); } } ``` ## Checklist - [ ] Changelog review by team - [ ] Email `azurecosmossdkdotnet@microsoft.com` for preview API review - [ ] API contract diff approval (Kiran & Kirill) - [ ] Kiran sign-off (required) - [ ] Determine if "Recommended Version" needs further updating --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When the semantic reranking inference API returns a non-success HTTP status code (e.g., 400 Bad Request), the SDK was calling HttpResponseMessage.EnsureSuccessStatusCode() which throws an HttpRequestException containing only the status code. The response body with detailed error information was discarded, making it difficult for users to diagnose why their reranking request failed.
This PR replaces that call with a manual check that reads the response body and throws a CosmosException containing the full error details.
Changes
InferenceService.cs
InferenceServiceTests.cs (new)
Three unit tests:
Before vs After
Before: A 400 response produces:
System.Net.Http.HttpRequestException: Response status code does not indicate success: 400 (Bad Request).
After: The same 400 response now produces:
Microsoft.Azure.Cosmos.CosmosException: Response status code does not indicate success: BadRequest (400); Substatus: 0; ActivityId: ; Reason: ({error details from response body});
Testing
Note
Semantic reranking is a preview API gated behind the PREVIEW compile constant.