Query: Adds ability to choose global vs local/focused statistics for FullTextScore #5582
Merged
microsoft-github-policy-service[bot] merged 8 commits intomasterfrom Feb 6, 2026
Conversation
adityasa
reviewed
Jan 30, 2026
Contributor
|
Is it possible to add some emulator based e2e tests?
In both cases, the query should honor FullTextScore Scope (local v/s global). |
adityasa
reviewed
Feb 3, 2026
sc978345
previously approved these changes
Feb 4, 2026
sboshra
reviewed
Feb 4, 2026
…stics for full text search
…s for FulTextScoreScore.Local
07a86a1 to
44ba1c6
Compare
4 tasks
5 tasks
microsoft-github-policy-service Bot
pushed a commit
that referenced
this pull request
Mar 20, 2026
## Version Changes | Property | Old | New | |---|---|---| | `ClientOfficialVersion` | 3.57.0 | **3.58.0** | | `ClientPreviewVersion` | 3.58.0 | **3.59.0** | | `ClientPreviewSuffixVersion` | preview.0 | **preview.0** | ## Changelog ### 3.59.0-preview.0 (Preview) #### Added - [#5502](#5502) VectorIndex Policy: Adds Support for QuantizerType in IndexingPolicy - [#5634](#5634) Semantic Reranking: Adds response body in semantic reranking error responses - [#5685](#5685) Read Consistency Strategy: Adds Read Consistency Strategy option for read requests ### 3.58.0 (GA) #### Added - [#5447](#5447) Per Partition Automatic Failover: Adds Hub Region Processing Only While Routing Requests Failed with 404/1002 for single master accounts - [#5551](#5551) HPK: Adds internal CosmosClientOptions flag UseLengthAwareRangeComparer for length aware range comparer rollout - [#5582](#5582) Query: Adds ability to choose global vs local/focused statistics for FullTextScore - [5610](#5610) Refactors N-Region Synchronous Commit feature to use IServiceConfigurationReaderVNext interface. - [#5693](#5693) ThinClient Integration: Adds Enable Multiple Http2 connection on SocketsHttpHandler - [#5614](#5614) ThinClient Integration: Adds support for QueryPlan in thinclient mode #### Fixed - [#5597](#5597) CosmosClient: Fixes ObjectDisposedException message when client is disposed during request - [#5613](#5613) CrossRegionHedgingAvailabilityStrategy: Fixes ArgumentNullException race condition in hedging cancellation - [#5650](#5650) Batch: Fixes null ErrorMessage when promoting status from MultiStatus response - [#5651](#5651) Serializer: Fixes unsafe stream cast in FromStream<T> - [#5697](#5697) ResourceThrottleRetryPolicy: Fixes cumulativeRetryDelay tracking when x-ms-retry-after-ms header is absent ### API Contract Diff (GA) ```diff diff --git "a/Microsoft.Azure.Cosmos\\contracts\\API_3.57.0.txt" "b/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0.txt" index a1fa19e..1b74a69 100644 --- "a/Microsoft.Azure.Cosmos\\contracts\\API_3.57.0.txt" +++ "b/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0.txt" @@ -639,6 +639,11 @@ namespace Microsoft.Azure.Cosmos public string DefaultLanguage { get; set; } public Collection<FullTextPath> FullTextPaths { get; set; } } + public enum FullTextScoreScope + { + Global = 0, + Local = 1, + } public sealed class GeospatialConfig { public GeospatialConfig(); @@ -869,6 +874,7 @@ namespace Microsoft.Azure.Cosmos public Nullable<bool> EnableLowPrecisionOrderBy { get; set; } public bool EnableOptimisticDirectExecution { get; set; } public Nullable<bool> EnableScanInQuery { get; set; } + public FullTextScoreScope FullTextScoreScope { get; set; } public Nullable<int> MaxBufferedItemCount { get; set; } public Nullable<int> MaxConcurrency { get; set; } public Nullable<int> MaxItemCount { get; set; } ``` ### API Contract Diff (Preview) ```diff diff --git "a/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0-preview.0.txt" "b/Microsoft.Azure.Cosmos\\contracts\\API_3.59.0-preview.0.txt" index af57dd8..1ae52c0 100644 --- "a/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0-preview.0.txt" +++ "b/Microsoft.Azure.Cosmos\\contracts\\API_3.59.0-preview.0.txt" @@ -128,6 +128,7 @@ namespace Microsoft.Azure.Cosmos public new string IfMatchEtag { get; set; } public new string IfNoneMatchEtag { get; set; } public Nullable<int> PageSizeHint { get; set; } + public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; } } public abstract class ChangeFeedStartFrom { @@ -414,6 +415,7 @@ namespace Microsoft.Azure.Cosmos public Nullable<TimeSpan> OpenTcpConnectionTimeout { get; set; } public Nullable<PortReuseMode> PortReuseMode { get; set; } public Nullable<PriorityLevel> PriorityLevel { get; set; } + public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; } public TimeSpan RequestTimeout { get; set; } public CosmosSerializer Serializer { get; set; } public CosmosSerializationOptions SerializerOptions { get; set; } @@ -746,6 +748,11 @@ namespace Microsoft.Azure.Cosmos public string DefaultLanguage { get; set; } public Collection<FullTextPath> FullTextPaths { get; set; } } + public enum FullTextScoreScope + { + Global = 0, + Local = 1, + } public sealed class GeospatialConfig { public GeospatialConfig(); @@ -825,6 +832,7 @@ namespace Microsoft.Azure.Cosmos public Nullable<IndexingDirective> IndexingDirective { get; set; } public IEnumerable<string> PostTriggers { get; set; } public IEnumerable<string> PreTriggers { get; set; } + public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; } public string SessionToken { get; set; } } public class ItemResponse<T> : Response<T> @@ -972,6 +980,11 @@ namespace Microsoft.Azure.Cosmos High = 1, Low = 2, } + public enum QuantizerType + { + Product = 0, + Spherical = 1, + } public class QueryDefinition { public QueryDefinition(string query); @@ -988,6 +1001,7 @@ namespace Microsoft.Azure.Cosmos public Nullable<bool> EnableLowPrecisionOrderBy { get; set; } public bool EnableOptimisticDirectExecution { get; set; } public Nullable<bool> EnableScanInQuery { get; set; } + public FullTextScoreScope FullTextScoreScope { get; set; } public Nullable<int> MaxBufferedItemCount { get; set; } public Nullable<int> MaxConcurrency { get; set; } public Nullable<int> MaxItemCount { get; set; } @@ -995,6 +1009,7 @@ namespace Microsoft.Azure.Cosmos public Nullable<bool> PopulateIndexMetrics { get; set; } public Nullable<bool> PopulateQueryAdvice { get; set; } public QueryTextMode QueryTextMode { get; set; } + public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; } public Nullable<int> ResponseContinuationTokenLimitInKb { get; set; } public string SessionToken { get; set; } } @@ -1004,10 +1019,18 @@ namespace Microsoft.Azure.Cosmos None = 0, ParameterizedOnly = 1, } + public enum ReadConsistencyStrategy + { + Eventual = 1, + GlobalStrong = 4, + LatestCommitted = 3, + Session = 2, + } public class ReadManyRequestOptions : RequestOptions { public ReadManyRequestOptions(); public Nullable<ConsistencyLevel> ConsistencyLevel { get; set; } + public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; } public string SessionToken { get; set; } } public static class Regions @@ -1383,6 +1406,7 @@ namespace Microsoft.Azure.Cosmos public int IndexingSearchListSize { get; set; } public string Path { get; set; } public int QuantizationByteSize { get; set; } + public Nullable<QuantizerType> QuantizerType { get; set; } public VectorIndexType Type { get; set; } public string[] VectorIndexShardKey { get; set; } } @@ -1482,6 +1506,7 @@ namespace Microsoft.Azure.Cosmos.Fluent public CosmosClientBuilder WithHttpClientFactory(Func<HttpClient> httpClientFactory); public CosmosClientBuilder WithLimitToEndpoint(bool limitToEndpoint); public CosmosClientBuilder WithPriorityLevel(PriorityLevel priorityLevel); + public CosmosClientBuilder WithReadConsistencyStrategy(ReadConsistencyStrategy readConsistencyStrategy); public CosmosClientBuilder WithRequestTimeout(TimeSpan requestTimeout); public CosmosClientBuilder WithSerializerOptions(CosmosSerializationOptions cosmosSerializerOptions); public CosmosClientBuilder WithSystemTextJsonSerializerOptions(JsonSerializerOptions serializerOptions); @@ -1540,6 +1565,7 @@ namespace Microsoft.Azure.Cosmos.Fluent public VectorIndexDefinition<T> Path(string path, VectorIndexType indexType); public VectorIndexDefinition<T> WithIndexingSearchListSize(int indexingSearchListSize); public VectorIndexDefinition<T> WithQuantizationByteSize(int quantizationByteSize); + public VectorIndexDefinition<T> WithQuantizerType(QuantizerType quantizerType); public VectorIndexDefinition<T> WithVectorIndexShardKey(string[] vectorIndexShardKey); } } ``` ## Checklist - [ ] Changelog review by team - [ ] Email `azurecosmossdkdotnet@microsoft.com` for preview API review - [ ] API contract diff approval (Kiran & Kirill) - [ ] Kiran sign-off (required) - [ ] Determine if "Recommended Version" needs further updating --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This was referenced Mar 23, 2026
Bump Microsoft.Azure.Cosmos from 3.57.1 to 3.58.0
azureossd/general-database-connectivity-samples#27
Merged
Merged
This was referenced Apr 7, 2026
This was referenced Apr 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Enabling users to choose global vs local/focused statistics for FullTextScore
Why?
Cosmos DB’s implementation of FullTextScore computes BM25 statistics (term frequency, inverse document frequency, and document length) across all documents in the container, including all physical and logical partitions.
While this provides a valid and comprehensive representation of statistics for the entire dataset, it introduces challenges for several common use cases.
In multi-tenant scenarios, it is often necessary to isolate queries to data belonging to a specific tenant, typically defined by the partition key or a component of a hierarchical partition key. This enables scoring to reflect statistics that are accurate for that tenant’s dataset, rather than for the entire container. For customers such as Veeam and Sitecore, which operate large multi-tenant containers, this is not just an optimization but a requirement. Their tenants often operate in very different domains, which can significantly change the distribution and importance of keywords and phrases. Using global statistics in these cases leads to distorted relevance rankings.
In other scenarios involving hundreds or thousands of physical partitions, computing statistics across the entire container can become both time-consuming and expensive. Customers may prefer to use statistics derived from only a subset of partitions to improve performance and reduce RU consumption. Indeed, there is precedence for this as Azure AI Search defaults to this “local” method.
What?
We propose extending the flexibility of BM25 scoring in Cosmos DB so that developers can choose between a global FullTextScore (existing behavior) or Scoped FullTextScore (statistics computed restricted to the partition key(s) used in the query). The key aspects:
For global BM25, FullTextScore retains its existing behavior and computes BM25 statistics, such as IDF and average document length, across all documents in the container regardless of any partition key filters in the query. In scoped BM25, when a query includes a partition key filter or explicitly requests scoped scoring, the engine computes these statistics only over the subset of documents within the specified partition key values. Query results are still returned only from the filtered partitions, and the resulting scores and ranking reflect relevance within that partition-specific slice of data.
How?
The user issues query like:
And sets a new QueryRequestOption called
FullTextScoreScopewhich can be set to one of two values:localorglobal. The request option is inspected, and the query uses scoped/full stats accordingly.Type of change
Please delete options that are not relevant.