Skip to content

ResourceThrottleRetryPolicy: Fixes cumulativeRetryDelay tracking when x-ms-retry-after-ms header is absent#5697

Merged
kirankumarkolli merged 6 commits intomasterfrom
users/kirankk/429_NO_RETRY_AFTER
Mar 19, 2026
Merged

ResourceThrottleRetryPolicy: Fixes cumulativeRetryDelay tracking when x-ms-retry-after-ms header is absent#5697
kirankumarkolli merged 6 commits intomasterfrom
users/kirankk/429_NO_RETRY_AFTER

Conversation

@kirankumarkolli
Copy link
Copy Markdown
Member

Description

🤖 This PR was authored by GitHub Copilot

Fixes cumulative retry delay tracking bug in ResourceThrottleRetryPolicy when the x-ms-retry-after-ms response header is absent from HTTP 429 (TooManyRequests) responses.


Issue Summary

Property Value
Area Retry Policy
Header x-ms-retry-after-ms
Severity P2

Root Cause Analysis

Code Path

ResourceThrottleRetryPolicy.cs:94 - ShouldRetryInternalAsync(TimeSpan? retryAfter)
  -> ResourceThrottleRetryPolicy.cs:146 - CheckIfRetryNeeded(retryAfter, out retryDelay)
      -> Line 163: cumulativeRetryDelay updated with retryDelay (= 0) BEFORE fallback
      -> Line 169: retryDelay reassigned to 5s AFTER cumulative guard already passed

Root Cause

When the x-ms-retry-after-ms header is missing from a 429 response, retryDelay stays at TimeSpan.Zero. The cumulativeRetryDelay guard at line 163 updates the cumulative total with this zero value before the 5-second fallback is applied at line 169. This means cumulativeRetryDelay never reflects the actual wait time, allowing the SDK to retry far longer than the configured MaxRetryWaitTimeInSeconds.


Changes Made

Files Modified

File Change
ResourceThrottleRetryPolicy.cs Moved zero-delay fallback before cumulative guard check
ResourceThrottleRetryPolicyTests.cs Added 6 new unit tests for missing/present RetryAfter scenarios

Code Changes

The zero-delay detection and 5-second fallback assignment is moved before the cumulative delay guard, so cumulativeRetryDelay now correctly tracks the actual retry delay.


Generated Output (Before/After)

Before (incorrect):
With maxWaitTime = 12s, maxRetries = 9, missing RetryAfter:

  • Retry 1: cumulativeRetryDelay = 0s, actual wait = 5s - retries
  • Retry 2: cumulativeRetryDelay = 0s, actual wait = 10s - retries
  • Retry 3: cumulativeRetryDelay = 0s, actual wait = 15s - retries (BUG: exceeded 12s)
  • ... continues up to 9 retries = 45s actual wait

After (correct):
With maxWaitTime = 12s, maxRetries = 9, missing RetryAfter:

  • Retry 1: cumulativeRetryDelay = 5s, actual wait = 5s - retries
  • Retry 2: cumulativeRetryDelay = 10s, actual wait = 10s - retries
  • Retry 3: cumulativeRetryDelay = 15s > 12s - stops (correctly enforced)

Testing

  • New unit tests added: ResourceThrottleRetryPolicyTests.cs (6 new tests)
  • Existing tests pass (2 existing tests)
  • Local build succeeds (0 errors, 0 warnings)
  • All 8 tests pass: Passed: 8, Failed: 0

Checklist

  • Code follows repository conventions
  • Changes are minimal and focused
  • No unrelated changes included

Plan Compliance Audit

Checkpoint Status Notes
Local build succeeded Exit code 0, 0 errors
Full unit tests passed 8/8 passed
Root cause documented Code path traced above
Investigation steps included Summary in PR body

Generated by GitHub Copilot CLI Agent

…-after-ms header is absent

Fix cumulative retry delay tracking bug in ResourceThrottleRetryPolicy
when the x-ms-retry-after-ms response header is absent from HTTP 429
responses. The fallback delay (5s) is now applied before the cumulative
guard check, ensuring MaxRetryWaitTimeInSeconds is properly enforced.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
kirankumarkolli and others added 2 commits March 17, 2026 16:38
Removes PresentRetryAfterHeader_RespectsMaxWaitTime,
MissingRetryAfterHeader_RespectsMaxAttemptCount,
NonThrottledResponse_DoesNotRetry, and
MissingRetryAfterHeader_CumulativeDelayTracksActualFallbackDelay
tests as requested by reviewer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@kirankumarkolli kirankumarkolli marked this pull request as ready for review March 17, 2026 23:41
@kirankumarkolli kirankumarkolli enabled auto-merge (squash) March 17, 2026 23:41
@kirankumarkolli kirankumarkolli changed the title [Internal] Retry: Fixes cumulativeRetryDelay tracking when x-ms-retry-after-ms header is absent ResourceThrottleRetryPolicy: Fixes cumulativeRetryDelay tracking when x-ms-retry-after-ms header is absent Mar 17, 2026
Copy link
Copy Markdown
Member

@kundadebdatta kundadebdatta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@kirankumarkolli
Copy link
Copy Markdown
Member Author

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@kirankumarkolli kirankumarkolli merged commit 73573f2 into master Mar 19, 2026
32 checks passed
@kirankumarkolli kirankumarkolli deleted the users/kirankk/429_NO_RETRY_AFTER branch March 19, 2026 21:54
microsoft-github-policy-service Bot pushed a commit that referenced this pull request Mar 20, 2026
## Version Changes

| Property | Old | New |
|---|---|---|
| `ClientOfficialVersion` | 3.57.0 | **3.58.0** |
| `ClientPreviewVersion` | 3.58.0 | **3.59.0** |
| `ClientPreviewSuffixVersion` | preview.0 | **preview.0** |

## Changelog

### 3.59.0-preview.0 (Preview)

#### Added
- [#5502](#5502)
VectorIndex Policy: Adds Support for QuantizerType in IndexingPolicy
- [#5634](#5634)
Semantic Reranking: Adds response body in semantic reranking error
responses
- [#5685](#5685)
Read Consistency Strategy: Adds Read Consistency Strategy option for
read requests

### 3.58.0 (GA)

#### Added
- [#5447](#5447) Per
Partition Automatic Failover: Adds Hub Region Processing Only While
Routing Requests Failed with 404/1002 for single master accounts
- [#5551](#5551)
HPK: Adds internal CosmosClientOptions flag UseLengthAwareRangeComparer
for length aware range comparer rollout
- [#5582](#5582)
Query: Adds ability to choose global vs local/focused statistics for
FullTextScore
- [5610](#5610)
Refactors N-Region Synchronous Commit feature to use
IServiceConfigurationReaderVNext interface.
- [#5693](#5693)
ThinClient Integration: Adds Enable Multiple Http2 connection on
SocketsHttpHandler
- [#5614](#5614)
ThinClient Integration: Adds support for QueryPlan in thinclient mode

#### Fixed
- [#5597](#5597)
CosmosClient: Fixes ObjectDisposedException message when client is
disposed during request
- [#5613](#5613)
CrossRegionHedgingAvailabilityStrategy: Fixes ArgumentNullException race
condition in hedging cancellation
- [#5650](#5650)
Batch: Fixes null ErrorMessage when promoting status from MultiStatus
response
- [#5651](#5651)
Serializer: Fixes unsafe stream cast in FromStream<T>
- [#5697](#5697)
ResourceThrottleRetryPolicy: Fixes cumulativeRetryDelay tracking when
x-ms-retry-after-ms header is absent

### API Contract Diff (GA)

```diff
diff --git "a/Microsoft.Azure.Cosmos\\contracts\\API_3.57.0.txt" "b/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0.txt"
index a1fa19e..1b74a69 100644
--- "a/Microsoft.Azure.Cosmos\\contracts\\API_3.57.0.txt"
+++ "b/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0.txt"
@@ -639,6 +639,11 @@ namespace Microsoft.Azure.Cosmos
         public string DefaultLanguage { get; set; }
         public Collection<FullTextPath> FullTextPaths { get; set; }
     }
+    public enum FullTextScoreScope
+    {
+        Global = 0,
+        Local = 1,
+    }
     public sealed class GeospatialConfig
     {
         public GeospatialConfig();
@@ -869,6 +874,7 @@ namespace Microsoft.Azure.Cosmos
         public Nullable<bool> EnableLowPrecisionOrderBy { get; set; }
         public bool EnableOptimisticDirectExecution { get; set; }
         public Nullable<bool> EnableScanInQuery { get; set; }
+        public FullTextScoreScope FullTextScoreScope { get; set; }
         public Nullable<int> MaxBufferedItemCount { get; set; }
         public Nullable<int> MaxConcurrency { get; set; }
         public Nullable<int> MaxItemCount { get; set; }
```

### API Contract Diff (Preview)

```diff
diff --git "a/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0-preview.0.txt" "b/Microsoft.Azure.Cosmos\\contracts\\API_3.59.0-preview.0.txt"
index af57dd8..1ae52c0 100644
--- "a/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0-preview.0.txt"
+++ "b/Microsoft.Azure.Cosmos\\contracts\\API_3.59.0-preview.0.txt"
@@ -128,6 +128,7 @@ namespace Microsoft.Azure.Cosmos
         public new string IfMatchEtag { get; set; }
         public new string IfNoneMatchEtag { get; set; }
         public Nullable<int> PageSizeHint { get; set; }
+        public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; }
     }
     public abstract class ChangeFeedStartFrom
     {
@@ -414,6 +415,7 @@ namespace Microsoft.Azure.Cosmos
         public Nullable<TimeSpan> OpenTcpConnectionTimeout { get; set; }
         public Nullable<PortReuseMode> PortReuseMode { get; set; }
         public Nullable<PriorityLevel> PriorityLevel { get; set; }
+        public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; }
         public TimeSpan RequestTimeout { get; set; }
         public CosmosSerializer Serializer { get; set; }
         public CosmosSerializationOptions SerializerOptions { get; set; }
@@ -746,6 +748,11 @@ namespace Microsoft.Azure.Cosmos
         public string DefaultLanguage { get; set; }
         public Collection<FullTextPath> FullTextPaths { get; set; }
     }
+    public enum FullTextScoreScope
+    {
+        Global = 0,
+        Local = 1,
+    }
     public sealed class GeospatialConfig
     {
         public GeospatialConfig();
@@ -825,6 +832,7 @@ namespace Microsoft.Azure.Cosmos
         public Nullable<IndexingDirective> IndexingDirective { get; set; }
         public IEnumerable<string> PostTriggers { get; set; }
         public IEnumerable<string> PreTriggers { get; set; }
+        public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; }
         public string SessionToken { get; set; }
     }
     public class ItemResponse<T> : Response<T>
@@ -972,6 +980,11 @@ namespace Microsoft.Azure.Cosmos
         High = 1,
         Low = 2,
     }
+    public enum QuantizerType
+    {
+        Product = 0,
+        Spherical = 1,
+    }
     public class QueryDefinition
     {
         public QueryDefinition(string query);
@@ -988,6 +1001,7 @@ namespace Microsoft.Azure.Cosmos
         public Nullable<bool> EnableLowPrecisionOrderBy { get; set; }
         public bool EnableOptimisticDirectExecution { get; set; }
         public Nullable<bool> EnableScanInQuery { get; set; }
+        public FullTextScoreScope FullTextScoreScope { get; set; }
         public Nullable<int> MaxBufferedItemCount { get; set; }
         public Nullable<int> MaxConcurrency { get; set; }
         public Nullable<int> MaxItemCount { get; set; }
@@ -995,6 +1009,7 @@ namespace Microsoft.Azure.Cosmos
         public Nullable<bool> PopulateIndexMetrics { get; set; }
         public Nullable<bool> PopulateQueryAdvice { get; set; }
         public QueryTextMode QueryTextMode { get; set; }
+        public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; }
         public Nullable<int> ResponseContinuationTokenLimitInKb { get; set; }
         public string SessionToken { get; set; }
     }
@@ -1004,10 +1019,18 @@ namespace Microsoft.Azure.Cosmos
         None = 0,
         ParameterizedOnly = 1,
     }
+    public enum ReadConsistencyStrategy
+    {
+        Eventual = 1,
+        GlobalStrong = 4,
+        LatestCommitted = 3,
+        Session = 2,
+    }
     public class ReadManyRequestOptions : RequestOptions
     {
         public ReadManyRequestOptions();
         public Nullable<ConsistencyLevel> ConsistencyLevel { get; set; }
+        public Nullable<ReadConsistencyStrategy> ReadConsistencyStrategy { get; set; }
         public string SessionToken { get; set; }
     }
     public static class Regions
@@ -1383,6 +1406,7 @@ namespace Microsoft.Azure.Cosmos
         public int IndexingSearchListSize { get; set; }
         public string Path { get; set; }
         public int QuantizationByteSize { get; set; }
+        public Nullable<QuantizerType> QuantizerType { get; set; }
         public VectorIndexType Type { get; set; }
         public string[] VectorIndexShardKey { get; set; }
     }
@@ -1482,6 +1506,7 @@ namespace Microsoft.Azure.Cosmos.Fluent
         public CosmosClientBuilder WithHttpClientFactory(Func<HttpClient> httpClientFactory);
         public CosmosClientBuilder WithLimitToEndpoint(bool limitToEndpoint);
         public CosmosClientBuilder WithPriorityLevel(PriorityLevel priorityLevel);
+        public CosmosClientBuilder WithReadConsistencyStrategy(ReadConsistencyStrategy readConsistencyStrategy);
         public CosmosClientBuilder WithRequestTimeout(TimeSpan requestTimeout);
         public CosmosClientBuilder WithSerializerOptions(CosmosSerializationOptions cosmosSerializerOptions);
         public CosmosClientBuilder WithSystemTextJsonSerializerOptions(JsonSerializerOptions serializerOptions);
@@ -1540,6 +1565,7 @@ namespace Microsoft.Azure.Cosmos.Fluent
         public VectorIndexDefinition<T> Path(string path, VectorIndexType indexType);
         public VectorIndexDefinition<T> WithIndexingSearchListSize(int indexingSearchListSize);
         public VectorIndexDefinition<T> WithQuantizationByteSize(int quantizationByteSize);
+        public VectorIndexDefinition<T> WithQuantizerType(QuantizerType quantizerType);
         public VectorIndexDefinition<T> WithVectorIndexShardKey(string[] vectorIndexShardKey);
     }
 }
```

## Checklist
- [ ] Changelog review by team
- [ ] Email `azurecosmossdkdotnet@microsoft.com` for preview API review
- [ ] API contract diff approval (Kiran & Kirill)
- [ ] Kiran sign-off (required)
- [ ] Determine if "Recommended Version" needs further updating

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants