Skip to content

ChangeFeedProcessor: Fixes first-change skip during initial startup by anchoring StartTime#5617

Merged
kirankumarkolli merged 14 commits intomasterfrom
users/nalutripician/fix-changefeedprocessor-startasync-5268
Apr 8, 2026
Merged

ChangeFeedProcessor: Fixes first-change skip during initial startup by anchoring StartTime#5617
kirankumarkolli merged 14 commits intomasterfrom
users/nalutripician/fix-changefeedprocessor-startasync-5268

Conversation

@NaluTripician
Copy link
Copy Markdown
Contributor

Summary

Fixes issue #5268 where ChangeFeedProcessor can skip the first change when started without explicit start options.

Bug Description

When ChangeFeedProcessor is started with default options (no WithStartTime, no continuation, no start-from-beginning), it uses the default start-from-now path.

This path anchors at the first service read (IfNoneMatch: *) rather than at StartAsync() call time. Because startup includes asynchronous lease acquisition and load balancing, there is a window where documents can be written after StartAsync() returns but before the first read reaches the backend. Those documents can be skipped.

Root Cause

  • PartitionLoadBalancerCore.Start() schedules background work and returns immediately.
  • The first change feed read happens later after lease operations.
  • Default Now behavior is based on first read timing, not StartAsync() timing.
  • Documents created in that gap may be missed.

Fix

In ChangeFeedProcessorCore.StartAsync():

  • On first initialization only, if no explicit start option is configured:
    • StartFromBeginning == false
    • StartTime == null
    • StartContinuation is null/empty
  • set changeFeedProcessorOptions.StartTime = DateTime.UtcNow before async initialization.

This anchors start behavior at processor startup time and closes the timing gap.

Why This Is Safe

  • Does not override explicit user start settings.
  • Applies only on first startup when no lease continuation exists.
  • Existing continuation/checkpoint behavior remains unchanged.

Tests Added

Unit tests

Microsoft.Azure.Cosmos.Tests/ChangeFeed/ChangeFeedProcessorCoreTests.cs

  • StartAsync_SetsStartTime_WhenNoStartOptionsProvided
  • StartAsync_DoesNotOverrideExplicitStartTime
  • StartAsync_DoesNotSetStartTime_WhenStartFromBeginning

Emulator regression test

Microsoft.Azure.Cosmos.EmulatorTests/ChangeFeed/DynamicTests.cs

  • TestWithRunningProcessor_ImmediateWriteAfterStart
  • Starts processor and immediately writes documents without setup delay.

Validation

A/B test on the new emulator regression

  • With fix enabled: 10/10 passed
  • With fix temporarily removed: 7/10 failed, showing the original issue reproduces with high probability
  • Fix restored and revalidated: stable pass behavior

Additional checks

  • Targeted unit tests pass.
  • No diagnostics errors in changed files.

Files Changed

  • Microsoft.Azure.Cosmos/src/ChangeFeedProcessor/ChangeFeedProcessorCore.cs
  • Microsoft.Azure.Cosmos/tests/Microsoft.Azure.Cosmos.Tests/ChangeFeed/ChangeFeedProcessorCoreTests.cs
  • Microsoft.Azure.Cosmos/tests/Microsoft.Azure.Cosmos.EmulatorTests/ChangeFeed/DynamicTests.cs

Addresses #5268 by setting an implicit StartTime during first StartAsync when no explicit start options are set.

Also adds unit coverage for StartTime option behavior and an emulator regression test for immediate writes after StartAsync.
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good!

@NaluTripician NaluTripician changed the title ChangeFeedProcessor: prevent first-change skip during startup (#5268) ChangeFeedProcessor: Fixes first-change skip during startup Feb 18, 2026
@NaluTripician NaluTripician self-assigned this Feb 18, 2026
@NaluTripician NaluTripician added the auto-merge Enables automation to merge PRs label Feb 18, 2026
NaluTripician

This comment was marked as outdated.

@kirankumarkolli
Copy link
Copy Markdown
Member

Please update Title with initialization state also

NaluTripician and others added 2 commits March 4, 2026 11:36
Address review feedback to handle case where more than 10 docs
could be processed in the callback.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

@NaluTripician NaluTripician left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: DynamicTests.cs line 116 — @kirankumarkolli

Good catch — the == 10 check is fragile. If the callback fires with a batch that crosses 10 (e.g., count goes from 8→12), the ManualResetEvent would never be set and the test would time out. Changed to >= 10 to be defensive. Pushed in a4dec3f.

@NaluTripician NaluTripician changed the title ChangeFeedProcessor: Fixes first-change skip during startup ChangeFeedProcessor: Fixes first-change skip during initial startup by anchoring StartTime Mar 4, 2026
@NaluTripician
Copy link
Copy Markdown
Contributor Author

Updated the PR title to include initialization state context as requested.

@kirankumarkolli kirankumarkolli enabled auto-merge (squash) March 19, 2026 23:36
…r-startasync-5268

Resolve conflict in AvailabilityStrategyUnitTests.cs by keeping master's version.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@kirankumarkolli kirankumarkolli merged commit 4bec240 into master Apr 8, 2026
32 checks passed
@kirankumarkolli kirankumarkolli deleted the users/nalutripician/fix-changefeedprocessor-startasync-5268 branch April 8, 2026 00:59
@NaluTripician NaluTripician mentioned this pull request Apr 24, 2026
4 tasks
microsoft-github-policy-service Bot pushed a commit that referenced this pull request Apr 25, 2026
## Release 3.59.0

### Version Changes
- ClientOfficialVersion: 3.58.0 → 3.59.0
- ClientPreviewVersion: 3.59.0 → 3.60.0
- ClientPreviewSuffixVersion: preview.0 → preview.0

### Changelog (3.59.0 GA)

#### Added
- [5579](#5579)
Change Feed Processor: Adds Lease container export support
- [5709](#5709)
Performance: Adds caching for URL-encoded AAD authorization signature
- [5731](#5731) DNS
dot-suffix: Adds TCP DNS dot-suffix for Direct mode to avoid Kubernetes
ndots latency
- [5755](#5755)
Exceptionless: Adds enabling exception less 400 status code
- [5756](#5756)
Exceptionless: Adds enabling exception less 404/1002 status code
- [5757](#5757)
Exceptionless: Adds enabling exception less 403
- [5779](#5779)
Direct: Adds Direct package version bump to 3.42.4
- [5786](#5786)
Region Availability: Adds missing regions from Direct 3.42.4
- [5788](#5788)
Socket Handler: Adds HTTP/2 PING keep-alive to detect broken connections
in pool

#### Fixed
- [5553](#5553)
NativeDLLs: Fixes Conditionally include win-x64 native DLLs based on
RuntimeIdentifier
- [5588](#5588)
LINQ: Fixes memory leak from Expression.Compile() in all call sites
- [5617](#5617)
ChangeFeedProcessor: Fixes first-change skip during initial startup by
anchoring StartTime
- [5636](#5636)
CosmosClientBuilder: Fixes self-referencing loop in
GetSerializedConfiguration with STJ TypeInfoResolver
- [5748](#5748)
Routing: Fixes GetOverlappingRanges CPU overhead from repeated JSON
deserialization
- [5807](#5807)
ChangeFeedProcessor: Fixes lease de-duplication for
/partitionKey-partitioned lease containers

### Changelog (3.60.0-preview.0)

#### Added
- [5804](#5804)
SemanticReranking: Adds Configurable Request Timeout

#### Fixed
- [5783](#5783)
Container: Fixes SemanticRerankAsync TypeLoadException in derived
classes

### API Contract Diff (GA)
```diff
diff --git "a/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0.txt" "b/Microsoft.Azure.Cosmos\\contracts\\API_3.59.0.txt"
index 1b74a69..6fa9352 100644
--- "a/Microsoft.Azure.Cosmos\\contracts\\API_3.58.0.txt"
+++ "b/Microsoft.Azure.Cosmos\\contracts\\API_3.59.0.txt"
@@ -60,6 +60,7 @@ namespace Microsoft.Azure.Cosmos
         public ChangeFeedProcessor Build();
         public ChangeFeedProcessorBuilder WithErrorNotification(Container.ChangeFeedMonitorErrorDelegate errorDelegate);
         public virtual ChangeFeedProcessorBuilder WithInMemoryLeaseContainer();
+        public virtual ChangeFeedProcessorBuilder WithInMemoryLeaseContainer(MemoryStream leaseState);
         public ChangeFeedProcessorBuilder WithInstanceName(string instanceName);
         public ChangeFeedProcessorBuilder WithLeaseAcquireNotification(Container.ChangeFeedMonitorLeaseAcquireDelegate acquireDelegate);
         public ChangeFeedProcessorBuilder WithLeaseConfiguration(Nullable<TimeSpan> acquireInterval=default(Nullable<TimeSpan>), Nullable<TimeSpan> expirationInterval=default(Nullable<TimeSpan>), Nullable<TimeSpan> renewInterval=default(Nullable<TimeSpan>));
@@ -956,6 +957,7 @@ namespace Microsoft.Azure.Cosmos
         public const string NorwayWest = "Norway West";
         public const string PolandCentral = "Poland Central";
         public const string QatarCentral = "Qatar Central";
+        public const string SaudiArabiaEast = "Saudi Arabia East";
         public const string SingaporeCentral = "Singapore Central";
         public const string SingaporeNorth = "Singapore North";
         public const string SouthAfricaNorth = "South Africa North";
@@ -963,6 +965,7 @@ namespace Microsoft.Azure.Cosmos
         public const string SouthCentralUS = "South Central US";
         public const string SouthCentralUS2 = "South Central US 2";
         public const string SoutheastAsia = "Southeast Asia";
+        public const string SoutheastAsia3 = "Southeast Asia 3";
         public const string SoutheastUS = "Southeast US";
         public const string SoutheastUS3 = "Southeast US 3";
         public const string SoutheastUS5 = "Southeast US 5";
@@ -990,6 +993,7 @@ namespace Microsoft.Azure.Cosmos
         public const string USSecWest = "USSec West";
         public const string USSecWestCentral = "USSec West Central";
         public const string WestCentralUS = "West Central US";
+        public const string WestCentralUSFRE = "West Central US FRE";
         public const string WestEurope = "West Europe";
         public const string WestIndia = "West India";
         public const string WestUS = "West US";

```

### API Contract Diff (Preview)
```diff
diff --git "a/Microsoft.Azure.Cosmos\\contracts\\API_3.59.0-preview.0.txt" "b/Microsoft.Azure.Cosmos\\contracts\\API_3.60.0-preview.0.txt"
index 1ae52c0..58df10f 100644
--- "a/Microsoft.Azure.Cosmos\\contracts\\API_3.59.0-preview.0.txt"
+++ "b/Microsoft.Azure.Cosmos\\contracts\\API_3.60.0-preview.0.txt"
@@ -91,6 +91,7 @@ namespace Microsoft.Azure.Cosmos
         public ChangeFeedProcessor Build();
         public ChangeFeedProcessorBuilder WithErrorNotification(Container.ChangeFeedMonitorErrorDelegate errorDelegate);
         public virtual ChangeFeedProcessorBuilder WithInMemoryLeaseContainer();
+        public virtual ChangeFeedProcessorBuilder WithInMemoryLeaseContainer(MemoryStream leaseState);
         public ChangeFeedProcessorBuilder WithInstanceName(string instanceName);
         public ChangeFeedProcessorBuilder WithLeaseAcquireNotification(Container.ChangeFeedMonitorLeaseAcquireDelegate acquireDelegate);
         public ChangeFeedProcessorBuilder WithLeaseConfiguration(Nullable<TimeSpan> acquireInterval=default(Nullable<TimeSpan>), Nullable<TimeSpan> expirationInterval=default(Nullable<TimeSpan>), Nullable<TimeSpan> renewInterval=default(Nullable<TimeSpan>));
@@ -302,7 +303,7 @@ namespace Microsoft.Azure.Cosmos
         public abstract Task<ResponseMessage> ReplaceItemStreamAsync(Stream streamPayload, string id, PartitionKey partitionKey, ItemRequestOptions requestOptions=null, CancellationToken cancellationToken=default(CancellationToken));
         public abstract Task<ThroughputResponse> ReplaceThroughputAsync(ThroughputProperties throughputProperties, RequestOptions requestOptions=null, CancellationToken cancellationToken=default(CancellationToken));
         public abstract Task<ThroughputResponse> ReplaceThroughputAsync(int throughput, RequestOptions requestOptions=null, CancellationToken cancellationToken=default(CancellationToken));
-        public abstract Task<SemanticRerankResult> SemanticRerankAsync(string rerankContext, IEnumerable<string> documents, IDictionary<string, object> options=null, CancellationToken cancellationToken=default(CancellationToken));
+        public virtual Task<SemanticRerankResult> SemanticRerankAsync(string rerankContext, IEnumerable<string> documents, IDictionary<string, object> options=null, CancellationToken cancellationToken=default(CancellationToken));
         public abstract Task<ItemResponse<T>> UpsertItemAsync<T>(T item, Nullable<PartitionKey> partitionKey=default(Nullable<PartitionKey>), ItemRequestOptions requestOptions=null, CancellationToken cancellationToken=default(CancellationToken));
         public abstract Task<ResponseMessage> UpsertItemStreamAsync(Stream streamPayload, PartitionKey partitionKey, ItemRequestOptions requestOptions=null, CancellationToken cancellationToken=default(CancellationToken));
         public delegate Task ChangeFeedHandlerWithManualCheckpoint<T>(ChangeFeedProcessorContext context, IReadOnlyCollection<T> changes, Func<Task> checkpointAsync, CancellationToken cancellationToken);
@@ -407,6 +408,7 @@ namespace Microsoft.Azure.Cosmos
         public int GatewayModeMaxConnectionLimit { get; set; }
         public Func<HttpClient> HttpClientFactory { get; set; }
         public Nullable<TimeSpan> IdleTcpConnectionTimeout { get; set; }
+        public TimeSpan InferenceRequestTimeout { get; set; }
         public bool LimitToEndpoint { get; set; }
         public Nullable<int> MaxRequestsPerTcpConnection { get; set; }
         public Nullable<int> MaxRetryAttemptsOnRateLimitedRequests { get; set; }
@@ -1092,6 +1094,7 @@ namespace Microsoft.Azure.Cosmos
         public const string NorwayWest = "Norway West";
         public const string PolandCentral = "Poland Central";
         public const string QatarCentral = "Qatar Central";
+        public const string SaudiArabiaEast = "Saudi Arabia East";
         public const string SingaporeCentral = "Singapore Central";
         public const string SingaporeNorth = "Singapore North";
         public const string SouthAfricaNorth = "South Africa North";
@@ -1099,6 +1102,7 @@ namespace Microsoft.Azure.Cosmos
         public const string SouthCentralUS = "South Central US";
         public const string SouthCentralUS2 = "South Central US 2";
         public const string SoutheastAsia = "Southeast Asia";
+        public const string SoutheastAsia3 = "Southeast Asia 3";
         public const string SoutheastUS = "Southeast US";
         public const string SoutheastUS3 = "Southeast US 3";
         public const string SoutheastUS5 = "Southeast US 5";
@@ -1126,6 +1130,7 @@ namespace Microsoft.Azure.Cosmos
         public const string USSecWest = "USSec West";
         public const string USSecWestCentral = "USSec West Central";
         public const string WestCentralUS = "West Central US";
+        public const string WestCentralUSFRE = "West Central US FRE";
         public const string WestEurope = "West Europe";
         public const string WestIndia = "West India";
         public const string WestUS = "West US";
@@ -1504,6 +1509,7 @@ namespace Microsoft.Azure.Cosmos.Fluent
         public CosmosClientBuilder WithEnableRemoteRegionPreferredForSessionRetry(bool enableRemoteRegionPreferredForSessionRetry);
         public CosmosClientBuilder WithFaultInjection(IFaultInjector faultInjector);
         public CosmosClientBuilder WithHttpClientFactory(Func<HttpClient> httpClientFactory);
+        public CosmosClientBuilder WithInferenceRequestTimeout(TimeSpan inferenceRequestTimeout);
         public CosmosClientBuilder WithLimitToEndpoint(bool limitToEndpoint);
         public CosmosClientBuilder WithPriorityLevel(PriorityLevel priorityLevel);
         public CosmosClientBuilder WithReadConsistencyStrategy(ReadConsistencyStrategy readConsistencyStrategy);

```

### Checklist
- [ ] Changelog entries reviewed by team
- [ ] API contract diff reviewed by Kiran and Kirill
- [ ] Preview APIs reviewed (email sent to
azurecosmossdkdotnet@microsoft.com)
- [ ] Kiran sign-off obtained

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge Enables automation to merge PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants