fix(azcosmos): GetChangeFeed split-resilient via overlap match, 410/Gone retry, multi-range continuation queue (#26767)#26768
Conversation
…one retry, multi-range continuation queue (Azure#26767) GetChangeFeed previously failed (silently or loudly) when a physical partition split occurred under a customer-supplied FeedRange: * toHeaders did exact-equality matching against the cached PK ranges, so a parent FeedRange that no longer matched any current child fell through to a request issued WITHOUT the partition-key-range-id header (silent miss). * 410/Gone with the PartitionKeyRangeGone substatus surfaced directly to the caller; the underlying maxPKRangeGoneRetries / forceRefresh plumbing added in Azure#26723 was never wired into the change feed. * The composite continuation token only encoded a single EPK range, so resuming after a split lost events on the boundary. This change rewires GetChangeFeed onto a queue-driven drain loop: * buildChangeFeedInitialQueue resolves the customer FeedRange (or inbound continuation token) by overlap-match against the routing map, expanding a parent FeedRange into N children inheriting the parent ETag. * getChangeFeedForQueue drains the queue FIFO. On each iteration it re-resolves the head against the cached snapshot, detects mid-drain splits and bumps the rotation budget so newly-inserted children get queried in the same call. * On 410/Gone with PK-range substatus, the cache is refreshed, the routing snapshot is re-fetched, and the head is retried; capped at maxPKRangeGoneRetries. * The composite continuation token now carries the full multi-range queue, with each entry's ETag advanced FIFO on every response. * Cross-container token reuse is rejected loudly: if a token's ResourceID does not match the current container's _rid, the call fails with a diagnostic error rather than silently misrouting against the wrong routing map. Behavior preserved for callers without the cache infrastructure wired: when no PK-range cache is present and the direct fetch returns no ranges (test/mock case), a single passthrough queue entry is used so the legacy issue-without-overlap-match flow still runs. Tests added in cosmos_change_feed_split_test.go cover: 410-retry, retry cap surfacing the final 410, cross-container ResourceID rejection, no-overlap-after-refresh -> ErrFeedRangeUnresolved, split expansion routing the first child, multi-range continuation round-trip, and the mid-drain split-during-drain budget bump regression guard. All three existing change-feed tests continue to pass. Closes Azure#26767 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
efb1e86 to
c20a499
Compare
…andling - Capture drain-loop error in GetChangeFeed so deferred span observes failures (Azure#1) - Reset 410/Gone budget on both split-expansion and head-advance (Azure#2) - Move resp.Body.Close() above 304 short-circuit in newChangeFeedResponse (Azure#7) - Drop dead empty-feedMax branch in cosmos_feed_range.go (Azure#8) - Remove redundant token.head() reload in drain loop (Azure#9) - Reduce buildChangeFeedInitialQueue to 3 returns (Azure#10) - buildRequestHeaders surfaces PartitionKey serialization errors (Azure#11) - Cache-nil + no-routing degraded fallback: log warning, drop continuation token, and read fresh with passthrough range instead of erroring; preserves back-compat for tests built without caches wired up - CHANGELOG link points to PR Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Round-2 reviewer cleanup. dropHeadContinuation mutates Continuation[0] in place; the existing head pointer reflects the change without reload. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Condenses the GetChangeFeed numbered behavior list to a short paragraph and trims the CHANGELOG entry. No behavior change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…itionKeyRangeID - Move overlappingPartitionKeyRanges into cosmos_feed_range.go next to its sibling findOverlappingPartitionKeyRangeIDs and simplify to a single pass (no map round-trip) - Delete the legacy ChangeFeedOptions.toHeaders builder and its test file; the new GetChangeFeed path uses buildRequestHeaders exclusively - Delete the now-unused findPartitionKeyRangeID exact-match helper (the original silent-mismatch source); overlap-resolution lives upstream now - Drop the small toHeaders-tail test in TestContainerGetChangeFeedWithStartFrom that only verified the legacy header builder Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…t coverage - Move GetChangeFeed, buildChangeFeedInitialQueue, resolveFeedRangeToChildren, buildChildQueueEntries, getChangeFeedForQueue, and serializeCompositeContinuationToken out of cosmos_container.go into cosmos_container_change_feed.go (mirrors the cosmos_container_read_many.go split). Pure relocation, no behavior change. - Add cosmos_change_feed_request_options_test.go covering buildRequestHeaders: defaults (only AIM header), all-fields population, empty-ETag continuation is omitted, and PartitionKey serialization errors are surfaced (the new error-returning contract the legacy toHeaders silently swallowed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three new emulator tests targeting F1-specific paths not covered by the existing TestEmulatorContainerChangeFeed: - SubRangeOfPhysicalRange — original-bug repro: a customer FeedRange that is a strict sub-range of a physical pkrange (no exact-match) must resolve via overlap-match and emit a request. - WideRangeMultiDrain — wide FeedRange overlapping every physical range; paginates via the composite continuation token until all inserted items are drained, exercising the multi-range queue rotation. - CrossContainerTokenRejected — token from container-A handed to container-B must be rejected by the ResourceID guard. All three skip when EMULATOR is unset, matching repo convention. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Drop ineffectual headFeedRange reassignment in degraded fallback (it was never read after the assignment) - Delete unused findOverlappingPartitionKeyRangeIDs (overlappingPartitionKeyRanges superseded it; no remaining callers) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Helpers (buildChangeFeedInitialQueue, resolveFeedRangeToChildren, buildChildQueueEntries, getChangeFeedForQueue, serializeCompositeContinuationToken) remain in cosmos_container_change_feed.go; only the public method moves back so the canonical container file continues to host the public API surface. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR updates sdk/data/azcosmos change feed paging/routing to be resilient to partition splits by resolving FeedRanges via overlap (instead of exact boundary match), expanding split parents into per-child work items, and retrying 410/Gone PK-range errors via cache refresh.
Changes:
- Reworked
ContainerClient.GetChangeFeedinto a queue-driven drain loop that overlap-resolves ranges, expands splits, and performs bounded410/Goneretries with PK-range cache refresh. - Added new helpers for overlap detection (
overlappingPartitionKeyRanges,rangesOverlap, max-boundary normalization) and composite continuation queue operations (head/advance/replaceHeadWithChildren). - Added extensive unit tests for split/retry/token behavior plus new emulator coverage; updated request-header building tests and changelog.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/data/azcosmos/emulator_cosmos_change_feed_split_test.go | Adds emulator tests validating overlap routing, multi-drain continuation, and cross-container token rejection. |
| sdk/data/azcosmos/cosmos_feed_range.go | Introduces overlap-based PK range resolution helpers and ErrFeedRangeUnresolved. |
| sdk/data/azcosmos/cosmos_container.go | Routes GetChangeFeed through the new queue-based implementation. |
| sdk/data/azcosmos/cosmos_container_test.go | Removes a now-obsolete continuation/header assertion tied to the old toHeaders behavior. |
| sdk/data/azcosmos/cosmos_container_change_feed.go | New change-feed drain loop implementation (queue, split expansion, 410 retry, token guard). |
| sdk/data/azcosmos/cosmos_change_feed_split_test.go | New unit tests covering 410 retry, retry cap, split expansion, multi-range resume, and rotation budget bump. |
| sdk/data/azcosmos/cosmos_change_feed_response.go | Ensures response bodies are closed even on 304 to avoid leaks during drain. |
| sdk/data/azcosmos/cosmos_change_feed_request_options.go | Replaces legacy header builder with buildRequestHeaders for queue-head-based requests. |
| sdk/data/azcosmos/cosmos_change_feed_request_options_test.go | Updates/expands tests to validate buildRequestHeaders behavior and error surfacing. |
| sdk/data/azcosmos/cosmos_change_feed_composite_continuation_token.go | Adds composite continuation token queue helpers (head/advance/replaceHeadWithChildren). |
| sdk/data/azcosmos/CHANGELOG.md | Documents the split-resilient GetChangeFeed fix. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
kushagraThapar
left a comment
There was a problem hiding this comment.
Hi @tvaron3, thank you for putting this together — I really enjoyed reading through the queue-driven drain loop and the way the 410/Gone retry budget resets across split-expansion. The three deep-reviewer rounds clearly show in the structural quality.
A few items below for your consideration. None of them are hard blockers IMO; the heaviest one is M-2 (cross-container _rid guard behavior on first-call 304) which I think is worth addressing in this PR. The rest are smaller correctness, testing, or documentation gaps.
A couple of items I'm planning to file as follow-up issues rather than block this PR:
- The
ChangeFeedWithStartTimePostMergecapability bit (pre-existing gap, not introduced here, but the new "split-resilient" framing makes it more customer-visible after a merge). - Two design-doc extension proposals for
cosmosdb-design-docs(continuation-token container-identity invariant, and the FeedRange-becomes-unresolvable contract).
All comments are in the spirit of "could you please help me understand" rather than "must change" — happy to defer any of them if you have context I'm missing.
|
@sdkReviewAgent |
Round-2 reviewer findings from kushagraThapar plus a CHANGELOG link revert: - (Azure#2) Revert Copilot autofix that changed the CHANGELOG PR link from 26768 (this PR) to 26767 (the issue it closes). - (Azure#4) Clamp child ranges to the parent token's bounds during split- expansion (matches Java FeedRangeCompositeContinuationImpl.createChildRanges). Children that fall entirely outside the parent are dropped. New clampChildrenToParent helper + TestClampChildrenToParent. - (Azure#5) Populate token.ResourceID at queue-construction time (both Path A and Path B) so the cross-container guard remains meaningful even when the very first response is a 304 (no body parsed). - (Azure#6) Replace misleading 'legacy ETag-only continuation handled by buildRequestHeaders' comment with an accurate one stating that only composite tokens issued by GetChangeFeed are honored on resume; legacy raw-ETag continuation strings are not supported. - (Azure#7) Surface partial drain state alongside the 410-budget-exhausted error so callers can resume from the failed head instead of re-querying drained heads. New TestGetChangeFeed_410BudgetExhaustedMidDrain_SurfacesPartialState. - (Azure#8) Aggregate RequestCharge across drain iterations so the returned response reports total RU consumed (matches cosmos_container_read_many pattern). - (Azure#9) Remove the cache-nil degraded fallback (production always wires caches via acquireCaches; the fallback existed only for legacy tests built without caches). Rewire the 3 legacy TestContainerGetChangeFeed* tests to use createChangeFeedTestClient with pre-seeded caches. Removes now-unused dropHeadContinuation helper. - (Azure#10) Belt-and-suspenders: also accept len(response.Documents) > 0 in the early-return condition so a response missing _count but carrying docs isn't silently re-drained. - (Azure#11) Add explicit ctx.Done() check at the top of the drain loop so a cancelled context exits between sub-requests rather than waiting for transport-level cancellation. New TestGetChangeFeed_ContextCancelled_HonoredBetweenSubRequests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
✅ Review complete (50:22) Posted 3 inline comment(s). Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage |
…split-handling # Conflicts: # sdk/data/azcosmos/cosmos_container.go
…put options post-merge - (X1) Add hard cap on drain-loop rotation budget (4*(initialLen+1)) matching Java's FeedRangeCompositeContinuationImpl defense. Prevents unbounded loops if overlap-resolution ever produces duplicate ranges. - (X2) Add focused table-driven unit tests for the overlap helpers: TestNormalizeMaxBoundary, TestRangesOverlap_TableDriven (10 cases), TestOverlappingPartitionKeyRanges_TableDriven (11 cases covering exact- match, multi-overlap, sub-range, straddling, empty/inverted/equal feed ranges, boundary touches). - (X3) PopulateCompositeContinuationToken (public API) is now a no-op when ContinuationToken is already populated, preventing the multi-range composite token written by the drain loop from being overwritten with a stale single-range token rebuilt from the per-head FeedRange. Post-merge with upstream/main (Azure#26750): - Thread options.PriorityLevel and options.ThroughputBucket through the F1 drain loop's pipelineRequestOptions.headerOptionsOverride so the new priority/throughput-bucket support added in Azure#26750 reaches the change-feed request. - Rewire TestChangeFeedPriorityAndThroughputBucketHeaders to use createChangeFeedTestClient (caches pre-wired) since the cache-nil fallback was removed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…hangeFeedResponse (#26792) * fix(azcosmos): remove no-op PopulateCompositeContinuationToken from ChangeFeedResponse After PR #26768, GetChangeFeed populates ChangeFeedResponse.ContinuationToken directly with the multi-range composite token. PopulateCompositeContinuationToken became a no-op (early-returning when ContinuationToken was already set) to prevent it from overwriting the multi-range token with a stale single-range one rebuilt from the per-head FeedRange. This PR removes the now-vestigial method outright. Tests that previously exercised it have been retargeted at GetCompositeContinuationToken (which still exists as a public helper for callers building single-range tokens manually). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update sdk/data/azcosmos/CHANGELOG.md Co-authored-by: Simon Moreno <30335873+simorenoh@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Simon Moreno <30335873+simorenoh@users.noreply.github.com>
Closes #26767.
Purpose
ContainerClient.GetChangeFeeddoes not survive partition splits today. Three independent failure modes:FeedRangestraddles new children. The legacy header builder exact-matched against the cached PK ranges viafindPartitionKeyRangeID. After a split, the parent range no longer exactly equals any current child, so the request was issued withoutx-ms-documentdb-partitionkeyrangeid.GetChangeFeedpropagated*azcore.ResponseErrordirectly. ThepartitionKeyRangeCache.forceRefreshandmaxPKRangeGoneRetriesplumbing added in [Cosmos] add container cache and pk range cache #26723 was never wired into the change feed.PopulateCompositeContinuationTokenwrote onechangeFeedRange. If the original FeedRange covered a parent that has since split, resume routed against a no-longer-physical EPK boundary instead of fanning out — events between calls were lost at the boundary.There was also no cross-container guard: a token's
ResourceIDwas never checked against the current container's_rid, so a wrong-container token would silently misroute against the wrong routing map.What this change does
Wires
GetChangeFeedto a queue-driven drain loop that mirrors the Java/.NET SDK behavior:overlappingPartitionKeyRanges+rangesOverlaphelpers incosmos_feed_range.goreplace exact-match for routing decisions. EmptyMaxExclusiveis normalized to"FF".head/advance/replaceHeadWithChildren/dropHeadContinuationqueue helpers oncompositeContinuationToken. The composite token now describes the full set of unfinished sub-ranges; each entry is rotated FIFO with its own ETag.maxPKRangeGoneRetries. The PK-range snapshot is fetched once per call (through the cache when present, falling back to a direct fetch) and reused across iterations; the 410 path re-fetches.ResourceIDguard. A continuation token whoseResourceIDdiffers from the current container's_ridis rejected loudly. The token'sResourceIDis populated from the response body's_ridon first success so the guard remains meaningful across resume.ErrFeedRangeUnresolvedis returned (wrapped) when a customer-supplied FeedRange/token doesn't overlap any current physical range even after a forced refresh — a signal callers canerrors.Isagainst and recover from rather than seeing an opaque error.buildRequestHeadersbuilder. New(head changeFeedRange, resolvedPKRangeID string) (map[string]string, error)builder. Returns an error whenPartitionKeyserialization fails so the caller can surface the cause instead of silently dropping the request (the original-bug pattern). The drain loop calls this with a pre-resolved head; the loop owns overlap-resolution upstream.acquireCaches), the loop logs a warning vialog.Writef(azlog.EventResponse, ...), drops the head's continuation token, and reads fresh with a passthrough range so the request goes out without anx-ms-documentdb-partitionkeyrangeidheader. StrictfeedRangeUnresolvedErroris preserved for genuine "FeedRange doesn't apply" cases.The legacy
ChangeFeedOptions.toHeadersandfindPartitionKeyRangeID(the original-bug source) have been deleted. They were unexported and had no production callers after the rewrite. Coverage of the field-to-header mapping is preserved viacosmos_change_feed_request_options_test.goagainst the newbuildRequestHeaders.File organization
cosmos_container.go— publicGetChangeFeedentrypoint stays here so the canonical container file continues to host the public API surface.cosmos_container_change_feed.go(new) —buildChangeFeedInitialQueue,resolveFeedRangeToChildren,buildChildQueueEntries,getChangeFeedForQueue(drain loop),serializeCompositeContinuationToken. Mirrors thecosmos_container_read_many.gosplit.cosmos_change_feed_composite_continuation_token.go— queue-mutation helpers.cosmos_change_feed_request_options.go—buildRequestHeaders(legacytoHeadersdeleted).cosmos_change_feed_response.go—defer resp.Body.Close()ordering fix (above the 304 short-circuit).cosmos_feed_range.go—overlappingPartitionKeyRanges,rangesOverlap,normalizeMaxBoundary.Tests
Mock-based unit tests —
cosmos_change_feed_split_test.go:TestGetChangeFeed_410Gone_TriggersCacheRefreshAndRetry— 410 + cache-refresh + retry.TestGetChangeFeed_RetryCapAt3_ReturnsLastErrorOnRepeated410— surface final 410 after cap.TestGetChangeFeed_TokenResourceIDMismatch_Rejected— cross-container guard.TestGetChangeFeed_NoOverlapAfterRefresh_ReturnsErrFeedRangeUnresolved— loud-fail on unresolvable FeedRange.TestGetChangeFeed_SplitExpansion_RoutesToFirstChild— child routing + persisted multi-range token.TestGetChangeFeed_MultiRangeContinuation_RoundTrips— resume from a multi-element queue.TestGetChangeFeed_SplitDuringDrain_QueriesEveryNewlyInsertedChild— regression guard for the rotation-budget bump.Header-builder unit tests —
cosmos_change_feed_request_options_test.go:_BuildRequestHeaders_Defaults— empty options yields only the AIM header._BuildRequestHeaders_AllFields— every field maps to the right header._BuildRequestHeaders_EmptyContinuationOmitsIfNoneMatch— explicit empty ETag does not emit a strayIfNoneMatch._BuildRequestHeaders_PartitionKeySerializationError— verifies the new error-returning contract.Emulator integration tests —
emulator_cosmos_change_feed_split_test.go(skipped whenEMULATORis unset):_F1_SubRangeOfPhysicalRange— direct repro of the original bug (strict sub-range FeedRange)._F1_WideRangeMultiDrain— paginate via composite token until all items across all physical ranges are drained._F1_CrossContainerTokenRejected— token from container A handed to container B is rejected.The 3 pre-existing change-feed tests (
TestContainerGetChangeFeedWithStartFrom,TestContainerGetChangeFeedWithStartFromFiltering,TestContainerGetChangeFeedForEPKRange) continue to pass.go test -race -short -count=1 ./sdk/data/azcosmos/...is green. CI lint is green.Three deep-reviewer rounds completed; the latest pass found no critical or should-fix issues.
Notes for reviewers
response.FeedRangeafter split: when a customer FeedRange straddles a split,ChangeFeedResponse.FeedRangereports the head being drained for that page (not the original parent). Callers that previously relied on it round-tripping should use the persisted continuation token instead — noted in CHANGELOG.partitionKeyRangeCache,forceRefresh,maxPKRangeGoneRetries,isPKRangeGoneResponseError,ContainerClient.refreshPKRangeCache,ContainerClient.getContainerRID.Checklist