[Internal] DTS: Adds partitionKeyRangeId to operation result to assemble valid session tokens in DTS#5856
Conversation
…n tokens in DTS
In Gateway mode, distributed transaction operation-level session tokens are sent as LSN-only (without the pkRangeId: prefix) rather than the fully-assembled {pkRangeId}:{lsn} format that SessionContainer.SetSessionToken expects.
Changes:
- Add optional PartitionKeyRangeId property to DistributedTransactionOperationResult (JSON: 'partitionKeyRangeId'), with constant in DistributedTransactionSerializer
- Update MergeSessionTokens() to assemble {pkRangeId}:{lsn} when the token is LSN-only and partitionKeyRangeId is present
- If token already contains ':' (pre-assembled), use it as-is (backward compat)
- If token is LSN-only and partitionKeyRangeId is absent/whitespace, log a trace warning and silently skip merging for that operation
- Add unit tests covering all new branches including whitespace/empty partitionKeyRangeId edge cases and precedence when both fields are present
FabianMeiswinkel
left a comment
There was a problem hiding this comment.
SDK should not attempt to fix broken sesison tokens.
Deep Review (local) — findingsSharing findings from a local deep review. Not posting as a formal "Request Changes" because @FabianMeiswinkel already has SummaryThis correctly identifies a real latent crash from #5705: However: the architectural direction is contested, and the workaround is incomplete — other malformed shapes will still convert a committed DTS into a thrown exception. 🔴 BlockerB1 — Architectural pushback unresolved. Aligning with @FabianMeiswinkel: the session token shape is a server-emitted contract; the right fix is server-side, not a parsing escape hatch in the SDK. Once dual-shape parsing ships, the SDK has to tolerate both shapes indefinitely for interop with older clients, which effectively makes the "transitional" code permanent. PR #5705's earlier review from @kirankumarkolli made the same point ("Validations right place is upstream when response is composed."). If a client-side patch must land as a stop-gap, gate it behind a feature flag / version check with a tracking issue and an explicit deletion target. 🟡 MajorM2 — The fix is partial.
Each of these still escapes Suggestion: wrap the inner merge loop body (or the M3 — Silent skip creates a mystery 404/1002 for customers. When pkRangeId is missing, the SessionContainer never learns the LSN advance. The next session-consistency read against a replica that hasn't caught up returns Consider attaching a diagnostic note onto M4 — No operational signal for the Cosmos team. Suggestion: route this through M5 — Comment vs. behavior contradict each other. The block at // Cannot form a valid session token without pkRangeId; silently skip merging
// this operation's token but continue processing the rest of the response.
DefaultTrace.TraceWarning(...);The comment (and PR description) say "silently skip"; the code emits a warning. Pick one — given M4 I'd keep and upgrade the trace, then reword the comment. 🟢 Minor
💬 Suggestions
✅ Positives
|
NaluTripician
left a comment
There was a problem hiding this comment.
Following up on my earlier deep-review notes — the crash fix and overall shape look right, but I'd like these specific code asks resolved before I sign off. None are large; all are inline below.
Code asks (gating my sign-off):
- Narrow the new outer
catch (Exception)inMergeSessionTokensto excludeOperationCanceledExceptionso it matchesCommitTransactionAsync's outer-catch contract. - Null-guard
resultin the new catch'sTraceWarningformat-args path so a null array element can't NRE inside the safety net and escape the loop. - Make the SessionToken / PartitionKeyRangeId null-checks in
FromJsonsymmetric (bothIsNullOrWhiteSpace). - Add an "already canonical" pass-through in
FromJsonso the SDK is defensive against any coordinator version that still emits{pkRangeId}:{lsn}directly insessionToken— silently nulling a valid token is a worse regression than the original crash.
Process gates (not code, but worth calling out):
- @FabianMeiswinkel still has
CHANGES_REQUESTEDon file — please get an explicit re-review / dismissal rather than relying on the text reply. - CI on
6fb31fbis still in-progress for mostdotnet-v3-cijobs (Microsoft.Azure.Cosmos.Tests, Static Analysis, EmulatorTests suites, packaging). Want to see those green before I approve.
Once the four inline items are addressed and CI/Fabian unblock, this is good to go from my side.
- catch (Exception ex) when (ex is not OperationCanceledException): mirrors outer CommitTransactionAsync contract so cancellations propagate - FromJson ?? throw JsonException with element kind and raw text: null guard at the source rather than downstream in MergeSessionTokens - IsNullOrWhiteSpace on SessionToken (symmetric with PartitionKeyRangeId); whitespace-only SessionToken normalized to null - Already-canonical pass-through tightened: colonIndex > 0 && colonIndex < Length-1 ensures both pkRangeId and LSN sides are non-empty - Tests: OCE propagation, whitespace sessionToken (x2), canonical preserved (x2), edge-colon not-canonical (x2) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…sionTokens In a committed DTC (200 OK), no individual operation result can be 404/1002 that would be a server contract violation. The check was copied from GatewayStoreModel where it applies to single-operation retryable flows, but is dead code in the DTC commit path. Removed the corresponding test that asserted the now-deleted behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
NaluTripician
left a comment
There was a problem hiding this comment.
Approve with nits ✅
Thanks for the iterations — ~80% of my prior asks are addressed, the crash fix is real, and the test coverage (parser-level + committer-level + emulator + DelegatingTraceListener for the trace contract) is genuinely thorough. Signing off; Fabian is updating his review asynchronously so I'm not gating on that here.
Nits below are non-blocking — happy to take any/all as follow-ups.
Nits
-
result?.Indexin the catch's format args.FromJson's?? throw new JsonException(...)guard closes the realistic path, but the catch at line 224 is the last line of defense against bookkeeping bugs. A null-conditional access (result?.Index ?? -1) costs nothing and keeps the warning robust if anything upstream regresses. -
404 /
ReadSessionNotAvailableskip removal in6708805is undefended. The rationale ("unreachable" because DTC ops are writes) is plausible, but the deletion has no inline comment and the prior testCommitTransactionAsync_SkipsMerge_When404ReadSessionNotAvailablewas removed rather than replaced. Either restore the skip (cheap defensive depth) or add a 2-line comment + a regression test assertingMergeSessionTokensno-ops gracefully on a per-op 404/1002 — relies on the new try/catch. -
Edge-colon assembly produces double-colon tokens. When
pkRangeId="X"andsessionToken=":Y", the assembled output is"X::Y". The new try/catch inMergeSessionTokensturns the downstream throw into aTraceWarning, butFromResponseMessage_OperationResult_SessionToken_AssembledWhenColonIsAtEdgeactively locks in the malformed output as intended behavior. Suggest only assembling when the raw token contains no colon at all — and update that test to assert the token is nulled / skipped instead. -
TODO deletion gates for the transitional null-pkRangeId branch are only in tests, not at the
FromJsonbranch sites. When the coordinator update ships, the person ripping this out has to discover the three FromJson branches by hand. A// TODO(#NNNN, delete after coordinator emits pkRangeId)next to each branch would make that work mechanical. Pair with a tracking issue. -
TraceWarningspam risk is unbounded. If the coordinator never updates for a given account, every op of every DTS commit logs. Not urgent, but worth a sampling guard or first-N-per-process throttle before this is in customers' hands at scale. -
Brittle trace assertion.
CommitTransactionAsync_EmitsTraceWarning_WhenPartitionKeyRangeIdIsAbsentmatches on.Contains("partitionKeyRangeId")— if anyone reformats the warning and drops that exact substring the test goes silently green. Consider asserting on the operation index or a more stable token like"DTC operation index". -
Customer/operational diagnostic gap (M3/M4 from my prior review).
DefaultTrace.TraceWarningis invisible in nearly all customer deployments and gives the Cosmos team no production telemetry to size the urgency of the coordinator fix. Even one of these as a follow-up would help materially: aSessionMergeSkippedflag onDistributedTransactionResponse, or routing the warning throughCosmosDbEventSourceso it's observable.
Positives worth saying out loud
OperationCanceledExceptioncarve-out + the new test that locks the contract is exactly right.- Already-canonical pass-through branch + the
colonIndex > 0 && colonIndex < length - 1guard is a clean way to reject":"/"0:"without breaking real tokens. IsNullOrWhiteSpacesymmetry + the whitespace-to-null normalization branch and the parameterized tests covering" "," "are nice.- Copy constructor propagating
PartitionKeyRangeId— easy to miss. - Moving the
Operations[result.Index]access inside the try/catch is a genuine hardening fix on top of the headline change. ?? throw new JsonExceptiononFromJson'sDeserializeresult quietly turns "NRE somewhere later" into a boundary-level exception.
LGTM 🚢
Description
Problem
Distributed transaction (DTC) operation-level session tokens are sent by the coordinator as raw LSN vectors (e.g.,
"-1#425344#1=12345") without the{pkRangeId}:prefix.SessionContainer.SetSessionToken()unconditionally doesSplit(':')[1]on the token — an LSN-only token has no colon, sotokenPartshas length 1, causing anIndexOutOfRangeExceptionthat escapes and surfaces a committed transaction as failed to the caller.Fix
Add a
partitionKeyRangeIdfield toDistributedTransactionOperationResultand assemble the canonical{pkRangeId}:{lsn}session token inFromJson()before it reachesSessionContainer.Design Decisions
Assembly happens in
FromJson(), notMergeSessionTokens()The token is assembled at parse time so that
result.SessionTokenis always in canonical form (or null) by the time any downstream code reads it. This keepsMergeSessionTokenssimple — it just sets the header.Warn + null (skip merge) when
partitionKeyRangeIdis absent/blank — no throwThe DTC coordinator currently never sends
partitionKeyRangeId(tracked by #5857). Once the coordinator update ships, both fields will always be present and the null-path will be removed.We considered throwing
InvalidOperationExceptionon blankpartitionKeyRangeId, but rejected it because:SetPartitionKeyRangeId()has zero validation — it can legitimately be null or empty (e.g., master resource operations, partition startup, aborted transactions whereSDKResponseBuilder.cpp:193explicitly sets empty string)."0".Known limitation: stale-read window when pkRangeId is absent
When
partitionKeyRangeIdis absent (current coordinator behavior),SessionTokenis set to null → session token merge is skipped →SessionContainerretains the previous LSN for that partition. A subsequent session-consistent read may be served by a replica that hasn't yet replicated the DTC commit. This is a strict improvement over the pre-fix crash (committed tx surfacing as failed) and is transient — replication lag is typically milliseconds, and any other write to the same partition advances the token past the gap. This will be fully resolved when the coordinator always sendspartitionKeyRangeId(#5857).Changes
DistributedTransactionOperationResult.csPartitionKeyRangeIdJSON property.FromJson()assembles{pkRangeId}:{lsn}when valid, else warns and setsSessionToken = null. Removed unusedusing System.ComponentModel.DistributedTransactionSerializer.csPartitionKeyRangeIdconstant for the JSON field name.DistributedTransactionCommitter.csOperations[result.Index]. Improve catch diagnostics. Token is already canonical fromFromJson()—MergeSessionTokensjust sets the header.DistributedTransactionResponseTests.csDistributedTransactionCommitterTests.csDistributedTransactionTests.cs(emulator)ValidateSessionTokenMergedIntoDtcClientto use new wire format. AddValidateSessionTokenSkipped_WhenPartitionKeyRangeIdAbsent. Both tests use matching PKs for partition alignment.Wire Contract (Transitional)
Type of change