Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFP AVAD: Need APIs to validate partitionkeys and feedranges against a list of ranges. #4483

Closed
7 of 10 tasks
philipthomas-MSFT opened this issue May 13, 2024 · 0 comments · Fixed by #4566
Closed
7 of 10 tasks
Assignees

Comments

@philipthomas-MSFT
Copy link
Contributor

philipthomas-MSFT commented May 13, 2024

Description

Stakeholders

  • PlayFab
    • Rakesh Verna (CFP-AVAD)

Problem statement

Customer requires a means to find a feed range and/or partition key against a list of given feed ranges. The purpose of this is to "bookmark", (FeedRange(minInclusive, maxExclusive), LSN) document changes so that the customer can validate if the "bookmark" of the document change has been processed on previous feed iterations. Each feed iteration has a feed range, partition key and LSN returned as part of the ChangeFeedProcessorContext type or the changed document. The addition of feed range is included here. Validation needs to do by both partition key and feed range.

PR is located at 4566 exposed 2 new API methods IsSubset with a parent feed range by either partition key or child feed range.

"Bookmark" sample:

[
	{
		"Range": {
			"min": "",
			"max": "05C1DFFFFFFFFC"
		},
		"LSN": "0"
	}
]
  • Scenario (NOTE: This logic is written and performed by the customer and is subject to change; What we provide is the range lookup and exposes feed ranges on the ChangeFeedProcessorContext:
    • Setup: ContainerA and ContainerB
      • ContainerA contains the documents and is the monitored container with feed change iterations.
      • ContainerB contains the documents with bookmarks and is never monitored with feed change iteration.
      • Another active Issue and PR to expose feed range in type ChangeFeedProcessorContext .
    • Flow:
      1. Document1 is created in ContainerA.
      2. Document1 is read in the feed change iteration, GetChangeFeedProcessorBuilderWithAllVersionsAndDeletes, and stores in-memory, as a "bookmark", the ChangeFeedProcessorContext.FeedRange and the LSN of the changed document. See "Bookmark" sample.
      3. Document1 is now stored in ContainerB with bookmarks (FeedRange(minInclusive, maxExclusive), LSN).
      4. At some point if Document1 is read again from ContainerA due to a change, then it is ultimately read in the feed change iteration GetChangeFeedProcessorBuilderWithAllVersionsAndDeletes.
        • If the new feed's range and LSN on Document1 exist in the "bookmarks" when read, then they ignore that document.
        • If the new feed's range and LSN on Document1 does not exist (the highest LSN in bookmarks is less that the feed's changed document LSN) in the "bookmarks" when read, then they will "process" that document as if it is the first time read. "Processing" also means that the "bookmarks" are updated with the new LSN.

playfab_flow

  • Scenarios
    • Complete
  • Proof of concept or prototype
    • Complete
  • Testing
    • Complete
  • Documentation
    • Complete
  • Determining release schedule
    • Complete

Original rough information

{
	"id": 1,
	"billing": [],
	"bookmarks": [
	  { feedRange: "''-'AA'", lsn: 5 },
	  ....
	]
}
 
Changes[] to process
Filter for TitleId 1
 
Sort changes of TitleId1 by LSN and identify highest/last LSN --> 7
 
Add bookmark for FeedRange of current physical partiton + highestLSNForTitelId1
 
{
	"id": 1,
	"billing": [],
	"bookmarks": [
	  { feedRange: "''-'AA'", lsn: 5 },
	  { feedRange: "'0A'-'AA'", lsn: 7 },
	  --> '' - '0A': 5
	]
}
 
------------
 
Check for exactly once processing
Find Changes by TitleId1
Read Title doc for TitleId1
For each change {
Identify full PK
Check whether LSN of change with PK xyz has been processed yet
}
 
boolean hasChangeBeenProcessed(PartitionKey pk, long lsnOfChange, List[<FeedRange, long> titleBookamrks) {
--> lsnOfChange 9
--> PK has EPK '0B' --> changes processed until LSN 7
 
return lsnOfChange <= highestOverlappingLsn
}
 
Task<List[FeedRange]> FindOverlappingRangesAsync (PartitioKey pk, List{FeedRange])
Task<List[FeedRange]> FindOverlappingRangesAsync (FeedRange pk, List{FeedRange])
@philipthomas-MSFT philipthomas-MSFT self-assigned this May 13, 2024
@philipthomas-MSFT philipthomas-MSFT changed the title ChangeFeedProcessor: AVAD Response to Include More Intrinsic Information (specifically Feed Range) and API to validate processed LSNs CFP AVAD: Need APIs to validate partitionkeys and feedranges against a list of ranges. Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant