-
Notifications
You must be signed in to change notification settings - Fork 524
Hedging: Adds Read hedging PREVIEW contracts #4598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hedging: Adds Read hedging PREVIEW contracts #4598
Conversation
Microsoft.Azure.Cosmos/src/Routing/AvailabilityStrategy/AvailabilityStrategy.cs
Show resolved
Hide resolved
Microsoft.Azure.Cosmos/src/Routing/AvailabilityStrategy/AvailabilityStrategy.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
Update PR description with
|
Microsoft.Azure.Cosmos/src/Routing/AvailabilityStrategy/AvailabilityStrategy.cs
Outdated
Show resolved
Hide resolved
...re.Cosmos/src/Routing/AvailabilityStrategy/CrossRegionParallelHedgingAvailabilityStrategy.cs
Outdated
Show resolved
Hide resolved
Microsoft.Azure.Cosmos/src/Routing/AvailabilityStrategy/DisabledAvailabilityStrategy.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check accessors on methods
|
PR description builder pattern needs to include ApplicationRegion or PreferredRegions |
Microsoft.Azure.Cosmos/src/Routing/AvailabilityStrategy/AvailabilityStrategy.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. No major qualms.
Pull Request Template
Description
Adds PREVIEW contracts for request hedging
Parallel Hedging APIs + Samples
When Building a new
CosmosClientthere will be an option to include Parallel hedging in that client.or
The example above will create a
CosmosClientinstance with AvailabilityStrategy enabled with at 500ms threhshold. This means that if a request takes longer than 500ms the SDK will send a new request to the backend in order of the Preferred Regions List. If theApplicationRegionorApplicationPreferredRegionslist is not set, then an AvailabilityStrategy will not be able to be set. If still no response comes back from the first hedge or the primary request after the step time, another parallel request will be made to the next region. The SDK will then return the first response that comes back from the backend. The threshold parameter is a required parameter can can be set to any value greater than 0. There is also an option to theAvailabilityStrategyat request level and override the client levelAvailabilityStrategyby setting theAvailabilityStrategyon theRequestOptionsobject.Override
AvailabilityStrategy:Hedging can be enabled for all read requests: ReadItem, Queries (single and cross partition), ReadMany, and ChangeFeed. It is not enabled for write requests.
Diagnostics
In the diagnostics data there are two new areas of note
Response RegionandHedge Contextthat will appear when using this feature.Response Regionshows the region that the request is ultimately served out of.Hedge Contextshows all the regions requests were sent to.Design
The SDK will send the first request to the primary region. If there is no response from the backend before the threshold time, then the SDK will begin sending hedged requests to the regions in order of the
ApplicationPreferredRegionslist. After the first hedged request is sent out, the hedged requests will continue to be fired off one by one after waiting the time specified in the threshold step. Once a response is received from one of the requests, the availability strategy will check to see if the result is considered final. If the result is final, then it is returned. If not, the SDK will skip the remaining threshold/threshold step time and send out the next hedged request. If all hedged requests are sent out and no final response is received, the SDK will return the last response it received. The AvaiabilityStrategy operates on the RequestInvokerHandler level meaning that each hedged request will go through its own handler pipeline, including the ClientRetryPolicy. This means that the hedged requests will be retried independently of each other. Note that the hedged requests are restricted to the region they are sent out in so no cross region retries will be made, only local retries. The primary request will be retried as normal.Status Codes SDK Will Consider Final
All other status codes are treated as possible transient errors and will be retried with hedging.
Example Flow For Cross Region Hedging With 3 Regions
graph TD A[RequestMessage] <--> B[RequestInvokerHandler] B <--> C[CrossRegionHedgingStrategy] C --> E(PrimaryRequest) E --> F{time spent < threshold} F -- No --> I F -- Yes --> G[[Wait for response]] G -- Response --> H{Is Response Final} H -- Yes --> C H -- No --> I(Hedge Request 1) I --> J{time spent < threshold step} J -- No --> K(Hedge Request 2) J -- Yes --> M[[Wait for response]] M -- Response --> N{Is Response Final} N -- Yes --> C N -- No --> K K --> O[[Wait for response]] O -- Response --> P{Is Response Final} P -- Yes --> C P -- No, But this is the final hedge request --> C