Skip to content

Conversation

@ealsur
Copy link
Member

@ealsur ealsur commented Aug 22, 2022

Description

There were 3 conditions on which the Notification APIs were surfacing LeaseLostException to the user:

  1. If the lease is not found (the document was deleted, which would indicate an invalid state): https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos/src/ChangeFeedProcessor/LeaseManagement/DocumentServiceLeaseUpdaterCosmos.cs#L66-L70
  2. If there are constant conflicts beyond the retry scope (basically getting 412, which would indicate a very high concurrency of workers) https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos/src/ChangeFeedProcessor/LeaseManagement/DocumentServiceLeaseUpdaterCosmos.cs#L35
  3. When a worker processed a batch and was about to checkpoint but the lease was stolen (https://github.com/Azure/azure-cosmos-dotnet-v3/blob/master/Microsoft.Azure.Cosmos/src/ChangeFeedProcessor/LeaseManagement/DocumentServiceLeaseManagerCosmos.cs#L92) which would indicate that a batch will probably be reprocessed by another instance.

This PR makes it so the notification APIs are called but with the InnerException related to the CosmosException (a NotFound, or a Conflict, or a PreconditionFailed error) instead of the LeaseLostException.

This way we avoid leaking the internal type but we don't miss notifying the problem.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

Closing issues

Closes #3379

@kundadebdatta kundadebdatta self-requested a review August 23, 2022 20:14
kundadebdatta
kundadebdatta previously approved these changes Aug 23, 2022
@ealsur ealsur enabled auto-merge (squash) August 23, 2022 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LeaseLostException still bubbled up

5 participants