Skip to content

Conversation

@simorenoh
Copy link
Member

@simorenoh simorenoh commented Jul 18, 2024

This PR improves and fixes some functionality that was missing in the SDK. Namely, this PR includes:

  • New logic to retry ServiceRequestErrors up to three times before failing the request, ensuring that requests that were not reaching the service can attempt to run more than once.
  • Bugfix on a client hang that would happen when switching write regions in an account (running into 403/3 write forbidden error). SDK no longer hangs attempting to resolve the locational endpoint and now properly routes requests. This also adds write failover logic to the SDK.
  • Adding retry mechanisms for 403/1008 database account not found errors.
  • Small enhancements to our session retry logic and global endpoint manager refreshes, namely updating the locking logic to the same as async (slightly faster in deciding whether lock is needed) and adding locational routing to 404/1002 retries.
  • Documentation updates on our client configurations, becoming more specific on what each setting does.
  • Fixes bug introduced in PR 31096 which removed 503 retries from the SDK.
  • Adds logic to properly refresh the location cache information in the SDK every 5 minutes.

@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@azure-sdk
Copy link
Collaborator

API change check

APIView has identified API level changes in this PR and created following API reviews.

azure-cosmos

@simorenoh simorenoh marked this pull request as ready for review July 18, 2024 15:54
@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@xinlian12 xinlian12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh simorenoh merged commit f8ab118 into Azure:main Oct 7, 2024
@simorenoh simorenoh deleted the retry_validations branch October 7, 2024 16:56
l0lawrence pushed a commit to l0lawrence/azure-sdk-for-python that referenced this pull request Feb 19, 2025
* 403.3 loop fix, regional routing fix, improvement on service request errors

functional code, missing tests now

* Update ErrorCodesAndRetries.md

* Update TimeoutAndRetriesConfig.md

* Update http_constants.py

* Update CHANGELOG.md

* test improvements for 403 retry

* fix emulator tests

* Update test_globaldb.py

* Update test_globaldb.py

* add ServiceRequestError test and doc update

* addressing comments

* Update test_globaldb.py

* Update test_globaldb.py

* Update test_globaldb.py

* Update test_globaldb.py

* move policy

* revert

* fixes

* Update test_globaldb.py

* Update test_globaldb.py

* Update test_globaldb.py

* Update test_globaldb.py

* Update test_globaldb.py

* Update CHANGELOG.md

* 503 retries

* align readme with changelog

* forceful db account refresh

* remove premature locational endpoint

* make GEM refresh every 5 mins as it should have

* Delete drz3-drill.txt

* Update CHANGELOG.md

* Update test_location_cache.py

* Update _global_endpoint_manager.py

* ensure only one initial database account call

* Delete dr-zdrill-005.txt

* Update test_location_cache.py

* Update test_location_cache.py

* Update test_location_cache.py

* overhaul location_cache tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants