-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Add retries when polling during monitoring runs #786
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,6 +39,7 @@ | |
|
||
import proto | ||
|
||
from google.api_core import retry | ||
from google.api_core import operation | ||
from google.auth import credentials as auth_credentials | ||
from google.cloud.aiplatform import initializer | ||
|
@@ -48,6 +49,9 @@ | |
|
||
logging.basicConfig(level=logging.INFO, stream=sys.stdout) | ||
|
||
# This is the default retry callback to be used with get methods. | ||
_DEFAULT_RETRY = retry.Retry() | ||
|
||
|
||
class Logger: | ||
"""Logging wrapper class with high level helper methods.""" | ||
|
@@ -532,7 +536,9 @@ def _get_gca_resource(self, resource_name: str) -> proto.Message: | |
location=self.location, | ||
) | ||
|
||
return getattr(self.api_client, self._getter_method)(name=resource_name) | ||
return getattr(self.api_client, self._getter_method)( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The purpose of the PR is to address intermittent failure during polling and the goal is to scope to that issue. We can follow up on |
||
name=resource_name, retry=_DEFAULT_RETRY | ||
) | ||
|
||
def _sync_gca_resource(self): | ||
"""Sync GAPIC service representation of client class resource.""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC gapic_v1.method.DEFAULT is used when no retry is provided (https://github.com/googleapis/python-aiplatform/search?q=retry%3D). I wonder if there is any difference between _DEFAULT_RETRY and gapic_v1.method.DEFAULT?
If there is no difference, I think we need a more aggressive _DEFAULT_RETRY to fix the current problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gapic_v1.method.DEFAULT
is a sentinel value that is used to set the default retry passed into the constructor of the Gapic callable.See usage here: https://github.com/googleapis/python-api-core/blob/main/google/api_core/gapic_v1/method.py#L125
The default retry is
None
: https://github.com/googleapis/python-api-core/blob/main/google/api_core/gapic_v1/method.py#L147cc: @ivanmkc