-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: adds function/method enhancements, demo samples #122
Conversation
tests/unit/test_value_converter.py
Outdated
# `expected_type` is `test_value_converter.SomeMessage` while | ||
# `actual_from_value_output` is just `SomeMessage` | ||
# Use `isinstance()` instead. | ||
#assert(type(actual_from_value_output) is type(expected_type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried changing this to isinstance()
, but instead get this message:
E AssertionError: assert False
E + where False = isinstance(test_str: "Omnia Gallia est divisa"\ntest_int64: 3\ntest_bool: true\n, SomeMessage)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like an artifact of how the tests are collected and run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be. I'm inclined to leave this test as-is for now (since we test property-level equivalency). I'll log a GitHub issue and assign it to myself to further investigate this problem.
Does that work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -14,6 +14,8 @@ | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
# | |||
from google.cloud.aiplatform_helpers import add_methods_to_classes_in_package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest using a namespace that does not imply public API, since we don't expect the users to use this, right?. perhaps something like _helpers
instead of aiplatform_helpers
?
Also I think the convention here is to import the module and not individual methods or classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I think the convention here is to import the module and not individual methods or classes.
That's what the public Google Python style guide says. https://google.github.io/styleguide/pyguide.html#22-imports I don't know if we've followed it consistently in the past, but probably best to adhere to this for new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value_converter.py module is intended to be public. It will be helpful for tabular developers who need to format their prediction instances, for example.
I'll change the name of the methods intended to be private so that they have a leading underscore in their names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see - perhaps add_methods_to_classes_in_package
should be in a private module, where as value_converter
a public module. in that case perhaps a nested submodule aiplatform.helpers.value_converter
would be preferred over aiplatform_helpers.value_converter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
from google.cloud.aiplatform.v1beta1.schema.predict.instance import text_sentiment_pb2 as gcaspi_text_sentiment # type: ignore | ||
# DO NOT OVERWRITE FOLLOWING LINE: it was manually edited. | ||
from google.cloud.aiplatform.v1beta1.schema.predict.instance import TextSentimentPredictionInstance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't this be overwritten by the next re-generation? Why is it necessary to change the import here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this replace needs to be made permanent, please do it in the synth.py
. example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
cls.to_value.__doc__ = to_value.__doc__ | ||
|
||
# Add from_value() method to class with docstring | ||
cls.from_value = add_from_value_to_class(cls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this one not calling setattr
like the other two methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
I was just trying different methods of assigning members dynamically; forgot to standardize on one technique.
return False | ||
return True | ||
|
||
props = list(filter(is_prop, dir(self._pb))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the intention here is to collect all the field names - is there a better to do that than relying on attribute name's first character?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@software-dov Do you have any suggestsions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was my hack for trying to get around the "int64s as strings" issue.
However, playing with the Java library the other day, I think that sending int64 values as strings might not be as big a deal as I originally though. I'm going to switch this to a simple call to json_format.ParseDict()
and make sure that I can still train a model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This apparently isn't needed! I've removed this code.
for prop in props: | ||
props_dict[prop] = getattr(self._pb, prop) | ||
|
||
return json_format.ParseDict(props_dict, Value()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this work if some of the values of props_dict
are nested proto messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this bit.
|
||
training_pipeline = { | ||
"display_name": display_name, | ||
"training_task_definition": "gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_image_classification_1.0.0.yaml", | ||
"training_task_inputs": training_task_inputs, | ||
"training_task_inputs": icn_training_inputs.to_value(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rather prefer not to have additional method calls here.
(that is: define a new variable above so that the value of "trainign_task_inputs"
is just that variable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also note that this is simply a style preference with some hidden implication on sample generation. please feel free to leave it as is for sample review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
instances = [instance_val] | ||
|
||
params_obj = params.ImageClassificationPredictionParams({ | ||
"confidence_threshold": 0.5, "max_predictions": 5}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems more common to pass these in as parameters as opposed to a dict, as is done in the other sample of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
tests/unit/test_enhanced_types.py
Outdated
ModelType = definition.AutoMlImageClassificationInputs().ModelType | ||
|
||
|
||
class EnhancedTypesTests(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other library unit tests are written in the style of plain pytest functions. Unless that is not feasible for what you are testing here, please follow the same convention. Also perhaps add another folder under tests/unit
to house tests that specifically have to do with enhanced types, and add yourself as codeowner of that folder (if tests are allowed to have codeowners).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CODEOWNERS are by directory, so that is definitely possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both done: moved tests, switched to pytest rather than unittest.
tests/unit/test_value_converter.py
Outdated
# `expected_type` is `test_value_converter.SomeMessage` while | ||
# `actual_from_value_output` is just `SomeMessage` | ||
# Use `isinstance()` instead. | ||
#assert(type(actual_from_value_output) is type(expected_type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like an artifact of how the tests are collected and run.
@@ -14,6 +14,8 @@ | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
# | |||
from google.cloud.aiplatform_helpers import add_methods_to_classes_in_package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I think the convention here is to import the module and not individual methods or classes.
That's what the public Google Python style guide says. https://google.github.io/styleguide/pyguide.html#22-imports I don't know if we've followed it consistently in the past, but probably best to adhere to this for new code.
|
||
from google.cloud.aiplatform.v1beta1.schema.predict.instance import text_sentiment_pb2 as gcaspi_text_sentiment # type: ignore | ||
# DO NOT OVERWRITE FOLLOWING LINE: it was manually edited. | ||
from google.cloud.aiplatform.v1beta1.schema.predict.instance import TextSentimentPredictionInstance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this replace needs to be made permanent, please do it in the synth.py
. example
@@ -78,14 +78,14 @@ class AutoMlForecastingInputs(proto.Message): | |||
function over the validation set. | |||
|
|||
The supported optimization objectives: | |||
"minimize-rmse" (default) - Minimize root- | |||
"minimize-rmse" (default) - Minimize root- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PSA: If the proto comments are formatted correctly but the docstrings are getting generated in a weird state please file bugs on the generator repo. https://github.com/googleapis/gapic-generator-python
return False | ||
return True | ||
|
||
props = list(filter(is_prop, dir(self._pb))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@software-dov Do you have any suggestsions?
} | ||
training_task_inputs = json_format.ParseDict(training_task_inputs_dict, Value()) | ||
|
||
icn_training_inputs = definition.AutoMlImageClassificationInputs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instances are nicer b/c IDEs can offer more assistance with field names and types. Dicts are sometimes easier to pass around though.
I don't think we currently mandate one style or the other in the style guide. There is a mix in the currently published samples.
tests/unit/test_enhanced_types.py
Outdated
ModelType = definition.AutoMlImageClassificationInputs().ModelType | ||
|
||
|
||
class EnhancedTypesTests(unittest.TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CODEOWNERS are by directory, so that is definitely possible.
} | ||
training_task_inputs = json_format.ParseDict(training_task_inputs_dict, Value()) | ||
|
||
icn_training_inputs = definition.AutoMlImageClassificationInputs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General consensus amongst the owners was preference to have generated classes for API resources - having spell check and autocomplete as well as knowing where to look in reference docs is helpful
If you're using a user-defined object with arbitrary properties, a dict may be simpler.
endpoint = client.endpoint_path( | ||
project=project, location=location, endpoint=endpoint_id | ||
) | ||
response = client.predict( | ||
endpoint=endpoint, instances=instances, parameters=parameters | ||
endpoint=endpoint, instances=instances, parameters=params_obj | ||
) | ||
print("response") | ||
print(" deployed_model_id:", response.deployed_model_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - Is there a reason for the extra space at the beginning of this print statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a \t character in there earlier, but it got dropped accidentally. Added it back.
@@ -31,4 +31,4 @@ def test_ucaip_generated_predict_image_classification_sample(capsys): | |||
) | |||
|
|||
out, _ = capsys.readouterr() | |||
assert 'string_value: "daisy"' in out | |||
assert 'deployed_model_id:' in out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this test case change? Is there any chance this could lead to a false positive if no model ID is returned? Or will the sample straight up fail before it gets to this print statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of reasons, but biggest of them: we want to avoid testing for the output of models, since retraining can cause the predictions to change.
No, a model ID must be returned as part of the online prediction--you can't have a prediction without a model! The sample will fail if you attempt to send a prediction request to an endpoint that has no model deployed to it.
@@ -0,0 +1,2 @@ | |||
# Tests for enhancements to the AI Platform library for Python. | |||
* @telpirion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@busunkim96 should this instead be at .github/CODEOWNERS
?
and then be formatted as
(this might have whitespace errors but I'm just trying to convey the idea here :) )
/tests/unit/enhanced_library @telpirion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This PR does the following:
to_value()
,from_value()
, andfrom_map()
utility functionsprotobuf.Value
objects