-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Description
This seems very likely conected to Issues: #99, #269 and #100.
System Information
- Tensorflow /
DNNClassifier - TensorFlow v 1.10,
sagemaker==1.11.0 conda_tensorflow_p27on hosted notebook
Describe the problem
I'm running my own Iris classifier comparing Sagemaker to Google MLEngine. I have taken standard TensorFlow code known to work, that deploys and predicts from MLE and repeated the steps in Sagemaker. Everything goes as expected up until I invoke the endpoint. At this stage I receive the following errors:
Minimal repro / logs
Notebook Error
ModelErrorTraceback (most recent call last)
<ipython-input-92-07d9ea9b1239> in <module>()
2 # iris_predictor.predict({u'' : list(sample0.values)})
3 # iris_predictor.predict({ u'instances': [dict(sample0)]})
----> 4 iris_predictor.predict({ u'': [dict(sample0)]})
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/predictor.pyc in predict(self, data)
84 request_args['Accept'] = self.accept
85
---> 86 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
87
88 response_body = response['Body']
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/botocore/client.pyc in _api_call(self, *args, **kwargs)
312 "%s() only accepts keyword arguments." % py_operation_name)
313 # The "self" in this scope is referring to the BaseClient.
--> 314 return self._make_api_call(operation_name, kwargs)
315
316 _api_call.__name__ = str(py_operation_name)
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/botocore/client.pyc in _make_api_call(self, operation_name, api_params)
610 error_code = parsed_response.get("Error", {}).get("Code")
611 error_class = self.exceptions.from_code(error_code)
--> 612 raise error_class(parsed_response, operation_name)
613 else:
614 return parsed_response
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "". See https://ap-southeast-2.console.aws.amazon.com/cloudwatch/home?region=ap-southeast-2#logEventViewer:group=/aws/sagemaker/Endpoints/sagemaker-tensorflow-2018-10-01-02-17-54-088 in account XXXXX for more information.Cloudwatch Errors:
...
10.32.0.2 - - [01/Oct/2018:03:23:33 +0000] "GET /ping HTTP/1.1" 200 0 "-" "AHC/2.0"
[2018-10-01 03:23:37,290] ERROR in serving: u''
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/container_support/serving.py", line 182, in _invoke
self.transformer.transform(content, input_content_type, requested_output_content_type)
File "/usr/local/lib/python2.7/dist-packages/tf_container/serve.py", line 281, in transform
return self.transform_fn(data, content_type, accepts), accepts
File "/usr/local/lib/python2.7/dist-packages/tf_container/serve.py", line 208, in f
prediction = self.predict_fn(input)
File "/usr/local/lib/python2.7/dist-packages/tf_container/serve.py", line 223, in predict_fn
return self.proxy_client.request(data)
File "/usr/local/lib/python2.7/dist-packages/tf_container/proxy_client.py", line 66, in request
request_fn = self.request_fn_map[self.prediction_type]
KeyError: u''
[2018-10-01 03:23:37,290] ERROR in serving: u''
10.32.0.2 - - [01/Oct/2018:03:23:37 +0000] "POST /invocations HTTP/1.1" 500 0 "-" "AHC/2.0"
...Reproducing the Error
The full model definition passed to Sagemaker for training and evaluation is here
The notebook currently produces the error.
Create the estimator with
estimator = TensorFlow(entry_point='../package/trainer/model.py',
role=execution_role,
framework_version='1.10',
output_path=model_artifacts_location,
code_location=custom_code_upload_location,
train_instance_count=1,
training_steps=100,
evaluation_steps=10,
train_instance_type='ml.c4.xlarge')
# Fit
estimator.fit(s3_data)
# Deploy
iris_predictor = estimator.deploy(initial_instance_count=1,
instance_type='ml.t2.medium')
test0 = {'PetalLength': 1.7, 'PetalWidth': 0.5, 'SepalLength': 5.1, 'SepalWidth': 3.3}
# Predict (ERROR)
# iris_predictor.predict(test0)
# iris_predictor.predict([test0])
# iris_predictor.predict({ u'instances': [test0]})
iris_predictor.predict({ u'': [test0]}) ... all produce the vague error.
Thanks for your help.
zjost