Skip to content

Unable to invoke SageMaker API endpoint - Vague Error: KeyError: u'' #413

@tlelson

Description

@tlelson

This seems very likely conected to Issues: #99, #269 and #100.

System Information

  • Tensorflow / DNNClassifier
  • TensorFlow v 1.10, sagemaker==1.11.0
  • conda_tensorflow_p27 on hosted notebook

Describe the problem

I'm running my own Iris classifier comparing Sagemaker to Google MLEngine. I have taken standard TensorFlow code known to work, that deploys and predicts from MLE and repeated the steps in Sagemaker. Everything goes as expected up until I invoke the endpoint. At this stage I receive the following errors:

Minimal repro / logs

Notebook Error

ModelErrorTraceback (most recent call last)
<ipython-input-92-07d9ea9b1239> in <module>()
      2 # iris_predictor.predict({u'' : list(sample0.values)})
      3 # iris_predictor.predict({ u'instances': [dict(sample0)]})
----> 4 iris_predictor.predict({ u'': [dict(sample0)]})

/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/predictor.pyc in predict(self, data)
     84             request_args['Accept'] = self.accept
     85 
---> 86         response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
     87 
     88         response_body = response['Body']

/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/botocore/client.pyc in _api_call(self, *args, **kwargs)
    312                     "%s() only accepts keyword arguments." % py_operation_name)
    313             # The "self" in this scope is referring to the BaseClient.
--> 314             return self._make_api_call(operation_name, kwargs)
    315 
    316         _api_call.__name__ = str(py_operation_name)

/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/botocore/client.pyc in _make_api_call(self, operation_name, api_params)
    610             error_code = parsed_response.get("Error", {}).get("Code")
    611             error_class = self.exceptions.from_code(error_code)
--> 612             raise error_class(parsed_response, operation_name)
    613         else:
    614             return parsed_response

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "". See https://ap-southeast-2.console.aws.amazon.com/cloudwatch/home?region=ap-southeast-2#logEventViewer:group=/aws/sagemaker/Endpoints/sagemaker-tensorflow-2018-10-01-02-17-54-088 in account XXXXX for more information.

Cloudwatch Errors:

...
10.32.0.2 - - [01/Oct/2018:03:23:33 +0000] "GET /ping HTTP/1.1" 200 0 "-" "AHC/2.0"
[2018-10-01 03:23:37,290] ERROR in serving: u''
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/container_support/serving.py", line 182, in _invoke
self.transformer.transform(content, input_content_type, requested_output_content_type)
File "/usr/local/lib/python2.7/dist-packages/tf_container/serve.py", line 281, in transform
return self.transform_fn(data, content_type, accepts), accepts
File "/usr/local/lib/python2.7/dist-packages/tf_container/serve.py", line 208, in f
prediction = self.predict_fn(input)
File "/usr/local/lib/python2.7/dist-packages/tf_container/serve.py", line 223, in predict_fn
return self.proxy_client.request(data)
File "/usr/local/lib/python2.7/dist-packages/tf_container/proxy_client.py", line 66, in request
request_fn = self.request_fn_map[self.prediction_type]
KeyError: u''
[2018-10-01 03:23:37,290] ERROR in serving: u''
10.32.0.2 - - [01/Oct/2018:03:23:37 +0000] "POST /invocations HTTP/1.1" 500 0 "-" "AHC/2.0"
...

Reproducing the Error

The full model definition passed to Sagemaker for training and evaluation is here

The notebook currently produces the error.

Create the estimator with

estimator = TensorFlow(entry_point='../package/trainer/model.py',
                       role=execution_role,
                       framework_version='1.10',
                       output_path=model_artifacts_location,
                       code_location=custom_code_upload_location,
                       train_instance_count=1,
                       training_steps=100,
                       evaluation_steps=10,
                       train_instance_type='ml.c4.xlarge')

# Fit
estimator.fit(s3_data)

# Deploy
iris_predictor = estimator.deploy(initial_instance_count=1,
                                  instance_type='ml.t2.medium')

test0 = {'PetalLength': 1.7, 'PetalWidth': 0.5, 'SepalLength': 5.1, 'SepalWidth': 3.3}

# Predict (ERROR)
# iris_predictor.predict(test0) 
# iris_predictor.predict([test0]) 
# iris_predictor.predict({ u'instances': [test0]}) 
iris_predictor.predict({ u'': [test0]}) 

... all produce the vague error.

Thanks for your help.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions