Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config async problem when creating endpoint #4275

Open
RwGrid opened this issue Sep 17, 2024 · 1 comment
Open

config async problem when creating endpoint #4275

RwGrid opened this issue Sep 17, 2024 · 1 comment
Assignees
Labels
bug This issue is a confirmed bug. p2 This is a standard priority issue response-requested Waiting on additional information or feedback. sagemaker service-api This issue is caused by the service API, not the SDK implementation.

Comments

@RwGrid
Copy link

RwGrid commented Sep 17, 2024

Describe the bug

when i remove the async 'AsyncInferenceConfig' , it works, but i want an async endpoint because my payload is above 5 mb, its a numpy array of image of 50 + mb , so i need async, the container is triton

Expected Behavior

create an endpoint

Current Behavior

botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateEndpoint operation: One or more endpoint features are not supported using this configuration

Reproduction Steps

create_model = sagemaker_client.create_model(
ModelName = 'tritoninferencesasync',
ExecutionRoleArn = role,
PrimaryContainer= container)

create_endpoint_config_response = sm_client.create_endpoint_config(
EndpointConfigName='endpointconfigsingleasnyc',
ProductionVariants=[
{
"VariantName": "variant1",
"ModelName": 'tritoninferencesasync',
"InstanceType": "ml.m5.xlarge",
"InitialInstanceCount": 1,
}
],
AsyncInferenceConfig={
"OutputConfig": {
"S3OutputPath": "s3://allinferences/yy/xx"
},
"ClientConfig": {"MaxConcurrentInvocationsPerInstance": 4},
},
)
endpoint_name='tritonendpointsasync'
create_multi_endpoint = sagemaker_client.create_endpoint(
EndpointName = endpoint_name,
EndpointConfigName = 'endpointconfigsingleasnyc')

Possible Solution

another region maybe >? > currently in milan

Additional Information/Context

No response

SDK version used

boto3==1.35.11 botocore==1.35.11

Environment details (OS name and version, etc.)

windows 11

@RwGrid RwGrid added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Sep 17, 2024
@tim-finnigan tim-finnigan self-assigned this Sep 19, 2024
@tim-finnigan tim-finnigan added the investigating This issue is being investigated and/or work is in progress to resolve the issue. label Sep 19, 2024
@tim-finnigan
Copy link
Contributor

tim-finnigan commented Sep 19, 2024

Thanks for reaching out. The create_model command makes a request to the CreateModel API, so the ValidationError is coming from the SageMaker service API. Have you tried testing in other regions? I found a report of this same error, where someone commented about how the region could be the issue. (Although the SageMaker documentation says that Milan should be supported.)

I also noticed that you have both sagemaker_client and sm_client, so wanted to check if you intended to have separate clients here. If still seeing an issue, please share debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script.

@tim-finnigan tim-finnigan added response-requested Waiting on additional information or feedback. service-api This issue is caused by the service API, not the SDK implementation. sagemaker p2 This is a standard priority issue and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. needs-triage This issue or PR still needs to be triaged. labels Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. p2 This is a standard priority issue response-requested Waiting on additional information or feedback. sagemaker service-api This issue is caused by the service API, not the SDK implementation.
Projects
None yet
Development

No branches or pull requests

2 participants