You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"documentation":"<p>Specifies an option from a collection of preconfigured Amazon Machine Image (AMI) images. Each image is configured by Amazon Web Services with a set of software and driver versions. Amazon Web Services optimizes these configurations for different machine learning workloads.</p> <p>By selecting an AMI version, you can ensure that your inference environment is compatible with specific software requirements, such as CUDA driver versions, Linux kernel versions, or Amazon Web Services Neuron driver versions.</p> <p>The AMI version names, and their configurations, are the following:</p> <dl> <dt>al2-ami-sagemaker-inference-gpu-2</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 535</p> </li> <li> <p>CUDA version: 12.2</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-gpu-2-1</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 535</p> </li> <li> <p>CUDA version: 12.2</p> </li> <li> <p>NVIDIA Container Toolkit with disabled CUDA-compat mounting</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-gpu-3-1</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 550</p> </li> <li> <p>CUDA version: 12.4</p> </li> <li> <p>NVIDIA Container Toolkit with disabled CUDA-compat mounting</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-neuron-2</dt> <dd> <ul> <li> <p>Accelerator: Inferentia2 and Trainium</p> </li> <li> <p>Neuron driver version: 2.19</p> </li> </ul> </dd> </dl>"
"documentation":"<p>Settings for the capacity reservation for the compute instances that SageMaker AI reserves for an endpoint. </p>"
34308
34349
}
34309
34350
},
34310
34351
"documentation":"<p> Identifies a model that you want to host and the resources chosen to deploy for hosting it. If you are deploying multiple models, tell SageMaker how to distribute traffic among the models by specifying variant weights. For more information on production variants, check <a href=\"https://docs.aws.amazon.com/sagemaker/latest/dg/model-ab-testing.html\"> Production variants</a>. </p>"
@@ -34320,6 +34361,50 @@
34320
34361
"ml.eia2.xlarge"
34321
34362
]
34322
34363
},
34364
+
"ProductionVariantCapacityReservationConfig":{
34365
+
"type":"structure",
34366
+
"members":{
34367
+
"CapacityReservationPreference":{
34368
+
"shape":"CapacityReservationPreference",
34369
+
"documentation":"<p>Options that you can choose for the capacity reservation. SageMaker AI supports the following options:</p> <dl> <dt>capacity-reservations-only</dt> <dd> <p>SageMaker AI launches instances only into an ML capacity reservation. If no capacity is available, the instances fail to launch.</p> </dd> </dl>"
34370
+
},
34371
+
"MlReservationArn":{
34372
+
"shape":"MlReservationArn",
34373
+
"documentation":"<p>The Amazon Resource Name (ARN) that uniquely identifies the ML capacity reservation that SageMaker AI applies when it deploys the endpoint.</p>"
34374
+
}
34375
+
},
34376
+
"documentation":"<p>Settings for the capacity reservation for the compute instances that SageMaker AI reserves for an endpoint. </p>"
34377
+
},
34378
+
"ProductionVariantCapacityReservationSummary":{
34379
+
"type":"structure",
34380
+
"members":{
34381
+
"MlReservationArn":{
34382
+
"shape":"MlReservationArn",
34383
+
"documentation":"<p>The Amazon Resource Name (ARN) that uniquely identifies the ML capacity reservation that SageMaker AI applies when it deploys the endpoint.</p>"
34384
+
},
34385
+
"CapacityReservationPreference":{
34386
+
"shape":"CapacityReservationPreference",
34387
+
"documentation":"<p>The option that you chose for the capacity reservation. SageMaker AI supports the following options:</p> <dl> <dt>capacity-reservations-only</dt> <dd> <p>SageMaker AI launches instances only into an ML capacity reservation. If no capacity is available, the instances fail to launch.</p> </dd> </dl>"
34388
+
},
34389
+
"TotalInstanceCount":{
34390
+
"shape":"TaskCount",
34391
+
"documentation":"<p>The number of instances that you allocated to the ML capacity reservation.</p>"
34392
+
},
34393
+
"AvailableInstanceCount":{
34394
+
"shape":"TaskCount",
34395
+
"documentation":"<p>The number of instances that are currently available in the ML capacity reservation.</p>"
34396
+
},
34397
+
"UsedByCurrentEndpoint":{
34398
+
"shape":"TaskCount",
34399
+
"documentation":"<p>The number of instances from the ML capacity reservation that are being used by the endpoint.</p>"
34400
+
},
34401
+
"Ec2CapacityReservations":{
34402
+
"shape":"Ec2CapacityReservationsList",
34403
+
"documentation":"<p>The EC2 capacity reservations that are shared to this ML capacity reservation, if any.</p>"
34404
+
}
34405
+
},
34406
+
"documentation":"<p>Details about an ML capacity reservation.</p>"
"documentation":"<p>Settings for the capacity reservation for the compute instances that SageMaker AI reserves for an endpoint. </p>"
34735
34824
}
34736
34825
},
34737
34826
"documentation":"<p>Describes weight and capacities for a production variant associated with an endpoint. If you sent a request to the <code>UpdateEndpointWeightsAndCapacities</code> API and the endpoint status is <code>Updating</code>, you get different desired and current values. </p>"
@@ -41161,6 +41250,10 @@
41161
41250
"ProjectS3Path":{
41162
41251
"shape":"S3Uri",
41163
41252
"documentation":"<p>The location where Amazon S3 stores temporary execution data and other artifacts for the project that corresponds to the domain.</p>"
41253
+
},
41254
+
"SingleSignOnApplicationArn":{
41255
+
"shape":"SingleSignOnApplicationArn",
41256
+
"documentation":"<p>The ARN of the application managed by SageMaker AI and SageMaker Unified Studio in the Amazon Web Services IAM Identity Center.</p>"
41164
41257
}
41165
41258
},
41166
41259
"documentation":"<p>The settings that apply to an Amazon SageMaker AI domain when you use it in Amazon SageMaker Unified Studio.</p>"
Copy file name to clipboardExpand all lines: src/sagemaker_core/main/shapes.py
+65Lines changed: 65 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -4779,6 +4779,7 @@ class UnifiedStudioSettings(Base):
4779
4779
project_id: The ID of the Amazon SageMaker Unified Studio project that corresponds to the domain.
4780
4780
environment_id: The ID of the environment that Amazon SageMaker Unified Studio associates with the domain.
4781
4781
project_s3_path: The location where Amazon S3 stores temporary execution data and other artifacts for the project that corresponds to the domain.
4782
+
single_sign_on_application_arn: The ARN of the application managed by SageMaker AI and SageMaker Unified Studio in the Amazon Web Services IAM Identity Center.
@@ -4966,6 +4968,21 @@ class ProductionVariantRoutingConfig(Base):
4966
4968
routing_strategy: str
4967
4969
4968
4970
4971
+
class ProductionVariantCapacityReservationConfig(Base):
4972
+
"""
4973
+
ProductionVariantCapacityReservationConfig
4974
+
Settings for the capacity reservation for the compute instances that SageMaker AI reserves for an endpoint.
4975
+
4976
+
Attributes
4977
+
----------------------
4978
+
capacity_reservation_preference: Options that you can choose for the capacity reservation. SageMaker AI supports the following options: capacity-reservations-only SageMaker AI launches instances only into an ML capacity reservation. If no capacity is available, the instances fail to launch.
4979
+
ml_reservation_arn: The Amazon Resource Name (ARN) that uniquely identifies the ML capacity reservation that SageMaker AI applies when it deploys the endpoint.
@@ -4988,6 +5005,7 @@ class ProductionVariant(Base):
4988
5005
managed_instance_scaling: Settings that control the range in the number of instances that the endpoint provisions as it scales up or down to accommodate traffic.
4989
5006
routing_config: Settings that control how the endpoint routes incoming traffic to the instances that the endpoint hosts.
4990
5007
inference_ami_version: Specifies an option from a collection of preconfigured Amazon Machine Image (AMI) images. Each image is configured by Amazon Web Services with a set of software and driver versions. Amazon Web Services optimizes these configurations for different machine learning workloads. By selecting an AMI version, you can ensure that your inference environment is compatible with specific software requirements, such as CUDA driver versions, Linux kernel versions, or Amazon Web Services Neuron driver versions. The AMI version names, and their configurations, are the following: al2-ami-sagemaker-inference-gpu-2 Accelerator: GPU NVIDIA driver version: 535 CUDA version: 12.2 al2-ami-sagemaker-inference-gpu-2-1 Accelerator: GPU NVIDIA driver version: 535 CUDA version: 12.2 NVIDIA Container Toolkit with disabled CUDA-compat mounting al2-ami-sagemaker-inference-gpu-3-1 Accelerator: GPU NVIDIA driver version: 550 CUDA version: 12.4 NVIDIA Container Toolkit with disabled CUDA-compat mounting al2-ami-sagemaker-inference-neuron-2 Accelerator: Inferentia2 and Trainium Neuron driver version: 2.19
5008
+
capacity_reservation_config: Settings for the capacity reservation for the compute instances that SageMaker AI reserves for an endpoint.
4991
5009
"""
4992
5010
4993
5011
variant_name: str
@@ -5005,6 +5023,7 @@ class ProductionVariant(Base):
class ProductionVariantCapacityReservationSummary(Base):
8206
+
"""
8207
+
ProductionVariantCapacityReservationSummary
8208
+
Details about an ML capacity reservation.
8209
+
8210
+
Attributes
8211
+
----------------------
8212
+
ml_reservation_arn: The Amazon Resource Name (ARN) that uniquely identifies the ML capacity reservation that SageMaker AI applies when it deploys the endpoint.
8213
+
capacity_reservation_preference: The option that you chose for the capacity reservation. SageMaker AI supports the following options: capacity-reservations-only SageMaker AI launches instances only into an ML capacity reservation. If no capacity is available, the instances fail to launch.
8214
+
total_instance_count: The number of instances that you allocated to the ML capacity reservation.
8215
+
available_instance_count: The number of instances that are currently available in the ML capacity reservation.
8216
+
used_by_current_endpoint: The number of instances from the ML capacity reservation that are being used by the endpoint.
8217
+
ec2_capacity_reservations: The EC2 capacity reservations that are shared to this ML capacity reservation, if any.
@@ -8182,6 +8243,7 @@ class ProductionVariantSummary(Base):
8182
8243
desired_serverless_config: The serverless configuration requested for the endpoint update.
8183
8244
managed_instance_scaling: Settings that control the range in the number of instances that the endpoint provisions as it scales up or down to accommodate traffic.
8184
8245
routing_config: Settings that control how the endpoint routes incoming traffic to the instances that the endpoint hosts.
8246
+
capacity_reservation_config: Settings for the capacity reservation for the compute instances that SageMaker AI reserves for an endpoint.
8185
8247
"""
8186
8248
8187
8249
variant_name: str
@@ -8195,6 +8257,9 @@ class ProductionVariantSummary(Base):
0 commit comments