You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
promptVersion: '1', // optional, defaults to 'DRAFT'
1344
1350
});
1345
1351
```
1352
+
1353
+
## Inference Profiles
1354
+
1355
+
Amazon Bedrock Inference Profiles provide a way to manage and optimize inference configurations for your foundation models. They allow you to define reusable configurations that can be applied across different prompts and agents.
1356
+
1357
+
### Using Inference Profiles
1358
+
1359
+
Inference profiles can be used with prompts and agents to maintain consistent inference configurations across your application.
Amazon Bedrock offers two types of inference profiles:
1402
+
1403
+
#### Application Inference Profiles
1404
+
1405
+
Application inference profiles are user-defined profiles that help you track costs and model usage. They can be created for a single region or for multiple regions using a cross-region inference profile.
1406
+
1407
+
##### Single Region Application Profile
1408
+
1409
+
```ts fixture=default
1410
+
// Create an application inference profile for one Region
description: 'Multi-region application profile for cost tracking',
1435
+
});
1436
+
```
1437
+
1438
+
#### System Defined Inference Profiles
1439
+
1440
+
Cross-region inference enables you to seamlessly manage unplanned traffic bursts by utilizing compute across different AWS Regions. With cross-region inference, you can distribute traffic across multiple AWS Regions, enabling higher throughput and enhanced resilience during periods of peak demands.
1441
+
1442
+
Before using a CrossRegionInferenceProfile, ensure that you have access to the models and regions defined in the inference profiles. For instance, if you use the system defined inference profile "us.anthropic.claude-3-5-sonnet-20241022-v2:0", inference requests will be routed to US East (Virginia) us-east-1, US East (Ohio) us-east-2 and US West (Oregon) us-west-2. Thus, you need to have model access enabled in those regions for the model anthropic.claude-3-5-sonnet-20241022-v2:0.
Amazon Bedrock intelligent prompt routing provides a single serverless endpoint for efficiently routing requests between different foundational models within the same model family. It can help you optimize for response quality and cost. They offer a comprehensive solution for managing multiple AI models through a single serverless endpoint, simplifying the process for you. Intelligent prompt routing predicts the performance of each model for each request, and dynamically routes each request to the model that it predicts is most likely to give the desired response at the lowest cost.
The `grantProfileUsage` method adds the necessary IAM permissions to the resource, allowing it to use the inference profile. This includes permissions to call `bedrock:GetInferenceProfile` and `bedrock:ListInferenceProfiles` actions on the inference profile resource.
1510
+
1511
+
### Inference Profiles Import Methods
1512
+
1513
+
You can import existing application inference profiles using the following methods:
0 commit comments