feat(bedrock): add prompt caching support for custom ARNs and inferen…#16504
feat(bedrock): add prompt caching support for custom ARNs and inferen…#16504marcelloceschia wants to merge 5 commits intoanomalyco:devfrom
Conversation
|
The following comment was made by an LLM, it may be inaccurate: Found 2 potentially related PRs (excluding the current PR #16504):
Check these PRs to see if they've already addressed Bedrock prompt caching or if there's overlap in the implementation approach. |
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
…ce profiles - Enable prompt caching for Bedrock models that support it (Claude, Nova) - Add 'caching' option for custom ARNs/inference profiles without claude in name - Disable caching for Llama, Mistral, Cohere models (not supported) - Add comprehensive tests for all caching scenarios Fixes #1: Prompt cache not supported for custom ARN models Fixes anomalyco#2: 1M context window not configurable Users can now configure custom ARNs like: ```json { "provider": { "amazon-bedrock": { "models": { "arn:aws:bedrock:...:application-inference-profile/xxx": { "options": { "caching": true }, "limit": { "context": 1000000, "output": 32000 } } } } } } ```
2ed846f to
e349074
Compare
…ce profiles
Fixes: Prompt cache not supported for custom ARN models
Fixes: 1M context window not configurable
Users can now configure custom ARNs like:
{ "provider": { "amazon-bedrock": { "models": { "arn:aws:bedrock:...:application-inference-profile/xxx": { "options": { "caching": true }, "limit": { "context": 1000000, "output": 32000 } } } } } }Issue for this PR
Closes #
Type of change
What does this PR do?
Please provide a description of the issue, the changes you made to fix it, and why they work. It is expected that you understand why your changes work and if you do not understand why at least say as much so a maintainer knows how much to value the PR.
If you paste a large clearly AI generated description here your PR may be IGNORED or CLOSED!
How did you verify your code works?
I run the code locally and verified with our grafana dashboard, the the chaching is now used
Screenshots / recordings
If this is a UI change, please include a screenshot or recording.
Checklist
If you do not follow this template your PR will be automatically rejected.