Skip to content

feat(bedrock): add prompt caching support for custom ARNs and inferen…#16504

Open
marcelloceschia wants to merge 5 commits intoanomalyco:devfrom
marcelloceschia:fix/bedrock-prompt-caching-custom-arn
Open

feat(bedrock): add prompt caching support for custom ARNs and inferen…#16504
marcelloceschia wants to merge 5 commits intoanomalyco:devfrom
marcelloceschia:fix/bedrock-prompt-caching-custom-arn

Conversation

@marcelloceschia
Copy link

@marcelloceschia marcelloceschia commented Mar 7, 2026

…ce profiles

  • Enable prompt caching for Bedrock models that support it (Claude, Nova)
  • Add 'caching' option for custom ARNs/inference profiles without claude in name
  • Disable caching for Llama, Mistral, Cohere models (not supported)
  • Add comprehensive tests for all caching scenarios

Fixes: Prompt cache not supported for custom ARN models
Fixes: 1M context window not configurable

Users can now configure custom ARNs like:

{
  "provider": {
    "amazon-bedrock": {
      "models": {
        "arn:aws:bedrock:...:application-inference-profile/xxx": {
          "options": { "caching": true },
          "limit": { "context": 1000000, "output": 32000 }
        }
      }
    }
  }
}

Issue for this PR

Closes #

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Please provide a description of the issue, the changes you made to fix it, and why they work. It is expected that you understand why your changes work and if you do not understand why at least say as much so a maintainer knows how much to value the PR.

If you paste a large clearly AI generated description here your PR may be IGNORED or CLOSED!

How did you verify your code works?

I run the code locally and verified with our grafana dashboard, the the chaching is now used

Screenshots / recordings

image

If this is a UI change, please include a screenshot or recording.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

If you do not follow this template your PR will be automatically rejected.

@github-actions github-actions bot added the needs:compliance This means the issue will auto-close after 2 hours. label Mar 7, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

The following comment was made by an LLM, it may be inaccurate:

Found 2 potentially related PRs (excluding the current PR #16504):

  1. PR feat(provider): add provider-specific cache configuration system (significant token usage reduction) #5422: feat(provider): add provider-specific cache configuration system (significant token usage reduction)

  2. PR fix(metadata): Some providers use requestbody(body.metadata.user_id) to enable cache features Fixes #8195 #11276: fix(metadata): Some providers use requestbody(body.metadata.user_id) to enable cache features Fixes fix(telemetry): restore userId and sessionId metadata in experimental_telemetry #8195

Check these PRs to see if they've already addressed Bedrock prompt caching or if there's overlap in the implementation approach.

@github-actions github-actions bot removed the needs:compliance This means the issue will auto-close after 2 hours. label Mar 7, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

Thanks for updating your PR! It now meets our contributing guidelines. 👍

…ce profiles

- Enable prompt caching for Bedrock models that support it (Claude, Nova)
- Add 'caching' option for custom ARNs/inference profiles without claude in name
- Disable caching for Llama, Mistral, Cohere models (not supported)
- Add comprehensive tests for all caching scenarios

Fixes #1: Prompt cache not supported for custom ARN models
Fixes anomalyco#2: 1M context window not configurable

Users can now configure custom ARNs like:
```json
{
  "provider": {
    "amazon-bedrock": {
      "models": {
        "arn:aws:bedrock:...:application-inference-profile/xxx": {
          "options": { "caching": true },
          "limit": { "context": 1000000, "output": 32000 }
        }
      }
    }
  }
}
```
@marcelloceschia marcelloceschia force-pushed the fix/bedrock-prompt-caching-custom-arn branch from 2ed846f to e349074 Compare March 7, 2026 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants