Enable bedrock prompt cache#6463
Conversation
|
Hi @DOsinga, just a gentle reminder about this PR. The initial review comments were addressed a few weeks ago, so whenever you have time, I’d appreciate another look. |
There was a problem hiding this comment.
Pull request overview
This PR implements prompt caching for Anthropic Claude models running on AWS Bedrock to reduce costs and improve latency. The feature is controlled by a new BEDROCK_ENABLE_CACHING configuration parameter that defaults to false (opt-in).
Changes:
- Adds
BEDROCK_ENABLE_CACHINGconfiguration parameter (defaults to false) to enable prompt caching for Claude models - Implements intelligent cache point placement for system prompts and messages with a 4 cache point limit
- Adds comprehensive test coverage for cache point allocation logic and message conversion
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| documentation/docs/getting-started/providers.md | Documents the new BEDROCK_ENABLE_CACHING parameter in the providers table |
| crates/goose/src/providers/formats/bedrock.rs | Adds to_bedrock_message_with_caching function to insert cache points in messages and handles CachePoint blocks in responses |
| crates/goose/src/providers/bedrock.rs | Implements cache point allocation strategy, adds configuration check, and includes extensive test coverage |
|
/goose review this PR please |
|
Summary: This PR adds prompt caching support for Anthropic Claude models on AWS Bedrock via a new 🔴 Blocking Issues
🟡 Warnings
🟢 Suggestions
✅ Highlights
Review generated by goose |
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
…oose#6449) Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
…goose#6466) Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
…6352) Signed-off-by: rabi <ramishra@redhat.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
…aif-goose#6455) Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: Trae Robrock <trobrock@gmail.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
…goose#6370) Signed-off-by: rabi <ramishra@redhat.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
…aaif-goose#6235) Co-authored-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com> Co-authored-by: David Katz <dkatz@squareup.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: leocavalcante <leonardo.cavalcante@picpay.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: Valerii Kot <valerii.kot@rimthan.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
…aif-goose#6624) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Co-authored-by: Zane Staggs <zane@squareup.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Michael Neale <michael.neale@gmail.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
…-goose#6623) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: Bodhi Silberling <bodhirussellsilberling@yahoo.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Zane Staggs <zane@squareup.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Co-authored-by: Douwe Osinga <douwe@squareup.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
…e#6534) Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: Adrian Cole <adrian@tetrate.io> Co-authored-by: Adrian Cole <adrian@tetrate.io> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: Adrian Cole <adrian@tetrate.io> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: sheikhlimon <sheikhlimon404@gmail.com> Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
928ecba to
26b8b07
Compare
|
/goose review this PR please |
|
Hey @fbalicchia thanks for the PR! There seem to be a lot of unrelated changes in here, did maybe a merge or rebase go wrong? |
|
@jh-block, @blackgirlbytes You’re right — the PR ended up including some unrelated changes. The branch was quite old and during the rebase/merge things got mixed in. |
|
I think a fresh PR with only the relevant changes would be fine -- thank you! |
|
Closed in favor of 6710 |
Summary
Implemented prompt caching for Anthropic Claude models on AWS Bedrock to reduce costs
Introduced an intelligent cache point placement strategy that complies with AWS Bedrock’s four cache point limitation.
Added the BEDROCK_ENABLE_CACHING configuration parameter
Testing
After making a bit of a mess with the branches, I reopened this PR starting from this one.
6151