Skip to content

Conversation

@uddhav
Copy link
Contributor

@uddhav uddhav commented Jun 22, 2025

Refresh of #2280

GCP Vertex AI: Add retry handling for Anthropic API 529 overloaded status code

Adds support for handling Anthropic's HTTP 529 "API Overloaded" status code in the GCP Vertex AI provider. This status code indicates temporary backend capacity issues rather than quota exhaustion.

Screenshot 2025-04-17 at 9 10 33 AM

Changes

  • Added detection and retry logic for 529 "API overloaded" responses
  • Applied the same backoff strategy used for 429 rate limit errors
  • Enhanced error messages to distinguish between rate limits and overloaded states
  • Added unit test to verify correct 529 status code handling
  • chore: Added Claude Opus 4 version
  • chore: cleaned up the deprecated GCP Vertex AI models for Gemini 2.5 Flash and 2.5 Pro versions

This improves reliability when interacting with Anthropic models through the Vertex AI provider during high-traffic periods.

Copy link
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good - but not 100% sure about removal of models

@uddhav
Copy link
Contributor Author

uddhav commented Jun 23, 2025

looks good - but not 100% sure about removal of models

I cleaned them up because they're not available on GCP VertexAI model garden anymore. Anyone that still has access to them can still type it in.

https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/gemini-2.5-pro-preview-05-06

but let me add them back. cleanup is not a blocker for me.

Copy link
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - not able to verify myself with vertex but looks good

@michaelneale
Copy link
Collaborator

will need to update to main branch

@uddhav uddhav force-pushed the gcp-vertex-ai-overloaded-retry branch 3 times, most recently from 6da2346 to 5f0bbd7 Compare June 27, 2025 22:44
@uddhav
Copy link
Contributor Author

uddhav commented Jun 27, 2025

will need to update to main branch

Updated this morning and now again - looks like it waits for approval on a couple of jobs. Hopefully no conflict when you see this next time 🤞

@uddhav uddhav force-pushed the gcp-vertex-ai-overloaded-retry branch from 5f0bbd7 to 5d5cbb6 Compare June 30, 2025 03:43
@michaelneale
Copy link
Collaborator

@uddhav just updated this - looking ok?

@uddhav
Copy link
Contributor Author

uddhav commented Jul 16, 2025

@uddhav just updated this - looking ok?

@michaelneale thanks - LGTM

@michaelneale
Copy link
Collaborator

nice one! thanks

@michaelneale michaelneale merged commit 21b79ad into block:main Jul 17, 2025
8 checks passed
lifeizhou-ap added a commit that referenced this pull request Jul 17, 2025
* main:
  feat(gcpvertexai): do HTTP 429 like retries for Anthropic API HTTP 529 overloaded status code (#3026)
  Fix a few ui edge cases - refresh occasionally crashing, chat loader over text and chat input height returning to auto (#3469)
  Don't default to main for build-cli (#3467)
  docs: add MongoDB MCP server tutorial (#2660)
  feat: run sub recipe multiple times in parallel (Experimental feature) (#3274)
  chore(release): release version 1.1.0 (#3465)
  chore: implement streaming for anthropic.rs firstparty provider (#3419)
  Fix regression: add back detail to tool-call banners (#3231)
  Document release process and update some just recipes (#3460)
  feat: add download_cli.ps1 file for windows (#3354)
  fix: session_file is optional (#3462)
  Bump more space for goose is working on it so it doesnt overlap incoming agent chat messages (#3453)
  Align chat input action buttons to bottom when large amount of text (#3455)
  docs: add Cloudflare MCP Server tutorial (#3278)
  feat(cli): Clear persisted session file with /clear command (#3145)
s-soroosh pushed a commit to s-soroosh/goose that referenced this pull request Jul 18, 2025
…9 overloaded status code (block#3026)

Co-authored-by: Michael Neale <[email protected]>
Signed-off-by: Soroosh <[email protected]>
kwsantiago pushed a commit to kwsantiago/goose that referenced this pull request Jul 19, 2025
…9 overloaded status code (block#3026)

Co-authored-by: Michael Neale <[email protected]>
Signed-off-by: Kyle Santiago <[email protected]>
cbruyndoncx pushed a commit to cbruyndoncx/goose that referenced this pull request Jul 20, 2025
atarantino pushed a commit to atarantino/goose that referenced this pull request Aug 5, 2025
…9 overloaded status code (block#3026)

Co-authored-by: Michael Neale <[email protected]>
Signed-off-by: Adam Tarantino <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants