Skip to content

Mirror: fix: retry Amazon Bedrock network connection lost errors (#5569)#35

Closed
jeremylongshore wants to merge 1 commit intomainfrom
mirror/PR-5569
Closed

Mirror: fix: retry Amazon Bedrock network connection lost errors (#5569)#35
jeremylongshore wants to merge 1 commit intomainfrom
mirror/PR-5569

Conversation

@jeremylongshore
Copy link
Copy Markdown
Owner

@jeremylongshore jeremylongshore commented Feb 15, 2026

Mirror of Kilo-Org#5569

Field Value
Upstream PR #5569
Author @romeoscript
Category fix
Tier 2
Size 22 lines, 1 file

This PR mirrors the upstream change for multi-AI review analysis.

Bot Review Tracker

  • CodeRabbit
  • Gemini Code Assist
  • Greptile
  • CodeQL
  • Qodo PR-Agent

Links

Summary by CodeRabbit

  • Bug Fixes
    • Improved reliability by implementing automatic retry logic for transient network errors, reducing failed requests with exponential backoff between retry attempts.

@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add retry logic for Amazon Bedrock network errors

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Add retry logic for Amazon Bedrock network connection errors
• Implements exponential backoff with max 3 attempts
• Detects Bedrock-specific network errors and retries automatically
Diagram
flowchart LR
  A["API Request"] --> B["Attempt Chat Completion"]
  B --> C{Bedrock Network Error?}
  C -->|Yes & Attempts < 3| D["Wait 2s * Attempt"]
  D --> B
  C -->|No or Max Attempts| E["Throw Error"]
  B -->|Success| F["Return Stream"]
Loading

Grey Divider

File Changes

1. src/api/providers/openrouter.ts 🐞 Bug fix +21/-1

Add Bedrock network error retry mechanism

• Wraps chat completion request in retry loop with max 3 attempts
• Detects Amazon Bedrock network connection errors by message pattern
• Implements exponential backoff delay between retries
• Logs retry attempts for debugging purposes

src/api/providers/openrouter.ts


Grey Divider

Qodo Logo

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 15, 2026

📝 Walkthrough

Walkthrough

The OpenRouter API provider now implements a retry mechanism with exponential backoff for Bedrock network errors. When a network error occurs during chat completion requests, the system automatically retries up to 3 times with increasing delays before propagating the error.

Changes

Cohort / File(s) Summary
Bedrock Retry Logic
src/api/providers/openrouter.ts
Added retry loop with exponential backoff (2s, 4s delays) for Bedrock network errors in createMessage flow. Catches specific network error messages and rethrows after max retries.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

A hop, a skip, retry we go, 🐰
When Bedrock networks stumble slow,
With patience and a bounded wait,
We bounce back up to seal the fate!

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The description includes context about the mirrored upstream PR and metadata, but lacks detailed implementation details and testing instructions specified in the template. Add Context section explaining WHY retries are needed, Implementation section describing HOW the retry logic works, and How to Test section with specific testing steps.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: adding retry logic for Amazon Bedrock network connection errors in the OpenRouter provider.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch mirror/PR-5569

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
src/api/providers/openrouter.ts (2)

437-457: Missing test coverage for the retry logic.

Per coding guidelines, all code changes should have test coverage before attempting completion. The retry logic for Bedrock network errors—including the attempt count boundary, delay calculation, and the error-message matching—should be covered by tests to prevent regressions.

Additionally, a couple of minor observations on the implementation:

  1. Fragile string matching (Line 446-447): Matching on two substrings ("Amazon Bedrock error" and "Network connection lost") is brittle. If OpenRouter/Bedrock changes error wording, the retry silently stops working. Consider extracting these strings into named constants for easier maintenance and discoverability.

  2. Linear backoff, not exponential: The delay attempts * 2000 yields 2s then 4s (linear). True exponential would be 2^attempts * 1000 or similar. Linear is fine here given only 2 retries, but worth being precise if documenting it as "exponential."

As per coding guidelines, **/*.{ts,tsx,js,jsx}: "Ensure all code changes have test coverage before attempting completion".


444-445: error: any loses type safety in the inner catch.

Consider narrowing via instanceof Error or at least an unknown type to avoid masking unexpected throw values.

Suggested narrowing
-			} catch (error: any) {
-				const msg = error?.message || ""
+			} catch (error: unknown) {
+				const msg = error instanceof Error ? error.message : String(error)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

Failed to generate code suggestions for PR

@qodo-code-review
Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (2) 📎 Requirement gaps (0)

Grey Divider


Action required

1. console.log leaks error details 📘 Rule violation ⛨ Security
Description
The new retry logic logs the raw error.message, which may contain sensitive/internal details and
is unstructured for auditing. This violates secure logging requirements and can expose information
in logs.
Code

src/api/providers/openrouter.ts[450]

+						console.log(`[OpenRouter] Retrying Bedrock network error (attempt ${attempts}/3): ${msg}`)
Evidence
PR Compliance ID 5 requires logs to be structured and to avoid sensitive data; the added code uses
console.log and interpolates the raw error message into the log line.

Rule 5: Generic: Secure Logging Practices
src/api/providers/openrouter.ts[445-451]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The retry code logs raw `error.message` via `console.log`, which is unstructured and may leak sensitive/internal error details.

## Issue Context
This code runs in a provider integration path; logs should be structured and must not include sensitive data.

## Fix Focus Areas
- src/api/providers/openrouter.ts[445-451]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Retry misses wrapped errors 🐞 Bug ✓ Correctness
Description
The new retry condition only checks error.message, but OpenRouter errors are often wrapped with
the meaningful message inside error.error.message and/or error.error.metadata.raw. This can
cause the Bedrock network retry to never trigger even when the underlying error matches.
Code

src/api/providers/openrouter.ts[R444-448]

+				} catch (error: any) {
+					const msg = error?.message || ""
+					const isBedrockNetworkError =
+						msg.includes("Amazon Bedrock error") && msg.includes("Network connection lost")
+
Evidence
The retry loop derives msg solely from error?.message and then checks substrings, so any
OpenRouter error where the actual provider message is in the nested/wrapped shape will bypass the
retry. The same file defines schemas and helpers explicitly to extract the real upstream message
from error.error.metadata.raw, indicating this nesting is expected in practice.

src/api/providers/openrouter.ts[435-455]
src/api/providers/openrouter.ts[76-99]
src/api/providers/openrouter.ts[108-138]
src/api/providers/openrouter.ts[865-881]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The Bedrock retry logic only checks `error.message`, but OpenRouter errors are frequently wrapped (e.g., `error.error.message` and/or `error.error.metadata.raw`). As a result, the intended Bedrock retry may not trigger.

### Issue Context
This file already contains utilities and schema definitions for extracting the “real” upstream error message from `metadata.raw`. The retry logic should use the same extraction approach.

### Fix Focus Areas
- src/api/providers/openrouter.ts[435-455]
- src/api/providers/openrouter.ts[76-140]
- src/api/providers/openrouter.ts[865-881]

### Implementation notes
- Create a small helper (local function) like `getOpenRouterErrorMessage(error: any): string` that tries, in order:
 - `error?.message`
 - `error?.error?.message`
 - `extractErrorFromMetadataRaw(error?.metadata?.raw)`
 - `extractErrorFromMetadataRaw(error?.error?.metadata?.raw)`
 - fallback `String(error)`
- Use the returned string for `isBedrockNetworkError` matching.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Retry logic lacks tests 📘 Rule violation ⛯ Reliability
Description
Production behavior was changed by adding Bedrock-specific retry logic, but no tests were
added/updated in this PR to cover retry and non-retry cases. This increases regression risk and may
allow failures to go undetected.
Code

src/api/providers/openrouter.ts[R438-456]

+			let attempts = 0
+			while (true) {
+				try {
+					attempts++
+					stream = await this.client.chat.completions.create(completionParams, requestOptions)
+					break
+				} catch (error: any) {
+					const msg = error?.message || ""
+					const isBedrockNetworkError =
+						msg.includes("Amazon Bedrock error") && msg.includes("Network connection lost")
+
+					if (isBedrockNetworkError && attempts < 3) {
+						console.log(`[OpenRouter] Retrying Bedrock network error (attempt ${attempts}/3): ${msg}`)
+						await new Promise((resolve) => setTimeout(resolve, attempts * 2000))
+						continue
+					}
+					throw error
+				}
+			}
Evidence
PR Compliance ID 8 requires adding/updating tests when production code changes; the PR modifies
production code to introduce retry behavior, and the PR metadata indicates only one file was changed
(no test files included).

AGENTS.md
src/api/providers/openrouter.ts[438-456]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Retry behavior was added for a specific error condition, but there are no tests included to validate the new retry/backoff logic and stopping conditions.

## Issue Context
Tests should cover at least: (1) retries happen for the matching Bedrock network error, (2) retries stop after 3 attempts, and (3) other errors are thrown immediately.

## Fix Focus Areas
- src/api/providers/openrouter.ts[438-456]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Advisory comments

4. Retry applied only to streaming 🐞 Bug ⛯ Reliability
Description
The change introduces Bedrock network retries only for the streaming createMessage path; other
OpenRouter calls remain single-attempt. If the same transient Bedrock network error occurs in
non-streaming calls, this fix won’t help and behavior will differ by code path.
Code

src/api/providers/openrouter.ts[R437-457]

+			// kilocode_change start
+			let attempts = 0
+			while (true) {
+				try {
+					attempts++
+					stream = await this.client.chat.completions.create(completionParams, requestOptions)
+					break
+				} catch (error: any) {
+					const msg = error?.message || ""
+					const isBedrockNetworkError =
+						msg.includes("Amazon Bedrock error") && msg.includes("Network connection lost")
+
+					if (isBedrockNetworkError && attempts < 3) {
+						console.log(`[OpenRouter] Retrying Bedrock network error (attempt ${attempts}/3): ${msg}`)
+						await new Promise((resolve) => setTimeout(resolve, attempts * 2000))
+						continue
+					}
+					throw error
+				}
+			}
+			// kilocode_change end
Evidence
The PR adds retry logic exclusively around the streaming chat.completions.create call in
createMessage_implementationRenamedForKilocode. The same provider also implements a non-streaming
completePrompt that calls chat.completions.create once without retries, meaning transient
Bedrock errors can still surface there depending on how the handler is used (e.g., via
singleCompletionHandler).

src/api/providers/openrouter.ts[435-457]
src/api/providers/openrouter.ts[782-784]
src/utils/single-completion-handler.ts[24-33]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Bedrock retry logic is implemented only for streaming `createMessage`. Other OpenRouter calls (notably `completePrompt`) still do a single attempt.

### Issue Context
`singleCompletionHandler` will prefer `completePrompt` when available, so non-stream flows can still encounter transient Bedrock network errors.

### Fix Focus Areas
- src/api/providers/openrouter.ts[435-457]
- src/api/providers/openrouter.ts[753-784]
- src/utils/single-completion-handler.ts[24-33]

### Implementation notes
- Extract the retry loop into a private helper like `createChatCompletionWithBedrockRetry(params, requestOptions)`.
- Use it from both streaming (`createMessage`) and non-streaming (`completePrompt`) paths (ensuring types align).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

msg.includes("Amazon Bedrock error") && msg.includes("Network connection lost")

if (isBedrockNetworkError && attempts < 3) {
console.log(`[OpenRouter] Retrying Bedrock network error (attempt ${attempts}/3): ${msg}`)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. console.log leaks error details 📘 Rule violation ⛨ Security

The new retry logic logs the raw error.message, which may contain sensitive/internal details and
is unstructured for auditing. This violates secure logging requirements and can expose information
in logs.
Agent Prompt
## Issue description
The retry code logs raw `error.message` via `console.log`, which is unstructured and may leak sensitive/internal error details.

## Issue Context
This code runs in a provider integration path; logs should be structured and must not include sensitive data.

## Fix Focus Areas
- src/api/providers/openrouter.ts[445-451]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +444 to +448
} catch (error: any) {
const msg = error?.message || ""
const isBedrockNetworkError =
msg.includes("Amazon Bedrock error") && msg.includes("Network connection lost")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Retry misses wrapped errors 🐞 Bug ✓ Correctness

The new retry condition only checks error.message, but OpenRouter errors are often wrapped with
the meaningful message inside error.error.message and/or error.error.metadata.raw. This can
cause the Bedrock network retry to never trigger even when the underlying error matches.
Agent Prompt
### Issue description
The Bedrock retry logic only checks `error.message`, but OpenRouter errors are frequently wrapped (e.g., `error.error.message` and/or `error.error.metadata.raw`). As a result, the intended Bedrock retry may not trigger.

### Issue Context
This file already contains utilities and schema definitions for extracting the “real” upstream error message from `metadata.raw`. The retry logic should use the same extraction approach.

### Fix Focus Areas
- src/api/providers/openrouter.ts[435-455]
- src/api/providers/openrouter.ts[76-140]
- src/api/providers/openrouter.ts[865-881]

### Implementation notes
- Create a small helper (local function) like `getOpenRouterErrorMessage(error: any): string` that tries, in order:
  - `error?.message`
  - `error?.error?.message`
  - `extractErrorFromMetadataRaw(error?.metadata?.raw)`
  - `extractErrorFromMetadataRaw(error?.error?.metadata?.raw)`
  - fallback `String(error)`
- Use the returned string for `isBedrockNetworkError` matching.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@jeremylongshore
Copy link
Copy Markdown
Owner Author

@greptileai review

@jeremylongshore
Copy link
Copy Markdown
Owner Author

@greptileai

@jeremylongshore
Copy link
Copy Markdown
Owner Author

Merged into batch-6-combined-20-mirrors branch for combined testing. Bot reviews collected. Review artifacts preserved in .reviews/.

@jeremylongshore jeremylongshore deleted the mirror/PR-5569 branch February 16, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants