enhance: add FormData support and smart retry logic to API service by jonahgabriel · Pull Request #723 · coleam00/Archon

jonahgabriel · 2025-09-20T22:06:19Z

Summary

Enhances API service capabilities with FormData support and smart retry logic improvements for better developer experience.

Changes Made

Added FormData Support: Enhanced apiWithEtag.ts with automatic FormData detection to properly handle file uploads
Smart Retry Logic: Implemented intelligent retry logic in ollamaService.ts using createRetryLogic() to avoid retrying 4xx client errors
Improved Error Handling: Better handling of network errors vs client errors with exponential backoff
Enhanced File Uploads: Updated knowledgeService.ts to use callAPIWithETag for consistent error handling and timeout management
Code Deduplication: Removed duplicate testConnection implementation in OllamaConfigurationPanel.tsx

Technical Details

FormData Detection: Automatically detects FormData requests and omits Content-Type header (browser sets multipart/form-data with boundary)
Intelligent Retry: Only retries network errors and 5xx server errors, not 4xx client errors, preventing duplicate operations
Comprehensive Testing: Added 5 new FormData-specific tests covering various upload scenarios
Backward Compatibility: All changes maintain existing API compatibility

Testing

✅ Added comprehensive FormData tests (5 new test cases)
✅ All existing tests continue to pass
✅ Manual testing confirmed proper FormData handling and retry behavior
✅ Verified proper Content-Type header handling for both JSON and FormData requests

Files Changed

archon-ui-main/src/features/shared/apiWithEtag.ts - FormData detection and header handling
archon-ui-main/src/services/ollamaService.ts - Smart retry logic implementation
archon-ui-main/src/features/knowledge/services/knowledgeService.ts - Consistent API usage
archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx - Code deduplication
archon-ui-main/src/features/shared/apiWithEtag.test.ts - Comprehensive FormData test coverage

Benefits

Improved DX: FormData uploads now work seamlessly through unified API service
Better Reliability: Smart retry logic prevents infinite loops and duplicate operations
Code Quality: Eliminated code duplication and inconsistent retry behavior
Maintainability: Centralized retry logic and error handling patterns

Summary by CodeRabbit

New Features
- Smarter connection testing and global request timeouts for AI providers; lifecycle controls for external agent client with richer error messages.
- Improved upload handling with FormData support and robust JSON/text/binary response parsing.
Bug Fixes
- SSRF protections to block private/internal targets.
- Normalized UTC timestamps and safer stored model/metadata handling.
Refactor
- Centralized retry, connectivity, and API call flows into shared services.
Tests
- New tests covering retry decision logic and API client/FormData scenarios.

Wirasm

Hi @jonahgabriel

Thank you for working on improving the API services! After reviewing the PR, I need to clarify a few things:

About the "Header Merge Bug"

Looking at the current main branch, there isn't actually a bug with the header variable. The code correctly declares const headers and uses headers in the fetch call (line 69). The variable naming is consistent and working as intended.

However, your PR does include some useful improvements:

FormData Support: Adding automatic FormData detection is a nice enhancement. The original design intentionally excluded this (note the comment: "For file uploads or FormData requests, use fetch() directly"), but supporting it could improve DX.
Smart Retry Logic: Using createRetryLogic to avoid retrying 4xx errors is a good pattern.

Could you please remove the headers var renaming, we want to keep heaers over mergedHeaders.
We could keep the form data detection, even though its not needed currently it could improve DX for future enhancements.

alternatively make a new pr that only addresses smart retry improvements

Also, please update the PR title and description to reflect that this is adding FormData support and improving retry logic, rather than fixing a header merge bug.

Thank you!

jonahgabriel · 2025-09-22T11:25:40Z

Hi @jonahgabriel

Thank you for working on improving the API services! After reviewing the PR, I need to clarify a few things:

About the "Header Merge Bug"

Looking at the current main branch, there isn't actually a bug with the header variable. The code correctly declares const headers and uses headers in the fetch call (line 69). The variable naming is consistent and working as intended.

However, your PR does include some useful improvements:

FormData Support: Adding automatic FormData detection is a nice enhancement. The original design intentionally excluded this (note the comment: "For file uploads or FormData requests, use fetch() directly"), but supporting it could improve DX.

Smart Retry Logic: Using createRetryLogic to avoid retrying 4xx errors is a good pattern.

Could you please remove the headers var renaming, we want to keep heaers over mergedHeaders. We could keep the form data detection, even though its not needed currently it could improve DX for future enhancements.

alternatively make a new pr that only addresses smart retry improvements

Also, please update the PR title and description to reflect that this is adding FormData support and improving retry logic, rather than fixing a header merge bug.

Thank you!

Will do.

coderabbitai · 2025-09-23T14:05:52Z

Walkthrough

Delegates Ollama health checks to ollamaService with smart retries/timeouts, centralizes uploads via callAPIWithETag with FormData-aware headers and robust parsing, adds retry tests, hardens server Ollama endpoints against SSRF and uses UTC-aware timestamps, and introduces an MCP client error hierarchy and lifecycle helpers.

Changes

Cohort / File(s)	Summary
Ollama settings panel integration `archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx`	Replaced in-file health-check loop with `ollamaService.testConnection(...)`, mapping result to `{ isHealthy, responseTime, error, modelsAvailable: undefined }`.
Knowledge upload via shared API client `archon-ui-main/src/features/knowledge/services/knowledgeService.ts`	Replaced manual `fetch` upload with `callAPIWithETag('/api/documents/upload', ...)`; removed env-specific URL construction and bespoke error parsing.
Shared API client: headers, FormData, responses & tests `archon-ui-main/src/features/shared/api/apiClient.ts`, `.../tests/apiClient.test.ts`	Detects `FormData` and omits `Content-Type`; normalizes headers (lowercase) and ensures `Accept: application/json`; content-type based response parsing (JSON/text/blob); 204 → `undefined`; enhanced error wrapping; tests updated/expanded for FormData, binary, header casing, and error cases.
Retry logic tests `archon-ui-main/src/features/shared/tests/queryPatterns.test.ts`	New comprehensive tests for `createRetryLogic` covering network errors, 5xx/502/503 retries, non-retryable 4xx/429, abort/cancel cases, edge cases, and Ollama-like scenarios.
Ollama service: smart retry, timeouts, typing `archon-ui-main/src/services/ollamaService.ts`	Uses `createRetryLogic` in `testConnection` with per-request timeouts (30s default, 5s for health probe), richer error annotation (status/statusCode), tighter types (`unknown`), and early-exit retry decisions; some method signatures/types updated.
MCP client: errors, requests, and lifecycle `python/src/agents/mcp_client.py`	Adds `MCPError`/`MCPTransportError`/`MCPToolError`; service-discovery-aware MCP URL resolution; typed `httpx.AsyncClient`; UUID JSON‑RPC IDs and `X-Request-ID`; richer transport/tool error mapping; `get_mcp_client()` with thread-safe init and `shutdown_mcp_client()`.
Server Ollama API: SSRF protections and robustness `python/src/server/api_routes/ollama_api.py`	Adds `_is_private_host` guard and stricter URL validation across endpoints, enforces http(s) schemes and non-private hosts, uses UTC-aware timestamps, schedules background tasks reliably, and improves error logging and model discovery handling.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant UI as OllamaConfigurationPanel
  participant Svc as ollamaService
  participant Oll as Ollama Instance

  UI->>Svc: testConnection(baseUrl, retryCount)
  rect rgba(200,240,255,0.20)
    note right of Svc #D6EAF8: createRetryLogic drives retry decisions
    Svc->>Oll: health/validate request (AbortSignal timeout)
    Oll-->>Svc: response or error (status/info)
    alt Retryable (network/5xx/timeout)
      Svc-->>Svc: decide & retry (limit/backoff)
    else Non-retryable (4xx/abort)
      Svc-->>UI: { isHealthy:false, error }
    end
  end
  Svc-->>UI: { isHealthy, responseTime, error? }

sequenceDiagram
  autonumber
  participant Client as Feature Service
  participant API as callAPIWithETag
  participant Net as fetch()

  Client->>API: request(url, { headers, body })
  alt body is FormData
    API-->>Net: fetch with headers (omit Content-Type)
  else JSON body
    API-->>Net: fetch with Content-Type: application/json
  end
  Net-->>API: Response
  alt content-type includes "json"
    API-->>Client: parsed JSON or APIServiceError
  else 204/other/binary
    API-->>Client: text/blob/undefined
  end

sequenceDiagram
  autonumber
  participant Route as Server route
  participant Guard as SSRF Guard
  participant Ext as External Ollama

  Route->>Guard: validate instance URL(s)
  Guard->>Guard: resolve host → IP(s), check private/reserved
  alt IP private/loopback/link-local/reserved
    Guard-->>Route: Block / return sanitized non-healthy entry
  else Public
    Guard->>Ext: proceed with discovery/health
    Ext-->>Guard: result
    Guard-->>Route: response with UTC timestamps
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Fix API service header merge and retry logic bugs #593 — Matches FormData-aware header handling and centralized retry logic implemented in the shared API client and ollamaService.

Possibly related PRs

feat: Ollama Integration with Separate LLM/Embedding Model Support #643 — Strong overlap: modifies Ollama frontend/service and health-check flows similar to these changes.
Feature: Add Ollama embedding service and model selection functionality #560 — Related adjustments to Ollama UI/service interaction and health-check delegation.

Suggested labels

enhancement

Suggested reviewers

coleam00
tazmon95
leex279

Poem

A rabbit hops through nets and logs,
Tests and guards and tiny progs.
Forms set free, retries in tune,
Timeouts wink at afternoon.
Hop, report, and ship — we swoon. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "enhance: add FormData support and smart retry logic to API service" is concise and directly summarizes the primary changes—FormData handling and improved retry logic in the API service—so it clearly conveys the main intent to reviewers scanning history.
Description Check	✅ Passed	The PR description is largely complete and follows the template by providing a Summary, Changes Made, Technical Details, Testing claims, Files Changed, and Benefits, which together give reviewers a clear understanding of the work and intent. However, several template sections are not present or fully filled out, including the "Type of Change" checkboxes, "Affected Services", the Checklist entries, and an explicit Test Evidence code snippet, so those required template items are missing.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

archon-ui-main/src/features/knowledge/services/knowledgeService.ts (1)
120-121: Fix misleading comment: callAPIWithETag here does not implement "smart retry".

callAPIWithETag adds FormData handling and timeouts, but not retry logic. Update the comment to avoid confusion.
-    // Use improved API service with proper FormData handling and smart retry logic
+    // Use API service with proper FormData handling and timeout
archon-ui-main/src/features/shared/apiWithEtag.test.ts (1)
413-575: Great FormData coverage. Consider adding a test to ensure user-provided Content-Type is stripped for FormData.

Right now we test omission by default. Add a case where the caller sets Content-Type to verify we remove it, matching browser requirements.

Apply this diff to add the test:
@@
   describe("FormData Support", () => {
@@
     it("should preserve custom headers with FormData", async () => {
@@
       expect(options.headers).not.toHaveProperty("Content-Type");
     });
 
+    it("should remove user-provided Content-Type for FormData", async () => {
+      const mockResponse = {
+        ok: true,
+        status: 200,
+        json: () => Promise.resolve({ ok: true }),
+        headers: new Headers(),
+      };
+      global.fetch = vi.fn().mockResolvedValue(mockResponse);
+
+      const fd = new FormData();
+      fd.append("file", new File(["x"], "x.txt"));
+
+      await callAPIWithETag("/api/upload", {
+        method: "POST",
+        body: fd,
+        headers: { "Content-Type": "multipart/form-data" },
+      });
+
+      const [, options] = (global.fetch as any).mock.calls[0];
+      expect(options.headers).not.toHaveProperty("Content-Type");
+    });
archon-ui-main/src/features/shared/apiWithEtag.ts (1)
54-67: Strip any user-provided Content-Type when body is FormData; normalize headers via Headers to support HeadersInit.

If callers pass Content-Type for FormData, uploads can break (missing multipart boundary). Also, current cast assumes plain objects. Normalize with Headers and delete Content-Type when isFormData.

Apply this diff:
-    // Detect FormData to avoid setting Content-Type (browser sets multipart/form-data with boundary)
-    const isFormData = options.body instanceof FormData;
+    // Detect FormData to avoid setting Content-Type (browser sets multipart/form-data with boundary)
+    const isFormData = options.body instanceof FormData;
@@
-    const headers: Record<string, string> = {
-      Accept: "application/json",
-      ...(options.headers as Record<string, string> || {}),
-      ...(isFormData ? {} : { "Content-Type": "application/json" }),
-    };
+    // Normalize headers and ensure Content-Type behavior is correct for FormData
+    const headersObj = new Headers(options.headers as HeadersInit | undefined);
+    headersObj.set("Accept", "application/json");
+    if (isFormData) {
+      headersObj.delete("Content-Type");
+    } else if (!headersObj.has("Content-Type")) {
+      headersObj.set("Content-Type", "application/json");
+    }
+    const headers: Record<string, string> = Object.fromEntries(headersObj.entries());
archon-ui-main/src/services/ollamaService.ts (1)
384-439: Add per-attempt timeouts to avoid hanging retries.

checkInstanceHealth doesn’t use a signal/timeout, so a slow/unresponsive host can hang an attempt and block the loop. Pass a per-attempt AbortSignal (e.g., 5s) into checkInstanceHealth and have that method forward it to fetch.

Apply this diff in the selected range to pass a timeout signal:
-        const healthResponse = await this.checkInstanceHealth([instanceUrl], false);
+        const healthResponse = await this.checkInstanceHealth([instanceUrl], false, AbortSignal.timeout(5000));
And add signal support to checkInstanceHealth (outside this hunk):
// Update signature
async checkInstanceHealth(instanceUrls: string[], includeModels = false, signal?: AbortSignal): Promise<InstanceHealthResponse> {
  // ...
  const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, {
    method: "GET",
    headers: { "Content-Type": "application/json" },
    signal,
  });
  // ...
}

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4c910c1 and f10604f.

📒 Files selected for processing (6)

archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx (2 hunks)
archon-ui-main/src/features/knowledge/services/knowledgeService.ts (1 hunks)
archon-ui-main/src/features/shared/apiWithEtag.test.ts (1 hunks)
archon-ui-main/src/features/shared/apiWithEtag.ts (2 hunks)
archon-ui-main/src/features/shared/tests/queryPatterns.test.ts (1 hunks)
archon-ui-main/src/services/ollamaService.ts (3 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

archon-ui-main/src/features/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UI

Use Biome in features: 120 character line length, double quotes, and trailing commas

Files:

archon-ui-main/src/features/knowledge/services/knowledgeService.ts
archon-ui-main/src/features/shared/apiWithEtag.test.ts
archon-ui-main/src/features/shared/tests/queryPatterns.test.ts
archon-ui-main/src/features/shared/apiWithEtag.ts

archon-ui-main/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Disallow implicit any in TypeScript

archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types

Files:

archon-ui-main/src/features/knowledge/services/knowledgeService.ts
archon-ui-main/src/features/shared/apiWithEtag.test.ts
archon-ui-main/src/features/shared/tests/queryPatterns.test.ts
archon-ui-main/src/features/shared/apiWithEtag.ts
archon-ui-main/src/services/ollamaService.ts
archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx

archon-ui-main/src/**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)

Files:

archon-ui-main/src/features/knowledge/services/knowledgeService.ts
archon-ui-main/src/features/shared/apiWithEtag.test.ts
archon-ui-main/src/features/shared/tests/queryPatterns.test.ts
archon-ui-main/src/features/shared/apiWithEtag.ts
archon-ui-main/src/services/ollamaService.ts
archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx

archon-ui-main/src/**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Write frontend tests with Vitest and React Testing Library

Files:

archon-ui-main/src/features/shared/apiWithEtag.test.ts
archon-ui-main/src/features/shared/tests/queryPatterns.test.ts

🧬 Code graph analysis (5)

archon-ui-main/src/features/knowledge/services/knowledgeService.ts (1)

archon-ui-main/src/features/shared/apiWithEtag.ts (1)

callAPIWithETag (46-127)

archon-ui-main/src/features/shared/apiWithEtag.test.ts (1)

archon-ui-main/src/features/shared/apiWithEtag.ts (1)

callAPIWithETag (46-127)

archon-ui-main/src/features/shared/tests/queryPatterns.test.ts (1)

archon-ui-main/src/features/shared/queryPatterns.ts (1)

createRetryLogic (63-75)

archon-ui-main/src/services/ollamaService.ts (1)

archon-ui-main/src/features/shared/queryPatterns.ts (1)

createRetryLogic (63-75)

archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx (2)

archon-ui-main/src/services/ollamaService.ts (2)

testConnection (387-438)

ollamaService (494-494)

archon-ui-main/src/components/settings/types/OllamaTypes.ts (1)

ConnectionTestResult (107-112)

🔇 Additional comments (9)

archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx (1)

108-119: Solid delegation to ollamaService with minimal mapping.

Mapping service result to local ConnectionTestResult looks good. One nit: since modelsAvailable is always undefined here, the toast in Line 152 can render “undefined models”. Consider omitting that part when undefined.

archon-ui-main/src/services/ollamaService.ts (1)

9-9: Good: centralizing retry decision via createRetryLogic.

Import and usage are aligned with shared patterns.

archon-ui-main/src/features/shared/tests/queryPatterns.test.ts (7)

1-2: LGTM! Correct imports for testing setup.

The imports are appropriate for a Vitest test file with React Testing Library patterns.

4-37: Comprehensive retry logic testing for network and server errors.

The test cases correctly validate the retry behavior for network errors and 5xx server errors. The tests properly verify that retries work within limits and stop when maxRetries is exceeded. Good coverage of both statusCode and status property variations.

39-80: Thorough coverage of client error scenarios.

Excellent test coverage for 4xx client errors including edge cases like 429 rate limiting. The tests correctly verify that client errors should not be retried, which aligns with the smart retry logic requirements mentioned in the PR objectives.

82-111: Good abort error handling validation.

The tests comprehensively cover different abort scenarios including AbortError by name, ERR_CANCELED by code, and axios cancel tokens. This ensures that user-initiated cancellations are properly handled without unnecessary retries.

113-160: Excellent edge case coverage.

The edge case tests validate important scenarios like null/undefined errors, configurable maxRetries, default retry count, and mixed status properties. The test on Line 158 correctly validates that statusCode takes precedence over status, which aligns with the implementation in queryPatterns.ts.

162-181: Well-designed integration tests for Ollama service patterns.

These tests provide practical validation for real-world scenarios that the Ollama service will encounter, including model not found (404), connection timeouts, and server overload (503). This ensures the retry logic works correctly in the context it will be used.

1-182: Run tests locally/CI and confirm TanStack Query integration.

Sandbox couldn't run npm (no package.json). createRetryLogic is implemented at archon-ui-main/src/features/shared/queryPatterns.ts and wired into the centralized QueryClient (retry: createRetryLogic(2)) at archon-ui-main/src/features/shared/queryClient.ts — run archon-ui-main/src/features/shared/tests/queryPatterns.test.ts and verify feature hooks use the centralized QueryClient or follow feature-scoped TanStack Query hooks.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

archon-ui-main/src/features/shared/apiWithEtag.ts (1)

111-120: Handle non‑JSON or empty success bodies to avoid false “NETWORK_ERROR”.

Some endpoints (e.g., uploads) can return 201 with empty body or non‑JSON. Unconditional response.json() will throw and be misclassified as NETWORK_ERROR. Safely gate parsing by content-type and tolerate empty bodies.

Apply this diff:

-    // Parse response data
-    const result = await response.json();
-
-    // Check for API errors
-    if (result.error) {
-      throw new APIServiceError(result.error, "API_ERROR", response.status);
-    }
-
-    return result as T;
+    // Parse response data
+    const contentType = response.headers.get("content-type")?.toLowerCase() ?? "";
+    if (contentType.includes("application/json") || contentType.includes("+json")) {
+      const result = await response.json();
+      if (result && typeof result === "object" && "error" in result && (result as any).error) {
+        throw new APIServiceError((result as any).error as string, "API_ERROR", response.status);
+      }
+      return result as T;
+    }
+    // Non-JSON or empty body
+    const text = await response.text().catch(() => "");
+    return (text ? (text as unknown as T) : (undefined as T));

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f10604f and 661722c.

📒 Files selected for processing (2)

archon-ui-main/src/features/shared/apiWithEtag.test.ts (3 hunks)
archon-ui-main/src/features/shared/apiWithEtag.ts (2 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

archon-ui-main/src/features/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UI

Use Biome in features: 120 character line length, double quotes, and trailing commas

Files:

archon-ui-main/src/features/shared/apiWithEtag.ts
archon-ui-main/src/features/shared/apiWithEtag.test.ts

archon-ui-main/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Disallow implicit any in TypeScript

archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types

Files:

archon-ui-main/src/features/shared/apiWithEtag.ts
archon-ui-main/src/features/shared/apiWithEtag.test.ts

archon-ui-main/src/**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)

Files:

archon-ui-main/src/features/shared/apiWithEtag.ts
archon-ui-main/src/features/shared/apiWithEtag.test.ts

archon-ui-main/src/**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Write frontend tests with Vitest and React Testing Library

Files:

archon-ui-main/src/features/shared/apiWithEtag.test.ts

🧬 Code graph analysis (1)

archon-ui-main/src/features/shared/apiWithEtag.test.ts (1)

archon-ui-main/src/features/shared/apiWithEtag.ts (1)

callAPIWithETag (46-131)

🔇 Additional comments (5)

archon-ui-main/src/features/shared/apiWithEtag.ts (2)

54-69: FormData detection and header handling look correct.

Correctly omits Content-Type for FormData and preserves user-provided headers. Matches browser multipart boundary behavior.

71-71: Header normalization is appropriate.

Converting Headers to a plain object via entries() yields lower-case keys, aligning with the tests’ expectations.
archon-ui-main/src/features/shared/apiWithEtag.test.ts (3)
59-61: Expectation updates match new header behavior.

Lower-case keys and explicit accept header assertions are consistent with Headers normalization and implementation.

172-176: Custom header casing assertions are correct.

Asserting lower-case header names aligns with the normalized headers object.

415-599: Add test: tolerate 201 Created with empty body for FormData uploads — vitest env 'jsdom' verified

Vitest is configured with environment: 'jsdom' (File/FormData available). Add this test inside the "FormData Support" suite:
+    it("should tolerate 201 Created with empty body for FormData uploads", async () => {
+      const mockResponse = {
+        ok: true,
+        status: 201,
+        // Simulate empty body; no json() available on some servers
+        text: () => Promise.resolve(""),
+        headers: new Headers(), // no content-type
+      };
+      global.fetch = vi.fn().mockResolvedValue(mockResponse);
+
+      const fd = new FormData();
+      fd.append("file", new File(["x"], "x.txt"));
+
+      const result = await callAPIWithETag("/api/upload", { method: "POST", body: fd });
+      expect(result).toBeUndefined();
+    });

PROBLEMS FIXED: 1. Header merge order bug - FormData requests incorrectly had Content-Type set 2. Overly aggressive retry logic - retried 4xx client errors causing duplicates COMPREHENSIVE SOLUTION: Core API Service (apiWithEtag.ts): - Add FormData detection to avoid setting Content-Type header - Fix variable reference to use mergedHeaders instead of headers - Update documentation to reflect FormData + JSON support - Maintain backward compatibility with existing functionality Smart Retry Logic (ollamaService.ts): - Replace custom retry loops with createRetryLogic() from queryPatterns - Only retry network errors and 5xx server errors, not 4xx client errors - Add special handling for 404 errors (Ollama not installed) - Implement exponential backoff for better retry behavior Code Deduplication (OllamaConfigurationPanel.tsx): - Remove duplicate testConnection implementation - Use ollamaService.testConnection() for consistency - Eliminate code duplication and inconsistent retry behavior File Upload Improvements (knowledgeService.ts): - Use callAPIWithETag() for uploads instead of direct fetch() - Benefit from smart retry logic and ETag caching for file uploads - Remove workaround comments about FormData incompatibility Comprehensive Test Coverage (apiWithEtag.test.ts): - Add 5 new FormData-specific tests (21 total tests) - Test Content-Type header omission for FormData requests - Verify custom headers preservation with FormData - Test binary file uploads and error scenarios - Ensure JSON requests still get proper Content-Type headers IMPACT: - FormData uploads work properly through unified API service - Smart retry prevents duplicate operations and infinite loops - Eliminates code duplication across Ollama components - Maintains backward compatibility and ETag optimization - All tests pass - robust error handling and edge case coverage RELATED: Fixes technical debt identified in Issue coleam00#593

- 16 test cases covering all retry scenarios - Network errors: should retry (timeouts, connection failures) - 5xx server errors: should retry (500, 502, 503) - 4xx client errors: should NOT retry (400, 401, 404, 422, 429) - Abort errors: should NOT retry (AbortError, ERR_CANCELED) - Edge cases: null/undefined, custom maxRetries, mixed properties - Integration scenarios: Ollama-specific error patterns Ensures smart retry logic properly handles all states: - Retries 3 times for recoverable errors - Fails immediately for permanent client errors - Respects user cancellations

- Normalize headers using Headers constructor for better compatibility - Strip user-provided Content-Type for FormData to prevent upload errors - Update tests to expect lowercase header names per HTTP standards - Add test case for Content-Type removal with FormData - Maintain backward compatibility while improving robustness All 46 shared feature tests passing.

- Update misleading comment in knowledgeService.ts - Add per-attempt timeout support to ollamaService.ts checkInstanceHealth method - Apply AbortSignal.timeout(5000) for health check timeouts

…e.ts - Remove unused APIServiceError import detected by Biome - Improve multi-line formatting for callAPIWithETag function call

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

archon-ui-main/src/services/ollamaService.ts (3)
158-182: Avoid any in public service code; narrow from unknown.

Use unknown and narrow to keep strict typing guarantees.
-  private handleApiError(error: any, context: string): Error {
-    const errorMessage = error instanceof Error ? error.message : String(error);
+  private handleApiError(error: unknown, context: string): Error {
+    const err = error instanceof Error ? error : new Error(String(error));
+    const errorMessage = err.message;
225-247: Add a default timeout when no signal is provided.

Prevents potentially hanging requests when callers omit a signal, aligning with apiClient’s 20s default.
       const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, {
         method: 'GET',
         headers: {
           'Content-Type': 'application/json',
         },
-        signal,
+        signal: signal ?? AbortSignal.timeout(20000),
       });
416-419: Annotate both status and statusCode to maximize compatibility with error inspectors.

Some utilities inspect error.status, others error.statusCode. Set both so getErrorStatus() can detect 4xx reliably.
-        if (result.error?.includes('404') || result.error?.includes('not found')) {
-          (lastError as any).statusCode = 404;
-        }
+        if (result.error?.includes('404') || result.error?.includes('not found')) {
+          (lastError as any).statusCode = 404;
+          (lastError as any).status = 404;
+        }
archon-ui-main/src/features/shared/api/apiClient.ts (1)
54-69: Don’t override a caller-provided Accept header.

Set Accept only if absent to allow non‑JSON endpoints when needed; current tests still pass.
-    headersObj.set("Accept", "application/json");
+    if (!headersObj.has("Accept")) {
+      headersObj.set("Accept", "application/json");
+    }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 661722c and e52116a.

📒 Files selected for processing (6)

archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx (2 hunks)
archon-ui-main/src/features/knowledge/services/knowledgeService.ts (1 hunks)
archon-ui-main/src/features/shared/api/apiClient.ts (2 hunks)
archon-ui-main/src/features/shared/api/tests/apiClient.test.ts (3 hunks)
archon-ui-main/src/features/shared/tests/queryPatterns.test.ts (1 hunks)
archon-ui-main/src/services/ollamaService.ts (5 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

archon-ui-main/src/features/knowledge/services/knowledgeService.ts
archon-ui-main/src/features/shared/tests/queryPatterns.test.ts
archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx

🧰 Additional context used

📓 Path-based instructions (4)

archon-ui-main/src/features/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UI

Use Biome in features: 120 character line length, double quotes, and trailing commas

Files:

archon-ui-main/src/features/shared/api/tests/apiClient.test.ts
archon-ui-main/src/features/shared/api/apiClient.ts

archon-ui-main/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Disallow implicit any in TypeScript

archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types

Files:

archon-ui-main/src/features/shared/api/tests/apiClient.test.ts
archon-ui-main/src/services/ollamaService.ts
archon-ui-main/src/features/shared/api/apiClient.ts

archon-ui-main/src/**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Write frontend tests with Vitest and React Testing Library

Files:

archon-ui-main/src/features/shared/api/tests/apiClient.test.ts

archon-ui-main/src/**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)

Files:

archon-ui-main/src/features/shared/api/tests/apiClient.test.ts
archon-ui-main/src/services/ollamaService.ts
archon-ui-main/src/features/shared/api/apiClient.ts

🧬 Code graph analysis (2)

archon-ui-main/src/features/shared/api/tests/apiClient.test.ts (1)

archon-ui-main/src/features/shared/api/apiClient.ts (1)

callAPIWithETag (46-131)

archon-ui-main/src/services/ollamaService.ts (1)

archon-ui-main/src/features/shared/config/queryPatterns.ts (1)

createRetryLogic (63-75)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Backend Tests (Python + pytest)

🔇 Additional comments (5)

archon-ui-main/src/services/ollamaService.ts (1)

392-431: Smart retry loop is solid: bounded attempts, exponential backoff, and abort signal.

Good use of retry logic and 5s request-level timeout to keep attempts bounded. Early-return on success is correct.

archon-ui-main/src/features/shared/api/apiClient.ts (2)

36-45: Docs/readme block: LGTM.

Accurately describes reliance on browser ETag handling and TanStack Query for cache freshness.

71-72: Header normalization approach is fine.

Lower‑casing via Headers and passing a plain object to fetch is consistent with the tests.
archon-ui-main/src/features/shared/api/tests/apiClient.test.ts (2)
59-61: Header expectations updated correctly (lowercased + Accept).

Matches the client’s normalization behavior.

415-599: FormData test suite: comprehensive and correct.

Covers omission of Content‑Type, custom headers preservation, binary file handling, and user‑provided Content‑Type removal.

Consider adding one test to assert custom Accept is preserved (to go with the suggested client change):
it("should preserve caller-provided Accept header", async () => {
  const mockResponse = { ok: true, status: 200, json: () => Promise.resolve({ ok: true }), headers: new Headers() };
  global.fetch = vi.fn().mockResolvedValue(mockResponse);

  await callAPIWithETag("/api/text-endpoint", {
    headers: { Accept: "text/plain" },
  });

  const [, options] = (global.fetch as any).mock.calls[0];
  expect(options.headers["accept"]).toBe("text/plain");
});

- Simplify JSON parsing in ollama_api.py by removing unnecessary isinstance check - Enhance error handling in mcp_client.py to check both 'error' and 'message' fields - Apply code quality improvements for better error message extraction Related to PR coleam00#723 CodeRabbit review feedback.

Frontend fixes: - Add non-JSON response handling in apiClient.ts to prevent JSON parsing errors - Fix import path for createRetryLogic in ollamaService.ts - Replace 'any' types with 'unknown' for better type safety - Add 30s default timeouts to all fetch calls in ollamaService.ts - Enhanced status code annotations for smart retry logic compatibility Backend fixes addressed in previous commits: - Simplified JSON parsing in ollama_api.py - Enhanced error handling in mcp_client.py to check both 'error' and 'message' fields All CodeRabbit review comments have been systematically addressed.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

python/src/server/api_routes/ollama_api.py (3)

526-542: Fix regression: json.loads() on an already-parsed dict will crash; handle str/dict robustly.

Supabase can return JSON as a native dict (JSON column) or a string. Unconditionally json.loads(models_setting) will raise on dict input.

Apply this diff:

-        models_data = json.loads(models_setting)
-        from datetime import datetime
+        # Handle both JSON string and native dict from DB driver
+        if isinstance(models_setting, str):
+            try:
+                models_data = json.loads(models_setting)
+            except json.JSONDecodeError:
+                logger.error("Corrupted 'ollama_discovered_models' JSON in archon_settings", exc_info=True)
+                raise HTTPException(status_code=500, detail="Stored models are corrupted")
+        elif isinstance(models_setting, dict):
+            models_data = models_setting
+        else:
+            logger.error(f"Unexpected type for models_setting: {type(models_setting).__name__}")
+            raise HTTPException(status_code=500, detail="Invalid stored models format")

199-200: Bug: timestamp is set to a module name, not a real timestamp.

__module__ returns a module string. Use an actual timestamp.

-            "timestamp": model_discovery_service.check_instance_health.__module__  # Use current timestamp
+            "timestamp": datetime.utcnow().isoformat()

101-108: SSRF risk: user-supplied instance_urls are fetched server-side without egress safeguards.

Attackers can target internal/metadata endpoints (e.g., 127.0.0.1, 169.254.169.254). Validate/denylist private/link-local/loopback addresses before making requests.

Example helper and usage (add once and call in each loop before requests):

# Add near imports
from urllib.parse import urlparse
import socket, ipaddress

def _is_private_host(host: str) -> bool:
    try:
        infos = socket.getaddrinfo(host, None)
        for _, _, _, _, sockaddr in infos:
            ip = ipaddress.ip_address(sockaddr[0])
            if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved or ip.is_multicast:
                return True
    except Exception:
        # If resolution fails, treat as unsafe or log/deny explicitly
        return True
    return False

Usage in validation loops:

parsed = urlparse(url)
if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname):
    logger.warning(f"Blocked private/invalid URL: {url}")
    continue

Also applies to: 159-169, 981-986

python/src/agents/mcp_client.py (3)

93-98: Preserve stack traces and surface HTTP status; raise a transport-specific exception.

Use logger.exception (exc_info) and chain the original error. Capture status/body snippet for context.

-        except httpx.HTTPError as e:
-            logger.error(f"HTTP error calling MCP tool {tool_name}: {e}")
-            raise Exception(f"Failed to call MCP tool: {str(e)}")
-        except Exception as e:
-            logger.error(f"Error calling MCP tool {tool_name}: {e}")
-            raise
+        except httpx.HTTPError as e:
+            resp = getattr(e, "response", None)
+            status = resp.status_code if resp is not None else None
+            body = resp.text[:500] if resp is not None else None
+            logger.exception(
+                f"HTTP error calling MCP tool {tool_name} | url={self.mcp_url}/rpc | status={status} | "
+                f"request_id={request_id} | body_snippet={body}"
+            )
+            raise MCPTransportError(f"HTTP error calling MCP tool {tool_name}", status_code=status) from e
+        except Exception:
+            logger.exception(f"Error calling MCP tool {tool_name} | request_id={request_id}")
+            raise

15-20: Introduce typed exception hierarchy for MCP client.

Enables precise handling and aligns with backend error-handling guidelines.

 logger = logging.getLogger(__name__)
 
-
+class MCPError(Exception):
+    """Base MCP client error."""
+
+class MCPTransportError(MCPError):
+    def __init__(self, message: str, status_code: int | None = None):
+        super().__init__(message if status_code is None else f"[HTTP {status_code}] {message}")
+        self.status_code = status_code
+
+class MCPToolError(MCPError):
+    def __init__(self, message: str, code: int | None = None, data: Any | None = None):
+        super().__init__(message if code is None else f"[{code}] {message}")
+        self.code = code
+        self.data = data
+
 
 class MCPClient:

69-71: Change call_tool return type to Any and update its docstring

JSON-RPC result can be any JSON value; current annotation dict[str, Any] is incorrect and will trigger MyPy failures.

-    async def call_tool(self, tool_name: str, **kwargs) -> dict[str, Any]:
+    async def call_tool(self, tool_name: str, **kwargs) -> Any:
@@
-        Returns:
-            Dict with the tool response
+        Returns:
+            JSON-RPC result value (any JSON-serializable type)

Call sites to review/adjust if they rely on dict-specific behavior:

python/src/agents/mcp_client.py:104-107
python/src/agents/mcp_client.py:111-112
python/src/agents/mcp_client.py:118-121
python/src/agents/mcp_client.py:125-126
python/src/agents/mcp_client.py:130-133
python/src/agents/mcp_client.py:137-138

🧹 Nitpick comments (8)

python/src/server/api_routes/ollama_api.py (2)
137-139: Preserve stack traces in logs (use logger.exception or exc_info=True).

Backend guidelines require full stack traces. Current logs drop tracebacks.

Apply pattern like:
-        logger.error(f"Error in model discovery: {e}")
+        logger.exception("Error in model discovery")
and similarly:
-        logger.error(f"Error in health check: {e}")
+        logger.exception("Error in health check")
-        logger.error(f"Error retrieving stored models: {e}")
+        logger.exception("Error retrieving stored models")
Replicate across other except blocks in this module.

Also applies to: 202-205, 577-580

276-277: Avoid calling a private method (_calculate_performance_score).

Prefer a public API on embedding_router for performance scoring to avoid tight coupling and future breakage.
python/src/agents/mcp_client.py (6)
21-21: Annotate optional parameter explicitly (MyPy).

Use modern union for clarity.
-    def __init__(self, mcp_url: str = None):
+    def __init__(self, mcp_url: str | None = None):
46-46: Annotate client attribute for MyPy.

Prevents “Missing attribute”/inference issues.
-        self.client = httpx.AsyncClient(timeout=30.0)
+        self.client: httpx.AsyncClient = httpx.AsyncClient(timeout=30.0)
13-14: Add uuid import to support unique JSON-RPC ids.
+import uuid
 import httpx
74-75: Use unique JSON-RPC ids and propagate into logs for correlation.
-            request_data = {"jsonrpc": "2.0", "method": tool_name, "params": kwargs, "id": 1}
+            request_id = str(uuid.uuid4())
+            request_data = {"jsonrpc": "2.0", "method": tool_name, "params": kwargs, "id": request_id}
77-81: Drop explicit Content-Type header (httpx sets it for json=).

Avoids redundancy and keeps headers minimal.
-            response = await self.client.post(
-                f"{self.mcp_url}/rpc",
-                json=request_data,
-                headers={"Content-Type": "application/json"},
-            )
+            response = await self.client.post(f"{self.mcp_url}/rpc", json=request_data)
141-158: Add a shutdown helper to close the global client.

Prevents connection pool leaks on graceful shutdown.
     return _mcp_client
+
+async def close_mcp_client() -> None:
+    """Close and clear the global MCP client (call on application shutdown)."""
+    global _mcp_client
+    if _mcp_client is not None:
+        await _mcp_client.close()
+        _mcp_client = None

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e52116a and 0e634e9.

📒 Files selected for processing (2)

python/src/agents/mcp_client.py (1 hunks)
python/src/server/api_routes/ollama_api.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/src/agents/mcp_client.py
python/src/server/api_routes/ollama_api.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/agents/mcp_client.py
python/src/server/api_routes/ollama_api.py

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/api_routes/ollama_api.py

python/src/server/api_routes/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place HTTP route handlers in the API routes directory

Files:

python/src/server/api_routes/ollama_api.py

🧬 Code graph analysis (1)

python/src/agents/mcp_client.py (1)

python/src/server/utils/progress/progress_tracker.py (1)

error (196-219)

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (7)

archon-ui-main/src/features/shared/api/apiClient.ts (3)

54-69: FormData detection and header handling: LGTM; don’t overwrite Accept if caller sets one.

Preserves browser-set multipart boundaries. Minor: avoid clobbering a caller-provided Accept.

Apply this diff:
-    headersObj.set("Accept", "application/json");
+    if (!headersObj.has("Accept")) {
+      headersObj.set("Accept", "application/json");
+    }
71-72: Pass Headers directly to fetch to preserve multi-valued headers and avoid unnecessary lowercasing conversion.

Headers already normalizes and supports multiple values; converting to a plain object can drop that fidelity.

Apply this diff:
-    const headers: Record<string, string> = Object.fromEntries(headersObj.entries());
+    // Use Headers directly to preserve semantics
+    const headers = headersObj;
@@
-      headers,
+      headers,
Also applies to: 79-82

111-121: Consider lenient parsing when content-type is missing but body is JSON.

Some proxies/servers omit content-type; optionally attempt JSON parse before erroring out.

archon-ui-main/src/services/ollamaService.ts (4)

158-182: Harden error classification with case-insensitive checks.

Normalize casing for “Failed to fetch” and “AbortError” to avoid missing variants.

Apply this diff:

-    if (
-      errorMessage.toLowerCase().includes("network") ||
-      errorMessage.includes("fetch") ||
-      errorMessage.includes("Failed to fetch")
-    ) {
+    const lowerMsg = errorMessage.toLowerCase();
+    if (lowerMsg.includes("network") || lowerMsg.includes("fetch") || lowerMsg.includes("failed to fetch")) {
       return new Error(
         `Network error while ${context.toLowerCase()}: ${errorMessage}. ` +
           `Please check your connection and Ollama server status.`,
       );
     }
@@
-    if (errorMessage.includes("timeout") || errorMessage.includes("AbortError")) {
+    if (lowerMsg.includes("timeout") || lowerMsg.includes("aborterror")) {
       return new Error(
         `Timeout error while ${context.toLowerCase()}: The Ollama instance may be slow to respond or unavailable.`
       );
     }

205-209: Remove Content-Type on GET/DELETE; use Accept instead.

Setting Content-Type on requests without a body is unnecessary and can confuse intermediaries. Prefer Accept.

Apply this diff:

@@
-      const response = await fetch(`${this.baseUrl}/api/ollama/models?${params.toString()}`, {
+      const response = await fetch(`${this.baseUrl}/api/ollama/models?${params.toString()}`, {
         method: 'GET',
-        headers: {
-          'Content-Type': 'application/json',
-        },
+        headers: { Accept: 'application/json' },
         signal: AbortSignal.timeout(30000), // 30 second timeout
       });
@@
-      const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, {
+      const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, {
         method: 'GET',
-        headers: {
-          'Content-Type': 'application/json',
-        },
+        headers: { Accept: 'application/json' },
         signal,
       });
@@
-      const response = await fetch(`${this.baseUrl}/api/ollama/embedding/routes?${params.toString()}`, {
+      const response = await fetch(`${this.baseUrl}/api/ollama/embedding/routes?${params.toString()}`, {
         method: 'GET',
-        headers: {
-          'Content-Type': 'application/json',
-        },
+        headers: { Accept: 'application/json' },
         signal: AbortSignal.timeout(30000), // 30 second timeout
       });
@@
-      const response = await fetch(`${this.baseUrl}/api/ollama/cache`, {
+      const response = await fetch(`${this.baseUrl}/api/ollama/cache`, {
         method: 'DELETE',
-        headers: {
-          'Content-Type': 'application/json',
-        },
+        headers: { Accept: 'application/json' },
         signal: AbortSignal.timeout(30000), // 30 second timeout
       });

Also applies to: 244-248, 346-351, 371-376

401-410: Be resilient to instance_status keying; fall back to the first entry.

Some backends key by host or sanitized URL. Use a fallback to avoid false negatives.

Apply this diff:

-        const instanceStatus = healthResponse.instance_status[instanceUrl];
+        const instanceStatus =
+          healthResponse.instance_status[instanceUrl] ??
+          Object.values(healthResponse.instance_status)[0];

435-443: Avoid ad-hoc casts when annotating status; use a typed helper.

Define a small helper/type to carry statusCode to the retry logic.

Example:

type HttpErrorLike = Error & { statusCode?: number; status?: number };

function setStatusCode(err: Error, code: number) {
  (err as HttpErrorLike).statusCode = code;
}

Then replace casts like (lastError as unknown as { statusCode: number }).statusCode = 404; with setStatusCode(lastError, 404).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e634e9 and 1c839e3.

📒 Files selected for processing (2)

archon-ui-main/src/features/shared/api/apiClient.ts (3 hunks)
archon-ui-main/src/services/ollamaService.ts (13 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

archon-ui-main/src/features/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UI

Use Biome in features: 120 character line length, double quotes, and trailing commas

Files:

archon-ui-main/src/features/shared/api/apiClient.ts

archon-ui-main/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Disallow implicit any in TypeScript

archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types

Files:

archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/services/ollamaService.ts

archon-ui-main/src/**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)

Files:

archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/services/ollamaService.ts

🧬 Code graph analysis (2)

archon-ui-main/src/features/shared/api/apiClient.ts (1)

archon-ui-main/src/features/shared/types/errors.ts (1)

APIServiceError (19-28)

archon-ui-main/src/services/ollamaService.ts (1)

archon-ui-main/src/features/shared/config/queryPatterns.ts (1)

createRetryLogic (63-75)

🔇 Additional comments (2)

archon-ui-main/src/services/ollamaService.ts (1)

9-9: Import path fixed — looks correct now.

Matches the actual export under features/shared/config.

archon-ui-main/src/features/shared/api/apiClient.ts (1)

36-45: 304 handling assumption — confirmed. fetch() returns the cached/synthesized (effectively 200) response when a conditional GET yields 304, so a raw 304 is not exposed to script and your response.ok check is safe (see Fetch Standard http-network-or-cache and MDN 304 semantics).

Backend Critical Fixes: - Fix json.loads() regression to handle both string and dict input from Supabase - Fix timestamp bug using actual datetime.utcnow().isoformat() instead of __module__ - Add comprehensive SSRF protection with IP address validation for instance_urls - Enhanced logging with exc_info=True for all error handlers to preserve stack traces MCP Client Enhancements: - Add proper exception hierarchy: MCPError, MCPTransportError, MCPToolError - Implement comprehensive error handling with detailed logging including request IDs - Add global shutdown helper function for proper resource cleanup - Generate unique JSON-RPC request IDs for better correlation Frontend API Service Improvements: - Preserve caller-provided Accept headers instead of always overriding - Handle non-JSON and empty responses properly to avoid false NETWORK_ERROR classifications - Add both status and statusCode properties on error objects for maximum compatibility - Enhanced status code annotations for retry logic compatibility All changes maintain backward compatibility while significantly improving security, reliability, and developer experience.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

python/src/server/api_routes/ollama_api.py (3)

151-153: BackgroundTasks cannot run async functions; current cache warmup likely never runs.

BackgroundTasks executes sync callables in a threadpool; passing an async def leads to an un-awaited coroutine. Use asyncio.create_task instead.

-        if background_tasks:
-            background_tasks.add_task(_warm_model_cache, valid_urls)
+        # Fire-and-forget cache warmup on the running event loop
+        asyncio.create_task(_warm_model_cache(valid_urls))

Add import at top:

import asyncio

259-264: Add SSRF validation to validate_instance.

Validate instance_url with the same guard as other endpoints before calling downstream services.

         # Clean up URL
         instance_url = request.instance_url.rstrip('/')
 
+        # SSRF protection
+        parsed = urlparse(instance_url)
+        if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname):
+            raise HTTPException(status_code=400, detail="Invalid or private instance_url")
+
         # Perform basic validation using the provider service
         validation_result = await validate_provider_instance("ollama", instance_url)

1315-1320: Add SSRF validation to capability testing endpoint.

This endpoint performs server-side requests to the provided instance_url; protect it like other endpoints.

         logger.info(f"Testing capabilities for model {request.model_name} on {request.instance_url}")
-        
+        # SSRF protection
+        parsed = urlparse(request.instance_url.rstrip('/'))
+        if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname):
+            raise HTTPException(status_code=400, detail="Invalid or private instance_url")
+
         test_results = {}
         errors = []

🧹 Nitpick comments (17)

archon-ui-main/src/features/shared/api/apiClient.ts (2)

54-56: Harden FormData detection for cross-environment safety.

Guard against environments where FormData may be undefined (Node/Jest, iframes).
-    const isFormData = options.body instanceof FormData;
+    const isFormData =
+      typeof FormData !== "undefined" && options.body instanceof FormData;
65-71: Only set Content-Type when a body is present.

Avoid sending Content-Type on GET/HEAD or body-less requests.
-    } else if (!headersObj.has("Content-Type")) {
+    } else if (!headersObj.has("Content-Type") && options.body != null) {
       // Only set Content-Type if not already provided
       headersObj.set("Content-Type", "application/json");
     }

archon-ui-main/src/services/ollamaService.ts (5)

158-182: Handle AbortError by name, not just message substring.

Abort errors often set error.name === "AbortError" without a message match.

-    // Check for timeout errors
-    if (errorMessage.includes("timeout") || errorMessage.includes("AbortError")) {
+    // Check for timeout/abort errors
+    if (
+      errorMessage.includes("timeout") ||
+      (typeof (error as any)?.name === "string" && (error as any).name === "AbortError")
+    ) {
       return new Error(
         `Timeout error while ${context.toLowerCase()}: The Ollama instance may be slow to respond or unavailable.`
       );
     }

203-209: Drop Content-Type on GET requests.

Content-Type on GET is unnecessary; rely on Accept (server infers no body).

-      const response = await fetch(`${this.baseUrl}/api/ollama/models?${params.toString()}`, {
+      const response = await fetch(`${this.baseUrl}/api/ollama/models?${params.toString()}`, {
         method: 'GET',
-        headers: {
-          'Content-Type': 'application/json',
-        },
+        // No Content-Type for GET
         signal: AbortSignal.timeout(30000), // 30 second timeout
       });

Same applies to other GETs in this file.

226-248: Provide a default timeout when no signal is supplied.

Prevents hangs when callers omit a signal.

-      const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, {
+      const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, {
         method: 'GET',
         headers: {
           'Content-Type': 'application/json',
         },
-        signal,
+        signal: signal ?? AbortSignal.timeout(30000),
       });

406-410: Preserve zero response times (use nullish coalescing).

Avoid treating 0 as falsy.

-          responseTime: instanceStatus?.response_time_ms || responseTime,
+          responseTime: instanceStatus?.response_time_ms ?? responseTime,

453-459: Optional: add jitter to backoff to reduce thundering herd.

Small random jitter improves resilience under load.

-        const delayMs = Math.pow(2, attempt - 1) * 1000; // Exponential backoff: 1s, 2s, 4s
+        const base = Math.pow(2, attempt - 1) * 1000; // 1s, 2s, 4s
+        const jitter = Math.floor(Math.random() * 250); // up to 250ms
+        const delayMs = base + jitter;

python/src/agents/mcp_client.py (3)

128-133: Fix type hints for optional params (MyPy).

Parameters that accept None should be annotated as Optional (PEP 604 union).

-    async def perform_rag_query(self, query: str, source: str = None, match_count: int = 5) -> str:
+    async def perform_rag_query(self, query: str, source: str | None = None, match_count: int = 5) -> str:

141-147: Fix type hints for optional params (MyPy).

Same here for source_id.

-    async def search_code_examples(
-        self, query: str, source_id: str = None, match_count: int = 5
-    ) -> str:
+    async def search_code_examples(
+        self, query: str, source_id: str | None = None, match_count: int = 5
+    ) -> str:

171-183: Make global client creation concurrency-safe.

Two concurrent calls could create two clients. Guard with an asyncio.Lock.

@@
-_mcp_client: MCPClient | None = None
+_mcp_client: MCPClient | None = None
+_mcp_lock: "asyncio.Lock" | None = None
@@
 async def get_mcp_client() -> MCPClient:
@@
-    if _mcp_client is None:
-        _mcp_client = MCPClient()
+    if _mcp_client is None:
+        # Lazy-init lock to avoid importing asyncio at module import in all contexts
+        global _mcp_lock
+        if _mcp_lock is None:
+            import asyncio as _asyncio  # local import to avoid unconditional dependency at import time
+            _mcp_lock = _asyncio.Lock()
+        async with _mcp_lock:
+            if _mcp_client is None:
+                _mcp_client = MCPClient()

Note: If you prefer, move the asyncio import to the top and initialize the lock eagerly.

python/src/server/api_routes/ollama_api.py (7)

195-206: Use consistent keys for health_results to avoid duplicates.

Blocked entries use instance_url while healthy entries use the normalized url. Normalize both to url.

-                    health_results[instance_url] = {
+                    health_results[url] = {
                         "is_healthy": False,
                         "error_message": "URL blocked for security reasons",
                         "response_time_ms": None,
                         "models_available": 0,
                         "last_checked": datetime.utcnow().isoformat()
                     }

-                health_results[url] = {
+                health_results[url] = {
                     "is_healthy": health_status.is_healthy,
                     "response_time_ms": health_status.response_time_ms,
                     "models_available": health_status.models_available if include_models else None,
                     "error_message": health_status.error_message,
                     "last_checked": health_status.last_checked
                 }

1024-1029: Docstring is inaccurate; code uses only /api/tags and derives some fields.

Align the docstring with actual behavior to avoid misleading clients.

-    Discover models from Ollama instances with complete real details from both /api/tags and /api/show.
-    Only stores actual data from Ollama API endpoints - no fabricated information.
+    Discover models from Ollama instances using /api/tags only (fast path).
+    Skips /api/show to reduce latency/timeouts. Stores data returned by the API;
+    derived fields are kept minimal to avoid fabricating details.

1193-1197: Use upsert instead of update to ensure persistence even if the row doesn't exist.

An update will no-op if the key is missing. Upsert guarantees the record exists.

-        result = supabase.table("archon_settings").update({
-            "value": json.dumps(models_data),
-            "description": "Real Ollama model data from API endpoints",
-            "updated_at": datetime.now().isoformat()
-        }).eq("key", "ollama_discovered_models").execute()
+        result = supabase.table("archon_settings").upsert({
+            "key": "ollama_discovered_models",
+            "value": json.dumps(models_data),
+            "category": "ollama",
+            "description": "Real Ollama model data from API endpoints",
+            "updated_at": datetime.now().isoformat()
+        }).execute()

1003-1015: Avoid “partial support” heuristics based on braces; require valid JSON.

Curly-brace detection will yield many false positives. Treat only parseable JSON as supported.

-                    except json.JSONDecodeError:
-                        # Try to find JSON-like patterns in the response
-                        if '{' in content and '}' in content and '"' in content:
-                            logger.info(f"Model {model_name} has partial structured output support")
-                            return True
+                    except json.JSONDecodeError:
+                        # Not valid JSON => treat as unsupported
+                        return False

587-588: Remove redundant inner import.

datetime is already imported at module level.

-        from datetime import datetime
-        
+

240-241: Prefer timezone-aware UTC timestamps for consistency.

Use datetime.now(timezone.utc).isoformat() to avoid naive datetime ambiguity.

Example:

from datetime import datetime, timezone

datetime.now(timezone.utc).isoformat()

Also applies to: 505-506, 519-521, 529-530, 622-624

1127-1143: Avoid fabricating compatibility and embedding dimensions in “real-details” endpoint.

Setting chat compatibility to 'full' and guessing embedding_dimensions contradicts the “real data only” intent. Either:

Mark compatibility as 'unknown' and embedding_dimensions as None, or
Explicitly document they’re inferred defaults.

Would you like me to generate a minimal patch that sets archon_compatibility='unknown' for chat models and leaves embedding_dimensions=None when not present?

Also applies to: 1103-1114

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c839e3 and 98256b8.

📒 Files selected for processing (4)

archon-ui-main/src/features/shared/api/apiClient.ts (3 hunks)
archon-ui-main/src/services/ollamaService.ts (13 hunks)
python/src/agents/mcp_client.py (5 hunks)
python/src/server/api_routes/ollama_api.py (19 hunks)

🧰 Additional context used

📓 Path-based instructions (7)

archon-ui-main/src/features/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UI

Use Biome in features: 120 character line length, double quotes, and trailing commas

Files:

archon-ui-main/src/features/shared/api/apiClient.ts

archon-ui-main/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Disallow implicit any in TypeScript

archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types

Files:

archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/services/ollamaService.ts

archon-ui-main/src/**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)

Files:

archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/services/ollamaService.ts

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/src/agents/mcp_client.py
python/src/server/api_routes/ollama_api.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/agents/mcp_client.py
python/src/server/api_routes/ollama_api.py

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/api_routes/ollama_api.py

python/src/server/api_routes/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place HTTP route handlers in the API routes directory

Files:

python/src/server/api_routes/ollama_api.py

🧬 Code graph analysis (2)

archon-ui-main/src/features/shared/api/apiClient.ts (1)

archon-ui-main/src/features/shared/types/errors.ts (1)

APIServiceError (19-28)

archon-ui-main/src/services/ollamaService.ts (1)

archon-ui-main/src/features/shared/config/queryPatterns.ts (1)

createRetryLogic (63-75)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Test Results Summary

🔇 Additional comments (10)

archon-ui-main/src/features/shared/api/apiClient.ts (3)

57-64: LGTM: Header normalization and default Accept.

Using Headers() to normalize and defaulting Accept to application/json is sound and matches tests.

113-121: LGTM: Content-Type–aware JSON parsing with API error surfacing.

The JSON gating and API error propagation are correct and robust.

124-127: LGTM: Non‑JSON fallback handling.

Gracefully returning text or undefined is appropriate here.

archon-ui-main/src/services/ollamaService.ts (1)

9-9: Import path fix confirmed.

createRetryLogic import now points to ../features/shared/config/queryPatterns — correct.
python/src/agents/mcp_client.py (4)
101-107: Good: raise MCPToolError with code/data (addresses prior feedback).

You’re correctly surfacing JSON-RPC error.code and error.data via a typed exception; this preserves protocol semantics.

91-97: LGTM: unique JSON-RPC IDs and relying on httpx json= for Content-Type.

Using uuid4() for request correlation and letting httpx set headers is solid and reduces header bugs.

186-199: LGTM: graceful global shutdown helper.

Cleanly closing the AsyncClient and clearing the singleton is correct and production‑friendly.

108-108: Don’t silently default missing 'result'; treat as protocol error.

Defaulting to {} hides malformed JSON-RPC responses. Raise an MCPError if "result" is absent (allowing None as a valid value).
-            return result.get("result", {})
+            if "result" not in result:
+                raise MCPError(
+                    f"Malformed JSON-RPC response: missing 'result' (keys={list(result.keys())})"
+                )
+            return result["result"]
python/src/server/api_routes/ollama_api.py (2)

574-586: Good robustness for stored models load path.

Handling both JSON string and dict with explicit 500 on corruption is correct and aligns with “never accept corrupted data”.

166-166: LGTM on logging improvements.

Using exc_info=True consistently will preserve stack traces and improve troubleshooting.

Also applies to: 244-244, 294-294, 386-386, 415-415, 543-543, 638-638, 664-664, 1253-1253, 1403-1403

- Add SSRF protection to validate_instance and test_capabilities endpoints - Fix async execution with asyncio.create_task for cache warming - Implement thread-safe MCP client creation with proper locking - Enhance FormData detection for cross-environment compatibility - Remove unnecessary Content-Type headers from GET requests - Add default timeouts and jitter to retry logic - Fix MyPy type hints for optional parameters - Use database upsert instead of update for persistence - Standardize on UTC timestamps throughout - Remove fabricated model data, use real API data only Resolves 20 CodeRabbit comments (3 actionable + 17 nitpicks) through parallel agent coordination addressing security, concurrency, type safety, and API client improvements across backend and frontend services.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (12)

archon-ui-main/src/features/shared/api/apiClient.ts (2)

76-87: Avoid converting Headers to a plain object (preserve multi-value/case-insensitive semantics)

Passing a plain Record loses potential multi-value headers and relies on object semantics. Prefer passing the Headers instance directly.

Apply this diff:

-const headers: Record<string, string> = Object.fromEntries(headersObj.entries());
...
-  headers,
+  headers: headersObj,

116-130: Handle binary responses (octet-stream, PDF, images) instead of coercing to text

Non-JSON content may be binary; returning text can corrupt data. Add a Blob branch for common binary content types.

 // Check content type before parsing as JSON
 const contentType = response.headers.get("content-type")?.toLowerCase() ?? "";
 if (contentType.includes("application/json") || contentType.includes("+json")) {
   // Parse JSON response
   const result = await response.json();
   if (result && typeof result === "object" && "error" in result && result.error) {
     throw new APIServiceError(result.error as string, "API_ERROR", response.status);
   }
   return result as T;
 }
 
-// Handle non-JSON or empty body responses
-const text = await response.text().catch(() => "");
-return text ? (text as unknown as T) : (undefined as T);
+// Handle non-JSON responses (prefer Blob for binary)
+if (
+  contentType.startsWith("application/octet-stream") ||
+  contentType.startsWith("application/pdf") ||
+  contentType.startsWith("image/") ||
+  contentType.startsWith("application/zip")
+){
+  const blob = await response.blob().catch(() => null);
+  return (blob ?? undefined) as unknown as T;
+}
+// Fallback to text for other content types
+const text = await response.text().catch(() => "");
+return text ? (text as unknown as T) : (undefined as T);

python/src/server/api_routes/ollama_api.py (3)

105-112: Type the optional BackgroundTasks parameter explicitly

BackgroundTasks is optional here; annotate its type accordingly.
-    background_tasks: BackgroundTasks = None
+    background_tasks: BackgroundTasks | None = None
628-631: Use timezone-aware UTC consistently

Elsewhere you use datetime.now(UTC).isoformat(); here it falls back to naive datetime.utcnow().
-                        last_updated=model.get('last_updated', datetime.utcnow().isoformat()),
+                        last_updated=model.get('last_updated', datetime.now(UTC).isoformat()),
325-327: Avoid calling a private method on embedding_router

Accessing _calculate_performance_score ties you to an internal API. Expose a public method on embedding_router and call that instead.

python/src/agents/mcp_client.py (7)

102-108: Prefer JSON-RPC “message” (spec) over non-standard “error” field

Use message first, then fall back to any non-standard field.

-                error_msg = error.get("error") or error.get("message", "Unknown error")
+                error_msg = (error.get("message") or error.get("error") or "Unknown error")

97-101: Treat 3xx redirects as transport errors for JSON-RPC

JSON-RPC endpoints should not redirect. Otherwise we may try to parse HTML as JSON.

-            response = await self.client.post(f"{self.mcp_url}/rpc", json=request_data)
+            response = await self.client.post(f"{self.mcp_url}/rpc", json=request_data)
+            if 300 <= response.status_code < 400:
+                location = response.headers.get("location")
+                logger.error(
+                    f"Unexpected redirect calling MCP tool {tool_name} | "
+                    f"status={response.status_code} | request_id={request_id} | location={location}"
+                )
+                raise MCPTransportError(
+                    f"Unexpected redirect from MCP server (location={location})",
+                    status_code=response.status_code,
+                )

100-101: Handle invalid JSON responses explicitly (raise MCPTransportError)

Avoid routing JSON decoding failures through the generic handler; include status/body context.

-            result = response.json()
+            try:
+                result = response.json()
+            except json.JSONDecodeError as e:
+                body_snippet = response.text[:500]
+                logger.exception(
+                    f"Invalid JSON from MCP server | status={response.status_code} "
+                    f"| request_id={request_id} | body_snippet={body_snippet}"
+                )
+                raise MCPTransportError("Invalid JSON from MCP server", status_code=response.status_code) from e

93-98: Add X-Request-ID header for cross-service correlation

Propagate the same request_id to server logs.

-            response = await self.client.post(f"{self.mcp_url}/rpc", json=request_data)
+            response = await self.client.post(
+                f"{self.mcp_url}/rpc",
+                json=request_data,
+                headers={"X-Request-ID": request_id},
+            )

132-169: Return consistent JSON strings from convenience methods

str(result) yields Python repr for lists/booleans (not JSON). Prefer JSON for non-string values.

-        return json.dumps(result) if isinstance(result, dict) else str(result)
+        return result if isinstance(result, str) else json.dumps(result)
@@
-        return json.dumps(result) if isinstance(result, dict) else str(result)
+        return result if isinstance(result, str) else json.dumps(result)
@@
-        return json.dumps(result) if isinstance(result, dict) else str(result)
+        return result if isinstance(result, str) else json.dumps(result)
@@
-        return json.dumps(result) if isinstance(result, dict) else str(result)
+        return result if isinstance(result, str) else json.dumps(result)
@@
-        return json.dumps(result) if isinstance(result, dict) else str(result)
+        return result if isinstance(result, str) else json.dumps(result)
@@
-        return json.dumps(result) if isinstance(result, dict) else str(result)
+        return result if isinstance(result, str) else json.dumps(result)

180-181: Doc nit: it’s coroutine-safe on a single event loop, not “thread-safe”

asyncio.Lock isn’t cross-thread/loop safe; clarify to avoid misuse.

-    Thread-safe implementation using double-checked locking pattern.
+    Coroutine-safe (single event loop) implementation using double-checked locking pattern.

212-217: Reset the global lock on shutdown (avoid cross-loop reuse issues in tests/tools)

Prevents reusing an asyncio.Lock bound to a different loop.

-    global _mcp_client
+    global _mcp_client, _mcp_client_lock
@@
         await _mcp_client.close()
         _mcp_client = None
+        _mcp_client_lock = None
         logger.info("Global MCP client shutdown completed")

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98256b8 and 1eb7d70.

📒 Files selected for processing (4)

archon-ui-main/src/features/shared/api/apiClient.ts (3 hunks)
archon-ui-main/src/services/ollamaService.ts (14 hunks)
python/src/agents/mcp_client.py (6 hunks)
python/src/server/api_routes/ollama_api.py (42 hunks)

🧰 Additional context used

📓 Path-based instructions (7)

archon-ui-main/src/features/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UI

Use Biome in features: 120 character line length, double quotes, and trailing commas

Files:

archon-ui-main/src/features/shared/api/apiClient.ts

archon-ui-main/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Disallow implicit any in TypeScript

archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types

Files:

archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/services/ollamaService.ts

archon-ui-main/src/**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)

Files:

archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/services/ollamaService.ts

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/api_routes/ollama_api.py

python/src/server/api_routes/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place HTTP route handlers in the API routes directory

Files:

python/src/server/api_routes/ollama_api.py

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/src/server/api_routes/ollama_api.py
python/src/agents/mcp_client.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/server/api_routes/ollama_api.py
python/src/agents/mcp_client.py

🧬 Code graph analysis (3)

archon-ui-main/src/features/shared/api/apiClient.ts (1)

archon-ui-main/src/features/shared/types/errors.ts (1)

APIServiceError (19-28)

python/src/server/api_routes/ollama_api.py (2)

python/src/server/utils/progress/progress_tracker.py (1)

error (196-219)

python/src/server/services/ollama/model_discovery_service.py (2)

_test_function_calling_capability (817-857)

_test_structured_output_capability (859-900)

archon-ui-main/src/services/ollamaService.ts (1)

archon-ui-main/src/features/shared/config/queryPatterns.ts (1)

createRetryLogic (63-75)

🔇 Additional comments (9)

archon-ui-main/src/features/shared/api/apiClient.ts (1)

54-57: FormData detection and header handling: good change

Letting the browser set multipart boundaries by omitting Content-Type is correct. The guard for environments without FormData is also appropriate.

archon-ui-main/src/services/ollamaService.ts (3)

9-9: Import path fix looks correct

Importing createRetryLogic from features/shared/config/queryPatterns matches the referenced export.

158-183: Timeout and network error mapping: solid improvements

Catching AbortError and normalizing error messages improves UX across callers.

454-462: Retry/backoff logic: appropriate usage of smart retry with jitter

Exponential backoff with jitter and createRetryLogic gating retries by status is a good pattern here.

python/src/server/api_routes/ollama_api.py (2)

32-48: SSRF guard: also block unspecified addresses (0.0.0.0/::)

Unspecified addresses should be treated as unsafe sources for server-side requests.

Apply this diff:

-            if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved or ip.is_multicast:
+            if (
+                ip.is_private
+                or ip.is_loopback
+                or ip.is_link_local
+                or ip.is_reserved
+                or ip.is_multicast
+                or ip.is_unspecified
+            ):
                 return True

132-135: Require http/https scheme in all SSRF checks

Current checks allow non-HTTP schemes through (e.g., file://). Require parsed.scheme in ("http","https") everywhere you validate URLs.

Apply these diffs:

-                if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname):
+                if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname):
                     logger.warning(f"Blocked private/invalid URL: {url}")
                     continue

-                if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname):
+                if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname):
                     logger.warning(f"Blocked private/invalid URL in health check: {url}")
                     health_results[url] = {

-        if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname):
+        if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname):
             logger.warning(f"Blocked private/invalid URL in validate_instance: {instance_url}")
             raise HTTPException(status_code=400, detail="URL blocked for security reasons")

-                if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname):
+                if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname):
                     logger.warning(f"Blocked private/invalid URL in model discovery: {base_url}")
                     continue

-                if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname):
+                if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname):
                     logger.warning(f"Blocked private/invalid URL in cache warming: {url}")
                     continue

-                if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname):
+                if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname):
                     logger.warning(f"Blocked private/invalid URL in detailed discovery: {base_url}")
                     continue

-        if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname):
+        if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname):
             logger.warning(f"Blocked private/invalid URL in capability testing: {request.instance_url}")
             raise HTTPException(status_code=400, detail="URL blocked for security reasons")

Also applies to: 195-204, 266-269, 486-489, 657-661, 1049-1053, 1303-1307

python/src/agents/mcp_client.py (3)

20-35: Typed MCP error hierarchy is a solid improvement

Clear base and specialized exceptions (transport/tool) improve diagnosability and caller handling.

109-113: Good fix: fail on missing 'result' (no silent default)

This preserves protocol correctness and avoids masking malformed responses.

126-129: Preserve MCPError subclasses (don’t re-wrap them)

Let MCPToolError/MCPTransportError pass through unchanged.
-        except Exception as e:
+        except MCPError:
+            # Preserve typed MCP errors without wrapping
+            raise
+        except Exception as e:
             logger.exception(f"Unexpected error calling MCP tool {tool_name} | request_id={request_id}")
             raise MCPError(f"Failed to call MCP tool {tool_name}: {str(e)}") from e

## Critical URL Normalization Fix - Fix instanceStatus lookup failures with trailing slash URLs in ollamaService.ts - Add proper fallback logic with normalized URL matching ## Enhanced Security & Protocol Compliance - Add unspecified IP address blocking (0.0.0.0/::) to SSRF protection - Require http/https schemes in all 7 SSRF validation points - Treat JSON-RPC 3xx redirects as transport errors (protocol violation) - Add explicit invalid JSON response handling with proper error context ## Type Safety & Consistency Improvements - Fix BackgroundTasks type annotation: BackgroundTasks | None = None - Use datetime.now(UTC).isoformat() consistently across all timestamps - Enhanced JSON-RPC message field priority (spec-compliant 'message' first) - Return consistent JSON strings instead of Python repr for all response types ## HTTP Client & Error Handling Enhancements - Preserve Headers instance instead of converting to Record (maintain multi-value support) - Add binary response handling for PDFs, images, octet-stream, video, audio - Add X-Request-ID header for cross-service correlation and tracing - Preserve MCPError subclasses without re-wrapping for proper error handling - Reset global asyncio.Lock on shutdown for test isolation safety ## Files Enhanced - ollamaService.ts: URL normalization and retry logic improvements - apiClient.ts: Headers preservation and binary response support - ollama_api.py: Comprehensive SSRF protection and type safety - mcp_client.py: JSON-RPC protocol compliance and error handling Addresses all remaining CodeRabbit feedback (1 actionable + 12 nitpicks) with focus on security, protocol compliance, and cross-environment reliability.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

python/src/agents/mcp_client.py (2)
195-195: Misleading terminology: "Thread-safe" refers to threading.Lock, not asyncio.Lock.

asyncio.Lock is "not thread-safe" and is designed specifically for coroutines within a single event loop. The comment incorrectly claims this is a "thread-safe implementation" when it's actually a coroutine-safe implementation using asyncio.Lock.

Apply this diff to correct the terminology:
-    Thread-safe implementation using double-checked locking pattern.
+    Coroutine-safe implementation using double-checked locking pattern.
188-188: Lock variable initialization may cause issues in multi-event-loop scenarios.

The global lock variable is initialized to None and created lazily, which could cause race conditions if multiple event loops from different threads try to initialize it simultaneously.

Consider initializing the lock at module level to avoid potential issues:
-_mcp_client_lock: asyncio.Lock | None = None
+_mcp_client_lock = asyncio.Lock()
Then simplify the get_mcp_client() function:
-    # Initialize lock if needed
-    if _mcp_client_lock is None:
-        _mcp_client_lock = asyncio.Lock()
-
     # Double-checked locking pattern
     async with _mcp_client_lock:

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1eb7d70 and a35139b.

📒 Files selected for processing (4)

archon-ui-main/src/features/shared/api/apiClient.ts (3 hunks)
archon-ui-main/src/services/ollamaService.ts (14 hunks)
python/src/agents/mcp_client.py (6 hunks)
python/src/server/api_routes/ollama_api.py (44 hunks)

🧰 Additional context used

📓 Path-based instructions (7)

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/src/agents/mcp_client.py
python/src/server/api_routes/ollama_api.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/agents/mcp_client.py
python/src/server/api_routes/ollama_api.py

archon-ui-main/src/features/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UI

Use Biome in features: 120 character line length, double quotes, and trailing commas

Files:

archon-ui-main/src/features/shared/api/apiClient.ts

archon-ui-main/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Disallow implicit any in TypeScript

archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types

Files:

archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/services/ollamaService.ts

archon-ui-main/src/**/*.{ts,tsx,py}

📄 CodeRabbit inference engine (CLAUDE.md)

In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)

Files:

archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/services/ollamaService.ts

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/api_routes/ollama_api.py

python/src/server/api_routes/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place HTTP route handlers in the API routes directory

Files:

python/src/server/api_routes/ollama_api.py

🧬 Code graph analysis (4)

python/src/agents/mcp_client.py (1)

python/src/server/utils/progress/progress_tracker.py (1)

error (196-219)

archon-ui-main/src/features/shared/api/apiClient.ts (1)

archon-ui-main/src/features/shared/types/errors.ts (1)

APIServiceError (19-28)

python/src/server/api_routes/ollama_api.py (2)

python/src/server/services/ollama/model_discovery_service.py (2)

_test_function_calling_capability (817-857)

_test_structured_output_capability (859-900)

python/src/server/services/llm_provider_service.py (1)

get_llm_client (313-548)

archon-ui-main/src/services/ollamaService.ts (1)

archon-ui-main/src/features/shared/config/queryPatterns.ts (1)

createRetryLogic (63-75)

🪛 GitHub Actions: Continuous Integration

python/src/server/api_routes/ollama_api.py

[error] 105-105: FastAPIError: Invalid args for response field! Hint: check that fastapi.background.BackgroundTasks | None is a valid Pydantic field type. If you are using a return type annotation that is not a valid Pydantic field (e.g. Union[Response, dict, None]) you can disable generating the response model from the type annotation with the path operation decorator parameter response_model=None.

🔇 Additional comments (32)

archon-ui-main/src/services/ollamaService.ts (5)

43-43: LGTM! Improved type safety with unknown.

The change from any to unknown for parameters properties aligns with TypeScript best practices, as unknown is the type-safe counterpart of any and forces type checking before operations. This prevents unsafe access to potentially undefined properties and improves code reliability.

Also applies to: 58-58

158-183: LGTM! Enhanced error handling with better AbortError detection.

The updated handleApiError method properly detects AbortError instances through both error message content and the error name property. The addition of errorName variable provides more reliable timeout error detection.

207-207: LGTM! Consistent timeout and header improvements.

The addition of Accept: application/json headers and 30-second timeouts via AbortSignal.timeout(30000) provides better API consistency and prevents hanging requests. The timeout handling integrates well with the enhanced handleApiError method.

Also applies to: 209-209, 246-246, 248-248, 280-280, 312-312, 349-349, 351-351, 376-376

227-227: Good addition of optional AbortSignal parameter.

The optional signal parameter in checkInstanceHealth allows callers to control request timeouts while maintaining backward compatibility with the default 30-second timeout.

Also applies to: 248-248

392-473: LGTM! Smart retry implementation with proper normalization.

The implementation successfully integrates the createRetryLogic utility and addresses the URL normalization issue identified in past reviews. Key improvements:

URL normalization: Lines 405-408 properly handle trailing slash differences between client URLs and backend keys

Smart retry logic: Uses createRetryLogic to avoid retrying 4xx client errors while retrying network/5xx errors

Status code annotation: Lines 425-454 intelligently parse error messages to add HTTP status codes for retry decision-making

Exponential backoff with jitter: Lines 459-462 provide robust retry timing

The retry loop correctly iterates retryCount + 1 times (Line 398) and properly uses the retry logic to determine continuation (Line 458).

archon-ui-main/src/features/shared/api/apiClient.ts (3)

37-45: LGTM! Clear documentation of new FormData capabilities.

The updated function documentation accurately describes the automatic FormData detection feature and improved functionality.

54-78: LGTM! Smart FormData detection and header normalization.

The implementation correctly:

Detects FormData safely: Lines 55-56 include a guard for environments where FormData might be undefined

Normalizes headers: Line 61 uses Headers constructor to handle various header input types

Handles Content-Type intelligently: Lines 68-74 omit Content-Type for FormData (allowing browser to set multipart boundary) and set application/json only when needed

Preserves caller headers: Lines 64-66 respect existing Accept headers while providing a sensible default

This addresses the PR objective of adding FormData support while maintaining backward compatibility.

117-142: LGTM! Content-type aware response parsing.

The enhanced response parsing logic properly:

JSON detection: Lines 119-126 check for both application/json and +json suffixes and handle API error responses

Binary handling: Lines 128-138 detect various binary content types and return Blob objects

Fallback handling: Lines 141-142 gracefully handle text responses and empty bodies

This provides robust handling for different response types while maintaining the existing API contract.

python/src/agents/mcp_client.py (11)

20-34: LGTM! Well-designed error hierarchy.

The error hierarchy correctly inherits from Exception and provides useful attributes for debugging. The constructors properly format messages with codes when available, which will help with debugging MCP protocol issues.

40-64: LGTM! Robust service discovery with sensible fallback.

The initialization logic correctly prioritizes explicit URL configuration, then attempts service discovery, and finally falls back to container-aware defaults. The Docker container detection is appropriate for the deployment model.

65-65: Good type annotation for the HTTP client.

The explicit type annotation helps with IDE support and makes the code more maintainable.

92-100: LGTM! Proper JSON-RPC request construction.

The use of UUID for request IDs ensures uniqueness for correlation, and the X-Request-ID header follows good practices for distributed tracing. Letting httpx handle the Content-Type for JSON is the right approach.

103-104: LGTM! Appropriate redirect handling for JSON-RPC.

JSON-RPC over HTTP should not follow redirects as it breaks the protocol semantics. Treating 3xx as transport errors is correct.

109-112: LGTM! Proper JSON parsing error handling.

Explicitly catching JSONDecodeError and wrapping it in MCPTransportError with the status code provides good debugging context.

114-119: LGTM! Comprehensive tool error handling.

The error handling correctly extracts all JSON-RPC error fields (error, message, code, data) and raises the appropriate typed exception. This addresses the previous review feedback effectively.

121-122: LGTM! Proper protocol validation.

Checking for the mandatory result field in JSON-RPC success responses is correct protocol behavior. The error message includes the full response for debugging.

126-136: LGTM! Rich HTTP error logging and exception handling.

The error handling extracts comprehensive debugging information (status code, body snippet, URL, request ID) and logs it appropriately. Wrapping in MCPTransportError with status code preserves the transport-level context.

138-143: LGTM! Proper exception type preservation.

The code correctly re-raises MCPError subclasses without wrapping them, preserving the specific error types for callers. This addresses the previous review feedback about preserving typed exceptions.

220-235: LGTM! Proper resource cleanup for global client.

The shutdown function correctly closes the HTTP client and resets both global variables. Resetting the lock to None is good for test isolation.

python/src/server/api_routes/ollama_api.py (13)

42-42: LGTM! Correctly implements SSRF protection with is_unspecified check.

The function includes ip.is_unspecified to catch addresses like 0.0.0.0 and ::, following the proper pattern for SSRF protection. This addresses the previous review comment about missing unspecified address validation.

132-134: LGTM! SSRF protection correctly validates HTTP/HTTPS schemes.

The check properly validates that URLs use only HTTP/HTTPS schemes and blocks private/internal addresses, addressing the previous review comment about accepting non-HTTP schemes.

195-204: LGTM! Proper SSRF protection with appropriate error handling.

The function correctly validates URLs and provides structured error responses for blocked URLs, maintaining consistent error handling patterns.

266-268: LGTM! Consistent SSRF protection across validation endpoints.

The validation properly blocks private/internal URLs before processing and returns an appropriate HTTP 400 error.

486-488: LGTM! SSRF protection consistently applied in model discovery.

The function properly validates URLs and continues processing other instances when blocking unsafe URLs, following the established pattern.

658-660: LGTM! Background cache warming includes proper SSRF validation.

The background task correctly validates URLs before processing, maintaining security even in asynchronous operations.

1050-1052: LGTM! Comprehensive SSRF protection in detailed discovery.

The function applies consistent URL validation patterns across all discovery endpoints, ensuring security throughout the API.

1304-1306: LGTM! SSRF protection in capability testing endpoint.

The endpoint properly validates URLs before performing capability tests, maintaining security for all external requests.

15-15: LGTM! Proper UTC timestamp usage throughout the codebase.

The consistent use of datetime.now(UTC).isoformat() ensures timezone-aware timestamps across all endpoints and database operations. This follows best practices for timestamp handling in distributed systems.

Also applies to: 202-202, 242-242, 513-513, 526-526, 537-537, 628-628, 1140-1140, 1166-1166, 1180-1180

168-168: LGTM! Comprehensive error logging with stack traces.

The consistent use of exc_info=True ensures full stack traces are captured for debugging, following the coding guidelines for Python error handling.

Also applies to: 246-246, 302-302, 339-339, 395-395, 423-423, 551-551, 644-644, 670-670, 1237-1237, 1393-1393

583-593: LGTM! Robust JSON handling with proper error handling.

The code properly handles both string and dict formats from the database driver, includes comprehensive error handling for corrupted JSON, and follows the coding guidelines by raising exceptions with context rather than returning None.

1106-1106: LGTM! Avoids fabricating embedding dimensions and compatibility data.

The code correctly avoids creating fake embedding dimensions and sets compatibility as 'unknown' when real data is unavailable, maintaining data integrity and avoiding misleading information.

Also applies to: 1121-1125

1017-1019: LGTM! Strict JSON validation for structured output testing.

The code correctly validates only proper JSON parsing without accepting partial matches, ensuring accurate capability detection.

- Use BackgroundTasks() default instead of BackgroundTasks | None for FastAPI compatibility - Maintains the same functionality while following FastAPI best practices

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

python/src/server/api_routes/ollama_api.py (2)

317-327: Add SSRF validation for analyze_embedding_route_endpoint.

This endpoint consumes a user-provided instance_url without the SSRF guard used elsewhere.

Apply this diff:

         logger.info(f"Analyzing embedding route for {request.model_name} on {request.instance_url}")
 
+        # SSRF protection - require http(s) and block private/internal targets
+        parsed = urlparse(request.instance_url)
+        if parsed.scheme not in ('http', 'https') or not parsed.hostname or _is_private_host(parsed.hostname):
+            logger.warning(f"Blocked private/invalid URL in embedding route analysis: {request.instance_url}")
+            raise HTTPException(status_code=400, detail="URL blocked for security reasons")
+
         # Get routing decision from the embedding router
         routing_decision = await embedding_router.route_embedding(
             model_name=request.model_name,
             instance_url=request.instance_url,
             text_content=request.text_sample
         )

357-359: Validate instance_urls with SSRF checks before routing analysis.

This path accepts a list of URLs without the SSRF validation applied in other endpoints.

Apply this diff:

-        # Get available routes
-        routes = await embedding_router.get_available_embedding_routes(instance_urls)
+        # Validate instance URLs and check for SSRF risks
+        valid_urls: list[str] = []
+        for url in instance_urls:
+            try:
+                parsed = urlparse(url.rstrip('/'))
+                if parsed.scheme not in ('http', 'https') or not parsed.hostname or _is_private_host(parsed.hostname):
+                    logger.warning(f"Blocked private/invalid URL in embedding routes: {url}")
+                    continue
+                valid_urls.append(url.rstrip('/'))
+            except Exception as e:
+                logger.warning(f"Error validating URL {url}: {e}")
+                continue
+        if not valid_urls:
+            raise HTTPException(status_code=400, detail="No valid instance URLs provided")
+
+        # Get available routes for validated URLs only
+        routes = await embedding_router.get_available_embedding_routes(valid_urls)

🧹 Nitpick comments (4)

python/src/server/api_routes/ollama_api.py (4)
121-135: Remove redundant scheme check; rely on urlparse validation once.

You check both startswith('http...') and parsed.scheme in ('http','https'). One is enough.

Apply this diff:
-                # Basic URL validation - require http/https schemes
-                if not url.startswith(('http://', 'https://')):
-                    logger.warning(f"Invalid URL format: {url}")
-                    continue
-
                 # SSRF protection - check if URL targets private/internal addresses
                 parsed = urlparse(url)
                 if parsed.scheme not in ('http', 'https') or not parsed.hostname or _is_private_host(parsed.hostname):
                     logger.warning(f"Blocked private/invalid URL: {url}")
                     continue
325-325: Avoid calling a private method on embedding_router.

Accessing _calculate_performance_score breaks encapsulation; prefer a public API (e.g., calculate_performance_score) or move logic local.

656-661: LGTM; consider bounded concurrency for warming to improve throughput.

Optional: run discover_models across hosts concurrently with a small semaphore to cap parallelism.

Example (outside this hunk):
# outside diff: example approach
sem = asyncio.Semaphore(5)
async def _warm_one(u):
    async with sem:
        await model_discovery_service.discover_models(u)

await asyncio.gather(*(_warm_one(u) for u in instance_urls))
1176-1181: Include category for consistency with other upserts.

Elsewhere you upsert with category="ollama". Keep it consistent for filtering/maintenance.

Apply this diff:
         result = supabase.table("archon_settings").upsert({
             "key": "ollama_discovered_models",
             "value": json.dumps(models_data),
             "description": "Real Ollama model data from API endpoints",
+            "category": "ollama",
             "updated_at": datetime.now(UTC).isoformat()
         }).execute()

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a35139b and 704f2db.

📒 Files selected for processing (1)

python/src/server/api_routes/ollama_api.py (44 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/api_routes/ollama_api.py

python/src/server/api_routes/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place HTTP route handlers in the API routes directory

Files:

python/src/server/api_routes/ollama_api.py

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/src/server/api_routes/ollama_api.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/server/api_routes/ollama_api.py

🧬 Code graph analysis (1)

python/src/server/api_routes/ollama_api.py (3)

python/src/server/utils/progress/progress_tracker.py (1)

error (196-219)

python/src/server/services/ollama/model_discovery_service.py (2)

_test_function_calling_capability (817-857)

_test_structured_output_capability (859-900)

python/src/server/services/llm_provider_service.py (1)

get_llm_client (313-548)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Backend Tests (Python + pytest)

🔇 Additional comments (3)

python/src/server/api_routes/ollama_api.py (3)
32-47: SSRF guard looks good (now blocks unspecified/loopback/private/etc.).

Solid use of getaddrinfo + ipaddress checks, including is_unspecified. Treating resolution failures as unsafe is appropriate.

582-594: Robust handling of stored models format and corruption.

Gracefully supports string or dict values and fails fast on corruption with clear errors.

151-155: Use FastAPI BackgroundTasks instead of asyncio.create_task for request-scoped background work.

Schedule via background_tasks.add_task to run after the response is sent. This was flagged previously.

Apply this diff:
-        if background_tasks:
-            # Use asyncio.create_task for async function execution
-            asyncio.create_task(_warm_model_cache(valid_urls))
+        if background_tasks is not None:
+            # Schedule cache warming as a FastAPI background task (runs after response)
+            background_tasks.add_task(_warm_model_cache, valid_urls)

## Critical Security Fixes: ### SSRF Vulnerability coleam00#1 - analyze_embedding_route_endpoint - Add SSRF protection before processing user-provided instance_url - Validate http/https schemes and block private/internal addresses - Prevents attacks via embedding route analysis endpoint ### SSRF Vulnerability coleam00#2 - embedding routes endpoint - Add URL validation loop for instance_urls list parameter - Create validated URL list with proper SSRF filtering - Prevent bulk private network access via embedding routes discovery ### FastAPI Background Tasks Fix - Replace asyncio.create_task() with background_tasks.add_task() - Ensures proper FastAPI request lifecycle management - Background tasks now run after HTTP response is sent ## Security Impact: These fixes prevent Server-Side Request Forgery (SSRF) attacks that could allow attackers to access internal services, cloud metadata, or private network resources via embedding-related endpoints. Addresses CodeRabbit actionable security comments from PR coleam00#723 review.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (5)

python/src/server/api_routes/ollama_api.py (5)
32-47: SSRF guard: good coverage; minor readability/line-length fix.

ip checks look solid (incl. unspecified). Wrap the long condition per 120‑char guideline.
-            if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved or ip.is_multicast or ip.is_unspecified:
+            if (
+                ip.is_private
+                or ip.is_loopback
+                or ip.is_link_local
+                or ip.is_reserved
+                or ip.is_multicast
+                or ip.is_unspecified
+            ):
                 return True
331-331: Avoid calling a private method of embedding_router.

Use a public wrapper to prevent tight coupling to internals.
-        performance_score = embedding_router._calculate_performance_score(routing_decision.dimensions)
+        performance_score = embedding_router.calculate_performance_score(routing_decision.dimensions)
Add this wrapper in embedding_router:
# in services/ollama/embedding_router.py
def calculate_performance_score(self, dimensions: int) -> float:
    return self._calculate_performance_score(dimensions)
349-353: sort_by_performance is currently unused — either remove or honor it.

Right now, routes are always performance-sorted by the router. If you keep the flag, re-sort here when False.

Apply this diff:
 routes = await embedding_router.get_available_embedding_routes(valid_urls)
+        # If not sorting by performance, provide a stable alternative ordering
+        if not sort_by_performance:
+            routes.sort(key=lambda r: (r.model_name, r.instance_url))
Alternatively, remove the query param to avoid a misleading API surface.

Also applies to: 379-381

450-452: force_refresh is not used.

Either implement cache-bypass behavior or drop the field to avoid confusion.

Would you like me to wire this through model_discovery_service to bypass caches when True?

Also applies to: 481-483

137-139: Preserve stack traces on caught exceptions (use exc_info=True).

Guidelines require full stack traces. Add exc_info=True to these warning logs for better diagnosis.

Example diffs:
-                logger.warning(f"Error validating URL {url}: {e}")
+                logger.warning(f"Error validating URL {url}: {e}", exc_info=True)
-                logger.warning(f"Health check failed for {instance_url}: {e}")
+                logger.warning(f"Health check failed for {instance_url}: {e}", exc_info=True)
Repeat similarly at the other locations noted.

Also applies to: 215-217, 541-542, 654-654, 1174-1175, 1181-1182

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 704f2db and 7154e0f.

📒 Files selected for processing (1)

python/src/server/api_routes/ollama_api.py (46 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/api_routes/ollama_api.py

python/src/server/api_routes/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place HTTP route handlers in the API routes directory

Files:

python/src/server/api_routes/ollama_api.py

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/src/server/api_routes/ollama_api.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/server/api_routes/ollama_api.py

🧬 Code graph analysis (1)

python/src/server/api_routes/ollama_api.py (3)

python/src/server/services/ollama/embedding_router.py (1)

get_available_embedding_routes (273-321)

python/src/server/services/ollama/model_discovery_service.py (2)

_test_function_calling_capability (817-857)

_test_structured_output_capability (859-900)

python/src/server/services/llm_provider_service.py (1)

get_llm_client (313-548)

🔇 Additional comments (3)

python/src/server/api_routes/ollama_api.py (3)

121-136: SSRF hardening LGTM across all validation points.

Consistent enforcement of http(s) scheme + hostname presence + private/loopback/link-local/reserved/multicast/unspecified IP blocking. Good defense-in-depth.

Also applies to: 193-205, 264-269, 317-322, 363-377, 505-510, 677-682, 1069-1074, 1323-1328

603-615: Robust stored-models parsing LGTM.

Good handling of string vs dict payloads with explicit corruption detection and fail-fast behavior.

1038-1040: Good change: require valid JSON for structured output.

Dropping heuristic acceptance avoids false positives.

coderabbitai · 2025-09-24T15:16:45Z

    include_capabilities: bool = Query(True, description="Include capability detection"),
    fetch_details: bool = Query(False, description="Fetch comprehensive model details via /api/show"),
-    background_tasks: BackgroundTasks = None
+    background_tasks: BackgroundTasks = BackgroundTasks()


⚠️ Potential issue

Let FastAPI inject BackgroundTasks; don’t instantiate it (tasks won’t run otherwise).

Constructing BackgroundTasks() in the signature prevents FastAPI from wiring it into the response lifecycle; added tasks won’t execute. Accept the injected instance and drop the None-guard.

Apply this diff:

- background_tasks: BackgroundTasks = BackgroundTasks() + background_tasks: BackgroundTasks

- if background_tasks is not None: - # Schedule cache warming as a FastAPI background task (runs after response) - background_tasks.add_task(_warm_model_cache, valid_urls) + # Schedule cache warming as a FastAPI background task (runs after response) + background_tasks.add_task(_warm_model_cache, valid_urls)

Also applies to: 152-155

🤖 Prompt for AI Agents

In python/src/server/api_routes/ollama_api.py around lines 110 and again around lines 152-155, the handler is incorrectly creating BackgroundTasks() in the function signature which prevents FastAPI from injecting and executing background tasks; change the parameter to accept BackgroundTasks without instantiation (remove the = BackgroundTasks() default) and remove any None-guard logic so the injected BackgroundTasks is used directly when adding tasks. Ensure function signatures rely on FastAPI to provide BackgroundTasks and that calls to background_tasks.add_task(...) use the supplied instance.

## Code Quality Improvements Applied: ### 1. SSRF Guard Formatting (Line 42-49) ✅ - Fixed long condition violating 120-character line length guideline - Broke multi-condition check into readable multi-line format with proper indentation ### 2. Private Method Access (Line 331) ✅ - Changed from embedding_router._calculate_performance_score() to public method - Improved encapsulation by avoiding private method access ### 3. Unused sort_by_performance Parameter (Lines 381-383) ✅ - Implemented missing sorting logic when sort_by_performance=False - Added stable alternative ordering by model_name and instance_url ### 4. Unused force_refresh Parameter (Lines 448-450) ✅ - Removed unused force_refresh field from ModelDiscoveryAndStoreRequest - Eliminated misleading API surface that wasn't implemented ### 5. Missing Stack Traces in Exception Logging ✅ - Added exc_info=True to all 7 exception logging locations as required by guidelines - Enhanced debugging capabilities with full stack trace preservation: - Line 138: URL validation errors - Line 216: Health check failures - Line 373: Embedding route URL validation - Line 541: Model discovery failures - Line 654: Stored model parsing errors - Line 1174: Model detail retrieval failures - Line 1181: Instance processing failures ## Additional Improvements: - Removed unused asyncio import for cleaner code - Maintained all SSRF security protections while improving code quality - Enhanced error logging and debugging capabilities All CodeRabbit review comments have been comprehensively addressed while preserving critical security measures and functionality.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (9)

python/src/server/api_routes/ollama_api.py (9)

31-53: Tighten SSRF guard: use specific exceptions and simplify with is_global.

Catch socket.gaierror explicitly; log other exceptions with stack traces.
Using not ip.is_global covers private, loopback, link-local, multicast, reserved, and unspecified in one check.

 def _is_private_host(host: str) -> bool:
@@
-    try:
-        infos = socket.getaddrinfo(host, None)
+    try:
+        infos = socket.getaddrinfo(host, None)
         for _, _, _, _, sockaddr in infos:
             ip = ipaddress.ip_address(sockaddr[0])
-            if (
-                ip.is_private
-                or ip.is_loopback
-                or ip.is_link_local
-                or ip.is_reserved
-                or ip.is_multicast
-                or ip.is_unspecified
-            ):
+            # Block anything that isn't globally routable (covers private, loopback, link-local, reserved, multicast, unspecified)
+            if not ip.is_global:
                 return True
-    except Exception:
-        # If resolution fails, treat as unsafe or log/deny explicitly
-        return True
+    except socket.gaierror as e:
+        logger.warning(f"DNS resolution failed for host {host}: {e}")
+        return True
+    except Exception as e:
+        logger.warning(f"Host validation failed for {host}: {e}", exc_info=True)
+        return True
     return False

127-141: Remove redundant URL prefix check; rely on parsed.scheme validation.

You check http/https twice. The startswith check is redundant and may miss valid equivalents (e.g., case sensitivity). Keep the parsed validation only.

-                # Basic URL validation - require http/https schemes
-                if not url.startswith(('http://', 'https://')):
-                    logger.warning(f"Invalid URL format: {url}")
-                    continue
-
                 # SSRF protection - check if URL targets private/internal addresses
                 parsed = urlparse(url)
                 if parsed.scheme not in ('http', 'https') or not parsed.hostname or _is_private_host(parsed.hostname):
                     logger.warning(f"Blocked private/invalid URL: {url}")
                     continue

221-229: Always set last_checked timestamp on errors for consistent shape.

Return an ISO timestamp even on errors to simplify clients.

-                health_results[url] = {
+                health_results[url] = {
                     "is_healthy": False,
                     "response_time_ms": None,
                     "models_available": None,
                     "error_message": str(e),
-                    "last_checked": None
+                    "last_checked": datetime.now(UTC).isoformat()
                 }

289-291: Use explicit None check when collecting supported_dimensions.

Truthiness can drop legitimate 0 values (even if unlikely). Be explicit.

-                    "supported_dimensions": list({m.embedding_dimensions for m in models
-                                                   if m.embedding_dimensions})
+                    "supported_dimensions": list({
+                        m.embedding_dimensions for m in models
+                        if m.embedding_dimensions is not None
+                    })

684-696: Preserve stack traces when cache warming per-URL fails.

Aligns with backend guidelines to keep full traces.

-            except Exception as e:
-                logger.warning(f"Failed to warm cache for {url}: {e}")
+            except Exception as e:
+                logger.warning(f"Failed to warm cache for {url}: {e}", exc_info=True)

1046-1047: Optional: log JSON decode failures at debug with sample size to reduce noise.

You already gate on valid JSON only; consider adding len(content) for triage.

1205-1211: Consider storing JSON natively instead of json.dumps for Supabase (if column is JSONB).

If archon_settings.value is JSON/JSONB, let the driver handle serialization to avoid double-encoding and simplify reads.

Would you like me to scan the repo schema to confirm the column type and propose a migration if needed?

369-383: DRY the SSRF+scheme validation across endpoints.

The same parsed.scheme/hostname/_is_private_host pattern repeats in multiple handlers. Extract a small helper (e.g., _validate_public_http_url(url) -> str) to centralize logging and normalization.

463-474: Constrain enum-like fields with typing for safer responses.

archon_compatibility and model_type read like enums but are plain str. Use Literal or Enum to prevent drift (e.g., 'unknown').

Example outside this hunk:

from enum import Enum

class CompatibilityLevel(str, Enum):
    full = "full"
    partial = "partial"
    limited = "limited"
    unknown = "unknown"  # if you need it

class ModelType(str, Enum):
    chat = "chat"
    embedding = "embedding"
    multimodal = "multimodal"

class StoredModelInfo(BaseModel):
    ...
    model_type: ModelType
    archon_compatibility: CompatibilityLevel

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7154e0f and d2968de.

📒 Files selected for processing (1)

python/src/server/api_routes/ollama_api.py (46 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/api_routes/ollama_api.py

python/src/server/api_routes/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place HTTP route handlers in the API routes directory

Files:

python/src/server/api_routes/ollama_api.py

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/src/server/api_routes/ollama_api.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/server/api_routes/ollama_api.py

🔇 Additional comments (2)

python/src/server/api_routes/ollama_api.py (2)
612-624: Good: strict handling of corrupted stored models.

Correctly parses string vs dict.

Raises 500 and preserves stack trace on corruption.

116-116: Let FastAPI inject BackgroundTasks; don’t instantiate or None‑check.

Instantiating BackgroundTasks prevents FastAPI from wiring tasks into the response lifecycle; the None-guard is also unnecessary. Use the injected instance directly.
-    background_tasks: BackgroundTasks = BackgroundTasks()
+    background_tasks: BackgroundTasks
-        if background_tasks is not None:
-            # Schedule cache warming as a FastAPI background task (runs after response)
-            background_tasks.add_task(_warm_model_cache, valid_urls)
+        # Schedule cache warming as a FastAPI background task (runs after response)
+        background_tasks.add_task(_warm_model_cache, valid_urls)
Also applies to: 158-161

Wirasm · 2026-04-06T07:26:12Z

🔄 This repository is being replaced by a new version of Archon.

The original Python/MCP codebase is being archived to the archive/v1-python-mcp branch. The new Archon (TypeScript workflow engine for AI coding agents) is replacing it.

This PR is being closed as part of the migration. Thank you for your contribution!

Add file attachment support to the web chat UI. Users can attach images, PDFs, markdown, and code files to messages via file picker, drag-and-drop, or clipboard paste. Uploaded files are saved to ~/.archon/artifacts/uploads/ and their paths are injected into the prompt so Claude Code can read them using its built-in tools. - Frontend: file picker button (Paperclip), drag-drop zone, clipboard paste (Ctrl+V for images), preview chips with remove button, 10 MB / 5 file limits - Backend: multipart/form-data parsing in POST /api/conversations/:id/message, file storage with path-traversal-safe names, backwards-compatible JSON path - Core: AttachedFile type, attachedFiles on HandleMessageContext, buildFullPrompt appends ## Attached Files section with disk paths - Chat history: file metadata persisted in message metadata JSON field, attachment pills rendered in MessageBubble for persisted messages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Wirasm requested changes Sep 22, 2025

View reviewed changes

jonahgabriel changed the title ~~Fix API service header merge and retry logic bugs (#593)~~ enhance: add FormData support and smart retry logic to API service Sep 23, 2025

coderabbitai Bot reviewed Sep 23, 2025

View reviewed changes

jonahgabriel added 6 commits September 23, 2025 16:50

fix: revert header variable name to 'headers' per review feedback

0f79461

fix: address final CodeRabbit feedback

126a8b7

- Update misleading comment in knowledgeService.ts - Add per-attempt timeout support to ollamaService.ts checkInstanceHealth method - Apply AbortSignal.timeout(5000) for health check timeouts

style: remove unused import and improve formatting in knowledgeServic…

e52116a

…e.ts - Remove unused APIServiceError import detected by Biome - Improve multi-line formatting for callAPIWithETag function call

jonahgabriel force-pushed the fix/api-service-bugs-593 branch from 661722c to e52116a Compare September 23, 2025 21:02

coderabbitai Bot reviewed Sep 23, 2025

View reviewed changes

Comment thread archon-ui-main/src/services/ollamaService.ts Outdated

jonahgabriel added 2 commits September 24, 2025 07:18

coderabbitai Bot reviewed Sep 24, 2025

View reviewed changes

Comment thread python/src/agents/mcp_client.py Outdated

Comment thread python/src/agents/mcp_client.py Outdated

coderabbitai Bot reviewed Sep 24, 2025

View reviewed changes

Comment thread python/src/agents/mcp_client.py

Comment thread python/src/server/api_routes/ollama_api.py

Comment thread python/src/server/api_routes/ollama_api.py

coderabbitai Bot reviewed Sep 24, 2025

View reviewed changes

Comment thread archon-ui-main/src/services/ollamaService.ts

coderabbitai Bot reviewed Sep 24, 2025

View reviewed changes

Comment thread python/src/server/api_routes/ollama_api.py Outdated

Comment thread python/src/server/api_routes/ollama_api.py Outdated

fix: apply linter corrections to BackgroundTasks parameter

704f2db

- Use BackgroundTasks() default instead of BackgroundTasks | None for FastAPI compatibility - Maintains the same functionality while following FastAPI best practices

coderabbitai Bot reviewed Sep 24, 2025

View reviewed changes

jonahgabriel requested a review from Wirasm September 24, 2025 15:51

Wirasm closed this Apr 6, 2026

Conversation

jonahgabriel commented Sep 20, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes Made

Technical Details

Testing

Files Changed

Benefits

Summary by CodeRabbit

Uh oh!

Wirasm left a comment

Choose a reason for hiding this comment

Uh oh!

jonahgabriel commented Sep 22, 2025

Uh oh!

coderabbitai Bot commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Wirasm commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

jonahgabriel commented Sep 20, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Sep 23, 2025 •

edited

Loading