enhance: add FormData support and smart retry logic to API service#723
enhance: add FormData support and smart retry logic to API service#723jonahgabriel wants to merge 14 commits intocoleam00:mainfrom
Conversation
Wirasm
left a comment
There was a problem hiding this comment.
Thank you for working on improving the API services! After reviewing the PR, I need to clarify a few things:
About the "Header Merge Bug"
Looking at the current main branch, there isn't actually a bug with the header variable. The code correctly declares const headers and uses headers in the fetch call (line 69). The variable naming is consistent and working as intended.
However, your PR does include some useful improvements:
- FormData Support: Adding automatic FormData detection is a nice enhancement. The original design intentionally excluded this (note the comment: "For file uploads or FormData requests, use fetch() directly"), but supporting it could improve DX.
- Smart Retry Logic: Using createRetryLogic to avoid retrying 4xx errors is a good pattern.
Could you please remove the headers var renaming, we want to keep heaers over mergedHeaders.
We could keep the form data detection, even though its not needed currently it could improve DX for future enhancements.
alternatively make a new pr that only addresses smart retry improvements
Also, please update the PR title and description to reflect that this is adding FormData support and improving retry logic, rather than fixing a header merge bug.
Thank you!
Will do. |
WalkthroughDelegates Ollama health checks to ollamaService with smart retries/timeouts, centralizes uploads via callAPIWithETag with FormData-aware headers and robust parsing, adds retry tests, hardens server Ollama endpoints against SSRF and uses UTC-aware timestamps, and introduces an MCP client error hierarchy and lifecycle helpers. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant UI as OllamaConfigurationPanel
participant Svc as ollamaService
participant Oll as Ollama Instance
UI->>Svc: testConnection(baseUrl, retryCount)
rect rgba(200,240,255,0.20)
note right of Svc #D6EAF8: createRetryLogic drives retry decisions
Svc->>Oll: health/validate request (AbortSignal timeout)
Oll-->>Svc: response or error (status/info)
alt Retryable (network/5xx/timeout)
Svc-->>Svc: decide & retry (limit/backoff)
else Non-retryable (4xx/abort)
Svc-->>UI: { isHealthy:false, error }
end
end
Svc-->>UI: { isHealthy, responseTime, error? }
sequenceDiagram
autonumber
participant Client as Feature Service
participant API as callAPIWithETag
participant Net as fetch()
Client->>API: request(url, { headers, body })
alt body is FormData
API-->>Net: fetch with headers (omit Content-Type)
else JSON body
API-->>Net: fetch with Content-Type: application/json
end
Net-->>API: Response
alt content-type includes "json"
API-->>Client: parsed JSON or APIServiceError
else 204/other/binary
API-->>Client: text/blob/undefined
end
sequenceDiagram
autonumber
participant Route as Server route
participant Guard as SSRF Guard
participant Ext as External Ollama
Route->>Guard: validate instance URL(s)
Guard->>Guard: resolve host → IP(s), check private/reserved
alt IP private/loopback/link-local/reserved
Guard-->>Route: Block / return sanitized non-healthy entry
else Public
Guard->>Ext: proceed with discovery/health
Ext-->>Guard: result
Guard-->>Route: response with UTC timestamps
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
archon-ui-main/src/features/knowledge/services/knowledgeService.ts (1)
120-121: Fix misleading comment: callAPIWithETag here does not implement "smart retry".callAPIWithETag adds FormData handling and timeouts, but not retry logic. Update the comment to avoid confusion.
- // Use improved API service with proper FormData handling and smart retry logic + // Use API service with proper FormData handling and timeoutarchon-ui-main/src/features/shared/apiWithEtag.test.ts (1)
413-575: Great FormData coverage. Consider adding a test to ensure user-provided Content-Type is stripped for FormData.Right now we test omission by default. Add a case where the caller sets Content-Type to verify we remove it, matching browser requirements.
Apply this diff to add the test:
@@ describe("FormData Support", () => { @@ it("should preserve custom headers with FormData", async () => { @@ expect(options.headers).not.toHaveProperty("Content-Type"); }); + it("should remove user-provided Content-Type for FormData", async () => { + const mockResponse = { + ok: true, + status: 200, + json: () => Promise.resolve({ ok: true }), + headers: new Headers(), + }; + global.fetch = vi.fn().mockResolvedValue(mockResponse); + + const fd = new FormData(); + fd.append("file", new File(["x"], "x.txt")); + + await callAPIWithETag("/api/upload", { + method: "POST", + body: fd, + headers: { "Content-Type": "multipart/form-data" }, + }); + + const [, options] = (global.fetch as any).mock.calls[0]; + expect(options.headers).not.toHaveProperty("Content-Type"); + });archon-ui-main/src/features/shared/apiWithEtag.ts (1)
54-67: Strip any user-provided Content-Type when body is FormData; normalize headers via Headers to support HeadersInit.If callers pass Content-Type for FormData, uploads can break (missing multipart boundary). Also, current cast assumes plain objects. Normalize with Headers and delete Content-Type when isFormData.
Apply this diff:
- // Detect FormData to avoid setting Content-Type (browser sets multipart/form-data with boundary) - const isFormData = options.body instanceof FormData; + // Detect FormData to avoid setting Content-Type (browser sets multipart/form-data with boundary) + const isFormData = options.body instanceof FormData; @@ - const headers: Record<string, string> = { - Accept: "application/json", - ...(options.headers as Record<string, string> || {}), - ...(isFormData ? {} : { "Content-Type": "application/json" }), - }; + // Normalize headers and ensure Content-Type behavior is correct for FormData + const headersObj = new Headers(options.headers as HeadersInit | undefined); + headersObj.set("Accept", "application/json"); + if (isFormData) { + headersObj.delete("Content-Type"); + } else if (!headersObj.has("Content-Type")) { + headersObj.set("Content-Type", "application/json"); + } + const headers: Record<string, string> = Object.fromEntries(headersObj.entries());archon-ui-main/src/services/ollamaService.ts (1)
384-439: Add per-attempt timeouts to avoid hanging retries.checkInstanceHealth doesn’t use a signal/timeout, so a slow/unresponsive host can hang an attempt and block the loop. Pass a per-attempt AbortSignal (e.g., 5s) into checkInstanceHealth and have that method forward it to fetch.
Apply this diff in the selected range to pass a timeout signal:
- const healthResponse = await this.checkInstanceHealth([instanceUrl], false); + const healthResponse = await this.checkInstanceHealth([instanceUrl], false, AbortSignal.timeout(5000));And add signal support to checkInstanceHealth (outside this hunk):
// Update signature async checkInstanceHealth(instanceUrls: string[], includeModels = false, signal?: AbortSignal): Promise<InstanceHealthResponse> { // ... const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, { method: "GET", headers: { "Content-Type": "application/json" }, signal, }); // ... }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx(2 hunks)archon-ui-main/src/features/knowledge/services/knowledgeService.ts(1 hunks)archon-ui-main/src/features/shared/apiWithEtag.test.ts(1 hunks)archon-ui-main/src/features/shared/apiWithEtag.ts(2 hunks)archon-ui-main/src/features/shared/tests/queryPatterns.test.ts(1 hunks)archon-ui-main/src/services/ollamaService.ts(3 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
archon-ui-main/src/features/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UIUse Biome in features: 120 character line length, double quotes, and trailing commas
Files:
archon-ui-main/src/features/knowledge/services/knowledgeService.tsarchon-ui-main/src/features/shared/apiWithEtag.test.tsarchon-ui-main/src/features/shared/tests/queryPatterns.test.tsarchon-ui-main/src/features/shared/apiWithEtag.ts
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/features/knowledge/services/knowledgeService.tsarchon-ui-main/src/features/shared/apiWithEtag.test.tsarchon-ui-main/src/features/shared/tests/queryPatterns.test.tsarchon-ui-main/src/features/shared/apiWithEtag.tsarchon-ui-main/src/services/ollamaService.tsarchon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/features/knowledge/services/knowledgeService.tsarchon-ui-main/src/features/shared/apiWithEtag.test.tsarchon-ui-main/src/features/shared/tests/queryPatterns.test.tsarchon-ui-main/src/features/shared/apiWithEtag.tsarchon-ui-main/src/services/ollamaService.tsarchon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx
archon-ui-main/src/**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Write frontend tests with Vitest and React Testing Library
Files:
archon-ui-main/src/features/shared/apiWithEtag.test.tsarchon-ui-main/src/features/shared/tests/queryPatterns.test.ts
🧬 Code graph analysis (5)
archon-ui-main/src/features/knowledge/services/knowledgeService.ts (1)
archon-ui-main/src/features/shared/apiWithEtag.ts (1)
callAPIWithETag(46-127)
archon-ui-main/src/features/shared/apiWithEtag.test.ts (1)
archon-ui-main/src/features/shared/apiWithEtag.ts (1)
callAPIWithETag(46-127)
archon-ui-main/src/features/shared/tests/queryPatterns.test.ts (1)
archon-ui-main/src/features/shared/queryPatterns.ts (1)
createRetryLogic(63-75)
archon-ui-main/src/services/ollamaService.ts (1)
archon-ui-main/src/features/shared/queryPatterns.ts (1)
createRetryLogic(63-75)
archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx (2)
archon-ui-main/src/services/ollamaService.ts (2)
testConnection(387-438)ollamaService(494-494)archon-ui-main/src/components/settings/types/OllamaTypes.ts (1)
ConnectionTestResult(107-112)
🔇 Additional comments (9)
archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx (1)
108-119: Solid delegation to ollamaService with minimal mapping.Mapping service result to local ConnectionTestResult looks good. One nit: since modelsAvailable is always undefined here, the toast in Line 152 can render “undefined models”. Consider omitting that part when undefined.
archon-ui-main/src/services/ollamaService.ts (1)
9-9: Good: centralizing retry decision via createRetryLogic.Import and usage are aligned with shared patterns.
archon-ui-main/src/features/shared/tests/queryPatterns.test.ts (7)
1-2: LGTM! Correct imports for testing setup.The imports are appropriate for a Vitest test file with React Testing Library patterns.
4-37: Comprehensive retry logic testing for network and server errors.The test cases correctly validate the retry behavior for network errors and 5xx server errors. The tests properly verify that retries work within limits and stop when maxRetries is exceeded. Good coverage of both
statusCodeandstatusproperty variations.
39-80: Thorough coverage of client error scenarios.Excellent test coverage for 4xx client errors including edge cases like 429 rate limiting. The tests correctly verify that client errors should not be retried, which aligns with the smart retry logic requirements mentioned in the PR objectives.
82-111: Good abort error handling validation.The tests comprehensively cover different abort scenarios including AbortError by name, ERR_CANCELED by code, and axios cancel tokens. This ensures that user-initiated cancellations are properly handled without unnecessary retries.
113-160: Excellent edge case coverage.The edge case tests validate important scenarios like null/undefined errors, configurable maxRetries, default retry count, and mixed status properties. The test on Line 158 correctly validates that
statusCodetakes precedence overstatus, which aligns with the implementation inqueryPatterns.ts.
162-181: Well-designed integration tests for Ollama service patterns.These tests provide practical validation for real-world scenarios that the Ollama service will encounter, including model not found (404), connection timeouts, and server overload (503). This ensures the retry logic works correctly in the context it will be used.
1-182: Run tests locally/CI and confirm TanStack Query integration.Sandbox couldn't run npm (no package.json). createRetryLogic is implemented at archon-ui-main/src/features/shared/queryPatterns.ts and wired into the centralized QueryClient (retry: createRetryLogic(2)) at archon-ui-main/src/features/shared/queryClient.ts — run archon-ui-main/src/features/shared/tests/queryPatterns.test.ts and verify feature hooks use the centralized QueryClient or follow feature-scoped TanStack Query hooks.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
archon-ui-main/src/features/shared/apiWithEtag.ts (1)
111-120: Handle non‑JSON or empty success bodies to avoid false “NETWORK_ERROR”.Some endpoints (e.g., uploads) can return 201 with empty body or non‑JSON. Unconditional response.json() will throw and be misclassified as NETWORK_ERROR. Safely gate parsing by content-type and tolerate empty bodies.
Apply this diff:
- // Parse response data - const result = await response.json(); - - // Check for API errors - if (result.error) { - throw new APIServiceError(result.error, "API_ERROR", response.status); - } - - return result as T; + // Parse response data + const contentType = response.headers.get("content-type")?.toLowerCase() ?? ""; + if (contentType.includes("application/json") || contentType.includes("+json")) { + const result = await response.json(); + if (result && typeof result === "object" && "error" in result && (result as any).error) { + throw new APIServiceError((result as any).error as string, "API_ERROR", response.status); + } + return result as T; + } + // Non-JSON or empty body + const text = await response.text().catch(() => ""); + return (text ? (text as unknown as T) : (undefined as T));
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
archon-ui-main/src/features/shared/apiWithEtag.test.ts(3 hunks)archon-ui-main/src/features/shared/apiWithEtag.ts(2 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
archon-ui-main/src/features/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UIUse Biome in features: 120 character line length, double quotes, and trailing commas
Files:
archon-ui-main/src/features/shared/apiWithEtag.tsarchon-ui-main/src/features/shared/apiWithEtag.test.ts
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/features/shared/apiWithEtag.tsarchon-ui-main/src/features/shared/apiWithEtag.test.ts
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/features/shared/apiWithEtag.tsarchon-ui-main/src/features/shared/apiWithEtag.test.ts
archon-ui-main/src/**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Write frontend tests with Vitest and React Testing Library
Files:
archon-ui-main/src/features/shared/apiWithEtag.test.ts
🧬 Code graph analysis (1)
archon-ui-main/src/features/shared/apiWithEtag.test.ts (1)
archon-ui-main/src/features/shared/apiWithEtag.ts (1)
callAPIWithETag(46-131)
🔇 Additional comments (5)
archon-ui-main/src/features/shared/apiWithEtag.ts (2)
54-69: FormData detection and header handling look correct.Correctly omits Content-Type for FormData and preserves user-provided headers. Matches browser multipart boundary behavior.
71-71: Header normalization is appropriate.Converting Headers to a plain object via entries() yields lower-case keys, aligning with the tests’ expectations.
archon-ui-main/src/features/shared/apiWithEtag.test.ts (3)
59-61: Expectation updates match new header behavior.Lower-case keys and explicit accept header assertions are consistent with Headers normalization and implementation.
172-176: Custom header casing assertions are correct.Asserting lower-case header names aligns with the normalized headers object.
415-599: Add test: tolerate 201 Created with empty body for FormData uploads — vitest env 'jsdom' verifiedVitest is configured with environment: 'jsdom' (File/FormData available). Add this test inside the "FormData Support" suite:
+ it("should tolerate 201 Created with empty body for FormData uploads", async () => { + const mockResponse = { + ok: true, + status: 201, + // Simulate empty body; no json() available on some servers + text: () => Promise.resolve(""), + headers: new Headers(), // no content-type + }; + global.fetch = vi.fn().mockResolvedValue(mockResponse); + + const fd = new FormData(); + fd.append("file", new File(["x"], "x.txt")); + + const result = await callAPIWithETag("/api/upload", { method: "POST", body: fd }); + expect(result).toBeUndefined(); + });
PROBLEMS FIXED: 1. Header merge order bug - FormData requests incorrectly had Content-Type set 2. Overly aggressive retry logic - retried 4xx client errors causing duplicates COMPREHENSIVE SOLUTION: Core API Service (apiWithEtag.ts): - Add FormData detection to avoid setting Content-Type header - Fix variable reference to use mergedHeaders instead of headers - Update documentation to reflect FormData + JSON support - Maintain backward compatibility with existing functionality Smart Retry Logic (ollamaService.ts): - Replace custom retry loops with createRetryLogic() from queryPatterns - Only retry network errors and 5xx server errors, not 4xx client errors - Add special handling for 404 errors (Ollama not installed) - Implement exponential backoff for better retry behavior Code Deduplication (OllamaConfigurationPanel.tsx): - Remove duplicate testConnection implementation - Use ollamaService.testConnection() for consistency - Eliminate code duplication and inconsistent retry behavior File Upload Improvements (knowledgeService.ts): - Use callAPIWithETag() for uploads instead of direct fetch() - Benefit from smart retry logic and ETag caching for file uploads - Remove workaround comments about FormData incompatibility Comprehensive Test Coverage (apiWithEtag.test.ts): - Add 5 new FormData-specific tests (21 total tests) - Test Content-Type header omission for FormData requests - Verify custom headers preservation with FormData - Test binary file uploads and error scenarios - Ensure JSON requests still get proper Content-Type headers IMPACT: - FormData uploads work properly through unified API service - Smart retry prevents duplicate operations and infinite loops - Eliminates code duplication across Ollama components - Maintains backward compatibility and ETag optimization - All tests pass - robust error handling and edge case coverage RELATED: Fixes technical debt identified in Issue coleam00#593
- 16 test cases covering all retry scenarios - Network errors: should retry (timeouts, connection failures) - 5xx server errors: should retry (500, 502, 503) - 4xx client errors: should NOT retry (400, 401, 404, 422, 429) - Abort errors: should NOT retry (AbortError, ERR_CANCELED) - Edge cases: null/undefined, custom maxRetries, mixed properties - Integration scenarios: Ollama-specific error patterns Ensures smart retry logic properly handles all states: - Retries 3 times for recoverable errors - Fails immediately for permanent client errors - Respects user cancellations
- Normalize headers using Headers constructor for better compatibility - Strip user-provided Content-Type for FormData to prevent upload errors - Update tests to expect lowercase header names per HTTP standards - Add test case for Content-Type removal with FormData - Maintain backward compatibility while improving robustness All 46 shared feature tests passing.
- Update misleading comment in knowledgeService.ts - Add per-attempt timeout support to ollamaService.ts checkInstanceHealth method - Apply AbortSignal.timeout(5000) for health check timeouts
…e.ts - Remove unused APIServiceError import detected by Biome - Improve multi-line formatting for callAPIWithETag function call
661722c to
e52116a
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
archon-ui-main/src/services/ollamaService.ts (3)
158-182: Avoidanyin public service code; narrow fromunknown.Use
unknownand narrow to keep strict typing guarantees.- private handleApiError(error: any, context: string): Error { - const errorMessage = error instanceof Error ? error.message : String(error); + private handleApiError(error: unknown, context: string): Error { + const err = error instanceof Error ? error : new Error(String(error)); + const errorMessage = err.message;
225-247: Add a default timeout when no signal is provided.Prevents potentially hanging requests when callers omit a signal, aligning with apiClient’s 20s default.
const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, { method: 'GET', headers: { 'Content-Type': 'application/json', }, - signal, + signal: signal ?? AbortSignal.timeout(20000), });
416-419: Annotate bothstatusandstatusCodeto maximize compatibility with error inspectors.Some utilities inspect
error.status, otherserror.statusCode. Set both sogetErrorStatus()can detect 4xx reliably.- if (result.error?.includes('404') || result.error?.includes('not found')) { - (lastError as any).statusCode = 404; - } + if (result.error?.includes('404') || result.error?.includes('not found')) { + (lastError as any).statusCode = 404; + (lastError as any).status = 404; + }archon-ui-main/src/features/shared/api/apiClient.ts (1)
54-69: Don’t override a caller-provided Accept header.Set Accept only if absent to allow non‑JSON endpoints when needed; current tests still pass.
- headersObj.set("Accept", "application/json"); + if (!headersObj.has("Accept")) { + headersObj.set("Accept", "application/json"); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx(2 hunks)archon-ui-main/src/features/knowledge/services/knowledgeService.ts(1 hunks)archon-ui-main/src/features/shared/api/apiClient.ts(2 hunks)archon-ui-main/src/features/shared/api/tests/apiClient.test.ts(3 hunks)archon-ui-main/src/features/shared/tests/queryPatterns.test.ts(1 hunks)archon-ui-main/src/services/ollamaService.ts(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- archon-ui-main/src/features/knowledge/services/knowledgeService.ts
- archon-ui-main/src/features/shared/tests/queryPatterns.test.ts
- archon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx
🧰 Additional context used
📓 Path-based instructions (4)
archon-ui-main/src/features/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UIUse Biome in features: 120 character line length, double quotes, and trailing commas
Files:
archon-ui-main/src/features/shared/api/tests/apiClient.test.tsarchon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/features/shared/api/tests/apiClient.test.tsarchon-ui-main/src/services/ollamaService.tsarchon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Write frontend tests with Vitest and React Testing Library
Files:
archon-ui-main/src/features/shared/api/tests/apiClient.test.ts
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/features/shared/api/tests/apiClient.test.tsarchon-ui-main/src/services/ollamaService.tsarchon-ui-main/src/features/shared/api/apiClient.ts
🧬 Code graph analysis (2)
archon-ui-main/src/features/shared/api/tests/apiClient.test.ts (1)
archon-ui-main/src/features/shared/api/apiClient.ts (1)
callAPIWithETag(46-131)
archon-ui-main/src/services/ollamaService.ts (1)
archon-ui-main/src/features/shared/config/queryPatterns.ts (1)
createRetryLogic(63-75)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Backend Tests (Python + pytest)
🔇 Additional comments (5)
archon-ui-main/src/services/ollamaService.ts (1)
392-431: Smart retry loop is solid: bounded attempts, exponential backoff, and abort signal.Good use of retry logic and 5s request-level timeout to keep attempts bounded. Early-return on success is correct.
archon-ui-main/src/features/shared/api/apiClient.ts (2)
36-45: Docs/readme block: LGTM.Accurately describes reliance on browser ETag handling and TanStack Query for cache freshness.
71-72: Header normalization approach is fine.Lower‑casing via Headers and passing a plain object to fetch is consistent with the tests.
archon-ui-main/src/features/shared/api/tests/apiClient.test.ts (2)
59-61: Header expectations updated correctly (lowercased + Accept).Matches the client’s normalization behavior.
415-599: FormData test suite: comprehensive and correct.Covers omission of Content‑Type, custom headers preservation, binary file handling, and user‑provided Content‑Type removal.
Consider adding one test to assert custom Accept is preserved (to go with the suggested client change):
it("should preserve caller-provided Accept header", async () => { const mockResponse = { ok: true, status: 200, json: () => Promise.resolve({ ok: true }), headers: new Headers() }; global.fetch = vi.fn().mockResolvedValue(mockResponse); await callAPIWithETag("/api/text-endpoint", { headers: { Accept: "text/plain" }, }); const [, options] = (global.fetch as any).mock.calls[0]; expect(options.headers["accept"]).toBe("text/plain"); });
- Simplify JSON parsing in ollama_api.py by removing unnecessary isinstance check - Enhance error handling in mcp_client.py to check both 'error' and 'message' fields - Apply code quality improvements for better error message extraction Related to PR coleam00#723 CodeRabbit review feedback.
Frontend fixes: - Add non-JSON response handling in apiClient.ts to prevent JSON parsing errors - Fix import path for createRetryLogic in ollamaService.ts - Replace 'any' types with 'unknown' for better type safety - Add 30s default timeouts to all fetch calls in ollamaService.ts - Enhanced status code annotations for smart retry logic compatibility Backend fixes addressed in previous commits: - Simplified JSON parsing in ollama_api.py - Enhanced error handling in mcp_client.py to check both 'error' and 'message' fields All CodeRabbit review comments have been systematically addressed.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
python/src/server/api_routes/ollama_api.py (3)
526-542: Fix regression: json.loads() on an already-parsed dict will crash; handle str/dict robustly.Supabase can return JSON as a native dict (JSON column) or a string. Unconditionally
json.loads(models_setting)will raise on dict input.Apply this diff:
- models_data = json.loads(models_setting) - from datetime import datetime + # Handle both JSON string and native dict from DB driver + if isinstance(models_setting, str): + try: + models_data = json.loads(models_setting) + except json.JSONDecodeError: + logger.error("Corrupted 'ollama_discovered_models' JSON in archon_settings", exc_info=True) + raise HTTPException(status_code=500, detail="Stored models are corrupted") + elif isinstance(models_setting, dict): + models_data = models_setting + else: + logger.error(f"Unexpected type for models_setting: {type(models_setting).__name__}") + raise HTTPException(status_code=500, detail="Invalid stored models format")
199-200: Bug: timestamp is set to a module name, not a real timestamp.
__module__returns a module string. Use an actual timestamp.- "timestamp": model_discovery_service.check_instance_health.__module__ # Use current timestamp + "timestamp": datetime.utcnow().isoformat()
101-108: SSRF risk: user-supplied instance_urls are fetched server-side without egress safeguards.Attackers can target internal/metadata endpoints (e.g., 127.0.0.1, 169.254.169.254). Validate/denylist private/link-local/loopback addresses before making requests.
Example helper and usage (add once and call in each loop before requests):
# Add near imports from urllib.parse import urlparse import socket, ipaddress def _is_private_host(host: str) -> bool: try: infos = socket.getaddrinfo(host, None) for _, _, _, _, sockaddr in infos: ip = ipaddress.ip_address(sockaddr[0]) if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved or ip.is_multicast: return True except Exception: # If resolution fails, treat as unsafe or log/deny explicitly return True return FalseUsage in validation loops:
parsed = urlparse(url) if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname): logger.warning(f"Blocked private/invalid URL: {url}") continueAlso applies to: 159-169, 981-986
python/src/agents/mcp_client.py (3)
93-98: Preserve stack traces and surface HTTP status; raise a transport-specific exception.Use logger.exception (exc_info) and chain the original error. Capture status/body snippet for context.
- except httpx.HTTPError as e: - logger.error(f"HTTP error calling MCP tool {tool_name}: {e}") - raise Exception(f"Failed to call MCP tool: {str(e)}") - except Exception as e: - logger.error(f"Error calling MCP tool {tool_name}: {e}") - raise + except httpx.HTTPError as e: + resp = getattr(e, "response", None) + status = resp.status_code if resp is not None else None + body = resp.text[:500] if resp is not None else None + logger.exception( + f"HTTP error calling MCP tool {tool_name} | url={self.mcp_url}/rpc | status={status} | " + f"request_id={request_id} | body_snippet={body}" + ) + raise MCPTransportError(f"HTTP error calling MCP tool {tool_name}", status_code=status) from e + except Exception: + logger.exception(f"Error calling MCP tool {tool_name} | request_id={request_id}") + raise
15-20: Introduce typed exception hierarchy for MCP client.Enables precise handling and aligns with backend error-handling guidelines.
logger = logging.getLogger(__name__) - +class MCPError(Exception): + """Base MCP client error.""" + +class MCPTransportError(MCPError): + def __init__(self, message: str, status_code: int | None = None): + super().__init__(message if status_code is None else f"[HTTP {status_code}] {message}") + self.status_code = status_code + +class MCPToolError(MCPError): + def __init__(self, message: str, code: int | None = None, data: Any | None = None): + super().__init__(message if code is None else f"[{code}] {message}") + self.code = code + self.data = data + class MCPClient:
69-71: Change call_tool return type to Any and update its docstringJSON-RPC result can be any JSON value; current annotation dict[str, Any] is incorrect and will trigger MyPy failures.
- async def call_tool(self, tool_name: str, **kwargs) -> dict[str, Any]: + async def call_tool(self, tool_name: str, **kwargs) -> Any: @@ - Returns: - Dict with the tool response + Returns: + JSON-RPC result value (any JSON-serializable type)Call sites to review/adjust if they rely on dict-specific behavior:
- python/src/agents/mcp_client.py:104-107
- python/src/agents/mcp_client.py:111-112
- python/src/agents/mcp_client.py:118-121
- python/src/agents/mcp_client.py:125-126
- python/src/agents/mcp_client.py:130-133
- python/src/agents/mcp_client.py:137-138
🧹 Nitpick comments (8)
python/src/server/api_routes/ollama_api.py (2)
137-139: Preserve stack traces in logs (use logger.exception or exc_info=True).Backend guidelines require full stack traces. Current logs drop tracebacks.
Apply pattern like:
- logger.error(f"Error in model discovery: {e}") + logger.exception("Error in model discovery")and similarly:
- logger.error(f"Error in health check: {e}") + logger.exception("Error in health check")- logger.error(f"Error retrieving stored models: {e}") + logger.exception("Error retrieving stored models")Replicate across other except blocks in this module.
Also applies to: 202-205, 577-580
276-277: Avoid calling a private method (_calculate_performance_score).Prefer a public API on
embedding_routerfor performance scoring to avoid tight coupling and future breakage.python/src/agents/mcp_client.py (6)
21-21: Annotate optional parameter explicitly (MyPy).Use modern union for clarity.
- def __init__(self, mcp_url: str = None): + def __init__(self, mcp_url: str | None = None):
46-46: Annotate client attribute for MyPy.Prevents “Missing attribute”/inference issues.
- self.client = httpx.AsyncClient(timeout=30.0) + self.client: httpx.AsyncClient = httpx.AsyncClient(timeout=30.0)
13-14: Add uuid import to support unique JSON-RPC ids.+import uuid import httpx
74-75: Use unique JSON-RPC ids and propagate into logs for correlation.- request_data = {"jsonrpc": "2.0", "method": tool_name, "params": kwargs, "id": 1} + request_id = str(uuid.uuid4()) + request_data = {"jsonrpc": "2.0", "method": tool_name, "params": kwargs, "id": request_id}
77-81: Drop explicit Content-Type header (httpx sets it for json=).Avoids redundancy and keeps headers minimal.
- response = await self.client.post( - f"{self.mcp_url}/rpc", - json=request_data, - headers={"Content-Type": "application/json"}, - ) + response = await self.client.post(f"{self.mcp_url}/rpc", json=request_data)
141-158: Add a shutdown helper to close the global client.Prevents connection pool leaks on graceful shutdown.
return _mcp_client + +async def close_mcp_client() -> None: + """Close and clear the global MCP client (call on application shutdown).""" + global _mcp_client + if _mcp_client is not None: + await _mcp_client.close() + _mcp_client = None
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
python/src/agents/mcp_client.py(1 hunks)python/src/server/api_routes/ollama_api.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/agents/mcp_client.pypython/src/server/api_routes/ollama_api.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/agents/mcp_client.pypython/src/server/api_routes/ollama_api.py
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/api_routes/ollama_api.py
python/src/server/api_routes/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place HTTP route handlers in the API routes directory
Files:
python/src/server/api_routes/ollama_api.py
🧬 Code graph analysis (1)
python/src/agents/mcp_client.py (1)
python/src/server/utils/progress/progress_tracker.py (1)
error(196-219)
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (7)
archon-ui-main/src/features/shared/api/apiClient.ts (3)
54-69: FormData detection and header handling: LGTM; don’t overwrite Accept if caller sets one.Preserves browser-set multipart boundaries. Minor: avoid clobbering a caller-provided Accept.
Apply this diff:
- headersObj.set("Accept", "application/json"); + if (!headersObj.has("Accept")) { + headersObj.set("Accept", "application/json"); + }
71-72: Pass Headers directly to fetch to preserve multi-valued headers and avoid unnecessary lowercasing conversion.
Headersalready normalizes and supports multiple values; converting to a plain object can drop that fidelity.Apply this diff:
- const headers: Record<string, string> = Object.fromEntries(headersObj.entries()); + // Use Headers directly to preserve semantics + const headers = headersObj; @@ - headers, + headers,Also applies to: 79-82
111-121: Consider lenient parsing when content-type is missing but body is JSON.Some proxies/servers omit content-type; optionally attempt JSON parse before erroring out.
archon-ui-main/src/services/ollamaService.ts (4)
158-182: Harden error classification with case-insensitive checks.Normalize casing for “Failed to fetch” and “AbortError” to avoid missing variants.
Apply this diff:
- if ( - errorMessage.toLowerCase().includes("network") || - errorMessage.includes("fetch") || - errorMessage.includes("Failed to fetch") - ) { + const lowerMsg = errorMessage.toLowerCase(); + if (lowerMsg.includes("network") || lowerMsg.includes("fetch") || lowerMsg.includes("failed to fetch")) { return new Error( `Network error while ${context.toLowerCase()}: ${errorMessage}. ` + `Please check your connection and Ollama server status.`, ); } @@ - if (errorMessage.includes("timeout") || errorMessage.includes("AbortError")) { + if (lowerMsg.includes("timeout") || lowerMsg.includes("aborterror")) { return new Error( `Timeout error while ${context.toLowerCase()}: The Ollama instance may be slow to respond or unavailable.` ); }
205-209: Remove Content-Type on GET/DELETE; use Accept instead.Setting Content-Type on requests without a body is unnecessary and can confuse intermediaries. Prefer Accept.
Apply this diff:
@@ - const response = await fetch(`${this.baseUrl}/api/ollama/models?${params.toString()}`, { + const response = await fetch(`${this.baseUrl}/api/ollama/models?${params.toString()}`, { method: 'GET', - headers: { - 'Content-Type': 'application/json', - }, + headers: { Accept: 'application/json' }, signal: AbortSignal.timeout(30000), // 30 second timeout }); @@ - const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, { + const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, { method: 'GET', - headers: { - 'Content-Type': 'application/json', - }, + headers: { Accept: 'application/json' }, signal, }); @@ - const response = await fetch(`${this.baseUrl}/api/ollama/embedding/routes?${params.toString()}`, { + const response = await fetch(`${this.baseUrl}/api/ollama/embedding/routes?${params.toString()}`, { method: 'GET', - headers: { - 'Content-Type': 'application/json', - }, + headers: { Accept: 'application/json' }, signal: AbortSignal.timeout(30000), // 30 second timeout }); @@ - const response = await fetch(`${this.baseUrl}/api/ollama/cache`, { + const response = await fetch(`${this.baseUrl}/api/ollama/cache`, { method: 'DELETE', - headers: { - 'Content-Type': 'application/json', - }, + headers: { Accept: 'application/json' }, signal: AbortSignal.timeout(30000), // 30 second timeout });Also applies to: 244-248, 346-351, 371-376
401-410: Be resilient to instance_status keying; fall back to the first entry.Some backends key by host or sanitized URL. Use a fallback to avoid false negatives.
Apply this diff:
- const instanceStatus = healthResponse.instance_status[instanceUrl]; + const instanceStatus = + healthResponse.instance_status[instanceUrl] ?? + Object.values(healthResponse.instance_status)[0];
435-443: Avoid ad-hoc casts when annotating status; use a typed helper.Define a small helper/type to carry statusCode to the retry logic.
Example:
type HttpErrorLike = Error & { statusCode?: number; status?: number }; function setStatusCode(err: Error, code: number) { (err as HttpErrorLike).statusCode = code; }Then replace casts like
(lastError as unknown as { statusCode: number }).statusCode = 404;withsetStatusCode(lastError, 404).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
archon-ui-main/src/features/shared/api/apiClient.ts(3 hunks)archon-ui-main/src/services/ollamaService.ts(13 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
archon-ui-main/src/features/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UIUse Biome in features: 120 character line length, double quotes, and trailing commas
Files:
archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/features/shared/api/apiClient.tsarchon-ui-main/src/services/ollamaService.ts
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/features/shared/api/apiClient.tsarchon-ui-main/src/services/ollamaService.ts
🧬 Code graph analysis (2)
archon-ui-main/src/features/shared/api/apiClient.ts (1)
archon-ui-main/src/features/shared/types/errors.ts (1)
APIServiceError(19-28)
archon-ui-main/src/services/ollamaService.ts (1)
archon-ui-main/src/features/shared/config/queryPatterns.ts (1)
createRetryLogic(63-75)
🔇 Additional comments (2)
archon-ui-main/src/services/ollamaService.ts (1)
9-9: Import path fixed — looks correct now.Matches the actual export under features/shared/config.
archon-ui-main/src/features/shared/api/apiClient.ts (1)
36-45: 304 handling assumption — confirmed. fetch() returns the cached/synthesized (effectively 200) response when a conditional GET yields 304, so a raw 304 is not exposed to script and your response.ok check is safe (see Fetch Standard http-network-or-cache and MDN 304 semantics).
Backend Critical Fixes: - Fix json.loads() regression to handle both string and dict input from Supabase - Fix timestamp bug using actual datetime.utcnow().isoformat() instead of __module__ - Add comprehensive SSRF protection with IP address validation for instance_urls - Enhanced logging with exc_info=True for all error handlers to preserve stack traces MCP Client Enhancements: - Add proper exception hierarchy: MCPError, MCPTransportError, MCPToolError - Implement comprehensive error handling with detailed logging including request IDs - Add global shutdown helper function for proper resource cleanup - Generate unique JSON-RPC request IDs for better correlation Frontend API Service Improvements: - Preserve caller-provided Accept headers instead of always overriding - Handle non-JSON and empty responses properly to avoid false NETWORK_ERROR classifications - Add both status and statusCode properties on error objects for maximum compatibility - Enhanced status code annotations for retry logic compatibility All changes maintain backward compatibility while significantly improving security, reliability, and developer experience.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
python/src/server/api_routes/ollama_api.py (3)
151-153: BackgroundTasks cannot run async functions; current cache warmup likely never runs.BackgroundTasks executes sync callables in a threadpool; passing an async def leads to an un-awaited coroutine. Use asyncio.create_task instead.
- if background_tasks: - background_tasks.add_task(_warm_model_cache, valid_urls) + # Fire-and-forget cache warmup on the running event loop + asyncio.create_task(_warm_model_cache(valid_urls))Add import at top:
import asyncio
259-264: Add SSRF validation to validate_instance.Validate instance_url with the same guard as other endpoints before calling downstream services.
# Clean up URL instance_url = request.instance_url.rstrip('/') + # SSRF protection + parsed = urlparse(instance_url) + if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname): + raise HTTPException(status_code=400, detail="Invalid or private instance_url") + # Perform basic validation using the provider service validation_result = await validate_provider_instance("ollama", instance_url)
1315-1320: Add SSRF validation to capability testing endpoint.This endpoint performs server-side requests to the provided instance_url; protect it like other endpoints.
logger.info(f"Testing capabilities for model {request.model_name} on {request.instance_url}") - + # SSRF protection + parsed = urlparse(request.instance_url.rstrip('/')) + if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname): + raise HTTPException(status_code=400, detail="Invalid or private instance_url") + test_results = {} errors = []
🧹 Nitpick comments (17)
archon-ui-main/src/features/shared/api/apiClient.ts (2)
54-56: Harden FormData detection for cross-environment safety.Guard against environments where FormData may be undefined (Node/Jest, iframes).
- const isFormData = options.body instanceof FormData; + const isFormData = + typeof FormData !== "undefined" && options.body instanceof FormData;
65-71: Only set Content-Type when a body is present.Avoid sending Content-Type on GET/HEAD or body-less requests.
- } else if (!headersObj.has("Content-Type")) { + } else if (!headersObj.has("Content-Type") && options.body != null) { // Only set Content-Type if not already provided headersObj.set("Content-Type", "application/json"); }archon-ui-main/src/services/ollamaService.ts (5)
158-182: Handle AbortError by name, not just message substring.Abort errors often set error.name === "AbortError" without a message match.
- // Check for timeout errors - if (errorMessage.includes("timeout") || errorMessage.includes("AbortError")) { + // Check for timeout/abort errors + if ( + errorMessage.includes("timeout") || + (typeof (error as any)?.name === "string" && (error as any).name === "AbortError") + ) { return new Error( `Timeout error while ${context.toLowerCase()}: The Ollama instance may be slow to respond or unavailable.` ); }
203-209: Drop Content-Type on GET requests.Content-Type on GET is unnecessary; rely on Accept (server infers no body).
- const response = await fetch(`${this.baseUrl}/api/ollama/models?${params.toString()}`, { + const response = await fetch(`${this.baseUrl}/api/ollama/models?${params.toString()}`, { method: 'GET', - headers: { - 'Content-Type': 'application/json', - }, + // No Content-Type for GET signal: AbortSignal.timeout(30000), // 30 second timeout });Same applies to other GETs in this file.
226-248: Provide a default timeout when no signal is supplied.Prevents hangs when callers omit a signal.
- const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, { + const response = await fetch(`${this.baseUrl}/api/ollama/instances/health?${params.toString()}`, { method: 'GET', headers: { 'Content-Type': 'application/json', }, - signal, + signal: signal ?? AbortSignal.timeout(30000), });
406-410: Preserve zero response times (use nullish coalescing).Avoid treating 0 as falsy.
- responseTime: instanceStatus?.response_time_ms || responseTime, + responseTime: instanceStatus?.response_time_ms ?? responseTime,
453-459: Optional: add jitter to backoff to reduce thundering herd.Small random jitter improves resilience under load.
- const delayMs = Math.pow(2, attempt - 1) * 1000; // Exponential backoff: 1s, 2s, 4s + const base = Math.pow(2, attempt - 1) * 1000; // 1s, 2s, 4s + const jitter = Math.floor(Math.random() * 250); // up to 250ms + const delayMs = base + jitter;python/src/agents/mcp_client.py (3)
128-133: Fix type hints for optional params (MyPy).Parameters that accept None should be annotated as Optional (PEP 604 union).
- async def perform_rag_query(self, query: str, source: str = None, match_count: int = 5) -> str: + async def perform_rag_query(self, query: str, source: str | None = None, match_count: int = 5) -> str:
141-147: Fix type hints for optional params (MyPy).Same here for source_id.
- async def search_code_examples( - self, query: str, source_id: str = None, match_count: int = 5 - ) -> str: + async def search_code_examples( + self, query: str, source_id: str | None = None, match_count: int = 5 + ) -> str:
171-183: Make global client creation concurrency-safe.Two concurrent calls could create two clients. Guard with an asyncio.Lock.
@@ -_mcp_client: MCPClient | None = None +_mcp_client: MCPClient | None = None +_mcp_lock: "asyncio.Lock" | None = None @@ async def get_mcp_client() -> MCPClient: @@ - if _mcp_client is None: - _mcp_client = MCPClient() + if _mcp_client is None: + # Lazy-init lock to avoid importing asyncio at module import in all contexts + global _mcp_lock + if _mcp_lock is None: + import asyncio as _asyncio # local import to avoid unconditional dependency at import time + _mcp_lock = _asyncio.Lock() + async with _mcp_lock: + if _mcp_client is None: + _mcp_client = MCPClient()Note: If you prefer, move the asyncio import to the top and initialize the lock eagerly.
python/src/server/api_routes/ollama_api.py (7)
195-206: Use consistent keys for health_results to avoid duplicates.Blocked entries use instance_url while healthy entries use the normalized url. Normalize both to url.
- health_results[instance_url] = { + health_results[url] = { "is_healthy": False, "error_message": "URL blocked for security reasons", "response_time_ms": None, "models_available": 0, "last_checked": datetime.utcnow().isoformat() }- health_results[url] = { + health_results[url] = { "is_healthy": health_status.is_healthy, "response_time_ms": health_status.response_time_ms, "models_available": health_status.models_available if include_models else None, "error_message": health_status.error_message, "last_checked": health_status.last_checked }
1024-1029: Docstring is inaccurate; code uses only /api/tags and derives some fields.Align the docstring with actual behavior to avoid misleading clients.
- Discover models from Ollama instances with complete real details from both /api/tags and /api/show. - Only stores actual data from Ollama API endpoints - no fabricated information. + Discover models from Ollama instances using /api/tags only (fast path). + Skips /api/show to reduce latency/timeouts. Stores data returned by the API; + derived fields are kept minimal to avoid fabricating details.
1193-1197: Use upsert instead of update to ensure persistence even if the row doesn't exist.An update will no-op if the key is missing. Upsert guarantees the record exists.
- result = supabase.table("archon_settings").update({ - "value": json.dumps(models_data), - "description": "Real Ollama model data from API endpoints", - "updated_at": datetime.now().isoformat() - }).eq("key", "ollama_discovered_models").execute() + result = supabase.table("archon_settings").upsert({ + "key": "ollama_discovered_models", + "value": json.dumps(models_data), + "category": "ollama", + "description": "Real Ollama model data from API endpoints", + "updated_at": datetime.now().isoformat() + }).execute()
1003-1015: Avoid “partial support” heuristics based on braces; require valid JSON.Curly-brace detection will yield many false positives. Treat only parseable JSON as supported.
- except json.JSONDecodeError: - # Try to find JSON-like patterns in the response - if '{' in content and '}' in content and '"' in content: - logger.info(f"Model {model_name} has partial structured output support") - return True + except json.JSONDecodeError: + # Not valid JSON => treat as unsupported + return False
587-588: Remove redundant inner import.datetime is already imported at module level.
- from datetime import datetime - +
240-241: Prefer timezone-aware UTC timestamps for consistency.Use datetime.now(timezone.utc).isoformat() to avoid naive datetime ambiguity.
Example:
from datetime import datetime, timezone datetime.now(timezone.utc).isoformat()Also applies to: 505-506, 519-521, 529-530, 622-624
1127-1143: Avoid fabricating compatibility and embedding dimensions in “real-details” endpoint.Setting chat compatibility to 'full' and guessing embedding_dimensions contradicts the “real data only” intent. Either:
- Mark compatibility as 'unknown' and embedding_dimensions as None, or
- Explicitly document they’re inferred defaults.
Would you like me to generate a minimal patch that sets archon_compatibility='unknown' for chat models and leaves embedding_dimensions=None when not present?
Also applies to: 1103-1114
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
archon-ui-main/src/features/shared/api/apiClient.ts(3 hunks)archon-ui-main/src/services/ollamaService.ts(13 hunks)python/src/agents/mcp_client.py(5 hunks)python/src/server/api_routes/ollama_api.py(19 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
archon-ui-main/src/features/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UIUse Biome in features: 120 character line length, double quotes, and trailing commas
Files:
archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/features/shared/api/apiClient.tsarchon-ui-main/src/services/ollamaService.ts
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/features/shared/api/apiClient.tsarchon-ui-main/src/services/ollamaService.ts
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/agents/mcp_client.pypython/src/server/api_routes/ollama_api.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/agents/mcp_client.pypython/src/server/api_routes/ollama_api.py
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/api_routes/ollama_api.py
python/src/server/api_routes/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place HTTP route handlers in the API routes directory
Files:
python/src/server/api_routes/ollama_api.py
🧬 Code graph analysis (2)
archon-ui-main/src/features/shared/api/apiClient.ts (1)
archon-ui-main/src/features/shared/types/errors.ts (1)
APIServiceError(19-28)
archon-ui-main/src/services/ollamaService.ts (1)
archon-ui-main/src/features/shared/config/queryPatterns.ts (1)
createRetryLogic(63-75)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test Results Summary
🔇 Additional comments (10)
archon-ui-main/src/features/shared/api/apiClient.ts (3)
57-64: LGTM: Header normalization and default Accept.Using Headers() to normalize and defaulting Accept to application/json is sound and matches tests.
113-121: LGTM: Content-Type–aware JSON parsing with API error surfacing.The JSON gating and API error propagation are correct and robust.
124-127: LGTM: Non‑JSON fallback handling.Gracefully returning text or undefined is appropriate here.
archon-ui-main/src/services/ollamaService.ts (1)
9-9: Import path fix confirmed.createRetryLogic import now points to ../features/shared/config/queryPatterns — correct.
python/src/agents/mcp_client.py (4)
101-107: Good: raise MCPToolError with code/data (addresses prior feedback).You’re correctly surfacing JSON-RPC error.code and error.data via a typed exception; this preserves protocol semantics.
91-97: LGTM: unique JSON-RPC IDs and relying on httpx json= for Content-Type.Using uuid4() for request correlation and letting httpx set headers is solid and reduces header bugs.
186-199: LGTM: graceful global shutdown helper.Cleanly closing the AsyncClient and clearing the singleton is correct and production‑friendly.
108-108: Don’t silently default missing 'result'; treat as protocol error.Defaulting to {} hides malformed JSON-RPC responses. Raise an MCPError if "result" is absent (allowing None as a valid value).
- return result.get("result", {}) + if "result" not in result: + raise MCPError( + f"Malformed JSON-RPC response: missing 'result' (keys={list(result.keys())})" + ) + return result["result"]python/src/server/api_routes/ollama_api.py (2)
574-586: Good robustness for stored models load path.Handling both JSON string and dict with explicit 500 on corruption is correct and aligns with “never accept corrupted data”.
166-166: LGTM on logging improvements.Using exc_info=True consistently will preserve stack traces and improve troubleshooting.
Also applies to: 244-244, 294-294, 386-386, 415-415, 543-543, 638-638, 664-664, 1253-1253, 1403-1403
- Add SSRF protection to validate_instance and test_capabilities endpoints - Fix async execution with asyncio.create_task for cache warming - Implement thread-safe MCP client creation with proper locking - Enhance FormData detection for cross-environment compatibility - Remove unnecessary Content-Type headers from GET requests - Add default timeouts and jitter to retry logic - Fix MyPy type hints for optional parameters - Use database upsert instead of update for persistence - Standardize on UTC timestamps throughout - Remove fabricated model data, use real API data only Resolves 20 CodeRabbit comments (3 actionable + 17 nitpicks) through parallel agent coordination addressing security, concurrency, type safety, and API client improvements across backend and frontend services.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (12)
archon-ui-main/src/features/shared/api/apiClient.ts (2)
76-87: Avoid converting Headers to a plain object (preserve multi-value/case-insensitive semantics)Passing a plain Record loses potential multi-value headers and relies on object semantics. Prefer passing the Headers instance directly.
Apply this diff:
-const headers: Record<string, string> = Object.fromEntries(headersObj.entries()); ... - headers, + headers: headersObj,
116-130: Handle binary responses (octet-stream, PDF, images) instead of coercing to textNon-JSON content may be binary; returning text can corrupt data. Add a Blob branch for common binary content types.
// Check content type before parsing as JSON const contentType = response.headers.get("content-type")?.toLowerCase() ?? ""; if (contentType.includes("application/json") || contentType.includes("+json")) { // Parse JSON response const result = await response.json(); if (result && typeof result === "object" && "error" in result && result.error) { throw new APIServiceError(result.error as string, "API_ERROR", response.status); } return result as T; } -// Handle non-JSON or empty body responses -const text = await response.text().catch(() => ""); -return text ? (text as unknown as T) : (undefined as T); +// Handle non-JSON responses (prefer Blob for binary) +if ( + contentType.startsWith("application/octet-stream") || + contentType.startsWith("application/pdf") || + contentType.startsWith("image/") || + contentType.startsWith("application/zip") +){ + const blob = await response.blob().catch(() => null); + return (blob ?? undefined) as unknown as T; +} +// Fallback to text for other content types +const text = await response.text().catch(() => ""); +return text ? (text as unknown as T) : (undefined as T);python/src/server/api_routes/ollama_api.py (3)
105-112: Type the optional BackgroundTasks parameter explicitlyBackgroundTasks is optional here; annotate its type accordingly.
- background_tasks: BackgroundTasks = None + background_tasks: BackgroundTasks | None = None
628-631: Use timezone-aware UTC consistentlyElsewhere you use datetime.now(UTC).isoformat(); here it falls back to naive datetime.utcnow().
- last_updated=model.get('last_updated', datetime.utcnow().isoformat()), + last_updated=model.get('last_updated', datetime.now(UTC).isoformat()),
325-327: Avoid calling a private method on embedding_routerAccessing _calculate_performance_score ties you to an internal API. Expose a public method on embedding_router and call that instead.
python/src/agents/mcp_client.py (7)
102-108: Prefer JSON-RPC “message” (spec) over non-standard “error” fieldUse message first, then fall back to any non-standard field.
- error_msg = error.get("error") or error.get("message", "Unknown error") + error_msg = (error.get("message") or error.get("error") or "Unknown error")
97-101: Treat 3xx redirects as transport errors for JSON-RPCJSON-RPC endpoints should not redirect. Otherwise we may try to parse HTML as JSON.
- response = await self.client.post(f"{self.mcp_url}/rpc", json=request_data) + response = await self.client.post(f"{self.mcp_url}/rpc", json=request_data) + if 300 <= response.status_code < 400: + location = response.headers.get("location") + logger.error( + f"Unexpected redirect calling MCP tool {tool_name} | " + f"status={response.status_code} | request_id={request_id} | location={location}" + ) + raise MCPTransportError( + f"Unexpected redirect from MCP server (location={location})", + status_code=response.status_code, + )
100-101: Handle invalid JSON responses explicitly (raise MCPTransportError)Avoid routing JSON decoding failures through the generic handler; include status/body context.
- result = response.json() + try: + result = response.json() + except json.JSONDecodeError as e: + body_snippet = response.text[:500] + logger.exception( + f"Invalid JSON from MCP server | status={response.status_code} " + f"| request_id={request_id} | body_snippet={body_snippet}" + ) + raise MCPTransportError("Invalid JSON from MCP server", status_code=response.status_code) from e
93-98: Add X-Request-ID header for cross-service correlationPropagate the same request_id to server logs.
- response = await self.client.post(f"{self.mcp_url}/rpc", json=request_data) + response = await self.client.post( + f"{self.mcp_url}/rpc", + json=request_data, + headers={"X-Request-ID": request_id}, + )
132-169: Return consistent JSON strings from convenience methodsstr(result) yields Python repr for lists/booleans (not JSON). Prefer JSON for non-string values.
- return json.dumps(result) if isinstance(result, dict) else str(result) + return result if isinstance(result, str) else json.dumps(result) @@ - return json.dumps(result) if isinstance(result, dict) else str(result) + return result if isinstance(result, str) else json.dumps(result) @@ - return json.dumps(result) if isinstance(result, dict) else str(result) + return result if isinstance(result, str) else json.dumps(result) @@ - return json.dumps(result) if isinstance(result, dict) else str(result) + return result if isinstance(result, str) else json.dumps(result) @@ - return json.dumps(result) if isinstance(result, dict) else str(result) + return result if isinstance(result, str) else json.dumps(result) @@ - return json.dumps(result) if isinstance(result, dict) else str(result) + return result if isinstance(result, str) else json.dumps(result)
180-181: Doc nit: it’s coroutine-safe on a single event loop, not “thread-safe”asyncio.Lock isn’t cross-thread/loop safe; clarify to avoid misuse.
- Thread-safe implementation using double-checked locking pattern. + Coroutine-safe (single event loop) implementation using double-checked locking pattern.
212-217: Reset the global lock on shutdown (avoid cross-loop reuse issues in tests/tools)Prevents reusing an asyncio.Lock bound to a different loop.
- global _mcp_client + global _mcp_client, _mcp_client_lock @@ await _mcp_client.close() _mcp_client = None + _mcp_client_lock = None logger.info("Global MCP client shutdown completed")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
archon-ui-main/src/features/shared/api/apiClient.ts(3 hunks)archon-ui-main/src/services/ollamaService.ts(14 hunks)python/src/agents/mcp_client.py(6 hunks)python/src/server/api_routes/ollama_api.py(42 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
archon-ui-main/src/features/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UIUse Biome in features: 120 character line length, double quotes, and trailing commas
Files:
archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/features/shared/api/apiClient.tsarchon-ui-main/src/services/ollamaService.ts
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/features/shared/api/apiClient.tsarchon-ui-main/src/services/ollamaService.ts
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/api_routes/ollama_api.py
python/src/server/api_routes/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place HTTP route handlers in the API routes directory
Files:
python/src/server/api_routes/ollama_api.py
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/server/api_routes/ollama_api.pypython/src/agents/mcp_client.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/server/api_routes/ollama_api.pypython/src/agents/mcp_client.py
🧬 Code graph analysis (3)
archon-ui-main/src/features/shared/api/apiClient.ts (1)
archon-ui-main/src/features/shared/types/errors.ts (1)
APIServiceError(19-28)
python/src/server/api_routes/ollama_api.py (2)
python/src/server/utils/progress/progress_tracker.py (1)
error(196-219)python/src/server/services/ollama/model_discovery_service.py (2)
_test_function_calling_capability(817-857)_test_structured_output_capability(859-900)
archon-ui-main/src/services/ollamaService.ts (1)
archon-ui-main/src/features/shared/config/queryPatterns.ts (1)
createRetryLogic(63-75)
🔇 Additional comments (9)
archon-ui-main/src/features/shared/api/apiClient.ts (1)
54-57: FormData detection and header handling: good changeLetting the browser set multipart boundaries by omitting Content-Type is correct. The guard for environments without FormData is also appropriate.
archon-ui-main/src/services/ollamaService.ts (3)
9-9: Import path fix looks correctImporting createRetryLogic from features/shared/config/queryPatterns matches the referenced export.
158-183: Timeout and network error mapping: solid improvementsCatching AbortError and normalizing error messages improves UX across callers.
454-462: Retry/backoff logic: appropriate usage of smart retry with jitterExponential backoff with jitter and createRetryLogic gating retries by status is a good pattern here.
python/src/server/api_routes/ollama_api.py (2)
32-48: SSRF guard: also block unspecified addresses (0.0.0.0/::)Unspecified addresses should be treated as unsafe sources for server-side requests.
Apply this diff:
- if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved or ip.is_multicast: + if ( + ip.is_private + or ip.is_loopback + or ip.is_link_local + or ip.is_reserved + or ip.is_multicast + or ip.is_unspecified + ): return True
132-135: Require http/https scheme in all SSRF checksCurrent checks allow non-HTTP schemes through (e.g., file://). Require parsed.scheme in ("http","https") everywhere you validate URLs.
Apply these diffs:
- if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname): + if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname): logger.warning(f"Blocked private/invalid URL: {url}") continue- if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname): + if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname): logger.warning(f"Blocked private/invalid URL in health check: {url}") health_results[url] = {- if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname): + if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname): logger.warning(f"Blocked private/invalid URL in validate_instance: {instance_url}") raise HTTPException(status_code=400, detail="URL blocked for security reasons")- if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname): + if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname): logger.warning(f"Blocked private/invalid URL in model discovery: {base_url}") continue- if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname): + if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname): logger.warning(f"Blocked private/invalid URL in cache warming: {url}") continue- if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname): + if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname): logger.warning(f"Blocked private/invalid URL in detailed discovery: {base_url}") continue- if not parsed.scheme or not parsed.hostname or _is_private_host(parsed.hostname): + if parsed.scheme not in ("http", "https") or not parsed.hostname or _is_private_host(parsed.hostname): logger.warning(f"Blocked private/invalid URL in capability testing: {request.instance_url}") raise HTTPException(status_code=400, detail="URL blocked for security reasons")Also applies to: 195-204, 266-269, 486-489, 657-661, 1049-1053, 1303-1307
python/src/agents/mcp_client.py (3)
20-35: Typed MCP error hierarchy is a solid improvementClear base and specialized exceptions (transport/tool) improve diagnosability and caller handling.
109-113: Good fix: fail on missing 'result' (no silent default)This preserves protocol correctness and avoids masking malformed responses.
126-129: Preserve MCPError subclasses (don’t re-wrap them)Let MCPToolError/MCPTransportError pass through unchanged.
- except Exception as e: + except MCPError: + # Preserve typed MCP errors without wrapping + raise + except Exception as e: logger.exception(f"Unexpected error calling MCP tool {tool_name} | request_id={request_id}") raise MCPError(f"Failed to call MCP tool {tool_name}: {str(e)}") from e
## Critical URL Normalization Fix - Fix instanceStatus lookup failures with trailing slash URLs in ollamaService.ts - Add proper fallback logic with normalized URL matching ## Enhanced Security & Protocol Compliance - Add unspecified IP address blocking (0.0.0.0/::) to SSRF protection - Require http/https schemes in all 7 SSRF validation points - Treat JSON-RPC 3xx redirects as transport errors (protocol violation) - Add explicit invalid JSON response handling with proper error context ## Type Safety & Consistency Improvements - Fix BackgroundTasks type annotation: BackgroundTasks | None = None - Use datetime.now(UTC).isoformat() consistently across all timestamps - Enhanced JSON-RPC message field priority (spec-compliant 'message' first) - Return consistent JSON strings instead of Python repr for all response types ## HTTP Client & Error Handling Enhancements - Preserve Headers instance instead of converting to Record (maintain multi-value support) - Add binary response handling for PDFs, images, octet-stream, video, audio - Add X-Request-ID header for cross-service correlation and tracing - Preserve MCPError subclasses without re-wrapping for proper error handling - Reset global asyncio.Lock on shutdown for test isolation safety ## Files Enhanced - ollamaService.ts: URL normalization and retry logic improvements - apiClient.ts: Headers preservation and binary response support - ollama_api.py: Comprehensive SSRF protection and type safety - mcp_client.py: JSON-RPC protocol compliance and error handling Addresses all remaining CodeRabbit feedback (1 actionable + 12 nitpicks) with focus on security, protocol compliance, and cross-environment reliability.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
python/src/agents/mcp_client.py (2)
195-195: Misleading terminology: "Thread-safe" refers to threading.Lock, not asyncio.Lock.asyncio.Lock is "not thread-safe" and is designed specifically for coroutines within a single event loop. The comment incorrectly claims this is a "thread-safe implementation" when it's actually a coroutine-safe implementation using asyncio.Lock.
Apply this diff to correct the terminology:
- Thread-safe implementation using double-checked locking pattern. + Coroutine-safe implementation using double-checked locking pattern.
188-188: Lock variable initialization may cause issues in multi-event-loop scenarios.The global lock variable is initialized to
Noneand created lazily, which could cause race conditions if multiple event loops from different threads try to initialize it simultaneously.Consider initializing the lock at module level to avoid potential issues:
-_mcp_client_lock: asyncio.Lock | None = None +_mcp_client_lock = asyncio.Lock()Then simplify the
get_mcp_client()function:- # Initialize lock if needed - if _mcp_client_lock is None: - _mcp_client_lock = asyncio.Lock() - # Double-checked locking pattern async with _mcp_client_lock:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
archon-ui-main/src/features/shared/api/apiClient.ts(3 hunks)archon-ui-main/src/services/ollamaService.ts(14 hunks)python/src/agents/mcp_client.py(6 hunks)python/src/server/api_routes/ollama_api.py(44 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/agents/mcp_client.pypython/src/server/api_routes/ollama_api.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/agents/mcp_client.pypython/src/server/api_routes/ollama_api.py
archon-ui-main/src/features/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/features/**/*.{ts,tsx}: Use TanStack Query for all data fetching (no prop drilling); use smart HTTP polling (no WebSockets)
Biome formatting in features: 120-character line length, double quotes, and trailing commas
Apply Tron-inspired glassmorphism styling with Tailwind for feature UIUse Biome in features: 120 character line length, double quotes, and trailing commas
Files:
archon-ui-main/src/features/shared/api/apiClient.ts
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/features/shared/api/apiClient.tsarchon-ui-main/src/services/ollamaService.ts
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/features/shared/api/apiClient.tsarchon-ui-main/src/services/ollamaService.ts
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/api_routes/ollama_api.py
python/src/server/api_routes/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place HTTP route handlers in the API routes directory
Files:
python/src/server/api_routes/ollama_api.py
🧬 Code graph analysis (4)
python/src/agents/mcp_client.py (1)
python/src/server/utils/progress/progress_tracker.py (1)
error(196-219)
archon-ui-main/src/features/shared/api/apiClient.ts (1)
archon-ui-main/src/features/shared/types/errors.ts (1)
APIServiceError(19-28)
python/src/server/api_routes/ollama_api.py (2)
python/src/server/services/ollama/model_discovery_service.py (2)
_test_function_calling_capability(817-857)_test_structured_output_capability(859-900)python/src/server/services/llm_provider_service.py (1)
get_llm_client(313-548)
archon-ui-main/src/services/ollamaService.ts (1)
archon-ui-main/src/features/shared/config/queryPatterns.ts (1)
createRetryLogic(63-75)
🪛 GitHub Actions: Continuous Integration
python/src/server/api_routes/ollama_api.py
[error] 105-105: FastAPIError: Invalid args for response field! Hint: check that fastapi.background.BackgroundTasks | None is a valid Pydantic field type. If you are using a return type annotation that is not a valid Pydantic field (e.g. Union[Response, dict, None]) you can disable generating the response model from the type annotation with the path operation decorator parameter response_model=None.
[error] 105-105: FastAPIError: Invalid args for response field! Hint: check that fastapi.background.BackgroundTasks | None is a valid Pydantic field type. If you are using a return type annotation that is not a valid Pydantic field (e.g. Union[Response, dict, None]) you can disable generating the response model from the type annotation with the path operation decorator parameter response_model=None.
[error] 105-105: FastAPIError: Invalid args for response field! Hint: check that fastapi.background.BackgroundTasks | None is a valid Pydantic field type. If you are using a return type annotation that is not a valid Pydantic field (e.g. Union[Response, dict, None]) you can disable generating the response model from the type annotation with the path operation decorator parameter response_model=None.
🔇 Additional comments (32)
archon-ui-main/src/services/ollamaService.ts (5)
43-43: LGTM! Improved type safety with unknown.The change from
anytounknownfor parameters properties aligns with TypeScript best practices, asunknownis the type-safe counterpart ofanyand forces type checking before operations. This prevents unsafe access to potentially undefined properties and improves code reliability.Also applies to: 58-58
158-183: LGTM! Enhanced error handling with better AbortError detection.The updated
handleApiErrormethod properly detects AbortError instances through both error message content and the error name property. The addition oferrorNamevariable provides more reliable timeout error detection.
207-207: LGTM! Consistent timeout and header improvements.The addition of
Accept: application/jsonheaders and 30-second timeouts viaAbortSignal.timeout(30000)provides better API consistency and prevents hanging requests. The timeout handling integrates well with the enhancedhandleApiErrormethod.Also applies to: 209-209, 246-246, 248-248, 280-280, 312-312, 349-349, 351-351, 376-376
227-227: Good addition of optional AbortSignal parameter.The optional
signalparameter incheckInstanceHealthallows callers to control request timeouts while maintaining backward compatibility with the default 30-second timeout.Also applies to: 248-248
392-473: LGTM! Smart retry implementation with proper normalization.The implementation successfully integrates the
createRetryLogicutility and addresses the URL normalization issue identified in past reviews. Key improvements:
- URL normalization: Lines 405-408 properly handle trailing slash differences between client URLs and backend keys
- Smart retry logic: Uses
createRetryLogicto avoid retrying 4xx client errors while retrying network/5xx errors- Status code annotation: Lines 425-454 intelligently parse error messages to add HTTP status codes for retry decision-making
- Exponential backoff with jitter: Lines 459-462 provide robust retry timing
The retry loop correctly iterates
retryCount + 1times (Line 398) and properly uses the retry logic to determine continuation (Line 458).archon-ui-main/src/features/shared/api/apiClient.ts (3)
37-45: LGTM! Clear documentation of new FormData capabilities.The updated function documentation accurately describes the automatic FormData detection feature and improved functionality.
54-78: LGTM! Smart FormData detection and header normalization.The implementation correctly:
- Detects FormData safely: Lines 55-56 include a guard for environments where FormData might be undefined
- Normalizes headers: Line 61 uses
Headersconstructor to handle various header input types- Handles Content-Type intelligently: Lines 68-74 omit Content-Type for FormData (allowing browser to set multipart boundary) and set
application/jsononly when needed- Preserves caller headers: Lines 64-66 respect existing Accept headers while providing a sensible default
This addresses the PR objective of adding FormData support while maintaining backward compatibility.
117-142: LGTM! Content-type aware response parsing.The enhanced response parsing logic properly:
- JSON detection: Lines 119-126 check for both
application/jsonand+jsonsuffixes and handle API error responses- Binary handling: Lines 128-138 detect various binary content types and return Blob objects
- Fallback handling: Lines 141-142 gracefully handle text responses and empty bodies
This provides robust handling for different response types while maintaining the existing API contract.
python/src/agents/mcp_client.py (11)
20-34: LGTM! Well-designed error hierarchy.The error hierarchy correctly inherits from
Exceptionand provides useful attributes for debugging. The constructors properly format messages with codes when available, which will help with debugging MCP protocol issues.
40-64: LGTM! Robust service discovery with sensible fallback.The initialization logic correctly prioritizes explicit URL configuration, then attempts service discovery, and finally falls back to container-aware defaults. The Docker container detection is appropriate for the deployment model.
65-65: Good type annotation for the HTTP client.The explicit type annotation helps with IDE support and makes the code more maintainable.
92-100: LGTM! Proper JSON-RPC request construction.The use of UUID for request IDs ensures uniqueness for correlation, and the X-Request-ID header follows good practices for distributed tracing. Letting httpx handle the Content-Type for JSON is the right approach.
103-104: LGTM! Appropriate redirect handling for JSON-RPC.JSON-RPC over HTTP should not follow redirects as it breaks the protocol semantics. Treating 3xx as transport errors is correct.
109-112: LGTM! Proper JSON parsing error handling.Explicitly catching
JSONDecodeErrorand wrapping it inMCPTransportErrorwith the status code provides good debugging context.
114-119: LGTM! Comprehensive tool error handling.The error handling correctly extracts all JSON-RPC error fields (
error,message,code,data) and raises the appropriate typed exception. This addresses the previous review feedback effectively.
121-122: LGTM! Proper protocol validation.Checking for the mandatory
resultfield in JSON-RPC success responses is correct protocol behavior. The error message includes the full response for debugging.
126-136: LGTM! Rich HTTP error logging and exception handling.The error handling extracts comprehensive debugging information (status code, body snippet, URL, request ID) and logs it appropriately. Wrapping in
MCPTransportErrorwith status code preserves the transport-level context.
138-143: LGTM! Proper exception type preservation.The code correctly re-raises
MCPErrorsubclasses without wrapping them, preserving the specific error types for callers. This addresses the previous review feedback about preserving typed exceptions.
220-235: LGTM! Proper resource cleanup for global client.The shutdown function correctly closes the HTTP client and resets both global variables. Resetting the lock to
Noneis good for test isolation.python/src/server/api_routes/ollama_api.py (13)
42-42: LGTM! Correctly implements SSRF protection withis_unspecifiedcheck.The function includes
ip.is_unspecifiedto catch addresses like 0.0.0.0 and ::, following the proper pattern for SSRF protection. This addresses the previous review comment about missing unspecified address validation.
132-134: LGTM! SSRF protection correctly validates HTTP/HTTPS schemes.The check properly validates that URLs use only HTTP/HTTPS schemes and blocks private/internal addresses, addressing the previous review comment about accepting non-HTTP schemes.
195-204: LGTM! Proper SSRF protection with appropriate error handling.The function correctly validates URLs and provides structured error responses for blocked URLs, maintaining consistent error handling patterns.
266-268: LGTM! Consistent SSRF protection across validation endpoints.The validation properly blocks private/internal URLs before processing and returns an appropriate HTTP 400 error.
486-488: LGTM! SSRF protection consistently applied in model discovery.The function properly validates URLs and continues processing other instances when blocking unsafe URLs, following the established pattern.
658-660: LGTM! Background cache warming includes proper SSRF validation.The background task correctly validates URLs before processing, maintaining security even in asynchronous operations.
1050-1052: LGTM! Comprehensive SSRF protection in detailed discovery.The function applies consistent URL validation patterns across all discovery endpoints, ensuring security throughout the API.
1304-1306: LGTM! SSRF protection in capability testing endpoint.The endpoint properly validates URLs before performing capability tests, maintaining security for all external requests.
15-15: LGTM! Proper UTC timestamp usage throughout the codebase.The consistent use of
datetime.now(UTC).isoformat()ensures timezone-aware timestamps across all endpoints and database operations. This follows best practices for timestamp handling in distributed systems.Also applies to: 202-202, 242-242, 513-513, 526-526, 537-537, 628-628, 1140-1140, 1166-1166, 1180-1180
168-168: LGTM! Comprehensive error logging with stack traces.The consistent use of
exc_info=Trueensures full stack traces are captured for debugging, following the coding guidelines for Python error handling.Also applies to: 246-246, 302-302, 339-339, 395-395, 423-423, 551-551, 644-644, 670-670, 1237-1237, 1393-1393
583-593: LGTM! Robust JSON handling with proper error handling.The code properly handles both string and dict formats from the database driver, includes comprehensive error handling for corrupted JSON, and follows the coding guidelines by raising exceptions with context rather than returning None.
1106-1106: LGTM! Avoids fabricating embedding dimensions and compatibility data.The code correctly avoids creating fake embedding dimensions and sets compatibility as 'unknown' when real data is unavailable, maintaining data integrity and avoiding misleading information.
Also applies to: 1121-1125
1017-1019: LGTM! Strict JSON validation for structured output testing.The code correctly validates only proper JSON parsing without accepting partial matches, ensuring accurate capability detection.
- Use BackgroundTasks() default instead of BackgroundTasks | None for FastAPI compatibility - Maintains the same functionality while following FastAPI best practices
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
python/src/server/api_routes/ollama_api.py (2)
317-327: Add SSRF validation for analyze_embedding_route_endpoint.This endpoint consumes a user-provided instance_url without the SSRF guard used elsewhere.
Apply this diff:
logger.info(f"Analyzing embedding route for {request.model_name} on {request.instance_url}") + # SSRF protection - require http(s) and block private/internal targets + parsed = urlparse(request.instance_url) + if parsed.scheme not in ('http', 'https') or not parsed.hostname or _is_private_host(parsed.hostname): + logger.warning(f"Blocked private/invalid URL in embedding route analysis: {request.instance_url}") + raise HTTPException(status_code=400, detail="URL blocked for security reasons") + # Get routing decision from the embedding router routing_decision = await embedding_router.route_embedding( model_name=request.model_name, instance_url=request.instance_url, text_content=request.text_sample )
357-359: Validate instance_urls with SSRF checks before routing analysis.This path accepts a list of URLs without the SSRF validation applied in other endpoints.
Apply this diff:
- # Get available routes - routes = await embedding_router.get_available_embedding_routes(instance_urls) + # Validate instance URLs and check for SSRF risks + valid_urls: list[str] = [] + for url in instance_urls: + try: + parsed = urlparse(url.rstrip('/')) + if parsed.scheme not in ('http', 'https') or not parsed.hostname or _is_private_host(parsed.hostname): + logger.warning(f"Blocked private/invalid URL in embedding routes: {url}") + continue + valid_urls.append(url.rstrip('/')) + except Exception as e: + logger.warning(f"Error validating URL {url}: {e}") + continue + if not valid_urls: + raise HTTPException(status_code=400, detail="No valid instance URLs provided") + + # Get available routes for validated URLs only + routes = await embedding_router.get_available_embedding_routes(valid_urls)
🧹 Nitpick comments (4)
python/src/server/api_routes/ollama_api.py (4)
121-135: Remove redundant scheme check; rely on urlparse validation once.You check both startswith('http...') and parsed.scheme in ('http','https'). One is enough.
Apply this diff:
- # Basic URL validation - require http/https schemes - if not url.startswith(('http://', 'https://')): - logger.warning(f"Invalid URL format: {url}") - continue - # SSRF protection - check if URL targets private/internal addresses parsed = urlparse(url) if parsed.scheme not in ('http', 'https') or not parsed.hostname or _is_private_host(parsed.hostname): logger.warning(f"Blocked private/invalid URL: {url}") continue
325-325: Avoid calling a private method on embedding_router.Accessing _calculate_performance_score breaks encapsulation; prefer a public API (e.g., calculate_performance_score) or move logic local.
656-661: LGTM; consider bounded concurrency for warming to improve throughput.Optional: run discover_models across hosts concurrently with a small semaphore to cap parallelism.
Example (outside this hunk):
# outside diff: example approach sem = asyncio.Semaphore(5) async def _warm_one(u): async with sem: await model_discovery_service.discover_models(u) await asyncio.gather(*(_warm_one(u) for u in instance_urls))
1176-1181: Include category for consistency with other upserts.Elsewhere you upsert with category="ollama". Keep it consistent for filtering/maintenance.
Apply this diff:
result = supabase.table("archon_settings").upsert({ "key": "ollama_discovered_models", "value": json.dumps(models_data), "description": "Real Ollama model data from API endpoints", + "category": "ollama", "updated_at": datetime.now(UTC).isoformat() }).execute()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
python/src/server/api_routes/ollama_api.py(44 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/api_routes/ollama_api.py
python/src/server/api_routes/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place HTTP route handlers in the API routes directory
Files:
python/src/server/api_routes/ollama_api.py
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/server/api_routes/ollama_api.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/server/api_routes/ollama_api.py
🧬 Code graph analysis (1)
python/src/server/api_routes/ollama_api.py (3)
python/src/server/utils/progress/progress_tracker.py (1)
error(196-219)python/src/server/services/ollama/model_discovery_service.py (2)
_test_function_calling_capability(817-857)_test_structured_output_capability(859-900)python/src/server/services/llm_provider_service.py (1)
get_llm_client(313-548)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Backend Tests (Python + pytest)
🔇 Additional comments (3)
python/src/server/api_routes/ollama_api.py (3)
32-47: SSRF guard looks good (now blocks unspecified/loopback/private/etc.).Solid use of getaddrinfo + ipaddress checks, including is_unspecified. Treating resolution failures as unsafe is appropriate.
582-594: Robust handling of stored models format and corruption.Gracefully supports string or dict values and fails fast on corruption with clear errors.
151-155: Use FastAPI BackgroundTasks instead of asyncio.create_task for request-scoped background work.Schedule via background_tasks.add_task to run after the response is sent. This was flagged previously.
Apply this diff:
- if background_tasks: - # Use asyncio.create_task for async function execution - asyncio.create_task(_warm_model_cache(valid_urls)) + if background_tasks is not None: + # Schedule cache warming as a FastAPI background task (runs after response) + background_tasks.add_task(_warm_model_cache, valid_urls)
## Critical Security Fixes: ### SSRF Vulnerability coleam00#1 - analyze_embedding_route_endpoint - Add SSRF protection before processing user-provided instance_url - Validate http/https schemes and block private/internal addresses - Prevents attacks via embedding route analysis endpoint ### SSRF Vulnerability coleam00#2 - embedding routes endpoint - Add URL validation loop for instance_urls list parameter - Create validated URL list with proper SSRF filtering - Prevent bulk private network access via embedding routes discovery ### FastAPI Background Tasks Fix - Replace asyncio.create_task() with background_tasks.add_task() - Ensures proper FastAPI request lifecycle management - Background tasks now run after HTTP response is sent ## Security Impact: These fixes prevent Server-Side Request Forgery (SSRF) attacks that could allow attackers to access internal services, cloud metadata, or private network resources via embedding-related endpoints. Addresses CodeRabbit actionable security comments from PR coleam00#723 review.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
python/src/server/api_routes/ollama_api.py (5)
32-47: SSRF guard: good coverage; minor readability/line-length fix.
ipchecks look solid (incl. unspecified). Wrap the long condition per 120‑char guideline.- if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved or ip.is_multicast or ip.is_unspecified: + if ( + ip.is_private + or ip.is_loopback + or ip.is_link_local + or ip.is_reserved + or ip.is_multicast + or ip.is_unspecified + ): return True
331-331: Avoid calling a private method of embedding_router.Use a public wrapper to prevent tight coupling to internals.
- performance_score = embedding_router._calculate_performance_score(routing_decision.dimensions) + performance_score = embedding_router.calculate_performance_score(routing_decision.dimensions)Add this wrapper in embedding_router:
# in services/ollama/embedding_router.py def calculate_performance_score(self, dimensions: int) -> float: return self._calculate_performance_score(dimensions)
349-353: sort_by_performance is currently unused — either remove or honor it.Right now, routes are always performance-sorted by the router. If you keep the flag, re-sort here when False.
Apply this diff:
routes = await embedding_router.get_available_embedding_routes(valid_urls) + # If not sorting by performance, provide a stable alternative ordering + if not sort_by_performance: + routes.sort(key=lambda r: (r.model_name, r.instance_url))Alternatively, remove the query param to avoid a misleading API surface.
Also applies to: 379-381
450-452: force_refresh is not used.Either implement cache-bypass behavior or drop the field to avoid confusion.
Would you like me to wire this through model_discovery_service to bypass caches when True?
Also applies to: 481-483
137-139: Preserve stack traces on caught exceptions (use exc_info=True).Guidelines require full stack traces. Add exc_info=True to these warning logs for better diagnosis.
Example diffs:
- logger.warning(f"Error validating URL {url}: {e}") + logger.warning(f"Error validating URL {url}: {e}", exc_info=True)- logger.warning(f"Health check failed for {instance_url}: {e}") + logger.warning(f"Health check failed for {instance_url}: {e}", exc_info=True)Repeat similarly at the other locations noted.
Also applies to: 215-217, 541-542, 654-654, 1174-1175, 1181-1182
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
python/src/server/api_routes/ollama_api.py(46 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/api_routes/ollama_api.py
python/src/server/api_routes/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place HTTP route handlers in the API routes directory
Files:
python/src/server/api_routes/ollama_api.py
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/server/api_routes/ollama_api.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/server/api_routes/ollama_api.py
🧬 Code graph analysis (1)
python/src/server/api_routes/ollama_api.py (3)
python/src/server/services/ollama/embedding_router.py (1)
get_available_embedding_routes(273-321)python/src/server/services/ollama/model_discovery_service.py (2)
_test_function_calling_capability(817-857)_test_structured_output_capability(859-900)python/src/server/services/llm_provider_service.py (1)
get_llm_client(313-548)
🔇 Additional comments (3)
python/src/server/api_routes/ollama_api.py (3)
121-136: SSRF hardening LGTM across all validation points.Consistent enforcement of http(s) scheme + hostname presence + private/loopback/link-local/reserved/multicast/unspecified IP blocking. Good defense-in-depth.
Also applies to: 193-205, 264-269, 317-322, 363-377, 505-510, 677-682, 1069-1074, 1323-1328
603-615: Robust stored-models parsing LGTM.Good handling of string vs dict payloads with explicit corruption detection and fail-fast behavior.
1038-1040: Good change: require valid JSON for structured output.Dropping heuristic acceptance avoids false positives.
| include_capabilities: bool = Query(True, description="Include capability detection"), | ||
| fetch_details: bool = Query(False, description="Fetch comprehensive model details via /api/show"), | ||
| background_tasks: BackgroundTasks = None | ||
| background_tasks: BackgroundTasks = BackgroundTasks() |
There was a problem hiding this comment.
Let FastAPI inject BackgroundTasks; don’t instantiate it (tasks won’t run otherwise).
Constructing BackgroundTasks() in the signature prevents FastAPI from wiring it into the response lifecycle; added tasks won’t execute. Accept the injected instance and drop the None-guard.
Apply this diff:
- background_tasks: BackgroundTasks = BackgroundTasks()
+ background_tasks: BackgroundTasks- if background_tasks is not None:
- # Schedule cache warming as a FastAPI background task (runs after response)
- background_tasks.add_task(_warm_model_cache, valid_urls)
+ # Schedule cache warming as a FastAPI background task (runs after response)
+ background_tasks.add_task(_warm_model_cache, valid_urls)Also applies to: 152-155
🤖 Prompt for AI Agents
In python/src/server/api_routes/ollama_api.py around lines 110 and again around
lines 152-155, the handler is incorrectly creating BackgroundTasks() in the
function signature which prevents FastAPI from injecting and executing
background tasks; change the parameter to accept BackgroundTasks without
instantiation (remove the = BackgroundTasks() default) and remove any None-guard
logic so the injected BackgroundTasks is used directly when adding tasks. Ensure
function signatures rely on FastAPI to provide BackgroundTasks and that calls to
background_tasks.add_task(...) use the supplied instance.
## Code Quality Improvements Applied: ### 1. SSRF Guard Formatting (Line 42-49) ✅ - Fixed long condition violating 120-character line length guideline - Broke multi-condition check into readable multi-line format with proper indentation ### 2. Private Method Access (Line 331) ✅ - Changed from embedding_router._calculate_performance_score() to public method - Improved encapsulation by avoiding private method access ### 3. Unused sort_by_performance Parameter (Lines 381-383) ✅ - Implemented missing sorting logic when sort_by_performance=False - Added stable alternative ordering by model_name and instance_url ### 4. Unused force_refresh Parameter (Lines 448-450) ✅ - Removed unused force_refresh field from ModelDiscoveryAndStoreRequest - Eliminated misleading API surface that wasn't implemented ### 5. Missing Stack Traces in Exception Logging ✅ - Added exc_info=True to all 7 exception logging locations as required by guidelines - Enhanced debugging capabilities with full stack trace preservation: - Line 138: URL validation errors - Line 216: Health check failures - Line 373: Embedding route URL validation - Line 541: Model discovery failures - Line 654: Stored model parsing errors - Line 1174: Model detail retrieval failures - Line 1181: Instance processing failures ## Additional Improvements: - Removed unused asyncio import for cleaner code - Maintained all SSRF security protections while improving code quality - Enhanced error logging and debugging capabilities All CodeRabbit review comments have been comprehensively addressed while preserving critical security measures and functionality.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (9)
python/src/server/api_routes/ollama_api.py (9)
31-53: Tighten SSRF guard: use specific exceptions and simplify with is_global.
- Catch socket.gaierror explicitly; log other exceptions with stack traces.
- Using not ip.is_global covers private, loopback, link-local, multicast, reserved, and unspecified in one check.
def _is_private_host(host: str) -> bool: @@ - try: - infos = socket.getaddrinfo(host, None) + try: + infos = socket.getaddrinfo(host, None) for _, _, _, _, sockaddr in infos: ip = ipaddress.ip_address(sockaddr[0]) - if ( - ip.is_private - or ip.is_loopback - or ip.is_link_local - or ip.is_reserved - or ip.is_multicast - or ip.is_unspecified - ): + # Block anything that isn't globally routable (covers private, loopback, link-local, reserved, multicast, unspecified) + if not ip.is_global: return True - except Exception: - # If resolution fails, treat as unsafe or log/deny explicitly - return True + except socket.gaierror as e: + logger.warning(f"DNS resolution failed for host {host}: {e}") + return True + except Exception as e: + logger.warning(f"Host validation failed for {host}: {e}", exc_info=True) + return True return False
127-141: Remove redundant URL prefix check; rely on parsed.scheme validation.You check http/https twice. The startswith check is redundant and may miss valid equivalents (e.g., case sensitivity). Keep the parsed validation only.
- # Basic URL validation - require http/https schemes - if not url.startswith(('http://', 'https://')): - logger.warning(f"Invalid URL format: {url}") - continue - # SSRF protection - check if URL targets private/internal addresses parsed = urlparse(url) if parsed.scheme not in ('http', 'https') or not parsed.hostname or _is_private_host(parsed.hostname): logger.warning(f"Blocked private/invalid URL: {url}") continue
221-229: Always set last_checked timestamp on errors for consistent shape.Return an ISO timestamp even on errors to simplify clients.
- health_results[url] = { + health_results[url] = { "is_healthy": False, "response_time_ms": None, "models_available": None, "error_message": str(e), - "last_checked": None + "last_checked": datetime.now(UTC).isoformat() }
289-291: Use explicit None check when collecting supported_dimensions.Truthiness can drop legitimate 0 values (even if unlikely). Be explicit.
- "supported_dimensions": list({m.embedding_dimensions for m in models - if m.embedding_dimensions}) + "supported_dimensions": list({ + m.embedding_dimensions for m in models + if m.embedding_dimensions is not None + })
684-696: Preserve stack traces when cache warming per-URL fails.Aligns with backend guidelines to keep full traces.
- except Exception as e: - logger.warning(f"Failed to warm cache for {url}: {e}") + except Exception as e: + logger.warning(f"Failed to warm cache for {url}: {e}", exc_info=True)
1046-1047: Optional: log JSON decode failures at debug with sample size to reduce noise.You already gate on valid JSON only; consider adding len(content) for triage.
1205-1211: Consider storing JSON natively instead of json.dumps for Supabase (if column is JSONB).If archon_settings.value is JSON/JSONB, let the driver handle serialization to avoid double-encoding and simplify reads.
Would you like me to scan the repo schema to confirm the column type and propose a migration if needed?
369-383: DRY the SSRF+scheme validation across endpoints.The same parsed.scheme/hostname/_is_private_host pattern repeats in multiple handlers. Extract a small helper (e.g., _validate_public_http_url(url) -> str) to centralize logging and normalization.
463-474: Constrain enum-like fields with typing for safer responses.archon_compatibility and model_type read like enums but are plain str. Use Literal or Enum to prevent drift (e.g., 'unknown').
Example outside this hunk:
from enum import Enum class CompatibilityLevel(str, Enum): full = "full" partial = "partial" limited = "limited" unknown = "unknown" # if you need it class ModelType(str, Enum): chat = "chat" embedding = "embedding" multimodal = "multimodal" class StoredModelInfo(BaseModel): ... model_type: ModelType archon_compatibility: CompatibilityLevel
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
python/src/server/api_routes/ollama_api.py(46 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/api_routes/ollama_api.py
python/src/server/api_routes/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place HTTP route handlers in the API routes directory
Files:
python/src/server/api_routes/ollama_api.py
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/server/api_routes/ollama_api.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/server/api_routes/ollama_api.py
🔇 Additional comments (2)
python/src/server/api_routes/ollama_api.py (2)
612-624: Good: strict handling of corrupted stored models.
- Correctly parses string vs dict.
- Raises 500 and preserves stack trace on corruption.
116-116: Let FastAPI inject BackgroundTasks; don’t instantiate or None‑check.Instantiating BackgroundTasks prevents FastAPI from wiring tasks into the response lifecycle; the None-guard is also unnecessary. Use the injected instance directly.
- background_tasks: BackgroundTasks = BackgroundTasks() + background_tasks: BackgroundTasks- if background_tasks is not None: - # Schedule cache warming as a FastAPI background task (runs after response) - background_tasks.add_task(_warm_model_cache, valid_urls) + # Schedule cache warming as a FastAPI background task (runs after response) + background_tasks.add_task(_warm_model_cache, valid_urls)Also applies to: 158-161
|
🔄 This repository is being replaced by a new version of Archon. The original Python/MCP codebase is being archived to the This PR is being closed as part of the migration. Thank you for your contribution! |
Add file attachment support to the web chat UI. Users can attach images, PDFs, markdown, and code files to messages via file picker, drag-and-drop, or clipboard paste. Uploaded files are saved to ~/.archon/artifacts/uploads/ and their paths are injected into the prompt so Claude Code can read them using its built-in tools. - Frontend: file picker button (Paperclip), drag-drop zone, clipboard paste (Ctrl+V for images), preview chips with remove button, 10 MB / 5 file limits - Backend: multipart/form-data parsing in POST /api/conversations/:id/message, file storage with path-traversal-safe names, backwards-compatible JSON path - Core: AttachedFile type, attachedFiles on HandleMessageContext, buildFullPrompt appends ## Attached Files section with disk paths - Chat history: file metadata persisted in message metadata JSON field, attachment pills rendered in MessageBubble for persisted messages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add file attachment support to the web chat UI. Users can attach images, PDFs, markdown, and code files to messages via file picker, drag-and-drop, or clipboard paste. Uploaded files are saved to ~/.archon/artifacts/uploads/ and their paths are injected into the prompt so Claude Code can read them using its built-in tools. - Frontend: file picker button (Paperclip), drag-drop zone, clipboard paste (Ctrl+V for images), preview chips with remove button, 10 MB / 5 file limits - Backend: multipart/form-data parsing in POST /api/conversations/:id/message, file storage with path-traversal-safe names, backwards-compatible JSON path - Core: AttachedFile type, attachedFiles on HandleMessageContext, buildFullPrompt appends ## Attached Files section with disk paths - Chat history: file metadata persisted in message metadata JSON field, attachment pills rendered in MessageBubble for persisted messages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add file attachment support to the web chat UI. Users can attach images, PDFs, markdown, and code files to messages via file picker, drag-and-drop, or clipboard paste. Uploaded files are saved to ~/.archon/artifacts/uploads/ and their paths are injected into the prompt so Claude Code can read them using its built-in tools. - Frontend: file picker button (Paperclip), drag-drop zone, clipboard paste (Ctrl+V for images), preview chips with remove button, 10 MB / 5 file limits - Backend: multipart/form-data parsing in POST /api/conversations/:id/message, file storage with path-traversal-safe names, backwards-compatible JSON path - Core: AttachedFile type, attachedFiles on HandleMessageContext, buildFullPrompt appends ## Attached Files section with disk paths - Chat history: file metadata persisted in message metadata JSON field, attachment pills rendered in MessageBubble for persisted messages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Enhances API service capabilities with FormData support and smart retry logic improvements for better developer experience.
Changes Made
Technical Details
Testing
Files Changed
Benefits
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests