Feature/LLM-Providers-UI-Polished#736
Conversation
- Add requires_max_completion_tokens() function for GPT-5, o1, o3, Grok-3 series - Add prepare_chat_completion_params() for reasoning model compatibility - Implement max_tokens → max_completion_tokens conversion for reasoning models - Add temperature handling for reasoning models (must be 1.0 default) - Enhanced provider validation and API key security in provider endpoints - Streamlined retry logic (3→2 attempts) for faster issue detection - Add failure tracking and circuit breaker analysis for debugging - Support OpenRouter format detection (openai/gpt-5-nano, openai/o1-mini) - Improved Grok provider empty response handling with structured fallbacks - Enhanced contextual embedding with provider-aware model selection Core provider functionality: - OpenRouter, Grok, Anthropic provider support with full embedding integration - Provider-specific model defaults and validation - Secure API connectivity testing endpoints - Provider context passing for code generation workflows 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…oncerns, throughly hardening our code
…, need to fix health check, asnyico tasks errors, and contextual embeddings error
…es the underlying AsyncOpenAI / httpx.AsyncClient while the loop is still alive, with defensive logging if shutdown happens late (python/src/server/services/llm_provider_service.py:14, python/src/server/ services/llm_provider_service.py:520).
…eparate try/finally blocks; fallback clients now close without logging spurious Error creating LLM client when downstream code raises (python/src/server/services/llm_provider_service.py:335-556). - Close logic now sanitizes provider names consistently and awaits whichever aclose/close coroutine the SDK exposes, keeping the loop shut down cleanly (python/src/server/services/llm_provider_service.py:530-559). Robust JSON Parsing - Added _extract_json_payload to strip code fences / extra text returned by Ollama before json.loads runs, averting the markdown-induced decode errors you saw in logs (python/src/server/services/storage/code_storage_service.py:40-63). - Swapped the direct parse call for the sanitized payload and emit a debug preview when cleanup alters the content (python/src/server/ services/storage/code_storage_service.py:858-864).
…k to Ollama (matching existing tests) and so unsupported providers still raise the legacy ValueError the suite expects. The fallback now reuses _get_optimal_ollama_instance and rethrows ValueError(OpenAI API key not found and Ollama fallback failed) when it cant connect. Adjusted test_code_extraction_source_id.py to accept the new optional argument on the mocked extractor (and confirm its None when present).
…API key validation to call create_embedding with the provider argument and removed the hard-coded OpenAI fallback (python/src/server/api_routes/knowledge_api.py). - Broadened embedding provider detection so prefixed OpenRouter/OpenAI model names route through the correct client (python/src/server/ services/embeddings/embedding_service.py, python/src/server/services/llm_provider_service.py). - Removed the duplicate helper definitions from llm_provider_service.py, eliminating the stray docstring that was causing the import-time syntax error.
…ues and no nitpicks with the updates! what was done: Credential service now persists the provider under the uppercase key LLM_PROVIDER, matching the read path (no new EMBEDDING_PROVIDER usage introduced). Embedding batch creation stops inserting blank strings, logging failures and skipping invalid items before they ever hit the provider (python/src/server/services/embeddings/embedding_service.py). Contextual embedding prompts use real newline characters everywhereboth when constructing the batch prompt and when parsing the models response (python/src/server/services/embeddings/contextual_embedding_service.py). Embedding provider routing already recognizes OpenRouter-prefixed OpenAI models via is_openai_embedding_model; no further change needed there. Embedding insertion now skips unsupported vector dimensions instead of forcing them into the 1536-column, and the backoff loop uses await asyncio.sleep so we no longer block the event loop (python/src/server/services/storage/code_storage_service.py). RAG settings props were extended to include LLM_INSTANCE_NAME and OLLAMA_EMBEDDING_INSTANCE_NAME, and the debug log no longer prints API-key prefixes (the rest of the TanStack refactor/EMBEDDING_PROVIDER support remains deferred).
…models and parse regardless of json output or not. this commit creates a robust way for archons parsing to work throughly with openrouter automatically, regardless of the model youre using, to ensure proper functionality with out breaking any generation capabilities!
… system fully capabale of mix and matching llm providers (local and non local) for chat & embeddings. updated the ragsettings.tsx ui mainly, along with core functionality
…tus health checks
WalkthroughAdds explicit embedding-provider support across frontend and backend: new dependency, per-context chat vs embedding provider/model UI and credential notifications; adapter-based embedding providers (OpenAI/Google); async, lock-guarded crawling orchestration; and threading embedding_provider through extraction, storage, and progress reporting. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant UI as RAGSettings (FE)
participant CredFE as CredentialsService (FE)
participant API as Backend API
participant CredBE as CredentialService (BE)
participant Crawl as CrawlingService
participant Extract as CodeExtractionService
participant Store as CodeStorageService
participant EmbSvc as EmbeddingService
participant Adapter as ProviderAdapter
participant Provider as External Provider
UI->>CredFE: Save EMBEDDING_PROVIDER + models
CredFE-->>UI: Dispatch archon:credentials-updated
UI->>API: Start crawl/extract (may include embedding_provider)
API->>Crawl: orchestrate crawl
Crawl->>CredBE: await get_active_provider("embedding")
CredBE-->>Crawl: EMBEDDING_PROVIDER (explicit or default)
Crawl->>Extract: extract_and_store_code_examples(..., embedding_provider)
Extract->>Store: _store_code_examples(..., embedding_provider)
Store->>EmbSvc: create_embeddings_batch(texts,..., embedding_provider)
EmbSvc->>Adapter: instantiate adapter for provider
Adapter->>Provider: create embeddings (HTTP/API)
Provider-->>Adapter: vectors
Adapter-->>EmbSvc: vectors
EmbSvc-->>Store: vectors (persist)
Store-->>Extract: storage result
Extract-->>Crawl: progress/complete
Crawl-->>API: status/complete
API-->>UI: status/results
sequenceDiagram
autonumber
participant UI as RAGSettings (FE)
participant Ollama as Ollama Instance
rect rgb(240,248,255)
note right of UI: Context switch: Chat vs Embedding provider/model
UI->>UI: Load per-context provider/model from storage
UI->>Ollama: Poll /api/tags and /api/version (per-context URLs)
Ollama-->>UI: Status & model metrics
UI->>UI: Update status badges, actions, and context UI
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (4)python/src/server/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
python/src/server/services/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
python/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
python/src/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🧬 Code graph analysis (1)python/src/server/services/embeddings/embedding_service.py (4)
🔇 Additional comments (8)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🧪 Early access (Sonnet 4.5): enabledWe are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience. Note:
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
python/src/server/services/crawling/code_extraction_service.py (2)
198-205: Summary phase: handle 0..1 vs 0..100; clamp to 20–90%.Same unit-mismatch risk as above. Normalize and clamp.
Apply this diff:
async def summary_progress(data: dict): # Scale progress to 20-90% range - raw_progress = data.get("progress", data.get("percentage", 0)) - scaled_progress = 20 + int(raw_progress * 0.7) # 20-90% - data["progress"] = scaled_progress + raw = data.get("progress", data.get("percentage", 0)) + try: + raw_num = float(raw) + except (TypeError, ValueError): + raw_num = 0.0 + if 0.0 <= raw_num <= 1.0: + raw_num *= 100.0 + # 20-90% with clamping + scaled_progress = min(90, max(20, 20 + int(raw_num * 0.7))) + data["progress"] = scaled_progress await progress_callback(data)
217-224: Storage phase: handle 0..1 vs 0..100; clamp to 90–100%.Normalize and clamp to avoid progress jumping outside target range.
Apply this diff:
async def storage_progress(data: dict): # Scale progress to 90-100% range - raw_progress = data.get("progress", data.get("percentage", 0)) - scaled_progress = 90 + int(raw_progress * 0.1) # 90-100% - data["progress"] = scaled_progress + raw = data.get("progress", data.get("percentage", 0)) + try: + raw_num = float(raw) + except (TypeError, ValueError): + raw_num = 0.0 + if 0.0 <= raw_num <= 1.0: + raw_num *= 100.0 + # 90-100% with clamping + scaled_progress = min(100, max(90, 90 + int(raw_num * 0.1))) + data["progress"] = scaled_progress await progress_callback(data)archon-ui-main/src/components/settings/RAGSettings.tsx (3)
861-881: Page-load Ollama checks ignore embedding-only Ollama selection.This effect only runs when LLM_PROVIDER === 'ollama'. If chat is remote but embedding is Ollama, embedding connectivity is never auto-checked on load.
Apply:
- if (!hasRunInitialTestRef.current && - ragSettings.LLM_PROVIDER === 'ollama' && - Object.keys(ragSettings).length > 0) { + if ( + !hasRunInitialTestRef.current && + (ragSettings.LLM_PROVIDER === 'ollama' || embeddingProvider === 'ollama') && + Object.keys(ragSettings).length > 0 + ) {Also include embeddingProvider in deps:
- }, [ragSettings.LLM_PROVIDER, llmInstanceConfig.url, llmInstanceConfig.name, - embeddingInstanceConfig.url, embeddingInstanceConfig.name]); + }, [ragSettings.LLM_PROVIDER, embeddingProvider, llmInstanceConfig.url, llmInstanceConfig.name, + embeddingInstanceConfig.url, embeddingInstanceConfig.name]);
896-905: Do not log secrets (even partially).This console.log prints the first 10 characters of an API key. Remove it to avoid leaking secrets via logs.
Apply:
- console.log('🔍 OpenAI status check:', { - openAIKey, - keyValue: keyValue ? `${keyValue.substring(0, 10)}...` : keyValue, - hasValue: !!keyValue, - hasOpenAIKey, - openAIConnected, - isChecking, - allCredentials: Object.keys(apiCredentials) - }); + // Intentionally avoid logging API key material.
2388-2414: Embedding model defaults tied to LLM provider (bug).Use EMBEDDING_PROVIDER to derive embedding defaults; otherwise display/placeholder can be wrong when providers differ.
Apply:
-function getDisplayedEmbeddingModel(ragSettings: any): string { - const provider = ragSettings.LLM_PROVIDER || 'openai'; +function getDisplayedEmbeddingModel(ragSettings: any): string { + const provider = ragSettings.EMBEDDING_PROVIDER || ragSettings.LLM_PROVIDER || 'openai';
🧹 Nitpick comments (11)
python/src/server/services/crawling/code_extraction_service.py (1)
160-166: DRY: Extract a helper to normalize progress values.Three wrappers (and also the storage callback in _store_code_examples) repeat the same fallback and should accept both 0..1 and 0..100. Centralize to reduce drift.
You can add inside CodeExtractionService:
def _normalize_progress_0_100(self, data: dict[str, Any]) -> float: raw = data.get("progress", data.get("percentage", 0)) try: val = float(raw) except (TypeError, ValueError): return 0.0 return val * 100.0 if 0.0 <= val <= 1.0 else valThen use it in the three wrappers and in _store_code_examples’ storage_callback, applying appropriate clamping per phase.
Also applies to: 198-205, 217-224
archon-ui-main/src/services/credentialsService.ts (1)
79-87: Credential update event is a good integration hook.Nice browser-guard and CustomEvent dispatch. Consider documenting the event name and detail shape in a shared constants/types module for discoverability.
archon-ui-main/src/components/settings/RAGSettings.tsx (9)
118-120: DEFAULT_OLLAMA_URL should live in config.Extract to a config util/env (e.g., getApiUrl-like) to avoid hardcoding environment-specific hostnames.
349-397: Ollama health detection may check the wrong base URL.You merge LLM and Embedding URLs and pick the first non-empty, which can test the wrong instance when only one provider is Ollama. Scope detection to the active context.
Apply:
- useEffect(() => { - const needsDetection = chatProvider === 'ollama' || embeddingProvider === 'ollama'; - if (!needsDetection) { - setOllamaServerStatus('unknown'); - return; - } - - const baseUrl = ( - ragSettings.LLM_BASE_URL?.trim() || - llmInstanceConfig.url?.trim() || - ragSettings.OLLAMA_EMBEDDING_URL?.trim() || - embeddingInstanceConfig.url?.trim() || - DEFAULT_OLLAMA_URL - ); - const normalizedUrl = baseUrl.replace('/v1', '').replace(/\/$/, ''); + useEffect(() => { + const isOllamaActive = + (activeSelection === 'chat' && chatProvider === 'ollama') || + (activeSelection === 'embedding' && embeddingProvider === 'ollama'); + if (!isOllamaActive) { + setOllamaServerStatus('unknown'); + return; + } + const baseUrl = + activeSelection === 'chat' + ? (ragSettings.LLM_BASE_URL?.trim() || llmInstanceConfig.url?.trim() || DEFAULT_OLLAMA_URL) + : (ragSettings.OLLAMA_EMBEDDING_URL?.trim() || embeddingInstanceConfig.url?.trim() || DEFAULT_OLLAMA_URL); + const normalizedUrl = baseUrl.replace('/v1', '').replace(/\/$/, '');
964-984: Provider alert should reflect active context.Currently tied to LLM provider only; embed alerts won’t show when embedding is selected.
Apply:
- const selectedProviderKey = isProviderKey(ragSettings.LLM_PROVIDER) - ? (ragSettings.LLM_PROVIDER as ProviderKey) - : undefined; - const selectedProviderStatus = selectedProviderKey ? getProviderStatus(selectedProviderKey) : undefined; + const selectedProviderKey = activeSelection === 'chat' ? chatProvider : embeddingProvider; + const selectedProviderStatus = getProviderStatus(selectedProviderKey);
330-336: Listen for credential updates to refresh status immediately.Hook into archon:credentials-updated to avoid waiting for the interval/manual actions.
Add after this block:
useEffect(() => { const handler = () => reloadApiCredentials(); window.addEventListener('archon:credentials-updated', handler as EventListener); return () => window.removeEventListener('archon:credentials-updated', handler as EventListener); }, []);
1458-1494: Save persists instance fields; consider refetching provider status.Optional: after saving, trigger reloadApiCredentials and a one-shot status re-test for current providers.
121-154: Prop types: avoid inline shape and any; reuse RagSettings.Import RagSettings and type setRagSettings as React.Dispatch<React.SetStateAction> to prevent drift.
Apply:
+import type { RagSettings as RagSettingsType } from '../../services/credentialsService'; -interface RAGSettingsProps { - ragSettings: { - MODEL_CHOICE: string; - USE_CONTEXTUAL_EMBEDDINGS: boolean; - CONTEXTUAL_EMBEDDINGS_MAX_WORKERS: number; - USE_HYBRID_SEARCH: boolean; - USE_AGENTIC_RAG: boolean; - USE_RERANKING: boolean; - LLM_PROVIDER?: string; - LLM_BASE_URL?: string; - LLM_INSTANCE_NAME?: string; - EMBEDDING_MODEL?: string; - EMBEDDING_PROVIDER?: string; - OLLAMA_EMBEDDING_URL?: string; - OLLAMA_EMBEDDING_INSTANCE_NAME?: string; - // Crawling Performance Settings - CRAWL_BATCH_SIZE?: number; - CRAWL_MAX_CONCURRENT?: number; - CRAWL_WAIT_STRATEGY?: string; - CRAWL_PAGE_TIMEOUT?: number; - CRAWL_DELAY_BEFORE_HTML?: number; - // Storage Performance Settings - DOCUMENT_STORAGE_BATCH_SIZE?: number; - EMBEDDING_BATCH_SIZE?: number; - DELETE_BATCH_SIZE?: number; - ENABLE_PARALLEL_BATCHES?: boolean; - // Advanced Settings - MEMORY_THRESHOLD_PERCENT?: number; - DISPATCHER_CHECK_INTERVAL?: number; - CODE_EXTRACTION_BATCH_SIZE?: number; - CODE_SUMMARY_MAX_WORKERS?: number; - }; - setRagSettings: (settings: any) => void; -} +interface RAGSettingsProps { + ragSettings: RagSettingsType; + setRagSettings: React.Dispatch<React.SetStateAction<RagSettingsType>>; +}
200-205: Instance config defaults use host.docker.internal.Non-Docker Linux environments may not resolve this hostname. Consider reading defaults from env/config and showing a hint if unreachable.
503-526: Provider status polling cadence.60s is reasonable. If this runs across multiple tabs, consider BroadcastChannel to coordinate and reduce duplicate polling.
352-363: Health check URL normalization..replace('/v1','') is fine; you may prefer a regex anchored to '/v1/?$' to avoid unintended replacements.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
archon-ui-main/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (6)
archon-ui-main/package.json(1 hunks)archon-ui-main/src/components/settings/RAGSettings.tsx(21 hunks)archon-ui-main/src/services/credentialsService.ts(7 hunks)python/src/server/services/crawling/code_extraction_service.py(3 hunks)python/src/server/services/credential_service.py(1 hunks)python/src/server/services/embeddings/embedding_service.py(5 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/services/credential_service.pypython/src/server/services/crawling/code_extraction_service.pypython/src/server/services/embeddings/embedding_service.py
python/src/server/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/services/**/*.py: For batch operations, report both success count and a detailed failure list
On external API calls, retry with exponential backoff and fail with a clear message after retries
Place business logic in service layer modules (API Route → Service → Database pattern)
Files:
python/src/server/services/credential_service.pypython/src/server/services/crawling/code_extraction_service.pypython/src/server/services/embeddings/embedding_service.py
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/server/services/credential_service.pypython/src/server/services/crawling/code_extraction_service.pypython/src/server/services/embeddings/embedding_service.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/server/services/credential_service.pypython/src/server/services/crawling/code_extraction_service.pypython/src/server/services/embeddings/embedding_service.py
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/services/credentialsService.tsarchon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/services/credentialsService.tsarchon-ui-main/src/components/settings/RAGSettings.tsx
🧬 Code graph analysis (2)
python/src/server/services/embeddings/embedding_service.py (4)
python/src/server/config/logfire_config.py (1)
safe_span(151-172)python/src/server/services/llm_provider_service.py (2)
get_embedding_model(600-675)get_llm_client(313-548)python/src/server/services/credential_service.py (3)
get_credential(160-177)get_credential(545-547)get_active_provider(426-494)python/src/server/services/embeddings/embedding_exceptions.py (1)
EmbeddingAPIError(86-99)
archon-ui-main/src/components/settings/RAGSettings.tsx (3)
archon-ui-main/src/features/ui/primitives/input.tsx (1)
Input(8-29)archon-ui-main/src/features/ui/primitives/button.tsx (1)
Button(11-130)archon-ui-main/src/services/credentialsService.ts (1)
credentialsService(604-604)
🪛 GitHub Actions: Continuous Integration
python/src/server/services/embeddings/embedding_service.py
[error] 304-304: Embedding creation failed: TypeError: object MagicMock can't be used in 'await' expression (likely due to improper mocking of async dependency in tests).
🔇 Additional comments (16)
python/src/server/services/crawling/code_extraction_service.py (1)
160-166: Retain existing extraction_progress scaling Upstream emits raw progress on a 0–100 scale, soint(raw_progress * 0.2)correctly maps to 0–20; no normalization for 0–1 fractions is needed.Likely an incorrect or invalid review comment.
archon-ui-main/src/services/credentialsService.ts (5)
26-26: RagSettings: EMBEDDING_PROVIDER addition looks good.
236-237: String-field mapping includes EMBEDDING_PROVIDER.Good catch; ensures seamless persistence and hydration.
294-297: Notify on updateCredential.Event emission post-update is correct and sequenced after JSON parsing.
320-323: Notify on createCredential.Consistent with update flow.
342-343: Notify on deleteCredential.Good for live UI sync.
archon-ui-main/src/components/settings/RAGSettings.tsx (9)
7-9: Icon imports are pack-scoped.Good usage of react-icons with scoped imports; preserves tree-shaking.
18-19: Embedding-capable providers list.Looks correct and constrains the embedding grid appropriately.
133-134: Prop surface extended with EMBEDDING_PROVIDER.Matches backend interface; good.
177-186: Independent chat vs embedding provider state.Solid approach with one- and two‑way sync guards.
242-259: Per‑provider model persistence effects.Looks good; updates only when relevant model changes.
Also applies to: 260-275
398-442: One-/two-way sync guards.Ref logic prevents loops; good separation for chat vs embedding updates.
985-1043: Auto-retry LLM testing looks solid.Timeout refs are cleared; suppression avoids toast spam. Good.
1045-1101: Auto-retry Embedding testing looks solid.Mirrors LLM behavior; cleanup is correct.
545-564: AbortSignal.timeout: cross-browser support.Safari support can lag. Consider a fallback with manual AbortController timeout.
Please verify in your target browsers. If you see TypeError, swap to a custom AbortController timeout wrapper.
archon-ui-main/package.json (1)
53-53: react-icons@5.5.0 upgrade verifiedBuild completes without TS/JSX typing errors (IconType unchanged) and pack-scoped imports remain tree-shakable (no unexpected bundle bloat).
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
python/src/server/services/embeddings/embedding_service.py (1)
380-421: Retry Google rate limits with exponential backoffBatches hitting Google 429s currently fail without retry. Add a branch mirroring the OpenAI path.
except openai.RateLimitError as e: error_message = str(e) if "insufficient_quota" in error_message: @@ raise # Will be caught by outer try + except EmbeddingRateLimitError as e: + retry_count += 1 + if retry_count < max_retries: + wait_time = 2**retry_count + search_logger.warning( + f"Rate limit (Google) for batch {batch_index}, " + f"waiting {wait_time}s before retry {retry_count}/{max_retries}" + ) + await asyncio.sleep(wait_time) + else: + raise
🧹 Nitpick comments (6)
python/src/server/services/embeddings/embedding_service.py (6)
123-131: Plumb dimensions into Google requestsDimensions are currently ignored by Google adapter. Pass them through to _fetch_single_embedding so they can be applied (see companion change).
- embeddings = await asyncio.gather( - *( - self._fetch_single_embedding(http_client, google_api_key, model, text) - for text in texts - ) - ) + embeddings = await asyncio.gather( + *( + self._fetch_single_embedding(http_client, google_api_key, model, text, dimensions) + for text in texts + ) + )
317-323: Remove unreachable branchAfter normalizing to "openai" on Line 318, the subsequent None/empty check cannot trigger.
- if not embedding_provider: - search_logger.error("No embedding provider configured") - raise ValueError("No embedding provider configured. Please set EMBEDDING_PROVIDER environment variable.")
324-325: Clarify log source of provider selectionMessage may be misleading when provider is overridden via function arg.
- search_logger.info(f"Using embedding provider: '{embedding_provider}' (from EMBEDDING_PROVIDER setting)") + search_logger.info(f"Using embedding provider: '{embedding_provider}'")
343-346: Avoid repeated model lookups inside retry loopCompute embedding_model once per batch to reduce overhead and stabilize behavior across retries.
batch_index = i // batch_size + # Resolve once per batch + embedding_model = await get_embedding_model(provider=embedding_provider) @@ - try: - # Create embeddings for this batch - embedding_model = await get_embedding_model(provider=embedding_provider) + try: + # Create embeddings for this batchAlso applies to: 368-373
325-341: Skip creating an OpenAI-compatible client when provider is GoogleGoogle adapter doesn’t use the client; avoid creating/closing it for performance and fewer moving parts.
Example refactor:
- If embedding_provider == "google": don’t enter get_llm_client context; set client = None and proceed.
- Else: use current get_llm_client path.
109-131: Consider Google batch endpoint to reduce HTTP callsUse models/{model}:batchEmbedContents to embed the whole batch in one request; it returns embeddings aligned to inputs and supports outputDimensionality.
I can provide a concrete patch if you’d like to switch to batchEmbedContents.
Also applies to: 169-175
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
python/src/server/services/embeddings/embedding_service.py(5 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/services/embeddings/embedding_service.py
python/src/server/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/services/**/*.py: For batch operations, report both success count and a detailed failure list
On external API calls, retry with exponential backoff and fail with a clear message after retries
Place business logic in service layer modules (API Route → Service → Database pattern)
Files:
python/src/server/services/embeddings/embedding_service.py
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/server/services/embeddings/embedding_service.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/server/services/embeddings/embedding_service.py
🧠 Learnings (1)
📓 Common learnings
Learnt from: zaksnet
PR: coleam00/Archon#473
File: archon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx:26-31
Timestamp: 2025-09-10T11:35:20.655Z
Learning: When reviewing large refactoring PRs like "Clean multi provider feature", focus feedback on the core objectives (provider management, API keys, model configuration, service registry) rather than unrelated UI improvements in components that were only incidentally modified during the refactor.
🧬 Code graph analysis (1)
python/src/server/services/embeddings/embedding_service.py (3)
python/src/server/services/llm_provider_service.py (2)
get_embedding_model(600-675)get_llm_client(313-548)python/src/server/services/credential_service.py (4)
get_credential(160-177)get_credential(545-547)get_active_provider(426-494)get_credentials_by_category(323-369)python/src/server/services/embeddings/embedding_exceptions.py (1)
EmbeddingAPIError(86-99)
🔇 Additional comments (1)
python/src/server/services/embeddings/embedding_service.py (1)
149-165: Normalize Google model IDs and support output dimensionalityIf users enter "models/text-embedding-004", current code produces models/models/... (404). Also, when dimensions is provided, Google supports outputDimensionality — pass it through.
Apply this diff:
- async def _fetch_single_embedding( + async def _fetch_single_embedding( self, http_client: httpx.AsyncClient, api_key: str, model: str, - text: str, - ) -> list[float]: - url = f"https://generativelanguage.googleapis.com/v1beta/models/{model}:embedContent" + text: str, + dimensions: int | None = None, + ) -> list[float]: + # Normalize model for URL/payload compatibility + if model.startswith("models/"): + url_model = model[len("models/") :] + payload_model = model + else: + url_model = model + payload_model = f"models/{model}" + url = f"https://generativelanguage.googleapis.com/v1beta/models/{url_model}:embedContent" headers = { "x-goog-api-key": api_key, "Content-Type": "application/json", } - payload = { - "model": f"models/{model}", + payload = { + "model": payload_model, "content": {"parts": [{"text": text}]}, } + if dimensions is not None: + # Google-specific: honor requested dimensionality when supported by the model + payload["outputDimensionality"] = dimensions
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
archon-ui-main/src/components/settings/RAGSettings.tsx (3)
279-301: Don’t store raw API keys in React stateYou’re persisting decrypted API key values into component state (apiCredentials). This is a security posture issue: values can be inspected via devtools and linger in memory. You only need to know “has value” for UI; switch to the has_value boolean from checkCredentialStatus.
Minimal change (store booleans instead of secrets):
- const credentials: {[key: string]: string} = {}; + const credentials: {[key: string]: boolean} = {}; @@ - if (result.has_value && result.value && result.value.trim().length > 0) { - credentials[key] = result.value; - } + credentials[key] = !!result.has_value;Then update getProviderStatus to use the boolean:
- const openAIKey = Object.keys(apiCredentials).find(key => key.toUpperCase() === 'OPENAI_API_KEY'); - const keyValue = openAIKey ? apiCredentials[openAIKey] : undefined; - const hasOpenAIKey = openAIKey && keyValue && keyValue.trim().length > 0 && !keyValue.includes('[ENCRYPTED]'); + const openAIKey = Object.keys(apiCredentials).find(k => k.toUpperCase() === 'OPENAI_API_KEY'); + const hasOpenAIKey = openAIKey ? !!apiCredentials[openAIKey] : false;Apply the same pattern for Google.
731-807: Ollama metrics may misattribute models due to URL mismatch (/v1, trailing slash)You filter models by exact URL equality against llmInstanceConfig.url/embeddingInstanceConfig.url, but the backend typically returns base URLs without /v1. This can cause zero counts. Normalize both sides and also send normalized instance_urls in the request.
Apply:
- const instanceUrls = []; - if (llmInstanceConfig.url) instanceUrls.push(llmInstanceConfig.url); - if (embeddingInstanceConfig.url && embeddingInstanceConfig.url !== llmInstanceConfig.url) { - instanceUrls.push(embeddingInstanceConfig.url); - } + const instanceUrls: string[] = []; + const llmUrlBase = normalizeBaseUrl(llmInstanceConfig.url); + const embUrlBase = normalizeBaseUrl(embeddingInstanceConfig.url); + if (llmUrlBase) instanceUrls.push(llmUrlBase); + if (embUrlBase && embUrlBase !== llmUrlBase) instanceUrls.push(embUrlBase); @@ - const llmChatModels = allChatModels.filter((model: any) => - model.instance_url === llmInstanceConfig.url - ); - const llmEmbeddingModels = allEmbeddingModels.filter((model: any) => - model.instance_url === llmInstanceConfig.url - ); + const llmChatModels = allChatModels.filter((model: any) => + normalizeBaseUrl(model.instance_url) === llmUrlBase + ); + const llmEmbeddingModels = allEmbeddingModels.filter((model: any) => + normalizeBaseUrl(model.instance_url) === llmUrlBase + ); @@ - const embChatModels = allChatModels.filter((model: any) => - model.instance_url === embeddingInstanceConfig.url - ); - const embEmbeddingModels = allEmbeddingModels.filter((model: any) => - model.instance_url === embeddingInstanceConfig.url - ); + const embChatModels = allChatModels.filter((model: any) => + normalizeBaseUrl(model.instance_url) === embUrlBase + ); + const embEmbeddingModels = allEmbeddingModels.filter((model: any) => + normalizeBaseUrl(model.instance_url) === embUrlBase + );Add normalizeBaseUrl as suggested above.
861-884: Prevent repeated metrics fetches on every keystrokeEffect depends on instance URLs/names, and condition embeddingProvider === 'ollama' is always true when selected, causing repeated fetches while typing or minor state changes. Guard by comparing to last snapshot and/or narrow deps.
Apply:
- React.useEffect(() => { - if ( - ragSettings.LLM_PROVIDER === 'ollama' || embeddingProvider === 'ollama' - ) { - const currentProvider = ragSettings.LLM_PROVIDER; - const lastProvider = lastMetricsFetchRef.current.provider; - - // Only fetch if provider changed to Ollama (scenario 1: user clicks on Ollama Provider) - if (currentProvider !== lastProvider || embeddingProvider === 'ollama') { - lastMetricsFetchRef.current = { - provider: currentProvider, - llmUrl: llmInstanceConfig.url, - embUrl: embeddingInstanceConfig.url, - llmOnline: llmStatus.online, - embOnline: embeddingStatus.online - }; - console.log('🔄 Fetching Ollama metrics - Provider selected'); - fetchOllamaMetrics(); - } - } - }, [ragSettings.LLM_PROVIDER, embeddingProvider, llmInstanceConfig.url, llmInstanceConfig.name, - embeddingInstanceConfig.url, embeddingInstanceConfig.name]); // Include embeddingProvider in deps + React.useEffect(() => { + const current = { + provider: ragSettings.LLM_PROVIDER, + embProvider: embeddingProvider, + llmUrl: llmInstanceConfig.url, + embUrl: embeddingInstanceConfig.url, + llmOnline: llmStatus.online, + embOnline: embeddingStatus.online, + }; + const last = lastMetricsFetchRef.current; + const meaningfulChange = + current.provider !== last.provider || + current.embProvider !== (last as any).embProvider || + current.llmUrl !== last.llmUrl || + current.embUrl !== last.embUrl || + current.llmOnline !== last.llmOnline || + current.embOnline !== last.embOnline; + + if ((current.provider === 'ollama' || current.embProvider === 'ollama') && meaningfulChange) { + lastMetricsFetchRef.current = current as any; + console.log('🔄 Fetching Ollama metrics - state changed'); + fetchOllamaMetrics(); + } + }, [ragSettings.LLM_PROVIDER, embeddingProvider, llmStatus.online, embeddingStatus.online]);python/src/server/services/crawling/code_extraction_service.py (2)
162-166: Normalize and clamp extraction-phase progress (0–20%)Extraction phase lacks the robust parsing/clamping you added in later phases. Strings/0–1 inputs can break or under-report (int(0.5*0.2)=0). Align with summary/storage handling.
Apply:
- raw_progress = data.get("progress", data.get("percentage", 0)) - scaled_progress = int(raw_progress * 0.2) # 0-20% + raw = data.get("progress", data.get("percentage", 0)) + try: + raw_num = float(raw) + except (TypeError, ValueError): + raw_num = 0.0 + if 0.0 <= raw_num <= 1.0: + raw_num *= 100.0 + # 0–20% with clamping + scaled_progress = min(20, max(0, int(raw_num * 0.2))) data["progress"] = scaled_progress await progress_callback(data)
887-898: Fix TypeError when min_length is None in indented-block pathThis branch compares an int to None: len("...") >= min_length. With default min_length=None (common callsite), this crashes.
Apply:
- elif current_block and len("\n".join(current_block)) >= min_length: - # End of indented block, check if it's code - code_content = "\n".join(current_block) + elif current_block: + block_text = "\n".join(current_block) + threshold = min_length if min_length is not None else await self._get_min_code_length() + if len(block_text) < threshold: + # Not enough content; reset and continue + current_block = [] + current_indent = None + continue + # End of indented block, check if it's code + code_content = block_textpython/src/server/services/embeddings/embedding_service.py (1)
371-427: Retry on provider-agnostic rate limitsThe retry loop only handles openai.RateLimitError. Add handling for EmbeddingRateLimitError to back off on Google (and other) 429s.
while retry_count < max_retries: try: @@ - except openai.RateLimitError as e: + except openai.RateLimitError as e: error_message = str(e) @@ else: raise # Will be caught by outer try + except EmbeddingRateLimitError as e: + retry_count += 1 + if retry_count < max_retries: + wait_time = 2**retry_count + search_logger.warning( + f"Embedding rate limit for batch {batch_index}, " + f"waiting {wait_time}s before retry {retry_count}/{max_retries}" + ) + await asyncio.sleep(wait_time) + else: + raise
🧹 Nitpick comments (15)
archon-ui-main/src/services/credentialsService.ts (3)
276-287: Avoid N events: add optional notify suppression to updateCredentialWhen updateRagSettings saves many keys, the current per-key notifications can cause UI thrash. Add an optional flag to suppress per-call notify so we can batch.
Apply:
- async updateCredential(credential: Credential): Promise<Credential> { + async updateCredential( + credential: Credential, + options?: { suppressNotify?: boolean } + ): Promise<Credential> {
294-297: Avoid N events: conditionally dispatch notify inside updateCredentialGate notify by the new suppress flag. This enables batching from updateRagSettings.
- const updated = await response.json(); - this.notifyCredentialUpdate([credential.key]); - return updated; + const updated = await response.json(); + if (!options?.suppressNotify) { + this.notifyCredentialUpdate([credential.key]); + } + return updated;
348-367: Batch “credentials-updated” notifications when saving RAG settingsSending one event per key can trigger redundant reloads in listeners. Batch emit once with all updated keys.
async updateRagSettings(settings: RagSettings): Promise<void> { - const promises = []; + const promises = []; + const updatedKeys: string[] = []; @@ - promises.push( - this.updateCredential({ - key, - value: value.toString(), - is_encrypted: false, - category: "rag_strategy", - }), - ); + updatedKeys.push(key); + promises.push( + this.updateCredential( + { + key, + value: value.toString(), + is_encrypted: false, + category: "rag_strategy", + }, + { suppressNotify: true } + ), + ); @@ - await Promise.all(promises); + await Promise.all(promises); + this.notifyCredentialUpdate(updatedKeys);archon-ui-main/src/components/settings/RAGSettings.tsx (8)
118-120: DEFAULT_OLLAMA_URL portabilityhost.docker.internal may not resolve on all Linux hosts. Consider a fallback to http://localhost:11434/v1 when the health check fails, or make this configurable via env.
330-347: Reload credentials: add event listener for batched updatesYou emit archon:credentials-updated in the service. Listen here to refresh immediately instead of waiting for the 30s interval.
Add:
useEffect(() => { const handler = () => { reloadApiCredentials(); }; window.addEventListener('archon:credentials-updated', handler as EventListener); return () => window.removeEventListener('archon:credentials-updated', handler as EventListener); // eslint-disable-next-line react-hooks/exhaustive-deps }, []);
349-396: Normalize Ollama base URL onceYou normalize here, but similar logic is duplicated elsewhere. Extract a normalizeBaseUrl helper and reuse to avoid mismatches with trailing slashes or /v1 suffix.
Helper to add (outside the component):
function normalizeBaseUrl(url?: string): string { if (!url) return ''; return url.replace('/v1', '').replace(/\/$/, ''); }
982-1039: Deferred retry loops — LGTM with minor nitThe 5s retry with cancellation handling is solid. Consider using the same normalizeBaseUrl helper here for baseUrl construction to keep behavior consistent.
1119-1147: Page-load connection checks — LGTMStaggering tests and suppressing toasts avoids noise. Minor: use normalizeBaseUrl for DEFAULT_OLLAMA_URL as well for consistent logging and requests.
2354-2410: Helper typings use any — prefer RagSettings typegetDisplayedChatModel/getDisplayedEmbeddingModel accept any. Import and use the RagSettings type (or RAGSettingsProps['ragSettings']) for stronger typing and to honor strict TS settings.
Example:
import type { RagSettings as RagSettingsType } from '../../services/credentialsService'; function getDisplayedChatModel(ragSettings: RagSettingsType): string { ... } function getDisplayedEmbeddingModel(ragSettings: RagSettingsType): string { ... }
545-564: AbortSignal.timeout browser supportAbortSignal.timeout isn’t supported in all browsers. Consider a fallback with AbortController + manual setTimeout to avoid runtime errors in older environments.
503-526: Provider connection testing cadenceTesting all providers every 60s is fine. The “checked recently” guard references providerConnectionStatus captured at effect creation, so it won’t truly skip. Either include providerConnectionStatus in deps or remove the guard (since interval is already 60s).
python/src/server/services/crawling/code_extraction_service.py (1)
412-423: Docstring arg mismatch: remove nonexistent parameterDocstring mentions min_length but the function signature doesn’t accept it. Update docstring for accuracy.
python/src/server/services/credential_service.py (1)
533-543: Honor service_type when setting providerset_active_provider always writes LLM_PROVIDER. Support EMBEDDING_PROVIDER when service_type == "embedding".
- async def set_active_provider(self, provider: str, service_type: str = "llm") -> bool: + async def set_active_provider(self, provider: str, service_type: str = "llm") -> bool: """Set the active provider for a service type.""" try: - # For now, we'll update the RAG strategy settings - return await self.set_credential( - "LLM_PROVIDER", - provider, - category="rag_strategy", - description=f"Active {service_type} provider", - ) + key = "EMBEDDING_PROVIDER" if service_type == "embedding" else "LLM_PROVIDER" + return await self.set_credential( + key, + provider, + category="rag_strategy", + description=f"Active {service_type} provider", + ) except Exception as e: logger.error(f"Error setting active provider {provider} for {service_type}: {e}") return Falsepython/src/server/services/embeddings/embedding_service.py (2)
109-132: Support Google outputDimensionality with provider limitsHonor requested dimensions for Google by clamping to [1, 768] and setting outputDimensionality.
async def create_embeddings( self, texts: list[str], model: str, dimensions: int | None = None, ) -> list[list[float]]: try: - if dimensions is not None: - _ = dimensions # Maintains adapter signature; Google controls dimensions server-side. + # Google supports optional outputDimensionality with a max of 768 + out_dim = None + if dimensions is not None: + try: + d = int(dimensions) + out_dim = 1 if d < 1 else 768 if d > 768 else d + except Exception: + out_dim = None @@ - embeddings = await asyncio.gather( - *( - self._fetch_single_embedding(http_client, google_api_key, model, text) - for text in texts - ) - ) + embeddings = await asyncio.gather( + *( + self._fetch_single_embedding( + http_client, google_api_key, model, text, out_dim + ) + for text in texts + ) + ) @@ - async def _fetch_single_embedding( + async def _fetch_single_embedding( self, http_client: httpx.AsyncClient, api_key: str, model: str, - text: str, + text: str, + dimensions: int | None = None, ) -> list[float]: @@ - payload = { + payload = { "model": payload_model, "content": {"parts": [{"text": text}]}, } + if isinstance(dimensions, int): + payload["outputDimensionality"] = dimensionsAlso applies to: 149-171
330-331: Tighten log wordingThis log suggests the source is EMBEDDING_PROVIDER even when an explicit override is used. Make wording neutral.
- search_logger.info(f"Using embedding provider: '{embedding_provider}' (from EMBEDDING_PROVIDER setting)") + search_logger.info(f"Using embedding provider: '{embedding_provider}'")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
archon-ui-main/src/components/settings/RAGSettings.tsx(25 hunks)archon-ui-main/src/services/credentialsService.ts(7 hunks)python/src/server/services/crawling/code_extraction_service.py(3 hunks)python/src/server/services/credential_service.py(1 hunks)python/src/server/services/embeddings/embedding_service.py(5 hunks)python/src/server/services/storage/code_storage_service.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/services/storage/code_storage_service.pypython/src/server/services/embeddings/embedding_service.pypython/src/server/services/credential_service.pypython/src/server/services/crawling/code_extraction_service.py
python/src/server/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/services/**/*.py: For batch operations, report both success count and a detailed failure list
On external API calls, retry with exponential backoff and fail with a clear message after retries
Place business logic in service layer modules (API Route → Service → Database pattern)
Files:
python/src/server/services/storage/code_storage_service.pypython/src/server/services/embeddings/embedding_service.pypython/src/server/services/credential_service.pypython/src/server/services/crawling/code_extraction_service.py
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/server/services/storage/code_storage_service.pypython/src/server/services/embeddings/embedding_service.pypython/src/server/services/credential_service.pypython/src/server/services/crawling/code_extraction_service.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/server/services/storage/code_storage_service.pypython/src/server/services/embeddings/embedding_service.pypython/src/server/services/credential_service.pypython/src/server/services/crawling/code_extraction_service.py
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/components/settings/RAGSettings.tsxarchon-ui-main/src/services/credentialsService.ts
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/components/settings/RAGSettings.tsxarchon-ui-main/src/services/credentialsService.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: zaksnet
PR: coleam00/Archon#473
File: archon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx:26-31
Timestamp: 2025-09-10T11:35:20.655Z
Learning: When reviewing large refactoring PRs like "Clean multi provider feature", focus feedback on the core objectives (provider management, API keys, model configuration, service registry) rather than unrelated UI improvements in components that were only incidentally modified during the refactor.
📚 Learning: 2025-09-19T10:31:54.292Z
Learnt from: CR
PR: coleam00/Archon#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-19T10:31:54.292Z
Learning: Applies to python/src/server/**/*.py : Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Applied to files:
python/src/server/services/embeddings/embedding_service.py
🧬 Code graph analysis (3)
python/src/server/services/storage/code_storage_service.py (1)
python/src/server/services/embeddings/embedding_service.py (1)
create_embeddings_batch(260-477)
python/src/server/services/embeddings/embedding_service.py (2)
python/src/server/services/llm_provider_service.py (2)
get_embedding_model(600-675)get_llm_client(313-548)python/src/server/services/credential_service.py (4)
get_credential(160-177)get_credential(552-554)get_active_provider(426-501)get_credentials_by_category(323-369)
archon-ui-main/src/components/settings/RAGSettings.tsx (3)
archon-ui-main/src/features/ui/primitives/input.tsx (1)
Input(8-29)archon-ui-main/src/features/ui/primitives/button.tsx (1)
Button(11-130)archon-ui-main/src/services/credentialsService.ts (1)
credentialsService(604-604)
🔇 Additional comments (15)
archon-ui-main/src/services/credentialsService.ts (4)
26-26: RagSettings: embedding provider added — LGTMAdding EMBEDDING_PROVIDER aligns the service with the new UI split for chat vs embedding. No issues.
79-87: Credential update notification with browser guard — LGTMGood use of typeof window check and CustomEvent to broadcast key updates.
196-206: Defaults aligned with UI and new embedding fields — LGTMMODEL_CHOICE now matches the UI (gpt-4o-mini), and defaults include EMBEDDING_PROVIDER/EMBEDDING_MODEL. This also addresses the earlier review about default model mismatch.
236-239: Mapping includes EMBEDDING_PROVIDER — LGTMEnsures persistence of the embedding provider string from credentials.
archon-ui-main/src/components/settings/RAGSettings.tsx (5)
17-19: Embedding-capable provider whitelist — LGTMLimiting embedding providers to openai/google/ollama matches current backend support.
242-259: Provider-model persistence for chat — LGTMPersists chat model per provider without coupling to embedding state. Good separation.
260-275: Provider-model persistence for embedding — LGTMMirrors chat behavior for embeddings. Looks good.
1271-1286: Provider switch updates the corresponding model — LGTMGood UX: pulls last-used or default model per provider when switching contexts.
1532-1538: “Loading…” model counts — LGTMReads per-instance counts from metrics state with clear loading state. After normalizing URLs (earlier comment), these counts should be accurate.
Also applies to: 1610-1616
python/src/server/services/crawling/code_extraction_service.py (2)
200-209: LGTM: Robust summary-phase progress scalingGood numeric normalization (0–1 vs 0–100) and clamping into 20–90%.
226-235: LGTM: Robust storage-phase progress scalingConsistent normalization and clamping into 90–100%.
python/src/server/services/embeddings/embedding_service.py (3)
156-171: LGTM: Normalize Google model IDsCorrectly handles both "text-embedding-004" and "models/text-embedding-004" to avoid 404s.
133-147: Treat Google 429 as rate limit to enable backoffAll HTTP errors are mapped to EmbeddingAPIError; 429s should surface as EmbeddingRateLimitError.
- except httpx.HTTPStatusError as error: - error_content = error.response.text - search_logger.error( - f"Google embedding API returned {error.response.status_code} - {error_content}", - exc_info=True, - ) - raise EmbeddingAPIError( - f"Google embedding API error: {error.response.status_code} - {error_content}", - original_error=error, - ) from error + except httpx.HTTPStatusError as error: + error_content = error.response.text + if error.response.status_code == 429: + # Surface as rate limit to trigger unified retry/backoff + raise EmbeddingRateLimitError(f"Google rate limit: {error_content}") from error + search_logger.error( + f"Google embedding API returned {error.response.status_code} - {error_content}", + exc_info=True, + ) + raise EmbeddingAPIError( + f"Google embedding API error: {error.response.status_code} - {error_content}", + original_error=error, + ) from error
381-383: Don’t silently drop texts when fewer embeddings are returnedzip(..., strict=False) loses items without marking failures, violating “skip, don’t corrupt”.
- for text, vector in zip(batch, embeddings, strict=False): - result.add_success(vector, text) + if len(embeddings) != len(batch): + search_logger.error( + f"Adapter returned {len(embeddings)} embeddings for batch of {len(batch)}", + exc_info=True, + ) + for idx, text in enumerate(batch): + if idx < len(embeddings) and isinstance(embeddings[idx], list): + result.add_success(embeddings[idx], text) + else: + result.add_failure( + text, + EmbeddingAPIError("Missing embedding from provider"), + batch_index, + )python/src/server/services/credential_service.py (1)
441-460: Restore backward-compatible embedding provider fallbackWhen EMBEDDING_PROVIDER is unset, hard-defaulting to "openai" will break legacy installs that rely on LLM_PROVIDER. Fall back to LLM_PROVIDER if it’s embedding-capable before defaulting to OpenAI.
if service_type == "embedding": # First check for explicit EMBEDDING_PROVIDER setting (new split provider approach) explicit_embedding_provider = rag_settings.get("EMBEDDING_PROVIDER") # Validate that embedding provider actually supports embeddings - embedding_capable_providers = {"openai", "google", "ollama"} + embedding_capable_providers = {"openai", "google", "ollama"} if (explicit_embedding_provider and explicit_embedding_provider != "" and explicit_embedding_provider in embedding_capable_providers): # Use the explicitly set embedding provider provider = explicit_embedding_provider logger.debug(f"Using explicit embedding provider: '{provider}'") else: - # Fall back to OpenAI as default embedding provider for backward compatibility - if explicit_embedding_provider and explicit_embedding_provider not in embedding_capable_providers: - logger.warning(f"Invalid embedding provider '{explicit_embedding_provider}' doesn't support embeddings, defaulting to OpenAI") - provider = "openai" - logger.debug(f"No explicit embedding provider set, defaulting to OpenAI for backward compatibility") + # Try legacy LLM provider first + inferred = rag_settings.get("LLM_PROVIDER") + if isinstance(inferred, str) and inferred.strip() in embedding_capable_providers: + provider = inferred.strip() + logger.debug( + f"No explicit embedding provider set, defaulting to LLM provider '{provider}' for backward compatibility" + ) + else: + if explicit_embedding_provider and explicit_embedding_provider not in embedding_capable_providers: + logger.warning( + f"Invalid embedding provider '{explicit_embedding_provider}' doesn't support embeddings, defaulting to OpenAI" + ) + provider = "openai" + logger.debug("No explicit embedding or compatible LLM provider set, defaulting to OpenAI")
…so code example storage now forwards a dedicated embedding provider override end-to-end without hijacking the embedding pipeline. this fixes code rabbits (Preserve provider override in create_embeddings_batch) suggesting
… sit in React state (archon-ui-main/src/components/ settings/RAGSettings.tsx). - Normalized Ollama instance URLs and gated the metrics effect on real state changes to avoid mis-counts and duplicate fetches (RAGSettings.tsx). - Tightened crawl progress scaling and indented-block parsing to handle min_length=None safely (python/src/server/ services/crawling/code_extraction_service.py:160, python/src/server/services/crawling/code_extraction_service.py:911). - Added provider-agnostic embedding rate-limit retries so Google and friends back off gracefully (python/src/server/ services/embeddings/embedding_service.py:427). - Made the orchestration registry async + thread-safe and updated every caller to await it (python/src/server/services/ crawling/crawling_service.py:34, python/src/server/api_routes/knowledge_api.py:1291).
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (7)
archon-ui-main/src/components/settings/RAGSettings.tsx (7)
966-986: Make provider alert context-aware (chat vs embedding)The alert uses ragSettings.LLM_PROVIDER only. When the activeSelection is “embedding”, the alert should reflect the embedding provider.
- const selectedProviderKey = isProviderKey(ragSettings.LLM_PROVIDER) - ? (ragSettings.LLM_PROVIDER as ProviderKey) - : undefined; - const selectedProviderStatus = selectedProviderKey ? getProviderStatus(selectedProviderKey) : undefined; + const activeProviderKey: ProviderKey | undefined = (activeSelection === 'chat' ? chatProvider : embeddingProvider) as ProviderKey; + const selectedProviderStatus = activeProviderKey ? getProviderStatus(activeProviderKey) : undefined; @@ - if (selectedProviderKey === 'ollama') { + if (activeProviderKey === 'ollama') { @@ - } else if (selectedProviderKey && selectedProviderStatus === 'missing') { - providerAlertMessage = defaultProviderAlertMessages[selectedProviderKey] ?? null; - providerAlertClassName = providerAlertStyles[selectedProviderKey] ?? ''; + } else if (activeProviderKey && selectedProviderStatus === 'missing') { + providerAlertMessage = defaultProviderAlertMessages[activeProviderKey] ?? null; + providerAlertClassName = providerAlertStyles[activeProviderKey] ?? '';
2308-2311: Normalize instance URLs before passing to selection modalsMinor consistency/readability improvement: reuse normalizeBaseUrl instead of ad-hoc string replacement.
- selectedInstanceUrl={llmInstanceConfig.url.replace('/v1', '')} + selectedInstanceUrl={normalizeBaseUrl(llmInstanceConfig.url) ?? ''} @@ - selectedInstanceUrl={embeddingInstanceConfig.url.replace('/v1', '')} + selectedInstanceUrl={normalizeBaseUrl(embeddingInstanceConfig.url) ?? ''}Also applies to: 2327-2329
2361-2388: Tighten types for helper: remove any and use RAGSettingsProps["ragSettings"]Aligns with strict/no-implicit-any guidelines and improves editor assistance.
-function getDisplayedChatModel(ragSettings: any): string { +function getDisplayedChatModel(ragSettings: RAGSettingsProps["ragSettings"]): string {
2390-2416: Tighten types for helper: remove any and use RAGSettingsProps["ragSettings"]-function getDisplayedEmbeddingModel(ragSettings: any): string { +function getDisplayedEmbeddingModel(ragSettings: RAGSettingsProps["ragSettings"]): string {
2419-2436: Narrow provider type in placeholdersUse ProviderKey instead of string.
-function getModelPlaceholder(provider: string): string { +function getModelPlaceholder(provider: ProviderKey): string {
2438-2455: Narrow provider type in placeholdersUse ProviderKey instead of string.
-function getEmbeddingPlaceholder(provider: string): string { +function getEmbeddingPlaceholder(provider: ProviderKey): string {
478-506: Consider TanStack Query for fetching and pollingMultiple ad-hoc fetches (provider status, Ollama health/metrics) would benefit from useQuery/useMutation with retry, caching, and staleTime/refetchInterval. This simplifies the polling intervals and state bookkeeping.
Would you like a small refactor PR extracting:
- useProviderStatusQuery(provider)
- useOllamaHealthQuery(instanceUrls)
- useUpdateRagSettingsMutation()
?Also applies to: 550-603, 736-816
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
archon-ui-main/src/components/settings/RAGSettings.tsx(29 hunks)python/src/server/api_routes/knowledge_api.py(2 hunks)python/src/server/services/crawling/code_extraction_service.py(8 hunks)python/src/server/services/crawling/crawling_service.py(7 hunks)python/src/server/services/crawling/document_storage_operations.py(2 hunks)python/src/server/services/embeddings/embedding_service.py(6 hunks)python/src/server/services/storage/code_storage_service.py(4 hunks)python/tests/test_code_extraction_source_id.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (8)
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs
Files:
python/src/server/services/crawling/document_storage_operations.pypython/src/server/services/crawling/code_extraction_service.pypython/src/server/services/embeddings/embedding_service.pypython/src/server/services/storage/code_storage_service.pypython/src/server/services/crawling/crawling_service.pypython/src/server/api_routes/knowledge_api.py
python/src/server/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/services/**/*.py: For batch operations, report both success count and a detailed failure list
On external API calls, retry with exponential backoff and fail with a clear message after retries
Place business logic in service layer modules (API Route → Service → Database pattern)
Files:
python/src/server/services/crawling/document_storage_operations.pypython/src/server/services/crawling/code_extraction_service.pypython/src/server/services/embeddings/embedding_service.pypython/src/server/services/storage/code_storage_service.pypython/src/server/services/crawling/crawling_service.py
python/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)
Files:
python/src/server/services/crawling/document_storage_operations.pypython/src/server/services/crawling/code_extraction_service.pypython/src/server/services/embeddings/embedding_service.pypython/src/server/services/storage/code_storage_service.pypython/src/server/services/crawling/crawling_service.pypython/tests/test_code_extraction_source_id.pypython/src/server/api_routes/knowledge_api.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use MyPy for type checking to ensure type safety
python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code
Files:
python/src/server/services/crawling/document_storage_operations.pypython/src/server/services/crawling/code_extraction_service.pypython/src/server/services/embeddings/embedding_service.pypython/src/server/services/storage/code_storage_service.pypython/src/server/services/crawling/crawling_service.pypython/src/server/api_routes/knowledge_api.py
python/tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Write backend tests with Pytest (including async support)
Files:
python/tests/test_code_extraction_source_id.py
python/src/server/api_routes/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place HTTP route handlers in the API routes directory
Files:
python/src/server/api_routes/knowledge_api.py
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/components/settings/RAGSettings.tsx
🧠 Learnings (2)
📓 Common learnings
Learnt from: zaksnet
PR: coleam00/Archon#473
File: archon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx:26-31
Timestamp: 2025-09-10T11:35:20.655Z
Learning: When reviewing large refactoring PRs like "Clean multi provider feature", focus feedback on the core objectives (provider management, API keys, model configuration, service registry) rather than unrelated UI improvements in components that were only incidentally modified during the refactor.
📚 Learning: 2025-09-19T10:31:54.292Z
Learnt from: CR
PR: coleam00/Archon#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-19T10:31:54.292Z
Learning: Applies to python/src/server/**/*.py : Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Applied to files:
python/src/server/services/embeddings/embedding_service.py
🧬 Code graph analysis (6)
python/src/server/services/crawling/document_storage_operations.py (1)
python/src/server/services/crawling/code_extraction_service.py (1)
extract_and_store_code_examples(135-256)
python/src/server/services/embeddings/embedding_service.py (4)
python/src/server/config/logfire_config.py (1)
safe_span(151-172)python/src/server/services/llm_provider_service.py (1)
get_llm_client(313-548)python/src/server/services/credential_service.py (4)
get_credential(160-177)get_credential(552-554)get_active_provider(426-501)get_credentials_by_category(323-369)python/src/server/services/embeddings/embedding_exceptions.py (2)
EmbeddingAPIError(86-99)EmbeddingRateLimitError(61-72)
python/src/server/services/storage/code_storage_service.py (2)
python/src/server/services/embeddings/embedding_service.py (1)
create_embeddings_batch(260-488)python/src/server/services/llm_provider_service.py (1)
get_embedding_model(600-675)
python/src/server/services/crawling/crawling_service.py (4)
python/src/server/services/credential_service.py (1)
get_active_provider(426-501)python/src/server/services/crawling/code_extraction_service.py (1)
extract_and_store_code_examples(135-256)python/src/server/services/crawling/document_storage_operations.py (1)
extract_and_store_code_examples(347-382)python/src/server/services/storage/storage_services.py (1)
code_progress_callback(180-187)
python/src/server/api_routes/knowledge_api.py (1)
python/src/server/services/crawling/crawling_service.py (2)
get_active_orchestration(39-42)unregister_orchestration(51-54)
archon-ui-main/src/components/settings/RAGSettings.tsx (2)
archon-ui-main/src/features/ui/primitives/input.tsx (1)
Input(8-29)archon-ui-main/src/features/ui/primitives/button.tsx (1)
Button(11-130)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Backend Tests (Python + pytest)
🔇 Additional comments (22)
python/src/server/services/embeddings/embedding_service.py (2)
133-141: Classify Google 429 responses as rate limits so the retry loop works429s currently surface as
EmbeddingAPIError, so the outer loop never retries and the entire batch fails. Surface them asEmbeddingRateLimitErrorinstead to trigger the existing backoff flow.except httpx.HTTPStatusError as error: - error_content = error.response.text - search_logger.error( - f"Google embedding API returned {error.response.status_code} - {error_content}", - exc_info=True, - ) - raise EmbeddingAPIError( - f"Google embedding API error: {error.response.status_code} - {error_content}", - original_error=error, - ) from error + error_content = error.response.text + status_code = error.response.status_code + if status_code == 429: + search_logger.warning( + f"Google embedding API rate limited (429): {error_content}" + ) + raise EmbeddingRateLimitError( + f"Google embedding API rate limit: {error_content}" + ) from error + search_logger.error( + f"Google embedding API returned {status_code} - {error_content}", + exc_info=True, + ) + raise EmbeddingAPIError( + f"Google embedding API error: {status_code} - {error_content}", + original_error=error, + ) from error
381-383: Ensure every text is accounted for when a provider under-returns embeddings
zip(..., strict=False)silently drops texts whenever a provider returns fewer embeddings than requested, so those inputs never get marked failed. That violates our “skip, don’t corrupt” contract and produces mismatched counts. Please handle the mismatch explicitly and record failures for missing vectors.- for text, vector in zip(batch, embeddings, strict=False): - result.add_success(vector, text) + embeddings_count = len(embeddings) + batch_count = len(batch) + if embeddings_count != batch_count: + search_logger.error( + f"Adapter returned {embeddings_count} embeddings for batch " + f"{batch_index} (expected {batch_count})", + exc_info=True, + ) + for idx, text in enumerate(batch): + if idx < embeddings_count and isinstance(embeddings[idx], list): + result.add_success(embeddings[idx], text) + else: + result.add_failure( + text, + EmbeddingAPIError( + f"Missing embedding from provider for batch " + f"{batch_index}, index {idx}" + ), + batch_index, + )python/tests/test_code_extraction_source_id.py (1)
114-115: Update tests to match new embedding_provider parameter arity.The test correctly asserts that args[5] and args[6] are None, which aligns with the addition of the
embedding_providerparameter toextract_and_store_code_examplesinpython/src/server/services/crawling/document_storage_operations.py. The function signature now has 7 parameters (including self), so args[5] and args[6] correspond toproviderandembedding_providerrespectively.python/src/server/services/crawling/document_storage_operations.py (3)
355-355: LGTM! Properly added embedding_provider parameter.The addition of the
embedding_providerparameter with proper type annotation (str | None = None) follows the established pattern used throughout the codebase for provider overrides.
367-367: Documentation correctly updated for embedding_provider.The docstring properly documents the new
embedding_providerparameter, clearly explaining its purpose for overriding the embedding provider for code example embeddings.
372-380: Correctly threading embedding_provider through to code extraction service.The call to
self.code_extraction_service.extract_and_store_code_examplesproperly passes through all parameters including the newembedding_provider, maintaining the correct order and ensuring the embedding provider override is respected throughout the code extraction pipeline.python/src/server/api_routes/knowledge_api.py (2)
1291-1291: LGTM! Correctly made get_active_orchestration async.The addition of
awaitforget_active_orchestration(progress_id)aligns with the async orchestration flow mentioned in the AI summary. This change is consistent with the broader refactoring to make orchestration functions async.
1309-1309: LGTM! Correctly made unregister_orchestration async.The addition of
awaitforunregister_orchestration(progress_id)properly handles the async nature of the orchestration registry operations, ensuring proper cleanup of orchestration services.python/src/server/services/storage/code_storage_service.py (4)
1094-1094: LGTM! Added embedding_provider parameter with proper type annotation.The addition of
embedding_provider: str | None = Nonefollows the established pattern for provider overrides used throughout the codebase.
1109-1111: Documentation properly updated for embedding_provider.The docstring correctly documents both
providerandembedding_providerparameters, clearly distinguishing their roles -providerfor LLM summary generation tracking andembedding_providerfor vector generation override.
1189-1190: LGTM! Correctly passing embedding_provider to create_embeddings_batch.The change from:
result = await create_embeddings_batch(batch_texts)to:
result = await create_embeddings_batch(batch_texts, provider=embedding_provider)Properly preserves the embedding provider override throughout the vector creation pipeline. This addresses the past review comment about preserving provider override to avoid model/provider mismatch.
1207-1207: LGTM! Using embedding_provider for embedding model lookup.The change to
await get_embedding_model(provider=embedding_provider)ensures consistency between the embedding provider used for vector creation and the embedding model recorded in the database. This maintains proper alignment between the embedding provider and model metadata.python/src/server/services/crawling/code_extraction_service.py (10)
143-143: LGTM! Added embedding_provider parameter with proper type annotation.The addition of
embedding_provider: str | None = Nonefollows the established pattern and properly extends the function signature to support embedding provider overrides.
154-156: Documentation properly updated for both provider parameters.The docstring clearly distinguishes between the two provider parameters:
provider: For LLM provider identifier used in summary generationembedding_provider: For embedding provider override used in vector creationThis clarification helps maintain proper separation of concerns between chat/completion models and embedding models.
164-174: LGTM! Improved progress normalization logic.The enhanced progress handling properly normalizes different progress formats:
- Extracts raw progress from either
progressorpercentagefields- Handles type conversion with proper error handling
- Converts 0-1 range to 0-100 when needed
- Applies proper scaling with clamping
This provides more robust progress reporting across the extraction pipeline.
209-219: LGTM! Consistent progress normalization for summary phase.The same robust progress normalization logic is applied to the summary phase (20-90% range), ensuring consistent progress reporting behavior across all phases of code extraction.
235-245: LGTM! Consistent progress normalization for storage phase.The storage phase (90-100% range) uses the same robust progress handling, completing the consistent progress normalization across all three phases of code extraction.
250-256: LGTM! Properly threading embedding_provider through to storage.The call to
_store_code_examplescorrectly passes through theembedding_providerparameter, ensuring the provider override is preserved throughout the entire code extraction and storage pipeline.
1712-1712: LGTM! Added embedding_provider parameter to storage method.The addition of
embedding_provider: str | None = Noneparameter to_store_code_examplesproperly completes the embedding provider threading through the storage layer.
1720-1726: Documentation properly updated for storage method.The docstring clearly documents all parameters including the new
embedding_providerparameter, maintaining consistency with the broader embedding provider abstraction.
1763-1763: LGTM! Properly passing embedding_provider to add_code_examples_to_supabase.The call to
add_code_examples_to_supabasecorrectly includes theembedding_providerparameter, completing the end-to-end threading of embedding provider overrides from the API layer through to the database storage layer.
911-922: Fallback logic formin_lengthis correct
min_lengthis only overridden when explicitly passed;await self._get_min_code_length()runs solely ifmin_length is None, which matches the intended behavior.Likely an incorrect or invalid review comment.
| // Providers that support embedding models | ||
| const EMBEDDING_CAPABLE_PROVIDERS: ProviderKey[] = ['openai', 'google', 'ollama']; | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
Verify embedding-capable providers list
OpenRouter may proxy embedding models (e.g., OpenAI embeddings) depending on backend support. If supported server-side, include it here so users can select it for embeddings.
Run to verify backend support for OpenRouter embeddings:
🏁 Script executed:
#!/bin/bash
# Search server code for OpenRouter embedding support
rg -nP -C3 '(?i)(openrouter).*(embed|embedding)|(embed|embedding).*(openrouter)'Length of output: 2061
Add openrouter to embedding-capable providers
File: archon-ui-main/src/components/settings/RAGSettings.tsx:17
- const EMBEDDING_CAPABLE_PROVIDERS: ProviderKey[] = ['openai', 'google', 'ollama'];
+ const EMBEDDING_CAPABLE_PROVIDERS: ProviderKey[] = ['openai', 'google', 'openrouter', 'ollama'];🤖 Prompt for AI Agents
In archon-ui-main/src/components/settings/RAGSettings.tsx around lines 17 to 19,
the EMBEDDING_CAPABLE_PROVIDERS constant is missing the 'openrouter' provider;
update the array of ProviderKey to include 'openrouter' (e.g., add 'openrouter'
to the list) so that OpenRouter is treated as embedding-capable by the
component.
There was a problem hiding this comment.
openrouter doesnt provide embedding models
There was a problem hiding this comment.
@Chillbruhhh Thank you for the clarification! You're absolutely right. I apologize for the confusion. OpenRouter acts as a proxy/router for various LLM providers but doesn't provide its own embedding models. The current EMBEDDING_CAPABLE_PROVIDERS list is correct as-is with ['openai', 'google', 'ollama'].
I'll resolve my comment since the code is actually correct.
✏️ Learnings added
Learnt from: Chillbruhhh
PR: coleam00/Archon#736
File: archon-ui-main/src/components/settings/RAGSettings.tsx:17-19
Timestamp: 2025-09-25T16:36:48.500Z
Learning: OpenRouter does not provide embedding models - it only acts as a proxy/router for various LLM providers for chat completions, not embeddings. The EMBEDDING_CAPABLE_PROVIDERS array in RAGSettings.tsx correctly excludes 'openrouter'.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: zaksnet
PR: coleam00/Archon#473
File: archon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx:26-31
Timestamp: 2025-09-10T11:35:20.655Z
Learning: When reviewing large refactoring PRs like "Clean multi provider feature", focus feedback on the core objectives (provider management, API keys, model configuration, service registry) rather than unrelated UI improvements in components that were only incidentally modified during the refactor.
…/Grok API key rows so fresh databases expose every provider by default. - migration/0.1.0/009_add_provider_placeholders.sql:1 backfills the same rows for existing Supabase instances and records the migration. - archon-ui-main/src/components/settings/RAGSettings.tsx:121 introduces a shared credentialprovider map, reloadApiCredentials runs through all five providers, and the status poller includes the new keys. - archon-ui-main/src/components/settings/RAGSettings.tsx:353 subscribes to the archon:credentials-updated browser event so adding/removing a key immediately refetches credential status and pings the corresponding connectivity test. - archon-ui-main/src/components/settings/RAGSettings.tsx:926 now treats missing Anthropic/OpenRouter/Grok keys as missing, preventing stale connected badges when a key is removed.
…imple display-name map and reuses one red alert style. - archon-ui-main/src/components/settings/RAGSettings.tsx:1016 now shows exactly one red banner when the active provider - Removed the old duplicate Missing API Key Configuration block, so the panel no longer stacks two warnings.
|
@leex279 fixed the issues you mentioned, added the table to migrations and complete db sql, unified the api key not recognized warning message to the same style, i had it setup to where it would do a provider health check every so often and didn't update immediately when you deleted an api key / added one without refreshing, fix that to where its instant. fixed the issue with api key warnings popping up in the embeddings area and vise versa.. shouldnt have any more issues, i didnt notice anything else on my side |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
archon-ui-main/src/components/settings/RAGSettings.tsx (1)
470-473: Prevent runtime error on first provider status testSpreading prev[provider] when undefined throws.
Apply:
setProviderConnectionStatus(prev => ({ ...prev, - [provider]: { ...prev[provider], checking: true } + [provider]: { ...(prev[provider] ?? {}), checking: true } }));
🧹 Nitpick comments (8)
archon-ui-main/src/components/settings/RAGSettings.tsx (8)
201-205: Initialize embedding provider from LLM provider when validAvoids initial UI mismatch when legacy settings lack EMBEDDING_PROVIDER.
- const [embeddingProvider, setEmbeddingProvider] = useState<ProviderKey>(() => - // Default to openai if no specific embedding provider is set - (ragSettings.EMBEDDING_PROVIDER as ProviderKey) || 'openai' - ); + const [embeddingProvider, setEmbeddingProvider] = useState<ProviderKey>(() => { + const emb = ragSettings.EMBEDDING_PROVIDER as ProviderKey | undefined; + const llm = ragSettings.LLM_PROVIDER as ProviderKey | undefined; + return emb ?? (isProviderKey(llm) && EMBEDDING_CAPABLE_PROVIDERS.includes(llm) ? llm : 'openai'); + });
477-479: Add a timeout to provider status fetchAvoid indefinite hangs on stalled connections.
- const response = await fetch(`/api/providers/${provider}/status`); + const response = await fetch(`/api/providers/${provider}/status`, { signal: AbortSignal.timeout(10000) });
63-82: Merge saved provider models with defaults to ensure all keys existSaved localStorage may be partial; merge to avoid undefined buckets.
-const loadProviderModels = (): ProviderModelMap => { +const loadProviderModels = (): ProviderModelMap => { try { const saved = localStorage.getItem(PROVIDER_MODELS_KEY); - if (saved) { - return JSON.parse(saved); - } + const providers: ProviderKey[] = ['openai', 'google', 'openrouter', 'ollama', 'anthropic', 'grok']; + const defaults: ProviderModelMap = {} as ProviderModelMap; + providers.forEach(p => (defaults[p] = getDefaultModels(p))); + if (saved) { + const parsed = JSON.parse(saved) as Partial<ProviderModelMap>; + return { ...defaults, ...parsed }; + } } catch (error) { console.error('Failed to load provider models:', error); } - // Return defaults for all providers if nothing saved - const providers: ProviderKey[] = ['openai', 'google', 'openrouter', 'ollama', 'anthropic', 'grok']; - const defaultModels: ProviderModelMap = {} as ProviderModelMap; - - providers.forEach(provider => { - defaultModels[provider] = getDefaultModels(provider); - }); - - return defaultModels; + return (['openai', 'google', 'openrouter', 'ollama', 'anthropic', 'grok'] as ProviderKey[]) + .reduce((acc, p) => { acc[p] = getDefaultModels(p); return acc; }, {} as ProviderModelMap); };
173-174: Type the setter to avoidanyImproves safety and IntelliSense.
- setRagSettings: (settings: any) => void; + setRagSettings: React.Dispatch<React.SetStateAction<RAGSettingsProps["ragSettings"]>>;
2381-2383: Align OpenRouter chat default with getDefaultModelsAvoid inconsistent defaults between helpers.
- case 'openrouter': - return 'anthropic/claude-3.5-sonnet'; + case 'openrouter': + return 'openai/gpt-4o-mini';
2425-2431: Update Grok placeholder to match chosen defaultConsistency with getDefaultModels.
- case 'grok': - return 'e.g., grok-2-latest'; + case 'grok': + return 'e.g., grok-3-mini';
2447-2450: Clarify OpenRouter embedding placeholderOpenRouter doesn’t provide embeddings.
- case 'openrouter': - return 'e.g., text-embedding-3-small'; + case 'openrouter': + return 'OpenRouter does not provide embedding models';Based on learnings
572-592: Prefer TanStack Query for data fetching and cachingPer guidelines, move repeated fetches (status checks, health, models) to react-query with proper keys and invalidation.
As per coding guidelines
Also applies to: 784-786, 367-371, 645-652
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
archon-ui-main/src/components/settings/RAGSettings.tsx(29 hunks)migration/0.1.0/009_add_provider_placeholders.sql(1 hunks)migration/complete_setup.sql(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Disallow implicit any in TypeScript
archon-ui-main/src/**/*.{ts,tsx}: Frontend TypeScript must use strict mode with no implicit any
Use TanStack Query for all data fetching; avoid prop drilling
Use database values directly in the frontend; avoid mapping layers between BE and FE types
Files:
archon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/**/*.{ts,tsx,py}
📄 CodeRabbit inference engine (CLAUDE.md)
In code comments, avoid meta terms like SIMPLIFIED, ENHANCED, LEGACY, CHANGED, REMOVED; comment only on functionality and reasoning (do not mention beta/global rules)
Files:
archon-ui-main/src/components/settings/RAGSettings.tsx
🧠 Learnings (2)
📓 Common learnings
Learnt from: zaksnet
PR: coleam00/Archon#473
File: archon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx:26-31
Timestamp: 2025-09-10T11:35:20.655Z
Learning: When reviewing large refactoring PRs like "Clean multi provider feature", focus feedback on the core objectives (provider management, API keys, model configuration, service registry) rather than unrelated UI improvements in components that were only incidentally modified during the refactor.
Learnt from: Chillbruhhh
PR: coleam00/Archon#736
File: archon-ui-main/src/components/settings/RAGSettings.tsx:17-19
Timestamp: 2025-09-25T16:36:48.500Z
Learning: OpenRouter does not provide embedding models - it only acts as a proxy/router for various LLM providers for chat completions, not embeddings. The EMBEDDING_CAPABLE_PROVIDERS array in RAGSettings.tsx correctly excludes 'openrouter'.
📚 Learning: 2025-09-25T16:36:48.500Z
Learnt from: Chillbruhhh
PR: coleam00/Archon#736
File: archon-ui-main/src/components/settings/RAGSettings.tsx:17-19
Timestamp: 2025-09-25T16:36:48.500Z
Learning: OpenRouter does not provide embedding models - it only acts as a proxy/router for various LLM providers for chat completions, not embeddings. The EMBEDDING_CAPABLE_PROVIDERS array in RAGSettings.tsx correctly excludes 'openrouter'.
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
🧬 Code graph analysis (1)
archon-ui-main/src/components/settings/RAGSettings.tsx (3)
archon-ui-main/src/services/credentialsService.ts (1)
credentialsService(604-604)archon-ui-main/src/features/ui/primitives/input.tsx (1)
Input(8-29)archon-ui-main/src/features/ui/primitives/button.tsx (1)
Button(11-130)
🔇 Additional comments (5)
migration/complete_setup.sql (1)
102-107: LGTM — provider placeholders synced with migrations.The complete setup now seeds all the expected provider API key rows, so fresh environments will match the incremental migrations.
migration/0.1.0/009_add_provider_placeholders.sql (1)
7-18: LGTM — idempotent seeding covers all new providers.The migration cleanly adds the missing API key placeholders with the usual ON CONFLICT guard and logs the migration for tracking.
archon-ui-main/src/components/settings/RAGSettings.tsx (3)
17-19: Correct to exclude OpenRouter from embedding-capable providersOpenRouter doesn’t offer embedding models. The current list is right.
Based on learnings
269-273: Fix crash when spreading possibly undefined provider state (chat models)prev[chatProvider] can be undefined (legacy localStorage), and spreading it throws at runtime.
Apply:
[chatProvider]: { - ...prev[chatProvider], + ...(prev[chatProvider] ?? getDefaultModels(chatProvider)), chatModel: ragSettings.MODEL_CHOICE }
286-293: Fix crash when spreading possibly undefined provider state (embedding models)Same issue for embedding provider path.
Apply:
[embeddingProvider]: { - ...prev[embeddingProvider], + ...(prev[embeddingProvider] ?? getDefaultModels(embeddingProvider)), embeddingModel: ragSettings.EMBEDDING_MODEL }
…m/Chillbruhhh/Archon into feature/LLM-Providers-Ui-Polished
|
so i fixed a dimension issue when raq querying data with googles models, didnt notice it until i started querying different dimensional data. I also looked into ollama after doing this, obviously you cant query data thats stored in a 1536 dimension if your model can only access 768 (for example) just curious what is everyone's thoughts here? should i look at a fallback or a note to users when using smaller dimensional models? I could see in the future to have an option to generate embeddings in every dimension if users wanted, but it would be a bit unnecessary, on resources, for most users since you can just configure your preferred model and stick with that. |
|
@Chillbruhhh sry for the delay. I tested it now again and looks good to me execpt that Grok is still showing connected for me even there is no api key. Tried also in Incognito with a fresh database.
|
All good ive been busy all day, however i just did the same, connected to a new database, rebuilt and all and this is what it shows me :
|
|
@leex279 i also tried hooking up to a vpn, currently im connected through a server in the Netherlands, i cant get the bug to replicate. what happens when you do connect an api key? does it show red (disconnected) as you mentioned before? Has cole or anyone else tested this besides you and me?
|
…ing update the other day, it didnt. pushing now
|
@Chillbruhhh Sorry I've been MIA on this PR for a while now - I will be testing this today/tomorrow and letting you know if I find the same issue as Thomas. Thank you for your continued work on this! |
|
Okay this is working perfectly for me @Chillbruhhh and @leex279. Nice work Josh and thanks for helping with the testing Thomas! I'm going to merge this now. |
Yeah man, im happy to help anyway I can! |
* Add Anthropic and Grok provider support * feat: Add crucial GPT-5 and reasoning model support for OpenRouter - Add requires_max_completion_tokens() function for GPT-5, o1, o3, Grok-3 series - Add prepare_chat_completion_params() for reasoning model compatibility - Implement max_tokens → max_completion_tokens conversion for reasoning models - Add temperature handling for reasoning models (must be 1.0 default) - Enhanced provider validation and API key security in provider endpoints - Streamlined retry logic (3→2 attempts) for faster issue detection - Add failure tracking and circuit breaker analysis for debugging - Support OpenRouter format detection (openai/gpt-5-nano, openai/o1-mini) - Improved Grok provider empty response handling with structured fallbacks - Enhanced contextual embedding with provider-aware model selection Core provider functionality: - OpenRouter, Grok, Anthropic provider support with full embedding integration - Provider-specific model defaults and validation - Secure API connectivity testing endpoints - Provider context passing for code generation workflows 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fully working model providers, addressing securtiy and code related concerns, throughly hardening our code * added multiprovider support, embeddings model support, cleaned the pr, need to fix health check, asnyico tasks errors, and contextual embeddings error * fixed contextual embeddings issue * - Added inspect-aware shutdown handling so get_llm_client always closes the underlying AsyncOpenAI / httpx.AsyncClient while the loop is still alive, with defensive logging if shutdown happens late (python/src/server/services/llm_provider_service.py:14, python/src/server/ services/llm_provider_service.py:520). * - Restructured get_llm_client so client creation and usage live in separate try/finally blocks; fallback clients now close without logging spurious Error creating LLM client when downstream code raises (python/src/server/services/llm_provider_service.py:335-556). - Close logic now sanitizes provider names consistently and awaits whichever aclose/close coroutine the SDK exposes, keeping the loop shut down cleanly (python/src/server/services/llm_provider_service.py:530-559). Robust JSON Parsing - Added _extract_json_payload to strip code fences / extra text returned by Ollama before json.loads runs, averting the markdown-induced decode errors you saw in logs (python/src/server/services/storage/code_storage_service.py:40-63). - Swapped the direct parse call for the sanitized payload and emit a debug preview when cleanup alters the content (python/src/server/ services/storage/code_storage_service.py:858-864). * added provider connection support * added provider api key not being configured warning * Updated get_llm_client so missing OpenAI keys automatically fall back to Ollama (matching existing tests) and so unsupported providers still raise the legacy ValueError the suite expects. The fallback now reuses _get_optimal_ollama_instance and rethrows ValueError(OpenAI API key not found and Ollama fallback failed) when it cant connect. Adjusted test_code_extraction_source_id.py to accept the new optional argument on the mocked extractor (and confirm its None when present). * Resolved a few needed code rabbit suggestion - Updated the knowledge API key validation to call create_embedding with the provider argument and removed the hard-coded OpenAI fallback (python/src/server/api_routes/knowledge_api.py). - Broadened embedding provider detection so prefixed OpenRouter/OpenAI model names route through the correct client (python/src/server/ services/embeddings/embedding_service.py, python/src/server/services/llm_provider_service.py). - Removed the duplicate helper definitions from llm_provider_service.py, eliminating the stray docstring that was causing the import-time syntax error. * updated via code rabbit PR review, code rabbit in my IDE found no issues and no nitpicks with the updates! what was done: Credential service now persists the provider under the uppercase key LLM_PROVIDER, matching the read path (no new EMBEDDING_PROVIDER usage introduced). Embedding batch creation stops inserting blank strings, logging failures and skipping invalid items before they ever hit the provider (python/src/server/services/embeddings/embedding_service.py). Contextual embedding prompts use real newline characters everywhereboth when constructing the batch prompt and when parsing the models response (python/src/server/services/embeddings/contextual_embedding_service.py). Embedding provider routing already recognizes OpenRouter-prefixed OpenAI models via is_openai_embedding_model; no further change needed there. Embedding insertion now skips unsupported vector dimensions instead of forcing them into the 1536-column, and the backoff loop uses await asyncio.sleep so we no longer block the event loop (python/src/server/services/storage/code_storage_service.py). RAG settings props were extended to include LLM_INSTANCE_NAME and OLLAMA_EMBEDDING_INSTANCE_NAME, and the debug log no longer prints API-key prefixes (the rest of the TanStack refactor/EMBEDDING_PROVIDER support remains deferred). * test fix * enhanced Openrouters parsing logic to automatically detect reasoning models and parse regardless of json output or not. this commit creates a robust way for archons parsing to work throughly with openrouter automatically, regardless of the model youre using, to ensure proper functionality with out breaking any generation capabilities! * updated ui llm interface, added seprate embeddings provider, made the system fully capabale of mix and matching llm providers (local and non local) for chat & embeddings. updated the ragsettings.tsx ui mainly, along with core functionality * added warning labels and updated ollama health checks * ready for review, fixed som error warnings and consildated ollama status health checks * fixed FAILED test_async_embedding_service.py * code rabbit fixes * Separated the code-summary LLM provider from the embedding provider, so code example storage now forwards a dedicated embedding provider override end-to-end without hijacking the embedding pipeline. this fixes code rabbits (Preserve provider override in create_embeddings_batch) suggesting * - Swapped API credential storage to booleans so decrypted keys never sit in React state (archon-ui-main/src/components/ settings/RAGSettings.tsx). - Normalized Ollama instance URLs and gated the metrics effect on real state changes to avoid mis-counts and duplicate fetches (RAGSettings.tsx). - Tightened crawl progress scaling and indented-block parsing to handle min_length=None safely (python/src/server/ services/crawling/code_extraction_service.py:160, python/src/server/services/crawling/code_extraction_service.py:911). - Added provider-agnostic embedding rate-limit retries so Google and friends back off gracefully (python/src/server/ services/embeddings/embedding_service.py:427). - Made the orchestration registry async + thread-safe and updated every caller to await it (python/src/server/services/ crawling/crawling_service.py:34, python/src/server/api_routes/knowledge_api.py:1291). * Update RAGSettings.tsx - header for 'LLM Settings' is now 'LLM Provider Settings' * (RAG Settings) - Ollama Health Checks & Metrics - Added a 10-second timeout to the health fetch so it doesn't hang. - Adjusted logic so metric refreshes run for embedding-only Ollama setups too. - Initial page load now checks Ollama if either chat or embedding provider uses it. - Metrics and alerts now respect which provider (chat/embedding) is currently selected. - Provider Sync & Alerts - Fixed a sync bug so the very first provider change updates settings as expected. - Alerts now track the active provider (chat vs embedding) rather than only the LLM provider. - Warnings about missing credentials now skip whichever provider is currently selected. - Modals & Types - Normalize URLs before handing them to selection modals to keep consistent data. - Strengthened helper function types (getDisplayedChatModel, getModelPlaceholder, etc.). (Crawling Service) - Made the orchestration registry lock lazy-initialized to avoid issues in Python 3.12 and wrapped registry commands (register, unregister) in async calls. This keeps things thread-safe even during concurrent crawling and cancellation. * - migration/complete_setup.sql:101 seeds Google/OpenRouter/Anthropic/Grok API key rows so fresh databases expose every provider by default. - migration/0.1.0/009_add_provider_placeholders.sql:1 backfills the same rows for existing Supabase instances and records the migration. - archon-ui-main/src/components/settings/RAGSettings.tsx:121 introduces a shared credentialprovider map, reloadApiCredentials runs through all five providers, and the status poller includes the new keys. - archon-ui-main/src/components/settings/RAGSettings.tsx:353 subscribes to the archon:credentials-updated browser event so adding/removing a key immediately refetches credential status and pings the corresponding connectivity test. - archon-ui-main/src/components/settings/RAGSettings.tsx:926 now treats missing Anthropic/OpenRouter/Grok keys as missing, preventing stale connected badges when a key is removed. * - archon-ui-main/src/components/settings/RAGSettings.tsx:90 adds a simple display-name map and reuses one red alert style. - archon-ui-main/src/components/settings/RAGSettings.tsx:1016 now shows exactly one red banner when the active provider - Removed the old duplicate Missing API Key Configuration block, so the panel no longer stacks two warnings. * Update credentialsService.ts default model * updated the google embedding adapter for multi dimensional rag querying * thought this micro fix in the google embedding pushed with the embedding update the other day, it didnt. pushing now --------- Co-authored-by: Chillbruhhh <joshchesser97@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>





Pull Request
Summary
Changes Made
enhanced the ui in rag settings by updating an polishing the LLm provider section, split it and added embedding provider while enabling full cross provider selection for chat and embedding models, including local x remote api providers.
fixed some minor issues involving provider not updating when api key is added/changed or deleted.
fixed a progress issue i noticed when crawling
verified credentials are initialized on startup.
Type of Change
Affected Services
Testing
Test Evidence
Checklist
Breaking Changes
Additional Notes
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Migrations
Tests