Fix Issue #362: Comprehensive OpenAI Error Handling#646
Fix Issue #362: Comprehensive OpenAI Error Handling#646
Conversation
Replace silent failures with clear, actionable error messages to eliminate 90-minute debugging sessions when OpenAI API quota is exhausted. ## Backend Enhancements - Add error sanitization preventing sensitive data exposure (API keys, URLs, tokens) - Add upfront API key validation before expensive operations (crawl, upload, refresh) - Implement fail-fast pattern in RAG service (no more empty results for API failures) - Add specific error handling for quota, rate limit, auth, and API errors - Add EmbeddingAuthenticationError exception with masked key prefix support ## Frontend Enhancements - Create enhanced error utilities with OpenAI-specific parsing - Build TanStack Query compatible API wrapper preserving ETag caching - Update knowledge service to use enhanced error handling - Enhance TanStack Query hooks with user-friendly error messages ## Security Features - Comprehensive regex sanitization (8 patterns) with ReDoS protection - Input validation and circular reference detection - Generic fallback messages for sensitive keywords - Bounded quantifiers to prevent regex DoS attacks ## User Experience - Clear error messages: "OpenAI API quota exhausted" - Actionable guidance: "Check your OpenAI billing dashboard and add credits" - Immediate error visibility (no more silent failures) - Appropriate error severity styling ## Architecture Compatibility - Full TanStack Query integration maintained - ETag caching and optimistic updates preserved - No performance regression (all existing tests pass) - Compatible with existing knowledge base architecture Resolves #362: Users no longer experience mysterious empty RAG results that require extensive debugging to identify OpenAI quota issues. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add detailed logging to API key validation function - Add debug logs to frontend error parsing - Add debug logs to TanStack Query error handler - This will help identify where error handling is failing in the UI
The embedding service wraps OpenAI AuthenticationError as EmbeddingAPIError, causing API key validation to fail silently instead of blocking operations. - Add specific handling for EmbeddingAPIError with 401 status codes - Detect authentication errors by checking for '401' and 'incorrect/invalid' keywords - Ensure API key validation properly blocks crawl operations - Add detailed logging to identify wrapped error types This fixes the issue where invalid API keys would start crawls that fail silently instead of showing immediate error messages in the UI.
The callAPIWithETag function extracts error messages from backend responses but loses the structured error details. Enhanced wrapper now reconstructs the full error structure based on status codes and error messages. - Add specific handling for ProjectServiceError class - Reconstruct OpenAI error details from extracted messages - Preserve error_type and structured message information - Add comprehensive debug logging to trace error parsing This ensures OpenAI authentication/quota errors are properly recognized and display the correct user-friendly messages in the UI.
Add comprehensive debug logs to parseKnowledgeBaseError to identify why OpenAI error details are not being recognized properly. This will show exactly what data structure is being received and why the error_type detection is failing.
Clean production-ready version with: - Comprehensive OpenAI error handling (401, 429, 502) - Proper error message sanitization - ProjectServiceError structure handling - Enhanced user-friendly error messages - TanStack Query integration Successfully resolves Issue #362 - users now get immediate clear error messages instead of silent failures and 90-minute debugging sessions.
Critical Issues Fixed: ✅ Remove hardcoded retry timing - use dynamic retry_after values ✅ Optimize regex patterns - use string operations + efficient patterns ✅ Improve error detection - use partial string matching instead of exact ✅ Fix exception swallowing - fail fast for config errors, allow network errors Improvements Added: ✅ Structured error codes (OPENAI_AUTH_FAILED, OPENAI_QUOTA_EXHAUSTED, etc.) ✅ Additional error types (timeout_error, configuration_error) ✅ Length limits on error messages (prevent processing large inputs) ✅ Enhanced user guidance for new error types ✅ Updated test assertions for error codes Security Enhancements: ✅ More efficient sanitization (string ops + simplified regex) ✅ Stricter bounds on regex patterns ✅ Input length validation The implementation now meets all production readiness criteria from the code review with robust error handling, enhanced security, and improved reliability for OpenAI error detection and user messaging.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughAdds provider-aware error handling across frontend and backend for knowledge/RAG workflows. Introduces new error utils, enhanced API client, cache invalidations, provider adapters, API key validation prechecks, and refined exception propagation. Updates types to Python 3.10 unions in places, adds tests for OpenAI error paths, and includes documentation and a code review report. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant UI as UI (Knowledge Page)
participant Hook as useKnowledgeQueries
participant Svc as knowledgeService
participant API as callKnowledgeAPI / uploadWithEnhancedErrors
participant Err as errorHandler (parse/getDisplay)
User->>UI: Trigger knowledge action (upload/crawl/delete/refresh/update)
UI->>Hook: mutate()
Hook->>Svc: call service endpoint
Svc->>API: HTTP request
API-->>Svc: Success -> data
Svc-->>Hook: resolve
Hook-->>UI: onSuccess
alt Error
API-->>Svc: throw EnhancedError
Svc-->>Hook: reject EnhancedError
Hook->>Err: getDisplayErrorMessage(error)
Err-->>Hook: provider-aware message
Hook-->>UI: onError(message)
UI-->>User: Show provider-aware error
end
sequenceDiagram
autonumber
actor Client
participant Route as /api/knowledge/*
participant Validator as _validate_provider_api_key
participant Provider as ProviderErrorFactory
participant RAG as RAG/Search Services
Client->>Route: POST /rag | /upload | /crawl | /refresh
Route->>Validator: pre-check API key
Validator->>Provider: detect provider + test request
alt Auth/Quota/Rate/API error
Provider-->>Validator: Embedding*Error
Validator-->>Route: raise HTTPException (401/429/502/503)
Route-->>Client: Provider-aware HTTP error (sanitized)
else OK or transient network issue
Validator-->>Route: continue
Route->>RAG: perform operation/query
alt Embedding/RAG error
RAG-->>Route: Embedding*Error or RuntimeError
Route-->>Client: HTTPException with sanitized details
else Success
RAG-->>Route: result
Route-->>Client: 200 response
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks (4 passed, 1 inconclusive)❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
Poem
✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Transform OpenAI-specific error handling into extensible multi-provider system that supports OpenAI, Google AI, Anthropic, Ollama, and future providers. ## Backend Enhancements - Add ProviderErrorAdapter pattern with provider-specific implementations - Create ProviderErrorFactory for unified error handling across providers - Refactor API key validation to detect and handle any provider - Update error sanitization to use provider-specific patterns - Add provider context to all error responses ## Frontend Enhancements - Rename interfaces from OpenAI-specific to provider-agnostic - Update error detection to work with any provider name - Add provider context to error messages and guidance - Support provider-specific error codes and classifications ## Provider Support Added ✅ OpenAI: sk-* keys, org/proj/req IDs, quota/rate limit patterns ✅ Google AI: AIza* keys, googleapis.com URLs, project patterns ✅ Anthropic: sk-ant-* keys, anthropic.com URLs ✅ Ollama: localhost URLs, connection patterns (no API keys) ## Error Message Examples - OpenAI: 'Invalid or expired OpenAI API key. Please check your API key in settings.' - Google: 'Invalid or expired Google API key. Please check your API key in settings.' - Anthropic: 'Invalid or expired Anthropic API key. Please check your API key in settings.' ## Security Features - Provider-specific sanitization patterns prevent data exposure - Auto-detection of provider from error content - Structured error codes for reliable classification - Enhanced input validation and ReDoS protection This addresses the code review feedback to make error handling truly generic and extensible for all LLM providers, not just OpenAI, while maintaining the same level of user experience and security for each provider.
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 17
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (11)
python/src/server/services/search/hybrid_search_strategy.py (4)
191-195: Don’t swallow provider/search errors—propagate per PR’s fail-fast goal.Returning an empty list on exceptions hides OpenAI/Supabase failures and contradicts the PR objective to avoid silent empty results. Log and re-raise so upstream error mappers can surface actionable messages.
- except Exception as e: - logger.error(f"Hybrid code example search failed: {e}") - span.set_attribute("error", str(e)) - return [] + except Exception as e: + logger.exception("Hybrid code example search failed") + span.set_attribute("error", str(e)) + raise
103-107: Same here: avoid masking failures in document search.Mirror the fail-fast behavior instead of returning [] so callers can present structured errors.
- except Exception as e: - logger.error(f"Hybrid document search failed: {e}") - span.set_attribute("error", str(e)) - return [] + except Exception as e: + logger.exception("Hybrid document search failed") + span.set_attribute("error", str(e)) + raise
54-56: Avoid mutating caller-provided filter_metadata.filter_json aliases the input dict; popping “source” mutates caller state. Copy before mutation.
- filter_json = filter_metadata or {} - source_filter = filter_json.pop("source", None) if "source" in filter_json else None + filter_json = dict(filter_metadata) if filter_metadata else {} + source_filter = filter_json.pop("source", None)
138-143: Same mutable-dict issue in code search.Copy the dict before popping to prevent side effects.
- filter_json = filter_metadata or {} + filter_json = dict(filter_metadata) if filter_metadata else {} # Use source_id parameter if provided, otherwise check filter_metadata final_source_filter = source_id if not final_source_filter and "source" in filter_json: final_source_filter = filter_json.pop("source")python/src/server/services/threading_service.py (2)
481-490: Bug: run_in_executor receives kwargs for loop, not for func (TypeError risk).Passing
**kwargshere binds them torun_in_executor, not tofunc. This will raise on any keyword usage. Wrap withfunctools.partial.Apply:
async def run_cpu_intensive(self, func: Callable, *args, **kwargs) -> Any: """Run CPU-intensive function in thread pool""" - loop = asyncio.get_event_loop() - return await loop.run_in_executor(self.cpu_executor, func, *args, **kwargs) + loop = asyncio.get_event_loop() + bound = functools.partial(func, *args, **kwargs) + return await loop.run_in_executor(self.cpu_executor, bound) async def run_io_bound(self, func: Callable, *args, **kwargs) -> Any: """Run I/O-bound function in thread pool""" - loop = asyncio.get_event_loop() - return await loop.run_in_executor(self.io_executor, func, *args, **kwargs) + loop = asyncio.get_event_loop() + bound = functools.partial(func, *args, **kwargs) + return await loop.run_in_executor(self.io_executor, bound)Also add at top-level imports:
import functools
108-121: Token limit wait not honored; can prematurely raise “Rate limit exceeded”.
_calculate_wait_time()only considers request count, not tokens. When token budget is exceeded,wait_timecan be 0 andacquire()returns False immediately, failing fast even though waiting a few seconds would succeed.
- Compute wait for both windows (requests and tokens) and take the max.
- Avoid immediate False; sleep a small floor to prevent tight loop, bounded by
max_backoff.Apply:
- # Calculate wait time if we can't make the request - wait_time = self._calculate_wait_time(estimated_tokens) - if wait_time > 0: + # Calculate wait time if we can't make the request + wait_time = self._calculate_wait_time(estimated_tokens) + if wait_time > 0: logfire_logger.info( f"Rate limiting: waiting {wait_time:.1f}s", extra={ "tokens": estimated_tokens, "current_usage": self._get_current_usage(), } ) - wait_time_to_sleep = wait_time + wait_time_to_sleep = min(wait_time, self.config.max_backoff) else: - return False + # Token window may be at capacity even when request window is free. + # Yield briefly instead of failing immediately. + wait_time_to_sleep = 0.05And replace
_calculate_wait_time:- def _calculate_wait_time(self, estimated_tokens: int) -> float: - """Calculate how long to wait before retrying""" - if not self.request_times: - return 0 - - oldest_request = self.request_times[0] - time_since_oldest = time.time() - oldest_request - - if time_since_oldest < 60: - return 60 - time_since_oldest + 0.1 - - return 0 + def _calculate_wait_time(self, estimated_tokens: int) -> float: + """Calculate how long to wait before retrying (requests and tokens).""" + # Request-based wait + req_wait = 0.0 + if self.request_times: + oldest_request = self.request_times[0] + time_since_oldest = time.time() - oldest_request + if time_since_oldest < 60: + req_wait = 60 - time_since_oldest + 0.1 + + # Token-based wait + tok_wait = 0.0 + if self.token_usage: + current_tokens = sum(tokens for _, tokens in self.token_usage) + over = current_tokens + estimated_tokens - self.config.tokens_per_minute + if over > 0: + now = time.time() + freed = 0 + for ts, tokens in self.token_usage: + age = now - ts + remaining = max(0.0, 60 - age) + freed += tokens + if freed >= over: + tok_wait = remaining + 0.1 + break + + wait = max(req_wait, tok_wait) + return min(max(wait, 0.0), self.config.max_backoff)Also applies to: 166-177
python/src/mcp_server/features/projects/project_tools.py (1)
185-193: Bug: projects payload shape is wrong; “projects” should be an arrayYou’re returning the entire response object under "projects" which likely nests incorrectly. Handle both array and object responses (like tasks.list) and return a normalized structure.
- if response.status_code == 200: - response_data = response.json() - # Response already includes projects array and count - return json.dumps({ - "success": True, - "projects": response_data, - "count": response_data.get("count", 0), - }) + if response.status_code == 200: + response_data = response.json() + # Normalize: accept either list or object { projects, count } + if isinstance(response_data, list): + projects = response_data + count = len(projects) + elif isinstance(response_data, dict): + projects = response_data.get("projects", []) + count = response_data.get("count", len(projects)) + else: + return MCPErrorFormatter.format_error( + error_type="invalid_response", + message="Unexpected response format from API", + details={"response_type": type(response_data).__name__}, + suggestion="Expected list or object with 'projects'.", + ) + return json.dumps({ + "success": True, + "projects": projects, + "count": count, + })python/src/server/services/knowledge/knowledge_item_service.py (1)
466-474: Don’t fetch rows when only the count is needed.Use head=True to avoid transferring page data.
- self.supabase.table("archon_crawled_pages") - .select("*", count="exact") + self.supabase.table("archon_crawled_pages") + .select("id", count="exact", head=True) .eq("source_id", source_id) .execute()python/src/server/services/crawling/helpers/url_handler.py (1)
316-333: Handle Markdown link titles in text.Current regex captures the url plus optional title in the md group. Strip the title and optional angle brackets to avoid polluted URLs.
- urls = [] - for match in re.finditer(combined_pattern, content): - url = ( - match.group('md') - or match.group('auto') - or match.group('bare') - or match.group('proto') - or match.group('www') - ) - if not url: - continue - url = _clean_url(url) + urls = [] + for match in combined_pattern.finditer(content): + grp = match.lastgroup + raw = match.group(grp) if grp else None + if not raw: + continue + if grp == 'md': + # Extract URL only (strip optional title and angle brackets) + m = re.match(r'^\s*<?([^>\s]+)>?', raw) + url = m.group(1) if m else raw + else: + url = raw + url = _clean_url(url)python/src/server/services/knowledge/knowledge_summary_service.py (1)
183-193: Eliminate N+1 count queries with a single grouped query per table.Looping performs one count per
source_id. Use a grouped count to cut roundtrips and improve performance.Example (pseudocode; adapt to Supabase/PostgREST capabilities or use RPC/raw SQL):
- counts = {} - for source_id in source_ids: - result = ( - self.supabase.from_("archon_crawled_pages") - .select("id", count="exact", head=True) - .eq("source_id", source_id) - .execute() - ) - counts[source_id] = result.count if hasattr(result, "count") else 0 + rows = ( + self.supabase.from_("archon_crawled_pages") + .select("source_id, count:id", head=False) + .in_("source_id", source_ids) + .group("source_id") + .execute() + ).data or [] + counts = {sid: 0 for sid in source_ids} + for r in rows: + counts[r["source_id"]] = r.get("count", 0)Also ensure indexes:
- archon_crawled_pages(source_id)
- archon_code_examples(source_id)
Also applies to: 212-222
archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts (1)
116-121:new URL(request.url)can throw; protect optimistic path.Invalid URLs will throw before the mutation runs, breaking optimistic updates. Add a safe hostname fallback.
- title: new URL(request.url).hostname || "New crawl", + title: (() => { try { return new URL(request.url).hostname; } catch { return "New crawl"; } })(),
🧹 Nitpick comments (36)
python/src/server/services/search/hybrid_search_strategy.py (3)
69-71: Set span results_count=0 on empty responses.Tiny observability win; keeps metrics consistent with the non-empty path.
if not response.data: logger.debug("No results from hybrid search") + span.set_attribute("results_count", 0) return []if not response.data: logger.debug("No results from hybrid code search") + span.set_attribute("results_count", 0) return []Also applies to: 156-159
31-37: Validate and clamp match_count to protect the DB/RPC.Prevent excessive loads or invalid inputs (e.g., negative or huge limits).
async def search_documents_hybrid( self, query: str, query_embedding: list[float], match_count: int, filter_metadata: dict | None = None, ) -> list[dict[str, Any]]: + match_count = max(1, min(match_count, 200))async def search_code_examples_hybrid( self, query: str, match_count: int, filter_metadata: dict | None = None, source_id: str | None = None, ) -> list[dict[str, Any]]: + match_count = max(1, min(match_count, 200))Also applies to: 109-115
73-87: De-duplicate result shaping.Both methods transform rows identically (aside from “summary”). Consider a small helper to DRY this and centralize schema changes.
Also applies to: 160-175
python/src/server/services/threading_service.py (2)
466-479: Don’t hold concurrency slots while sleeping for rate limits.Acquire rate limits before entering the concurrency semaphore to avoid head-of-line blocking when many tasks wait.
- async with self.rate_limiter.semaphore: - can_proceed = await self.rate_limiter.acquire(estimated_tokens, progress_callback) - if not can_proceed: - raise Exception("Rate limit exceeded") - - start_time = time.time() - try: - yield - finally: - duration = time.time() - start_time - logfire_logger.debug( - "Rate limited operation completed", - extra={"duration": duration, "tokens": estimated_tokens}, - ) + can_proceed = await self.rate_limiter.acquire(estimated_tokens, progress_callback) + if not can_proceed: + raise Exception("Rate limit exceeded") + async with self.rate_limiter.semaphore: + start_time = time.time() + try: + yield + finally: + duration = time.time() - start_time + logfire_logger.debug( + "Rate limited operation completed", + extra={"duration": duration, "tokens": estimated_tokens}, + )
85-91: Tighten progress_callback typing.It’s awaited; annotate as async callable for clarity and tooling.
- async def acquire(self, estimated_tokens: int = 8000, progress_callback: Callable | None = None) -> bool: + async def acquire( + self, + estimated_tokens: int = 8000, + progress_callback: Callable[[dict], "asyncio.Future[Any] | asyncio.coroutines.Coroutine[Any, Any, None]"] | None = None, + ) -> bool:Alternatively, import
Awaitableand useCallable[[dict], Awaitable[None]] | None.python/src/server/utils/progress/progress_tracker.py (2)
104-109: Reduce info-level noise for debug-only logs.The document_storage DEBUG logs at high progress may flood logs in large runs. Consider gating or lowering level.
Also applies to: 133-139
154-158: Protect additional core fields from accidental override.Consider adding
end_time,duration,duration_formatted,error,error_timetoprotected_fields.- protected_fields = {"progress", "status", "log", "progress_id", "type", "start_time"} + protected_fields = { + "progress", "status", "log", "progress_id", "type", "start_time", + "end_time", "duration", "duration_formatted", "error", "error_time" + }python/src/mcp_server/features/tasks/task_tools.py (1)
425-429: Unify 404 error shape with MCPErrorFormatterReturn a standardized error object (with http_status) for not-found to keep responses consistent across tools.
- elif response.status_code == 404: - return json.dumps({ - "success": False, - "error": f"Task {task_id} not found. Use list_tasks to find valid task IDs.", - }) + elif response.status_code == 404: + return MCPErrorFormatter.format_error( + error_type="not_found", + message=f"Task {task_id} not found", + suggestion="Use list_tasks to find valid task IDs.", + http_status=404, + )python/src/server/services/embeddings/contextual_embedding_service.py (1)
35-46: Remove unused MODEL_CHOICE lookup (dead code + duplicate source of truth)model_choice is fetched but unused; model selection is handled later by _get_model_choice(). Simplify to avoid confusion.
- # Model choice is a RAG setting, get from credential service - try: - from ...services.credential_service import credential_service - - model_choice = await credential_service.get_credential("MODEL_CHOICE", "gpt-4.1-nano") - except Exception as e: - # Fallback to environment variable or default - search_logger.warning( - f"Failed to get MODEL_CHOICE from credential service: {e}, using fallback" - ) - model_choice = os.getenv("MODEL_CHOICE", "gpt-4.1-nano") - - search_logger.debug(f"Using MODEL_CHOICE: {model_choice}") + # Model selection is handled by _get_model_choice()python/src/server/services/knowledge/knowledge_item_service.py (2)
112-132: Avoid N+1 count queries for code examples.The per-source head count loop issues one request per source_id. Prefer a single grouped count (or an RPC) to reduce round trips, especially at higher page sizes.
If you’d like, I can sketch an RPC (PostgREST) returning (source_id, count) and the Python call-site changes.
115-123: Rename shadowed variable for clarity.count_result here reuses the name used earlier for the total sources count; rename to avoid confusion.
Apply this minimal rename:
- count_result = ( + ce_count_result = ( self.supabase.from_("archon_code_examples") .select("id", count="exact", head=True) .eq("source_id", source_id) .execute() ) - code_example_counts[source_id] = ( - count_result.count if hasattr(count_result, "count") else 0 - ) + code_example_counts[source_id] = ( + ce_count_result.count if hasattr(ce_count_result, "count") else 0 + )python/src/server/main.py (1)
194-199: Update modules list to reflect removed MCP client router.UI payload still advertises "mcp-clients" though it’s removed above.
"modules": ["settings", "mcp", "mcp-clients", "knowledge", "projects"], + # "mcp-clients" removed from architecture + "modules": ["settings", "mcp", "knowledge", "projects"],python/src/mcp_server/features/feature_tools.py (1)
77-80: Minor: explicitly request JSON from the API.Helps avoid content-type surprises.
- response = await client.get( - urljoin(api_url, f"/api/projects/{project_id}/features") - ) + response = await client.get( + urljoin(api_url, f"/api/projects/{project_id}/features"), + headers={"Accept": "application/json"} + )python/src/server/services/embeddings/embedding_exceptions.py (1)
113-115: Tiny logging/UX nit: prefer ASCII ellipsis.Some logs/viewers mangle the single-character ellipsis.
- self.api_key_prefix = api_key_prefix[:3] + "…" if api_key_prefix and len(api_key_prefix) >= 3 else None + self.api_key_prefix = (api_key_prefix[:3] + "...") if api_key_prefix and len(api_key_prefix) >= 3 else Nonepython/src/server/services/crawling/helpers/url_handler.py (3)
164-169: Micro: simplify extension check.Use endswith(tuple) instead of iterating.
- for ext in binary_extensions: - if path.endswith(ext): - logger.debug(f"Skipping binary file: {url} (matched extension: {ext})") - return True + if path.endswith(tuple(binary_extensions)): + logger.debug(f"Skipping binary file: {url}") + return True
359-360: Log at debug to reduce noise.Link extraction counts are better suited for debug-level logs.
- logger.info(f"Extracted {len(unique_urls)} unique links from content") + logger.debug(f"Extracted {len(unique_urls)} unique links from content")
207-276: Consider longer hash to further reduce collision risk.64-bit (16 hex chars) is likely fine up to ~1M sources, but bumping to 128-bit (32 hex) future-proofs without much cost.
- # Generate SHA256 hash and take first 16 characters - return hashlib.sha256(canonical.encode("utf-8")).hexdigest()[:16] + # Generate SHA256 hash and take first 32 characters (128-bit) + return hashlib.sha256(canonical.encode("utf-8")).hexdigest()[:32]python/src/server/services/knowledge/knowledge_summary_service.py (5)
127-135: Avoid parameter shadowing for clarity and future-safety.The loop-level
knowledge_type = metadata.get("knowledge_type")shadows the function parameter of the same name. Rename the local to avoid confusion and accidental misuse.- knowledge_type = metadata.get("knowledge_type") + item_knowledge_type = metadata.get("knowledge_type") if not knowledge_type: # Fallback: If not in metadata, default to "technical" for now # This handles legacy data that might not have knowledge_type set safe_logfire_info(f"Knowledge type not found in metadata for {source_id}, defaulting to technical") - knowledge_type = "technical" + item_knowledge_type = "technical" @@ - "knowledge_type": knowledge_type, + "knowledge_type": item_knowledge_type,
102-112: Parallelize independent batch fetches to reduce latency.
_get_document_counts_batch,_get_code_example_counts_batch, and_get_first_urls_batchare independent. Run them concurrently.- # Get document counts in a single query - doc_counts = await self._get_document_counts_batch(source_ids) - - # Get code example counts in a single query - code_counts = await self._get_code_example_counts_batch(source_ids) - - # Get first URLs in a single query - first_urls = await self._get_first_urls_batch(source_ids) + import asyncio + doc_counts, code_counts, first_urls = await asyncio.gather( + self._get_document_counts_batch(source_ids), + self._get_code_example_counts_batch(source_ids), + self._get_first_urls_batch(source_ids), + )
241-246: Order by a selected column and ensure deterministic “first” URL.You order by
created_atbut don't include it inselect. Add it for deterministic client-side grouping and to avoid potential backend constraints requiring selected order columns.- .select("source_id, url") + .select("source_id, url, created_at") .in_("source_id", source_ids) .order("created_at", desc=False)
239-260: Consider server-side “first row per group” to reduce payload.Current approach retrieves all rows for
source_id ∈ setthen picks the first per group in Python. If volume grows, move “first url per source” to the DB via a window function or RPC to return only one row persource_id.
164-167: Preserve exception context.Re-raising without context loses stack info. Use
raiseto preserve original traceback after logging.- safe_logfire_error(f"Failed to get knowledge summaries | error={str(e)}") - raise + safe_logfire_error(f"Failed to get knowledge summaries | error={e}") + raise # preserves original tracebackcode-review1.md (3)
26-36: Resolve contradiction: ReDoS “Potential Vulnerability” vs “Bounded Regex prevents ReDoS.”The report flags a potential ReDoS, but later asserts bounded regex prevents it. Clarify the final stance and, if still a risk, include a concrete action item.
Also applies to: 98-105
17-25: Avoid brittle line anchors.Hard-coded line numbers (e.g., ts:199) will drift. Prefer permalinks with commit hashes, code fences with context, or symbol-level references.
240-247: Tighten “High Priority” recs with owners and acceptance criteria.Add explicit owners and measurable acceptance tests (e.g., “Retry-after honored from header; unit test asserts backoff honors 429 Retry-After”).
archon-ui-main/src/features/knowledge/utils/errorHandler.ts (2)
126-129: DRY provider name formatting.
provider.charAt(0).toUpperCase() + ...is duplicated. Extract a helper for consistency.+function fmtProvider(provider?: string) { + return provider ? provider[0].toUpperCase() + provider.slice(1) : 'LLM'; +} @@ - const provider = error.errorDetails.provider ? error.errorDetails.provider.charAt(0).toUpperCase() + error.errorDetails.provider.slice(1) : 'LLM'; + const provider = fmtProvider(error.errorDetails.provider); @@ - const provider = error.errorDetails.provider ? error.errorDetails.provider.charAt(0).toUpperCase() + error.errorDetails.provider.slice(1) : 'LLM'; + const provider = fmtProvider(error.errorDetails.provider);Also applies to: 205-207
138-146: Guard against noisy provider-supplied messages.
api_errordisplays raw provider message. If backend sanitization regresses, this could leak details. Consider truncation or a safe fallback beyond current backend guarantees.TASK_PRP/PRPs/rebase-bugfix-issue-362.md (2)
69-77: Codify rebase conflict policies.Document precedence rules (e.g., “prefer main for infra, prefer feature branch for provider errors”) to speed decisions during interactive rebase.
125-132: Automate validations.Where possible, replace manual steps with scripts (lint, unit, E2E) to reduce human error during multi-hour rebases.
I can draft a minimal
./tools/rebase-validate.shto run the listed checks if helpful.archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts (3)
85-96: Type the mutation error asunknown(orEnhancedError) to avoid casts.You currently declare
Errorthen cast toEnhancedErrorin onError. Useunknownto reflect reality orEnhancedErrorif the service guarantees it.- return useMutation< - CrawlStartResponse, - Error, + return useMutation< + CrawlStartResponse, + unknown, CrawlRequest,Apply similarly to other mutations.
Also applies to: 294-304, 501-506, 555-561, 604-607
277-283: Centralize error toast logic.The onError blocks duplicate the same pattern. Extract a small helper (e.g.,
toastProviderError(showToast, error, defaultMessage)) to reduce repetition and ensure uniform UX.Also applies to: 457-462, 531-536, 581-586, 620-625
687-700: Reduce refetch churn after operations complete.Consider
queryClient.invalidateQuerieswithrefetchType: 'active'and debouncing per key to avoid bursts when many operations finish nearly simultaneously.archon-ui-main/src/features/knowledge/services/apiWithEnhancedErrors.ts (1)
28-32: Consider extracting provider detection logic.The provider detection logic is duplicated here and could be extracted into a utility function for reusability.
Consider creating a utility function:
function detectProviderFromError(message: string): string { if (message.includes("OpenAI")) return "OpenAI"; if (message.includes("Google")) return "Google"; if (message.includes("Anthropic")) return "Anthropic"; if (message.includes("Ollama")) return "Ollama"; return "LLM"; }python/src/server/services/embeddings/provider_error_adapters.py (3)
121-129: Avoid over-sanitizing generic words like "server".This can unnecessarily hide helpful diagnostics (e.g., “server returned 502”). Keep “token” and conditionally add “endpoint” as you already do.
- sensitive_words = ['internal', 'server', 'token'] + sensitive_words = ['internal', 'token']
338-341: Fail fast on unsupported providers (avoid silent OpenAI fallback).Defaulting to OpenAI can cause wrong sanitization and UX. Prefer an explicit error.
- return cls._adapters.get(provider.lower(), cls._adapters["openai"]) + try: + return cls._adapters[provider.lower()] + except KeyError: + raise ValueError(f"Unsupported provider: {provider}")
359-374: Broaden provider detection for Google (Gemini/Vertex) and keep Anthropic-first.Optional: include common Google GenAI identifiers to improve detection accuracy.
- elif "google" in error_lower or "googleapis" in error_lower or "AIza" in error_str: + elif any(s in error_lower for s in ("google", "googleapis", "vertex ai", "generative ai", "gemini")) or "AIza" in error_str: return "google"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (28)
TASK_PRP/PRPs/rebase-bugfix-issue-362.md(1 hunks)archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts(6 hunks)archon-ui-main/src/features/knowledge/services/apiWithEnhancedErrors.ts(1 hunks)archon-ui-main/src/features/knowledge/services/knowledgeService.ts(9 hunks)archon-ui-main/src/features/knowledge/utils/errorHandler.ts(1 hunks)code-review1.md(1 hunks)python/src/agents/base_agent.py(1 hunks)python/src/mcp_server/features/documents/document_tools.py(1 hunks)python/src/mcp_server/features/documents/version_tools.py(1 hunks)python/src/mcp_server/features/feature_tools.py(1 hunks)python/src/mcp_server/features/projects/project_tools.py(1 hunks)python/src/mcp_server/features/tasks/task_tools.py(1 hunks)python/src/mcp_server/mcp_server.py(0 hunks)python/src/mcp_server/modules/rag_module.py(0 hunks)python/src/server/api_routes/knowledge_api.py(5 hunks)python/src/server/main.py(2 hunks)python/src/server/services/crawling/crawling_service.py(1 hunks)python/src/server/services/crawling/helpers/url_handler.py(9 hunks)python/src/server/services/embeddings/contextual_embedding_service.py(1 hunks)python/src/server/services/embeddings/embedding_exceptions.py(1 hunks)python/src/server/services/embeddings/provider_error_adapters.py(1 hunks)python/src/server/services/knowledge/knowledge_item_service.py(1 hunks)python/src/server/services/knowledge/knowledge_summary_service.py(10 hunks)python/src/server/services/search/hybrid_search_strategy.py(1 hunks)python/src/server/services/search/rag_service.py(2 hunks)python/src/server/services/threading_service.py(3 hunks)python/src/server/utils/progress/progress_tracker.py(9 hunks)python/tests/test_openai_error_handling.py(1 hunks)
💤 Files with no reviewable changes (2)
- python/src/mcp_server/modules/rag_module.py
- python/src/mcp_server/mcp_server.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-28T13:07:24.810Z
Learnt from: Wirasm
PR: coleam00/Archon#514
File: archon-ui-main/src/services/crawlProgressService.ts:35-39
Timestamp: 2025-08-28T13:07:24.810Z
Learning: The crawlProgressService.ts in the Archon codebase should be deprecated in favor of the existing useCrawlProgressPolling hook from usePolling.ts, which already includes ETag support, 304 handling, tab visibility detection, and proper React lifecycle integration. This consolidation reduces code duplication and improves performance.
Applied to files:
archon-ui-main/src/features/knowledge/hooks/useKnowledgeQueries.ts
🪛 GitHub Actions: Continuous Integration
python/tests/test_openai_error_handling.py
[error] 21-21: ImportError: cannot import name '_sanitize_openai_error' from 'src.server.api_routes.knowledge_api' (tests/test_openai_error_handling.py:21).
python/src/server/api_routes/knowledge_api.py
[error] 1-1: ImportError: cannot import name '_sanitize_openai_error' from 'src.server.api_routes.knowledge_api' during test collection (tests/test_openai_error_handling.py). The symbol '_sanitize_openai_error' is not defined in that module. Command: 'uv run pytest tests/ --verbose --tb=short --cov=src --cov-report=xml --cov-report=html --cov-report=term-missing --junitxml=test-results.xml'.
🔇 Additional comments (26)
python/src/server/utils/progress/progress_tracker.py (1)
110-132: LGTM: Non-decreasing progress guard is correct.Prevents regressions while still allowing status updates; 0–100 clamp is appropriate.
python/src/server/services/crawling/crawling_service.py (1)
490-497: No functional change detected (whitespace).Nothing to do.
python/src/mcp_server/features/tasks/task_tools.py (1)
16-16: LGTM: centralized error formatting importImporting MCPErrorFormatter here is correct and aligns this module with the new error-handling flow.
python/src/mcp_server/features/projects/project_tools.py (1)
16-16: LGTM: adopt MCPErrorFormatter importConsistent with PR-wide error handling changes.
python/src/mcp_server/features/documents/version_tools.py (1)
16-16: LGTM: standardized error formatter importMatches the new error-handling approach; no further changes needed here.
python/src/server/services/embeddings/contextual_embedding_service.py (1)
219-223: LGTM: formatting-only changeTrailing newline addition; no behavior impact.
python/src/mcp_server/features/documents/document_tools.py (1)
16-16: LGTM: migrate to MCPErrorFormatterConsistent with the centralized error handling strategy.
python/src/server/services/knowledge/knowledge_item_service.py (1)
142-146: No-op formatting change — OK.Whitespace-only tweak around display_url; no behavioral impact.
python/src/server/main.py (2)
116-118: Good: removed unused exception binding.Keeps logs via exc_info without linter noise.
138-140: Good: removed unused exception binding (shutdown path).Consistent with startup path.
python/src/mcp_server/features/feature_tools.py (1)
13-15: Adopting MCPErrorFormatter is the right move.Centralized, consistent error shapes; aligns with the PR’s objectives.
python/src/server/services/embeddings/embedding_exceptions.py (1)
102-116: New EmbeddingAuthenticationError looks solid.Clear semantics, safe masking, and useful metadata. Good fit for provider adapters and UI mapping.
python/src/server/services/crawling/helpers/url_handler.py (3)
9-9: Import change — OK.urljoin is used below; good addition.
36-55: Markdown detector is reasonable and defensive.Lowercases path and ignores query/fragment; good error handling.
367-431: Heuristics for link-collection files look good.Filename-based checks plus density threshold and “full” exclusion are sensible for preserving single-page behavior.
python/src/server/services/search/rag_service.py (3)
119-121: Good implementation of fail-fast pattern for embedding failures.The change to raise a RuntimeError instead of returning empty results is correct and aligns with the fail-fast principle. This ensures that configuration or API issues are properly surfaced to users.
145-155: Proper error propagation for embedding provider errors.The dynamic import and re-raising of specific embedding exceptions allows precise error information to flow through to the API layer. This is a good architectural decision that maintains proper separation of concerns.
159-160: Consistent fail-fast behavior for generic failures.The RuntimeError with chained exception preserves the error context while ensuring failures are not silently ignored.
archon-ui-main/src/features/knowledge/services/knowledgeService.ts (2)
6-7: Good migration to enhanced error handling wrapper.The replacement of
callAPIWithETagwithcallKnowledgeAPIand addition ofuploadWithEnhancedErrorsensures consistent provider-aware error handling across all knowledge service endpoints while preserving ETag caching functionality.
136-137: No change required — the /api prefix is added by the upload wrapper.
uploadWithEnhancedErrors sets API_BASE_URL = "/api", so passing "/documents/upload" yields "/api/documents/upload"; the backend router also uses prefix="/api" and defines @router.post("/documents/upload"). Files: archon-ui-main/src/features/knowledge/services/apiWithEnhancedErrors.ts and python/src/server/api_routes/knowledge_api.py.Likely an incorrect or invalid review comment.
archon-ui-main/src/features/knowledge/services/apiWithEnhancedErrors.ts (2)
119-133: Good implementation of upload endpoint path construction.The upload wrapper correctly handles test environment URLs and maintains consistency with the base API path.
164-176: Proper timeout error handling.The timeout error handling with AbortError detection and appropriate error message is well implemented.
python/src/server/api_routes/knowledge_api.py (4)
56-65: Well-designed provider-aware error sanitization.The
_sanitize_provider_errorfunction properly delegates to the ProviderErrorFactory for provider-specific sanitization, preventing information disclosure.
68-228: Comprehensive API key validation with excellent error handling.The
_validate_provider_api_keyfunction handles all edge cases well:
- Auto-detects provider from configuration
- Handles authentication, quota, and rate limit errors distinctly
- Provides clear, actionable error messages
- Allows network errors to pass through gracefully
- Fails fast for configuration errors
The error handling is thorough and user-friendly.
657-658: Good placement of API key validation checks.The CRITICAL comments and validation calls are properly placed before resource-intensive operations (refresh, crawl, upload).
Also applies to: 778-779, 934-935
1161-1247: Excellent provider-aware error handling in RAG endpoint.The error handling in
perform_rag_queryis comprehensive:
- Determines the current provider dynamically
- Handles each error type with appropriate HTTP status codes
- Provides provider-specific, sanitized error messages
- Includes relevant metadata in error responses
This implementation provides excellent user experience while preventing information leakage.
| const enhancedError: EnhancedError = new Error(error.message || 'Unknown error'); | ||
|
|
||
| // Check if this is an HTTP response error with JSON details | ||
| if (error && typeof error === 'object') { | ||
| // Handle fetch Response errors | ||
| if (error.status || error.statusCode) { | ||
| enhancedError.statusCode = error.status || error.statusCode; | ||
| } |
There was a problem hiding this comment.
Type error: assigning Error to EnhancedError.
const enhancedError: EnhancedError = new Error(...) is not assignable in TS. Cast or build via Object.assign to satisfy the structural type.
- const enhancedError: EnhancedError = new Error(error.message || 'Unknown error');
+ const enhancedError = Object.assign(new Error(error.message || 'Unknown error'), {}) as EnhancedError;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const enhancedError: EnhancedError = new Error(error.message || 'Unknown error'); | |
| // Check if this is an HTTP response error with JSON details | |
| if (error && typeof error === 'object') { | |
| // Handle fetch Response errors | |
| if (error.status || error.statusCode) { | |
| enhancedError.statusCode = error.status || error.statusCode; | |
| } | |
| const enhancedError = Object.assign(new Error(error.message || 'Unknown error'), {}) as EnhancedError; | |
| // Check if this is an HTTP response error with JSON details | |
| if (error && typeof error === 'object') { | |
| // Handle fetch Response errors | |
| if (error.status || error.statusCode) { | |
| enhancedError.statusCode = error.status || error.statusCode; | |
| } |
🤖 Prompt for AI Agents
In archon-ui-main/src/features/knowledge/utils/errorHandler.ts around lines 94
to 101, the code declares "const enhancedError: EnhancedError = new Error(...)"
which causes a TypeScript type error because Error is not structurally
compatible with EnhancedError; replace this by creating an EnhancedError object
via Object.assign (e.g., Object.assign(new Error(...), {}) ) or by constructing
the Error and then casting it to EnhancedError (e.g., const enhancedError = new
Error(...) as unknown as EnhancedError), and then populate statusCode and other
fields on that object so subsequent assignments (enhancedError.statusCode = ...)
type-check without error.
| except TimeoutError: | ||
| self.logger.error(f"Agent {self.name} timed out after 120 seconds") | ||
| raise Exception(f"Agent {self.name} operation timed out - taking too long to respond") |
There was a problem hiding this comment.
Timeout handling will miss asyncio.wait_for timeouts
asyncio.wait_for raises asyncio.TimeoutError, not necessarily the built-in TimeoutError on Python 3.10. Catch both to avoid leaking timeouts.
- except TimeoutError:
+ except (asyncio.TimeoutError, TimeoutError):
self.logger.error(f"Agent {self.name} timed out after 120 seconds")
raise Exception(f"Agent {self.name} operation timed out - taking too long to respond")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except TimeoutError: | |
| self.logger.error(f"Agent {self.name} timed out after 120 seconds") | |
| raise Exception(f"Agent {self.name} operation timed out - taking too long to respond") | |
| except (asyncio.TimeoutError, TimeoutError): | |
| self.logger.error(f"Agent {self.name} timed out after 120 seconds") | |
| raise Exception(f"Agent {self.name} operation timed out - taking too long to respond") |
🤖 Prompt for AI Agents
In python/src/agents/base_agent.py around lines 219-221, the except currently
only catches the built-in TimeoutError but asyncio.wait_for raises
asyncio.TimeoutError; update the handler to catch both (asyncio.TimeoutError,
TimeoutError) as e (add an import for asyncio if not present), log the exception
details (e) in the logger message, and re-raise or wrap the error preserving the
original as the cause (e.g., raise Exception(...) from e) so asyncio timeouts
are not leaked.
| } | ||
| ) | ||
|
|
||
| except EmbeddingAuthenticationError as e: |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Verify missing import issue in test.
The exception classes are used but not imported at the module level.
🏁 Script executed:
#!/bin/bash
# Description: Verify that EmbeddingAuthenticationError and related exceptions are properly accessible
# Check if the exceptions are imported locally where used
rg -B5 "except EmbeddingAuthenticationError" python/src/server/api_routes/knowledge_api.py
# Check the actual imports at the top of the file
head -50 python/src/server/api_routes/knowledge_api.py | grep -E "^from|^import"Length of output: 923
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Searching for 'EmbeddingAuthenticationError' across the repo..."
rg -n --hidden --glob '!**/node_modules/**' -C3 "EmbeddingAuthenticationError" || true
echo
echo "Searching for class definitions matching 'Embedding.*Error'..."
rg -n --hidden --glob '!**/node_modules/**' -C3 "class\s+Embedding.*Error" || true
echo
echo "Searching for 'AuthenticationError' class definitions (broader)..."
rg -n --hidden --glob '!**/node_modules/**' -C3 "class\s+\w*AuthenticationError" || true
echo
echo "Showing top 200 lines of the file for context (imports + surrounding code):"
sed -n '1,200p' python/src/server/api_routes/knowledge_api.py || true
echo
echo "Showing the function block around the 'except EmbeddingAuthenticationError' occurrence:"
rg -n --hidden --glob '!**/node_modules/**' -n -C5 "except\s+EmbeddingAuthenticationError" python/src/server/api_routes/knowledge_api.py || trueLength of output: 11095
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Searching for references to the exception names across the repo (ignore .gitignore with -uu)..."
rg -uu -n "EmbeddingAuthenticationError|EmbeddingQuotaExhaustedError|EmbeddingAPIError" || true
echo
echo "Listing files under services/embeddings (if present):"
ls -la python/src/server/services/embeddings || true
echo
echo "Showing provider_error_adapters.py (if present):"
if [ -f python/src/server/services/embeddings/provider_error_adapters.py ]; then
sed -n '1,240p' python/src/server/services/embeddings/provider_error_adapters.py
else
echo "provider_error_adapters.py not found"
fi
echo
echo "Showing __init__.py for services.embeddings (if present):"
if [ -f python/src/server/services/embeddings/__init__.py ]; then
sed -n '1,240p' python/src/server/services/embeddings/__init__.py
else
echo "__init__.py not found in services/embeddings"
fi
echo
echo "Searching for where those exception classes might be defined (class definitions):"
rg -uu -n "class\s+\w*(Embedding|Embedding).*Error" || true
echo
echo "Searching for any imports that expose these names (from ... import EmbeddingAuthenticationError ...):"
rg -uu -n "import\s+.*EmbeddingAuthenticationError|from\s+.*import\s+.*EmbeddingAuthenticationError" || trueLength of output: 19730
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Showing lines 1100-1250 of knowledge_api.py for context:"
sed -n '1100,1250p' python/src/server/api_routes/knowledge_api.py || true
echo
echo "Searching for explicit imports of embedding_exceptions in the file:"
rg -n "from\s+.*embedding_exceptions|import\s+.*EmbeddingAuthenticationError" python/src/server/api_routes/knowledge_api.py || true
echo
echo "Show occurrences of the exception names with context:"
rg -n -C3 "EmbeddingAuthenticationError|EmbeddingQuotaExhaustedError|EmbeddingAPIError|EmbeddingRateLimitError" python/src/server/api_routes/knowledge_api.py || trueLength of output: 10965
Import embedding exceptions used in _validate_provider_api_key
The _validate_provider_api_key function in python/src/server/api_routes/knowledge_api.py (except at line ~110) references EmbeddingAuthenticationError, EmbeddingQuotaExhaustedError, and EmbeddingAPIError but does not import them; add a module- or function-scoped import from ..services.embeddings.embedding_exceptions to prevent NameError during exception handling.
🤖 Prompt for AI Agents
In python/src/server/api_routes/knowledge_api.py around line 110, the except
block references EmbeddingAuthenticationError, EmbeddingQuotaExhaustedError, and
EmbeddingAPIError but they are not imported; add a module- or function-scoped
import from ..services.embeddings.embedding_exceptions for these three exception
classes (EmbeddingAuthenticationError, EmbeddingQuotaExhaustedError,
EmbeddingAPIError) so the exception handlers do not raise NameError.
| def parse_error(self, error: Exception) -> Exception: | ||
| """Parse OpenAI-specific errors into standard embedding exceptions.""" | ||
| error_str = str(error) | ||
|
|
||
| # Handle OpenAI authentication errors | ||
| if ("401" in error_str and ("invalid" in error_str.lower() or "incorrect" in error_str.lower())): | ||
| # Extract API key prefix if available | ||
| api_key_prefix = None | ||
| if "sk-" in error_str: | ||
| import re | ||
| key_match = re.search(r'sk-([a-zA-Z0-9]{3})', error_str) | ||
| if key_match: | ||
| api_key_prefix = f"sk-{key_match.group(1)}…" | ||
|
|
||
| return EmbeddingAuthenticationError( | ||
| "Invalid OpenAI API key", | ||
| api_key_prefix=api_key_prefix | ||
| ) | ||
|
|
||
| # Handle quota exhaustion | ||
| elif ("quota" in error_str.lower() or "billing" in error_str.lower() or "credits" in error_str.lower()): | ||
| # Try to extract token usage if available | ||
| tokens_used = None | ||
| token_match = re.search(r'(\d+)\s*tokens?', error_str, re.IGNORECASE) | ||
| if token_match: | ||
| tokens_used = int(token_match.group(1)) | ||
|
|
||
| return EmbeddingQuotaExhaustedError( | ||
| "OpenAI quota exhausted", | ||
| tokens_used=tokens_used | ||
| ) | ||
|
|
||
| # Handle rate limiting | ||
| elif ("rate" in error_str.lower() and "limit" in error_str.lower()): | ||
| return EmbeddingRateLimitError("OpenAI rate limit exceeded") | ||
|
|
||
| # Generic API error | ||
| else: | ||
| return EmbeddingAPIError(f"OpenAI API error: {error_str}", original_error=error) | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Strengthen OpenAI error classification (use status/exception types; disambiguate 429 quota vs rate limit; 5xx handling).
Current string checks are brittle and may misclassify 429s and 5xx. Prefer introspecting status/exception names, fall back to message keywords, and include Retry-After when present. Also remove the redundant local import re.
def parse_error(self, error: Exception) -> Exception:
- """Parse OpenAI-specific errors into standard embedding exceptions."""
- error_str = str(error)
-
- # Handle OpenAI authentication errors
- if ("401" in error_str and ("invalid" in error_str.lower() or "incorrect" in error_str.lower())):
- # Extract API key prefix if available
- api_key_prefix = None
- if "sk-" in error_str:
- import re
- key_match = re.search(r'sk-([a-zA-Z0-9]{3})', error_str)
- if key_match:
- api_key_prefix = f"sk-{key_match.group(1)}…"
-
- return EmbeddingAuthenticationError(
- "Invalid OpenAI API key",
- api_key_prefix=api_key_prefix
- )
-
- # Handle quota exhaustion
- elif ("quota" in error_str.lower() or "billing" in error_str.lower() or "credits" in error_str.lower()):
- # Try to extract token usage if available
- tokens_used = None
- token_match = re.search(r'(\d+)\s*tokens?', error_str, re.IGNORECASE)
- if token_match:
- tokens_used = int(token_match.group(1))
-
- return EmbeddingQuotaExhaustedError(
- "OpenAI quota exhausted",
- tokens_used=tokens_used
- )
-
- # Handle rate limiting
- elif ("rate" in error_str.lower() and "limit" in error_str.lower()):
- return EmbeddingRateLimitError("OpenAI rate limit exceeded")
-
- # Generic API error
- else:
- return EmbeddingAPIError(f"OpenAI API error: {error_str}", original_error=error)
+ """Parse OpenAI-specific errors into standard embedding exceptions."""
+ error_str = str(error)
+ msg_l = error_str.lower()
+ error_name = type(error).__name__.lower()
+ status = getattr(error, "status", None) or getattr(error, "status_code", None)
+ response = getattr(error, "response", None)
+ if status is None and response is not None:
+ status = getattr(response, "status", None) or getattr(response, "status_code", None)
+ try:
+ status_i = int(status) if status is not None else None
+ except Exception:
+ status_i = None
+ headers = getattr(response, "headers", {}) if response is not None else {}
+ retry_after = headers.get("Retry-After")
+
+ # Authentication: status 401/403, OpenAI auth error classes, or clear invalid key text
+ if (status_i in (401, 403)) or ("auth" in error_name) or ("invalid" in msg_l and "key" in msg_l):
+ api_key_prefix = None
+ key_match = re.search(r'\bsk-([a-zA-Z0-9]{3})', error_str)
+ if key_match:
+ api_key_prefix = f"sk-{key_match.group(1)}…"
+ return EmbeddingAuthenticationError("Invalid OpenAI API key", api_key_prefix=api_key_prefix)
+
+ # 429: differentiate quota vs. plain rate limit
+ if status_i == 429:
+ if any(k in msg_l for k in ("quota", "billing", "credit", "insufficient_quota")):
+ tokens_used = None
+ token_match = re.search(r'(\d+)\s*tokens?', error_str, re.IGNORECASE)
+ if token_match:
+ tokens_used = int(token_match.group(1))
+ return EmbeddingQuotaExhaustedError("OpenAI quota exhausted", tokens_used=tokens_used)
+ hint = f"; retry after {retry_after}s" if retry_after else ""
+ return EmbeddingRateLimitError(f"OpenAI rate limit exceeded{hint}")
+
+ # Quota exhaustion by message even without explicit 429
+ if any(k in msg_l for k in ("quota", "billing", "credit", "insufficient_quota")):
+ tokens_used = None
+ token_match = re.search(r'(\d+)\s*tokens?', error_str, re.IGNORECASE)
+ if token_match:
+ tokens_used = int(token_match.group(1))
+ return EmbeddingQuotaExhaustedError("OpenAI quota exhausted", tokens_used=tokens_used)
+
+ # Server-side/transient errors
+ if (status_i is not None and status_i >= 500) or "bad gateway" in msg_l or "server error" in msg_l:
+ return EmbeddingAPIError(f"OpenAI API error: {error_str}", original_error=error)
+
+ # Fallback
+ return EmbeddingAPIError(f"OpenAI API error: {error_str}", original_error=error)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
python/src/server/services/embeddings/provider_error_adapters.py lines 51-90:
The current implementation relies on brittle string matching and a local "import
re"; update parse_error to first inspect standard attributes (e.g., status_code,
http_status, code) and exception class names to classify errors, disambiguate
429 responses by checking if the status/code equals 429 and then decide whether
it's quota (body indicates quota/credits/billing) or rate limit (Retry-After
present or typical rate-limit headers), treat 5xx status codes as
EmbeddingAPIError/TransientServerError, extract Retry-After and token counts
from structured fields when available (fall back to regex only as last resort),
and remove the inline import of re in favor of module-level import; populate the
specific adapter exceptions with api_key_prefix, tokens_used, retry_after, and
original_error as appropriate.
| # Use string operations for API key detection (OpenAI format: sk-...) | ||
| if 'sk-' in sanitized: | ||
| words = sanitized.split() | ||
| for i, word in enumerate(words): | ||
| if word.startswith('sk-') and len(word) == 51: | ||
| words[i] = '[REDACTED_KEY]' | ||
| sanitized = ' '.join(words) | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Make OpenAI key redaction length-agnostic (avoid false negatives).
Current logic assumes exactly 51 chars and tokenized-by-space; this can leak newer key formats or when punctuation is attached.
- # Use string operations for API key detection (OpenAI format: sk-...)
- if 'sk-' in sanitized:
- words = sanitized.split()
- for i, word in enumerate(words):
- if word.startswith('sk-') and len(word) == 51:
- words[i] = '[REDACTED_KEY]'
- sanitized = ' '.join(words)
+ # Robust OpenAI key redaction: handle variable lengths and punctuation
+ sanitized = re.sub(r'\bsk-[A-Za-z0-9_-]{20,100}\b', '[REDACTED_KEY]', sanitized)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Use string operations for API key detection (OpenAI format: sk-...) | |
| if 'sk-' in sanitized: | |
| words = sanitized.split() | |
| for i, word in enumerate(words): | |
| if word.startswith('sk-') and len(word) == 51: | |
| words[i] = '[REDACTED_KEY]' | |
| sanitized = ' '.join(words) | |
| # Robust OpenAI key redaction: handle variable lengths and punctuation | |
| sanitized = re.sub(r'\bsk-[A-Za-z0-9_-]{20,100}\b', '[REDACTED_KEY]', sanitized) |
| # Should raise HTTPException with status 429 and detailed error info | ||
| with pytest.raises(HTTPException) as exc_info: | ||
| await perform_rag_query(request) | ||
|
|
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Add missing import for embedding exceptions.
The test uses EmbeddingAuthenticationError, EmbeddingQuotaExhaustedError, and EmbeddingAPIError in the _validate_provider_api_key tests but they're not imported at the top level in the knowledge_api.py file for these specific validation tests.
🏁 Script executed:
#!/bin/bash
# Description: Check if EmbeddingAuthenticationError and related exceptions are properly imported in knowledge_api.py
# Check the imports in knowledge_api.py
rg -A5 -B5 "EmbeddingAuthenticationError" python/src/server/api_routes/knowledge_api.pyLength of output: 2009
Add top-level imports for embedding exceptions. The exceptions are only imported inside an except block in python/src/server/api_routes/knowledge_api.py; move or add the import at module top-level so EmbeddingAuthenticationError, EmbeddingQuotaExhaustedError, and EmbeddingAPIError are available to the tests that reference them.
🤖 Prompt for AI Agents
In python/tests/test_openai_error_handling.py around line 110, the tests expect
EmbeddingAuthenticationError, EmbeddingQuotaExhaustedError, and
EmbeddingAPIError to be importable but those exceptions are only imported inside
an except block in python/src/server/api_routes/knowledge_api.py; add top-level
imports for these exception classes in the module where they are defined (or
re-export them from a public module) and update
python/src/server/api_routes/knowledge_api.py to import them at module top-level
so the test file can import them directly. Ensure the exceptions are exported
from their defining module and update any relative import paths in tests to
reference that top-level export.
| # Should not raise any exception | ||
| await _validate_openai_api_key() |
There was a problem hiding this comment.
Update test to use correct function name.
The test calls _validate_openai_api_key() but should call _validate_provider_api_key().
Apply this diff:
- await _validate_openai_api_key()
+ await _validate_provider_api_key()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Should not raise any exception | |
| await _validate_openai_api_key() | |
| # Should not raise any exception | |
| await _validate_provider_api_key() |
🤖 Prompt for AI Agents
In python/tests/test_openai_error_handling.py around lines 209 to 210, the test
is calling the incorrect function name `_validate_openai_api_key()`; update the
call to `_validate_provider_api_key()` so the test invokes the intended
validation function and ensure any imports or mocks reference the same
`_validate_provider_api_key` symbol if necessary.
| with pytest.raises(HTTPException) as exc_info: | ||
| await _validate_openai_api_key() |
There was a problem hiding this comment.
Update test to use correct function name.
Same issue with function name here.
Apply this diff:
- await _validate_openai_api_key()
+ await _validate_provider_api_key()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| with pytest.raises(HTTPException) as exc_info: | |
| await _validate_openai_api_key() | |
| with pytest.raises(HTTPException) as exc_info: | |
| await _validate_provider_api_key() |
🤖 Prompt for AI Agents
In python/tests/test_openai_error_handling.py around lines 223-224, the test
calls _validate_openai_api_key but the actual function exported/implemented has
a different name; update the test to call the correct function name (e.g.,
validate_openai_api_key) so it matches the implementation and raises
HTTPException as expected.
| with pytest.raises(HTTPException) as exc_info: | ||
| await _validate_openai_api_key() |
There was a problem hiding this comment.
Update test to use correct function name.
Same issue with function name here.
Apply this diff:
- await _validate_openai_api_key()
+ await _validate_provider_api_key()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| with pytest.raises(HTTPException) as exc_info: | |
| await _validate_openai_api_key() | |
| with pytest.raises(HTTPException) as exc_info: | |
| await _validate_provider_api_key() |
🤖 Prompt for AI Agents
In python/tests/test_openai_error_handling.py around lines 240-241, the test is
calling the wrong function name _validate_openai_api_key(); update the test to
call the correct function used elsewhere (await validate_openai_api_key()) so
the awaited call matches the real function name and signature, keeping the
pytest.raises(HTTPException) context unchanged.
| def test_sanitize_openai_error_removes_urls(self): | ||
| """Test that sanitization function removes URLs from error messages.""" | ||
| error_message = "Connection failed to https://api.openai.com/v1/embeddings with status 400" | ||
| sanitized = _sanitize_openai_error(error_message) | ||
|
|
||
| assert "https://api.openai.com" not in sanitized | ||
| assert "[REDACTED_URL]" in sanitized | ||
| assert "Connection failed" in sanitized |
There was a problem hiding this comment.
Update all sanitization tests to use correct function name.
All sanitization tests need to call _sanitize_provider_error instead of _sanitize_openai_error.
Apply this diff pattern to all sanitization test methods (lines 246-348):
- sanitized = _sanitize_openai_error(error_message)
+ sanitized = _sanitize_provider_error(error_message)This change needs to be applied to the following test methods:
test_sanitize_openai_error_removes_urlstest_sanitize_openai_error_removes_api_keystest_sanitize_openai_error_removes_auth_infotest_sanitize_openai_error_returns_generic_for_sensitive_wordstest_sanitize_openai_error_preserves_safe_messagestest_sanitize_openai_error_removes_organization_idstest_sanitize_openai_error_removes_project_idstest_sanitize_openai_error_removes_request_idstest_sanitize_openai_error_handles_multiple_patternstest_sanitize_openai_error_input_validation
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_sanitize_openai_error_removes_urls(self): | |
| """Test that sanitization function removes URLs from error messages.""" | |
| error_message = "Connection failed to https://api.openai.com/v1/embeddings with status 400" | |
| sanitized = _sanitize_openai_error(error_message) | |
| assert "https://api.openai.com" not in sanitized | |
| assert "[REDACTED_URL]" in sanitized | |
| assert "Connection failed" in sanitized | |
| def test_sanitize_openai_error_removes_urls(self): | |
| """Test that sanitization function removes URLs from error messages.""" | |
| error_message = "Connection failed to https://api.openai.com/v1/embeddings with status 400" | |
| sanitized = _sanitize_provider_error(error_message) | |
| assert "https://api.openai.com" not in sanitized | |
| assert "[REDACTED_URL]" in sanitized | |
| assert "Connection failed" in sanitized |
🤖 Prompt for AI Agents
In python/tests/test_openai_error_handling.py around lines 246 to 253 (and
extending through line 348 for the related tests), the sanitization tests call
the old function name _sanitize_openai_error; update each listed test method to
call _sanitize_provider_error instead. Replace every invocation of
_sanitize_openai_error with _sanitize_provider_error in the following tests:
test_sanitize_openai_error_removes_urls,
test_sanitize_openai_error_removes_api_keys,
test_sanitize_openai_error_removes_auth_info,
test_sanitize_openai_error_returns_generic_for_sensitive_words,
test_sanitize_openai_error_preserves_safe_messages,
test_sanitize_openai_error_removes_organization_ids,
test_sanitize_openai_error_removes_project_ids,
test_sanitize_openai_error_removes_request_ids,
test_sanitize_openai_error_handles_multiple_patterns, and
test_sanitize_openai_error_input_validation so they all call the correct
sanitization function.
Fix failing backend tests by updating test imports to use the new provider-agnostic function names: - _sanitize_openai_error → _sanitize_provider_error - _validate_openai_api_key → _validate_provider_api_key All test functionality preserved while supporting the new multi-provider error handling architecture. Tests now properly validate provider-agnostic error sanitization and API key validation.
…e-643 fix: replace hardcoded test commands with bun run test in CI
…est-suite-643 fix: replace hardcoded test commands with bun run test in CI
…est-suite-643 fix: replace hardcoded test commands with bun run test in CI
Summary
Implements comprehensive OpenAI error handling to resolve Issue #362 where users experienced silent failures and 90-minute debugging sessions when OpenAI API quota was exhausted.
Problem Solved
Key Features
Implementation Highlights
Backend Enhancements
_sanitize_openai_error()with 8+ sanitization patterns and ReDoS protection_validate_openai_api_key()called before crawl/upload/refresh operationsFrontend Enhancements
Error Types Handled
Security Features
sk-1234...→[REDACTED_KEY]https://api.openai.com→[REDACTED_URL]org-1234...→[REDACTED_ORG]Code Review Status
✅ All critical and important issues addressed:
Test Plan
Manual Testing ✅
Automated Testing
Error Flow Validation
Architecture Impact
Closes
Fixes #362
Testing Instructions
🤖 Generated with Claude Code
Summary by CodeRabbit