refactor: Remove Socket.IO and implement HTTP polling architecture by Wirasm · Pull Request #514 · coleam00/Archon

Wirasm · 2025-08-28T11:01:14Z

Pull Request

Summary

Complete removal of Socket.IO/WebSocket dependencies in favor of simple HTTP polling architecture. This major refactoring eliminates the complexity of WebSocket connections while improving performance and reliability of real-time updates in the Archon system.

Changes Made

Core Architecture Migration (`970ba04`, `75f0e70`)

Complete Socket.IO Removal: Removed all WebSocket/Socket.IO dependencies from frontend and backend
- Removed python-socketio from requirements and 1,800+ lines of Socket.IO code
- Deleted socketIOService.ts, socketService.ts, taskSocketService.ts (1,800+ lines total)
- Removed socketio_handlers.py, socketio_broadcasts.py, socketio_app.py (1,200+ lines)
HTTP Polling Implementation: Replaced real-time connections with efficient HTTP polling
- New usePolling hook with ETag support for 70% bandwidth reduction
- Smart polling intervals (1-3s active, 5-10s background) with visibility detection
- Progress endpoints: /api/progress/crawl, /api/progress/project-creation

Performance Optimizations (`e4a6a68`, `75f0e70`)

Project Switching Speed: Reduced from 1.5-2 seconds to 100-200ms
- Direct API calls for user actions, polling for background updates
- Eliminated race conditions between polling and direct API calls
ETag Caching: Added comprehensive ETag support to reduce network traffic
- Projects endpoint already had ETag, added to tasks endpoint
- 304 Not Modified responses save ~70% bandwidth
Race Condition Elimination: Fixed competing refresh systems
- Removed fragile isLoadingDirectly anti-pattern
- Single source of truth: polling for external updates, direct calls for user feedback

Data Consistency & Error Handling (`6816610`, `98c8684`, `2b08c7b`)

Crawl Data Validation: Fixed false success states when storage fails
- Return actual stored chunk count, throw on zero storage
- Proper error propagation to frontend progress displays
LLM Provider Consolidation: Unified API key access pattern
- Removed direct environment variable access
- Support for OpenAI, Google, Ollama providers through single interface
Async/Await Consistency: Fixed async operations in source management

MCP Integration Fixes (`1301516`)

Project Polling Bug: Fixed API response handling in MCP tools
- API returns {projects: [...]} but code was iterating directly
- Resolved 'str' object has no attribute 'get' errors during project creation

Documentation & Standards (`4179322`, `af402c2`)

API Naming Conventions: 168-line comprehensive guide
Polling Architecture: Detailed documentation of new HTTP polling patterns
CLAUDE.md Updates: Updated development guidelines for new architecture

Type of Change

Breaking change (fix or feature that would cause existing functionality to not work as expected)
Performance improvement
Code refactoring

Affected Services

Testing

All existing tests pass
Manually tested affected user flows
Docker builds succeed for all services
MCP tools tested and verified working

Test Evidence

# Backend tests pass
uv run pytest tests/test_api_essentials.py -v
uv run pytest tests/test_service_integration.py -v

# Frontend builds successfully  
cd archon-ui-main && npm run build

# All services start without Socket.IO dependencies
docker-compose up --build -d

Checklist

My code follows the service architecture patterns
If using an AI coding assistant, I used the CLAUDE.md rules
All new and existing tests pass locally
My changes generate no new warnings
I have updated relevant documentation
I have verified no regressions in existing features

Breaking Changes

WebSocket/Socket.IO Removal:

Any external integrations expecting Socket.IO connections will break
Frontend components previously using socket hooks need to use polling hooks
Backend Socket.IO event handlers completely removed

Migration Steps:

Update any external clients to use HTTP polling instead of WebSocket
Replace socket hook imports with polling equivalents:
- useSocketSubscription → usePolling
- useTaskSocket → useTaskPolling
- useCrawlProgressSocket → useCrawlProgressPolling

Additional Notes

Performance Impact:

77 files changed, 5,811 insertions, 8,189 deletions (net -2,378 lines)
Project switching improved from 1.5-2s to 100-200ms
ETag caching reduces bandwidth by ~70% for repeat requests
Memory usage reduced by removing WebSocket connection overhead

Architecture Benefits:

Simplified deployment (no WebSocket proxy configuration needed)
Better error handling and debugging (HTTP status codes vs socket events)
Easier testing (standard HTTP requests vs socket mocking)
More robust connection handling (HTTP retry logic vs socket reconnection)

Files Removed:

archon-ui-main/src/services/socketIOService.ts (494 lines)
archon-ui-main/src/services/socketService.ts (831 lines)
archon-ui-main/src/services/taskSocketService.ts (327 lines)
python/src/server/api_routes/socketio_handlers.py (1,082 lines)
python/src/server/api_routes/socketio_broadcasts.py (56 lines)
python/src/server/socketio_app.py (65 lines)

This refactoring significantly simplifies the codebase while improving performance and maintainability. The HTTP polling approach is more reliable and easier to debug than WebSocket connections.

Summary by CodeRabbit

New Features
- HTTP polling with ETag caching for projects, tasks, progress, and chat; progress endpoints added.
- New knowledge UX: Add Knowledge modal, Crawling tab, and detailed crawling progress cards.
- New hooks: polling/optimistic mutation patterns enabling optimistic UI and offline-aware polling.
Improvements
- Task status labels renamed to Todo / Doing / Review / Done; multi-select and improved drag‑and‑drop ordering.
- Chat loads history then streams updates; smart polling (intervals, tab-pausing, backoff).
Removed
- Socket.IO/websocket real‑time features, live test/coverage UIs, and live MCP log streaming.
Documentation
- Added polling architecture, ETag, API naming conventions, and architecture docs.

Major refactoring to simplify the architecture: 1. Socket.IO Removal: - Removed all Socket.IO dependencies and code (~4,256 lines) - Replaced with HTTP polling for real-time updates - Added new polling hooks (usePolling, useDatabaseMutation, etc.) - Removed socket services and handlers 2. Status Consolidation: - Removed UI/DB status mapping layer - Using database values directly (todo, doing, review, done) - Removed obsolete status types and mapping functions - Updated all components to use database status values 3. Simplified Architecture: - Cleaner separation between frontend and backend - Reduced complexity in state management - More maintainable codebase 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>

- Added loading overlay when dragging tasks between columns - Added loading state when switching between projects - Added proper error handling with toast notifications - Removed remaining Socket.IO references - Improved user feedback during async operations

Created developer guide explaining: - Core polling components and hooks - ETag caching implementation - State management patterns - Migration from Socket.IO - Performance optimizations - Developer guidelines and best practices

- Fixed projectService.getTasks() to projectService.getTasksByProject() - Ensures consistent naming throughout the codebase - Resolves error when refreshing tasks after drag operations

Created naming standards documentation covering: - Service method naming patterns - API endpoint conventions - Component and hook naming - State variable naming - Type definitions - Common patterns and anti-patterns - Migration notes from Socket.IO

- Replaced Socket.IO references with HTTP polling architecture - Added polling intervals and ETag caching documentation - Added API naming conventions section - Corrected task endpoint patterns (use getTasksByProject, not getTasks) - Added state naming patterns and status values

Complete removal of Socket.IO/WebSocket dependencies in favor of simple HTTP polling: Frontend changes: - Remove all WebSocket/Socket.IO references from KnowledgeBasePage - Implement useCrawlProgressPolling hook for progress tracking - Fix polling hook to prevent ERR_INSUFFICIENT_RESOURCES errors - Add proper cleanup and state management for completed crawls - Persist and restore active crawl progress across page refreshes - Fix agent chat service to handle disabled agents gracefully Backend changes: - Remove python-socketio from requirements - Convert ProgressTracker to in-memory state management - Add /api/crawl-progress/{id} endpoint for polling - Initialize ProgressTracker immediately when operations start - Remove all Socket.IO event handlers and cleanup commented code - Simplify agent_chat_api to basic REST endpoints Bug fixes: - Fix race condition where progress data wasn't available for polling - Fix memory leaks from recreating polling callbacks - Fix crawl progress URL mismatch between frontend and backend - Add proper error filtering for expected 404s during initialization - Stop polling when crawl operations complete This change simplifies the architecture significantly and makes it more robust by removing the complexity of WebSocket connections. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

- Modify add_documents_to_supabase to return actual chunks stored count - Update crawl orchestration to validate chunks were actually saved to database - Throw exception when chunks are processed but none stored (e.g., API key failures) - Ensure UI shows error state instead of false success when storage fails - Add proper error field to progress updates for frontend display This prevents misleading "crawl completed" status when backend fails to store data.

- Fix credential service to properly store encrypted OpenAI API key from environment - Remove direct environment variable access pattern from source management service - Update both extract_source_summary and generate_source_title_and_metadata to async - Convert all LLM operations to use get_llm_client() for multi-provider support - Fix callers in document_storage_operations.py and storage_services.py to use await - Improve title generation prompt with better context and examples for user-readable titles - Consolidate on single pattern that supports OpenAI, Google, Ollama providers This fixes embedding service failures while maintaining compatibility for future providers.

- Make update_source_info async and await it properly - Fix generate_source_title_and_metadata async calls - Improve source title generation with URL-based detection - Remove unnecessary threading wrapper for async operations 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

- Fix polling logic to properly extract projects array from API response - The API returns {projects: [...]} but polling was trying to iterate directly over response - This caused 'str' object has no attribute 'get' errors during project creation - Update both create_project polling and list_projects response handling - Verified all MCP tools now work correctly including create_project 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

- Replace race condition-prone polling refetch with direct API calls for immediate task loading (100-200ms vs 1.5-2s) - Add polling suppression during direct API calls to prevent task jumping from double setTasks() calls - Clear stale tasks immediately on project switch to prevent wrong data visibility - Maintain polling for background updates from agents/MCP while optimizing user-initiated actions Performance improvements: - Project switches now load tasks in 100-200ms instead of 1.5-2 seconds - Eliminated visual task jumping during project transitions - Clean separation: direct calls for user actions, polling for external updates 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Critical fixes addressing code review findings: **Race Condition Resolution:** - Remove fragile isLoadingDirectly flag that could permanently disable polling - Remove competing polling onSuccess callback that caused task jumping - Clean separation: direct API calls for user actions, polling for external updates only **Socket.IO Removal:** - Replace projectCreationProgressService with useProgressPolling HTTP polling - Remove all Socket.IO dependencies and references - Complete migration to HTTP-only architecture **Performance Optimization:** - Add ETag support to /projects/{project_id}/tasks endpoint for 70% bandwidth savings - Remove competing TasksTab onRefresh system that caused multiple API calls - Single source of truth: polling handles background updates, direct calls for immediate feedback **Task Management Simplification:** - Remove onRefresh calls from all TasksTab operations (create, update, delete, move) - Operations now use optimistic updates with polling fallback - Eliminates 3-way race condition between polling, direct calls, and onRefresh Result: Fast project switching (100-200ms), no task jumping, clean polling architecture 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai · 2025-08-28T11:01:22Z

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (1)

python/tests/test_code_extraction_source_id.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Replaces Socket.IO/WebSocket real-time features with HTTP polling and ETag caching across frontend and backend. Adds polling hooks, progress API, ETAG utilities, optimistic/DB mutation hooks, and many UI/service refactors (tasks, projects, knowledge, crawling, agent chat). Removes socket deps and streaming modules; updates tests and Docker entrypoint.

Changes

Cohort / File(s)	Summary
Docs & Architecture `CLAUDE.md`, `PRPs/ai_docs/*`, `PRPs/ai_docs/ETAG_IMPLEMENTATION.md`, `PRPs/ai_docs/optimistic_updates.md`	Add/modify architecture and design docs: polling architecture, ETag guidance, naming conventions, optimistic updates, and general architecture migration notes.
Frontend: Polling & Mutation Hooks `archon-ui-main/src/hooks/usePolling.ts`, `archon-ui-main/src/hooks/useDatabaseMutation.ts`, `archon-ui-main/src/hooks/useProjectMutation.ts`	New generic polling hook + specialized pollers (tasks/projects/crawl progress) with ETag, visibility pause, backoff; new database and project mutation hooks with toasts and isPending/isLoading semantics.
Frontend: Remove Socket Clients & Hooks deleted: `archon-ui-main/src/services/socketIOService.ts`, `.../taskSocketService.ts`, `.../socketService.ts`, `.../services/crawlProgressService.ts`, `.../services/projectCreationProgressService.ts`, `.../hooks/useTaskSocket.ts`, `.../hooks/useSocketSubscription.ts`	Remove Socket.IO / WebSocket client modules and related React hooks; eliminate socket-based public APIs.
Frontend: Agent Chat → Polling `archon-ui-main/src/services/agentChatService.ts`, `archon-ui-main/src/components/layouts/ArchonChatPanel.tsx`	Replace WebSocket streaming with polling-like streamMessages/getChatHistory/stopStreaming API; UI loads history then streams via polling callbacks.
Frontend: Tasks & Projects (UI + utils) `archon-ui-main/src/pages/ProjectPage.tsx`, `archon-ui-main/src/components/project-tasks/*`, `archon-ui-main/src/utils/taskOrdering.ts`, `archon-ui-main/src/utils/debounce.ts`, `archon-ui-main/src/components/common/DeleteConfirmModal.tsx`	Rewire tasks/projects for polling + optimistic updates: rename statuses (backlog/in-progress/complete → todo/doing/done), add fractional ordering, debounce util, multi-select/mass ops, delete confirm modal, selection props, and related UI wiring changes.
Frontend: Knowledge/Crawling UI `archon-ui-main/src/pages/KnowledgeBasePage.tsx`, `archon-ui-main/src/components/knowledge-base/{AddKnowledgeModal,CrawlingTab,CrawlingProgressCard,KnowledgeItemCard,GroupedKnowledgeItemCard,KnowledgeTable}.tsx`	Replace socket progress with progressId-driven polling; add AddKnowledgeModal, CrawlingTab, CrawlingProgressCard, localStorage recovery, DNS validation and crawl-type handling.
Frontend: Config & Cleanup `archon-ui-main/src/config/api.ts`, `archon-ui-main/vite.config.ts`, `archon-ui-main/package.json`, `archon-ui-main/src/App.tsx`	Remove WS URL/proxy and socket.io-client dep, update routes (add /projects/:projectId), remove coverage/test UI components and related proxies.
Frontend: Tests & Test Setup `archon-ui-main/test/**`, `archon-ui-main/test/setup.ts`, `archon-ui-main/vitest.config.ts`	Add tests for polling/hooks/task ordering/DeleteConfirmModal; remove WebSocket mocks; broaden Vitest include globs.
Backend: App wiring & Routers `python/src/server/main.py`, `python/src/server/api_routes/__init__.py`, removed `coverage_api.py`, `tests_api.py`, `socketio_app.py`, `socketio_broadcasts.py`, `socketio_handlers.py`	Remove Socket.IO app and many socket-based API modules; add progress router; expose FastAPI app (uvicorn entrypoint changed).
Backend: Progress/Etag Utilities & API `python/src/server/utils/etag_utils.py`, `python/src/server/utils/progress/progress_tracker.py`, `python/src/server/api_routes/progress_api.py`	New ETag utils; ProgressTracker refactored to in-memory tracker (no sockets); new /api/progress endpoints returning ETag and poll metadata; 304 handling.
Backend: Projects & Tasks APIs `python/src/server/api_routes/projects_api.py`, `python/src/server/services/projects/project_service.py`, `python/src/server/services/projects/task_service.py`	Add ETag caching to /api/projects and /api/projects/{id}/tasks, add batch task-counts endpoint, remove broadcast hooks; TaskService receives DI supabase client.
Backend: Knowledge & Crawling `python/src/server/api_routes/knowledge_api.py`, `python/src/server/services/crawling/`, `python/src/server/services/storage/`, `python/src/server/services/source_management_service.py`	Migrate crawling orchestration to ProgressTracker, add concurrency limits, cancellation, stronger verification of stored chunks, change progress field to progress, return chunks_stored from storage, modernize typing.
Backend: Agent Chat API `python/src/server/api_routes/agent_chat_api.py`	Remove SSE/Socket.IO streaming; add GET /sessions/{session_id}/messages and make send_message store user message only.
Backend: MCP / Logs `python/src/server/api_routes/mcp_api.py`, `archon-ui-main/src/services/mcpServerService.ts`, `archon-ui-main/src/services/mcpService.ts`, `archon-ui-main/src/pages/MCPPage.tsx`	Remove WebSocket log streaming; switch to HTTP-based log fetch paths; UI simplified (no live streaming).
Types & Schemas `archon-ui-main/src/types/project.ts`, `archon-ui-main/src/lib/projectSchemas.ts`, `archon-ui-main/src/types/crawl.ts`	Remove UI status mapping, event schemas and UITaskStatus; add TaskCounts and rich CrawlProgressData/ProgressDetails types.
Backend: Tests & Test updates `python/tests/`, `python/tests/server/`	Replace socket tests with polling/ETag tests; add etag utils tests; adjust test fixtures/data shapes.
Config & Dependencies `docker-compose.yml`, `python/pyproject.toml`, `python/src/server/config/logfire_config.py`	Uvicorn now runs FastAPI app (not socket_app); remove python-socketio dependency; suppress httpx logs.
Misc Python typing modernizations `python/src/mcp_server/features/`, `python/src/server/services/`, `python/src/server/utils/*`	Update type hints to PEP 604/built-in generics, minor signature adjustments (mostly typing-only), and remove socket-related parameters/usages.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant UI as Frontend (React)
  participant API as FastAPI
  participant PT as ProgressTracker
  Note over UI,API: Polling model with ETag caching

  User->>UI: Start crawl (URL/File)
  UI->>API: POST /api/knowledge/... (start)
  API->>PT: tracker.start(progress_id)
  API-->>UI: { progressId }

  loop Every 1s (visible tab)
    UI->>API: GET /api/progress/{progressId}\nIf-None-Match: "etag123"
    API->>PT: get_progress(progressId)
    alt Not Modified
      API-->>UI: 304 Not Modified
    else Updated
      API-->>UI: 200 { status, progress, ... }\nETag: "etag456"
      UI->>UI: Update UI (merge optimistic changes)
    end
  end

sequenceDiagram
  autonumber
  actor User
  participant UI as Frontend (React)
  participant API as FastAPI
  Note over UI,API: Projects/Tasks polling (8–10s) + optimistic updates

  loop Poll interval (8–10s)
    UI->>API: GET /api/projects\nIf-None-Match: "etagP"
    API-->>UI: 200/304 with ETag
    UI->>API: GET /api/projects/{id}/tasks\nIf-None-Match: "etagT"
    API-->>UI: 200/304 with ETag
    UI->>UI: Merge with optimistic updates
  end

  User->>UI: Drag task → drop
  UI->>UI: Optimistic reorder (local)
  UI->>API: PUT /api/tasks/{id} (status/order)
  API-->>UI: 200 Task
  UI->>UI: Clear optimistic state or rollback on error

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120–180 minutes

Possibly related PRs

Fix critical token consumption issue in list endpoints (#488) #502 — Modifies projects/tasks listing endpoints and lightweight response shapes; strongly overlaps with this PR’s projects/tasks ETag and list changes.
Fix race condition in concurrent crawling with unique source IDs #472 — Changes crawling/document-storage pipeline and source ID/display_name logic; overlaps with crawling and document storage refactors here.

Suggested reviewers

leex279
tazmon95

Poem

"I swapped the wires for gentle polls, I hop and hum in tidy loops,
ETAG crumbs guide tiny goals, saving needless, noisy swoops.
Tasks march from todo into done, orders fractioned, neat, and true,
Progress tracked by little pings — a rabbit hums: 'Refetch — one, two.' 🥕✨"

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch refactor-remove-sockets

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 36

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (56)

python/src/mcp_server/utils/__init__.py (1)

1-3: Align MCP package namespace with filesystem
The code is in python/src/mcp_server/ but all imports use the mcp.server namespace—and there’s no python/src/mcp/server/ directory. You must either:

Rename/move python/src/mcp_server/ to python/src/mcp/server/ (with __init__.py in each folder), or

Configure package_dir in your build (e.g. setup.py/pyproject.toml) to map mcp.server → mcp_server

Without one of these, from mcp.server.fastmcp import Context will fail at import time.

python/src/server/services/projects/project_service.py (4)

45-47: Use timezone-aware UTC timestamps for consistency and ETag correctness.

Naive datetimes can cause inconsistent sorting/ETag derivation across servers.

-from datetime import datetime
+from datetime import datetime, timezone
@@
-                "created_at": datetime.now().isoformat(),
-                "updated_at": datetime.now().isoformat(),
+                "created_at": datetime.now(timezone.utc).isoformat(),
+                "updated_at": datetime.now(timezone.utc).isoformat(),
@@
-            update_data = {"updated_at": datetime.now().isoformat()}
+            update_data = {"updated_at": datetime.now(timezone.utc).isoformat()}

Also applies to: 323-324

72-75: Preserve stack traces and add clear context in logs.

Replace logger.error with logger.exception (or exc_info=True) per guidelines; keep client-facing messages concise.

-        except Exception as e:
-            logger.error(f"Error creating project: {e}")
-            return False, {"error": f"Database error: {str(e)}"}
+        except Exception:
+            logger.exception("create_project: failed to insert project into 'archon_projects'")
+            return False, {"error": "Failed to create project"}

@@
-        except Exception as e:
-            logger.error(f"Error listing projects: {e}")
-            return False, {"error": f"Error listing projects: {str(e)}"}
+        except Exception:
+            logger.exception("list_projects: failed to retrieve projects from 'archon_projects'")
+            return False, {"error": "Error listing projects"}

@@
-        except Exception as e:
-            logger.error(f"Error getting project: {e}")
-            return False, {"error": f"Error getting project: {str(e)}"}
+        except Exception:
+            logger.exception("get_project: failed fetching project and sources (project_id=%s)", project_id)
+            return False, {"error": "Error getting project"}

@@
-        except Exception as e:
-            logger.error(f"Error deleting project: {e}")
-            return False, {"error": f"Error deleting project: {str(e)}"}
+        except Exception:
+            logger.exception("delete_project: failed to delete (project_id=%s)", project_id)
+            return False, {"error": "Error deleting project"}

@@
-        except Exception as e:
-            logger.error(f"Error getting project features: {e}")
-            return False, {"error": f"Error getting project features: {str(e)}"}
+        except Exception:
+            logger.exception("get_project_features: failed (project_id=%s)", project_id)
+            return False, {"error": "Error getting project features"}

@@
-        except Exception as e:
-            logger.error(f"Error updating project: {e}")
-            return False, {"error": f"Error updating project: {str(e)}"}
+        except Exception:
+            logger.exception("update_project: failed (project_id=%s)", project_id)
+            return False, {"error": "Error updating project"}

Also applies to: 146-149, 223-226, 270-273, 308-311, 363-366

338-341: Validate input types before persisting; reject malformed JSON/lists.

Enforce types for fields like docs/features/data/pinned per guidelines to avoid storing corrupted data.

-            for field in allowed_fields:
-                if field in update_fields:
-                    update_data[field] = update_fields[field]
+            for field in allowed_fields:
+                if field in update_fields:
+                    val = update_fields[field]
+                    if field in ("docs", "features", "data") and not isinstance(val, list):
+                        return False, {"error": f"Field '{field}' must be a list"}
+                    if field == "pinned" and not isinstance(val, bool):
+                        return False, {"error": "Field 'pinned' must be a boolean"}
+                    update_data[field] = val

20-27: Service-wide error signaling: prefer exceptions over (bool, payload) tuples.

Guidelines recommend raising specific exceptions with contextual messages and logging full traces. Consider refactoring service methods to return domain objects and signal failures via exceptions; translate to HTTP in the route layer.

python/src/server/api_routes/mcp_api.py (6)

141-154: Operation lock released too early; start/stop can race.

The lock guards only the throttle check, not the whole start/stop sequence. Keep the lock held through the full operation to prevent concurrent starts/stops.

Apply this diff to hold the lock for the entire operation blocks:

 async def start_server(self) -> dict[str, Any]:
-    async with self._operation_lock:
-        # Check throttling
-        current_time = time.time()
-        if current_time - self._last_operation_time < self._min_operation_interval:
-            wait_time = self._min_operation_interval - (
-                current_time - self._last_operation_time
-            )
-            mcp_logger.warning(f"Start operation throttled, please wait {wait_time:.1f}s")
-            return {
-                "success": False,
-                "status": self.status,
-                "message": f"Please wait {wait_time:.1f}s before starting server again",
-            }
-
-    with safe_span("mcp_server_start") as span:
+    async with self._operation_lock:
+        # Check throttling
+        current_time = time.time()
+        if current_time - self._last_operation_time < self._min_operation_interval:
+            wait_time = self._min_operation_interval - (current_time - self._last_operation_time)
+            mcp_logger.warning(f"Start operation throttled, please wait {wait_time:.1f}s")
+            return {
+                "success": False,
+                "status": self.status,
+                "message": f"Please wait {wait_time:.1f}s before starting server again",
+            }
+
+        with safe_span("mcp_server_start") as span:

 async def stop_server(self) -> dict[str, Any]:
-    async with self._operation_lock:
-        # Check throttling
-        current_time = time.time()
-        if current_time - self._last_operation_time < self._min_operation_interval:
-            wait_time = self._min_operation_interval - (
-                current_time - self._last_operation_time
-            )
-            mcp_logger.warning(f"Stop operation throttled, please wait {wait_time:.1f}s")
-            return {
-                "success": False,
-                "status": self.status,
-                "message": f"Please wait {wait_time:.1f}s before stopping server again",
-            }
-
-    with safe_span("mcp_server_stop") as span:
+    async with self._operation_lock:
+        # Check throttling
+        current_time = time.time()
+        if current_time - self._last_operation_time < self._min_operation_interval:
+            wait_time = self._min_operation_interval - (current_time - self._last_operation_time)
+            mcp_logger.warning(f"Stop operation throttled, please wait {wait_time:.1f}s")
+            return {
+                "success": False,
+                "status": self.status,
+                "message": f"Please wait {wait_time:.1f}s before stopping server again",
+            }
+
+        with safe_span("mcp_server_stop") as span:

Also applies to: 276-299

185-201: Avoid blocking the event loop: start container in executor and await cancelled task.

self.container.start() is synchronous and may block; also the cancelled log_reader_task should be awaited to avoid warnings.

             try:
-                # Start the container
-                self.container.start()
+                # Start the container (off the event loop)
+                loop = asyncio.get_running_loop()
+                await loop.run_in_executor(None, self.container.start)
                 self.status = "starting"
                 self.start_time = time.time()
                 self._last_operation_time = time.time()
                 self._add_log("INFO", "MCP container starting...")
                 mcp_logger.info(f"Starting MCP container: {self.container_name}")
                 safe_set_attribute(span, "container_id", self.container.id)
 
                 # Start reading logs from the container
                 if self.log_reader_task:
-                    self.log_reader_task.cancel()
+                    self.log_reader_task.cancel()
+                    try:
+                        await self.log_reader_task
+                    except asyncio.CancelledError:
+                        pass
                 self.log_reader_task = asyncio.create_task(self._read_container_logs())
 
                 # Give it a moment to start
                 await asyncio.sleep(2)

88-91: Preserve stack traces in logs (use exc_info or logger.exception).

Errors log only message strings; per guidelines, include full tracebacks.

-        except Exception as e:
-            mcp_logger.error(f"Failed to initialize Docker client: {str(e)}")
+        except Exception as e:
+            mcp_logger.error(f"Failed to initialize Docker client: {e}", exc_info=True)
             self.docker_client = None

-            except APIError as e:
+            except APIError as e:
                 self.status = "failed"
-                self._add_log("ERROR", f"Docker API error: {str(e)}")
-                mcp_logger.error(f"Docker API error during MCP startup - error={str(e)}")
+                self._add_log("ERROR", f"Docker API error: {e}")
+                mcp_logger.error("Docker API error during MCP startup", exc_info=True)
                 safe_set_attribute(span, "success", False)
                 safe_set_attribute(span, "error", str(e))

-            except Exception as e:
+            except Exception as e:
                 self.status = "failed"
-                self._add_log("ERROR", f"Failed to start MCP server: {str(e)}")
-                mcp_logger.error(
-                    f"Exception during MCP server startup - error={str(e)}, error_type={type(e).__name__}"
-                )
+                self._add_log("ERROR", f"Failed to start MCP server: {e}")
+                mcp_logger.error("Exception during MCP server startup", exc_info=True)
                 safe_set_attribute(span, "success", False)
                 safe_set_attribute(span, "error", str(e))

-            except APIError as e:
-                self._add_log("ERROR", f"Docker API error: {str(e)}")
-                mcp_logger.error(f"Docker API error during MCP stop - error={str(e)}")
+            except APIError as e:
+                self._add_log("ERROR", f"Docker API error: {e}")
+                mcp_logger.error("Docker API error during MCP stop", exc_info=True)

-            except Exception as e:
-                self._add_log("ERROR", f"Error stopping MCP server: {str(e)}")
-                mcp_logger.error(
-                    f"Exception during MCP server stop - error={str(e)}, error_type={type(e).__name__}"
-                )
+            except Exception as e:
+                self._add_log("ERROR", f"Error stopping MCP server: {e}")
+                mcp_logger.error("Exception during MCP server stop", exc_info=True)

-        except APIError as e:
+        except APIError as e:
             if "container not found" not in str(e).lower():
-                self._add_log("ERROR", f"Docker API error reading logs: {str(e)}")
+                self._add_log("ERROR", f"Docker API error reading logs: {e}")

-        except Exception as e:
-            api_logger.error("Failed to get MCP configuration", error=str(e))
+        except Exception as e:
+            api_logger.error("Failed to get MCP configuration", error=str(e), exc_info=True)

-        except Exception as e:
-            api_logger.error(f"Failed to save MCP configuration | error={str(e)}")
+        except Exception as e:
+            api_logger.error(f"Failed to save MCP configuration | error={e}", exc_info=True)

Also applies to: 235-246, 247-258, 334-344, 345-356, 472-476, 700-704, 744-747

315-318: Add retry with exponential backoff when stopping the container.

External API calls should retry. Apply similar handling to start and reload operations.

Additional helper to introduce (outside diff due to new function):

# Place inside MCPServerManager
async def _retry(self, fn, attempts=3, base_delay=0.5, *args, **kwargs):
    for i in range(attempts):
        try:
            loop = asyncio.get_running_loop()
            return await loop.run_in_executor(None, lambda: fn(*args, **kwargs))
        except Exception:
            if i == attempts - 1:
                raise
            await asyncio.sleep(base_delay * (2 ** i))

Then wrap start/stop/reload calls with await self._retry(self.container.start) and await self._retry(self.container.stop, timeout=10).

653-664: GET /config ignores saved MCP config; read and merge stored config.

Honor persisted mcp_config (host/port/transport) when present.

-            # Configuration for SSE-only mode with actual port
-            config = {
-                "host": "localhost",
-                "port": mcp_port,
-                "transport": "sse",
-            }
+            # Base config (defaults)
+            config: dict[str, Any] = {"host": "localhost", "port": mcp_port, "transport": "sse"}
+
+            # Try to load saved config
+            try:
+                from ..services.credential_service import credential_service
+                saved_json = await credential_service.get_credential("mcp_config", None)
+                if saved_json:
+                    saved = ServerConfig.model_validate_json(saved_json)
+                    config.update(saved.model_dump())
+            except Exception:
+                pass

Also applies to: 665-692

794-816: Debug placeholder in production endpoint; return 503 until introspection is implemented.

Avoid misleading data in UIs; surface an explicit “not implemented” status.

-                if is_running:
-                    return {
-                        "tools": [
-                            {
-                                "name": "debug_placeholder",
-                                "description": "MCP server is running and modules are registered, but tool introspection is not working yet",
-                                "module": "debug",
-                                "parameters": [],
-                            }
-                        ],
-                        "count": 1,
-                        "server_running": True,
-                        "source": "debug_placeholder",
-                        "message": "MCP server is running with 3 modules registered. Tool introspection needs to be fixed.",
-                    }
+                if is_running:
+                    return {
+                        "tools": [],
+                        "count": 0,
+                        "server_running": True,
+                        "source": "not_implemented",
+                        "message": "Tool introspection not implemented yet.",
+                    }

python/src/mcp_server/features/documents/version_tools.py (2)

109-118: Also accept 201 Created for create_version responses.

APIs often return 201 on creation; treating it as success improves robustness.

Apply:

-                if response.status_code == 200:
+                if response.status_code in (200, 201):

149-154: Don’t echo raw server error bodies to clients.

Returning response.text risks leaking internal details/PII; log it instead and return a generic message.

Apply:

-                    return MCPErrorFormatter.format_error(
-                        error_type="validation_error",
-                        message=f"Invalid request: {response.text}",
-                        suggestion="Check that all required fields are provided and valid",
-                        http_status=400,
-                    )
+                    logger.debug("create_version 400 body: %s", response.text)
+                    return MCPErrorFormatter.format_error(
+                        error_type="validation_error",
+                        message="Invalid request.",
+                        suggestion="Check that all required fields are provided and valid.",
+                        http_status=400,
+                    )

python/src/mcp_server/utils/timeout_config.py (1)

74-79: Define or remove parse_float_env before using it
The proposed diff references parse_float_env, which isn’t defined anywhere in the repo—either add its implementation (e.g. safe float(os.getenv…) parsing) or revert to float(os.getenv…).

python/src/mcp_server/utils/error_handling.py (2)

48-56: Avoid truthiness checks for optional fields to prevent dropping valid values

Use explicit is not None checks so empty strings or status 0 (defensive) aren’t silently omitted.

-        if suggestion:
+        if suggestion is not None:
             error_response["error"]["suggestion"] = suggestion

-        if http_status:
+        if http_status is not None:
             error_response["error"]["http_status"] = http_status

71-91: Harden HTTP error parsing; include request context; log parse failures

Current code may raise AttributeError when detail is not a dict; it’s swallowed and loses useful info. Make parsing resilient and add URL/method to details; log parse failures at debug.

-        # Try to extract error from response body
-        try:
-            body = response.json()
-            if isinstance(body, dict):
-                # Look for common error fields
-                error_message = (
-                    body.get("detail", {}).get("error")
-                    or body.get("error")
-                    or body.get("message")
-                    or body.get("detail")
-                )
-                if error_message:
-                    return MCPErrorFormatter.format_error(
-                        error_type="api_error",
-                        message=f"Failed to {operation}: {error_message}",
-                        details={"response_body": body},
-                        http_status=response.status_code,
-                        suggestion=_get_suggestion_for_status(response.status_code),
-                    )
-        except Exception:
-            pass  # Fall through to generic error
+        # Try to extract error from response body
+        try:
+            body = response.json()
+            error_message = None
+            if isinstance(body, dict):
+                # Accept: {"detail": "..."} | {"detail": {...}} | {"error": "..."} | {"message": "..."}
+                detail = body.get("detail")
+                if isinstance(detail, dict):
+                    error_message = detail.get("error") or detail.get("message") or detail.get("detail")
+                elif isinstance(detail, (str, list)):
+                    error_message = " ".join(map(str, detail if isinstance(detail, list) else [detail]))
+                error_message = error_message or body.get("error") or body.get("message")
+            elif isinstance(body, (list, str)):
+                # Some APIs return a list/str of errors
+                error_message = " ".join(map(str, body if isinstance(body, list) else [body]))
+            if error_message:
+                req = getattr(response, "request", None)
+                details = {"response_body": body}
+                if req is not None:
+                    details.update({"url": str(req.url), "method": req.method})
+                return MCPErrorFormatter.format_error(
+                    error_type="api_error",
+                    message=f"Failed to {operation}: {error_message}",
+                    details=details,
+                    http_status=response.status_code,
+                    suggestion=_get_suggestion_for_status(response.status_code),
+                )
+        except Exception as parse_err:
+            logger.debug("Failed to parse error body for %s: %s", operation, parse_err)
+            # Fall through to generic error

python/src/server/config/config.py (2)

66-81: Avoid jwt.decode with verify disabled; use get_unverified_claims.

Safer and purpose-built for extracting claims without verification.

-        decoded = jwt.decode(
-            supabase_key,
-            '',
-            options={
-                "verify_signature": False,
-                "verify_aud": False,
-                "verify_exp": False,
-                "verify_nbf": False,
-                "verify_iat": False
-            }
-        )
-        role = decoded.get("role")
+        claims = jwt.get_unverified_claims(supabase_key)
+        role = claims.get("role")

Add the exception import used below (outside this range):

from jose.exceptions import JWTError

89-93: Catch specific jose exceptions and log with context instead of broad Exception.

Improves debuggability and follows guideline on specific exceptions + stack traces.

-    except Exception:
-        # If we can't decode the JWT, we'll allow it to proceed
-        # This handles new key formats or non-JWT keys
-        return True, "UNABLE_TO_VALIDATE"
+    except JWTError:
+        # If we can't decode the JWT, we'll allow it to proceed
+        # This handles new key formats or non-JWT keys
+        logger.warning(
+            "validate_supabase_key: Unable to decode Supabase JWT; proceeding without role validation.",
+            exc_info=True,
+        )
+        return True, "UNABLE_TO_VALIDATE"

Add logger setup (outside this range):

import logging
logger = logging.getLogger(__name__)

archon-ui-main/src/services/agentChatService.ts (1)

9-15: timestamp is JSON string; normalize to Date before returning

Backend sends ISO strings. Without conversion, UI calling toLocaleTimeString on a string will throw at runtime.

 export interface ChatMessage {
   id: string;
   content: string;
   sender: 'user' | 'agent';
-  timestamp: Date;
+  timestamp: Date; // keep Date in UI, convert below
   agent_type?: string;
 }

-        const messages: ChatMessage[] = await response.json();
+        const raw = await response.json();
+        const messages: ChatMessage[] = (raw as any[]).map(m => ({
+          ...m,
+          timestamp: new Date(m.timestamp),
+        }));

-      const messages = await response.json();
-      return messages;
+      const raw = await response.json();
+      return (raw as any[]).map(m => ({
+        ...m,
+        timestamp: new Date(m.timestamp),
+      }));

Also applies to: 200-206, 248-252

archon-ui-main/src/components/layouts/ArchonChatPanel.tsx (5)

48-63: Remove WebSocket gating and artificial delay; this disables chat under the new polling architecture

VITE_ENABLE_WEBSOCKET and the delay are obsolete and incorrectly force offline.

-        // Check if WebSocket is enabled
-        const enableWebSocket = import.meta.env.VITE_ENABLE_WEBSOCKET !== 'false';
-        if (!enableWebSocket) {
-          console.warn('WebSocket connection is disabled by environment configuration');
-          setConnectionError('Agent chat is currently disabled');
-          setConnectionStatus('offline');
-          setIsInitialized(true);
-          return;
-        }
-        
-        setConnectionStatus('connecting');
-        
-        // Add a small delay to prevent WebSocket race conditions on page refresh
-        await new Promise(resolve => setTimeout(resolve, 500));
+        setConnectionStatus('connecting');

65-70: Fix createSession argument order

agentChatService.createSession(agentType, projectId?). You’re passing (undefined, 'rag').

-          const { session_id } = await agentChatService.createSession(undefined, 'rag');
+          const { session_id } = await agentChatService.createSession('rag');

209-216: sendMessage call signature is incorrect

sendMessage(sessionId, request: { message, project_id?, context? }).

-      await agentChatService.sendMessage(sessionId, inputValue.trim(), context);
+      await agentChatService.sendMessage(sessionId, {
+        message: inputValue.trim(),
+        context,
+      });

230-253: Reconnection uses removed API; restart polling instead

manualReconnect no longer exists. Stop current stream, then re-subscribe (from last id).

-  const handleReconnect = async () => {
+  const handleReconnect = async () => {
     if (!sessionId || isReconnecting) return;
     
     setIsReconnecting(true);
     setConnectionStatus('connecting');
     setConnectionError('Attempting to reconnect...');
     
     try {
-      const success = await agentChatService.manualReconnect(sessionId);
-      if (success) {
-        setConnectionError(null);
-        setConnectionStatus('online');
-      } else {
-        setConnectionError('Reconnection failed. Server may still be offline.');
-        setConnectionStatus('offline');
-      }
+      agentChatService.stopStreaming(sessionId);
+      const status = await agentChatService.getServerStatus();
+      if (status === 'online') {
+        const lastId = messages.length ? messages[messages.length - 1].id : undefined;
+        await agentChatService.streamMessages(
+          sessionId,
+          (m: ChatMessage) => {
+            setMessages(prev => [...prev, m]);
+            setConnectionError(null);
+            setConnectionStatus('online');
+          },
+          (err: Error) => {
+            console.error('Message streaming error:', err);
+            setConnectionStatus('offline');
+            setConnectionError('Chat service is offline. Messages will not be received.');
+          },
+          lastId
+        );
+      } else {
+        setConnectionError('Reconnection failed. Server may still be offline.');
+        setConnectionStatus('offline');
+      }
     } catch (error) {
       console.error('Manual reconnection failed:', error);
       setConnectionError('Reconnection failed. Please try again later.');
       setConnectionStatus('offline');
     } finally {
       setIsReconnecting(false);
     }
   };

1-468: Remove WebSocket toggle and manual‐reconnect logic from ArchonChatPanel.tsx
Drop the import.meta.env.VITE_ENABLE_WEBSOCKET guard (lines ~49–53), related isReconnecting/manualReconnect state and handler (lines ~236–243), and the reconnect button UI—chat now uses REST streaming only.

python/src/server/services/crawling/helpers/url_handler.py (2)

28-33: Case-insensitive sitemap check + richer error logs.

Lower-case before checks and log stack traces for diagnostics.

-        try:
-            return url.endswith('sitemap.xml') or 'sitemap' in urlparse(url).path
+        try:
+            parsed = urlparse(url)
+            path_lower = parsed.path.lower()
+            return url.lower().endswith('sitemap.xml') or 'sitemap' in path_lower
         except Exception as e:
-            logger.warning(f"Error checking if URL is sitemap: {e}")
+            logger.warning(f"Error checking if URL is sitemap: {e} (url={url})", exc_info=True)
             return False

45-50: Case-insensitive .txt detection + stack traces.

-        try:
-            return url.endswith('.txt')
+        try:
+            return url.lower().endswith('.txt')
         except Exception as e:
-            logger.warning(f"Error checking if URL is text file: {e}")
+            logger.warning(f"Error checking if URL is text file: {e} (url={url})", exc_info=True)
             return False

python/src/server/services/crawling/strategies/sitemap.py (1)

19-55: Add resilient retries with backoff for external fetch.

Meets the guideline for external calls. Example using requests + urllib3 Retry:

# Outside the function (module scope)
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

_session = requests.Session()
_retry = Retry(
    total=3,
    backoff_factor=0.5,
    status_forcelist=(429, 500, 502, 503, 504),
    allowed_methods=("GET",),
    raise_on_status=False,
)
_session.mount("http://", HTTPAdapter(max_retries=_retry))
_session.mount("https://", HTTPAdapter(max_retries=_retry))

# Inside parse_sitemap (replace requests.get)
resp = _session.get(
    sitemap_url,
    timeout=30,
    headers={"User-Agent": "archon-crawler/1.0 (+https://archon)"},
)

python/src/server/services/projects/progress_service.py (4)

37-45: Don’t let initial_data override internal fields (avoid corrupted state).

**initial_data comes last and can overwrite logs, start_time, status, etc. This violates “never accept/store corrupted data.”

-        self.active_operations[progress_id] = {
+        base = {
             "type": operation_type,
             "status": "starting",
             "percentage": 0,
-            "start_time": datetime.now(),
+            "start_time": datetime.now(timezone.utc),
             "logs": [f"🚀 Starting {operation_type}..."],
             "step": "initialization",
-            **initial_data,
-        }
+        }
+        # Allow only non-reserved keys from initial_data
+        reserved = set(base.keys()) | {"duration", "error"}
+        safe_initial = {k: v for k, v in (initial_data or {}).items() if k not in reserved}
+        self.active_operations[progress_id] = {**base, **safe_initial}

Also see the timezone fix below.

60-66: Unknown progress_id should raise a specific exception, not silently return.

Silent returns make debugging hard and break “fail fast with clear, contextual errors.” Raise with context and let the API layer map to 404.

-        if progress_id not in self.active_operations:
-            logger.warning(f"📊 [PROGRESS] Attempted to update unknown operation: {progress_id}")
-            logger.warning(
-                f"📊 [PROGRESS] Active operations: {list(self.active_operations.keys())}"
-            )
-            return
+        if progress_id not in self.active_operations:
+            logger.warning(
+                f"📊 [PROGRESS] Attempted to update unknown operation: {progress_id}. "
+                f"Active: {list(self.active_operations.keys())}"
+            )
+            raise KeyError(f"Progress operation not found: {progress_id}")

108-116: Don’t block 30s in complete_operation; schedule cleanup in the background.

await asyncio.sleep(30) keeps the request/task open 30s. Use create_task to defer cleanup without blocking.

-        logger.info(f"🧹 [PROGRESS] Scheduling cleanup for {progress_id} in 30 seconds")
-        await asyncio.sleep(30)  # Increased from 5 to 30 seconds
-        if progress_id in self.active_operations:
-            logger.info(f"🧹 [PROGRESS] Cleaning up completed operation: {progress_id}")
-            del self.active_operations[progress_id]
-        else:
-            logger.info(f"🧹 [PROGRESS] Operation {progress_id} already cleaned up")
+        logger.info(f"🧹 [PROGRESS] Scheduling cleanup for {progress_id} in 30 seconds")
+        asyncio.create_task(self._deferred_cleanup(progress_id, delay_seconds=30))

Add this helper (outside the ranges shown, e.g., below the class methods):

async def _deferred_cleanup(self, progress_id: str, delay_seconds: int) -> None:
    import asyncio
    await asyncio.sleep(delay_seconds)
    if progress_id in self.active_operations:
        logger.info(f"🧹 [PROGRESS] Cleaning up completed operation: {progress_id}")
        del self.active_operations[progress_id]
    else:
        logger.info(f"🧹 [PROGRESS] Operation {progress_id} already cleaned up")

130-136: Isolate progress callback failures so crawl/work doesn’t break.

Wrap callback to log and continue (aligns with “continue processing; log per-item failures”).

-            if progress_callback:
-                # Add step information for multi-progress tracking
-                step_info = {"currentStep": message, "stepMessage": message, **kwargs}
-                await progress_callback("crawling", percentage, message, **step_info)
+            if progress_callback:
+                try:
+                    step_info = {"currentStep": message, "stepMessage": message, **kwargs}
+                    await progress_callback("crawling", percentage, message, **step_info)
+                except Exception as e:
+                    logger.error(
+                        f"Progress callback failed at {percentage}%: {e}",
+                        exc_info=True,
+                    )

python/src/server/services/crawling/strategies/recursive.py (2)

63-67: Fail fast if crawler is missing.

Returning [] hides a misconfiguration. Raise with context so API can map to 500/503 and logs carry a stack trace.

-        if not self.crawler:
-            logger.error("No crawler instance available for recursive crawling")
-            if progress_callback:
-                await progress_callback("error", 0, "Crawler not available")
-            return []
+        if not self.crawler:
+            logger.error("No crawler instance available for recursive crawling", exc_info=True)
+            raise RuntimeError("Crawler not available for recursive crawling")

130-136: Harden progress reporting against callback errors (don’t break crawl).

Same rationale as in ProgressService.

-        async def report_progress(percentage: int, message: str, **kwargs):
+        async def report_progress(percentage: int, message: str, **kwargs):
             """Helper to report progress if callback is available"""
             if progress_callback:
-                # Add step information for multi-progress tracking
-                step_info = {"currentStep": message, "stepMessage": message, **kwargs}
-                await progress_callback("crawling", percentage, message, **step_info)
+                try:
+                    step_info = {"currentStep": message, "stepMessage": message, **kwargs}
+                    await progress_callback("crawling", percentage, message, **step_info)
+                except Exception as e:
+                    logger.error(f"Progress callback failed: {e}", exc_info=True)

python/src/mcp_server/features/tasks/task_tools.py (3)

113-116: Validate required fields before calling the API.

Fail fast if title is empty to avoid creating bad records.

         try:
             api_url = get_api_url()
             timeout = get_default_timeout()
 
+            if not title or not title.strip():
+                return MCPErrorFormatter.format_error(
+                    error_type="validation_error",
+                    message="Title is required",
+                    suggestion="Provide a non-empty title",
+                )

345-347: Validate status against allowed values before updating.

Prevents illegal states reaching the backend.

-            if status is not None:
-                update_fields["status"] = status
+            if status is not None:
+                allowed = {"todo", "doing", "review", "done"}
+                if status not in allowed:
+                    return MCPErrorFormatter.format_error(
+                        error_type="validation_error",
+                        message=f"Invalid status '{status}'",
+                        suggestion="Use one of: todo | doing | review | done",
+                    )
+                update_fields["status"] = status

117-131: Add retries with exponential backoff for external API calls.

Guidelines require retries on external calls. Suggest a small helper and use it for POST/GET/PUT/DELETE.

Example helper to add near imports:

# helper (place near top of file)
import asyncio

async def _with_retry(request_coro_factory, *, attempts: int = 3, base_delay: float = 0.5):
    last_exc = None
    for i in range(attempts):
        try:
            return await request_coro_factory()
        except httpx.RequestError as e:
            last_exc = e
            if i == attempts - 1:
                raise
            await asyncio.sleep(base_delay * (2 ** i))
    raise last_exc

Example usage inside create_task:

-                response = await client.post(
-                    urljoin(api_url, "/api/tasks"),
-                    json={ ... },
-                )
+                response = await _with_retry(lambda: client.post(
+                    urljoin(api_url, "/api/tasks"),
+                    json={ ... },
+                ))

I can apply this pattern across list/get/update/delete if you prefer.

Also applies to: 209-214, 279-286, 365-377, 415-423

archon-ui-main/src/components/project-tasks/EditTaskModal.tsx (1)

156-158: Use nullish coalescing to preserve 0 priority values.

|| 1 forces 0 → 1 unexpectedly.
-                  value={localTask?.task_order || 1} 
+                  value={localTask?.task_order ?? 1}

python/src/server/services/storage/storage_services.py (1)

74-81: Fix async progress callback, improve error logging, and deprecate WebSocket pathway.

Lambda returns a coroutine but type says int; make an async callback and cast percentage to int.
Log with stack traces and context per guidelines.
Given polling replaces sockets, consider removing WebSocket dependency and use a progress-tracker service.

Apply:

@@
-                chunks = await self.smart_chunk_text_async(
-                    file_content,
-                    chunk_size=5000,
-                    progress_callback=lambda msg, pct: report_progress(
-                        f"Chunking: {msg}", 10 + float(pct) * 0.2
-                    ),
-                )
+                async def _chunk_progress(msg: str, pct: int) -> None:
+                    await report_progress(f"Chunking: {msg}", int(10 + float(pct) * 0.2))
+
+                chunks = await self.smart_chunk_text_async(
+                    file_content,
+                    chunk_size=5000,
+                    progress_callback=_chunk_progress,
+                )
@@
-                from ..source_management_service import extract_source_summary, update_source_info
+                from ..source_management_service import extract_source_summary, update_source_info
@@
-                source_summary = await extract_source_summary(source_id, file_content[:5000])
+                source_summary = await extract_source_summary(source_id, file_content[:5000])
@@
-                await update_source_info(
+                await update_source_info(
                     self.supabase_client,
                     source_id,
                     source_summary,
                     total_word_count,
                     file_content[:1000],  # content for title generation
-                    knowledge_type,      # Pass knowledge_type parameter!
+                    knowledge_type,      # Pass knowledge_type parameter!
                     tags,               # FIX: Pass tags parameter!
                 )
@@
-                logger.error(f"Error uploading document: {e}")
+                logger.exception(
+                    f"Error uploading document during upload_document: filename={filename}, source_id={source_id}"
+                )

Optional typing clean-up (keeps mypy happy):

-from typing import Any
+from typing import Any, Awaitable, Callable
@@
-        progress_callback: Any | None = None,
+        progress_callback: Callable[[str, int, dict | None], Awaitable[None]] | None = None,

Migration note: If WebSocket must be removed, drop the websocket parameter and route progress through the new progress service; I can provide a patch once the preferred progress API is confirmed.

Also applies to: 127-140, 178-191, 8-10, 22-32

python/src/server/api_routes/agent_chat_api.py (2)

17-19: In-memory sessions are not production-safe or multi-worker compatible.
State will be lost on restart and isn’t shared across Uvicorn workers. Persist sessions/messages (e.g., Supabase/Redis) or clearly gate this as dev-only.

I can draft a minimal Supabase-backed session store if desired.

27-33: Model/serialization mismatch for timestamp; add validation and standardize to UTC.
Messages are stored with ISO strings, while ChatMessage.timestamp is a datetime. Either return ChatMessage instances (letting Pydantic serialize) or change the model to str. Prefer UTC.
-from datetime import datetime
+from datetime import datetime, timezone
@@
 class ChatMessage(BaseModel):
     id: str
     content: str
     sender: str
-    timestamp: datetime
+    timestamp: datetime  # Ensure UTC when creating messages
     agent_type: str | None = None

python/src/server/main.py (1)

96-111: Preserve stack traces; fix incorrect logging kwargs.
Use exc_info=True; remove unsupported “error=” kwarg. This matches our logging guideline.

-        except Exception as e:
-            api_logger.warning(f"Could not fully initialize crawling context: {str(e)}")
+        except Exception:
+            api_logger.warning("Could not fully initialize crawling context", exc_info=True)
@@
-        except Exception as e:
-            api_logger.warning(f"Could not initialize prompt service: {e}")
+        except Exception:
+            api_logger.warning("Could not initialize prompt service", exc_info=True)
@@
-        except Exception as e:
-            api_logger.warning(f"Could not set main event loop: {e}")
+        except Exception:
+            api_logger.warning("Could not set main event loop", exc_info=True)
@@
-    except Exception as e:
-        api_logger.error(f"❌ Failed to start backend: {str(e)}")
+    except Exception:
+        api_logger.error("❌ Failed to start backend", exc_info=True)
@@
-        except Exception as e:
-            api_logger.warning("Could not cleanup crawling context", error=str(e))
+        except Exception:
+            api_logger.warning("Could not cleanup crawling context", exc_info=True)
@@
-        except Exception as e:
-            api_logger.warning("Could not cleanup background task manager", error=str(e))
+        except Exception:
+            api_logger.warning("Could not cleanup background task manager", exc_info=True)
@@
-    except Exception as e:
-        api_logger.error(f"❌ Error during shutdown: {str(e)}")
+    except Exception:
+        api_logger.error("❌ Error during shutdown", exc_info=True)

Also applies to: 120-132, 158-159

python/src/server/services/storage/document_storage_service.py (3)

93-101: Bug: fallback delete uses hard-coded slice size 10, ignoring fallback_batch_size.
Use fallback_batch_size for consistency.

-                batch_urls = unique_urls[i : i + 10]
+                batch_urls = unique_urls[i : i + fallback_batch_size]

262-277: O(n²) and incorrect mapping when duplicate texts exist; map by index robustly.
Build an index map to handle duplicates and avoid quadratic scans.

-            batch_data = []
-            # Map successful texts back to their original indices
-            for j, (embedding, text) in enumerate(
-                zip(batch_embeddings, successful_texts, strict=False)
-            ):
-                # Find the original index of this text
-                orig_idx = None
-                for idx, orig_text in enumerate(contextual_contents):
-                    if orig_text == text:
-                        orig_idx = idx
-                        break
-
-                if orig_idx is None:
-                    search_logger.warning("Could not map embedding back to original text")
-                    continue
-
-                j = orig_idx  # Use original index for metadata lookup
+            batch_data = []
+            from collections import defaultdict, deque
+            index_map = defaultdict(deque)
+            for _idx, _txt in enumerate(contextual_contents):
+                index_map[_txt].append(_idx)
+            for embedding, text in zip(batch_embeddings, successful_texts, strict=False):
+                if not index_map[text]:
+                    search_logger.warning("Could not map embedding back to original text")
+                    continue
+                j = index_map[text].popleft()
                 # Use source_id from metadata if available, otherwise extract from URL
                 if batch_metadatas[j].get("source_id"):
                     source_id = batch_metadatas[j]["source_id"]
                 else:
                     # Fallback: Extract source_id from URL
                     parsed_url = urlparse(batch_urls[j])
                     source_id = parsed_url.netloc or parsed_url.path

368-380: Final progress message should reflect actual stored count; remove Socket.IO wording.
Align with polling and use total_chunks_stored.

-        # Send final 100% progress report to ensure UI shows completion
+        # Send final 100% progress report to ensure UI shows completion
         if progress_callback and asyncio.iscoroutinefunction(progress_callback):
             await progress_callback(
-                f"Document storage completed: {len(contents)} chunks stored in {total_batches} batches",
+                f"Document storage completed: {total_chunks_stored} chunks stored in {total_batches} batches",
                 100,  # Ensure we report 100%
                 {
                     "completed_batches": total_batches,
                     "total_batches": total_batches,
                     "current_batch": total_batches,
-                    "chunks_processed": len(contents),
-                    # DON'T send 'status': 'completed' - that's for the orchestration service only!
+                    "chunks_processed": total_chunks_stored,
                 },
             )

python/src/server/services/crawling/strategies/batch.py (1)

118-122: Instantiate MemoryAdaptiveDispatcher with ThreadingConfig
MemoryAdaptiveDispatcher.init only takes a single config: ThreadingConfig (python/src/server/services/threading_service.py:165); replace the keyword args in batch.py (lines 118–122) with a ThreadingConfig(...) instance matching its fields.

archon-ui-main/src/components/project-tasks/TaskBoardView.tsx (1)

21-35: Wire selection props through or remove them
ColumnDropZone accepts selectedTasks and onTaskSelect but never passes them to DraggableTaskCard, and DraggableTaskCardProps does not define any selection props. Either update DraggableTaskCardProps (and its component) to accept and use selectedTasks/onTaskSelect (or isSelected/onToggleSelect) and pass them in the JSX, or drop these unused props.

python/src/server/services/crawling/strategies/single_page.py (1)

140-147: Preserve stack traces with logger.exception.

Leverage logger.exception (or exc_info=True) for crawler errors.

-                except Exception as e:
-                    last_error = f"Crawler exception for {url}: {str(e)}"
-                    logger.error(last_error)
+                except Exception as e:
+                    last_error = f"Crawler exception for {url}: {e}"
+                    logger.exception(last_error)

archon-ui-main/src/pages/KnowledgeBasePage.tsx (5)

972-975: Bug: viewMode uses 'list' but state uses 'table'.

These branches never execute. Replace 'list' with 'table'.

-                ) : (
-                  // Original layout when no progress items or in list view
+                ) : (
+                  // Original layout when no progress items or in table view
                   <div className={`grid ${viewMode === 'grid' ? 'grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4' : 'grid-cols-1 gap-3'}`}>
                     {/* Progress Items (only in list view) */}
-                    {viewMode === 'list' && progressItems.map(progressData => (
+                    {viewMode === 'table' && progressItems.map(progressData => (

Also applies to: 1045-1053

183-186: Remove leftover Socket.IO cleanup.

WebSockets were removed; disconnect() is dead path.

-      // Cleanup all crawl progress connections on unmount
-      crawlProgressService.disconnect();
+      // No socket cleanup needed with polling

441-456: Deduplicate and remove Socket.IO streaming; switch to polling-only.

Progress item is added twice and streaming is still used. Rely on polling by setting activeProgressId.

-        setProgressItems(prev => [...prev, progressData]);
+        setProgressItems(prev => [...prev, progressData]);
@@
-        // Connect to crawl progress WebSocket
-        // Add new progress item for tracking
-        setProgressItems(prev => [...prev, {
-          progressId: response.progressId,
-          status: 'starting',
-          message: 'Starting crawl operation...'
-        }]);
-        
-        await crawlProgressService.streamProgressEnhanced(response.progressId, {
-          onMessage: (data: CrawlProgressData) => {
-            console.log('🔄 Refresh progress update:', data);
-            if (data.status === 'completed') {
-              handleProgressComplete(data);
-            } else if (data.error || data.status === 'error') {
-              handleProgressError(data.error || 'Refresh failed', response.progressId);
-            } else if (data.status === 'cancelled' || data.status === 'stopped') {
-              // Handle cancelled/stopped status
-              handleProgressUpdate({ ...data, status: 'cancelled' });
-              setTimeout(() => {
-                setProgressItems(prev => prev.filter(item => item.progressId !== response.progressId));
-                // Clean up from localStorage
-                try {
-                  localStorage.removeItem(`crawl_progress_${response.progressId}`);
-                  const activeCrawls = JSON.parse(localStorage.getItem('active_crawls') || '[]');
-                  const updated = activeCrawls.filter((id: string) => id !== response.progressId);
-                  localStorage.setItem('active_crawls', JSON.stringify(updated));
-                } catch (error) {
-                  console.error('Failed to clean up cancelled crawl:', error);
-                }
-                crawlProgressService.stopStreaming(response.progressId);
-              }, 2000); // Show cancelled status for 2 seconds before removing
-            } else {
-              handleProgressUpdate(data);
-            }
-          },
-          onStateChange: (state: any) => {
-            console.log('🔄 Refresh state change:', state);
-          },
-          onError: (error: Error | Event) => {
-            const errorMessage = error instanceof Error ? error.message : 'Connection error';
-            console.error('❌ Refresh error:', errorMessage);
-            handleProgressError(errorMessage, response.progressId);
-          }
-        }, {
-          autoReconnect: true,
-          reconnectDelay: 5000,
-          connectionTimeout: 10000
-        });
+        // Switch active progress to start polling
+        setActiveProgressId(response.progressId);

Also applies to: 476-523

585-599: Avoid removing the progress card twice.

You remove immediately and again after 3s. Keep the delayed removal so users can see “completed”.

-    // Remove from progress items after completion
-    setProgressItems(prev => prev.filter(item => item.progressId !== data.progressId));
+    // Keep visible briefly; will remove after delay below

602-629: Do not reference Socket.IO stopStreaming in polling mode.

Remove stopStreaming; polling will auto-stop when status is terminal.

-    // Stop the Socket.IO streaming for this progress
-    crawlProgressService.stopStreaming(data.progressId);
@@
-      // Stop the Socket.IO streaming for this progress
-      crawlProgressService.stopStreaming(progressId);

python/src/server/services/crawling/document_storage_operations.py (1)

139-156: Validate storage results before using chunk counts

The code retrieves chunks_stored from storage_stats but doesn't validate that storage was successful. According to the PR objectives, "crawl data validation now returns actual stored chunk counts and throws on zero storage." Consider adding validation to ensure chunks were actually stored.

 # Call add_documents_to_supabase with the correct parameters
 storage_stats = await add_documents_to_supabase(
     client=self.supabase_client,
     urls=all_urls,  # Now has entry per chunk
     chunk_numbers=all_chunk_numbers,  # Proper chunk numbers (0, 1, 2, etc)
     contents=all_contents,  # Individual chunks
     metadatas=all_metadatas,  # Metadata per chunk
     url_to_full_document=url_to_full_document,
     batch_size=25,  # Increased from 10 for better performance
     progress_callback=progress_callback,  # Pass the callback for progress updates
     enable_parallel_batches=True,  # Enable parallel processing
     provider=None,  # Use configured provider
     cancellation_check=cancellation_check  # Pass cancellation check
 )

 # Calculate chunk counts
 chunk_count = len(all_contents)
 chunks_stored = storage_stats.get("chunks_stored", 0)

+# Validate that chunks were actually stored
+if chunk_count > 0 and chunks_stored == 0:
+    safe_logfire_error(f"No chunks were stored despite having {chunk_count} chunks to process")
+    raise Exception(f"Failed to store any chunks - expected {chunk_count} chunks to be stored")
+
 return {
     'chunk_count': chunk_count,
     'chunks_stored': chunks_stored,

python/src/server/services/source_management_service.py (1)

256-265: 'updated_at': 'now()' will store a literal string, not a timestamp.

PostgREST won’t execute SQL functions in JSON payloads. Either rely on DB default/trigger or set an ISO timestamp.

Apply:
                 .update({
                     "summary": summary,
                     "total_word_count": word_count,
-                    "metadata": metadata,
-                    "updated_at": "now()",
+                    "metadata": metadata,
                 })
Prefer DB triggers, or use datetime.utcnow().isoformat() if you must set here.

python/src/server/services/crawling/crawling_service.py (2)

452-468: On cancellation, also record progress state via tracker.

So pollers see a terminal 'cancelled' state with timestamp.

Apply:

         except asyncio.CancelledError:
             safe_logfire_info(f"Crawl operation cancelled | progress_id={self.progress_id}")
+            if self.progress_tracker:
+                await self.progress_tracker.update(
+                    status="cancelled", percentage=-1, log="Crawl operation was cancelled by user"
+                )
             await self._handle_progress_update(
                 task_id,
                 {
                     "status": "cancelled",
                     "percentage": -1,
                     "log": "Crawl operation was cancelled by user",
                 },
             )

468-484: On error, call progress_tracker.error for consistent terminal state.

_update only sets fields; error() also stamps errorTime and standardizes state.

Apply:

         except Exception as e:
             safe_logfire_error(f"Async crawl orchestration failed | error={str(e)}")
+            if self.progress_tracker:
+                await self.progress_tracker.error(str(e))
             await self._handle_progress_update(
                 task_id, {
                     "status": "error", 
                     "percentage": -1, 
                     "log": f"Crawl failed: {str(e)}",
                     "error": str(e)
                 }
             )

coderabbitai · 2025-08-28T11:16:47Z

+          // Load initial chat history
+          try {
+            const history = await agentChatService.getChatHistory(session_id);
+            console.log(`[CHAT PANEL] Loaded chat history:`, history);
+            setMessages(history || []);
+          } catch (error) {
+            console.error('Failed to load chat history:', error);
+            // Initialize with empty messages if history can't be loaded
+            setMessages([]);
+          }



🛠️ Refactor suggestion

Avoid re-appending history on first poll

Pass the last history id into streamMessages (with new optional param).

// Load initial chat history try { const history = await agentChatService.getChatHistory(session_id); console.log(`[CHAT PANEL] Loaded chat history:`, history); setMessages(history || []); } catch (error) { console.error('Failed to load chat history:', error); // Initialize with empty messages if history can't be loaded setMessages([]); } // Start polling for new messages (will fail gracefully if backend is down) try { + const lastId = (Array.isArray(history) && history.length > 0) ? history[history.length - 1].id : undefined; await agentChatService.streamMessages( session_id, (message: ChatMessage) => { setMessages(prev => [...prev, message]); setConnectionError(null); // Clear any previous errors on successful message setConnectionStatus('online'); }, (error: Error) => { console.error('Message streaming error:', error); setConnectionStatus('offline'); setConnectionError('Chat service is offline. Messages will not be received.'); - } + }, + lastId ); } catch (error) {

Also applies to: 83-101

wont fix, agent service is disabled

@Wirasm Understood! That makes perfect sense - no point optimizing polling behavior for a disabled service. Thanks for the clarification.

✏️ Learnings added

Learnt from: Wirasm PR: coleam00/Archon#514 File: archon-ui-main/src/components/layouts/ArchonChatPanel.tsx:71-81 Timestamp: 2025-08-28T12:03:56.351Z Learning: The agent chat service is disabled in the Archon codebase, so optimizations or fixes related to agent chat functionality may not be necessary to implement.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2025-08-28T11:16:48Z

+  const mutateAsync = useCallback(async (variables: TVariables): Promise<TData> => {
+    setIsPending(true);
+    setIsError(false);
+    setIsSuccess(false);
+    setError(null);
+
+    try {
+      const result = await mutationFn(variables);
+      setData(result);
+      setIsSuccess(true);
+
+      // Call success callback if provided
+      if (onSuccess) {
+        onSuccess(result, variables);
+      }
+
+      // Show success message if available
+      if (successMessage && typeof window !== 'undefined') {
+        console.log(successMessage);
+      }
+
+      return result;
+    } catch (err) {
+      const error = err instanceof Error ? err : new Error('Unknown error');
+      setError(error);
+      setIsError(true);
+
+      // Call error callback if provided
+      if (onError) {
+        onError(error);
+      }
+
+      // Show error message
+      if (typeof window !== 'undefined') {
+        console.error(`${errorMessage}:`, error);
+      }
+
+      throw error;
+    } finally {
+      setIsPending(false);
+    }
+  }, [mutationFn, onSuccess, onError, successMessage, errorMessage]);


🛠️ Refactor suggestion

Guard against state updates after unmount and support cancellation.

Multiple rapid calls/unmounts can trigger React warnings. Track mount status and (optionally) pass an AbortSignal to mutationFn.

-import { useState, useCallback } from 'react'; +import { useState, useCallback, useRef, useEffect } from 'react'; @@ export function useProjectMutation<TData = unknown, TVariables = unknown>( _key: unknown, // For compatibility with old API, not used mutationFn: (variables: TVariables) => Promise<TData>, options: UseProjectMutationOptions<TData, TVariables> = {} ): UseProjectMutationResult<TData, TVariables> { + const isMounted = useRef(true); + useEffect(() => { + return () => { + isMounted.current = false; + }; + }, []); @@ const mutateAsync = useCallback(async (variables: TVariables): Promise<TData> => { - setIsPending(true); - setIsError(false); - setIsSuccess(false); - setError(null); + if (isMounted.current) { + setIsPending(true); + setIsError(false); + setIsSuccess(false); + setError(null); + } @@ - setData(result); - setIsSuccess(true); + if (isMounted.current) { + setData(result); + setIsSuccess(true); + } @@ - setError(error); - setIsError(true); + if (isMounted.current) { + setError(error); + setIsError(true); + } @@ - } finally { - setIsPending(false); - } + } finally { + if (isMounted.current) { + setIsPending(false); + } + } - }, [mutationFn, onSuccess, onError, successMessage, errorMessage]); + }, [mutationFn, onSuccess, onError, successMessage, errorMessage]);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const mutateAsync = useCallback(async (variables: TVariables): Promise<TData> => {

setIsPending(true);

setIsError(false);

setIsSuccess(false);

setError(null);

try {

const result = await mutationFn(variables);

setData(result);

setIsSuccess(true);

// Call success callback if provided

if (onSuccess) {

onSuccess(result, variables);

}

// Show success message if available

if (successMessage && typeof window !== 'undefined') {

console.log(successMessage);

}

return result;

} catch (err) {

const error = err instanceof Error ? err : new Error('Unknown error');

setError(error);

setIsError(true);

// Call error callback if provided

if (onError) {

onError(error);

}

// Show error message

if (typeof window !== 'undefined') {

console.error(`${errorMessage}:`, error);

}

throw error;

} finally {

setIsPending(false);

}

}, [mutationFn, onSuccess, onError, successMessage, errorMessage]);

++ b/archon-ui-main/src/hooks/useProjectMutation.ts

@@

import { useState, useCallback, useRef, useEffect } from 'react';

@@

export function useProjectMutation<TData = unknown, TVariables = unknown>(

_key: unknown, // For compatibility with old API, not used

mutationFn: (variables: TVariables) => Promise<TData>,

options: UseProjectMutationOptions<TData, TVariables> = {}

): UseProjectMutationResult<TData, TVariables> {

// Track whether the component is still mounted to prevent state updates after unmount

const isMounted = useRef(true);

useEffect(() => {

return () => {

isMounted.current = false;

};

}, []);

@@

const mutateAsync = useCallback(async (variables: TVariables): Promise<TData> => {

- setIsPending(true);

- setIsError(false);

- setIsSuccess(false);

if (isMounted.current) {

setIsPending(true);

setIsError(false);

setIsSuccess(false);

setError(null);

}

@@

- setData(result);

if (isMounted.current) {

setData(result);

setIsSuccess(true);

}

@@

- setError(error);

if (isMounted.current) {

setError(error);

setIsError(true);

}

@@

- } finally {

- setIsPending(false);

} finally {

if (isMounted.current) {

setIsPending(false);

}

}

}, [mutationFn, onSuccess, onError, successMessage, errorMessage]);

🤖 Prompt for AI Agents

In archon-ui-main/src/hooks/useProjectMutation.ts around lines 42 to 83, the mutateAsync function can update state after the component has unmounted and lacks cancellation for in-flight requests; modify the hook to track mounted status with a ref (set true on mount and false on unmount) and only call setState or callbacks when still mounted, and create a per-call AbortController whose signal is passed to mutationFn if it accepts one; ensure the controller is aborted on unmount (and optionally when starting a new mutate call) and update dependencies accordingly so state updates and callbacks are guarded and in-flight operations can be cancelled.

fixed the core issue with the memory leak

@Wirasm Great work on addressing the memory leak! Preventing state updates after unmount is crucial for avoiding React warnings and potential memory issues, especially in hooks that handle async operations. The fix should help ensure cleaner component lifecycle management throughout the polling-based architecture.

coderabbitai · 2025-08-28T11:16:52Z

    try:
-        try:
-            import os
-
-            import openai
-
-            api_key = os.getenv("OPENAI_API_KEY")
-            if not api_key:
-                # Try to get from credential service with direct fallback
-                from .credential_service import credential_service
-
-                if (
-                    credential_service._cache_initialized
-                    and "OPENAI_API_KEY" in credential_service._cache
-                ):
-                    cached_key = credential_service._cache["OPENAI_API_KEY"]
-                    if isinstance(cached_key, dict) and cached_key.get("is_encrypted"):
-                        api_key = credential_service._decrypt_value(cached_key["encrypted_value"])
-                    else:
-                        api_key = cached_key
-                else:
-                    api_key = os.getenv("OPENAI_API_KEY", "")
-
-            if not api_key:
-                raise ValueError("No OpenAI API key available")
-
-            client = openai.OpenAI(api_key=api_key)
-            search_logger.info("Successfully created LLM client fallback for summary generation")
-        except Exception as e:
-            search_logger.error(f"Failed to create LLM client fallback: {e}")
-            return default_summary
-
-        # Call the OpenAI API to generate the summary
-        response = client.chat.completions.create(
-            model=model_choice,
-            messages=[
-                {
-                    "role": "system",
-                    "content": "You are a helpful assistant that provides concise library/tool/framework summaries.",
-                },
-                {"role": "user", "content": prompt},
-            ],
-        )
+        async with get_llm_client(provider=provider) as client:
+            # Get model choice from credential service
+            from .credential_service import credential_service
+            rag_settings = await credential_service.get_credentials_by_category("rag_strategy")
+            model_choice = rag_settings.get("MODEL_CHOICE", "gpt-4.1-nano")
+
+            search_logger.info(f"Generating summary for {source_id} using model: {model_choice}")
+
+            # Call the LLM API to generate the summary
+            response = await client.chat.completions.create(
+                model=model_choice,
+                messages=[
+                    {
+                        "role": "system",
+                        "content": "You are a helpful assistant that provides concise library/tool/framework summaries.",
+                    },
+                    {"role": "user", "content": prompt},
+                ],
+            )

-        # Extract the generated summary with proper error handling
-        if not response or not response.choices or len(response.choices) == 0:
-            search_logger.error(f"Empty or invalid response from LLM for {source_id}")
-            return default_summary
+            # Extract the generated summary with proper error handling
+            if not response or not response.choices or len(response.choices) == 0:
+                search_logger.error(f"Empty or invalid response from LLM for {source_id}")
+                return default_summary

-        message_content = response.choices[0].message.content
-        if message_content is None:
-            search_logger.error(f"LLM returned None content for {source_id}")
-            return default_summary
+            message_content = response.choices[0].message.content
+            if message_content is None:
+                search_logger.error(f"LLM returned None content for {source_id}")
+                return default_summary

-        summary = message_content.strip()
+            summary = message_content.strip()

-        # Ensure the summary is not too long
-        if len(summary) > max_length:
-            summary = summary[:max_length] + "..."
+            # Ensure the summary is not too long
+            if len(summary) > max_length:
+                summary = summary[:max_length] + "..."

-        return summary
+            return summary



🛠️ Refactor suggestion

Add retries with backoff for LLM calls and preserve stack traces.

External API per guidelines should retry and log full traces.

Apply:

- try: - async with get_llm_client(provider=provider) as client: + try: + async with get_llm_client(provider=provider) as client: # Get model choice from credential service from .credential_service import credential_service rag_settings = await credential_service.get_credentials_by_category("rag_strategy") model_choice = rag_settings.get("MODEL_CHOICE", "gpt-4.1-nano") search_logger.info(f"Generating summary for {source_id} using model: {model_choice}") - # Call the LLM API to generate the summary - response = await client.chat.completions.create( - model=model_choice, - messages=[ - { - "role": "system", - "content": "You are a helpful assistant that provides concise library/tool/framework summaries.", - }, - {"role": "user", "content": prompt}, - ], - ) + # Call the LLM API with simple retries + last_exc = None + response = None + for attempt in range(3): + try: + response = await client.chat.completions.create( + model=model_choice, + messages=[ + {"role": "system", "content": "You are a helpful assistant that provides concise library/tool/framework summaries."}, + {"role": "user", "content": prompt}, + ], + ) + break + except Exception as e: + last_exc = e + await asyncio.sleep(0.5 * (2 ** attempt)) + if response is None: + search_logger.error(f"LLM call failed after retries for {source_id}", exc_info=True) + return default_summary # Extract the generated summary with proper error handling if not response or not response.choices or len(response.choices) == 0: search_logger.error(f"Empty or invalid response from LLM for {source_id}") return default_summary @@ - except Exception as e: - search_logger.error( - f"Error generating summary with LLM for {source_id}: {e}. Using default summary." - ) + except Exception as e: + search_logger.error(f"Error generating summary with LLM for {source_id}: {e}. Using default summary.", exc_info=True) return default_summary

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

try:

import os

import openai

api_key = os.getenv("OPENAI_API_KEY")

if not api_key:

# Try to get from credential service with direct fallback

from .credential_service import credential_service

if (

credential_service._cache_initialized

and "OPENAI_API_KEY" in credential_service._cache

):

cached_key = credential_service._cache["OPENAI_API_KEY"]

if isinstance(cached_key, dict) and cached_key.get("is_encrypted"):

api_key = credential_service._decrypt_value(cached_key["encrypted_value"])

else:

api_key = cached_key

else:

api_key = os.getenv("OPENAI_API_KEY", "")

if not api_key:

raise ValueError("No OpenAI API key available")

client = openai.OpenAI(api_key=api_key)

search_logger.info("Successfully created LLM client fallback for summary generation")

except Exception as e:

search_logger.error(f"Failed to create LLM client fallback: {e}")

return default_summary

# Call the OpenAI API to generate the summary

response = client.chat.completions.create(

model=model_choice,

messages=[

{

"role": "system",

"content": "You are a helpful assistant that provides concise library/tool/framework summaries.",

},

{"role": "user", "content": prompt},

],

)

async with get_llm_client(provider=provider) as client:

# Get model choice from credential service

from .credential_service import credential_service

rag_settings = await credential_service.get_credentials_by_category("rag_strategy")

model_choice = rag_settings.get("MODEL_CHOICE", "gpt-4.1-nano")

search_logger.info(f"Generating summary for {source_id} using model: {model_choice}")

# Call the LLM API to generate the summary

response = await client.chat.completions.create(

model=model_choice,

messages=[

{

"role": "system",

"content": "You are a helpful assistant that provides concise library/tool/framework summaries.",

},

{"role": "user", "content": prompt},

],

)

# Extract the generated summary with proper error handling

if not response or not response.choices or len(response.choices) == 0:

search_logger.error(f"Empty or invalid response from LLM for {source_id}")

return default_summary

# Extract the generated summary with proper error handling

if not response or not response.choices or len(response.choices) == 0:

search_logger.error(f"Empty or invalid response from LLM for {source_id}")

return default_summary

message_content = response.choices[0].message.content

if message_content is None:

search_logger.error(f"LLM returned None content for {source_id}")

return default_summary

message_content = response.choices[0].message.content

if message_content is None:

search_logger.error(f"LLM returned None content for {source_id}")

return default_summary

summary = message_content.strip()

summary = message_content.strip()

# Ensure the summary is not too long

if len(summary) > max_length:

summary = summary[:max_length] + "..."

# Ensure the summary is not too long

if len(summary) > max_length:

summary = summary[:max_length] + "..."

return summary

return summary

try:

async with get_llm_client(provider=provider) as client:

# Get model choice from credential service

from .credential_service import credential_service

rag_settings = await credential_service.get_credentials_by_category("rag_strategy")

model_choice = rag_settings.get("MODEL_CHOICE", "gpt-4.1-nano")

search_logger.info(f"Generating summary for {source_id} using model: {model_choice}")

# Call the LLM API with simple retries

last_exc = None

response = None

for attempt in range(3):

try:

response = await client.chat.completions.create(

model=model_choice,

messages=[

{"role": "system", "content": "You are a helpful assistant that provides concise library/tool/framework summaries."},

{"role": "user", "content": prompt},

],

)

break

except Exception as e:

last_exc = e

await asyncio.sleep(0.5 * (2 ** attempt))

if response is None:

search_logger.error(f"LLM call failed after retries for {source_id}", exc_info=True)

return default_summary

# Extract the generated summary with proper error handling

if not response or not response.choices or len(response.choices) == 0:

search_logger.error(f"Empty or invalid response from LLM for {source_id}")

return default_summary

message_content = response.choices[0].message.content

if message_content is None:

search_logger.error(f"LLM returned None content for {source_id}")

return default_summary

summary = message_content.strip()

# Ensure the summary is not too long

if len(summary) > max_length:

summary = summary[:max_length] + "..."

return summary

except Exception as e:

search_logger.error(f"Error generating summary with LLM for {source_id}: {e}. Using default summary.", exc_info=True)

return default_summary

not fixing this pr

@Wirasm Understood! No problem at all - you're managing the scope and priorities for this PR. The retry logic can always be addressed in a future enhancement if needed.

coderabbitai · 2025-08-28T11:16:52Z

+    json_str = json.dumps(data, sort_keys=True, default=str)
+
+    # Generate MD5 hash
+    hash_obj = hashlib.md5(json_str.encode('utf-8'))
+
+    # Return ETag in standard format (quoted)
+    return f'"{hash_obj.hexdigest()}"'


🛠️ Refactor suggestion

Use SHA-256 and canonical JSON to avoid weak-hash (Bandit B303) and whitespace drift.

MD5 will trip security linters and has higher collision risk. Also, canonicalize JSON with separators to keep ETags stable.

- json_str = json.dumps(data, sort_keys=True, default=str) - - # Generate MD5 hash - hash_obj = hashlib.md5(json_str.encode('utf-8')) + json_str = json.dumps(data, sort_keys=True, default=str, separators=(",", ":")) + + # Generate SHA-256 hash (avoid weak-hash warnings) + hash_obj = hashlib.sha256(json_str.encode("utf-8"))

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

json_str = json.dumps(data, sort_keys=True, default=str)

# Generate MD5 hash

hash_obj = hashlib.md5(json_str.encode('utf-8'))

# Return ETag in standard format (quoted)

return f'"{hash_obj.hexdigest()}"'

json_str = json.dumps(data, sort_keys=True, default=str, separators=(",", ":"))

# Generate SHA-256 hash (avoid weak-hash warnings)

hash_obj = hashlib.sha256(json_str.encode("utf-8"))

# Return ETag in standard format (quoted)

return f'"{hash_obj.hexdigest()}"'

🤖 Prompt for AI Agents

In python/src/server/utils/etag_utils.py around lines 18 to 24, replace the MD5-based ETag generation with SHA-256 and produce canonical JSON to avoid whitespace drift: serialize data using json.dumps with sort_keys=True and separators=(',', ':') (and default=str as needed), then compute a hashlib.sha256 over the UTF-8 encoded bytes of that canonical JSON and return the quoted hex digest as the ETag.

skip and documented in the etac docs

@Wirasm, understood! I see this aligns with the documented ETag implementation approach where the current implementation intentionally keeps things simple despite some limitations. Thanks for the clarification.

🐰 (respectfully noting)

🧠 Learnings used

Learnt from: Wirasm PR: coleam00/Archon#514 File: python/src/server/utils/etag_utils.py:27-42 Timestamp: 2025-08-28T12:06:56.351Z Learning: The ETag implementation in python/src/server/utils/etag_utils.py has known limitations with If-None-Match semantics (wildcards, multiple ETags, weak validators) that are documented in PRPs/ai_docs/ETAG_IMPLEMENTATION.md. These limitations are intentionally not addressed to keep the current implementation simple.

coderabbitai · 2025-08-28T11:16:52Z

+def check_etag(request_etag: str | None, current_etag: str) -> bool:
+    """Check if request ETag matches current ETag.
+
+    Args:
+        request_etag: ETag from If-None-Match header
+        current_etag: Current ETag of the data
+
+    Returns:
+        True if ETags match (data unchanged), False otherwise
+    """
+    if not request_etag:
+        return False
+
+    # Both ETags should have quotes, compare directly
+    # The If-None-Match header and our generated ETag should both be quoted
+    return request_etag == current_etag


⚠️ Potential issue

Handle full If-None-Match semantics (lists, weak validators, wildcard).

Current equality check misses common cases like W/ weak tags, multiple ETags, and *. This will cause incorrect 200/304 behavior through proxies/caches.

-def check_etag(request_etag: str | None, current_etag: str) -> bool: +def check_etag(request_etag: str | None, current_etag: str) -> bool: @@ - if not request_etag: - return False - - # Both ETags should have quotes, compare directly - # The If-None-Match header and our generated ETag should both be quoted - return request_etag == current_etag + if not request_etag: + return False + + # Split on commas (multiple validators allowed), trim spaces + candidates = [t.strip() for t in request_etag.split(",") if t.strip()] + for token in candidates: + # Wildcard: treat as match for GET/HEAD semantics + if token == "*": + return True + + # Drop weak validator prefix if present + if token.startswith("W/"): + token = token[2:].strip() + + # Normalize quotes + if not (token.startswith('"') and token.endswith('"')): + token = f'"{token.strip(\'"\')}"' + + if token == current_etag: + return True + return False

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def check_etag(request_etag: str | None, current_etag: str) -> bool:

"""Check if request ETag matches current ETag.

Args:

request_etag: ETag from If-None-Match header

current_etag: Current ETag of the data

Returns:

True if ETags match (data unchanged), False otherwise

"""

if not request_etag:

return False

# Both ETags should have quotes, compare directly

# The If-None-Match header and our generated ETag should both be quoted

return request_etag == current_etag

def check_etag(request_etag: str | None, current_etag: str) -> bool:

"""Check if request ETag matches current ETag.

Args:

request_etag: ETag from If-None-Match header

current_etag: Current ETag of the data

Returns:

True if ETags match (data unchanged), False otherwise

"""

if not request_etag:

return False

# Split on commas (multiple validators allowed), trim spaces

candidates = [t.strip() for t in request_etag.split(",") if t.strip()]

for token in candidates:

# Wildcard: treat as match for GET/HEAD semantics

if token == "*":

return True

# Drop weak validator prefix if present

if token.startswith("W/"):

token = token[2:].strip()

# Normalize quotes

if not (token.startswith('"') and token.endswith('"')):

token = f'"{token.strip(\'"\')}"'

if token == current_etag:

return True

return False

🤖 Prompt for AI Agents

In python/src/server/utils/etag_utils.py around lines 27 to 42, the current equality-only check ignores If-None-Match semantics (wildcard "*", multiple comma-separated ETags, and weak validators like "W/"). Update check_etag to: treat a header value of "*" as a match; split the header on commas and iterate each listed tag; for each tag remove optional "W/" weak-prefix and surrounding whitespace, extract the opaque-tag (strip surrounding quotes) and compare that opaque-tag against the current_etag's opaque-tag (strip current_etag quotes as well) using the weak-comparison rule (i.e., compare opaque-tags only). Return True if any listed tag matches, otherwise False.

will not address this, not in scope for current pr limitation documented here PRPs/ai_docs/ETAG_IMPLEMENTATION.md

@Wirasm, understood. I acknowledge that the ETag implementation improvement is out of scope for this PR and appreciate you documenting the limitation.

✏️ Learnings added

Learnt from: Wirasm PR: coleam00/Archon#514 File: python/src/server/utils/etag_utils.py:27-42 Timestamp: 2025-08-28T12:06:56.351Z Learning: The ETag implementation in python/src/server/utils/etag_utils.py has known limitations with If-None-Match semantics (wildcards, multiple ETags, weak validators) that are documented in PRPs/ai_docs/ETAG_IMPLEMENTATION.md. These limitations are intentionally not addressed to keep the current implementation simple.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

- Remove WebSocket URL configuration from api.ts - Clean up WebSocket tests and mocks from test files - Remove websocket parameter from embedding service - Update MCP project tools tests to match new API response format - Add example real test for usePolling hook - Update vitest config to properly include test files

- Add ETag utilities tests covering generation and checking logic - Add progress API tests with 304 Not Modified support - Add progress service tests for operation tracking - Add projects API polling tests with ETag validation - Fix projects API to properly handle ETag check independently of response object - Test coverage for critical polling components following MCP test patterns

- Remove getWebSocketUrl imports that were causing runtime errors - Replace WebSocket log streaming with deprecation warnings - Remove unused WebSocket properties and methods - Simplify disconnectLogs to no-op functions These services now use HTTP polling exclusively as part of the Socket.IO to polling migration. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>

- Add isMountedRef to track component mount status - Guard all setState calls with mounted checks - Prevent callbacks from firing after unmount - Apply fix to useProjectMutation, useDatabaseMutation, and useAsyncMutation Addresses Code Rabbit feedback about potential state updates after component unmount. Simple pragmatic fix without over-engineering request cancellation. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>

- Add concise documentation explaining current ETag implementation - Document that we use simple equality check, not full RFC 7232 - Clarify this works for our browser-to-API use case - Note limitations for future CDN/proxy support Addresses Code Rabbit feedback about RFC compliance by documenting the known limitations of our simplified implementation. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>

- Remove WebSocket event schemas from projectSchemas.ts - Remove WebSocket event types from types/project.ts - Remove WebSocket initialization and subscription methods from projectService.ts - Remove all broadcast event calls throughout the service - Clean up imports to remove unused types Complete removal of WebSocket infrastructure in favor of HTTP polling. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>

- Change backend API to return 'progress' instead of 'percentage' - Remove unnecessary mapping in frontend - Use consistent 'progress' field name throughout - Update all progress initialization to use 'progress' field Simple consolidation to one field name instead of mapping between two. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>

- Update tasks state when polling returns new data - Keep UI in sync with server changes for selected project - Tasks now live-update from external changes without project switching The polling was fetching fresh data but never updating the UI state. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>

- Use API response data.title instead of selectedProject?.title - Shows correct project name when pinning/unpinning any project card - Toast now accurately reflects which project was actually modified The issue was the toast would show the wrong project name when pinning a project that wasn't the currently selected one. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>

Removed all temporary project tracking during creation: - Removed tempProjects state and allProjects combining - Removed handleProjectCreationProgress function - Removed progress polling for project creation - Removed ProjectCreationProgressCard rendering - Simplified createProject to just create and let polling pick it up This fixes false 'creation failed' errors and simplifies the code significantly. Project creation now shows a simple toast and relies on polling for updates.

Changed loadTaskCountsForAllProjects to use Promise.allSettled for parallel API calls: - All project task counts now fetched simultaneously instead of sequentially - Better error isolation - one project failing doesn't affect others - Significant performance improvement for users with multiple projects - If 5 projects: from 5×API_TIME to just 1×API_TIME total

Replace NodeJS.Timeout with ReturnType<typeof setInterval> in crawlProgressService. This makes the timer type compatible across both Node.js and browser environments, fixing TypeScript compilation errors in browser builds.

Map backend statuses to correct UI states: - 'processing' → 'processing' (use existing UI state) - 'queued' → 'starting' (pre-crawl state) - 'cancelled' → 'cancelled' (use existing UI state) This prevents incorrect UI states and gives users accurate feedback about crawl operation status.

Replace NodeJS.Timer with ReturnType<typeof setInterval> in both TaskPollingService and ProjectPollingService classes. This ensures compatibility across Node.js and browser environments.

This file was created during Socket.IO removal but never actually used. The application already uses usePolling hooks (useTaskPolling, useProjectPolling) which have proper ETag support and visibility handling. Removing dead code to reduce maintenance burden and confusion.

* Add Supabase key validation and simplify frontend state management - Add backend validation to detect and warn about anon vs service keys - Prevent startup with incorrect Supabase key configuration - Consolidate frontend state management following KISS principles - Remove duplicate state tracking and sessionStorage polling - Add clear error display when backend fails to start - Improve .env.example documentation with detailed key selection guide - Add comprehensive test coverage for validation logic - Remove unused test results checking to eliminate 404 errors The implementation now warns users about key misconfiguration while maintaining backward compatibility. Frontend state is simplified with MainLayout as the single source of truth for backend status. * Fix critical issues from code review - Use python-jose (already in dependencies) instead of PyJWT for JWT decoding - Make unknown Supabase key roles fail fast per alpha principles - Skip all JWT validations (not just signature) when checking role - Update tests to expect failure for unknown roles Fixes: - No need to add PyJWT dependency - python-jose provides JWT functionality - Unknown key types now raise ConfigurationError instead of warning - JWT decode properly skips all validations to only check role claim * Remove unnecessary startup delay script from frontend Dockerfile - Rolled back to match main branch Dockerfile - Removed 3-second sleep script that was added for backend readiness - Container now runs npm directly without intermediate script - Tested and verified all services start correctly without the delay * Fix document deletion persistence issue (#278) - Fixed projectService methods to include project_id parameter in API calls - Updated deleteDocument() to use correct endpoint: /api/projects/{projectId}/docs/{docId} - Updated getDocument() and updateDocument() to use correct endpoints with project_id - Modified DocsTab component to call backend API when deleting documents - Documents now properly persist deletion after page refresh The issue was that document deletion was only happening in UI state and never reached the backend. The service methods were using incorrect API endpoints that didn't include the required project_id parameter. * Add comprehensive test coverage for document CRUD operations - Add Document interface for type safety - Fix error messages to include projectId context - Add unit tests for all projectService document methods - Add integration tests for DocsTab deletion flow - Update vitest config to include new test files * MCP server consolidation and simplification - Consolidated multiple MCP modules into unified project_module - Removed redundant project, task, document, and version modules - Identified critical issue with async project creation losing context - Updated CLAUDE.md with project instructions This commit captures the current state before refactoring to split consolidated tools into separate operations for better clarity and to solve the async project creation context issue. * Improve MCP tool usability and documentation - Fix parameter naming confusion in RAG tools (source → source_domain) - Add clarification that source_domain expects domain names not IDs - Improve manage_versions documentation with clear examples - Add better error messages for validation failures - Enhance manage_document with non-PRP examples - Add comprehensive documentation to get_project_features - Fix content parameter type in manage_versions to accept Any type These changes address usability issues discovered during testing without breaking existing functionality. * Refactor MCP server structure and add separate project tools - Rename src/mcp to src/mcp_server for clarity - Update all internal imports to use new path - Create features/projects directory for modular tool organization - Add separate, simple project tools (create, list, get, delete, update) - Keep consolidated tools for backward compatibility (via env var) - Add USE_SEPARATE_PROJECT_TOOLS env var to toggle between approaches The new separate tools: - Solve the async project creation context loss issue - Provide clearer, single-purpose interfaces - Remove complex PRP examples for simplicity - Handle project creation polling automatically * Update bug_report.yml Changing Archon Alpha to Beta in the issue template * Update README.md Added note in the README * Fix project cards horizontal scrollbar visibility (#295) Addresses issue #293 by replacing hide-scrollbar with scrollbar-thin class to ensure users can see and interact with the horizontal scrollbar when project cards overflow. * Fix missing feature field in project tasks API response Resolves issue #282 by adding feature field to task dictionary in TaskService.list_tasks() method. The project tasks API endpoint was excluding the feature field while individual task API included it, causing frontend to default to 'General' instead of showing custom feature values. Changes: - Add feature field to task response in list_tasks method - Maintains compatibility with existing API consumers - All 212 tests pass with this change * Remove consolidated project module in favor of separated tools The consolidated project module contained all project, task, document, version, and feature management in a single 922-line file. This has been replaced with focused, single-purpose tools in separate modules. * Remove feature flags from Docker configuration Removed USE_SEPARATE_PROJECT_AND_TASK_TOOLS and PROJECTS_ENABLED environment variables as the separated tools are now the default. * Add document and version management tools Extract document management functionality into focused tools: - create_document: Create new documents with metadata - list_documents: List all documents in a project - get_document: Retrieve specific document details - update_document: Modify existing documents - delete_document: Remove documents from projects Extract version control functionality: - create_version: Create immutable snapshots - list_versions: View version history - get_version: Retrieve specific version content - restore_version: Rollback to previous versions Includes improved documentation and error messages based on testing. * Add task management tools with smart routing Extract task functionality into focused tools: - create_task: Create tasks with sources and code examples - list_tasks: List tasks with project/status filtering - get_task: Retrieve task details - update_task: Modify task properties - delete_task: Archive tasks (soft delete) Preserves intelligent endpoint routing: - Project-specific: /api/projects/{id}/tasks - Status filtering: /api/tasks?status=X - Assignee filtering: /api/tasks?assignee=X * Add feature management tool for project capabilities Extract get_project_features as a standalone tool with enhanced documentation explaining feature structures and usage patterns. Features track functional components like auth, api, and database. * Update project tools to use simplified approach Remove complex PRP validation logic and focus on core functionality. Maintains backward compatibility with existing API endpoints. * Register all separated tools in MCP server Update MCP server to use the new modular tool structure: - Projects and tasks from existing modules - Documents and versions from new modules - Feature management from standalone module Remove all feature flag logic as separated tools are now default. * Update MCP Dockerfile to support new module structure Create documents directory and ensure all new modules are properly included in the container build. * Clean up unused imports in RAG module Remove import of deleted project_module. * Fix type errors and remove trailing whitespace - Add explicit type annotations for params dictionaries to resolve mypy errors - Remove trailing whitespace from blank lines (W293 ruff warnings) - Ensure type safety in task_tools.py and document_tools.py * Add comprehensive unit tests for MCP server features - Create test structure mirroring features folder organization - Add tests for document tools (create, list, update, delete) - Add tests for version tools (create, list, restore, invalid field handling) - Add tests for task tools (create with sources, list with filters, update, delete) - Add tests for project tools (create with polling, list, get) - Add tests for feature tools (get features with various structures) - Mock HTTP client for all external API calls - Test both success and error scenarios - 100% test coverage for critical tool functions * Updating the Logo for Archon * Add Stage 1 workflow for external PR info collection - Collects PR information without requiring secrets - Triggers on pull_request events and @claude-review-ext comments - Uploads PR details as artifact for secure processing * Add Stage 2 secure review workflow for external PRs - Runs after Stage 1 via workflow_run trigger - Has access to repository secrets - Downloads PR artifact and performs review - Maintains security by never checking out fork code * Add documentation for external PR review workflows - Explains the two-stage security model - Provides usage instructions for contributors and maintainers - Includes troubleshooting and security considerations * Fix base branch checkout in Stage 2 workflow - Extract PR base branch from artifact instead of using workflow branch - Add step to switch to correct base branch after downloading PR info - Use PR base branch for diff generation instead of workflow branch * Fix external PR workflow permissions and error handling - Grant pull-requests write permission for comment posting - Add try-catch error handling with continue-on-error - Ensure workflow continues even if comment posting fails * Simplify authorization for external PR workflows - Move authorization check to Stage 1 job condition - Remove complex authorization job from Stage 2 - Fix duplicate exec declaration error - Add unauthorized user message handling in Stage 1 - Trust Stage 1's authorization in Stage 2 * Fix exec declaration error in Stage 2 workflow - Replace async exec with execSync to avoid declaration issues - Add proper error handling for artifact extraction - Use childProcess module directly instead of destructuring * Fix Claude Code Action authentication and context issues - Remove invalid pr_number parameter - Add explicit github_token to fix OIDC failure in workflow_run - Add mode: review for proper review mode - Create fake event.json to provide PR context - Set environment variables to simulate PR event * Remove invalid mode parameter and improve event context - Remove invalid mode: review parameter - Update event context to simulate issue_comment - Add direct_prompt to guide Claude to review the diff - Update instructions to use Read tool for pr-diff.patch * Simplify fork PR review to single workflow with pull_request_target - Remove complex two-stage workflow approach - Use pull_request_target with security safeguards - Add first-time contributor check and approval requirement - Never checkout PR code - only analyze diff - Mirror full review format from main claude-review workflow - Manual trigger via @claude-review-fork for maintainers * Fix Claude checkout issue for forked PRs - Add environment overrides to prevent PR branch checkout - Add explicit github_token for authentication - Add direct_prompt to guide Claude to use diff file - Override GITHUB_REF and GITHUB_SHA to stay on base branch * Override event context to prevent PR checkout - Set GITHUB_EVENT_NAME to workflow_dispatch to avoid PR detection - Use override_prompt instead of direct_prompt for better control - Create wrapper script for debugging - Explicitly tell Claude not to checkout code * fix: Restructure fork review workflow to avoid PR branch checkout - Create isolated review context directory to prevent PR detection - Move diff to changes.patch file in review-context directory - Add explicit REVIEW_INSTRUCTIONS.md file for guidance - Use standard 'prompt' parameter instead of 'override_prompt' - This approach should prevent Claude Action from auto-detecting PR context * Remove claude-review-fork.yml workflow The Claude Code Action is not compatible with reviewing PRs from forks. It always attempts to checkout the PR branch which fails for security reasons. * Fix: Allow HTTP for local Supabase connections (#323) - Modified validate_supabase_url() to allow HTTP for local development - HTTP is now allowed for localhost, 127.0.0.1, host.docker.internal, and 0.0.0.0 - HTTPS is still required for production/non-local environments - Fixes server startup failure when using local Supabase with Docker * Update README.md (#332) Makes it easier to copy paste & run the command in on single shot * feat(mcp): Add robust error handling and timeout configuration Critical improvements to MCP server reliability and client experience: Error Handling: - Created MCPErrorFormatter for consistent error responses across all tools - Provides structured errors with type, message, details, and actionable suggestions - Helps clients (like Claude Code) understand and handle failures gracefully - Categorizes errors (connection_timeout, validation_error, etc.) for better debugging Timeout Configuration: - Centralized timeout config with environment variable support - Different timeouts for regular operations vs polling operations - Configurable via MCP_REQUEST_TIMEOUT, MCP_CONNECT_TIMEOUT, etc. - Prevents indefinite hangs when services are unavailable Module Registration: - Distinguishes between ImportError (acceptable) and code errors (must fix) - SyntaxError/NameError/AttributeError now halt execution immediately - Prevents broken code from silently failing in production Polling Safety: - Fixed project creation polling with exponential backoff - Handles API unavailability with proper error messages - Maximum attempts configurable via MCP_MAX_POLLING_ATTEMPTS Response Normalization: - Fixed inconsistent response handling in list_tasks - Validates and normalizes different API response formats - Clear error messages when response format is unexpected These changes address critical issues from PR review while maintaining backward compatibility. All 20 existing tests pass. * refactor(mcp): Apply consistent error handling to all MCP tools Comprehensive update to MCP server error handling: Error Handling Improvements: - Applied MCPErrorFormatter to all remaining MCP tool files - Replaced all hardcoded timeout values with configurable timeout system - Converted all simple string errors to structured error format - Added proper httpx exception handling with detailed context Tools Updated: - document_tools.py: All 5 document management tools - version_tools.py: All 4 version management tools - feature_tools.py: Project features tool - project_tools.py: Remaining 3 project tools (get, list, delete) - task_tools.py: Remaining 4 task tools (get, list, update, delete) Test Improvements: - Removed backward compatibility checks from all tests - Tests now enforce structured error format (dict not string) - Any string error response is now considered a bug - All 20 tests passing with new strict validation This completes the error handling refactor for all MCP tools, ensuring consistent client experience and better debugging. * fix(mcp): Address all priority actions from PR review Based on latest PR #306 review feedback: Fixed Issues: - Replaced last remaining basic error handling with MCPErrorFormatter in version_tools.py get_version function - Added proper error handling for invalid env vars in get_max_polling_attempts - Improved type hints with TaskUpdateFields TypedDict for better validation - All tools now consistently use get_default_timeout() (verified with grep) Test Improvements: - Added comprehensive tests for MCPErrorFormatter utility (10 tests) - Added tests for timeout_config utility (13 tests) - All 43 MCP tests passing with new utilities - Tests verify structured error format and timeout configuration Type Safety: - Created TaskUpdateFields TypedDict to specify exact allowed fields - Documents valid statuses and assignees in type comments - Improves IDE support and catches type errors at development time This completes all priority actions from the review: ✅ Fixed inconsistent timeout usage (was already done) ✅ Fixed error handling inconsistency ✅ Improved type hints for update_fields ✅ Added tests for utility modules * style: Apply linting fixes and formatting Applied automated linting and formatting: - Fixed missing newlines at end of files - Adjusted line wrapping for better readability - Fixed multi-line string formatting in tests - No functional changes, only style improvements All 43 tests still passing after formatting changes. * Update docker-compose.yml Adding host.docker.internal:host-gateway to Docker Compose for the server and agents. * Update README.md (#410) added star history at end of readme * Important updates to CONTRIBUTING.md and README * fix: Allow HTTP for all private network ranges in Supabase URLs (#417) * fix: Allow HTTP for all private network ranges in Supabase URLs - Extend HTTP support to all RFC 1918 private IP ranges - Class A: 10.0.0.0 to 10.255.255.255 (10.0.0.0/8) - Class B: 172.16.0.0 to 172.31.255.255 (172.16.0.0/12) - Class C: 192.168.0.0 to 192.168.255.255 (192.168.0.0/16) - Also includes link-local (169.254.0.0/16) addresses - Uses Python's ipaddress module for robust IP validation - Maintains HTTPS requirement for public/production URLs - Backwards compatible with existing localhost exceptions * security: Fix URL validation vulnerabilities - Replace substring matching with exact hostname matching to prevent bypass attacks - Exclude unspecified address (0.0.0.0) from allowed HTTP hosts - Add support for .localhost domains per RFC 6761 - Improve error messages with hostname context for better debugging Addresses security concerns raised in PR review regarding: - Malicious domains like 'localhost.attacker.com' bypassing HTTPS requirements - Unspecified address being incorrectly allowed as valid connection target --------- Co-authored-by: tazmon95 <tazmon95@users.noreply.github.com> Co-authored-by: root <root@supatest2.jtpa.net> * Fix business document categorization bug - Fixed missing knowledge_type and tags parameters in DocumentStorageService.upload_document() - Added source_type='file' to document chunk metadata for proper categorization - Enhanced source metadata creation to include source_type based on source_id pattern - Fixed metadata spread order in knowledge_item_service to prevent source_type override - Business documents now correctly show pink color theme and appear in Business Documents section Fixes issue where business documents were incorrectly stored as technical knowledge and appeared with blue color theme instead of pink. * feat: Add Gemini CLI support to MCPPage and IDEGlobalRules * fix(mcp): Fix update_task signature and MCP instructions Resolves #420 - Tasks being duplicated instead of updated Changes: 1. Fixed update_task function signature to use individual optional parameters - Changed from TypedDict to explicit parameters (title, status, etc.) - Consistent with update_project and update_document patterns - Builds update_fields dict internally from provided parameters 2. Updated MCP instructions with correct function names - Replaced non-existent manage_task with actual functions - Added complete function signatures for all tools - Improved workflow documentation with concrete examples This fixes the issue where AI agents were confused by: - Wrong function names in instructions (manage_task vs update_task) - Inconsistent parameter patterns across update functions - TypedDict magic that wasn't clearly documented 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * test(mcp): Update tests for new update_task signature - Fixed test_update_task_status to use individual parameters - Added test_update_task_no_fields for validation testing - All MCP tests passing (44 tests) * style(mcp): Clean up whitespace in MCP instructions - Remove trailing whitespace - Consistent formatting in instruction text * Fix crawler timeout for JavaScript-heavy documentation sites Remove wait_for='body' selector from documentation site crawling config. The body element exists immediately in HTML, causing unnecessary timeouts for JavaScript-rendered content. Now relies on domcontentloaded event and delay_before_return_html for proper JavaScript execution. * chore: Remove unused imports and fix exception chaining - Remove unused asyncio imports from batch.py and recursive.py - Add proper exception chaining with 'from e' to preserve stack traces * fix: Apply URL transformation before crawling in recursive strategy - Transform URLs to raw content (e.g., GitHub blob -> raw) before sending to crawler - Maintain mapping dictionary to preserve original URLs in results - Align progress callback signatures between batch and recursive strategies - Add safety guards for missing links attribute - Remove unused loop counter in batch strategy - Optimize binary file checks to avoid duplicate calls This ensures GitHub files are crawled as raw content instead of HTML pages, fixing the issue where content extraction was degraded due to HTML wrapping. * Improve development environment with Docker Compose profiles (#435) * Add improved development environment with backend in Docker and frontend locally - Created dev.bat script to run backend services in Docker and frontend locally - Added docker-compose.backend.yml for backend-only Docker setup - Updated package.json to run frontend on port 3737 - Fixed api.ts to use default port 8181 instead of throwing error - Script automatically stops production containers to avoid port conflicts - Provides instant HMR for frontend development * Refactor development environment setup: replace dev.bat with Makefile for cross-platform support and enhanced commands * Enhance development environment: add environment variable checks and update test commands for frontend and backend * Improve development environment with Docker Compose profiles This commit enhances the development workflow by replacing the separate docker-compose.backend.yml file with Docker Compose profiles, fixing critical service discovery issues, and adding comprehensive developer tooling through an improved Makefile system. Key improvements: - Replace docker-compose.backend.yml with cleaner profile approach - Fix service discovery by maintaining consistent container names - Fix port mappings (3737:3737 instead of 3737:5173) - Add make doctor for environment validation - Fix port configuration and frontend HMR - Improve error handling with .SHELLFLAGS in Makefile - Add comprehensive port configuration via environment variables - Simplify make dev-local to only run essential services - Add logging directory creation for local development - Document profile strategy in docker-compose.yml These changes provide three flexible development modes: - Hybrid mode (default): Backend in Docker, frontend local with HMR - Docker mode: Everything in Docker for production-like testing - Local mode: API server and UI run locally Co-authored-by: Zak Stam <zaksnet@users.noreply.github.com> * Fix make stop command to properly handle Docker Compose profiles The stop command now explicitly specifies all profiles to ensure all containers are stopped regardless of how they were started. * Fix README to document correct make commands - Changed 'make lint' to 'make lint-frontend' and 'make lint-backend' - Removed non-existent 'make logs-server' command - Added 'make watch-mcp' and 'make watch-agents' commands - All documented make commands now match what's available in Makefile * fix: Address critical issues from code review #435 - Create robust environment validation script (check-env.js) that properly parses .env files - Fix Docker healthcheck port mismatch (5173 -> 3737) - Remove hard-coded port flags from package.json to allow environment configuration - Fix Docker detection logic using /.dockerenv instead of HOSTNAME - Normalize container names to lowercase (archon-server, archon-mcp, etc.) - Improve stop-local command with port-based fallback for process killing - Fix API configuration fallback chain to include VITE_PORT - Fix Makefile shell variable expansion using runtime evaluation - Update .PHONY targets with comprehensive list - Add --profile flags to Docker Compose commands in README - Add VITE_ARCHON_SERVER_PORT to docker-compose.yml - Add Node.js 18+ to prerequisites - Use dynamic ports in Makefile help messages - Add lint alias combining frontend and backend linting - Update .env.example documentation - Scope .gitignore logs entry to /logs/ Co-Authored-By: Claude <noreply@anthropic.com> * Fix container name resolution for MCP server - Add dynamic container name resolution with three-tier strategy - Support environment variables for custom container names - Add service discovery labels to docker-compose services - Update BackendStartupError with correct container name references * Fix frontend test failures in API configuration tests - Update environment variable names to use VITE_ prefix that matches production code - Fix MCP client service tests to use singleton instance export - Update default behavior tests to expect fallback to port 8181 - All 77 frontend tests now pass * Fix make stop-local to avoid Docker daemon interference Replace aggressive kill -9 with targeted process termination: - Filter processes by command name (node/vite/python/uvicorn) before killing - Use graceful SIGTERM instead of SIGKILL - Add process verification to avoid killing Docker-related processes - Improve logging with descriptive step messages * refactor: Simplify development workflow based on comprehensive review - Reduced Makefile from 344 lines (43 targets) to 83 lines (8 essential targets) - Removed unnecessary environment variables (*_CONTAINER_NAME variables) - Fixed Windows compatibility by removing Unix-specific commands - Added security fixes to check-env.js (path validation) - Simplified MCP container discovery to use fixed container names - Fixed 'make stop' to properly handle Docker Compose profiles - Updated documentation to reflect simplified workflow - Restored original .env.example with comprehensive Supabase key documentation This addresses all critical issues from code review: - Cross-platform compatibility ✅ - Security vulnerabilities fixed ✅ - 81% reduction in complexity ✅ - Maintains all essential functionality ✅ All tests pass: Frontend (77/77), Backend (267/267) * feat: Add granular test and lint commands to Makefile - Split test command into test-fe and test-be for targeted testing - Split lint command into lint-fe and lint-be for targeted linting - Keep original test and lint commands that run both - Update help text with new commands for better developer experience * feat: Improve Docker Compose detection and prefer modern syntax - Prefer 'docker compose' (plugin) over 'docker-compose' (standalone) - Add better error handling in Makefile with proper exit on failures - Add Node.js check before running environment scripts - Pass environment variables correctly to frontend in hybrid mode - Update all documentation to use modern 'docker compose' syntax - Auto-detect which Docker Compose version is available * docs: Update CONTRIBUTING.md to reflect simplified development workflow - Add Node.js 18+ as prerequisite for hybrid development - Mark Make as optional throughout the documentation - Update all docker-compose commands to modern 'docker compose' syntax - Add Make command alternatives for testing (make test, test-fe, test-be) - Document make dev for hybrid development mode - Remove linting requirements until codebase errors are resolved * fix: Rename frontend service to archon-frontend for consistency Aligns frontend service naming with other services (archon-server, archon-mcp, archon-agents) for better consistency in Docker image naming patterns. --------- Co-authored-by: Zak Stam <zakscomputers@hotmail.com> Co-authored-by: Zak Stam <zaksnet@users.noreply.github.com> * fixed the creation and saving of docs in the project management area * Fix logging error in threading service Fixed TypeError when passing custom fields to Python logger by using the 'extra' parameter instead of direct keyword arguments. This resolves embedding creation failures during crawl operations. * Apply linting fixes for better code formatting - Added trailing commas for multi-line function calls - Improved line breaks for better readability * Complete logging fixes for all statements in threading service Applied the extra parameter pattern to all remaining logging statements (11 more) to ensure consistency and prevent runtime errors when any code path is executed. This completes the fix for the entire file. * Remove deprecated PRP testing scripts and dead code - Removed python/src/server/testing/ folder containing deprecated test utilities - These PRP viewer testing tools were used during initial development - No longer needed as functionality has been integrated into main codebase - No dependencies or references found in production code * Remove original_archon folder from main branch - Removed the original_archon/ directory containing the legacy Archon v1-v6 iterations - This was the original AI agent builder system before the pivot to the current architecture - The folder has been preserved in the 'preserve-original-archon' branch for historical reference - Reduces repository size by ~5.2MB and removes confusion about which codebase is active * Fix proxy when specifying PROD=true * Add a test * Parse before passing to vite * PR Feedback * Add titles to action buttons in DocumentCard, DraggableTaskCard, and TaskTableView components. * feat(ui): add accessibility attributes to action buttons Add type, aria-label, and aria-hidden attributes to action and icon buttons across task and document components to improve accessibility and assistive technology support. * Fix Docker Compose default profile and error documentation - Add 'default' profile to all services so 'docker compose up --build -d' works without --profile flag - Update BackendStartupError.tsx to include '--profile full' in Docker command examples - Update docker-compose.yml comments to document the new default behavior This allows users to run either: - docker compose up --build -d (uses default profile, starts all services) - docker compose --profile full up --build -d (explicit profile, same result) - docker compose --profile backend up --build -d (backend services only) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Use generic YOUR_PROFILE placeholder instead of hardcoded 'full' profile This allows users to choose their preferred profile (backend, full, etc.) rather than assuming they always want the full profile for error recovery. * Remove profiles from all services to enable default startup - Remove profile restrictions from all services so they start with 'docker compose up' - All services now run by default without requiring --profile flags - Profile functionality removed - users now use default behavior only - This enables the requested 'docker compose up --build -d' workflow * Update error modal to use 'full' profile with helpful note - Change from generic YOUR_PROFILE to specific 'full' profile - Add note explaining users can replace 'full' if needed - Maintains clarity while providing flexibility for different profiles * Merge UX improvements from PR #443 - Update error modal to show default 'docker compose up --build -d' command - Add better organized note structure with bullet points - Include profile-specific fallback example for existing users - Update README Quick Start to show default command first - Maintain backward compatibility guidance for profile users * Fix critical token consumption issue in list endpoints (#488) - Add include_content parameter to ProjectService.list_projects() - Add exclude_large_fields parameter to TaskService.list_tasks() - Add include_content parameter to DocumentService.list_documents() - Update all MCP tools to use lightweight responses by default - Fix critical N+1 query problem in ProjectService (was making separate query per project) - Add response size monitoring and logging for validation - Add comprehensive unit and integration tests Results: - Projects endpoint: 99.3% token reduction (27,055 -> 194 tokens) - Tasks endpoint: 98.2% token reduction (12,750 -> 226 tokens) - Documents endpoint: Returns metadata with content_size instead of full content - Maintains full backward compatibility with default parameters - Single query optimization eliminates N+1 performance issue * fix: include_archived flag now works correctly in task listing - Add include_archived parameter to TaskService.list_tasks() - Service now conditionally applies archived filter based on parameter - Add 'archived' field to task DTO for client visibility - Update API endpoints to pass include_archived down to service - Remove redundant client-side filtering in API layer - Fix type hints in integration tests (dict[str, Any] | None) - Use pytest.skip() instead of return for proper test reporting These fixes address the functional bug identified by CodeRabbit where archived tasks couldn't be retrieved even when explicitly requested. * feat: disable agents service by default using Docker profiles - Add 'agents' profile to archon-agents service - Remove archon-agents as dependency from archon-mcp service - Service now only starts with --profile agents flag - Prevents startup issues while agents service is under development - All core functionality continues to work without agents 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * fix: address PR review feedback for agents service disabling - Fix misleading profile documentation at top of docker-compose.yml - Add AGENTS_ENABLED flag for cleaner agent service handling - Make AGENTS_SERVICE_URL configurable via environment variable - Prevent noisy connection errors when agents service isn't running This provides a cleaner way to disable the agents service and allows the application to skip agent wiring when AGENTS_ENABLED=false. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * Add PRP story task template and reorganize PRP commands (#508) * Reorganize PRP commands and add story task template - Move PRP commands to dedicated subdirectories - Add new agent definitions for codebase analysis and library research - Create story task PRP template for user story implementation - Rename prp-base.md to prp_base.md for consistency * Update .claude/agents/library-researcher.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Update .claude/commands/prp-claude-code/prp-story-task-create.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Update .claude/commands/prp-claude-code/prp-story-task-create.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Update PRPs/templates/prp_story_task.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Update PRPs/templates/prp_story_task.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * fix: resolve container startup race condition in agents service (#451) (#503) * depends on and env var added Update Vite configuration to enable allowed hosts - Uncommented the allowedHosts configuration to allow for dynamic host settings based on environment variables. - This change enhances flexibility for different deployment environments while maintaining the default localhost and specific domain access. Needs testing to confirm proper functionality with various host configurations. rm my domain * Enhance Vite configuration with dynamic allowed hosts support - Added VITE_ALLOWED_HOSTS environment variable to .env.example and docker-compose.yml for flexible host configuration. - Updated Vite config to dynamically set allowed hosts, incorporating defaults and custom values from the environment variable. - This change improves deployment flexibility while maintaining security by defaulting to localhost and specific domains. Needs testing to confirm proper functionality with various host configurations. * refactor: remove unnecessary dependency on archon-agents in docker-compose.yml - Removed the dependency condition for archon-agents from the archon-mcp service to streamline the startup process. - This change simplifies the service configuration and reduces potential startup issues related to agent service health checks. Needs testing to ensure that the application functions correctly without the archon-agents dependency. --------- Co-authored-by: Julian Gegenhuber <office@salzkammercode.at> * Fix race condition in concurrent crawling with unique source IDs (#472) * Fix race condition in concurrent crawling with unique source IDs - Add unique hash-based source_id generation to prevent conflicts - Separate source identification from display with three fields: - source_id: 16-char SHA256 hash for unique identification - source_url: Original URL for tracking - source_display_name: Human-friendly name for UI - Add comprehensive test suite validating the fix - Migrate existing data with backward compatibility * Fix title generation to use source_display_name for better AI context - Pass source_display_name to title generation function - Use display name in AI prompt instead of hash-based source_id - Results in more specific, meaningful titles for each source * Skip AI title generation when display name is available - Use source_display_name directly as title to avoid unnecessary AI calls - More efficient and predictable than AI-generated titles - Keep AI generation only as fallback for backward compatibility * Fix critical issues from code review - Add missing os import to prevent NameError crash - Remove unused imports (pytest, Mock, patch, hashlib, urlparse, etc.) - Fix GitHub API capitalization consistency - Reuse existing DocumentStorageService instance - Update test expectations to match corrected capitalization Addresses CodeRabbit review feedback on PR #472 * Add safety improvements from code review - Truncate display names to 100 chars when used as titles - Document hash collision probability (negligible for <1M sources) Simple, pragmatic fixes per KISS principle * Fix code extraction to use hash-based source_ids and improve display names - Fixed critical bug where code extraction was using old domain-based source_ids - Updated code extraction service to accept source_id as parameter instead of extracting from URL - Added special handling for llms.txt and sitemap.xml files in display names - Added comprehensive tests for source_id handling in code extraction - Removed unused urlparse import from code_extraction_service.py This fixes the foreign key constraint errors that were preventing code examples from being stored after the source_id architecture refactor. Co-Authored-By: Claude <noreply@anthropic.com> * Fix critical variable shadowing and source_type determination issues - Fixed variable shadowing in document_storage_operations.py where source_url parameter was being overwritten by document URLs, causing incorrect source_url in database - Fixed source_type determination to use actual URLs instead of hash-based source_id - Added comprehensive tests for source URL preservation - Ensure source_type is correctly set to "file" for file uploads, "url" for web crawls The variable shadowing bug was causing sitemap sources to have the wrong source_url (last crawled page instead of sitemap URL). The source_type bug would mark all sources as "url" even for file uploads due to hash-based IDs not starting with "file_". Co-Authored-By: Claude <noreply@anthropic.com> * Fix URL canonicalization and document metrics calculation - Implement proper URL canonicalization to prevent duplicate sources - Remove trailing slashes (except root) - Remove URL fragments - Remove tracking parameters (utm_*, gclid, fbclid, etc.) - Sort query parameters for consistency - Remove default ports (80 for HTTP, 443 for HTTPS) - Normalize scheme and domain to lowercase - Fix avg_chunks_per_doc calculation to avoid division by zero - Track processed_docs count separately from total crawl_results - Handle all-empty document sets gracefully - Show processed/total in logs for better visibility - Add comprehensive tests for both fixes - 10 test cases for URL canonicalization edge cases - 4 test cases for document metrics calculation This prevents database constraint violations when crawling the same content with URL variations and provides accurate metrics in logs. * Fix synchronous extract_source_summary blocking async event loop - Run extract_source_summary in thread pool using asyncio.to_thread - Prevents blocking the async event loop during AI summary generation - Preserves exact error handling and fallback behavior - Variables (source_id, combined_content) properly passed to thread Added comprehensive tests verifying: - Function runs in thread without blocking - Error handling works correctly with fallback - Multiple sources can be processed - Thread safety with variable passing * Fix synchronous update_source_info blocking async event loop - Run update_source_info in thread pool using asyncio.to_thread - Prevents blocking the async event loop during database operations - Preserves exact error handling and fallback behavior - All kwargs properly passed to thread execution Added comprehensive tests verifying: - Function runs in thread without blocking - Error handling triggers fallback correctly - All kwargs are preserved when passed to thread - Existing extract_source_summary tests still pass * Fix race condition in source creation using upsert - Replace INSERT with UPSERT for new sources to prevent PRIMARY KEY violations - Handles concurrent crawls attempting to create the same source - Maintains existing UPDATE behavior for sources that already exist Added comprehensive tests verifying: - Concurrent source creation doesn't fail - Upsert is used for new sources (not insert) - Update is still used for existing sources - Async concurrent operations work correctly - Race conditions with delays are handled This prevents database constraint errors when multiple crawls target the same URL simultaneously. * Add migration detection UI components Add MigrationBanner component with clear user instructions for database schema updates. Add useMigrationStatus hook for periodic health check monitoring with graceful error handling. * Integrate migration banner into main app Add migration status monitoring and banner display to App.tsx. Shows migration banner when database schema updates are required. * Enhance backend startup error instructions Add detailed Docker restart instructions and migration script guidance. Improves user experience when encountering startup failures. * Add database schema caching to health endpoint Implement smart caching for schema validation to prevent repeated database queries. Cache successful validations permanently and throttle failures to 30-second intervals. Replace debug prints with proper logging. * Clean up knowledge API imports and logging Remove duplicate import statements and redundant logging. Improves code clarity and reduces log noise. * Remove unused instructions prop from MigrationBanner Clean up component API by removing instructions prop that was accepted but never rendered. Simplifies the interface and eliminates dead code while keeping the functional hardcoded migration steps. * Add schema_valid flag to migration_required health response Add schema_valid: false flag to health endpoint response when database schema migration is required. Improves API consistency without changing existing behavior. --------- Co-authored-by: Claude <noreply@anthropic.com> * Hotfix - crawls hanging after embedding rate limiting * Moving Dockerfiles to uv for package installation (#533) * Moving Dockerfiles to uv for package installation * Updating uv installation for CI * Reduced the size of sentence-transformers by making it CPU only, including reranking by default now (#534) * CI fails now when unit tests for backend fail (#536) * CI fails now when unit tests for backend fail * Fixing up a couple unit tests * Documentation improvements for MCP and README (#540) * Spacing updates for Make installation in README * add PRPs/completed/ to gitignore * add archon-coderabbit-helper slash command * refactor: Remove Socket.IO and implement HTTP polling architecture (#514) * refactor: Remove Socket.IO and consolidate task status naming Major refactoring to simplify the architecture: 1. Socket.IO Removal: - Removed all Socket.IO dependencies and code (~4,256 lines) - Replaced with HTTP polling for real-time updates - Added new polling hooks (usePolling, useDatabaseMutation, etc.) - Removed socket services and handlers 2. Status Consolidation: - Removed UI/DB status mapping layer - Using database values directly (todo, doing, review, done) - Removed obsolete status types and mapping functions - Updated all components to use database status values 3. Simplified Architecture: - Cleaner separation between frontend and backend - Reduced complexity in state management - More maintainable codebase 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * feat: Add loading states and error handling for UI operations - Added loading overlay when dragging tasks between columns - Added loading state when switching between projects - Added proper error handling with toast notifications - Removed remaining Socket.IO references - Improved user feedback during async operations * docs: Add comprehensive polling architecture documentation Created developer guide explaining: - Core polling components and hooks - ETag caching implementation - State management patterns - Migration from Socket.IO - Performance optimizations - Developer guidelines and best practices * fix: Correct method name for fetching tasks - Fixed projectService.getTasks() to projectService.getTasksByProject() - Ensures consistent naming throughout the codebase - Resolves error when refreshing tasks after drag operations * docs: Add comprehensive API naming conventions guide Created naming standards documentation covering: - Service method naming patterns - API endpoint conventions - Component and hook naming - State variable naming - Type definitions - Common patterns and anti-patterns - Migration notes from Socket.IO * docs: Update CLAUDE.md with polling architecture and naming conventions - Replaced Socket.IO references with HTTP polling architecture - Added polling intervals and ETag caching documentation - Added API naming conventions section - Corrected task endpoint patterns (use getTasksByProject, not getTasks) - Added state naming patterns and status values * refactor: Remove Socket.IO and implement HTTP polling architecture Complete removal of Socket.IO/WebSocket dependencies in favor of simple HTTP polling: Frontend changes: - Remove all WebSocket/Socket.IO references from KnowledgeBasePage - Implement useCrawlProgressPolling hook for progress tracking - Fix polling hook to prevent ERR_INSUFFICIENT_RESOURCES errors - Add proper cleanup and state management for completed crawls - Persist and restore active crawl progress across page refreshes - Fix agent chat service to handle disabled agents gracefully Backend changes: - Remove python-socketio from requirements - Convert ProgressTracker to in-memory state management - Add /api/crawl-progress/{id} endpoint for polling - Initialize ProgressTracker immediately when operations start - Remove all Socket.IO event handlers and cleanup commented code - Simplify agent_chat_api to basic REST endpoints Bug fixes: - Fix race condition where progress data wasn't available for polling - Fix memory leaks from recreating polling callbacks - Fix crawl progress URL mismatch between frontend and backend - Add proper error filtering for expected 404s during initialization - Stop polling when crawl operations complete This change simplifies the architecture significantly and makes it more robust by removing the complexity of WebSocket connections. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix data consistency issue in crawl completion - Modify add_documents_to_supabase to return actual chunks stored count - Update crawl orchestration to validate chunks were actually saved to database - Throw exception when chunks are processed but none stored (e.g., API key failures) - Ensure UI shows error state instead of false success when storage fails - Add proper error field to progress updates for frontend display This prevents misleading "crawl completed" status when backend fails to store data. * Consolidate API key access to unified LLM provider service pattern - Fix credential service to properly store encrypted OpenAI API key from environment - Remove direct environment variable access pattern from source management service - Update both extract_source_summary and generate_source_title_and_metadata to async - Convert all LLM operations to use get_llm_client() for multi-provider support - Fix callers in document_storage_operations.py and storage_services.py to use await - Improve title generation prompt with better context and examples for user-readable titles - Consolidate on single pattern that supports OpenAI, Google, Ollama providers This fixes embedding service failures while maintaining compatibility for future providers. * Fix async/await consistency in source management services - Make update_source_info async and await it properly - Fix generate_source_title_and_metadata async calls - Improve source title generation with URL-based detection - Remove unnecessary threading wrapper for async operations 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: correct API response handling in MCP project polling - Fix polling logic to properly extract projects array from API response - The API returns {projects: [...]} but polling was trying to iterate directly over response - This caused 'str' object has no attribute 'get' errors during project creation - Update both create_project polling and list_projects response handling - Verified all MCP tools now work correctly including create_project 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Optimize project switching performance and eliminate task jumping - Replace race condition-prone polling refetch with direct API calls for immediate task loading (100-200ms vs 1.5-2s) - Add polling suppression during direct API calls to prevent task jumping from double setTasks() calls - Clear stale tasks immediately on project switch to prevent wrong data visibility - Maintain polling for background updates from agents/MCP while optimizing user-initiated actions Performance improvements: - Project switches now load tasks in 100-200ms instead of 1.5-2 seconds - Eliminated visual task jumping during project transitions - Clean separation: direct calls for user actions, polling for external updates 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Remove race condition anti-pattern and complete Socket.IO removal Critical fixes addressing code review findings: **Race Condition Resolution:** - Remove fragile isLoadingDirectly flag that could permanently disable polling - Remove competing polling onSuccess callback that caused task jumping - Clean separation: direct API calls for user actions, polling for external updates only **Socket.IO Removal:** - Replace projectCreationProgressService with useProgressPolling HTTP polling - Remove all Socket.IO dependencies and references - Complete migration to HTTP-only architecture **Performance Optimization:** - Add ETag support to /projects/{project_id}/tasks endpoint for 70% bandwidth savings - Remove competing TasksTab onRefresh system that caused multiple API calls - Single source of truth: polling handles background updates, direct calls for immediate feedback **Task Management Simplification:** - Remove onRefresh calls from all TasksTab operations (create, update, delete, move) - Operations now use optimistic updates with polling fallback - Eliminates 3-way race condition between polling, direct calls, and onRefresh Result: Fast project switching (100-200ms), no task jumping, clean polling architecture 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * Remove remaining Socket.IO and WebSocket references - Remove WebSocket URL configuration from api.ts - Clean up WebSocket tests and mocks from test files - Remove websocket parameter from embedding service - Update MCP project tools tests to match new API response format - Add example real test for usePolling hook - Update vitest config to properly include test files * Add comprehensive unit tests for polling architecture - Add ETag utilities tests covering generation and checking logic - Add progress API tests with 304 Not Modified support - Add progress service tests for operation tracking - Add projects API polling tests with ETag validation - Fix projects API to properly handle ETag check independently of response object - Test coverage for critical polling components following MCP test patterns * Remove WebSocket functionality from service files - Remove getWebSocketUrl imports that were causing runtime errors - Replace WebSocket log streaming with deprecation warnings - Remove unused WebSocket properties and methods - Simplify disconnectLogs to no-op functions These services now use HTTP polling exclusively as part of the Socket.IO to polling migration. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * Fix memory leaks in mutation hooks - Add isMountedRef to track component mount status - Guard all setState calls with mounted checks - Prevent callbacks from firing after unmount - Apply fix to useProjectMutation, useDatabaseMutation, and useAsyncMutation Addresses Code Rabbit feedback about potential state updates after component unmount. Simple pragmatic fix without over-engineering request cancellation. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * Document ETag implementation and limitations - Add concise documentation explaining current ETag implementation - Document that we use simple equality check, not full RFC 7232 - Clarify this works for our browser-to-API use case - Note limitations for future CDN/proxy support Addresses Code Rabbit feedback about RFC compliance by documenting the known limitations of our simplified implementation. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * Remove all WebSocket event schemas and functionality - Remove WebSocket event schemas from projectSchemas.ts - Remove WebSocket event types from types/project.ts - Remove WebSocket initialization and subscription methods from projectService.ts - Remove all broadcast event calls throughout the service - Clean up imports to remove unused types Complete removal of WebSocket infrastructure in favor of HTTP polling. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * Fix progress field naming inconsistency - Change backend API to return 'progress' instead of 'percentage' - Remove unnecessary mapping in frontend - Use consistent 'progress' field name throughout - Update all progress initialization to use 'progress' field Simple consolidation to one field name instead of mapping between two. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * Fix tasks polling data not updating UI - Update tasks state when polling returns new data - Keep UI in sync with server changes for selected project - Tasks now live-update from external changes without project switching The polling was fetching fresh data but never updating the UI state. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * Fix incorrect project title in pin/unpin toast messages - Use API response data.title instead of selectedProject?.title - Shows correct project name when pinning/unpinning any project card - Toast now accurately reflects which project was actually modified The issue was the toast would show the wrong project name when pinning a project that wasn't the currently selected one. 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> * Remove over-engineered tempProjects logic Removed all temporary project tracking during creation: - Removed tempProjects state and allProjects combining - Removed handleProjectCreationProgress function - Removed progress polling for project creation - Removed ProjectCreationProgressCard rendering - Simplified createProject to just create and let polling pick it up This fixes false 'creation failed' errors and simplifies the code significantly. Project creation now shows a simple toast and relies on polling for updates. * Optimize task count loading with parallel fetching Changed loadTaskCountsForAllProjects to use Promise.allSettled for parallel API calls: - All project task counts now fetched simultaneously instead of sequentially - Better error isolation - one project failing doesn't affect others - Significant performance improvement for users with multiple projects - If 5 projects: from 5×API_TIME to just 1×API_TIME total * Fix TypeScript timer type for browser compatibility Replace NodeJS.Timeout with ReturnType<typeof setInterval> in crawlProgressService. This makes the timer type compatible across both Node.js and browser environments, fixing TypeScript compilation errors in browser builds. * Add explicit status mappings for crawl progress states Map backend statuses to correct UI states: - 'processing' → 'processing' (use existing UI state) - 'queued' → 'starting' (pre-crawl state) - 'cancelled' → 'cancelled' (use existing UI state) This prevents incorrect UI states and gives users accurate feedback about crawl operation status. * Fix TypeScript timer types in pollingService for browser compatibility Replace NodeJS.Timer with ReturnType<typeof setInterval> in both TaskPollingService and ProjectPollingService classes. This ensures compatibility across Node.js and browser environments. * Remove unused pollingService.ts dead code This file was created during Socket.IO removal but never actually used. The application already uses usePolling hooks (useTaskPolling, useProjectPolling) which have proper ETag support and visibility handling. Removing dead code to reduce maintenance burden and confusion. * Fix TypeScript timer type in progressService for browser compatibility Replace NodeJS.Timer with ReturnType<typeof setInterval> to ensure compatibility across Node.js and browser environments, consistent with other timer type fixes throughout the codebase. * Fix TypeScript timer type in projectCreationProgressService Replace NodeJS.Timeout with ReturnType<typeof setInterval> in Map type to ensure browser/DOM build compatibility. * Add proper error handling to project creation progress polling Stop infinite polling on fatal errors: - 404 errors continue polling (resource might not exist yet) - Other HTTP errors (500, 503, etc.) stop polling and report error - Network/parsing errors stop polling and report error - Clear feedback to callbacks on all error types This prevents wasting resources polling forever on unrecoverable errors and provides better user feedback when things go wrong. * Fix documentation accuracy in API conventions and architecture docs - Fix API_NAMING_CONVENTIONS.md: Changed 'documents' to 'docs' and used distinct placeholders ({project_id} and {doc_id}) to match actual API routes - Fix POLLING_ARCHITECTURE.md: Updated import path to use relative import (from ..utils.etag_utils) to match actual code structure - ARCHITECTURE.md: List formatting was already correct, no changes needed These changes ensure documentation accurately reflects the actual codebase. * Fix type annotations in recursive crawling strategy - Changed max_concurrent from invalid 'int = None' to 'int | None = None' - Made progress_callback explicitly async: 'Callable[..., Awaitable[None]] | None' - Added Awaitable import from typing - Uses modern Python 3.10+ union syntax (project requires Python 3.12) * Improve error logging in sitemap parsing - Use logger.exception() instead of logger.error() for automatic stack traces - Include sitemap URL in all error messages for better debugging - Remove unused traceback import and manual traceback logging - Now all exceptions show which sitemap failed with full stack trace * Remove all Socket.IO remnants from task_service.py Removed: - Duplicate broadcast_task_update function definitions - _broadcast_available flag (always False) - All Socket.IO broadcast blocks in create_task, update_task, and archive_task - Socket.IO related logging and error handling - Unnecessary traceback import within Socket.IO error handler Task updates are now handled exclusively via HTTP polling as intended. * Complete WebSocket/Socket.IO cleanup across frontend and backend - Remove socket.io-client dependency and all related packages - Remove WebSocket proxy configuration from vite.config.ts - Clean up WebSocket state management and deprecated methods from services - Remove VITE_ENABLE_WEBSOCKET environment variable checks - Update all comments to remove WebSocket/Socket.IO references - Fix user-facing error messages that mentioned Socket.IO - Preserve legitimate FastAPI WebSocket endpoints for MCP/test streaming This completes the refactoring to HTTP polling, removing all Socket.IO infrastructure while keeping necessary WebSocket functionality. * Remove MCP log display functionality following KISS principles - Remove all log display UI from MCPPage (saved ~100 lines) - Remove log-related API endpoints and WebSocket streaming - Keep internal log tracking for Docker container monitoring - Simplify MCPPage to focus on server control and configuration - Remove unused LogEntry types and streaming methods Following early beta KISS principles - MCP logs are debug info that developers can check via terminal/Docker if needed. UI now focuses on essential functionality only. * Add Claude Code command for analyzing CodeRabbit suggestions - Create structured command for CodeRabbit review analysis - Provides clear format for assessing validity and priority - Generates 2-5 practical options with tradeoffs - Emphasizes early beta context and KISS principles - Includes effort estimation for each option This command helps quickly triage CodeRabbit suggestions and decide whether to address them based on project priorities and tradeoffs. * Add in-flight guard to prevent overlapping fetches in crawl progress polling Prevents race condition where slow responses could cause multiple concurrent fetches for the same progressId. Simple boolean flag skips new fetches while one is active and properly cleans up on stop/disconnect. Co-Authored-By: Claude <noreply@anthropic.com> * Remove unused progressService.ts dead code File was completely unused with no imports or references anywhere in the codebase. Other services (crawlProgressService, projectCreationProgressService) handle their specific progress polling needs directly. Co-Authored-By: Claude <noreply@anthropic.com> * Remove unused project creation progress components Both ProjectCreationProgressCard.tsx and projectCreationProgressService.ts were dead code with no references. The service duplicated existing usePolling functionality unnecessarily. Removed per KISS principles. Co-Authored-By: Claude <noreply@anthropic.com> * Update POLLING_ARCHITECTURE.md to reflect current state Removed references to deleted files (progressService.ts, projectCreationProgressService.ts, ProjectCreationProgressCard.tsx). Updated to document what exists now rather than migration history. Co-Authored-By: Claude <noreply@anthropic.com> * Update API_NAMING_CONVENTIONS.md to reflect current state Updated progress endpoints to match actual implementation. Removed migration/historical references and anti-patterns section. Focused on current best practices and architecture patterns. Co-Authored-By: Claude <noreply@anthropic.com> * Remove unused optimistic updates code and references Deleted unused useOptimisticUpdates.ts hook that was never imported. Removed optimistic upda…

) * feat: AI-generated conversation titles across all adapters (#514) Add AI-powered title generation for conversations created from any adapter (Web, CLI, Slack, Telegram, GitHub, Discord). Previously only the Web adapter set titles (truncated first message), leaving all other adapters with "Untitled conversation" in the Web UI. - Create title-generator service using configured AI client with tools: [] - Wire into orchestrator handleMessage (covers all chat adapters) - Wire into CLI workflowRunCommand (covers direct CLI invocations) - Remove old web-only truncation-based auto-title from api.ts - Add optional TITLE_GENERATION_MODEL env var for cheaper/faster model - Fire-and-forget design: never blocks message delivery, falls back to truncated message on failure - 19 unit tests (title generator + orchestrator integration) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: CLI-to-Web UI integration — persist messages, platform-agnostic API, stale run cleanup - Add findConversationByPlatformId to look up conversations without hardcoding platform_type, fixing 404 when Web UI loads CLI conversations - Update CLIAdapter to persist assistant messages via messageDb and wire user message persistence in workflow command - Add failStaleWorkflowRuns to transition stuck 'running' workflow runs to 'failed' at server startup - Add platform badges (CLI, telegram, etc.) to sidebar conversation items - Disable message input for non-web conversations with "coming soon" note - Add unit tests for new DB functions and CLI adapter persistence Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…oleam00#515) * feat: AI-generated conversation titles across all adapters (coleam00#514) Add AI-powered title generation for conversations created from any adapter (Web, CLI, Slack, Telegram, GitHub, Discord). Previously only the Web adapter set titles (truncated first message), leaving all other adapters with "Untitled conversation" in the Web UI. - Create title-generator service using configured AI client with tools: [] - Wire into orchestrator handleMessage (covers all chat adapters) - Wire into CLI workflowRunCommand (covers direct CLI invocations) - Remove old web-only truncation-based auto-title from api.ts - Add optional TITLE_GENERATION_MODEL env var for cheaper/faster model - Fire-and-forget design: never blocks message delivery, falls back to truncated message on failure - 19 unit tests (title generator + orchestrator integration) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: CLI-to-Web UI integration — persist messages, platform-agnostic API, stale run cleanup - Add findConversationByPlatformId to look up conversations without hardcoding platform_type, fixing 404 when Web UI loads CLI conversations - Update CLIAdapter to persist assistant messages via messageDb and wire user message persistence in workflow command - Add failStaleWorkflowRuns to transition stuck 'running' workflow runs to 'failed' at server startup - Add platform badges (CLI, telegram, etc.) to sidebar conversation items - Disable message input for non-web conversations with "coming soon" note - Add unit tests for new DB functions and CLI adapter persistence Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Wirasm and others added 13 commits August 28, 2025 10:34

docs: Add comprehensive polling architecture documentation

7283b79

Created developer guide explaining: - Core polling components and hooks - ETag caching implementation - State management patterns - Migration from Socket.IO - Performance optimizations - Developer guidelines and best practices

fix: Correct method name for fetching tasks

94a8949

- Fixed projectService.getTasks() to projectService.getTasksByProject() - Ensures consistent naming throughout the codebase - Resolves error when refreshing tasks after drag operations

coderabbitai Bot reviewed Aug 28, 2025

View reviewed changes

Wirasm and others added 15 commits August 28, 2025 14:26

Fix TypeScript timer type for browser compatibility

bc08810

Replace NodeJS.Timeout with ReturnType<typeof setInterval> in crawlProgressService. This makes the timer type compatible across both Node.js and browser environments, fixing TypeScript compilation errors in browser builds.

Fix TypeScript timer types in pollingService for browser compatibility

3dd46b7

Replace NodeJS.Timer with ReturnType<typeof setInterval> in both TaskPollingService and ProjectPollingService classes. This ensures compatibility across Node.js and browser environments.

leoric-crown mentioned this pull request Sep 3, 2025

Development logs flooded with polling requests, making debugging difficult #568

Closed

5 tasks

coderabbitai Bot mentioned this pull request Sep 4, 2025

feat: Deep URL Navigation with Cross-Platform Copy Functionality #573

Closed

10 tasks

This was referenced Sep 4, 2025

🐛 [Bug]: So it Disconnects all the time #466

Closed

🐛 [Bug]: Consistent Disconnect During Code Example Extraction When Creating TypeScript Knowledge Base #454

Closed

coderabbitai Bot mentioned this pull request Sep 22, 2025

Refactor: Move shared hooks from ui/hooks to shared/hooks #729

Merged

24 tasks

coderabbitai Bot mentioned this pull request Oct 10, 2025

Feature/UI style guide #772

Merged

24 tasks

Wirasm deleted the refactor-remove-sockets branch April 6, 2026 07:38

Conversation

Wirasm commented Aug 28, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

Summary

Changes Made

Core Architecture Migration (970ba04, 75f0e70)

Performance Optimizations (e4a6a68, 75f0e70)

Data Consistency & Error Handling (6816610, 98c8684, 2b08c7b)

MCP Integration Fixes (1301516)

Documentation & Standards (4179322, af402c2)

Type of Change

Affected Services

Testing

Test Evidence

Checklist

Breaking Changes

Additional Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Wirasm Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Wirasm Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Wirasm Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Wirasm Aug 28, 2025

Wirasm commented Aug 28, 2025 •

edited by coderabbitai Bot

Loading

Core Architecture Migration (`970ba04`, `75f0e70`)

Performance Optimizations (`e4a6a68`, `75f0e70`)

Data Consistency & Error Handling (`6816610`, `98c8684`, `2b08c7b`)

MCP Integration Fixes (`1301516`)

Documentation & Standards (`4179322`, `af402c2`)

coderabbitai Bot commented Aug 28, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai Bot Aug 28, 2025 •

edited

Loading

coderabbitai Bot Aug 28, 2025 •

edited

Loading

coderabbitai Bot Aug 28, 2025 •

edited

Loading

coderabbitai Bot Aug 28, 2025 •

edited

Loading

coderabbitai Bot Aug 28, 2025 •

edited

Loading