RAG Enhancements (Page Level Retrieval) by coleam00 · Pull Request #767 · coleam00/Archon

coleam00 · 2025-10-07T21:23:35Z

Pull Request

Summary

Moved from chunk level to page level retrieval to fit better with how RAG best works with AI coding assistants and code documentation.

Changes Made

The primary RAG tool now returns URLs of similar chunks instead of the chunk content itself
New RAG MCP tool to fetch crawled content of a URL (can be given to the agent or the agent can find that through the RAG query)
New RAG MCP tool to list out the pages available from a source
Primary RAG tool has a flag so there is still the option to return underlying chunk content
llms-full.txt are split into sections based on main headings, other sites are stored by page (sitemap, regular sites (recursive), and llms.txt)

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Performance improvement
Code refactoring

Affected Services

Testing

New backend tests for RAG (MCP and API).

Test Evidence

All unit tests including the new ones are passing.

Checklist

My code follows the service architecture patterns
If using an AI coding assistant, I used the CLAUDE.md rules
I have added tests that prove my fix/feature works
All new and existing tests pass locally
My changes generate no new warnings
I have updated relevant documentation
I have verified no regressions in existing features

Breaking Changes

Old documentation will work still but for optimal results everyone should recrawl.

Summary by CodeRabbit

New Features
- Search can return page-level results (pages vs chunks); new endpoints/tools to list pages and read full page content by ID or URL.
- Full documentation pages (and LLMS-derived sections) are stored with rich metadata and chunk linking; HTML title extraction added.
- Small adjacent chunks are merged for clearer content.
Documentation
- Usage docs added for new knowledge-base tooling.
Tests
- Added parser tests; updated race-condition test for upsert behavior.

coderabbitai · 2025-10-07T21:23:43Z

Walkthrough

Adds persistent page metadata and section parsing, links chunks to pages, exposes Pages API and RAG return_mode ("pages"/"chunks"), adds page storage operations and LLMS-full parser, propagates page_id through crawling/storage, extracts HTML titles, and updates source upsert behavior and tests.

Changes

Cohort / File(s)	Summary
Database Migration `migration/0.1.0/010_add_page_metadata_table.sql`	Create `archon_page_metadata` table, add `page_id` column to `archon_crawled_pages`, add indexes, constraints, and column/table comments.
MCP RAG Tools `python/src/mcp_server/features/rag/rag_tools.py`	Add `return_mode` to `rag_search_knowledge_base`; add `rag_list_pages_for_source` and `rag_read_full_page` tools; propagate `return_mode` in requests/responses.
Server API — Knowledge `python/src/server/api_routes/knowledge_api.py`	Add `return_mode` to `RagQueryRequest` and pass it to RAGService; include `success` flag in responses.
Server API — Pages `python/src/server/api_routes/pages_api.py`, `python/src/server/main.py`	New pages router with endpoints to list pages, get by ID, get by URL; response models and large-content masking; router included in app.
RAG Service `python/src/server/services/search/rag_service.py`	Add `_group_chunks_by_pages` helper; support `return_mode` in `perform_rag_query`; group chunk results into page-level results when requested.
Crawling — Orchestration `python/src/server/services/crawling/crawling_service.py`	Integrate `PageStorageOperations`; forward `url_to_page_id=None` to document storage callback.
Crawling — Page/Section Storage `python/src/server/services/crawling/page_storage_operations.py`, `python/src/server/services/crawling/document_storage_operations.py`	Add `PageStorageOperations` for storing pages and LLMS-full sections; `process_and_store_documents` updated to create pages/sections, map URL→page_id, pass linkage downstream, and update page `chunk_count`.
Crawling — Helpers / Parser `python/src/server/services/crawling/helpers/llms_full_parser.py`	New parser to split `llms-full.txt` into `LLMsFullSection` sections, slug/url generation, and robustness handling.
Crawling — Strategies `python/src/server/services/crawling/strategies/batch.py`, `python/src/server/services/crawling/strategies/single_page.py`	Extract and normalize HTML `<title>` and include `title` in crawl results.
Storage — Chunks `python/src/server/services/storage/document_storage_service.py`, `python/src/server/services/storage/base_storage_service.py`	Propagate `url_to_page_id` and include `page_id` per chunk during embedding/storage; merge consecutive small chunks post-processing.
Source Management `python/src/server/services/source_management_service.py`, `python/tests/test_source_race_condition.py`	Switch to upsert for source updates/creates to avoid races; update test to expect upsert.
Tests `python/tests/server/services/test_llms_full_parser.py`	Add tests covering slug/url generation and parsing scenarios for LLMS-full parser.
Docs / Tools Listing `CLAUDE.md`	Document new MCP tools: `archon:rag_list_pages_for_source` and `archon:rag_read_full_page`.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant KnowledgeAPI as /api/knowledge
  participant RAGService
  participant VectorStore
  participant PagesTable as archon_page_metadata

  Client->>KnowledgeAPI: POST /api/knowledge { query, return_mode="pages" }
  KnowledgeAPI->>RAGService: perform_rag_query(query, return_mode)
  RAGService->>VectorStore: vector/hybrid search → chunk results
  VectorStore-->>RAGService: chunk_results (may include page_id/url)
  alt return_mode == "pages" and page_ids present
    RAGService->>PagesTable: SELECT metadata WHERE id IN (page_ids)
    PagesTable-->>RAGService: page rows
    RAGService-->>KnowledgeAPI: grouped page-level results + actual_return_mode="pages"
  else
    RAGService-->>KnowledgeAPI: chunk-level results + actual_return_mode="chunks"
  end
  KnowledgeAPI-->>Client: { success: true, results, actual_return_mode }

sequenceDiagram
  autonumber
  participant Crawler
  participant CrawlService
  participant PageStore as PageStorageOperations
  participant DocStoreOps as DocumentStorageOperations
  participant ChunkStore as DocumentStorageService
  participant Pages as archon_page_metadata

  Crawler->>CrawlService: emit crawl_results (html/text/url)
  alt llms-full content
    CrawlService->>PageStore: store_llms_full_sections(base_url, content, source_id)
    PageStore->>Pages: INSERT section pages
    Pages-->>PageStore: url→page_id map
  else normal pages
    CrawlService->>PageStore: store_pages(crawl_results, source_id)
    PageStore->>Pages: INSERT pages
    Pages-->>PageStore: url→page_id map
  end
  PageStore-->>CrawlService: url→page_id mapping
  CrawlService->>DocStoreOps: process_and_store_documents(..., url_to_page_id)
  DocStoreOps->>ChunkStore: add_documents_to_supabase(..., url_to_page_id)
  ChunkStore-->>DocStoreOps: stored chunk rows
  DocStoreOps->>PageStore: update_page_chunk_count(page_id, count)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Improved MCP and global rules instructions #705 — touches MCP RAG tools and signatures (closely related to rag_tools changes).
Fix race condition in concurrent crawling with unique source IDs #472 — modifies crawling/document-storage pipeline and interfaces (related to document/page storage changes).
Enhanced the hybrid search strategy with tsvector keyword matching #539 — changes RAGService search pipeline and reranking (related to return_mode/page grouping).

Suggested reviewers

leex279
Wirasm

Poem

A rabbit found pages, not crumbs in a row,
Hopped through titles where anchors grow.
It stitched chunks to parents, gave each page a name,
Asked RAG for “pages” and watched order take frame.
Thump-thump — upserted warren, tidy knowledge game. 🐇📜

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly captures the primary enhancement of migrating RAG retrieval to page‐level and clearly communicates the main change in a concise, specific manner.
Description Check	✅ Passed	The description closely follows the repository template by including Summary, Changes Made, Type of Change, Affected Services, Testing, Test Evidence, Checklist, and Breaking Changes and clearly describes the migration to page‐level retrieval and new tool additions, but the Testing section lacks the required checkbox entries and an Additional Notes section is omitted.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch rag-by-document

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

♻️ Duplicate comments (2)

python/src/server/services/crawling/strategies/batch.py (1)

238-248: Duplicate code: extract title parsing to shared helper.

This is identical to the title extraction in single_page.py (lines 182-192) and recursive.py (lines 280-290). See the refactor suggestion in single_page.py to consolidate this logic into a shared helper function.

python/src/server/services/crawling/strategies/recursive.py (1)

280-290: Duplicate code: extract title parsing to shared helper.

This is the third occurrence of identical title extraction logic (also in single_page.py lines 182-192 and batch.py lines 238-248). See the refactor suggestion in single_page.py to consolidate this into a shared helper.

🧹 Nitpick comments (3)

python/src/server/api_routes/knowledge_api.py (1)

87-119: Consider flattening the nested exception handling.

The nested try-catch structure is functional but could be simplified for better readability. The current implementation correctly handles provider-specific errors, but the nesting adds cognitive complexity.

Optional simplification:

try:
    # Test API key with minimal embedding request using provider-scoped configuration
    from ..services.embeddings.embedding_service import create_embedding

    test_result = await create_embedding(text="test", provider=provider)

    if not test_result:
        logger.error(
            f"❌ {provider.title()} API key validation failed - no embedding returned"
        )
        raise HTTPException(
            status_code=401,
            detail={
                "error": f"Invalid {provider.title()} API key",
                "message": f"Please verify your {provider.title()} API key in Settings.",
                "error_type": "authentication_failed",
                "provider": provider,
            },
        )
except HTTPException:
    raise
except Exception as e:
    logger.error(
        f"❌ {provider.title()} API key validation failed: {e}",
        exc_info=True,
    )
    raise HTTPException(
        status_code=401,
        detail={
            "error": f"Invalid {provider.title()} API key",
            "message": f"Please verify your {provider.title()} API key in Settings. Error: {str(e)[:100]}",
            "error_type": "authentication_failed",
            "provider": provider,
        },
    )

python/src/mcp_server/features/rag/rag_tools.py (1)

238-242: Consider using elif for mutual exclusivity in input validation.

The validation correctly enforces "either page_id or url" but could be slightly clearer.

Optional refinement:

-            if not page_id and not url:
-                return json.dumps(
-                    {"success": False, "error": "Must provide either page_id or url"},
-                    indent=2
-                )
+            if not page_id and not url:
+                return json.dumps(
+                    {"success": False, "error": "Must provide either page_id or url"},
+                    indent=2
+                )
+            elif page_id and url:
+                return json.dumps(
+                    {"success": False, "error": "Provide either page_id or url, not both"},
+                    indent=2
+                )

python/src/server/services/crawling/page_storage_operations.py (1)

206-224: Consider more specific exception handling.

While using logger.warning for this non-critical update is appropriate, the bare Exception catch should be replaced with more specific exception types (e.g., database-related exceptions).

-        except Exception as e:
-            logger.warning(
-                f"Error updating chunk_count for page {page_id}: {e}", exc_info=True
-            )
+        except (SupabaseException, DatabaseError) as e:
+            logger.warning(
+                f"Error updating chunk_count for page {page_id}: {e}", exc_info=True
+            )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a580fdf and 4b71899.

📒 Files selected for processing (15)

migration/0.1.0/010_add_page_metadata_table.sql (1 hunks)
python/src/mcp_server/features/rag/rag_tools.py (4 hunks)
python/src/server/api_routes/knowledge_api.py (5 hunks)
python/src/server/api_routes/pages_api.py (1 hunks)
python/src/server/main.py (2 hunks)
python/src/server/services/crawling/crawling_service.py (3 hunks)
python/src/server/services/crawling/document_storage_operations.py (4 hunks)
python/src/server/services/crawling/helpers/llms_full_parser.py (1 hunks)
python/src/server/services/crawling/page_storage_operations.py (1 hunks)
python/src/server/services/crawling/strategies/batch.py (1 hunks)
python/src/server/services/crawling/strategies/recursive.py (1 hunks)
python/src/server/services/crawling/strategies/single_page.py (1 hunks)
python/src/server/services/search/rag_service.py (3 hunks)
python/src/server/services/storage/document_storage_service.py (3 hunks)
python/tests/server/services/test_llms_full_parser.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (9)

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/services/crawling/strategies/recursive.py
python/src/server/services/crawling/strategies/batch.py
python/src/server/main.py
python/src/server/services/crawling/document_storage_operations.py
python/src/server/services/crawling/page_storage_operations.py
python/src/server/services/search/rag_service.py
python/src/server/api_routes/pages_api.py
python/src/server/services/crawling/helpers/llms_full_parser.py
python/src/server/services/storage/document_storage_service.py
python/src/server/api_routes/knowledge_api.py
python/src/server/services/crawling/strategies/single_page.py
python/src/server/services/crawling/crawling_service.py

python/src/server/services/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/services/**/*.py: For batch operations, report both success count and a detailed failure list
On external API calls, retry with exponential backoff and fail with a clear message after retries
Place business logic in service layer modules (API Route → Service → Database pattern)

Files:

python/src/server/services/crawling/strategies/recursive.py
python/src/server/services/crawling/strategies/batch.py
python/src/server/services/crawling/document_storage_operations.py
python/src/server/services/crawling/page_storage_operations.py
python/src/server/services/search/rag_service.py
python/src/server/services/crawling/helpers/llms_full_parser.py
python/src/server/services/storage/document_storage_service.py
python/src/server/services/crawling/strategies/single_page.py
python/src/server/services/crawling/crawling_service.py

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/src/server/services/crawling/strategies/recursive.py
python/src/server/services/crawling/strategies/batch.py
python/src/server/main.py
python/src/server/services/crawling/document_storage_operations.py
python/src/server/services/crawling/page_storage_operations.py
python/src/server/services/search/rag_service.py
python/src/server/api_routes/pages_api.py
python/src/server/services/crawling/helpers/llms_full_parser.py
python/tests/server/services/test_llms_full_parser.py
python/src/server/services/storage/document_storage_service.py
python/src/server/api_routes/knowledge_api.py
python/src/server/services/crawling/strategies/single_page.py
python/src/mcp_server/features/rag/rag_tools.py
python/src/server/services/crawling/crawling_service.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/server/services/crawling/strategies/recursive.py
python/src/server/services/crawling/strategies/batch.py
python/src/server/main.py
python/src/server/services/crawling/document_storage_operations.py
python/src/server/services/crawling/page_storage_operations.py
python/src/server/services/search/rag_service.py
python/src/server/api_routes/pages_api.py
python/src/server/services/crawling/helpers/llms_full_parser.py
python/src/server/services/storage/document_storage_service.py
python/src/server/api_routes/knowledge_api.py
python/src/server/services/crawling/strategies/single_page.py
python/src/mcp_server/features/rag/rag_tools.py
python/src/server/services/crawling/crawling_service.py

python/src/server/main.py

📄 CodeRabbit inference engine (CLAUDE.md)

Register FastAPI exception handlers in the main application (search for @app.exception_handler)

Files:

python/src/server/main.py

{python/src/server/exceptions.py,python/src/server/main.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Define custom exceptions in exceptions.py and ensure handlers are wired in main.py

Files:

python/src/server/main.py

python/src/server/api_routes/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place HTTP route handlers in the API routes directory

Files:

python/src/server/api_routes/pages_api.py
python/src/server/api_routes/knowledge_api.py

python/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Write backend tests with Pytest (including async support)

Files:

python/tests/server/services/test_llms_full_parser.py

python/src/mcp_server/features/*/*_tools.py

📄 CodeRabbit inference engine (CLAUDE.md)

Name and locate MCP tools as [feature]_tools.py under each feature

Name MCP tools as find_[resource] for list/search/get and manage_[resource] for create/update/delete (with action parameter)

Files:

python/src/mcp_server/features/rag/rag_tools.py

🧬 Code graph analysis (7)

python/src/server/services/crawling/document_storage_operations.py (2)

python/src/server/services/crawling/page_storage_operations.py (2)

store_llms_full_sections (115-204)

store_pages (33-113)

python/src/server/services/crawling/helpers/llms_full_parser.py (1)

parse_llms_full_sections (75-189)

python/src/server/services/crawling/page_storage_operations.py (2)

python/src/server/config/logfire_config.py (2)

get_logger (137-147)

safe_logfire_info (224-236)

python/src/server/services/crawling/helpers/llms_full_parser.py (1)

parse_llms_full_sections (75-189)

python/src/server/services/search/rag_service.py (2)

python/src/server/api_routes/knowledge_api.py (1)

perform_rag_query (1110-1143)

python/src/server/config/logfire_config.py (1)

set_attribute (178-180)

python/src/server/api_routes/pages_api.py (2)

python/src/server/config/logfire_config.py (2)

get_logger (137-147)

safe_logfire_error (239-251)

python/src/server/services/client_manager.py (1)

get_supabase_client (15-43)

python/tests/server/services/test_llms_full_parser.py (1)

python/src/server/services/crawling/helpers/llms_full_parser.py (3)

create_section_slug (23-56)

create_section_url (59-72)

parse_llms_full_sections (75-189)

python/src/server/api_routes/knowledge_api.py (4)

python/src/server/services/embeddings/embedding_service.py (1)

create_embedding (236-293)

python/src/server/services/search/rag_service.py (2)

RAGService (31-479)

perform_rag_query (245-366)

python/src/server/services/client_manager.py (1)

get_supabase_client (15-43)

python/src/server/services/crawling/crawling_service.py (2)

get_active_orchestration (47-51)

unregister_orchestration (61-65)

python/src/server/services/crawling/crawling_service.py (1)

python/src/server/services/crawling/page_storage_operations.py (1)

PageStorageOperations (16-225)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Backend Tests (Python + pytest)

🔇 Additional comments (17)

python/src/server/services/crawling/strategies/single_page.py (1)

194-202: LGTM! Title field properly integrated into crawl result.

The title extraction enhances page metadata for downstream RAG and storage operations. The fallback to "Untitled" ensures consistent behavior.

python/src/server/services/crawling/strategies/batch.py (1)

250-255: LGTM! Title field properly added to batch results.

The title inclusion aligns with the page-level metadata enhancement across crawling strategies.

python/src/server/main.py (2)

28-28: LGTM! Pages API router imported correctly.

187-187: LGTM! Pages API router registered properly.

The new pages router is correctly registered with FastAPI, enabling page retrieval endpoints introduced in this PR.

python/src/server/api_routes/knowledge_api.py (3)

180-180: LGTM! Return mode field properly typed and documented.

The new return_mode field with default "chunks" and comment clarifying valid values follows Pydantic best practices.

1120-1132: LGTM! Return mode correctly propagated through RAG pipeline.

The return_mode parameter is properly passed to RAGService.perform_rag_query and the success flag is correctly added to the result before returning.

1295-1313: LGTM! Orchestration service cancellation properly handled.

The orchestration service is correctly retrieved, cancelled, and unregistered. The sequential steps ensure proper cleanup of the crawl task.

python/src/server/services/crawling/strategies/recursive.py (1)

292-297: LGTM! Title field properly integrated into recursive crawl results.

python/src/server/services/crawling/crawling_service.py (3)

22-22: LGTM! PageStorageOperations imported correctly.

102-102: LGTM! Page storage operations properly initialized.

The PageStorageOperations instance is correctly instantiated with the Supabase client, following the same pattern as doc_storage_ops.

427-437: LGTM! Document storage called with explicit url_to_page_id parameter.

Passing url_to_page_id=None is correct at this stage in the orchestration flow. The parameter will be populated by page storage operations before document storage in the complete implementation.

python/src/server/services/storage/document_storage_service.py (2)

16-29: LGTM! Page ID linkage parameter properly added.

The url_to_page_id parameter is correctly typed as dict[str, str] | None and positioned appropriately in the function signature to enable FK linkage between pages and chunks.

403-417: LGTM! Page ID correctly retrieved and included in chunk records.

The page_id is properly fetched from the url_to_page_id mapping and included in the batch data payload, establishing the FK relationship in the database. The conditional check ensures graceful handling when page_id is unavailable.

python/src/mcp_server/features/rag/rag_tools.py (4)

79-85: LGTM! Return mode parameter properly integrated.

The return_mode parameter with default "pages" follows the established pattern and is correctly documented in the docstring.

117-138: LGTM! Return mode correctly propagated to backend and response.

The return_mode is properly included in the request payload (lines 117-121) and the backend's return_mode is correctly surfaced in the response (line 133).

217-279: LGTM! New rag_read_full_page tool properly implemented.

The tool provides a clean interface for retrieving complete page content after RAG search. The dual lookup by page_id or URL offers flexibility, and error handling follows the established pattern.

248-254: No changes needed for URL encoding
httpx automatically encodes query parameters when using the params dict (e.g. url=https%3A%2F%2F…), as confirmed by testing.

coderabbitai · 2025-10-07T21:32:22Z

+@router.get("/pages/{page_id}")
+async def get_page_by_id(page_id: str):
+    """
+    Get a single page by its ID.
+
+    Args:
+        page_id: The UUID of the page
+
+    Returns:
+        PageResponse with complete page data
+    """
+    try:
+        client = get_supabase_client()
+
+        # Query by ID
+        result = client.table("archon_page_metadata").select("*").eq("id", page_id).single().execute()
+
+        if not result.data:
+            raise HTTPException(status_code=404, detail=f"Page not found: {page_id}")
+
+        return PageResponse(**result.data)
+
+    except HTTPException:
+        raise
+    except Exception as e:
+        logger.error(f"Error getting page {page_id}: {e}", exc_info=True)
+        safe_logfire_error(f"Failed to get page | page_id={page_id} | error={str(e)}")
+        raise HTTPException(status_code=500, detail=f"Failed to get page: {str(e)}") from e


⚠️ Potential issue | 🟠 Major

Remove redundant check and improve exception handling.

Two issues:

Redundant None check (lines 106-107): The .single() method raises an exception when no results are found, so checking if not result.data is redundant and will never execute. The Supabase client raises an exception before returning None.

Bare Exception catching (line 113): Should use specific exception types as per coding guidelines.

+from supabase import SupabaseException + @router.get("/pages/{page_id}") async def get_page_by_id(page_id: str): """ Get a single page by its ID. Args: page_id: The UUID of the page Returns: PageResponse with complete page data """ try: client = get_supabase_client() # Query by ID result = client.table("archon_page_metadata").select("*").eq("id", page_id).single().execute() - if not result.data: - raise HTTPException(status_code=404, detail=f"Page not found: {page_id}") - return PageResponse(**result.data) except HTTPException: raise - except Exception as e: + except SupabaseException as e: + # .single() raises when no results found - convert to 404 + if "not found" in str(e).lower() or "no rows" in str(e).lower(): + raise HTTPException(status_code=404, detail=f"Page not found: {page_id}") from e logger.error(f"Error getting page {page_id}: {e}", exc_info=True) safe_logfire_error(f"Failed to get page | page_id={page_id} | error={str(e)}") raise HTTPException(status_code=500, detail=f"Failed to get page: {str(e)}") from e + except ValueError as e: + logger.error(f"Invalid page data for {page_id}: {e}", exc_info=True) + raise HTTPException(status_code=500, detail=f"Invalid page data: {str(e)}") from e

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In python/src/server/api_routes/pages_api.py around lines 89 to 116, remove the redundant check for result.data after .single() (the Supabase client will raise when no row is found) and replace the broad "except Exception" with targeted exception handling: catch the Supabase/Postgrest client-specific exception(s) (e.g., PostgrestError or the client-specific error type you use) to turn not-found errors into a 404 HTTPException and other client errors into a 500, and keep the existing except HTTPException: raise passthrough; avoid catching bare Exception so only expected DB/client errors are handled and rewrapped into HTTPException with a clear message.

coderabbitai · 2025-10-07T21:32:22Z

+    async def store_pages(
+        self,
+        crawl_results: list[dict],
+        source_id: str,
+        request: dict[str, Any],
+        crawl_type: str,
+    ) -> dict[str, str]:
+        """
+        Store pages in archon_page_metadata table from regular crawl results.
+
+        Args:
+            crawl_results: List of crawled documents with url, markdown, title, etc.
+            source_id: The source ID these pages belong to
+            request: The original crawl request with knowledge_type, tags, etc.
+            crawl_type: Type of crawl performed (sitemap, url, link_collection, etc.)
+
+        Returns:
+            {url: page_id} mapping for FK references in chunks
+        """
+        safe_logfire_info(
+            f"store_pages called | source_id={source_id} | crawl_type={crawl_type} | num_results={len(crawl_results)}"
+        )
+
+        url_to_page_id: dict[str, str] = {}
+        pages_to_insert: list[dict[str, Any]] = []
+
+        for doc in crawl_results:
+            url = doc.get("url", "").strip()
+            markdown = doc.get("markdown", "").strip()
+
+            # Skip documents with empty content or missing URLs
+            if not url or not markdown:
+                continue
+
+            # Prepare page record
+            word_count = len(markdown.split())
+            char_count = len(markdown)
+
+            page_record = {
+                "source_id": source_id,
+                "url": url,
+                "full_content": markdown,
+                "section_title": None,  # Regular page, not a section
+                "section_order": 0,
+                "word_count": word_count,
+                "char_count": char_count,
+                "chunk_count": 0,  # Will be updated after chunking
+                "metadata": {
+                    "knowledge_type": request.get("knowledge_type", "documentation"),
+                    "crawl_type": crawl_type,
+                    "page_type": "documentation",
+                    "tags": request.get("tags", []),
+                },
+            }
+            pages_to_insert.append(page_record)
+
+        # Batch insert pages
+        if pages_to_insert:
+            try:
+                safe_logfire_info(
+                    f"Inserting {len(pages_to_insert)} pages into archon_page_metadata table"
+                )
+                result = (
+                    self.supabase_client.table("archon_page_metadata")
+                    .insert(pages_to_insert)
+                    .execute()
+                )
+
+                # Build url → page_id mapping
+                for page in result.data:
+                    url_to_page_id[page["url"]] = page["id"]
+
+                safe_logfire_info(
+                    f"Successfully stored {len(url_to_page_id)} pages in archon_page_metadata"
+                )
+
+            except Exception as e:
+                logger.error(f"Error inserting pages into archon_page_metadata: {e}", exc_info=True)
+                # Don't raise - allow chunking to continue even if page storage fails
+
+        return url_to_page_id


⚠️ Potential issue | 🟠 Major

Enhance batch operation error reporting and exception specificity.

The method has several issues related to the coding guidelines:

Bare Exception catching (line 109): The guidelines require "Use specific exception types; avoid catching bare Exception". Consider catching more specific exceptions like SupabaseException or database-related errors.

Missing detailed failure reporting: The guidelines state "For batch operations, report both success count and a detailed failure list". Currently, if the batch insert fails, you lose visibility into which specific pages failed. The current implementation is all-or-nothing.

Potential data loss: If the batch insert partially succeeds (some pages inserted, some failed), the url_to_page_id mapping will be incomplete, but downstream code won't know which pages are missing.

Consider this refactored approach:

- # Batch insert pages - if pages_to_insert: - try: - safe_logfire_info( - f"Inserting {len(pages_to_insert)} pages into archon_page_metadata table" - ) - result = ( - self.supabase_client.table("archon_page_metadata") - .insert(pages_to_insert) - .execute() - ) - - # Build url → page_id mapping - for page in result.data: - url_to_page_id[page["url"]] = page["id"] - - safe_logfire_info( - f"Successfully stored {len(url_to_page_id)} pages in archon_page_metadata" - ) - - except Exception as e: - logger.error(f"Error inserting pages into archon_page_metadata: {e}", exc_info=True) - # Don't raise - allow chunking to continue even if page storage fails + # Batch insert pages + if pages_to_insert: + failed_pages = [] + try: + safe_logfire_info( + f"Inserting {len(pages_to_insert)} pages into archon_page_metadata table" + ) + result = ( + self.supabase_client.table("archon_page_metadata") + .insert(pages_to_insert) + .execute() + ) + + # Build url → page_id mapping + for page in result.data: + url_to_page_id[page["url"]] = page["id"] + + safe_logfire_info( + f"Successfully stored {len(url_to_page_id)} pages | failed={len(failed_pages)} | total={len(pages_to_insert)}" + ) + + except (SupabaseException, DatabaseError) as e: + # Track which pages failed + failed_pages = [p["url"] for p in pages_to_insert] + logger.error( + f"Error inserting pages into archon_page_metadata | failed_count={len(failed_pages)} | error={e}", + exc_info=True, + extra={"failed_urls": failed_pages[:10]} # Log first 10 for debugging + ) + # Don't raise - allow chunking to continue even if page storage fails

Note: You'll need to import the appropriate exception types from your Supabase/database client library.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def store_pages(

self,

crawl_results: list[dict],

source_id: str,

request: dict[str, Any],

crawl_type: str,

) -> dict[str, str]:

"""

Store pages in archon_page_metadata table from regular crawl results.

Args:

crawl_results: List of crawled documents with url, markdown, title, etc.

source_id: The source ID these pages belong to

request: The original crawl request with knowledge_type, tags, etc.

crawl_type: Type of crawl performed (sitemap, url, link_collection, etc.)

Returns:

{url: page_id} mapping for FK references in chunks

"""

safe_logfire_info(

f"store_pages called | source_id={source_id} | crawl_type={crawl_type} | num_results={len(crawl_results)}"

)

url_to_page_id: dict[str, str] = {}

pages_to_insert: list[dict[str, Any]] = []

for doc in crawl_results:

url = doc.get("url", "").strip()

markdown = doc.get("markdown", "").strip()

# Skip documents with empty content or missing URLs

if not url or not markdown:

continue

# Prepare page record

word_count = len(markdown.split())

char_count = len(markdown)

page_record = {

"source_id": source_id,

"url": url,

"full_content": markdown,

"section_title": None, # Regular page, not a section

"section_order": 0,

"word_count": word_count,

"char_count": char_count,

"chunk_count": 0, # Will be updated after chunking

"metadata": {

"knowledge_type": request.get("knowledge_type", "documentation"),

"crawl_type": crawl_type,

"page_type": "documentation",

"tags": request.get("tags", []),

},

}

pages_to_insert.append(page_record)

# Batch insert pages

if pages_to_insert:

try:

safe_logfire_info(

f"Inserting {len(pages_to_insert)} pages into archon_page_metadata table"

)

result = (

self.supabase_client.table("archon_page_metadata")

.insert(pages_to_insert)

.execute()

)

# Build url → page_id mapping

for page in result.data:

url_to_page_id[page["url"]] = page["id"]

safe_logfire_info(

f"Successfully stored {len(url_to_page_id)} pages in archon_page_metadata"

)

except Exception as e:

logger.error(f"Error inserting pages into archon_page_metadata: {e}", exc_info=True)

# Don't raise - allow chunking to continue even if page storage fails

return url_to_page_id

async def store_pages(

self,

crawl_results: list[dict],

source_id: str,

request: dict[str, Any],

crawl_type: str,

) -> dict[str, str]:

"""

Store pages in archon_page_metadata table from regular crawl results.

Args:

crawl_results: List of crawled documents with url, markdown, title, etc.

source_id: The source ID these pages belong to

request: The original crawl request with knowledge_type, tags, etc.

crawl_type: Type of crawl performed (sitemap, url, link_collection, etc.)

Returns:

{url: page_id} mapping for FK references in chunks

"""

safe_logfire_info(

f"store_pages called | source_id={source_id} | crawl_type={crawl_type} | num_results={len(crawl_results)}"

)

url_to_page_id: dict[str, str] = {}

pages_to_insert: list[dict[str, Any]] = []

for doc in crawl_results:

url = doc.get("url", "").strip()

markdown = doc.get("markdown", "").strip()

# Skip documents with empty content or missing URLs

if not url or not markdown:

continue

# Prepare page record

word_count = len(markdown.split())

char_count = len(markdown)

page_record = {

"source_id": source_id,

"url": url,

"full_content": markdown,

"section_title": None, # Regular page, not a section

"section_order": 0,

"word_count": word_count,

"char_count": char_count,

"chunk_count": 0, # Will be updated after chunking

"metadata": {

"knowledge_type": request.get("knowledge_type", "documentation"),

"crawl_type": crawl_type,

"page_type": "documentation",

"tags": request.get("tags", []),

},

}

pages_to_insert.append(page_record)

# Batch insert pages

if pages_to_insert:

failed_pages: list[str] = []

try:

safe_logfire_info(

f"Inserting {len(pages_to_insert)} pages into archon_page_metadata table"

)

result = (

self.supabase_client.table("archon_page_metadata")

.insert(pages_to_insert)

.execute()

)

# Build url → page_id mapping

for page in result.data:

url_to_page_id[page["url"]] = page["id"]

safe_logfire_info(

f"Successfully stored {len(url_to_page_id)} pages | failed={len(failed_pages)} | total={len(pages_to_insert)}"

)

except (SupabaseException, DatabaseError) as e:

# Track which pages failed

failed_pages = [p["url"] for p in pages_to_insert]

logger.error(

f"Error inserting pages into archon_page_metadata | "

f"failed_count={len(failed_pages)} | error={e}",

exc_info=True,

extra={"failed_urls": failed_pages[:10]}, # Log first 10 for debugging

)

# Don't raise - allow chunking to continue even if page storage fails

return url_to_page_id

🤖 Prompt for AI Agents

In python/src/server/services/crawling/page_storage_operations.py around lines 33 to 113, the batch insert catches a bare Exception and provides no granular failure reporting or recovery, which can hide partial successes and missing page IDs; change the try/except to catch the specific Supabase/database exception(s) (import them from the client lib), after a failed batch insert log the exception with response/error payload, attempt a fallback by inserting pages individually (or in smaller batches) to collect which page records succeeded vs failed, populate url_to_page_id only for successful inserts, and return both the success count and a failure list (or at minimum log the failed URLs and their errors) so downstream code can know which pages are missing; ensure all logs include the exception details and relevant page URLs/indices.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

python/tests/server/services/test_llms_full_parser.py (1)
126-126: Clarify comment wording.

The comment "Should only have 2 sections (empty one skipped)" is slightly misleading. The standalone # on line 118 is not treated as a section boundary (it doesn't match the "# " pattern with a space), so there's no empty section being skipped—it's just content within Section 1.

Consider rewording for clarity:
-    # Should only have 2 sections (empty one skipped)
+    # Should have 2 sections (standalone "#" is content, not a section boundary)
python/src/mcp_server/features/rag/rag_tools.py (1)
296-358: Add validation for mutually exclusive parameters.

The docstring states "Provide EITHER page_id OR url, not both," but the validation only checks if neither is provided (lines 317-321). When both are provided, the code proceeds with page_id taking precedence (line 327), which may not be the intended behavior.

Add validation for the mutually exclusive case:
 async def rag_read_full_page(
     ctx: Context, page_id: str | None = None, url: str | None = None
 ) -> str:
     """..."""
     try:
-        if not page_id and not url:
+        if not page_id and not url:
             return json.dumps(
                 {"success": False, "error": "Must provide either page_id or url"},
                 indent=2
             )
+        if page_id and url:
+            return json.dumps(
+                {"success": False, "error": "Provide either page_id or url, not both"},
+                indent=2
+            )
python/src/server/services/search/rag_service.py (1)
245-364: LGTM! Clean return_mode integration with smart fallback.

The extension of perform_rag_query to support return_mode is well-implemented. The fallback logic (lines 334-345) gracefully handles pre-migration data without page_ids by returning chunks instead of pages, ensuring backward compatibility.

The response propagates actual_return_mode (line 357) and sets the span attribute (line 362), but the log message on line 365 uses the original return_mode parameter instead of actual_return_mode. This could be slightly misleading when fallback occurs:
-logger.info(f"RAG query completed - {len(formatted_results)} {return_mode} found")
+logger.info(f"RAG query completed - {len(formatted_results)} {actual_return_mode} found")
This ensures the log accurately reflects whether pages or chunks were actually returned.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b71899 and 2b9b199.

📒 Files selected for processing (6)

CLAUDE.md (1 hunks)
python/src/mcp_server/features/rag/rag_tools.py (4 hunks)
python/src/server/services/search/rag_service.py (3 hunks)
python/src/server/services/source_management_service.py (4 hunks)
python/tests/server/services/test_llms_full_parser.py (1 hunks)
python/tests/test_source_race_condition.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (6)

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/tests/test_source_race_condition.py
python/tests/server/services/test_llms_full_parser.py
python/src/mcp_server/features/rag/rag_tools.py
python/src/server/services/source_management_service.py
python/src/server/services/search/rag_service.py

python/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Write backend tests with Pytest (including async support)

Files:

python/tests/test_source_race_condition.py
python/tests/server/services/test_llms_full_parser.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/mcp_server/features/rag/rag_tools.py
python/src/server/services/source_management_service.py
python/src/server/services/search/rag_service.py

python/src/mcp_server/features/*/*_tools.py

📄 CodeRabbit inference engine (CLAUDE.md)

Name and locate MCP tools as [feature]_tools.py under each feature

Name MCP tools as find_[resource] for list/search/get and manage_[resource] for create/update/delete (with action parameter)

Files:

python/src/mcp_server/features/rag/rag_tools.py

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/services/source_management_service.py
python/src/server/services/search/rag_service.py

python/src/server/services/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/services/**/*.py: For batch operations, report both success count and a detailed failure list
On external API calls, retry with exponential backoff and fail with a clear message after retries
Place business logic in service layer modules (API Route → Service → Database pattern)

Files:

python/src/server/services/source_management_service.py
python/src/server/services/search/rag_service.py

🧬 Code graph analysis (3)

python/tests/server/services/test_llms_full_parser.py (1)

python/src/server/services/crawling/helpers/llms_full_parser.py (3)

create_section_slug (23-56)

create_section_url (59-72)

parse_llms_full_sections (75-189)

python/src/server/services/source_management_service.py (1)

python/src/server/services/llm_provider_service.py (2)

extract_message_text (885-922)

get_llm_client (313-548)

python/src/server/services/search/rag_service.py (2)

python/src/server/api_routes/knowledge_api.py (1)

perform_rag_query (1110-1143)

python/src/server/config/logfire_config.py (1)

set_attribute (178-180)

🔇 Additional comments (4)

python/tests/server/services/test_llms_full_parser.py (1)

1-179: LGTM! Comprehensive test coverage with resolved anchor format.

The test file provides excellent coverage of the LLMs-full parser functionality. The previous review concern about anchor format (expecting #section-{order}-{slug} instead of just #{slug}) has been correctly addressed throughout the test assertions.

Test coverage includes:

Slug generation and URL creation ✓

Single and multiple H1 sections ✓

Edge cases: no H1s, H2 handling, empty sections, consecutive H1s ✓

Word count calculation and empty content handling ✓

python/src/mcp_server/features/rag/rag_tools.py (2)

79-152: LGTM! Clear extension of search with return_mode support.

The addition of the return_mode parameter with a default of "pages" aligns well with the PR objectives to move toward page-level retrieval. The documentation clearly explains both modes and recommends pages for better context.

The request/response flow correctly propagates return_mode through the backend call and includes it in the response for transparency.

217-294: LGTM! Well-structured page listing tool.

The new rag_list_pages_for_source tool provides a clean interface for browsing documentation structure. The docstring includes a helpful example workflow, and error handling is comprehensive.

python/src/server/services/search/rag_service.py (1)

175-243: LGTM! Solid page-grouping implementation with thoughtful scoring.

The _group_chunks_by_pages method effectively aggregates chunk results into page-level responses. The aggregate scoring logic (average similarity with match-count boost capped at 20%) provides a reasonable signal for page relevance. The fallback from page_id to URL for grouping ensures compatibility with different data sources.

Key strengths:

Handles both page_id and URL-based grouping

Fetches complete page metadata from archon_page_metadata

Includes preview text from best matching chunk

Robust handling of missing page records with .maybe_single()

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (5)

python/src/server/services/crawling/page_storage_operations.py (2)

118-123: Remove the bare Exception handler in store_pages.

Catching Exception here still violates the service guidelines and swallows unexpected failures, leaving callers with an empty url_to_page_id and no signal that page storage failed. Drop this block (or re‑raise) and rely on the specific APIError handling so upstream code can react appropriately.
-        except Exception as e:
-            safe_logfire_error(
-                f"Unexpected error upserting pages | source_id={source_id} | attempted={len(pages_to_insert)} | error={str(e)}"
-            )
-            logger.error(f"Unexpected error upserting pages for source {source_id}: {e}", exc_info=True)
-            # Don't raise - allow chunking to continue
+        except Exception:
+            raise
217-223: Avoid swallowing unexpected errors when storing llms-full sections.

Same issue as above: the bare Exception handler hides real failures and leaves downstream processes thinking the upsert succeeded. Remove it (or re‑raise) so only the specific database error path is tolerated.
-        except Exception as e:
-            safe_logfire_error(
-                f"Unexpected error upserting sections | base_url={base_url} | attempted={len(pages_to_insert)} | error={str(e)}"
-            )
-            logger.error(f"Unexpected error upserting sections for {base_url}: {e}", exc_info=True)
-            # Don't raise - allow process to continue
+        except Exception:
+            raise

python/src/server/api_routes/pages_api.py (3)

130-133: Use specific Supabase exceptions in list_pages.

The broad except Exception was already flagged—please catch APIError (and other expected types) explicitly and let unexpected errors bubble. This keeps to the service guidelines and preserves stack traces.

-    except Exception as e:
+    except APIError as e:
         logger.error(f"Error listing pages for source {source_id}: {e}", exc_info=True)
         safe_logfire_error(f"Failed to list pages | source_id={source_id} | error={str(e)}")
         raise HTTPException(status_code=500, detail=f"Failed to list pages: {str(e)}") from e
+    except ValueError as e:
+        logger.error(f"Invalid page data for source {source_id}: {e}", exc_info=True)
+        raise HTTPException(status_code=500, detail=f"Invalid page data: {str(e)}") from e

155-167: Return 404 for missing pages and remove the dead if not result.data check.

Postgrest’s .single() raises APIError (code PGRST116) when no row exists, so the if not result.data branch never runs. Worse, the exception is caught by the bare handler and exposed as a 500. Catch APIError, translate the not-found case to a 404, and drop the redundant check.

-        result = client.table("archon_page_metadata").select("*").eq("url", url).single().execute()
-
-        if not result.data:
-            raise HTTPException(status_code=404, detail=f"Page not found for URL: {url}")
+        result = client.table("archon_page_metadata").select("*").eq("url", url).single().execute()
 
         # Handle large pages
         page_data = _handle_large_page_content(result.data.copy())
         return PageResponse(**page_data)
 
     except HTTPException:
         raise
-    except Exception as e:
+    except APIError as e:
+        if e.code == "PGRST116" and "no rows" in (e.message or "").lower():
+            raise HTTPException(status_code=404, detail=f"Page not found for URL: {url}") from e
+        logger.error(f"Error getting page by URL {url}: {e}", exc_info=True)
+        safe_logfire_error(f"Failed to get page by URL | url={url} | error={str(e)}")
+        raise HTTPException(status_code=500, detail=f"Failed to get page: {str(e)}") from e
+    except ValueError as e:
         logger.error(f"Error getting page by URL {url}: {e}", exc_info=True)
         safe_logfire_error(f"Failed to get page by URL | url={url} | error={str(e)}")
         raise HTTPException(status_code=500, detail=f"Failed to get page: {str(e)}") from e

187-199: Fix not-found handling for /pages/{page_id} and drop the redundant null check.

Same issue as the URL lookup: .single() raises APIError when no page exists, so the if not result.data branch never executes. The caller currently receives a 500 instead of a 404. Catch APIError, translate the not-found case, and avoid the bare Exception.

-        result = client.table("archon_page_metadata").select("*").eq("id", page_id).single().execute()
-
-        if not result.data:
-            raise HTTPException(status_code=404, detail=f"Page not found: {page_id}")
+        result = client.table("archon_page_metadata").select("*").eq("id", page_id).single().execute()
 
         # Handle large pages
         page_data = _handle_large_page_content(result.data.copy())
         return PageResponse(**page_data)
 
     except HTTPException:
         raise
-    except Exception as e:
+    except APIError as e:
+        if e.code == "PGRST116" and "no rows" in (e.message or "").lower():
+            raise HTTPException(status_code=404, detail=f"Page not found: {page_id}") from e
+        logger.error(f"Error getting page {page_id}: {e}", exc_info=True)
+        safe_logfire_error(f"Failed to get page | page_id={page_id} | error={str(e)}")
+        raise HTTPException(status_code=500, detail=f"Failed to get page: {str(e)}") from e
+    except ValueError as e:
         logger.error(f"Error getting page {page_id}: {e}", exc_info=True)
         safe_logfire_error(f"Failed to get page | page_id={page_id} | error={str(e)}")
         raise HTTPException(status_code=500, detail=f"Failed to get page: {str(e)}") from e

🧹 Nitpick comments (1)

python/src/server/services/search/rag_service.py (1)

197-197: Consider using highest-similarity chunk instead of first chunk.

The current implementation uses the first chunk's content as best_chunk_content, which may not be the most representative. Consider selecting the chunk with the highest similarity score instead.

Apply this diff:

+                    "best_similarity": result.get("similarity_score", 0.0),
                     "best_chunk_content": result.get("content", ""),

Then update the accumulation logic:

             page_groups[group_key]["chunk_matches"] += 1
             page_groups[group_key]["total_similarity"] += result.get("similarity_score", 0.0)
+            # Track the best chunk by similarity
+            current_similarity = result.get("similarity_score", 0.0)
+            if current_similarity > page_groups[group_key]["best_similarity"]:
+                page_groups[group_key]["best_similarity"] = current_similarity
+                page_groups[group_key]["best_chunk_content"] = result.get("content", "")

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b9b199 and 2c2dc6e.

📒 Files selected for processing (3)

python/src/server/api_routes/pages_api.py (1 hunks)
python/src/server/services/crawling/page_storage_operations.py (1 hunks)
python/src/server/services/search/rag_service.py (3 hunks)

🧰 Additional context used

📓 Path-based instructions (5)

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/services/search/rag_service.py
python/src/server/services/crawling/page_storage_operations.py
python/src/server/api_routes/pages_api.py

python/src/server/services/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/services/**/*.py: For batch operations, report both success count and a detailed failure list
On external API calls, retry with exponential backoff and fail with a clear message after retries
Place business logic in service layer modules (API Route → Service → Database pattern)

Files:

python/src/server/services/search/rag_service.py
python/src/server/services/crawling/page_storage_operations.py

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/src/server/services/search/rag_service.py
python/src/server/services/crawling/page_storage_operations.py
python/src/server/api_routes/pages_api.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/server/services/search/rag_service.py
python/src/server/services/crawling/page_storage_operations.py
python/src/server/api_routes/pages_api.py

python/src/server/api_routes/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Place HTTP route handlers in the API routes directory

Files:

python/src/server/api_routes/pages_api.py

🧬 Code graph analysis (3)

python/src/server/services/search/rag_service.py (2)

python/src/server/api_routes/knowledge_api.py (1)

perform_rag_query (1110-1143)

python/src/server/config/logfire_config.py (1)

set_attribute (178-180)

python/src/server/services/crawling/page_storage_operations.py (2)

python/src/server/config/logfire_config.py (3)

get_logger (137-147)

safe_logfire_error (239-251)

safe_logfire_info (224-236)

python/src/server/services/crawling/helpers/llms_full_parser.py (1)

parse_llms_full_sections (75-189)

python/src/server/api_routes/pages_api.py (2)

python/src/server/config/logfire_config.py (2)

get_logger (137-147)

safe_logfire_error (239-251)

python/src/server/services/client_manager.py (1)

get_supabase_client (15-43)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Backend Tests (Python + pytest)

🔇 Additional comments (2)

python/src/server/services/search/rag_service.py (2)

329-344: LGTM! Page grouping logic with graceful fallback.

The implementation correctly handles the page-level return mode with a graceful fallback to chunks when page_ids are not present. The logging provides good visibility for the fallback behavior.

245-245: LGTM! Return mode parameter added correctly.

The return_mode parameter is added with an appropriate default value of "chunks" to maintain backward compatibility.

coderabbitai · 2025-10-07T23:02:42Z

+        except Exception as e:
+            logger.warning(
+                f"Unexpected error updating chunk_count for page {page_id}: {e}", exc_info=True


⚠️ Potential issue | 🟠 Major

Let truly unexpected update failures surface.

update_page_chunk_count should not catch bare Exception; re-raise after logging so programming/config bugs don’t get silently muted.

- except Exception as e: - logger.warning( - f"Unexpected error updating chunk_count for page {page_id}: {e}", exc_info=True - ) + except Exception: + raise

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except Exception as e:

logger.warning(

f"Unexpected error updating chunk_count for page {page_id}: {e}", exc_info=True

except Exception:

raise

🤖 Prompt for AI Agents

In python/src/server/services/crawling/page_storage_operations.py around lines 245 to 247, the except block catches a bare Exception and only logs it which masks programming/configuration errors; modify the handler to re-raise the exception after logging (e.g., log with exc_info=True and then call raise) or remove the broad except entirely so unexpected failures propagate, ensuring only expected/handled errors are swallowed if explicitly intended.

coderabbitai · 2025-10-07T23:02:42Z

+            # Query page by page_id if available, otherwise by URL
+            if data["page_id"]:
+                page_info = (
+                    self.supabase_client.table("archon_page_metadata")
+                    .select("id, url, section_title, word_count")
+                    .eq("id", data["page_id"])
+                    .maybe_single()
+                    .execute()
+                )
+            else:
+                # Regular pages - exact URL match
+                page_info = (
+                    self.supabase_client.table("archon_page_metadata")
+                    .select("id, url, section_title, word_count")
+                    .eq("url", data["url"])
+                    .maybe_single()
+                    .execute()
+                )


⚠️ Potential issue | 🟠 Major

Refactor to use a single bulk query and add error handling.

The current implementation queries the database once per page in a loop, creating an N+1 query problem that degrades performance when many pages are involved. Additionally, database queries lack error handling, which violates the coding guidelines requiring proper error handling and logging for database operations.

Apply this diff to fetch all page metadata in a single bulk query with error handling:

- page_results = [] - for group_key, data in page_groups.items(): + # Collect all page_ids and URLs for bulk query + page_ids_to_fetch = [data["page_id"] for data in page_groups.values() if data["page_id"]] + urls_to_fetch = [data["url"] for data in page_groups.values() if not data["page_id"] and data["url"]] + + # Fetch all page metadata in bulk + page_metadata_map = {} + try: + if page_ids_to_fetch: + page_results_by_id = ( + self.supabase_client.table("archon_page_metadata") + .select("id, url, section_title, word_count") + .in_("id", page_ids_to_fetch) + .execute() + ) + for page in page_results_by_id.data or []: + page_metadata_map[page["id"]] = page + + if urls_to_fetch: + page_results_by_url = ( + self.supabase_client.table("archon_page_metadata") + .select("id, url, section_title, word_count") + .in_("url", urls_to_fetch) + .execute() + ) + for page in page_results_by_url.data or []: + page_metadata_map[page["url"]] = page + except Exception as e: + logger.error(f"Failed to fetch page metadata | page_ids={len(page_ids_to_fetch)} | urls={len(urls_to_fetch)} | error={str(e)}", exc_info=True) + return [] + + page_results = [] + for group_key, data in page_groups.items(): avg_similarity = data["total_similarity"] / data["chunk_matches"] match_boost = min(0.2, data["chunk_matches"] * 0.02) aggregate_score = avg_similarity * (1 + match_boost) - # Query page by page_id if available, otherwise by URL - if data["page_id"]: - page_info = ( - self.supabase_client.table("archon_page_metadata") - .select("id, url, section_title, word_count") - .eq("id", data["page_id"]) - .maybe_single() - .execute() - ) - else: - # Regular pages - exact URL match - page_info = ( - self.supabase_client.table("archon_page_metadata") - .select("id, url, section_title, word_count") - .eq("url", data["url"]) - .maybe_single() - .execute() - ) - - if page_info and page_info.data is not None: + # Look up page metadata from bulk query results + lookup_key = data["page_id"] if data["page_id"] else data["url"] + page_info_data = page_metadata_map.get(lookup_key) + + if page_info_data: page_results.append({ - "page_id": page_info.data["id"], - "url": page_info.data["url"], - "section_title": page_info.data.get("section_title"), - "word_count": page_info.data.get("word_count", 0), + "page_id": page_info_data["id"], + "url": page_info_data["url"], + "section_title": page_info_data.get("section_title"), + "word_count": page_info_data.get("word_count", 0), "chunk_matches": data["chunk_matches"], "aggregate_similarity": aggregate_score, "average_similarity": avg_similarity, "source_id": data["source_id"], }) + else: + logger.warning(f"Page metadata not found | group_key={group_key} | page_id={data['page_id']} | url={data['url']}")

As per coding guidelines.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Query page by page_id if available, otherwise by URL

if data["page_id"]:

page_info = (

self.supabase_client.table("archon_page_metadata")

.select("id, url, section_title, word_count")

.eq("id", data["page_id"])

.maybe_single()

.execute()

)

else:

# Regular pages - exact URL match

page_info = (

self.supabase_client.table("archon_page_metadata")

.select("id, url, section_title, word_count")

.eq("url", data["url"])

.maybe_single()

.execute()

)

# Collect all page_ids and URLs for bulk query

page_ids_to_fetch = [data["page_id"] for data in page_groups.values() if data["page_id"]]

urls_to_fetch = [data["url"] for data in page_groups.values() if not data["page_id"] and data["url"]]

# Fetch all page metadata in bulk

page_metadata_map = {}

try:

if page_ids_to_fetch:

page_results_by_id = (

self.supabase_client.table("archon_page_metadata")

.select("id, url, section_title, word_count")

.in_("id", page_ids_to_fetch)

.execute()

)

for page in page_results_by_id.data or []:

page_metadata_map[page["id"]] = page

if urls_to_fetch:

page_results_by_url = (

self.supabase_client.table("archon_page_metadata")

.select("id, url, section_title, word_count")

.in_("url", urls_to_fetch)

.execute()

)

for page in page_results_by_url.data or []:

page_metadata_map[page["url"]] = page

except Exception as e:

logger.error(

f"Failed to fetch page metadata | page_ids={len(page_ids_to_fetch)} | "

f"urls={len(urls_to_fetch)} | error={e}",

exc_info=True

)

return []

page_results = []

for group_key, data in page_groups.items():

avg_similarity = data["total_similarity"] / data["chunk_matches"]

match_boost = min(0.2, data["chunk_matches"] * 0.02)

aggregate_score = avg_similarity * (1 + match_boost)

# Look up page metadata from bulk query results

lookup_key = data["page_id"] if data["page_id"] else data["url"]

page_info_data = page_metadata_map.get(lookup_key)

if page_info_data:

page_results.append({

"page_id": page_info_data["id"],

"url": page_info_data["url"],

"section_title": page_info_data.get("section_title"),

"word_count": page_info_data.get("word_count", 0),

"chunk_matches": data["chunk_matches"],

"aggregate_similarity": aggregate_score,

"average_similarity": avg_similarity,

"source_id": data["source_id"],

})

else:

logger.warning(

f"Page metadata not found | group_key={group_key} | "

f"page_id={data['page_id']} | url={data['url']}"

)

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c2dc6e and c6b0dbd.

📒 Files selected for processing (2)

python/src/server/services/crawling/helpers/llms_full_parser.py (1 hunks)
python/src/server/services/storage/base_storage_service.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

python/src/server/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/**/*.py: Never accept corrupted data in batch/continuation paths: skip failed items entirely (e.g., do not store zero embeddings, null FKs, or malformed JSON)
Use specific exception types; avoid catching bare Exception
Preserve full stack traces in logs (use exc_info=True with Python logging)
Never return None to indicate failure; raise an exception with details
Authentication/authorization failures must halt the operation and be clearly surfaced
Service startup, missing configuration, database connection, or critical dependency failures should crash fast with clear errors
During crawling/batch/background tasks and WebSocket events, continue processing other items but log failures with context
Include context (operation intent, relevant IDs/URLs/data) in error messages
Pydantic should raise on data corruption or validation errors; do not accept invalid inputs

Files:

python/src/server/services/storage/base_storage_service.py
python/src/server/services/crawling/helpers/llms_full_parser.py

python/src/server/services/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/src/server/services/**/*.py: For batch operations, report both success count and a detailed failure list
On external API calls, retry with exponential backoff and fail with a clear message after retries
Place business logic in service layer modules (API Route → Service → Database pattern)

Files:

python/src/server/services/storage/base_storage_service.py
python/src/server/services/crawling/helpers/llms_full_parser.py

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/src/server/services/storage/base_storage_service.py
python/src/server/services/crawling/helpers/llms_full_parser.py

python/src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use MyPy for type checking to ensure type safety

python/src/**/*.py: On service startup, missing configuration, DB connection failures, auth/authorization failures, critical dependency outages, or invalid/corrupting data: fail fast and bubble errors
For batch processing, background tasks, WebSocket events, optional features, and external API calls: continue processing but log errors (with retries/backoff for APIs)
Never accept or persist corrupted data; skip failed items entirely (e.g., zero embeddings, null FKs, malformed JSON)
Error messages must include operation context, IDs/URLs, use specific exception types, preserve full stack traces (logging with exc_info=True), and avoid returning None/null—raise exceptions instead; for batches report success counts and detailed failures
Backend code targets Python 3.12 and adheres to a 120 character line length
Use Ruff for linting (errors, warnings, unused imports) in backend code
Use Mypy for static type checking in backend code

Files:

python/src/server/services/storage/base_storage_service.py
python/src/server/services/crawling/helpers/llms_full_parser.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Backend Tests (Python + pytest)

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

python/tests/server/services/test_llms_full_parser.py (1)
1-195: Well-structured test suite with good edge case coverage.

The test suite effectively covers the core functionality of the LLMs-full.txt parser:

Slug generation with punctuation/spacing edge cases

URL construction with section ordering

Single and multiple section parsing

Edge cases: no H1 headers, H2 boundaries, empty sections, consecutive H1s, whitespace handling

The tests are clear, well-named, and follow pytest conventions. The anchor format (#section-{order}-{slug}) correctly matches the parser implementation.

Optional enhancements:

Test code block handling: The parser ignores H1 headers inside code blocks. Consider adding:
def test_h1_inside_code_block_ignored():
    """Test that H1 inside code blocks are not section boundaries"""
    content = """# Real Section
Some content here.

```python
# This is not a section
def foo():
    pass
More content in the real section.
"""
sections = parse_llms_full_sections(content, "https://example.com/llms-full.txt")
assert len(sections) == 1
assert sections[0].section_title == "# Real Section"
2. **Test code block split merging**: The parser merges sections split by unclosed code blocks (parser lines 200-234). Consider testing this behavior.

3. **Test small section combining**: The parser combines consecutive sections <200 chars (parser lines 237-259). While `test_empty_sections_skipped` and `test_consecutive_h1_headers` work around this by using longer content, an explicit test would verify the combining logic.

4. **Parameterize slug tests**: Lines 14-21 could use `pytest.mark.parametrize` for conciseness:
```python
@pytest.mark.parametrize("input_heading,expected_slug", [
    ("# Core Concepts", "core-concepts"),
    ("# Getting Started!", "getting-started"),
    ("# API Reference (v2)", "api-reference-v2"),
    # ...
])
def test_create_section_slug(input_heading, expected_slug):
    assert create_section_slug(input_heading) == expected_slug

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c6b0dbd and b9bebe8.

📒 Files selected for processing (1)

python/tests/server/services/test_llms_full_parser.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

python/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

python/**/*.py: Target Python 3.12 with a 120-character line length
Use Ruff for linting (errors, warnings, unused imports)

Files:

python/tests/server/services/test_llms_full_parser.py

python/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Write backend tests with Pytest (including async support)

Files:

python/tests/server/services/test_llms_full_parser.py

🧬 Code graph analysis (1)

python/tests/server/services/test_llms_full_parser.py (1)

python/src/server/services/crawling/helpers/llms_full_parser.py (3)

create_section_slug (23-56)

create_section_url (59-72)

parse_llms_full_sections (75-262)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Backend Tests (Python + pytest)

* Initial commit for RAG by document * Phase 2 * Adding migrations * Fixing page IDs for chunk metadata * Fixing unit tests, adding tool to list pages for source * Fixing page storage upsert issues * Max file length for retrieval * Fixing title issue * Fixing tests

…ts (#767) The /api/health endpoint reported concurrency.active: 0 while workflows were running because background workflows bypass the ConversationLockManager. Now the endpoint queries running workflows from the DB and merges them with lock manager stats, deduplicating shared conversation IDs. Changes: - Added getRunningWorkflows() DB function returning running workflow details - Enriched /api/health to combine lock manager + DB workflow sources - Updated and expanded health endpoint tests for merge/dedup behavior Fixes #767 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ts (coleam00#767) The /api/health endpoint reported concurrency.active: 0 while workflows were running because background workflows bypass the ConversationLockManager. Now the endpoint queries running workflows from the DB and merges them with lock manager stats, deduplicating shared conversation IDs. Changes: - Added getRunningWorkflows() DB function returning running workflow details - Enriched /api/health to combine lock manager + DB workflow sources - Updated and expanded health endpoint tests for merge/dedup behavior Fixes coleam00#767 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coleam00 added 4 commits October 7, 2025 14:19

Initial commit for RAG by document

36981fa

Phase 2

592ff7d

Adding migrations

0c6815a

Fixing page IDs for chunk metadata

4b71899

coderabbitai Bot reviewed Oct 7, 2025

View reviewed changes

Fixing unit tests, adding tool to list pages for source

2b9b199

coderabbitai Bot reviewed Oct 7, 2025

View reviewed changes

coleam00 added 2 commits October 7, 2025 17:42

Fixing page storage upsert issues

9e19ab3

Max file length for retrieval

2c2dc6e

coderabbitai Bot reviewed Oct 7, 2025

View reviewed changes

Fixing title issue

c6b0dbd

coderabbitai Bot reviewed Oct 10, 2025

View reviewed changes

Comment thread python/src/server/services/crawling/helpers/llms_full_parser.py

Comment thread python/src/server/services/storage/base_storage_service.py

Fixing tests

b9bebe8

coleam00 merged commit bfd0a84 into main Oct 10, 2025
7 of 8 checks passed

coderabbitai Bot reviewed Oct 10, 2025

View reviewed changes

coderabbitai Bot mentioned this pull request Oct 14, 2025

feat: Implement priority-based automatic discovery of llms.txt and sitemap.xml files #622

Merged

5 tasks

This was referenced Dec 30, 2025

feat(db-refactor): Complete Database Layer Abstraction with Traefik Network #915

Closed

feat(db): Database Abstraction Layer with Repository Pattern #918

Closed

This was referenced Jan 12, 2026

Accomodate Row-level Security Policy Violation #926

Closed

feat: Docker deployment improvements, async fixes, and resilience #928

Closed

coderabbitai Bot mentioned this pull request Feb 22, 2026

Add crawl checkpoint/resume infrastructure #936

Closed

Wirasm deleted the rag-by-document branch April 6, 2026 07:38

Conversation

coleam00 commented Oct 7, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

Summary

Changes Made

Type of Change

Affected Services

Testing

Test Evidence

Checklist

Breaking Changes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coleam00 commented Oct 7, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 7, 2025 •

edited

Loading