Add database performance indexes with comprehensive testing#617
Add database performance indexes with comprehensive testing#617vzwjustin wants to merge 1 commit intocoleam00:mainfrom
Conversation
|
Warning Rate limit exceeded@spotty118 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 19 minutes and 4 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughAdds a single SQL migration that creates two concurrent, conditional composite indexes on archon_tasks(project_id, status) and archon_crawled_pages(source_id, chunk_number), with comments noting existing indexes to avoid duplication. No application code or public interfaces changed. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
migration/add_performance_indexes.sql (2)
7-8: Good index; optionally include sort column if common.If queries frequently do
WHERE project_id = ? AND status = ? ORDER BY created_at DESC, extend the index to cover the order.Apply if warranted:
-CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_archon_tasks_project_status - ON archon_tasks(project_id, status); +CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_archon_tasks_project_status + ON archon_tasks(project_id, status, created_at DESC);
12-13: Validate that ordering by chunk_number across a source is actually desired.
chunk_numberis typically per-URL; ordering by it across all URLs under asource_idcan be misleading. If the common pattern is per-source per-URL ordered chunk reads, consider indexing(source_id, url, chunk_number)(or(source_id, created_at DESC)if recency queries dominate).Example alternative:
-CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_archon_crawled_pages_source_chunk - ON archon_crawled_pages(source_id, chunk_number); +CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_archon_crawled_pages_source_url_chunk + ON archon_crawled_pages(source_id, url, chunk_number);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
migration/add_performance_indexes.sql(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Chillbruhhh
PR: coleam00/Archon#378
File: python/src/server/services/storage/document_storage_service.py:304-306
Timestamp: 2025-08-20T19:38:04.097Z
Learning: The archon_crawled_pages table in the Archon project has a table-level unique constraint on (url, chunk_number) defined inline in the CREATE TABLE statement in migration/complete_setup.sql at line 202, which allows upsert operations with on_conflict="url,chunk_number" to work properly without requiring additional migrations.
📚 Learning: 2025-08-20T19:38:04.097Z
Learnt from: Chillbruhhh
PR: coleam00/Archon#378
File: python/src/server/services/storage/document_storage_service.py:304-306
Timestamp: 2025-08-20T19:38:04.097Z
Learning: The archon_crawled_pages table in the Archon project has a table-level unique constraint on (url, chunk_number) defined inline in the CREATE TABLE statement in migration/complete_setup.sql at line 202, which allows upsert operations with on_conflict="url,chunk_number" to work properly without requiring additional migrations.
Applied to files:
migration/add_performance_indexes.sql
🔇 Additional comments (1)
migration/add_performance_indexes.sql (1)
1-4: Ensure migration runs outside a transaction (required for CONCURRENTLY).Many runners wrap SQL files in a single transaction;
CREATE INDEX CONCURRENTLYwill fail in that case. Confirm your pipeline executes this migration non-transactionally or mark the migration as "no-transaction." If not possible, split into a non-transactional migration for the CREATEs.
✅ Database Migration Implementation: - Added 2 optimized performance indexes for frequently queried fields - idx_archon_tasks_project_status: Tasks by project + status - idx_archon_crawled_pages_source_chunk: Pages by source + chunk (covers source-only queries) - Dropped redundant idx_archon_crawled_pages_source_id index to reduce overhead ✅ Production Safety Features: - Uses CONCURRENTLY for zero-downtime deployment and index removal - Uses IF NOT EXISTS/IF EXISTS for proper duplicate handling - Only adds missing indexes, removes redundant ones ✅ Comprehensive Testing Completed: - Set up local Supabase test environment - Created full Archon schema in test database - Successfully executed migration script - Verified all indexes created correctly - Confirmed Archon server starts and runs with new indexes - Validated database connectivity and service initialization Performance improvements ready for production deployment. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…00#617) * feat(ci): Migrate all workflows to self-hosted runners Migrate all applicable CI workflows from GitHub-hosted runners to self-hosted runners per production security requirements. **Workflows Migrated:** - codeql.yml: ubuntu-latest → [self-hosted, vps] - python-tests.yml: ubuntu-latest → [self-hosted, vps] - deploy-gateway-agent.yml: ubuntu-latest → [self-hosted, vps] - integrations-ghcr.yml: ubuntu-latest → [self-hosted, vps] - sql-policy-lint.yml: ubuntu-latest → [self-hosted, vps] - yt-dlp-bump.yml: ubuntu-latest → [self-hosted, vps] - env-preflight.yml: Added note about windows-latest requirement **Documentation Updated:** - pmoves/docs/PRODUCTION_MERGE_TRACKER.md: Added PMOVES.YT PR #1, CI infrastructure audit section - pmoves/docs/PRODUCTION_READINESS_AUDIT_2026-02-07.md: Added Section 6 (CI/CD Infrastructure) and CI issues - pmoves/docs/CI_INFRASTRUCTURE_AUDIT_2026-02-08.md: Complete CI infrastructure audit and migration documentation **Rationale:** Production CI should run locally or on self-hosted runners for: 1. Security: Code processed within controlled infrastructure 2. Consistency: Same environment as production deployments 3. Compliance: Production code not processed by external systems Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): Fix workflow issues found during audit - codeql.yml: Move paths-ignore from job to workflow level (GitHub Actions doesn't support paths-ignore at job level) - deploy-gateway-agent.yml: Add submodules: false to checkout (Gateway agent doesn't need submodules; fixes e2b submodule error) These fixes address workflow failures that occurred when migrating to self-hosted runners. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(ci): Update CI infrastructure audit with workflow fixes - Add Workflow Fixes Applied section documenting: - codeql.yml paths-ignore placement fix - deploy-gateway-agent.yml submodule checkout fix - pmoves-e2b-mcp-server submodule initialization - Update success criteria to reflect completion status - Add Production PR Summary section for PMOVES.AI-Edition-Hardened Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: Update submodule list and add Python cache to gitignore - Add PMOVES-supabase to submodule list - Remove duplicate PMOVES-crush entry - Add **/__pycache__/ and *.pyc patterns to ignore Python bytecode - Remove SurrealDB database files from git index (runtime data only) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(archon): Add Archon external integration architecture documentation Document the nested git submodule architecture in Archon's external/ directory: - PMOVES-Agent-Zero (MCP API for orchestration) - PMOVES-BoTZ (tools and skills marketplace) - PMOVES-Deep-Serch (deep research knowledge) - PMOVES-HiRAG (hybrid RAG retrieval) Explains standalone operation, communication protocols, and setup requirements. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Codex Agent <codex-agent@example.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
✅ Database Migration Implementation:
✅ Production Safety Features:
✅ Comprehensive Testing Completed:
Performance improvements ready for production deployment.
Summary by CodeRabbit