Skip to content

Add database performance indexes with comprehensive testing#617

Closed
vzwjustin wants to merge 1 commit intocoleam00:mainfrom
vzwjustin:main
Closed

Add database performance indexes with comprehensive testing#617
vzwjustin wants to merge 1 commit intocoleam00:mainfrom
vzwjustin:main

Conversation

@vzwjustin
Copy link
Copy Markdown

@vzwjustin vzwjustin commented Sep 7, 2025

✅ Database Migration Implementation:

  • Added 2 optimized performance indexes for frequently queried fields
  • idx_archon_tasks_project_status: Tasks by project + status
  • idx_archon_crawled_pages_source_chunk: Pages by source + chunk (covers source-only queries)
  • Removed redundant indexes and used canonical naming conventions

✅ Production Safety Features:

  • Uses CONCURRENTLY for zero-downtime deployment
  • Uses IF NOT EXISTS with canonical names for proper duplicate detection
  • Only adds missing indexes, avoids maintenance overhead

✅ Comprehensive Testing Completed:

  • Set up local Supabase test environment
  • Created full Archon schema in test database
  • Successfully executed migration script
  • Verified all indexes created correctly
  • Confirmed Archon server starts and runs with new indexes
  • Validated database connectivity and service initialization

Performance improvements ready for production deployment.

Summary by CodeRabbit

  • Chores
    • Introduced database indexing to improve performance of common views, including faster task filtering by project/status and quicker retrieval of crawled content by source.
    • Deploys safely with zero-downtime migration steps.
    • Users should notice reduced load times in affected lists and views; no action required.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Sep 7, 2025

Warning

Rate limit exceeded

@spotty118 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 19 minutes and 4 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 1148dc4 and 906fd6f.

📒 Files selected for processing (1)
  • migration/add_performance_indexes.sql (1 hunks)

Walkthrough

Adds a single SQL migration that creates two concurrent, conditional composite indexes on archon_tasks(project_id, status) and archon_crawled_pages(source_id, chunk_number), with comments noting existing indexes to avoid duplication. No application code or public interfaces changed.

Changes

Cohort / File(s) Summary
DB Migration: Performance Indexes
migration/add_performance_indexes.sql
Adds two indexes (CONCURRENTLY, IF NOT EXISTS): idx_archon_tasks_project_status on (project_id, status) and idx_archon_crawled_pages_source_chunk on (source_id, chunk_number). Documents existing related indexes to prevent duplication.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I thump my paws on database ground,
Two fresh indexes spin around—
Tasks align by project, status true,
Pages chunk in tidy queue.
Carrots cached, queries fleet,
Burrow paths now swift and neat. 🥕✨

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vzwjustin
Copy link
Copy Markdown
Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Sep 7, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@vzwjustin
Copy link
Copy Markdown
Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Sep 7, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
migration/add_performance_indexes.sql (2)

7-8: Good index; optionally include sort column if common.

If queries frequently do WHERE project_id = ? AND status = ? ORDER BY created_at DESC, extend the index to cover the order.

Apply if warranted:

-CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_archon_tasks_project_status
-    ON archon_tasks(project_id, status);
+CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_archon_tasks_project_status
+    ON archon_tasks(project_id, status, created_at DESC);

12-13: Validate that ordering by chunk_number across a source is actually desired.

chunk_number is typically per-URL; ordering by it across all URLs under a source_id can be misleading. If the common pattern is per-source per-URL ordered chunk reads, consider indexing (source_id, url, chunk_number) (or (source_id, created_at DESC) if recency queries dominate).

Example alternative:

-CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_archon_crawled_pages_source_chunk
-    ON archon_crawled_pages(source_id, chunk_number);
+CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_archon_crawled_pages_source_url_chunk
+    ON archon_crawled_pages(source_id, url, chunk_number);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 012d2c5 and 1148dc4.

📒 Files selected for processing (1)
  • migration/add_performance_indexes.sql (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: Chillbruhhh
PR: coleam00/Archon#378
File: python/src/server/services/storage/document_storage_service.py:304-306
Timestamp: 2025-08-20T19:38:04.097Z
Learning: The archon_crawled_pages table in the Archon project has a table-level unique constraint on (url, chunk_number) defined inline in the CREATE TABLE statement in migration/complete_setup.sql at line 202, which allows upsert operations with on_conflict="url,chunk_number" to work properly without requiring additional migrations.
📚 Learning: 2025-08-20T19:38:04.097Z
Learnt from: Chillbruhhh
PR: coleam00/Archon#378
File: python/src/server/services/storage/document_storage_service.py:304-306
Timestamp: 2025-08-20T19:38:04.097Z
Learning: The archon_crawled_pages table in the Archon project has a table-level unique constraint on (url, chunk_number) defined inline in the CREATE TABLE statement in migration/complete_setup.sql at line 202, which allows upsert operations with on_conflict="url,chunk_number" to work properly without requiring additional migrations.

Applied to files:

  • migration/add_performance_indexes.sql
🔇 Additional comments (1)
migration/add_performance_indexes.sql (1)

1-4: Ensure migration runs outside a transaction (required for CONCURRENTLY).

Many runners wrap SQL files in a single transaction; CREATE INDEX CONCURRENTLY will fail in that case. Confirm your pipeline executes this migration non-transactionally or mark the migration as "no-transaction." If not possible, split into a non-transactional migration for the CREATEs.

Comment thread migration/add_performance_indexes.sql
✅ Database Migration Implementation:
- Added 2 optimized performance indexes for frequently queried fields
- idx_archon_tasks_project_status: Tasks by project + status
- idx_archon_crawled_pages_source_chunk: Pages by source + chunk (covers source-only queries)
- Dropped redundant idx_archon_crawled_pages_source_id index to reduce overhead

✅ Production Safety Features:
- Uses CONCURRENTLY for zero-downtime deployment and index removal
- Uses IF NOT EXISTS/IF EXISTS for proper duplicate handling
- Only adds missing indexes, removes redundant ones

✅ Comprehensive Testing Completed:
- Set up local Supabase test environment
- Created full Archon schema in test database
- Successfully executed migration script
- Verified all indexes created correctly
- Confirmed Archon server starts and runs with new indexes
- Validated database connectivity and service initialization

Performance improvements ready for production deployment.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@vzwjustin vzwjustin closed this Sep 7, 2025
POWERFULMOVES added a commit to POWERFULMOVES/PMOVES-Archon that referenced this pull request Feb 12, 2026
…00#617)

* feat(ci): Migrate all workflows to self-hosted runners

Migrate all applicable CI workflows from GitHub-hosted runners
to self-hosted runners per production security requirements.

**Workflows Migrated:**
- codeql.yml: ubuntu-latest → [self-hosted, vps]
- python-tests.yml: ubuntu-latest → [self-hosted, vps]
- deploy-gateway-agent.yml: ubuntu-latest → [self-hosted, vps]
- integrations-ghcr.yml: ubuntu-latest → [self-hosted, vps]
- sql-policy-lint.yml: ubuntu-latest → [self-hosted, vps]
- yt-dlp-bump.yml: ubuntu-latest → [self-hosted, vps]
- env-preflight.yml: Added note about windows-latest requirement

**Documentation Updated:**
- pmoves/docs/PRODUCTION_MERGE_TRACKER.md: Added PMOVES.YT PR #1,
  CI infrastructure audit section
- pmoves/docs/PRODUCTION_READINESS_AUDIT_2026-02-07.md: Added
  Section 6 (CI/CD Infrastructure) and CI issues
- pmoves/docs/CI_INFRASTRUCTURE_AUDIT_2026-02-08.md: Complete
  CI infrastructure audit and migration documentation

**Rationale:**
Production CI should run locally or on self-hosted runners for:
1. Security: Code processed within controlled infrastructure
2. Consistency: Same environment as production deployments
3. Compliance: Production code not processed by external systems

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ci): Fix workflow issues found during audit

- codeql.yml: Move paths-ignore from job to workflow level
  (GitHub Actions doesn't support paths-ignore at job level)

- deploy-gateway-agent.yml: Add submodules: false to checkout
  (Gateway agent doesn't need submodules; fixes e2b submodule error)

These fixes address workflow failures that occurred when migrating
to self-hosted runners.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs(ci): Update CI infrastructure audit with workflow fixes

- Add Workflow Fixes Applied section documenting:
  - codeql.yml paths-ignore placement fix
  - deploy-gateway-agent.yml submodule checkout fix
  - pmoves-e2b-mcp-server submodule initialization
- Update success criteria to reflect completion status
- Add Production PR Summary section for PMOVES.AI-Edition-Hardened

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: Update submodule list and add Python cache to gitignore

- Add PMOVES-supabase to submodule list
- Remove duplicate PMOVES-crush entry
- Add **/__pycache__/ and *.pyc patterns to ignore Python bytecode
- Remove SurrealDB database files from git index (runtime data only)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs(archon): Add Archon external integration architecture documentation

Document the nested git submodule architecture in Archon's external/ directory:
- PMOVES-Agent-Zero (MCP API for orchestration)
- PMOVES-BoTZ (tools and skills marketplace)
- PMOVES-Deep-Serch (deep research knowledge)
- PMOVES-HiRAG (hybrid RAG retrieval)

Explains standalone operation, communication protocols, and setup requirements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Codex Agent <codex-agent@example.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant