Add semantic search via pgvector + Ollama by cmeans · Pull Request #40 · cmeans/mcp-awareness

cmeans · 2026-03-23T21:36:39Z

Summary

New semantic_search tool finds entries by meaning using pgvector cosine similarity
EmbeddingProvider protocol with OllamaEmbedding and NullEmbedding implementations
Fire-and-forget embedding generation on write tools (remember, learn_pattern, add_context, report_alert, report_status, update_entry)
Separate embeddings table with HNSW index, CASCADE delete, unique per entry+model
Optional Ollama Docker service under embeddings profile
Graceful degradation: everything works without an embedding provider configured
276 tests (+37 new), all passing; ruff, mypy, format clean

QA

Prerequisites

pip install -e ".[dev]"
Deploy to test instance on alternate port (AWARENESS_PORT=8421)

Manual tests (via MCP tools)

- semantic_search without provider returns helpful error
```
semantic_search(query="retirement planning")
```
Expected: JSON with status: "error" and message mentioning AWARENESS_EMBEDDING_PROVIDER

remember creates entry (existing behavior unchanged)

remember(source="test", tags=["finance"], description="401k contribution strategy for 2026")

Expected: status: "ok" with entry id

- get_knowledge still works by tags
```
get_knowledge(tags=["finance"])
```
Expected: returns the entry from step 2
- With Ollama configured (AWARENESS_EMBEDDING_PROVIDER=ollama):
```
docker compose --profile embeddings up -d
docker exec awareness-ollama ollama pull nomic-embed-text
```
Restart mcp-awareness with AWARENESS_EMBEDDING_PROVIDER=ollama

remember generates embedding

remember(source="test", tags=["finance"], description="Roth IRA conversion ladder strategy")

Expected: status: "ok" — embedding generated silently

- semantic_search finds by meaning
```
semantic_search(query="retirement planning")
```
Expected: returns entries about 401k and Roth IRA with similarity scores
- semantic_search with filters
```
semantic_search(query="retirement", source="test", mode="list")
```
Expected: list mode results with similarity scores, filtered to source "test"
- semantic_search with entry_type filter
```
semantic_search(query="retirement", entry_type="note")
```
Expected: only note entries returned

🤖 Generated with Claude Code

Adds vector similarity search as an optional, additive capability. Existing tag-based search is unchanged. When an embedding provider is configured, write tools auto-generate embeddings fire-and-forget, and the new semantic_search tool finds entries by meaning. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-03-23T21:37:48Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

- get_knowledge now supports created_after/created_before for filtering by creation time (distinct from since/until which filter by last update) - Documented and isolated _prompt_manager._prompts private access in custom prompt sync — no public remove API exists in FastMCP Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Ollama service container in GitHub Actions with nomic-embed-text model - 7 integration tests that exercise the real embedding pipeline: OllamaEmbedding.embed(), is_available(), similarity ordering, write-and-search round trip, embedding generation on write - Tests skip gracefully when Ollama is not available (local dev) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…lama - Store: semantic_search since/until filter tests - Server: embed-returns-empty and embed-raises-exception error path tests - Embeddings: unreachable Ollama returns is_available=False test - 292 tests total Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cmeans

QA Review — PR #40: Add semantic search via pgvector + Ollama

Automated tests

294/294 pass (PR says 276 — subsequent commits added more)
ruff: clean
mypy: clean

Manual tests (8/8 pass)

Tested both without and with Ollama provider via isolated MCP instances.

#	Test	Result
1	`semantic_search` without provider → structured error	✅ Clear error mentioning `AWARENESS_EMBEDDING_PROVIDER`
2	`remember` works without provider (no regression)	✅ `status: "ok"`
3	`get_knowledge` by tags (no regression)	✅ Returns entry
4	Configure Ollama provider	✅ Restarted with `AWARENESS_EMBEDDING_PROVIDER=ollama`
5	`remember` generates embedding	✅ Verified embedding row in `embeddings` table
6	`semantic_search` finds by meaning	✅ "retirement planning" → "Roth IRA conversion ladder strategy" (similarity: 0.39)
7	`semantic_search` with source filter + list mode	✅ List mode with similarity scores
8	`semantic_search` with entry_type filter	✅ Only notes returned

Code review

Architecture: Clean separation — embeddings.py defines the EmbeddingProvider protocol, OllamaEmbedding (stdlib urllib, no SDK dep), and NullEmbedding. Server uses _generate_embedding fire-and-forget wrapper. Store protocol extended with 3 methods. Good.

Schema: embeddings table with VECTOR(768), UNIQUE(entry_id, model), ON DELETE CASCADE, text_hash for staleness detection. Alembic migration includes HNSW index. Good.

Graceful degradation: NullEmbedding.is_available() returns False, so all write tools silently skip embedding generation. semantic_search returns a clear error. No functional regression without a provider. Verified manually.

Ollama integration: Uses /api/embed (batch endpoint), /api/tags for availability check. Model name matching handles :latest suffix. Configurable timeout. Good.

Docker Compose: Ollama service under embeddings profile with resource limits (1 CPU, 2GB). HNSW index parameters (m=16, ef_construction=64) are reasonable for the expected scale.

CI: Ollama service added with model pull step and wait loop. AWARENESS_OLLAMA_URL env var. Good.

New filters: created_after/created_before in get_knowledge — clean addition, SQL-level, parameterized.

Docs: CHANGELOG, README (27 tools, test count), CLAUDE.md (architecture), data-dictionary (embeddings table). Good.

Findings

1. HNSW index missing from inline DDL (functional gap)

The Alembic migration creates the HNSW index:

CREATE INDEX IF NOT EXISTS idx_embeddings_vector_hnsw ON embeddings
    USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);

But the inline DDL in postgres_store.py (lines 77-90) only creates the table and idx_embeddings_entry — no HNSW index. New installs that don't run Alembic will get sequential scans on every semantic_search query.

Suggest: add the HNSW index creation to the inline DDL block.

2. `VECTOR(768)` hardcoded vs configurable dimensions

Both inline DDL and Alembic migration hardcode VECTOR(768). But AWARENESS_EMBEDDING_DIMENSIONS env var allows changing it. If someone configures a different model with different dimensions (e.g., 384 for all-MiniLM-L6-v2), upsert_embedding will fail with a Postgres type error.

Options:

Document that 768 is the only supported dimension (simplest)
Use VECTOR without dimension constraint (flexible but loses type safety)
Make the migration/DDL dimension-aware

3. `_generate_embedding` is synchronous, not truly fire-and-forget (observation)

_generate_embedding calls provider.embed() — a synchronous HTTP request to Ollama — inline before returning the tool response. On my test setup this added ~100-200ms per write. The PR description says "fire-and-forget (never blocks responses)" but it does block.

For v1 this is fine (Ollama is local, latency is tolerable). If it becomes an issue, a background thread or queue would make it truly non-blocking.

4. Pre-existing entries don't get embeddings (observation, expected)

The 401k entry created without Ollama didn't get an embedding (verified in DB). Only entries written after configuring a provider are embedded. A backfill mechanism (get_entries_without_embeddings exists in the store but isn't exposed as a tool) would be useful for onboarding.

5. Test count discrepancy (nit)

PR says 276, actual is 294. Likely accumulated from subsequent commits (5 commits on the branch). Update the PR body/CHANGELOG if desired.

Verdict

Finding #1 (missing HNSW index in inline DDL) is the substantive issue — it creates a functional gap for non-Alembic installs. Finding #2 (hardcoded dimensions) is worth a decision. #3-#5 are observations.

The feature is well-implemented with good graceful degradation. Requesting changes for the HNSW index gap (#1).

…traint - Add HNSW index to _create_tables (finding #1: non-Alembic installs) - Document VECTOR(768) hardcoded dimension in data-dictionary (finding #2) - Clarify embedding-on-write is synchronous, background planned for Phase 2 - Update test count to 294 in CHANGELOG and README Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cmeans

QA Re-review — PR #40 (round 2)

Fix verification (commit `7e9c8f9`)

All findings addressed:

HNSW index in inline DDL — ✅ Added. Matches Alembic migration (m=16, ef_construction=64).
Dimension constraint documented — ✅ Data-dictionary notes VECTOR(768) is hardcoded and AWARENESS_EMBEDDING_DIMENSIONS doesn't alter the column type.
Sync embedding clarified — ✅ Changelog updated: "currently synchronous; background generation planned for Phase 2".
Test count — ✅ Updated to 294 in README and CHANGELOG.

Automated tests

294/294 pass
ruff + mypy clean

Verdict

All clear. Ready for QA Approved label.

cmeans · 2026-03-23T22:34:30Z

Review response

Thanks for the thorough review — 8/8 manual QA passing is great to see.

Finding #1: HNSW index missing from inline DDL ✅ Fixed

Added the HNSW index to _create_tables so non-Alembic installs (and tests) get it too.

Finding #2: Hardcoded VECTOR(768) ✅ Documented

Documented in data-dictionary that VECTOR(768) is intentionally hardcoded to match nomic-embed-text. AWARENESS_EMBEDDING_DIMENSIONS configures the provider but doesn't alter the column type — both DDL and migration must be updated to change dimensions.

Finding #3: Synchronous embedding on write ✅ Acknowledged

Updated CHANGELOG to clarify this is synchronous in v1. Background thread/queue for truly non-blocking generation is a Phase 2 item.

Finding #4: Pre-existing entries not backfilled ✅ Phase 2

get_entries_without_embeddings exists in the store protocol — exposing it as a backfill_embeddings tool is planned for the follow-up PR.

Finding #5: Test count ✅ Fixed

Updated CHANGELOG and README to 294.

cmeans and others added 4 commits March 23, 2026 16:42

Fix suppression skip test, cover _generate_embedding edge cases

5265ec3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cmeans commented Mar 23, 2026

View reviewed changes

cmeans added the QA Approved Manual QA testing completed and passed label Mar 23, 2026

cmeans merged commit 8da0886 into main Mar 23, 2026
9 checks passed

cmeans deleted the semantic-search branch March 23, 2026 22:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add semantic search via pgvector + Ollama#40

Add semantic search via pgvector + Ollama#40
cmeans merged 6 commits into
mainfrom
semantic-search

cmeans commented Mar 23, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

cmeans left a comment

Uh oh!

cmeans left a comment

Uh oh!

cmeans commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

cmeans commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

QA

Prerequisites

Manual tests (via MCP tools)

Uh oh!

codecov Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cmeans left a comment

Choose a reason for hiding this comment

QA Review — PR #40: Add semantic search via pgvector + Ollama

Automated tests

Manual tests (8/8 pass)

Code review

Findings

1. HNSW index missing from inline DDL (functional gap)

2. VECTOR(768) hardcoded vs configurable dimensions

3. _generate_embedding is synchronous, not truly fire-and-forget (observation)

4. Pre-existing entries don't get embeddings (observation, expected)

5. Test count discrepancy (nit)

Verdict

Uh oh!

cmeans left a comment

Choose a reason for hiding this comment

QA Re-review — PR #40 (round 2)

Fix verification (commit 7e9c8f9)

Automated tests

Verdict

Uh oh!

cmeans commented Mar 23, 2026

Review response

Finding #1: HNSW index missing from inline DDL ✅ Fixed

Finding #2: Hardcoded VECTOR(768) ✅ Documented

Finding #3: Synchronous embedding on write ✅ Acknowledged

Finding #4: Pre-existing entries not backfilled ✅ Phase 2

Finding #5: Test count ✅ Fixed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cmeans commented Mar 23, 2026 •

edited

Loading

codecov Bot commented Mar 23, 2026 •

edited

Loading

2. `VECTOR(768)` hardcoded vs configurable dimensions

3. `_generate_embedding` is synchronous, not truly fire-and-forget (observation)

Fix verification (commit `7e9c8f9`)