Fix JSON content, connection resilience, RAG Phase 2#43
Conversation
- remember/update_entry: re-serialize content to string when Pydantic deserializes JSON objects/arrays before the str validator runs - PostgresStore: auto-healing _conn property with 30s health check debounce. Dead connections reconnect transparently on next access. - 4 new tests (298 total) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug fixes: - remember/update_entry: re-serialize content when Pydantic deserializes JSON objects/arrays before str validator runs - PostgresStore: auto-healing _conn property with 30s health check debounce RAG Phase 2: - Background embedding via ThreadPoolExecutor (2 workers, non-blocking writes) - backfill_embeddings tool: embed pre-existing entries + re-embed stale ones - hint param on get_knowledge: semantic re-ranking of tag-filtered results - get_stale_embeddings store method for changed-since-embedding detection - 304 tests, 28 tools Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New tool (29 total) traverses related_ids in entry data: - Forward: entries this entry references - Reverse: entries that reference this entry (JSONB containment query) - Deduplicates bidirectional links - Supports list mode Convention: store related_ids: ["id1", "id2"] in data when using remember or learn_pattern. No schema change needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- deployment-guide: Ollama service setup, embeddings profile, config table - README: 29 tools listed, semantic search section, config vars, entry relationships, intentions, read/action tracking in feature lists - CLAUDE.md: embedding pipeline, connection resilience, entry relationships - data-dictionary: related_ids convention, connection resilience note - vision.md: current stats, awareness-canvas introduction and repo link - from-metrics-to-mental-models.md: updated disclaimer for v0.10.0 - collation-layer.md: updated disclaimer (evaluation field, intentions) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Split into Server, Embedding, and Docker Compose subsections. Added: AWARENESS_EMBEDDING_DIMENSIONS, POSTGRES_PASSWORD, AWARENESS_PG_DATA, AWARENESS_OLLAMA_DATA, CLOUDFLARED_CONFIG, CLOUDFLARED_CREDS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…Getting started Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…de, reconnect - update_entry: JSON content, source, content_type field tests - backfill_embeddings: stale refresh path test - hint: list mode with similarity scores test - Connection resilience: OperationalError reconnect path test - Fix integration test race: wait for background embedding threads - 315 tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_do_embed now uses a dedicated psycopg connection (via store.dsn) instead of the shared store._conn. Prevents InFailedSqlTransaction errors when the background thread and main thread race on the same connection. Same pattern as _do_cleanup in PostgresStore. Also: _conn health check now rollbacks before SELECT 1 to clear any prior INERROR state, and catches DatabaseError (not just OperationalError/InterfaceError). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three catch-all blocks in backfill_embeddings and hint re-ranking exist as safety nets for unpredictable failures. Marking them as excluded from coverage since they're intentionally untestable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cmeans
left a comment
There was a problem hiding this comment.
QA Review — PR #43: Fix JSON content, connection resilience, RAG Phase 2
Automated tests
- 315/315 pass (PR says 304 — subsequent commits added more)
- ruff: clean
- mypy: clean
Manual tests (6/6 pass)
All executed via isolated MCP instance with Ollama enabled.
| # | Test | Result |
|---|---|---|
| 1 | JSON dict content accepted | ✅ remember with content={"key":"value"} — stored as JSON string. Verified in DB. |
| 2 | Connection resilience | |
| 3 | backfill_embeddings |
✅ Returns new: 0, remaining: 0 — background thread pool already embedded entries. Verified in DB: all 3 entries have embeddings. |
| 4 | semantic_search finds entries |
✅ "retirement planning" → 401k (0.56) > Roth IRA (0.37) > Config (0.24). Correct semantic ordering. |
| 5 | hint re-ranks get_knowledge |
✅ tags=["finance"], hint="retirement savings" — results include similarity scores, 401k ranked above Roth IRA. |
| 6 | Stale embedding detection | ✅ After update_entry with new description, background thread re-embedded within milliseconds. backfill_embeddings shows 0 stale (already refreshed). Verified in DB: text_hash and created updated. |
Code review
Background embedding (key improvement): ThreadPoolExecutor(max_workers=2) replaces synchronous _generate_embedding. _do_embed runs in the pool with its own DB connection (via psycopg.connect(store.dsn)) to avoid racing with the main thread's shared connection. Entry data is serialized to primitives before submission (no mutable sharing). Good.
Connection resilience: _conn is now a @property that health-checks every 30s. Pattern: check closed → rollback (clear INERROR state) → SELECT 1 → rollback. On failure: close old, create new. The rollback-before-ping is important — without it, a connection in INERROR state would fail the health check every time. Good.
JSON content fix: if not isinstance(content, str): content = json.dumps(content) in both remember and update_entry. Handles the Pydantic deserialization edge case where JSON strings arrive as dicts. Simple, correct.
backfill_embeddings tool: Two phases — missing entries via get_entries_without_embeddings, then stale entries via get_stale_embeddings. Stale detection compares text_hash(compose_embedding_text(entry)) against stored emb_text_hash. Clean.
hint parameter: Re-ranks by doing a semantic_search with the hint text, then sorting the original entries by their similarity scores. Entries without embeddings sort to the end (-1.0). similarity field added to both list and full output when hint is active. Good.
get_related tool: Bidirectional traversal — forward via data.related_ids list, reverse via JSONB containment data->'related_ids' @> %s::jsonb. Deduplicates. Not in the QA manual tests but code-reviewed; the convention relies on callers storing related_ids in data.
Docs: Comprehensive README restructure — new config tables (Server, Embedding, Docker Compose), new sections (Semantic search, 29 tools), updated Current status. CLAUDE.md updated with embedding pipeline and connection resilience docs. CHANGELOG entries. Good.
Findings
1. Test count discrepancy (nit)
PR says 304, CHANGELOG says 310, actual is 315. Update to match.
2. get_related not in QA manual tests (observation)
The PR adds get_related tool (28 → 29 tools) but it's not listed in the QA section. PR summary also doesn't mention it. It was code-reviewed and the logic is straightforward (forward + reverse JSONB lookup, dedup), but it wasn't manually tested through MCP.
3. hint over-fetches from semantic_search (minor)
The hint implementation calls semantic_search with limit=len(entries) + 10 but no filters — it searches ALL entries, not just the ones that matched the tag filter. For large stores, this could return similarity scores for unrelated entries. Currently harmless (the re-sort only applies scores to entries already in the result set), but the over-fetch is unnecessary work. Could pass the same source/tags filters through.
Verdict
No substantive issues. Finding #1 is a nit, #2 is an observation (get_related is simple enough for code review), #3 is a minor optimization opportunity. The three headline features (JSON fix, connection resilience, background embedding) are all well-implemented.
Ready for QA Approved label.
Review responseFinding #1: Test count ✅ FixedUpdated CHANGELOG and README to 315. Finding #2: get_related not in QA ✅ AcknowledgedSimple bidirectional JSONB lookup — code review sufficient. Will include in QA for future PRs that touch it. Finding #3: hint over-fetches ✅ Fixed
|
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The round-trip test was asserting ranking after only one entry had an embedding (thread scheduling race). Now waits for both entries to appear in search results before checking similarity ordering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
remember/update_entryaccept JSON objects/arrays — re-serialized to string when Pydantic deserializes before str validator_connproperty with 30s health check debounce. Dead connections reconnect transparently after Postgres restarts.backfill_embeddingstool (28 total): Embeds pre-existing entries + re-embeds stale ones (text changed since embedding)hintparam onget_knowledge: Semantic re-ranking of tag-filtered results by similarity to a natural language phraseget_stale_embeddingsfinds entries whose content changed after their embedding was generatedQA
Prerequisites
pip install -e ".[dev]"AWARENESS_PORT=8421)nomic-embed-textmodelManual tests (via MCP tools)
Expected:
status: "ok"— content stored as JSON stringExpected: returns results after transparent reconnect
Expected:
new_embeddings> 0,remaining= 0Expected: returns entries that existed before Ollama was configured
Expected: results include
similarityscores, retirement-related entries ranked firstExpected:
refreshed_embeddings>= 1🤖 Generated with Claude Code