Skip to content

Memory v3 — storage, read loop, and write path (P2–P4), all flag-gated#31990

Merged
velissa-ai merged 21 commits into
mainfrom
velissa-ai/memory-v3-build
May 25, 2026
Merged

Memory v3 — storage, read loop, and write path (P2–P4), all flag-gated#31990
velissa-ai merged 21 commits into
mainfrom
velissa-ai/memory-v3-build

Conversation

@velissa-ai
Copy link
Copy Markdown
Collaborator

Summary

Builds memory v3 — the retrieval-loop redesign — as a new assistant/src/memory/v3/ namespace beside memory/v2/. Everything is additive and flag-gated; with config.memory.v3.* at defaults (all false) nothing changes in production. v2 is byte-for-byte untouched (empty diff on memory/v2/).

  • P2 — storage + traversal: tree-node format + store, DAG tree-index, compositional index rendering, parallel-fan-out traversal w/ cycle guards, validator, and read-only assistant memory v3 validate|tree CLI/routes.
  • P3 — read loop (shadow): scouts (hot/sparse/dense) → fast filter → scout-seeded tree-walk → edge expansion → gate, composed into runRetrievalLoop; exposed as a P1-harness Retriever (compare route, gated on v3.enabled) and a live-shadow memoryRetrieval middleware (gated on v3.enabled && v3.shadow, injects v2, logs v3 as mode='v3_shadow').
  • P4 — write path: v3 job types + config; co-activation logging; weighted/decaying auto-edge learning; and v3 consolidation that drains the shared memory/buffer.md into the tree while preserving essentials/threads/recent exactly as v2 (scheduler retargets only when v3.write.enabled).

Out of scope (per plan): P5 cutover + v2 retirement, plugin extraction, and the by-hand v2→v3 page migration.

Production-safety (verified by self-review)

With v3.enabled/shadow/write.* all false: shadow middleware is a pure pass-through, the compare route excludes v3, the consolidation scheduler still enqueues memory_v2_consolidate, and nothing auto-enqueues v3 jobs. Migrations 262/263 only create empty tables.

Self-review result

  • Plan faithfulness: PASS. Repo integration: PASS (tsc/lint clean, generate:openapi no-diff, v3 tests pass per-file).
  • Known follow-ups (not addressed here — all in flag-gated, production-inert code):
    • Auto-edge learning ships inert end-to-end (intentional build-phase phasing → activates at cutover/P5): memory_v3_edge_learning is dispatched but never enqueued; co-activation used is never reconciled to true; the loop calls expandEdges without aboveThreshold's extraAdjacency; consolidation doesn't yet consume promotion candidates.
    • Cross-PR slop: forced-tool LLM scaffolding duplicated across filter.ts/gate.ts/tree-walk.ts; v3 consolidation prompt-override resolver is a verbatim fork of v2's. Behavior-preserving consolidation deferred to avoid regression risk on tested inert code.
    • Live shadow covers per-turn + context-load retrieval (both flow through the memoryRetrieval pipeline); post-compaction cached-block reinjection bypasses the pipeline but performs no new retrieval, so nothing is missed.

PRs merged into this feature branch

#31971, #31972, #31973, #31974, #31975, #31976, #31977, #31978, #31979, #31980, #31981, #31982, #31983, #31984, #31985, #31986, #31987, #31988, #31989 (19 PRs).

Part of plan: memory-v3-build.md

velissa-ai and others added 19 commits May 25, 2026 02:39
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
…1978)

Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
…#31980)

Co-authored-by: Vellum Assistant <assistant@vellum.ai>
…hness) (#31981)

Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
…31984)

Co-authored-by: Vellum Assistant <assistant@vellum.ai>
…ains standing-context files (#31985)

Co-authored-by: Vellum Assistant <assistant@vellum.ai>
…31986)

Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
…2, log v3) (#31989)

Co-authored-by: Vellum Assistant <assistant@vellum.ai>
@velissa-ai velissa-ai self-assigned this May 25, 2026
@velissa-ai
Copy link
Copy Markdown
Collaborator Author

Plan papertrail: Memory v3 — Storage, Read Loop, and Write Path (P2–P4)

Memory v3 — Storage, Read Loop, and Write Path (P2–P4)

Overview

Builds memory v3 — the retrieval-loop redesign — in-codebase as a memory/v3/ namespace beside today's memory/v2/, covering three phases from the design doc (.private/plans/memory-retrieval-architecture-v2.md): P2 the on-disk tree/DAG storage format + traversal primitives (code only), P3 the read loop (scouts → filter → parallel tree fan-out → edge expansion → gate), shadow-measured against the v2 router via the already-shipped P1 harness, and P4 the write path (capture, consolidation, self-improving index) running additively behind a flag. Everything is additive and flag-gated; v2 stays fully live.

Out of scope (do NOT build here): P5 (the pilot-assistant cutover + deleting v2's retrieval/write path — the only hard gate) and the later/optional plugin extraction. Also out of scope: the v2→v3 page migration (reclassifying the ~1,600 pages into the tree, authoring _index.md nodes) — that is a parallel data track owned by the user, done by hand against P2's format. This plan only produces the format and code that track populates against; it never populates the corpus.

Shared standing-context substrate (preserved, NOT forked). The files memory/buffer.md (capture buffer) and memory/essentials.md / threads.md / recent.md (standing-context digests) are shared substrate, orthogonal to the concept-page retrieval redesign. v3 reuses them — it does not create a memory/v3/buffer.md or its own meta-files. They are injected two ways today and v3 leaves both untouched: (1) the always-on static block ## Essentials / ## Threads / ## Recent / ## Buffer via readMemoryV2StaticContent() (memory/v2/static-context.ts), spliced into the user message on first-message / post-compaction turns; (2) loadNowText() (memory/v2/now-text.ts) concatenates essentials/threads/recent into nowText, the standing-context input to the router/loop (already a field on the harness RetrievalInput). Concept-page retrieval (the v3 loop) layers selected pages on top of this always-on context. Maintenance of these files (rewrite recent.md, promote essentials/threads, trim buffer.md) is owned by consolidation — v3 consolidation preserves it identically (PR 19). This is why the write path collapses to "v3 consolidation drains the shared buffer," not a parallel v3 capture pipeline.

Grounding anchors (verified against current main):

  • Page store: assistant/src/memory/v2/page-store.ts (getConceptsDir<ws>/memory/concepts/, readPage/writePage/listPages/pageExists/slugify/validateSlug, segment slug regex ^[a-z0-9](?:[a-z0-9-]*)$, renderPageContent); frontmatter FRONTMATTER_REGEX from src/skills/frontmatter.ts + yaml.parse.
  • Page/frontmatter types: assistant/src/memory/v2/types.ts:37-62 (ConceptPageFrontmatterSchema strict: edges/ref_files/ref_urls/summary; ConceptPage = {slug,frontmatter,body}).
  • Page index: assistant/src/memory/v2/page-index.ts:49-77 (PageIndex = entries/bySlug/byId/rendered), getPageIndex(ws), invalidatePageIndex.
  • Edge index: assistant/src/memory/v2/edge-index.ts (getEdgeIndex, getReachable(index,slug,hops,dir), validateEdgeTargets).
  • Router (read): assistant/src/memory/v2/router.ts:245 (runRouter{selectedSlugs,sourceBySlug,failureReason}); injection assistant/src/memory/v2/injection.ts:181 (config.memory.v2.router.enabled gate), entry injectMemoryV2Block; called from conversation-graph-memory.ts:794; compaction conversation-graph-memory.ts:217-290 (onCompacted/reinjectCachedMemory, mode "context-load").
  • Scout substrate: hybridQueryConceptPages (qdrant.ts:707-804), rerankCandidates (reranker.ts:75-179), generateBm25QueryEmbedding (sparse-bm25.ts), dense embed via embedWithBackend.
  • Hot/EMA: computeInjectionScores / recordInjectionEvents (injection-events.ts), table memory_v2_injection_events.
  • Pipeline seam: assistant/src/plugins/types.ts:120-236 (PipelineName incl. "memoryRetrieval", MemoryArgs/MemoryResult, Middleware), plugins/pipeline.ts:99-115 (composeMiddleware), default plugins/defaults/memory-retrieval.ts.
  • LLM call sites: assistant/src/config/schemas/llm.ts:38-81 (LLMCallSiteEnum, has memoryRetrieval,memoryRouter); resolver getConfiguredProvider(callSite) (providers/provider-send-message.ts); shipped defaults assistant/src/config/call-site-defaults.ts.
  • Write path: indexMessageNow (memory/indexer.ts:56-380, called conversation-crud.ts:1070); retrospective memory/memory-retrospective-trigger-check.ts + memory-retrospective-job.ts + memory-retrospective-state.ts (job type memory_retrospective); remember() tool tools/memory/register.tsmemory/graph/tool-handlers.ts:34-70 (v2 appends memory/buffer.md + memory/archive/<date>.md); jobs memory/jobs-store.ts:17-47 (MemoryJobType), worker memory/jobs-worker.ts (processJob switch ~516-627), enqueueMemoryJob/upsertDebouncedJob; consolidation memory/v2/consolidation-job.ts (job memory_v2_consolidate, hands off via runBackgroundJob; drains memory/buffer.md → pages, rewrites recent.md, promotes essentials/threads, trims buffer); page embedding memory/jobs/embed-concept-page.ts (job embed_concept_page, Qdrant+BM25 upsert).
  • Standing context (shared, reused by v3): injected at daemon/conversation-agent-loop.ts:1634-1654 (shouldInjectNowAndPkb = isFirstMessage || compactedThisTurn) via readMemoryV2StaticContent() (memory/v2/static-context.ts, self-gates on config.memory.v2.enabled, blocks ## Essentials/## Threads/## Recent/## Buffer) + loadNowText() (memory/v2/now-text.ts, files essentials.md/threads.md/recent.md); nowText also flows to retrieval at conversation-graph-memory.ts:786.
  • P1 harness (shipped): assistant/src/memory/v2/harness/Retriever (retriever.ts:71-74), RetrievalInput/RetrievalOutput/RetrievalCost, DescentTrace+sub-types (trace.ts), runComparisonOverHistory (compare.ts), runComparison (runner.ts); compare route runtime/routes/memory-v2-routes.ts:627-653 (retrievers array hardcoded [createRouterRetriever(db)] at ~line 644); CLI cli/commands/memory-v2.ts (assistant memory v2 compare).
  • DB migrations: add to assistant/src/memory/migrations/, register in db-init.ts; idempotent + append-only (CLAUDE.md).

Standing verify for every PR (from CLAUDE.md; run in assistant/ with export PATH="$HOME/.bun/bin:$PATH"): bunx tsc --noEmit; scoped bun test <new test files> (never unscoped); bun run lint; then bunx prettier --write on all changed files (pre-commit hook enforces prettier). Provider-agnostic language in comments/logs ("LLM", not model names). .js import extensions (NodeNext).


PR 1: v3 tree-node format + node store

Depends on

None

Branch

memory-v3/pr-1-node-format

Title

feat(memory-v3): tree-node on-disk format + node store

Files

  • assistant/src/memory/v3/types.ts (new)
  • assistant/src/memory/v3/tree-store.ts (new)
  • assistant/src/memory/v3/__tests__/tree-store.test.ts (new)

Implementation steps

  1. In types.ts, define the v3 tree node — the unit the parallel migration authors by hand. A node is a markdown file with YAML frontmatter (mirror v2/types.ts exactly). Zod TreeNodeFrontmatterSchema (.strict()): children: z.array(z.string()).default([]) (each entry is a child reference — either "page:<page-slug>" for a leaf concept page or "node:<node-id>" for a sub-node; this reference list IS the DAG edge and the portable replacement for filesystem symlinks per the Storage section's "manifest of canonical-path references"), routing_hints: z.string().optional() (thin hand-written cross-branch disambiguation), summary: z.string().optional() (the node's self-description headline; the markdown body is the full self-description). Export TreeNode = { id: string; frontmatter; body } and the inferred frontmatter type.
  2. In tree-store.ts, resolve the v3 tree dir: getTreeDir(workspaceDir) = join(workspaceDir, "memory", "v3", "tree") (do NOT touch memory/concepts/ — pages stay canonical and shared). Node files live at <treeDir>/<node-id>.md where node-id uses the same /-segmented slug rules as pages.
  3. Port the slug machinery from v2/page-store.ts for node ids: validateNodeId (reuse the segment regex ^[a-z0-9](?:[a-z0-9-]*)$, /-separated, ≤200 chars, reject ../backslash/whitespace) and reuse slugify for segment generation. Reserve the root node id "_root".
  4. Implement readNode(ws,id), writeNode(ws,node) (atomic temp+rename, mkdir -p parents, like writePage), deleteNode(ws,id), listNodes(ws) (recursive walk of <treeDir>, return ids in /-form, skip dotfiles/.tmp.*/non-.md, [] if dir missing), and renderNodeContent(node) (frontmatter + body, mirroring renderPageContent). Parse frontmatter with the same FRONTMATTER_REGEX + yaml.parse + schema .parse pipeline as pages.
  5. Tests: round-trip write/read a node; children parse for both page:/node: forms; malformed YAML throws; missing node → null; listNodes walks nested ids; validateNodeId rejects traversal/empty/whitespace; reserved _root.

Acceptance criteria

  • readNode/writeNode/listNodes/deleteNode/renderNodeContent/validateNodeId exported and unit-tested on a fixture tree dir (tmpdir).
  • No change to memory/concepts/ or any v2 module.
  • bunx tsc --noEmit, scoped bun test, bun run lint, bunx prettier --write all clean.

PR 2: v3 tree index (DAG build + cache)

Depends on

PR 1

Branch

memory-v3/pr-2-tree-index

Title

feat(memory-v3): tree index with DAG adjacency + cache

Files

  • assistant/src/memory/v3/tree-index.ts (new)
  • assistant/src/memory/v3/__tests__/tree-index.test.ts (new)

Implementation steps

  1. Mirror v2/page-index.ts's module-cache pattern. Define TreeIndex = { nodes: Map<string,TreeNode>; childrenByNode: Map<string, ReadonlyArray<ChildRef>>; parentsByNode: Map<string, Set<string>>; pageParents: Map<string, Set<string>>; root: string } where ChildRef = { kind: "page"|"node"; ref: string }.
  2. getTreeIndex(workspaceDir): Promise<TreeIndex>listNodes + readNode all nodes in parallel (drop unreadable with a warn, like page-index), parse each children entry into a ChildRef, build forward adjacency (childrenByNode) and reverse adjacency (parentsByNode for node: children, pageParents for page: children). Root = "_root" if present, else the single node with no parents (warn + pick deterministically if ambiguous).
  3. Build resolution is structural only — it does NOT verify that referenced pages/nodes exist (that's validation, PR 5). Dangling refs are retained in adjacency so validation can report them.
  4. Cache per-workspaceDir (module-level Map), and export invalidateTreeIndex(workspaceDir?). Wire invalidateTreeIndex into writeNode/deleteNode in tree-store.ts (import + call after the cache-affecting mutation, exactly as page-store calls invalidatePageIndex).
  5. Tests: build a fixture tree (root → 2 sub-nodes → page leaves; one node referenced by two parents = DAG); assert childrenByNode/parentsByNode/pageParents/root; assert cache hit returns same object and invalidateTreeIndex forces rebuild; ambiguous-root warns and is deterministic.

Acceptance criteria

  • getTreeIndex returns correct DAG adjacency (incl. a node with 2 parents) on a fixture tree.
  • writeNode/deleteNode invalidate the tree-index cache.
  • Standing verify clean.

PR 3: compositional index rendering

Depends on

PR 2

Branch

memory-v3/pr-3-index-composition

Title

feat(memory-v3): compose node index from children + routing hints

Files

  • assistant/src/memory/v3/index-composition.ts (new)
  • assistant/src/memory/v3/__tests__/index-composition.test.ts (new)

Implementation steps

  1. Implement the Storage section's "parent index is composed by concatenating its children's descriptions + a thin routing-hints layer" — generated at read time, never stored. Signature: composeNodeIndex(nodeId: string, tree: TreeIndex, pages: PageIndex): string.
  2. For each ChildRef of nodeId: if kind:"node", emit a block "[node:<id>] <child.summary or first line of body>"; if kind:"page", look up pages.bySlug.get(slug) and emit "[page:<slug>] <entry.summary>" (summary already truncated to 200 chars by the page index). Skip refs whose target is absent but record nothing here (validation owns reporting).
  3. Append the node's own routing_hints (if present) under a Routing hints: trailer.
  4. Keep output a plain string block suitable to drop into an LLM descent prompt (this is what the tree-walk model reads per node in PR 10). Deterministic ordering: children in authored order.
  5. Tests: compose a node with mixed node:/page: children → asserts the block contains each child's summary line + routing hints; missing page ref is silently omitted; empty children → just routing hints (or empty string).

Acceptance criteria

  • composeNodeIndex produces a deterministic, prompt-ready index block from a fixture TreeIndex + PageIndex.
  • Pure function (no I/O); fully unit-tested.
  • Standing verify clean.

PR 4: traversal primitives + cycle/visited guards

Depends on

PR 2

Branch

memory-v3/pr-4-traversal

Title

feat(memory-v3): parallel-fan-out traversal with cycle/visited guards

Files

  • assistant/src/memory/v3/traversal.ts (new)
  • assistant/src/memory/v3/__tests__/traversal.test.ts (new)

Implementation steps

  1. Implement the mechanical traversal the read loop will drive (the LLM descend-decision is injected, NOT implemented here — keep this pure/deterministic and testable without a provider).
  2. resolveChildren(tree, nodeId): ChildRef[] — thin accessor over childrenByNode.
  3. walkTree(tree, opts) where opts = { start?: string; seeds?: string[]; breadthBudget: number; maxDepth: number; descend: (nodeId, children) => Promise<ChildRef[]> }. Implements parallel fan-out (Architecture §tree-walk): from start (default root) and any seeds (node ids surfaced by scouts), call descend to choose which child nodes to recurse into (bounded by breadthBudget per level), collect all page: children encountered, and recurse into chosen node: children. Dedup by canonical id with a visited: Set<string> so the DAG (shared sub-nodes) and any accidental cycle terminate — result-dedup is not enough (Storage §true-DAG). Respect maxDepth.
  4. Return { pages: Set<string>; levels: TreeLevel[] } where each TreeLevel matches the P1 trace.ts shape (node,considered,descended,skipped,reasoning,cost?) so the loop can assemble a DescentTrace directly.
  5. Tests (provider-free, descend is a stub): linear descent collects expected leaf pages; DAG (shared sub-node) visited once; injected cycle (node A↔B) terminates; breadthBudget caps descents per level; maxDepth halts; seeds start mid-tree.

Acceptance criteria

  • walkTree fans out over a fixture DAG, dedups by canonical id, terminates on cycles, honors breadth/depth budgets, and emits TreeLevel[] compatible with harness/trace.ts.
  • No LLM/provider dependency in this module.
  • Standing verify clean.

PR 5: tree validation

Depends on

PR 3, PR 4

Branch

memory-v3/pr-5-validate

Title

feat(memory-v3): tree validator (orphans, cycles, dangling refs, freshness)

Files

  • assistant/src/memory/v3/validate.ts (new)
  • assistant/src/memory/v3/__tests__/validate.test.ts (new)

Implementation steps

  1. validateTree(workspaceDir): Promise<TreeValidationReport> — the helper the parallel migration runs to check hand-authored structure (design P2: "any light authoring/validation helpers the migration wants").
  2. Build getTreeIndex + getPageIndex + getEdgeIndex. Report:
    • danglingChildRefs: node:/page: children whose target node/page does not exist.
    • orphanPages: pages in PageIndex not reachable from root via walkTree (descend = take all) — i.e. classified-but-unlinked, or not-yet-classified (informational, the migration is in progress).
    • cycles: node cycles detected during a full descent (reuse walkTree's visited logic, but record back-edges).
    • staleIndex: nodes whose own mtime is older than a child node's mtime (compositional index may be out of date) — use getNodeMtimeMs (add a tiny mtime helper to tree-store.ts mirroring getPageMtimeMs).
    • unknownEdgeTargets: reuse validateEdgeTargets(getEdgeIndex(...), knownSlugs) for the page edges: graph.
  3. Return counts + the offending ids; do not throw (it's a report).
  4. Tests: fixtures that trigger each category (dangling ref, orphan page, A→B→A cycle, stale parent mtime, edge to missing slug) and a clean tree → empty report.

Acceptance criteria

  • validateTree reports each defect category accurately on fixtures and returns a clean report for a well-formed tree.
  • Standing verify clean.

PR 6: CLI memory v3 validate + tree (read-only route)

Depends on

PR 5

Branch

memory-v3/pr-6-cli-validate

Title

feat(memory-v3): assistant memory v3 validate/tree CLI + route

Files

  • assistant/src/runtime/routes/memory-v3-routes.ts (new)
  • assistant/src/runtime/routes/index.ts (touch — register)
  • assistant/src/cli/commands/memory-v3.ts (new)
  • assistant/src/cli/commands/memory.ts (touch — register v3 subcommand; mirror how v2 is attached)
  • assistant/src/cli/commands/__tests__/memory-v3-render.test.ts (new)
  • assistant/openapi.yaml (regenerate)

Implementation steps

  1. Mirror memory-v2-routes.ts exactly (Zod params, RouteHandlerArgs, ROUTES entry). Add two read-only operations: memory_v3_validate (returns the validateTree report) and memory_v3_tree (returns a serializable view of getTreeIndex — node ids, child refs, root — for printing). Both call getWorkspaceDir(); no LLM, no writes.
  2. New CLI group assistant memory v3 mirroring the v2 command wiring. Subcommands validate and tree call cliIpcCall("memory_v3_validate"|"memory_v3_tree", { body }). Add --json for raw output.
  3. Pure render functions (CLI-side, like memory-v2-compare-render.ts — keep daemon-internal types out of the IPC client per the cli/no-daemon-internals lint rule; import report/tree types type-only): renderValidationReport(report) (counts + offending ids) and renderTree(view) (indented tree print, marking DAG re-entries).
  4. Tests: unit-test the two render functions against sample report/tree objects (no daemon).
  5. Run bun run generate:openapi and commit the regenerated openapi.yaml (two new paths).

Acceptance criteria

  • assistant memory v3 validate and assistant memory v3 tree run against the daemon and print readable output; --json returns raw payloads.
  • Render functions unit-tested; only type-only imports cross into the CLI/IPC client.
  • openapi.yaml regenerated from source; standing verify clean.

PR 7: v3 config schema + LLM call sites

Depends on

None

Branch

memory-v3/pr-7-config-callsites

Title

feat(memory-v3): config schema + cheap/capable LLM call sites

Files

  • assistant/src/config/schemas/llm.ts (touch)
  • assistant/src/config/call-site-defaults.ts (touch)
  • the memory config schema file defining memory.v2 (grep config/schemas/ for v2:/router: — add a sibling v3 object)
  • relevant config __tests__ (touch/add)

Implementation steps

  1. Locate the Zod schema that defines config.memory.v2 (sibling under config/schemas/). Add v3: z.object({...}).default({...}) with: enabled: z.boolean().default(false), shadow: z.boolean().default(false) (live-shadow toggle, used in PR 15), passCap: z.number().int().default(3), breadthBudget: z.number().int().default(6), maxDepth: z.number().int().default(6), denseQuota: z.object({ activeDomain: z.number(), offDomain: z.number() }).default(...), lanes: z.object({ hot: z.boolean().default(true), sparse: z.boolean().default(true), dense: z.boolean().default(true), tree: z.boolean().default(true), edges: z.boolean().default(true) }).default(...) (so the harness can toggle individual lanes to read each one's marginal recall — design P3 "lanes land incrementally"), and ks: z.array(z.number()).default([5,10,25,50]). Default the whole thing so existing configs are untouched (backwards compat).
  2. In llm.ts:38-81, add three call sites to LLMCallSiteEnum: "memoryV3Filter", "memoryV3Descent", "memoryV3Gate" (model tiering — cheap filter+descent, capable gate; design §Model tiering).
  3. In call-site-defaults.ts, add shipped defaults assigning memoryV3Filter/memoryV3Descent → the cost-optimized profile and memoryV3Gate → the balanced profile (match the existing pattern for memoryRouter).
  4. Tests: config parses with no v3 key (defaults applied); explicit v3 overrides parse; the three call sites resolve via resolveCallSiteConfig to expected profiles.

Acceptance criteria

  • config.memory.v3 exists with safe defaults (enabled:false); pre-existing configs validate unchanged.
  • Three v3 call sites resolve through the standard resolver with tiered defaults.
  • Standing verify clean.

PR 8: scouts (hot / sparse / dense)

Depends on

PR 7

Branch

memory-v3/pr-8-scouts

Title

feat(memory-v3): always-on scouts over the v2 substrate

Files

  • assistant/src/memory/v3/scouts.ts (new)
  • assistant/src/memory/v3/__tests__/scouts.test.ts (new)

Implementation steps

  1. Implement the three always-on scout lanes (Architecture §scouts), each reusing v2 substrate and each individually callable/toggleable per config.memory.v3.lanes:
    • hot: computeInjectionScores(db, allSlugs, now) → top-scored slugs. Reuse the existing memory_v2_injection_events table — hot/EMA is corpus-global access-frequency, retriever-agnostic, and v2 keeps writing it while live (no v3 EMA producer needed pre-cutover). Mark hits sticky.
    • sparse: generateBm25QueryEmbedding(queryText) + hybridQueryConceptPages(dense?, sparse, limit, …, {skipSparse:false}) reading sparseScore; flag near-exact (high-score) hits sticky + tree-bypass (Architecture §sparse).
    • dense: dense embed (embedWithBackend + calibration as in activation.ts:134-141) + hybridQueryConceptPages, then apply the asymmetric per-subtree quota (generous active-domain, thin off-domain slice) + MMR for diversity (design §dense). Domain is derived from the page slug's top segment / tree placement.
  2. Signature: runScouts(input: RetrievalInput, deps: { db; tree?: TreeIndex }): Promise<{ scouts: ScoutResult[]; sticky: Set<string>; bypass: Set<string> }> where ScoutResult matches harness/trace.ts (lane,slugs,scoreBySlug?). queryText derived from input.recentTurnPairs last user turn + input.nowText.
  3. No LLM in this module (judging dense is the next PR). Honor input.signal.
  4. Tests: stub db/Qdrant/embed substrate (inject via deps and module mocks like harness-oracle.test.ts patterns) — assert each lane returns expected slugs, sticky/bypass sets populate, dense quota caps off-domain, lane toggles suppress a lane.

Acceptance criteria

  • runScouts returns per-lane ScoutResult[] + sticky/bypass sets, reusing v2 substrate, with no live LLM/Qdrant in tests (stubbed).
  • Dense quota + MMR applied; lanes individually toggleable.
  • Standing verify clean.

PR 9: fast LLM filter (dense judgment)

Depends on

PR 8

Branch

memory-v3/pr-9-fast-filter

Title

feat(memory-v3): fast filter judging dense hits (sticky bypass)

Files

  • assistant/src/memory/v3/filter.ts (new)
  • assistant/src/memory/v3/__tests__/filter.test.ts (new)

Implementation steps

  1. Implement the fast filter (Architecture §dense "judgment-filtered, not domain-gated"): one cheap LLM call (getConfiguredProvider("memoryV3Filter")) that, given the conversation context + the bounded dense candidate set (~50–200, never the whole corpus), keeps meaningful cross-domain associations and drops spurious ones. Hot pages and near-exact sparse hits bypass (the sticky/bypass sets from PR 8) and are never judged.
  2. Signature: filterDenseHits(args: { input: RetrievalInput; dense: ScoutResult; sticky: Set<string>; bypass: Set<string> }): Promise<{ kept: string[]; trace: { judged: string[]; dropped: string[] } }>. Force a tool/structured output for the keep/drop decision (mirror the select_pages_to_inject forced-tool pattern in router.ts). If the provider is null/errors, fail open (keep all dense) and set a failureReason the loop can surface.
  3. Provider-agnostic prompt/logs.
  4. Tests: stub provider returns a keep-subset → asserts kept = bypass ∪ judged-kept; provider-null → fail-open keeps all; empty dense → no LLM call.

Acceptance criteria

  • filterDenseHits issues exactly one cheap LLM call over a bounded candidate set, respects sticky bypass, and fails open.
  • Tests use a stub provider (no real LLM).
  • Standing verify clean.

PR 10: tree-walk model driver

Depends on

PR 3, PR 4, PR 8

Branch

memory-v3/pr-10-tree-walk

Title

feat(memory-v3): scout-seeded tree-walk descent driver

Files

  • assistant/src/memory/v3/tree-walk.ts (new)
  • assistant/src/memory/v3/__tests__/tree-walk.test.ts (new)

Implementation steps

  1. Implement the LLM descend decision that drives PR 4's walkTree. For a given node, build the prompt from composeNodeIndex(nodeId, tree, pages) (PR 3) + the conversation context + the surviving scout hits (so descent is scout-seeded but retains pressure to descend branches scouts missed — design §tree-walk "don't degenerate into a follow-the-scouts ratifier"). One getConfiguredProvider("memoryV3Descent") call per visited node (cheap model), returning which child nodes to descend (bounded by breadthBudget).
  2. Signature: createDescender(args: { input; tree; pages; scouts; seeds }): (nodeId, children) => Promise<ChildRef[]>; plus runTreeWalk(args): Promise<{ pages: Set<string>; levels: TreeLevel[] }> that wires the descender into walkTree with breadthBudget/maxDepth from config.memory.v3, seeding walkTree with scout-surfaced node ids.
  3. Record per-node reasoning + considered/descended/skipped into the TreeLevel (the descender returns reasoning text; this is what makes the recall cliff observable — design §observability "watch work/ get skipped on a personal turn").
  4. Tests: stub descender provider with scripted decisions over a fixture tree → asserts the right leaf pages collected, skipped subtrees recorded, breadth budget enforced, seeds bias the start set. No real LLM.

Acceptance criteria

  • runTreeWalk performs scout-seeded parallel descent over a fixture tree using a stubbed descent model, collecting leaf pages and emitting reasoned TreeLevel[].
  • One descent call per visited node; budgets honored.
  • Standing verify clean.

PR 11: edge-expansion lane

Depends on

PR 7

Branch

memory-v3/pr-11-edge-expansion

Title

feat(memory-v3): curated edge-expansion lane

Files

  • assistant/src/memory/v3/edges.ts (new)
  • assistant/src/memory/v3/__tests__/edges.test.ts (new)

Implementation steps

  1. Implement edge expansion (Architecture §edge-expansion): given a set of confident seed slugs, pull their 1–2 hop neighborhood from the curated edges: graph. Reuse getEdgeIndex(ws) + getReachable(index, slug, hops, "out") from v2/edge-index.ts (no LLM, ~free).
  2. Signature: expandEdges(args: { workspaceDir; seeds: Iterable<string>; hops?: number; extraAdjacency?: ReadonlyMap<string, ReadonlySet<string>> }): Promise<{ pulled: Set<string>; expansions: EdgeExpansion[] }> where EdgeExpansion matches harness/trace.ts (from,pulled). The optional extraAdjacency param is the seam for PR 18 to inject above-threshold weighted auto-edges without modifying this module.
  3. Tests: fixture pages with edges: frontmatter → 1-hop and 2-hop expansion correct; extraAdjacency merges in; cycles in the edge graph don't loop (bounded by hops + visited).

Acceptance criteria

  • expandEdges returns the curated neighborhood with per-seed EdgeExpansion[], accepts injected extra adjacency, and is provider-free.
  • Standing verify clean.

PR 12: gate (selector; brief deferred)

Depends on

PR 7

Branch

memory-v3/pr-12-gate

Title

feat(memory-v3): gate decision (ready/more) + final selection

Files

  • assistant/src/memory/v3/gate.ts (new)
  • assistant/src/memory/v3/__tests__/gate.test.ts (new)

Implementation steps

  1. Implement the gate (Architecture §gate + §brief): one capable LLM call (getConfiguredProvider("memoryV3Gate")) over the accumulated candidate set (scouts-kept ∪ tree pages ∪ edge-pulled ∪ sticky) that decides ready (finalize selection) vs more (emit follow-up questions to seed the next pass — design: the loop-back query is the gate's generated question, not the original message). The gate also returns the final ordered selectedSlugs.
  2. Scope note — brief deferred: the design's ~1000-token voice brief is only consumed when v3 is actually injected (P5 cutover, out of scope). In shadow we inject v2, so this PR produces only the selection + GateDecision (matching harness/trace.ts). Leave a clearly-named seam (// brief generation lands at cutover (P5)) — do NOT build voice synthesis we can't yet measure.
  3. Signature: runGate(args: { input; candidates: Set<string>; sticky: Set<string>; passNumber: number }): Promise<{ decision: GateDecision; selectedSlugs: string[] }>. Sticky pages are never dropped by the gate.
  4. Tests: stub provider returns ready+selection → asserts selection includes sticky; stub returns more+questions → asserts decision.questions surfaced; provider-null → fail-safe (ready, select all candidates).

Acceptance criteria

  • runGate returns a GateDecision + ordered selectedSlugs, preserves sticky, and surfaces follow-up questions on "more".
  • Brief generation explicitly deferred with a seam; no voice synthesis built.
  • Tests use a stub provider; standing verify clean.

PR 13: retrieval-loop orchestration

Depends on

PR 9, PR 10, PR 11, PR 12

Branch

memory-v3/pr-13-loop

Title

feat(memory-v3): retrieval loop (scouts→filter→tree→edges→gate)

Files

  • assistant/src/memory/v3/loop.ts (new)
  • assistant/src/memory/v3/__tests__/loop.test.ts (new)

Implementation steps

  1. Implement the loop (Architecture diagram): per pass — runScoutsfilterDenseHits (dense only; hot+sparse-exact bypass) → runTreeWalk (seeded by surviving scouts) → expandEdges (over all accumulated confident seeds) → runGate. If gate says more and passNumber < config.memory.v3.passCap, the gate's questions become the next pass's query; otherwise force-exit with the current selection (cap at 3). input.nowText (the shared essentials/threads/recent standing context) is consumed as situational context for scouts/descent/gate — the loop selects concept pages to layer on top of it and never rewrites or re-injects the standing-context files (that remains the untouched static-block path).
  2. Maintain a cross-pass visited/candidate accumulator and dedup by canonical slug. Assemble the full DescentTrace (passes: DescentPass[], each with scouts/treeLevels/edgeExpansions/gate) and accumulate RetrievalCost across all lane calls.
  3. Honor config.memory.v3.lanes toggles so the offline harness (PR 14) can measure each lane's marginal recall (design P3: "each lane's contribution read off the shadow diff; dense kept only if it adds unique recall worth its noise").
  4. Signature: runRetrievalLoop(input: RetrievalInput, deps: { db }): Promise<RetrievalOutput> — returns { selectedSlugs, sourceBySlug, trace, cost, failureReason } (exactly the P1 RetrievalOutput contract). sourceBySlug tags each slug with its lane (hot/sparse/dense/tree/edge).
  5. Tests: stub all lane modules (or their providers) → single-pass ready path; multi-pass (gate "more" then "ready") respects passCap; lane toggles change the candidate set; trace has one DescentPass per pass; cost accumulates. No real LLM.

Acceptance criteria

  • runRetrievalLoop produces a valid RetrievalOutput with a complete multi-pass DescentTrace, capped at passCap, with per-lane source tags and accumulated cost.
  • Fully unit-tested with stubbed lanes; standing verify clean.

PR 14: v3 Retriever + plug into compare harness

Depends on

PR 13

Branch

memory-v3/pr-14-v3-retriever

Title

feat(memory-v3): v3 Retriever as comparand #2 in the compare harness

Files

  • assistant/src/memory/v3/retriever.ts (new)
  • assistant/src/runtime/routes/memory-v2-routes.ts (touch — retrievers array)
  • assistant/src/memory/v3/__tests__/retriever.test.ts (new)

Implementation steps

  1. createV3Retriever(db: DrizzleDb): Retriever{ name: "v3", retrieve: (input) => runRetrievalLoop(input, { db }) }. This is the offline, zero-production-risk shadow path: the P1 harness replays historical oracle turns and scores v3's selection against the v2 router's logged picks (recall@k).
  2. In handleCompareRetrievers (memory-v2-routes.ts:~644), change the hardcoded retrievers: [createRouterRetriever(db)] to include createV3Retriever(db) only when config.memory.v3.enabled (so the default surface is unchanged until v3 is switched on). Keep createRouterRetriever(db) as comparand feat: initialize Next.js app in /web directory #1 always.
  3. No new route/CLI — assistant memory v2 compare already renders N retrievers polymorphically (per the P1 render code). No OpenAPI change (internal array only).
  4. Tests: route-assembly-style test with a fixture DB + stubbed loop deps → assert the report contains both "router" and "v3" retriever entries when v3.enabled, and only "router" when disabled. (Reuse the harness-compare.test.ts fixture-DB pattern; do NOT invoke a real provider.)

Acceptance criteria

  • With config.memory.v3.enabled, assistant memory v2 compare reports recall@k for both router and v3 over historical turns; with it disabled, only router.
  • No wire/OpenAPI change; standing verify clean.

PR 15: live-shadow memoryRetrieval middleware + logging

Depends on

PR 14

Branch

memory-v3/pr-15-live-shadow

Title

feat(memory-v3): live shadow via memoryRetrieval middleware (inject v2, log v3)

Files

  • assistant/src/memory/v3/shadow-middleware.ts (new)
  • the plugin bootstrap that registers default middlewares (grep bootstrapPlugins/plugins/defaults/memory-retrieval.ts) (touch)
  • assistant/src/memory/v3/__tests__/shadow-middleware.test.ts (new)
  • possibly assistant/src/memory/memory-v2-activation-log-store.ts (touch — accept a v3_shadow mode value)

Implementation steps

  1. Register a memoryRetrieval pipeline middleware (Middleware<MemoryArgs, MemoryResult>) that, when config.memory.v3.enabled && config.memory.v3.shadow, runs runRetrievalLoop in parallel with the real v2 path, logs v3's selection set, and returns the v2 result unchanged (inject v2 only — zero user risk; design §observability "shadow mode"). When the flag is off, it's a pass-through to next().
  2. Cover both injection points, not just fresh user turns: per-turn (conversation-graph-memory.ts:794) and context-load / post-compaction reinjection (conversation-graph-memory.ts:217-290) — the middleware fires on the memoryRetrieval pipeline which both paths flow through; verify the compaction path triggers retrieval (if reinjectCachedMemory bypasses the pipeline, add a shadow call there too, gated by the flag). The always-on standing-context static block (readMemoryV2StaticContent) is a separate injector — it is NOT shadowed or modified; v3 only shadows concept-page selection.
  3. Logging: write v3's selections to memory_v2_activation_logs with mode = "v3_shadow" (the harness oracle filters mode='router', so this never pollutes the oracle, and the existing inspector can show v3 rows). Confirm the mode column is plain TEXT with no CHECK/enum constraint; if recordMemoryV2ActivationLog narrows the type, widen it to accept "v3_shadow". No DB migration needed if the column is unconstrained TEXT.
  4. Never block or slow the real turn: run v3 shadow work detached (don't await it on the critical path) and swallow its errors with a warn. Honor the abort signal.
  5. Tests: middleware with flag off → exact pass-through (v2 result, no v3 call); flag on → v3 runs, a v3_shadow row is logged, the returned MemoryResult still equals v2's; v3 error → logged, turn unaffected.

Acceptance criteria

  • With memory.v3.shadow on, real turns and post-compaction reloads run v3 alongside v2, inject v2 only, and log v3 selections as mode='v3_shadow' without measurable critical-path latency.
  • Flag off → byte-for-byte pass-through. No migration; standing verify clean.

PR 16: v3 write config + job types

Depends on

PR 7

Branch

memory-v3/pr-16-write-jobs-setup

Title

feat(memory-v3): write-path job types + config (no behavior)

Files

  • assistant/src/memory/jobs-store.ts (touch — MemoryJobType)
  • the memory.v3 config schema from PR 7 (touch — add write subtree)
  • assistant/src/memory/__tests__/ (touch/add a small enum/config test)

Implementation steps

  1. Add v3 job types to the MemoryJobType union (jobs-store.ts:17-47): "memory_v3_consolidate", "memory_v3_index_maintenance", "memory_v3_edge_learning". Additive only — do not touch existing types or the worker switch yet (handlers land in their own PRs).
  2. Extend config.memory.v3 with a write subtree: { enabled: z.boolean().default(false), consolidateIntervalMs: z.number().default(...), coactivation: z.boolean().default(false) } — all default-off so nothing changes until explicitly enabled. write.enabled means "v3 consolidation owns the shared-buffer drain + tree build" (it does NOT introduce a separate buffer — see the shared-substrate note in the Overview); when off, v2 consolidation stays the sole buffer-drainer.
  3. Tests: enum includes the three new types; config write subtree parses with defaults.

Acceptance criteria

  • Three v3 job types exist in the enum; config.memory.v3.write parses with safe (off) defaults.
  • No worker/handler/behavior change. Standing verify clean.

PR 17: co-activation logging (the gradient)

Depends on

PR 13, PR 16

Branch

memory-v3/pr-17-coactivation-log

Title

feat(memory-v3): pass-1→pass-2 co-activation logging

Files

  • assistant/src/memory/migrations/<NNN>-memory-v3-coactivation.ts (new)
  • assistant/src/memory/db-init.ts (touch — register migration)
  • assistant/src/memory/v3/coactivation-store.ts (new)
  • assistant/src/memory/v3/loop.ts (touch — emit co-activations)
  • assistant/src/memory/v3/__tests__/coactivation-store.test.ts (new)

Implementation steps

  1. Add an idempotent, append-only migration creating memory_v3_coactivation(id, conversation_id, turn, source_slug, target_slug, pass_gap, used INTEGER, created_at) with indexes on (source_slug,target_slug) and created_at. Register in db-init.ts (never reorder existing entries — CLAUDE.md).
  2. coactivation-store.ts: recordCoactivations(db, rows) (best-effort insert, like recordInjectionEvents) + readCoactivations(db, since?).
  3. In loop.ts, when config.memory.v3.write.coactivation is on, after the loop completes emit co-activation rows: for each page first surfaced on pass ≥2 (B) paired with pass-1 hits (A), pass_gap = passOf(B) - passOf(A). Set used=0 here; usefulness (cited in the would-be brief / shaped the response) is reconciled later — for now log retrieval co-occurrence only and leave used for the consolidation reconciler (PR 18) to set, defaulting to retrieved-not-yet-confirmed. Async/non-blocking; never on the retrieval critical path (design §learning-loop "the retrieval path shouldn't pay to write its own training data").
  4. Tests: fixture DB → recordCoactivations/readCoactivations round-trip; loop with a scripted 2-pass trace emits the expected A→B rows with correct pass_gap; flag-off emits nothing.

Acceptance criteria

  • Co-activation rows are written (behind write.coactivation) from multi-pass traces, off the critical path, with a registered idempotent migration.
  • Standing verify clean (include the migration test pattern from db-*.test.ts).

PR 18: weighted edge-learning job + auto-edge promotion

Depends on

PR 17

Branch

memory-v3/pr-18-edge-learning

Title

feat(memory-v3): weighted, decaying auto-edge learning job

Files

  • assistant/src/memory/migrations/<NNN>-memory-v3-auto-edges.ts (new)
  • assistant/src/memory/db-init.ts (touch — register)
  • assistant/src/memory/v3/auto-edges.ts (new)
  • assistant/src/memory/v3/edge-learning-job.ts (new)
  • assistant/src/memory/jobs-worker.ts (touch — dispatch memory_v3_edge_learning)
  • assistant/src/memory/v3/__tests__/edge-learning-job.test.ts (new)

Implementation steps

  1. Migration: memory_v3_auto_edges(source_slug, target_slug, weight REAL, last_reinforced_at, PRIMARY KEY(source_slug,target_slug)) — the distinct class from curated edges: (design §learning-loop "distinct class from curated edges"). Register in db-init.ts.
  2. auto-edges.ts: reinforce(db, a, b, now) (increment weight, only when the co-activation's used is true — reinforce usefulness, not mere retrieval), decay(db, now, halfLifeMs) (multiplicative decay of unused weights), aboveThreshold(db, threshold): Map<source, Set<target>> (the adjacency PR 11's expandEdges consumes via its extraAdjacency seam — traverse only above threshold).
  3. edge-learning-job.ts (memory_v3_edge_learning): read recent co-activations (PR 17), reinforce/decay weights, and surface high-weight auto-edges as promotion candidates for the assistant to ratify into curated edges: during consolidation (design: "promoting a high-weight auto-edge to a permanent curated link stays the assistant's call" — so this job proposes, it does not auto-write page frontmatter). Watch rich-get-richer: rely on decay + the dense/MMR diversity counterweight.
  4. Dispatch the job type in processJob (jobs-worker.ts), assigned to the slow-LLM/maintenance lane as appropriate (it's mostly DB work — fast lane).
  5. Tests: reinforce respects used; decay reduces unused weights; aboveThreshold returns the right adjacency; job run over fixture co-activations updates weights and emits promotion candidates. No real LLM.

Acceptance criteria

  • Weighted auto-edges accrue from used co-activations, decay over time, and expose an above-threshold adjacency for edge expansion; promotion stays advisory.
  • Migration registered + idempotent; job dispatched; standing verify clean.

PR 19: v3 consolidation — drain the shared buffer into the tree, preserve essentials/threads/recent

Depends on

PR 1, PR 2, PR 4, PR 5, PR 16

Branch

memory-v3/pr-19-consolidation

Title

feat(memory-v3): consolidation drains the shared buffer into the tree + maintains standing-context files

Files

  • assistant/src/memory/v3/consolidation-job.ts (new)
  • assistant/src/memory/v3/maintenance.ts (new — index/DAG upkeep helpers)
  • assistant/src/memory/v3/prompts/consolidation.ts (new — ported/adapted from v2/prompts/consolidation.ts)
  • assistant/src/memory/jobs-worker.ts (touch — dispatch both jobs + scheduler retarget)
  • assistant/src/memory/v3/__tests__/consolidation-job.test.ts (new)

Implementation steps

  1. Reuse the shared buffer + standing-context files — do NOT fork them. v3 consolidation reads the same memory/buffer.md v2 uses and maintains the same memory/essentials.md / threads.md / recent.md. There is no memory/v3/buffer.md and no v3 meta-files (this is the direct answer to "does v3 reuse buffer.md" — yes). The capture surface (remember(), the sweep job, the retrospective job, indexMessageNow segmentation/embeddings) is shared substrate and needs no v3 changes — it keeps writing the shared buffer/Qdrant, which is why P4 has no separate capture PRs.
  2. memory_v3_consolidate job — mirror v2 consolidation-job.ts exactly: single-process lock at memory/.v3-state/consolidation.lock, cutoff capture via formatBufferTimestamp, bail on empty buffer, hand off to a background agent via runBackgroundJob (15-min timeout, suppressFailureNotifications: true). Gate on config.memory.v3.write.enabled.
  3. Port the consolidation prompt from v2/prompts/consolidation.ts, keeping the standing-context outputs identical — rewrite recent.md (≤2000 chars, latest-first prose), update essentials.md (≤10000), update threads.md (≤10000), trim buffer.md to post-cutoff entries — and changing only the concept-page routing: entries route into shared concept pages (writePage) AND into the v3 tree (author/refresh _index.md node self-descriptions via writeNode + maintain DAG child refs). Surface PR 18's auto-edge promotion candidates in the agent's context so it can ratify high-weight edges into curated edges:.
  4. Scheduler retarget (the shared buffer can't be drained twice). Where the hourly scheduler enqueues memory_v2_consolidate (the maybeEnqueueGraphMaintenanceJobs path), when config.memory.v3.write.enabled enqueue memory_v3_consolidate instead (skip the v2 enqueue) so exactly one consolidator owns the buffer. Reversible via the flag. Because concept pages stay the shared canonical store, the v2 router keeps working off the pages v3 writes (it just ignores the tree overlay) — so this write-path switch is independent of the retrieval-path shadow/cutover.
  5. maintenance.ts (memory_v3_index_maintenance job, fast lane): mechanical no-LLM upkeep — run validateTree (PR 5), refresh stale composed indices, and cycle-check DAG edits with PR 4's visited/guard logic so consolidation can't introduce loops. Enqueue as a follow-up after consolidate, alongside an embed_concept_page reembed (pages are shared, so reembed is still needed — reuse the existing job).
  6. Dispatch both job types in processJob (jobs-worker.ts).
  7. Tests: consolidate over a fixture shared buffer → creates/updates a concept page AND a tree node, rewrites recent/essentials/threads, trims the buffer; scheduler enqueues v3 (not v2) when write.enabled and v2 when off; maintenance refuses a cycle-introducing edit + reports a stale index; flag-off → v2 consolidation path unchanged. Stub the background-agent handoff (no real LLM).

Acceptance criteria

  • With write.enabled, the shared memory/buffer.md is consolidated into shared concept pages + the v3 tree, while essentials.md/threads.md/recent.md are rewritten exactly as v2 does today and the buffer is trimmed; the scheduler enqueues only the v3 consolidator.
  • DAG edits are cycle-checked; the standing-context injection (static block + nowText) is byte-for-byte unaffected; flag-off leaves the v2 consolidation path untouched.
  • Both job types dispatched; standing verify clean (include a job-handler test pattern).

Out-of-scope reminders (not in this plan)

  • P5 — pilot cutover + retire v2 (flip the pilot to v3 once shadow proves v3 ≥ v2; delete v2 retrieval/write path — including re-pointing the always-on essentials/threads/recent injection, today gated on config.memory.v2.enabled in static-context.ts, onto a v3 gate so standing-context injection survives v2's removal). The only hard gate; a separate plan once shadow numbers are in.
  • Plugin extraction (lift v3 policy into a plugin + general plugin API). De-gated, optional, later.
  • v2→v3 page migration (reclassify ~1,600 pages, author _index.md nodes, wire the DAG). Parallel data track owned by the user, done by hand against this plan's PR 1–6 format; PR 6's validate/tree CLI is the tool for it. Gates nothing here.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 93da857fb2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +47 to +49
| "memory_v3_consolidate"
| "memory_v3_index_maintenance"
| "memory_v3_edge_learning"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Classify memory_v3_consolidate as a slow LLM job

memory_v3_consolidate is added as a new job type but not added to SLOW_LLM_JOB_TYPES, so it falls into the fast lane. This job can run up to 15 minutes and performs background LLM work, so putting it in the fast lane breaks lane isolation and can block short fast jobs whenever memory.v3.write.enabled is on. It should be classified with other slow LLM jobs (like memory_v2_consolidate) to preserve intended scheduling behavior.

Useful? React with 👍 / 👎.

Comment on lines +502 to +507
consolidateIntervalMs: z
.number({
error: "memory.v3.write.consolidateIntervalMs must be a number",
})
.int("memory.v3.write.consolidateIntervalMs must be an integer")
.default(3600000)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce positive v3 consolidation interval

memory.v3.write.consolidateIntervalMs is only constrained to an integer, so 0 or negative values are accepted. In maybeEnqueueGraphMaintenanceJobs, the scheduler checks nowMs - lastRun >= intervalMs; with a non-positive interval that condition is always true, causing a consolidation job to be enqueued on every worker pass and rapidly flooding the queue. Add a positive constraint to this schema field (matching the v2 interval validation).

Useful? React with 👍 / 👎.

Vellum Assistant and others added 2 commits May 25, 2026 12:15
The live-shadow middleware runs on every turn and read `config.memory.v3.enabled`
unguarded. Configs built outside the Zod schema (agent-loop test fixtures) have no
`memory.v3` block, so the gate threw `TypeError: undefined is not an object` and
aborted the turn — cascading across ~13 agent-loop test files. Guard with optional
chaining (matches the loop's existing `write?.coactivation` pattern) and add a
regression test for the absent-v3 config.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PR #31983 registered the two read-only v3 routes but never added their
ACTOR_ENDPOINTS entries in route-policy.ts; the per-PR run skipped CI so the
route-policy coverage guard never ran. Add both as settings.read (mirroring the
v2 read routes), satisfying guard-tests.test.ts.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@velissa-ai velissa-ai merged commit fc61250 into main May 25, 2026
14 checks passed
@velissa-ai velissa-ai deleted the velissa-ai/memory-v3-build branch May 25, 2026 17:27
velissa-ai added a commit that referenced this pull request May 25, 2026
- Classify memory_v3_consolidate as a slow LLM job (it hands off to a
  background agent for up to 15 min like memory_v2_consolidate); leaving it in
  the fast lane broke lane isolation when memory.v3.write.enabled is on.
- Constrain memory.v3.write.consolidateIntervalMs to positive: 0/negative made
  the scheduler's `now - lastRun >= interval` always true, flooding the queue.
- Tests for both.

Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant