clay-good · clay-good · May 12, 2026 · May 9, 2026 · May 9, 2026 · May 9, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -18,7 +18,7 @@ jobs:
 
       - uses: actions/setup-node@v4
         with:
-          node-version: '20'
+          node-version: '24'
           cache: 'npm'
 
       - run: npm ci
@@ -40,7 +40,7 @@ jobs:
 
       - uses: actions/setup-node@v4
         with:
-          node-version: '20'
+          node-version: '24'
           cache: 'npm'
 
       - run: npm ci
@@ -56,7 +56,7 @@ jobs:
 
       - uses: actions/setup-node@v4
         with:
-          node-version: '20'
+          node-version: '24'
           cache: 'npm'
 
       - run: npm ci

diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
@@ -14,7 +14,7 @@ jobs:
 
       - uses: actions/setup-node@v4
         with:
-          node-version: '20'
+          node-version: '24'
           cache: 'npm'
 
       - run: npm ci
@@ -43,7 +43,7 @@ jobs:
 
       - uses: actions/setup-node@v4
         with:
-          node-version: '20'
+          node-version: '24'
           cache: 'npm'
           registry-url: 'https://registry.npmjs.org'
 

diff --git a/AGENTS.md b/AGENTS.md
@@ -340,36 +340,35 @@ record_decision({
 })
 ```
 
-Decisions are consolidated and cross-checked against the git diff before each commit.
+Decisions are consolidated in the background immediately after `record_decision` is called — the pre-commit gate reads the already-consolidated store and adds no LLM latency.
 
 **Performance note**: if you skip `record_decision`, the gate detects unrecorded source changes at commit time and triggers a slow LLM extraction on the *next* commit (~10-30s). Calling `record_decision` proactively keeps every commit instant.
 
 ## When git commit is blocked by the decisions gate
 
-If `git commit` fails and the output is JSON with `"gated": true`, do NOT approve decisions yourself. Always present them to the user first.
-
+If `git commit` fails and the output is JSON with `"gated": true`, do NOT retry silently.
 Check the `reason` field and act accordingly:
 
 **`reason: "verified"` — decisions await review:**
 Present each decision to the user:
 > "The commit is blocked — I found N architectural decision(s) to validate:
 > 1. **[id]** Title — rationale
-> Do you approve? (yes/no)"
-
+Do you approve? (yes/no)"
 For each approval call `approve_decision`, for rejections call `reject_decision`.
 Then run `spec-gen decisions --sync` and retry `git commit`.
 
+**`reason: "approved_not_synced"` — decisions approved but not written to specs:**
+Run `spec-gen decisions --sync` then retry `git commit`. Do not skip this step.
+
 **`reason: "drafts_pending_consolidation"` — drafts were recorded but not yet consolidated:**
 Present to the user:
 > "N decision draft(s) were recorded but never consolidated. Run consolidation now? (~10-30s)"
-
 If yes: run `spec-gen decisions --consolidate --gate` and handle the result.
 If no: retry with `git commit --no-verify` to skip the gate.
 
 **`reason: "no_decisions_recorded"` — source files staged but nothing recorded:**
 Present to the user:
 > "Source files are staged but no architectural decisions were recorded. Run fallback extraction to check for undocumented decisions? (~10-30s)"
-
 If yes: run `spec-gen decisions --consolidate --gate` and handle the result.
 If no: retry with `git commit --no-verify` to skip the gate.
 <!-- end-spec-gen-decisions-instructions -->
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -48,10 +48,13 @@ Check the `reason` field and act accordingly:
 Present each decision to the user:
 > "The commit is blocked — I found N architectural decision(s) to validate:
 > 1. **[id]** Title — rationale
-> Do you approve? (yes/no)"
+Do you approve? (yes/no)"
 For each approval call `approve_decision`, for rejections call `reject_decision`.
 Then run `spec-gen decisions --sync` and retry `git commit`.
 
+**`reason: "approved_not_synced"` — decisions approved but not written to specs:**
+Run `spec-gen decisions --sync` then retry `git commit`. Do not skip this step.
+
 **`reason: "drafts_pending_consolidation"` — drafts were recorded but not yet consolidated:**
 Present to the user:
 > "N decision draft(s) were recorded but never consolidated. Run consolidation now? (~10-30s)"

diff --git a/README.md b/README.md
@@ -1,8 +1,8 @@
 # spec-gen
 
-**Persistent architectural memory for AI coding agents.**
+**Persistent architectural memory and structural cognition for AI coding agents.**
 
-spec-gen turns any codebase into a navigable knowledge graph backed by [OpenSpec](https://github.com/Fission-AI/OpenSpec) living specifications. It extracts and maintains specs, detects spec/code drift, gates architectural decisions, and exposes everything through graph-native MCP tools — so agents start every session already knowing the codebase instead of re-discovering it.
+spec-gen turns any evolving codebase into a navigable knowledge graph backed by [OpenSpec](https://github.com/Fission-AI/OpenSpec) living specifications. It maintains persistent architectural context across agent sessions: graph structure, specs, decisions, drift state, and semantic retrieval — so agents start each task already oriented instead of re-discovering the system from file reads.
 
 ---
 
@@ -15,7 +15,9 @@ AI agents are powerful but amnesiac. On every new task:
 - They have no link between specs and code — drift is invisible
 - File-by-file navigation often burns **15,000–50,000 tokens** per orientation pass, before a single line of useful code is written
 
-spec-gen closes this loop. Run a full analysis once, then keep the graph incrementally updated during development. Wire two files into your agent's context — every subsequent session starts informed.
+spec-gen closes this loop. Run a full analysis once, then keep the graph incrementally updated as the codebase evolves. Even greenfield projects become cognitively "brownfield" after only a few agent sessions — architectural context fragments, decisions disappear, and agents repeatedly reconstruct the same understanding from scratch.
+
+spec-gen persists that context continuously: structure, specs, decisions, drift state, and graph relationships remain queryable across sessions.
 
 ---
 
@@ -29,7 +31,7 @@ Three layers, each usable independently:
 | **2. Spec Layer** | LLM-generated living specs, ADRs, drift detection, decision gates | For generation |
 | **3. Agent Runtime** | 45 MCP tools — `orient()`, semantic search, graph expansion | No |
 
-You can use layer 1 alone to give agents structural context. Add layer 2 for spec coverage. Layer 3 is always-on once `spec-gen mcp` is running.
+You can use layer 1 alone to give agents structural context. Add layer 2 for semantic intent and architectural governance through OpenSpec-compatible living specifications. Layer 3 keeps that context continuously accessible through graph-native MCP tools once `spec-gen mcp` is running.
 
 ---
 
@@ -43,6 +45,7 @@ You can use layer 1 alone to give agents structural context. Add layer 2 for spe
 | Offline structural analysis | ❌ | ❌ | ✓ |
 | Token-efficient orient() | ❌ | ❌ | ✓ ~1–3k vs 15–50k tokens |
 | Living spec generation | ❌ | ❌ | ✓ |
+| Persistent cross-session architectural memory | ❌ | Partial | ✓ |
 
 Traditional coding agents reconstruct architecture from repeated file reads every session. spec-gen persists it as a queryable graph.
 
@@ -62,7 +65,7 @@ spec-gen mcp              # start MCP server
 
 Then ask your agent: **`orient("add a new payment method")`**
 
-That single call returns the relevant functions, their call neighbours, matching spec sections, and insertion-point candidates — in one round-trip instead of a dozen file reads, costing ~1,000 tokens instead of ~30,000.
+That single call returns the relevant functions, their call neighbours, matching spec sections, and insertion-point candidates — preserving architectural continuity across sessions instead of forcing the agent to repeatedly reconstruct context from raw file reads. In practice, this often reduces orientation cost from ~30,000 exploratory tokens to ~1,000 targeted tokens.
 
 **Full pipeline** (specs + decisions — optional and additive):
 
@@ -142,7 +145,7 @@ One graph query replaces most exploratory file reads. The agent knows exactly wh
 
 **Analyze** (no API key)
 
-Scans your codebase with pure static analysis. Builds a full call graph persisted to SQLite, runs label-propagation community detection to cluster tightly coupled functions, computes McCabe cyclomatic complexity for every function, and extracts DB schemas, HTTP routes, UI components, middleware chains, and environment variables. Outputs `.spec-gen/analysis/CODEBASE.md` — a ~600-token structural digest that compresses the equivalent of tens of thousands of exploratory tokens into a small, queryable summary.
+Continuously maintains a structural representation of your codebase using pure static analysis. Builds a full call graph persisted to SQLite, runs label-propagation community detection to cluster tightly coupled functions, computes McCabe cyclomatic complexity for every function, and extracts DB schemas, HTTP routes, UI components, middleware chains, and environment variables. Outputs `.spec-gen/analysis/CODEBASE.md` — a ~600-token structural digest that compresses the equivalent of tens of thousands of exploratory tokens into a small, queryable summary.
 
 With `--watch-auto`, the call graph updates incrementally on every file save: changed file and its direct callers are re-parsed and the graph is atomically swapped. Orient and BFS queries remain live between full analyze runs.
 
@@ -156,18 +159,21 @@ Compares git changes against spec mappings in milliseconds. Detects: Gap (code c
 
 **MCP** (no API key)
 
-45 graph-native tools exposed over stdio. `orient()` is the main entry point — one call replaces 10+ file reads. `detect_changes` risk-scores changed functions using call graph centrality × change type multiplier. See [docs/mcp-tools.md](docs/mcp-tools.md).
+45 graph-native tools exposed over stdio. Together they act as a persistent architectural runtime for coding agents: orientation, graph traversal, semantic retrieval, drift awareness, decision context, and structural risk analysis.
+`orient()` is the main entry point — one call replaces 10+ file reads. `detect_changes` risk-scores changed functions using call graph centrality × change type multiplier. See [docs/mcp-tools.md](docs/mcp-tools.md).
 
 `orient()` runs in **~430µs p50** against a 15k-node codebase (TypeScript compiler, ~79k edges). Full benchmark results: [scripts/BENCHMARKS.md](scripts/BENCHMARKS.md).
 
 **Decisions** (API key for consolidation)
 
-Agents call `record_decision` before writing code. Consolidation runs immediately in the background. At commit time, a pre-commit hook gates the commit until all verified decisions are reviewed and written back as requirements in `spec.md` files.
+Agents call `record_decision` before writing code. Consolidation runs immediately in the background. At commit time, a pre-commit hook gates the commit until all verified decisions are reviewed and written back as requirements in `spec.md` files. Decisions are classified by scope (`local / component / cross-domain / system`); only `cross-domain` and `system` decisions produce ADR files, keeping the decision log signal-dense.
 
 ---
 
 ## Architecture
 
+OpenSpec provides semantic intent and workflow structure. spec-gen maintains the evolving implementation as a continuously queryable architectural graph for agents.
+
 ```
 Codebase
    │
@@ -221,12 +227,13 @@ The graph and the OpenSpec spec layer are co-equal: the graph makes orientation
 - **LLM spec quality varies**: generated specs reflect the model's understanding. Review sections covering complex business logic before treating them as authoritative.
 - **Embedding is optional**: without an embedding endpoint, `orient` and `search_code` fall back to BM25 keyword search (still useful, less accurate for semantic queries).
 - **Large monorepos**: `spec-gen analyze` on large codebases may take several minutes. Graph storage itself has no practical limit — the pipeline (AST parsing, symbol extraction) is the bottleneck.
+- **`node:sqlite` experimental warning on Node 22**: Node.js 22 prints `ExperimentalWarning: SQLite is an experimental feature` to stderr. The warning is gone on Node 24+. Suppress on Node 22 with `NODE_NO_WARNINGS=1 spec-gen analyze`.
 
 ---
 
 ## Requirements
 
-- Node.js 20+
+- Node.js 22.5+
 - API key for `generate`, `verify`, and `drift --use-llm`:
   ```bash
   export ANTHROPIC_API_KEY=sk-ant-...    # default provider
@@ -245,7 +252,7 @@ The graph and the OpenSpec spec layer are co-equal: the graph makes orientation
 ```bash
 npm install
 npm run build
-npm test          # 2580+ unit tests
+npm test          # 2660+ unit tests
 npm run typecheck
 ```
 

diff --git a/docs/agent-setup.md b/docs/agent-setup.md
@@ -98,6 +98,36 @@ Wire the generated digest into your agent's context:
 `search_code` · `suggest_insertion_points` · `get_spec <domain>` · `search_specs` · `analyze_impact` · `get_function_body` · `get_function_skeleton`
 ```
 
+**Claude Code — MCP config (token-efficient two-server setup)**
+
+MCP clients load all tool schemas at session start. With 45 tools, this costs ~8–77k tokens before any work begins. Claude Code supports `alwaysLoad: false` (deferred, default) — tools load only when the agent searches for them via Tool Search.
+
+The recommended setup uses two server entries: one always-visible core server and one deferred full server:
+
+```json
+{
+  "mcpServers": {
+    "spec-gen-core": {
+      "type": "stdio",
+      "command": "spec-gen",
+      "args": ["mcp", "--minimal"],
+      "alwaysLoad": true
+    },
+    "spec-gen": {
+      "type": "stdio",
+      "command": "spec-gen",
+      "args": ["mcp"],
+      "alwaysLoad": false
+    }
+  }
+}
+```
+
+- **`spec-gen-core`** exposes 5 tools always visible in context (~500 tokens): `orient`, `search_code`, `record_decision`, `detect_changes`, `check_spec_drift`. These are the tools most likely to be called at session start.
+- **`spec-gen`** exposes all 45 tools deferred — loaded on demand when the agent uses Tool Search (e.g. "find tool for BFS graph traversal").
+
+If you only need one server entry, use `alwaysLoad: false` (the default) with the standard `spec-gen mcp` command — all tools are deferred and searchable via Tool Search.
+
 **Cline / Roo Code / Kilocode** — create `.clinerules/spec-gen.md`:
 
 ```markdown

diff --git a/docs/ci-cd.md b/docs/ci-cd.md
@@ -25,6 +25,17 @@ spec-gen setup --tools claude         # Install (also installs Claude Code skill
 spec-gen decisions --uninstall-hook   # Remove decisions hook only
 ```
 
+When the gate blocks, the JSON output includes a `reason` field:
+
+| Reason | Meaning | Action |
+|--------|---------|--------|
+| `verified` | Decisions consolidated and verified — await human review | Present to user, call `approve_decision` / `reject_decision`, then `--sync` |
+| `approved_not_synced` | Decisions approved but not written to specs yet | Run `spec-gen decisions --sync`, retry commit |
+| `drafts_pending_consolidation` | Drafts recorded but consolidation never ran | Run `spec-gen decisions --consolidate --gate` |
+| `no_decisions_recorded` | Source files staged but no decisions recorded | Run `spec-gen decisions --consolidate --gate` for fallback extraction |
+
+The gate uses a sentinel file (`.git/SPEC_GEN_GATE_RAN`) written by the pre-commit hook and checked by the post-commit hook. If a commit bypasses the gate via `--no-verify`, the post-commit hook detects the missing sentinel and logs a warning.
+
 **How they relate**: they address different failure modes and do not substitute for each other.
 
 The decisions gate asks: *"has this architectural choice been reviewed by a human?"* It operates on decisions recorded during development — it has no knowledge of which spec files cover which source files.

diff --git a/docs/cli-reference.md b/docs/cli-reference.md
@@ -140,12 +140,15 @@ spec-gen setup                   # workflow skills
 spec-gen decisions [options]
   --list                 # List decisions, optionally filtered by --status
   --status <status>      # Filter by status: draft, consolidated, verified, approved, synced, phantom
-  --approve <id>         # Approve a decision by ID
+                         # Note: synced/rejected/phantom are purged from store after --sync
+  --approve <id>         # Approve a decision by ID (blocked if already synced)
   --reject <id>          # Reject a decision by ID
   --reason <text>        # Rejection reason (used with --reject)
-  --sync                 # Write approved decisions into specs and ADRs
+  --sync                 # Write approved decisions into specs and ADRs, then purge inactive entries
   --dry-run              # Preview sync without writing files
   --gate                 # Run commit gate check (reads pending.json, no LLM — used by pre-commit hook)
+                         # Gate reason codes: verified | approved_not_synced |
+                         #   drafts_pending_consolidation | no_decisions_recorded
   --consolidate          # Manually trigger LLM consolidation + diff verification of drafts
   --json                 # Machine-readable output
   --uninstall-hook       # Remove decisions pre-commit hook (install via: spec-gen setup --tools claude)