Skip to content

feat(decisions): gate hardening, decision tiers, RAG indexing, and test coverage#78

Merged
clay-good merged 25 commits into
clay-good:mainfrom
laurentftech:fix/setup-ui-align
May 12, 2026
Merged

feat(decisions): gate hardening, decision tiers, RAG indexing, and test coverage#78
clay-good merged 25 commits into
clay-good:mainfrom
laurentftech:fix/setup-ui-align

Conversation

@laurentftech

@laurentftech laurentftech commented May 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Gate hardening: phantom bypass fix, approved_not_synced gate reason, --no-verify sentinel detection via post-commit hook
  • Consolidation reliability: use IDs as traceability anchor to prevent duplicate decisions from LLM title variation; stable replaceDecisions fixes silent consolidation drop bug; merged decisions inherit earliest superseded recordedAt
  • Decision scope tiers: DecisionScope = 'local' | 'component' | 'cross-domain' | 'system' — only cross-domain and system generate ADR files, preventing ADR spam from trivial implementation choices
  • Scope classification:
    • Consolidation LLM prompt includes scope criteria with explicit negative rules
    • Extractor capped to local/component — cross-domain upgrade requires full multi-domain visibility and therefore only occurs during consolidation
    • Auto-promotion in record_decision uses two independent deterministic triggers: structural (files span 2+ distinct top-level dirs) OR semantic (multi-domain + contract keyword, not refactor)
  • Gate violation taxonomy centralized: GATE_REASONS constant in constants.ts — all 4 reason codes shared by gate handler, docs, and tests
  • Store/ADR sync invariant: purge only after all per-decision syncs complete; partial failure leaves decision at approved (safe to retry)
  • Store lifecycle: purgeInactiveDecisions()--sync always calls syncer so purge runs even with zero approved decisions (fixes early-exit skip)
  • RAG indexing: ADR files indexed under domain "decisions"search_specs and orient return relevant ADRs semantically
  • Orient context: pending decisions surfaced by task domain/file relevance
  • EMFILE fix on macOS: chokidar v5 + kqueue opened 28k+ file descriptors before evaluating glob patterns. Switched ignored from glob array to a string-segment function evaluated before FDs are opened — eliminates EMFILE on projects with large node_modules
  • Replace better-sqlite3 with node:sqlite built-in: eliminates native addon compilation and GitHub prebuilt download at install time. Required for enterprise environments behind Nexus or other isolated registries. node:sqlite is built into Node.js 22.5+ — zero extra dependencies. Requires Node.js ≥ 22.5.0 (bumped from 20)
  • CI bumped to Node 24 LTS: node:sqlite experimental warning is absent on Node 24+. Node 22 still works but prints the warning to stderr (suppress with NODE_NO_WARNINGS=1)
  • MCP tool annotations: readOnlyHint/destructiveHint/idempotentHint added to all 45 tools — improves Claude Code Tool Search ranking so deferred tools surface faster on relevant queries
  • MCP --minimal profile: spec-gen mcp --minimal exposes only 5 core tools (orient, search_code, record_decision, detect_changes, check_spec_drift). Pair with alwaysLoad: true in Claude Code .mcp.json for always-visible core tools; use the full server with alwaysLoad: false (default) for the remaining 40 deferred and searchable via Tool Search. Documented in docs/agent-setup.md
  • orient() suggestedTools field: orient now returns a ranked list of relevant spec-gen tools derived from what it already knows (hub presence, spec domains, task keywords) — no extra I/O. Provides portable tool discovery for Cline/Cursor/OpenCode where Claude Code's Tool Search is unavailable
  • +27 tests this session on scope gate, auto-promotion triggers, consolidator scope mapping, synced re-approval guard (+98 total vs base)

Decision scope tiers

local        — single file, no cross-cutting concern (refactors, extractions, renames)
component    — single component/service/module, no cross-boundary contract impact
cross-domain — touches multiple spec domains AND changes behavioral contracts  ← ADR written
system       — global architectural constraint (auth, infra, data model, API) ← ADR written

Scope promotion uses deterministic triggers that prevent LLM-only escalation paths — structural trigger (files span 2+ distinct top-level source dirs mapped to different inferred domains) OR semantic (multi-domain + contract keyword + not refactor). Extractor always outputs local/component; upgrade to cross-domain/system only at consolidation.

Old pending.json without scope → treated as component (no ADR). CLI --list shows scope badge: gray=local, blue=component, yellow=cross-domain, red=system.

node:sqlite migration notes

  • Named params: { name: v }{ '@name': v }
  • db.pragma()db.exec('PRAGMA ...')
  • db.transaction(fn)()runTransaction(db, fn) helper using SAVEPOINT for nested transaction support
  • .all() / .get() return Record<string, SQLOutputValue> — cast via as unknown as T
  • Experimental warning: present on Node 22, absent on Node 24+. Not suppressed programmatically — process.emit/process.emitWarning monkeypatching is fragile and can mask legitimate warnings. Use NODE_NO_WARNINGS=1 on Node 22 if needed.

Two-server MCP config (token-efficient Claude Code setup)

{
  "mcpServers": {
    "spec-gen-core": {
      "type": "stdio",
      "command": "spec-gen",
      "args": ["mcp", "--minimal"],
      "alwaysLoad": true
    },
    "spec-gen": {
      "type": "stdio",
      "command": "spec-gen",
      "args": ["mcp"],
      "alwaysLoad": false
    }
  }
}

Core 5 always in context (~500 tokens). Full 45 deferred, searchable via Tool Search.

Test plan

  • npm test — 2663 tests pass
  • record_decision with local helper change → no ADR after sync (automated test)
  • record_decision with cross-domain auth change → ADR created after sync (automated test)
  • record_decision with 2 inferred domains + contract keyword → auto-promoted to cross-domain (automated test)
  • record_decision with files in 2 different top-level dirs → auto-promoted to cross-domain (automated test)
  • spec-gen decisions --list → scope badge visible (gray/blue/yellow/red manually verified)
  • Old pending.json without scope field → loads cleanly, defaults to component (manually verified)
  • git commit with approved-not-synced decisions → gate blocks with approved_not_synced reason (manually verified)
  • git commit --no-verify → post-commit hook logs bypass warning (manually verified)
  • spec-gen analyze on project with large node_modules → no EMFILE error on macOS (manually verified)
  • npm install in isolated network (no GitHub access) → installs cleanly, no native compilation (node:sqlite built-in)
  • Node 24 — no node:sqlite experimental warning (manually verified)
  • Node 22 — experimental warning present but suppressed with NODE_NO_WARNINGS=1 (manually verified)
  • spec-gen mcp --minimal → only 5 tools listed (manually verified via CLI help)
  • orient("add payment method")suggestedTools includes get_subgraph, get_spec, check_spec_drift (manually verified)

🤖 Generated with Claude Code

Common tools (Claude Code → Cline/Roo → Mistral Vibe) now follow the same
order as the analyze --ai-configs prompt. Setup-only tools follow after:
OpenCode, GSD, BMAD, omoa.

Also document why recursive CTE BFS was reverted in bfsFromDB comment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@laurentftech laurentftech marked this pull request as draft May 9, 2026 19:47
laurentftech and others added 2 commits May 9, 2026 21:56
Stale phantom decisions from prior sessions were counting as activeDecisions,
silently disabling the pre-commit gate for all future commits. Phantom means
"recorded but no code evidence found in diff" — it should not shield unrelated
future commits. Added 26 tests for decisions handlers (0%→88%) and orient
branch coverage (56%→61% branches).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
upsertDecisions skips existing IDs, so consolidated decisions sharing IDs
with their rejected original drafts were silently no-oped — the gate never
saw verified decisions to present for approval. replaceDecisions always
overwrites, ensuring verified/phantom status overwrites the rejected
placeholder set in the prior step.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@laurentftech laurentftech changed the title fix(setup): align tool order with analyze --ai-configs fix(decisions): gate bypass, consolidation drop, and test coverage May 9, 2026
laurentftech and others added 4 commits May 9, 2026 22:20
Pre-commit hook writes .git/SPEC_GEN_GATE_RAN after gate passes.
Post-commit hook (not skipped by --no-verify) checks for it — if absent,
prints a visible warning that the gate was bypassed and decisions were
not reviewed. Normal commits clean up the sentinel silently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Approved decisions must be synced to spec files before committing.
Gate now emits `approved_not_synced` reason code and exits 1 when
any decision is in `approved` state.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove isArchitectural keyword-threshold gate — any approved decision
is significant enough to warrant an ADR. The keyword filter was too
narrow and missed real architectural decisions (SQLite, gate logic).

Backfill openspec/decisions/ with ADRs for 3 already-synced decisions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er, detectChanges

Adds 36 tests for previously uncovered handlers:
- handleGetMiddlewareInventory/SchemaInventory/UIComponents/EnvVars/ExternalPackages
  (cached hit + live fallback for each)
- handleGetMinimalContext: no graph, function not found, callers/callees, testedBy edges
- handleGetCluster: no graph, no function, no community, members + stats
- handleDetectChanges: no graph, git-not-repo error path
- handleAuditSpecCoverage: error path

Statements: 30% → 61% on analysis.ts. Overall: 80% → 81.3%.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@laurentftech laurentftech marked this pull request as ready for review May 9, 2026 20:54
- Sentinel path: use `git rev-parse --git-dir` instead of hardcoded .git/
- Sentinel comment: clarify it marks a passed gate, not a bypass
- `consolidatedRecently`: extract magic 60-min window to CONSOLIDATION_GRACE_PERIOD_MS constant
- Gate: remove dead `missing.length === 0` guard (missing always empty at that point)
- decisions.test.ts: remove duplicate mockResolvedValue in beforeEach
- decisions.test.ts: add duplicate-title test (makeDecisionId is deterministic → upsert skips)
- syncer.ts: document why every decision gets an ADR (no keyword filter)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@laurentftech laurentftech marked this pull request as draft May 9, 2026 21:37
laurentftech and others added 7 commits May 10, 2026 09:36
…centralize status logic

- Preserve recordedAt from original draft during consolidation: replaceDecisions
  overwrites the row, losing when the decision was first recorded. Fix: carry forward
  original.recordedAt when consolidated ID matches an existing draft.
- Guard --approve on synced decisions: re-approving a synced decision would mark it
  approved again and force another --sync cycle. Block with error instead.
- Centralize inactive status set: add INACTIVE_STATUSES, isBlockingStatus,
  requiresSync to store.ts. Replace inline ['rejected','synced','phantom'] filter
  in gate with INACTIVE_STATUSES.has() to prevent divergence across handlers.
- Extract CONSOLIDATION_GRACE_PERIOD_MS constant (already in previous commit).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pass existing non-draft decisions (with their stable IDs) to the
consolidation LLM. The LLM now reuses an existing ID when it recognises
the same architectural concept, instead of minting a new one from a
potentially varied title. A LLM-supplied ID is only accepted when it
matches a known existing decision ID — fabricated IDs are silently
ignored and a new ID is derived as before.

This prevents the duplicate-decision problem where two consolidation
runs on the same diff produced different IDs due to LLM title phrasing
non-determinism.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The approved_not_synced + 3 other reason-code sections were accidentally
stripped from AGENTS.md during previous edits. Restored to match CLAUDE.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After decisions are synced their content lives in ADR files and spec.md
committed to git — the store entries are redundant. purgeInactiveDecisions()
now runs inside syncApprovedDecisions before saving, dropping synced,
rejected, and phantom entries. The store stays bounded by active work only.

The in-memory store returned to callers is left unpurged so the sync result
(synced list, modifiedSpecs) remains fully observable.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ADR files (openspec/decisions/adr-*.md) are now indexed in the spec
vector index under domain "decisions". Agents calling search_specs or
orient will find relevant past decisions via semantic search — no need
to read ADR files manually.

orient's pendingDecisions filter now uses INACTIVE_STATUSES for
consistency and restricts results to decisions matching the task's
relevant domains or files (instead of dumping all active decisions).
Approved decisions always surface regardless of domain match — the
agent must sync them before committing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- mcp-tools.md: search_specs mentions ADR indexing; sync_decisions mentions
  store purge; orient scenario lists ADR/decision context in results;
  gate reason codes added to decisions workflow scenario
- ci-cd.md: gate reason codes table + sentinel bypass detection documented
- cli-reference.md: --sync purge behavior, --gate reason codes, --approve
  synced guard noted

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When consolidation merges multiple drafts into one new decision (new ID,
no direct match in originalById), the provenance map previously fell back
to the consolidation timestamp — erasing when the underlying work was first
captured. Now uses the earliest recordedAt across all superseded draft IDs
so the audit trail anchors to the real start of the work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@laurentftech laurentftech changed the title fix(decisions): gate bypass, consolidation drop, and test coverage feat(decisions): gate hardening, decision tiers, RAG indexing, and test coverage May 10, 2026
laurentftech and others added 2 commits May 10, 2026 15:43
Introduce DecisionScope = 'local' | 'component' | 'cross-domain' | 'system'.
Only cross-domain and system decisions produce ADR files, preventing ADR spam
from trivial implementation choices.

- qualifiesForADR() in syncer gates ADR creation on scope
- Consolidation LLM prompt includes scope classification with explicit negative
  rules; extractor capped to local/component (cross-domain upgrade requires
  full multi-domain context visible only at consolidation time)
- record_decision auto-promotes scope via two deterministic triggers:
  structural (files span 2+ distinct top-level dirs) or semantic
  (multi-domain + contract keyword + not refactor)
- GATE_REASONS constant centralizes all 4 gate reason codes in constants.ts
- synced re-approval guard added to MCP handleApproveDecision handler
- --sync always calls syncApprovedDecisions so purgeInactiveDecisions runs
  even when no approved decisions exist (fixes early-exit purge skip)
- scope badge in decisions --list (gray/blue/yellow/red by tier)
- Backward compat: pending.json without scope treated as component

+27 tests: scope gate, auto-promotion triggers, consolidator scope mapping,
synced re-approval guard, structural/semantic promotion triggers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…scope tiers

- Remove dangling sentence after opening paragraph
- Update test count from 2580+ to 2660+
- Add scope tier description to Decisions section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@laurentftech laurentftech marked this pull request as ready for review May 10, 2026 14:45
laurentftech and others added 7 commits May 11, 2026 20:17
… macOS

chokidar v5 on macOS (kqueue) opens file descriptors for all directories
before evaluating glob patterns, consuming 20k+ FDs on node_modules alone.

Switch ignored to a function that checks string segments before any FD is
opened. Also add followSymlinks: false to reduce watch surface further.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eliminates native addon compilation requirement, enabling installation
in isolated environments (Nexus, restricted networks) where GitHub
prebuilts are unavailable.

- Migrate EdgeStore to node:sqlite DatabaseSync API
- Replace db.transaction() with SAVEPOINT-based runTransaction() helper
  to support nested transactions (node:sqlite has no native nesting)
- Update named params from { name: v } to { '@name': v } format
- Replace db.pragma() with db.exec('PRAGMA ...')
- Remove better-sqlite3 and @types/better-sqlite3 from dependencies
- Bump engines.node to >=22.5.0 (node:sqlite available since 22.5.0)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
API is stable (release candidate since PR #61262). Suppress with NODE_NO_WARNINGS=1.
Not suppressed programmatically — process.emit/emitWarning monkeypatching is fragile.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
node:sqlite is available since Node 22.5.0. CI was on Node 20 which
throws ERR_UNKNOWN_BUILTIN_MODULE.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
node:sqlite experimental warning is gone on Node 24+.
Bump CI from Node 22 to Node 24 (current LTS).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iency

- Add TOOL_ANNOTATIONS map (readOnlyHint/destructiveHint/idempotentHint) on all
  45 tools; merged into ListTools response to improve Claude Code Tool Search ranking
- Add --minimal flag: exposes only 5 core tools (orient, search_code,
  record_decision, detect_changes, check_spec_drift) for alwaysLoad: true entry
- Document two-server Claude Code config in docs/agent-setup.md: core 5 always
  visible (~500 tokens), full 45 deferred and searchable via Tool Search

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Derives a ranked list of relevant spec-gen tools from what orient already
knows (hub presence, spec domains, task keywords) — no extra I/O.

Works on any MCP client without Tool Search: agent reads orient output,
sees suggestedTools, knows which tools to request next. Complements the
Claude Code --minimal + Tool Search setup for Cline/Cursor/OpenCode users.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@clay-good

Copy link
Copy Markdown
Owner

Thank you @laurentftech ❤️

@clay-good clay-good merged commit 2cd661a into clay-good:main May 12, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants