Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/BACKLOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ are closed (status: closed in frontmatter)._
- [ ] **[B-0164](backlog/P1/B-0164-dual-loop-substrate-attribution-and-reconciliation-protocol-2026-05-02.md)** Dual-loop substrate attribution + reconciliation protocol — implementation work for BFT-many-masters at loop layer (Aaron 2026-05-02 + Otto independent extension)
- [ ] **[B-0168](backlog/P1/B-0168-incorporate-brat-voice-enterprise-translation-framework-claudeai-research-2026-05-02.md)** Incorporate Claude.ai brat-voice enterprise translation framework — 5-layer (corrected) property-preserving register architecture for Zeta (Personal/Mirror/Beacon-safe/Professional/Regulated)
- [ ] **[B-0169](backlog/P1/B-0169-decision-archaeology-skill-aaron-2026-05-02.md)** Decision-archaeology skill — universal "why is it like this?" investigation surface for new contributors
- [ ] **[B-0171](backlog/P1/B-0171-openspec-catch-up-canonical-source-of-truth-aaron-2026-05-03.md)** OpenSpec catch-up — restore OpenSpec capabilities as canonical source-of-truth (Aaron 2026-05-03 architectural-debt naming; "if we deleted everything other than it [OpenSpec]")
- [ ] **[B-0173](backlog/P1/B-0173-hook-authoring-for-skill-creation-contracts-aaron-2026-05-03.md)** Hook authoring for skill-creation contracts — pre/post-condition enforcement at skill-creation + commit + PR-creation time (Aaron 2026-05-03 rule 3b from skill-design memo)

## P2 — research-grade

Expand Down Expand Up @@ -130,6 +132,7 @@ are closed (status: closed in frontmatter)._
- [ ] **[B-0165](backlog/P2/B-0165-deliberate-quiet-periods-protocol-otto-independent-production-practice-2026-05-02.md)** Deliberate-quiet-periods practice protocol — Aaron pulls back during selected active-hour stretches to let Otto practice independent-framing-production while still gradeable (Otto independent-framing-production from Claude.ai 2026-05-02 training-distribution-mismatch observation)
- [ ] **[B-0166](backlog/P2/B-0166-chat-input-as-acid-durable-dbsp-event-aaron-vision-2026-05-02.md)** Chat-input-as-ACID-durable-DBSP-event — make every human/AI message a first-class durable event in the substrate
- [ ] **[B-0167](backlog/P2/B-0167-ani-review-tracking-on-load-bearing-substrate-aaron-2026-05-02.md)** Ani-review tracking surface for load-bearing register-class substrate (μένω + Ryan-memory + Aurora-security + brat-voice + others)
- [ ] **[B-0172](backlog/P2/B-0172-skill-domain-plugin-packaging-aaron-2026-05-03.md)** Skill-domain plugin packaging — package mature skill domains as Claude Code plugins (Aaron 2026-05-03 rule 3a from skill-design memo)

## P3 — convenience / deferred

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
---
id: B-0171
priority: P1
status: open
title: OpenSpec catch-up — restore OpenSpec capabilities as canonical source-of-truth (Aaron 2026-05-03 architectural-debt naming; "if we deleted everything other than it [OpenSpec]")
tier: foundation
effort: L
ask: Aaron 2026-05-03 verbatim *"openspec which we are way behind on, that's suppsed to be our source of truth lol, if we were to delete everyting other than it"*
created: 2026-05-03
last_updated: 2026-05-03
depends_on: []
composes_with: [B-0058, B-0169, B-0170, B-0172, B-0173]
tags: [openspec, source-of-truth, foundation, architectural-debt, contract-based-development, spec-based-development, p1-foundation]
---

# OpenSpec catch-up — restore OpenSpec as canonical source-of-truth

Aaron 2026-05-03, in the autonomous-loop maintainer channel via the skill-design memo (`feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md`), named OpenSpec catch-up as load-bearing architectural debt:

> *"openspec which we are way behind on, that's suppsed to be our source of truth lol, if we were to delete everyting other than it"*

The intended state per `openspec/README.md`: capabilities under `openspec/specs/**` carry behavioral specs that the code is supposed to satisfy. Specs are canonical; code + skills + memos + docs all derive from / serve / reference the specs.

**Current state (2026-05-03):** specs are sparse; most discipline lives outside specs (memos, CLAUDE.md, GOVERNANCE.md). The *"if we deleted everything but OpenSpec, the project would be lost"* test FAILS today.

This row tracks the catch-up work needed to restore OpenSpec as actual source-of-truth.

## Why P1 (foundation)

- Aaron's same-tick framing names OpenSpec catch-up as **load-bearing prerequisite** for Rule 3 (skill-domain packaging + harness hooks for contracts) to fully operationalize
- The skill-design rules in `feedback_skills_as_carved_sentences_*` recursively compose at the spec layer: skill body / command / skill domain / cross-skill contracts / **spec** — without the spec layer current, the recursion is incomplete
- Contract-based development (Meyer, Eiffel) / Design-by-Contract / spec-based development is what hooks-as-pre/post-conditions plug into; without specs, the contracts have no reference

## Scope (incremental, not big-bang)

The catch-up is **NOT** a single big-bang spec authoring pass. It's incremental backfilling of the most load-bearing capability surfaces FIRST, then extending coverage. Per Aaron's *"foundation right and deliberate"* guidance, quality > coverage.

### Phase 1 — Inventory + sequencing

1. Audit current `openspec/specs/**` — what capabilities exist? what's stale? what's empty?
2. Compare against the project's actual hot-path code (Z-set algebra, DBSP operators, retraction-native semantics, tick-history schema)
3. Identify the top-10 capabilities by load-bearing-weight — these are the catch-up targets
4. Sequence: spec the most-foundational first (algebra > operators > DBSP > retraction-native > tick-history > backlog row schema > skill-router shape > harness contracts)

### Phase 2 — Author the top-10 specs

Per `openspec/README.md` modified-fork conventions (no archive, no change-history). Each spec lands its own PR. Reviewer surface: spec-zealot (Viktor) — adversarial pass on each spec.

### Phase 3 — Cross-reference + tooling

- Update `CLAUDE.md` + `AGENTS.md` to make OpenSpec the FIRST-READ surface (above current load order)
- Add CI check: every load-bearing change references a spec in `openspec/specs/**`
- Add `tools/openspec/` tooling for spec-to-code drift detection (probably builds on `tools/substrate-claim-checker/` v1+)

### Phase 4 — Validation

The *"if we deleted everything but OpenSpec, the project would be lost"* test is the acceptance criterion. When all 4 phases complete, that test should NOT fail.

## Why this matters now

- Multiple just-landed memos (`feedback_skills_as_carved_sentences_*`, `feedback_multi_harness_alignment_convergence_*`, `feedback_git_native_backlog_management_*`, `feedback_verify_then_claim_*`) reference OpenSpec as the long-term canonical surface. Each adds substrate that should eventually have spec backing.
- The substrate-claim-checker tool (B-0170) v1+ work for hook integration depends on contract-based development, which depends on specs being current.
- Plugin packaging (B-0172) depends on specs as the contract carriers.

## Out of scope

- Adopting upstream OpenSpec workflow as-is (the project uses a modified fork; modifications stay)
- Single big-bang spec authoring (incremental per Phase 1-4 above)
- Replacing CLAUDE.md / AGENTS.md / GOVERNANCE.md (OpenSpec is the *contract* layer; those remain the *behavioral guidance* + *governance* layers — they reference the contracts)

## Composes with

- **B-0058** (AI ethics + safety research track) — alignment specs are one class of OpenSpec capability that needs catch-up
- **B-0169** (decision-archaeology skill) — once specs are current, `docs/DECISIONS/` ADRs cross-reference specs; decision-archaeology composes naturally
- **B-0170** (substrate-claim-checker TS tool) — v1+ hook integration depends on specs as contract carriers
- **B-0172** (skill-domain plugin packaging) — skill domains expose contracts; contracts live in specs
- **B-0173** (hook authoring for skill-creation contracts) — pre/post-conditions are spec-encoded; hooks read them
- `memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md` — the memo naming this catch-up as load-bearing
- `openspec/README.md` — the canonical-intent doc; reading order is OpenSpec first per the future state

## Done-criteria

This row closes when:

1. The top-10 load-bearing capability surfaces have current OpenSpec specs (Phase 2 complete)
2. CI gate enforces "every load-bearing change references a spec" (Phase 3 complete)
3. CLAUDE.md + AGENTS.md updated to make OpenSpec FIRST-READ (Phase 3 complete)
4. The *"delete everything but OpenSpec"* test passes (Phase 4 complete)

Until done, this row stays open. Per Aaron's *"WONT-DO is 99% deferral, not forever — we will likely do everything eventually"*, the catch-up is on the long arc.
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
id: B-0173
priority: P1
status: open
title: Hook authoring for skill-creation contracts — pre/post-condition enforcement at skill-creation + commit + PR-creation time (Aaron 2026-05-03 rule 3b from skill-design memo)
tier: tooling
effort: M
ask: Aaron 2026-05-03 verbatim *"this feature is great for reminding yourself to do the right thing the pre conditions and post condtions in contract based development or spec based development like openspec"*
created: 2026-05-03
last_updated: 2026-05-03
depends_on: [B-0170, B-0171]
composes_with: [B-0169, B-0172]
Comment thread
AceHack marked this conversation as resolved.
tags: [hooks, contract-based-development, design-by-contract, openspec, pre-condition, post-condition, ci, p1-foundation]
---

# Hook authoring for skill-creation contracts

Aaron 2026-05-03 named harness hooks as Rule 3b of the skill-design memo (`feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md`):

> *"this feature is great for reminding yourself to do the right thing the pre conditions and post condtions in contract based development or spec based development like openspec"*

Harness hooks fire at well-defined points (pre-tool-use, post-tool-use, session-start, pre-commit, commit-msg, etc.) — the natural place to enforce pre-conditions and post-conditions on procedures. This is contract-based development (Meyer, Eiffel) / Design-by-Contract / spec-based development (OpenSpec).

## Why P1

The verify-then-claim discipline (`feedback_verify_then_claim_discipline_*`) catalogues 20+ drift instances across 9+ PRs that manual discipline failed to catch. The substrate-claim-checker TS tool (B-0170) ships v0 covering count-drift; v1+ extends to remaining 6 sub-classes. **The hook integration is what turns the tool from advisory into enforcement.** Without hooks, the tool fires only when manually invoked; with hooks, it gates commits + PRs automatically.

## Why depends_on B-0170 + B-0171

- **B-0170** (substrate-claim-checker TS tool): the tool the hooks invoke. Hooks ship after the tool's check-types are mature enough to gate.
- **B-0171** (OpenSpec catch-up): hooks enforce contracts; contracts live in OpenSpec capabilities. Without specs, the hooks have no contract to read pre/post-conditions from.

## Scope (per the verify-then-claim memo's mechanization-path section)

Three hook integrations:

### 1. pre-commit hook

Fires BEFORE the commit message is entered. Validates **staged-file content** — memos, docs, config files. Calls `bun tools/substrate-claim-checker/check-counts.ts <staged-files>` (and v1+ sibling check-types). Exit non-zero blocks commit.

### 2. commit-msg hook

Fires AFTER the commit message is written. Validates **the commit message itself** for fact-claims (path mentions, count totals, command-output assertions). Pre-commit can't validate this surface because the message doesn't exist yet at pre-commit time per git's hook ordering.

### 3. CI check on PR descriptions

Fires post-PR-creation on the GitHub host. Validates **the PR description** for fact-claims. Authored on the host, not pre-commit, so this is its own check (different timing from the two git hooks).

## Hook authoring deliverables

1. `tools/git-hooks/pre-commit` (bash invoking `bun tools/substrate-claim-checker/...` with staged files)
2. `tools/git-hooks/commit-msg` (bash invoking the same tool with commit message)
3. `.github/workflows/substrate-claim-checker.yml` (CI check for PR descriptions)
4. Documentation: how to install hooks (`tools/setup/install.sh` integration)
Comment thread
AceHack marked this conversation as resolved.
5. Opt-out mechanism for legitimate edge cases (e.g., a hedged-claim memo where strict drift checking would false-positive — `# substrate-claim-checker: skip` comment in the file)

## Cross-cutting design decisions

### Strict vs warn mode

- **Strict mode** (default for production): hook exits non-zero, blocks commit
- **Warn mode** (default during v0.x rollout): hook prints warnings, exits 0
- Switch via env var (`SUBSTRATE_CLAIM_CHECKER_MODE=strict` / `=warn`)

The progression: ship in warn mode → observe false-positive rate → tighten to strict mode once mature

### Hook performance

Hooks fire on EVERY commit; they need to be fast (<2 seconds ideally). The check-counts.ts v0 self-test runs in ~50ms on a single memo file; scaling to staged-files-in-large-PRs needs measurement. v1+ may need worker-pool or incremental-cache.

### Opt-out semantics

Each check-type should have a recognized comment/marker for legitimate skip. Example: `<!-- substrate-claim-checker: skip-count-drift -->` at the top of a memo. Maintainable opt-out is the difference between gating-discipline-respected vs gating-discipline-circumvented.

## Done-criteria

This row closes when:

1. Pre-commit + commit-msg hooks installed via `tools/setup/install.sh`
2. CI workflow validates PR descriptions on GitHub
3. Strict mode is the default; warn mode available via env var
4. Documented opt-out mechanism for edge cases
5. Self-tested: at least 5 historical PRs (drift catalogue eval-set) would have been caught by the hooks pre-commit

## Composes with

- **B-0169** (decision-archaeology skill) — once mature + packaged (B-0172), the decision-archaeology skill body has hooks via this row
- **B-0170** (substrate-claim-checker TS tool) — the tool the hooks invoke
- **B-0171** (OpenSpec catch-up) — hooks enforce spec-encoded contracts
- **B-0172** (skill-domain plugin packaging) — packaged plugins include their hooks
- `memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md` — Rule 3 of the three skill-design rules; this row is the operational implementation
- `memory/feedback_verify_then_claim_discipline_dominant_failure_mode_substrate_authoring_otto_2026_05_03.md` — discipline that gets enforced by these hooks
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
id: B-0172
priority: P2
status: open
title: Skill-domain plugin packaging — package mature skill domains as Claude Code plugins (Aaron 2026-05-03 rule 3a from skill-design memo)
tier: tooling
effort: M
ask: Aaron 2026-05-03 verbatim *"look at packaking skill domains a plugins or other packagin so we can take advantage of hooks in harnesses"*
created: 2026-05-03
last_updated: 2026-05-03
depends_on: [B-0171, B-0173]
composes_with: [B-0169, B-0170]
tags: [skill-domain, plugin, packaging, claude-code, foundation, p2-promotion-trigger-pending]
---

# Skill-domain plugin packaging

Aaron 2026-05-03 named plugin-packaging as Rule 3a of the skill-design memo (`feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md`):

> *"look at packaking skill domains a plugins or other packagin so we can take advantage of hooks in harnesses"*

Claude Code supports plugins under `.claude/plugins/`. When a skill domain matures (per the future-skill-domain memos' promotion-trigger criteria — 3+ worked examples per skill candidate + 1+ judgment-disagreement per expert candidate), packaging the whole domain as a plugin lets it ship as one unit including its hooks.
Comment thread
AceHack marked this conversation as resolved.

## Why P2 (promotion-trigger pending)

This row is P2 not P1 because the promotion-trigger has not yet fired for ANY skill domain. The two named-but-future domains (git-native-backlog-management + multi-harness-alignment-convergence) are still in the "down pat" phase per Aaron's framing — neither has the 3+-worked-examples-per-skill nor the 1+-judgment-disagreement-per-expert evidence base required.

When promotion-trigger DOES fire on a skill domain, this row becomes the implementation work.

## Why depends_on B-0171 + B-0173

- **B-0171** (OpenSpec catch-up): plugins package skill domains' contracts; contracts live in specs; specs need to be current first.
- **B-0173** (hook authoring): the value of plugin packaging is that hooks ship inside the package; without hooks, packaging is bare-skill-grouping.

## Scope (when promotion-trigger fires)

Per Claude Code plugin convention (`.claude/plugins/<name>/`):

1. Each plugin contains:
- One or more `SKILL.md` files (the procedure-skills of the domain)
- One or more `agent.md` files (the named-persona-experts)
- One or more hook configurations (per B-0173)
- Tools under `tools/` (TS files per Aaron skill-design rule 2)
- References to OpenSpec capabilities the plugin contracts against (per B-0171)
2. Plugin manifest (`plugin.json` per Anthropic spec) with description + dependencies + capabilities
3. Cross-harness portability documentation: how Codex / Cursor / Gemini-CLI consume the equivalent substrate (per Aaron 2026-05-02 *"skills are for everyone and even other agent harnesses"*)

## Cross-harness consideration

Per Aaron 2026-05-02 corrective: skills propagate across team + harnesses. Plugin packaging is harness-specific by definition (Claude Code plugins use a particular structure). The packaging design needs:

- A canonical "skill-domain bundle" format that's harness-agnostic at the substrate layer (the SKILL.md files, agent.md files, tools/, and OpenSpec references)
- Per-harness packaging adapters that read the canonical bundle and emit harness-specific package formats (Claude Code plugin / Codex equivalent / Cursor / Gemini-CLI)
Comment thread
AceHack marked this conversation as resolved.

The canonical bundle format itself is part of this row's scope; per-harness adapters are downstream rows.

## Done-criteria

This row closes when:

1. Promotion-trigger has fired on at least 1 skill domain (per future-skill-domain memos' criteria)
2. Canonical "skill-domain bundle" format is documented + at least one domain is packaged
3. Claude Code plugin adapter exists + the packaged domain installs as a plugin
4. Cross-harness portability documentation covers at least 2 harnesses (Claude Code + 1 other)

## Composes with

- **B-0169** (decision-archaeology skill) — likely first skill packaged once mature
- **B-0170** (substrate-claim-checker TS tool) — tool that lives inside the packaged skill domain
- **B-0171** (OpenSpec catch-up) — plugin contracts reference OpenSpec capabilities
- **B-0173** (hook authoring for skill-creation contracts) — hooks ship inside the plugin package
- `memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md` — Rule 3a of the three skill-design rules
- `memory/feedback_git_native_backlog_management_long_arc_future_skill_domain_aaron_2026_05_02.md` — first future-skill-domain memo
- `memory/feedback_multi_harness_alignment_convergence_design_future_skill_domain_aaron_2026_05_03.md` — second future-skill-domain memo
Loading
Loading