-
Notifications
You must be signed in to change notification settings - Fork 1
backlog: 3 follow-up rows from PR #1253 (B-0171 OpenSpec + B-0172 plugin + B-0173 hooks) #1261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
AceHack
merged 2 commits into
main
from
backlog/follow-up-rows-from-pr-1253-openspec-plugin-hooks-aaron-2026-05-03
May 3, 2026
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
90 changes: 90 additions & 0 deletions
90
...cklog/P1/B-0171-openspec-catch-up-canonical-source-of-truth-aaron-2026-05-03.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,90 @@ | ||
| --- | ||
| id: B-0171 | ||
| priority: P1 | ||
| status: open | ||
| title: OpenSpec catch-up — restore OpenSpec capabilities as canonical source-of-truth (Aaron 2026-05-03 architectural-debt naming; "if we deleted everything other than it [OpenSpec]") | ||
| tier: foundation | ||
| effort: L | ||
| ask: Aaron 2026-05-03 verbatim *"openspec which we are way behind on, that's suppsed to be our source of truth lol, if we were to delete everyting other than it"* | ||
| created: 2026-05-03 | ||
| last_updated: 2026-05-03 | ||
| depends_on: [] | ||
| composes_with: [B-0058, B-0169, B-0170, B-0172, B-0173] | ||
| tags: [openspec, source-of-truth, foundation, architectural-debt, contract-based-development, spec-based-development, p1-foundation] | ||
| --- | ||
|
|
||
| # OpenSpec catch-up — restore OpenSpec as canonical source-of-truth | ||
|
|
||
| Aaron 2026-05-03, in the autonomous-loop maintainer channel via the skill-design memo (`feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md`), named OpenSpec catch-up as load-bearing architectural debt: | ||
|
|
||
| > *"openspec which we are way behind on, that's suppsed to be our source of truth lol, if we were to delete everyting other than it"* | ||
|
|
||
| The intended state per `openspec/README.md`: capabilities under `openspec/specs/**` carry behavioral specs that the code is supposed to satisfy. Specs are canonical; code + skills + memos + docs all derive from / serve / reference the specs. | ||
|
|
||
| **Current state (2026-05-03):** specs are sparse; most discipline lives outside specs (memos, CLAUDE.md, GOVERNANCE.md). The *"if we deleted everything but OpenSpec, the project would be lost"* test FAILS today. | ||
|
|
||
| This row tracks the catch-up work needed to restore OpenSpec as actual source-of-truth. | ||
|
|
||
| ## Why P1 (foundation) | ||
|
|
||
| - Aaron's same-tick framing names OpenSpec catch-up as **load-bearing prerequisite** for Rule 3 (skill-domain packaging + harness hooks for contracts) to fully operationalize | ||
| - The skill-design rules in `feedback_skills_as_carved_sentences_*` recursively compose at the spec layer: skill body / command / skill domain / cross-skill contracts / **spec** — without the spec layer current, the recursion is incomplete | ||
| - Contract-based development (Meyer, Eiffel) / Design-by-Contract / spec-based development is what hooks-as-pre/post-conditions plug into; without specs, the contracts have no reference | ||
|
|
||
| ## Scope (incremental, not big-bang) | ||
|
|
||
| The catch-up is **NOT** a single big-bang spec authoring pass. It's incremental backfilling of the most load-bearing capability surfaces FIRST, then extending coverage. Per Aaron's *"foundation right and deliberate"* guidance, quality > coverage. | ||
|
|
||
| ### Phase 1 — Inventory + sequencing | ||
|
|
||
| 1. Audit current `openspec/specs/**` — what capabilities exist? what's stale? what's empty? | ||
| 2. Compare against the project's actual hot-path code (Z-set algebra, DBSP operators, retraction-native semantics, tick-history schema) | ||
| 3. Identify the top-10 capabilities by load-bearing-weight — these are the catch-up targets | ||
| 4. Sequence: spec the most-foundational first (algebra > operators > DBSP > retraction-native > tick-history > backlog row schema > skill-router shape > harness contracts) | ||
|
|
||
| ### Phase 2 — Author the top-10 specs | ||
|
|
||
| Per `openspec/README.md` modified-fork conventions (no archive, no change-history). Each spec lands its own PR. Reviewer surface: spec-zealot (Viktor) — adversarial pass on each spec. | ||
|
|
||
| ### Phase 3 — Cross-reference + tooling | ||
|
|
||
| - Update `CLAUDE.md` + `AGENTS.md` to make OpenSpec the FIRST-READ surface (above current load order) | ||
| - Add CI check: every load-bearing change references a spec in `openspec/specs/**` | ||
| - Add `tools/openspec/` tooling for spec-to-code drift detection (probably builds on `tools/substrate-claim-checker/` v1+) | ||
|
|
||
| ### Phase 4 — Validation | ||
|
|
||
| The *"if we deleted everything but OpenSpec, the project would be lost"* test is the acceptance criterion. When all 4 phases complete, that test should NOT fail. | ||
|
|
||
| ## Why this matters now | ||
|
|
||
| - Multiple just-landed memos (`feedback_skills_as_carved_sentences_*`, `feedback_multi_harness_alignment_convergence_*`, `feedback_git_native_backlog_management_*`, `feedback_verify_then_claim_*`) reference OpenSpec as the long-term canonical surface. Each adds substrate that should eventually have spec backing. | ||
| - The substrate-claim-checker tool (B-0170) v1+ work for hook integration depends on contract-based development, which depends on specs being current. | ||
| - Plugin packaging (B-0172) depends on specs as the contract carriers. | ||
|
|
||
| ## Out of scope | ||
|
|
||
| - Adopting upstream OpenSpec workflow as-is (the project uses a modified fork; modifications stay) | ||
| - Single big-bang spec authoring (incremental per Phase 1-4 above) | ||
| - Replacing CLAUDE.md / AGENTS.md / GOVERNANCE.md (OpenSpec is the *contract* layer; those remain the *behavioral guidance* + *governance* layers — they reference the contracts) | ||
|
|
||
| ## Composes with | ||
|
|
||
| - **B-0058** (AI ethics + safety research track) — alignment specs are one class of OpenSpec capability that needs catch-up | ||
| - **B-0169** (decision-archaeology skill) — once specs are current, `docs/DECISIONS/` ADRs cross-reference specs; decision-archaeology composes naturally | ||
| - **B-0170** (substrate-claim-checker TS tool) — v1+ hook integration depends on specs as contract carriers | ||
| - **B-0172** (skill-domain plugin packaging) — skill domains expose contracts; contracts live in specs | ||
| - **B-0173** (hook authoring for skill-creation contracts) — pre/post-conditions are spec-encoded; hooks read them | ||
| - `memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md` — the memo naming this catch-up as load-bearing | ||
| - `openspec/README.md` — the canonical-intent doc; reading order is OpenSpec first per the future state | ||
|
|
||
| ## Done-criteria | ||
|
|
||
| This row closes when: | ||
|
|
||
| 1. The top-10 load-bearing capability surfaces have current OpenSpec specs (Phase 2 complete) | ||
| 2. CI gate enforces "every load-bearing change references a spec" (Phase 3 complete) | ||
| 3. CLAUDE.md + AGENTS.md updated to make OpenSpec FIRST-READ (Phase 3 complete) | ||
| 4. The *"delete everything but OpenSpec"* test passes (Phase 4 complete) | ||
|
|
||
| Until done, this row stays open. Per Aaron's *"WONT-DO is 99% deferral, not forever — we will likely do everything eventually"*, the catch-up is on the long arc. |
92 changes: 92 additions & 0 deletions
92
...cklog/P1/B-0173-hook-authoring-for-skill-creation-contracts-aaron-2026-05-03.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,92 @@ | ||
| --- | ||
| id: B-0173 | ||
| priority: P1 | ||
| status: open | ||
| title: Hook authoring for skill-creation contracts — pre/post-condition enforcement at skill-creation + commit + PR-creation time (Aaron 2026-05-03 rule 3b from skill-design memo) | ||
| tier: tooling | ||
| effort: M | ||
| ask: Aaron 2026-05-03 verbatim *"this feature is great for reminding yourself to do the right thing the pre conditions and post condtions in contract based development or spec based development like openspec"* | ||
| created: 2026-05-03 | ||
| last_updated: 2026-05-03 | ||
| depends_on: [B-0170, B-0171] | ||
| composes_with: [B-0169, B-0172] | ||
| tags: [hooks, contract-based-development, design-by-contract, openspec, pre-condition, post-condition, ci, p1-foundation] | ||
| --- | ||
|
|
||
| # Hook authoring for skill-creation contracts | ||
|
|
||
| Aaron 2026-05-03 named harness hooks as Rule 3b of the skill-design memo (`feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md`): | ||
|
|
||
| > *"this feature is great for reminding yourself to do the right thing the pre conditions and post condtions in contract based development or spec based development like openspec"* | ||
|
|
||
| Harness hooks fire at well-defined points (pre-tool-use, post-tool-use, session-start, pre-commit, commit-msg, etc.) — the natural place to enforce pre-conditions and post-conditions on procedures. This is contract-based development (Meyer, Eiffel) / Design-by-Contract / spec-based development (OpenSpec). | ||
|
|
||
| ## Why P1 | ||
|
|
||
| The verify-then-claim discipline (`feedback_verify_then_claim_discipline_*`) catalogues 20+ drift instances across 9+ PRs that manual discipline failed to catch. The substrate-claim-checker TS tool (B-0170) ships v0 covering count-drift; v1+ extends to remaining 6 sub-classes. **The hook integration is what turns the tool from advisory into enforcement.** Without hooks, the tool fires only when manually invoked; with hooks, it gates commits + PRs automatically. | ||
|
|
||
| ## Why depends_on B-0170 + B-0171 | ||
|
|
||
| - **B-0170** (substrate-claim-checker TS tool): the tool the hooks invoke. Hooks ship after the tool's check-types are mature enough to gate. | ||
| - **B-0171** (OpenSpec catch-up): hooks enforce contracts; contracts live in OpenSpec capabilities. Without specs, the hooks have no contract to read pre/post-conditions from. | ||
|
|
||
| ## Scope (per the verify-then-claim memo's mechanization-path section) | ||
|
|
||
| Three hook integrations: | ||
|
|
||
| ### 1. pre-commit hook | ||
|
|
||
| Fires BEFORE the commit message is entered. Validates **staged-file content** — memos, docs, config files. Calls `bun tools/substrate-claim-checker/check-counts.ts <staged-files>` (and v1+ sibling check-types). Exit non-zero blocks commit. | ||
|
|
||
| ### 2. commit-msg hook | ||
|
|
||
| Fires AFTER the commit message is written. Validates **the commit message itself** for fact-claims (path mentions, count totals, command-output assertions). Pre-commit can't validate this surface because the message doesn't exist yet at pre-commit time per git's hook ordering. | ||
|
|
||
| ### 3. CI check on PR descriptions | ||
|
|
||
| Fires post-PR-creation on the GitHub host. Validates **the PR description** for fact-claims. Authored on the host, not pre-commit, so this is its own check (different timing from the two git hooks). | ||
|
|
||
| ## Hook authoring deliverables | ||
|
|
||
| 1. `tools/git-hooks/pre-commit` (bash invoking `bun tools/substrate-claim-checker/...` with staged files) | ||
| 2. `tools/git-hooks/commit-msg` (bash invoking the same tool with commit message) | ||
| 3. `.github/workflows/substrate-claim-checker.yml` (CI check for PR descriptions) | ||
| 4. Documentation: how to install hooks (`tools/setup/install.sh` integration) | ||
|
AceHack marked this conversation as resolved.
|
||
| 5. Opt-out mechanism for legitimate edge cases (e.g., a hedged-claim memo where strict drift checking would false-positive — `# substrate-claim-checker: skip` comment in the file) | ||
|
|
||
| ## Cross-cutting design decisions | ||
|
|
||
| ### Strict vs warn mode | ||
|
|
||
| - **Strict mode** (default for production): hook exits non-zero, blocks commit | ||
| - **Warn mode** (default during v0.x rollout): hook prints warnings, exits 0 | ||
| - Switch via env var (`SUBSTRATE_CLAIM_CHECKER_MODE=strict` / `=warn`) | ||
|
|
||
| The progression: ship in warn mode → observe false-positive rate → tighten to strict mode once mature | ||
|
|
||
| ### Hook performance | ||
|
|
||
| Hooks fire on EVERY commit; they need to be fast (<2 seconds ideally). The check-counts.ts v0 self-test runs in ~50ms on a single memo file; scaling to staged-files-in-large-PRs needs measurement. v1+ may need worker-pool or incremental-cache. | ||
|
|
||
| ### Opt-out semantics | ||
|
|
||
| Each check-type should have a recognized comment/marker for legitimate skip. Example: `<!-- substrate-claim-checker: skip-count-drift -->` at the top of a memo. Maintainable opt-out is the difference between gating-discipline-respected vs gating-discipline-circumvented. | ||
|
|
||
| ## Done-criteria | ||
|
|
||
| This row closes when: | ||
|
|
||
| 1. Pre-commit + commit-msg hooks installed via `tools/setup/install.sh` | ||
| 2. CI workflow validates PR descriptions on GitHub | ||
| 3. Strict mode is the default; warn mode available via env var | ||
| 4. Documented opt-out mechanism for edge cases | ||
| 5. Self-tested: at least 5 historical PRs (drift catalogue eval-set) would have been caught by the hooks pre-commit | ||
|
|
||
| ## Composes with | ||
|
|
||
| - **B-0169** (decision-archaeology skill) — once mature + packaged (B-0172), the decision-archaeology skill body has hooks via this row | ||
| - **B-0170** (substrate-claim-checker TS tool) — the tool the hooks invoke | ||
| - **B-0171** (OpenSpec catch-up) — hooks enforce spec-encoded contracts | ||
| - **B-0172** (skill-domain plugin packaging) — packaged plugins include their hooks | ||
| - `memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md` — Rule 3 of the three skill-design rules; this row is the operational implementation | ||
| - `memory/feedback_verify_then_claim_discipline_dominant_failure_mode_substrate_authoring_otto_2026_05_03.md` — discipline that gets enforced by these hooks | ||
74 changes: 74 additions & 0 deletions
74
docs/backlog/P2/B-0172-skill-domain-plugin-packaging-aaron-2026-05-03.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,74 @@ | ||
| --- | ||
| id: B-0172 | ||
| priority: P2 | ||
| status: open | ||
| title: Skill-domain plugin packaging — package mature skill domains as Claude Code plugins (Aaron 2026-05-03 rule 3a from skill-design memo) | ||
| tier: tooling | ||
| effort: M | ||
| ask: Aaron 2026-05-03 verbatim *"look at packaking skill domains a plugins or other packagin so we can take advantage of hooks in harnesses"* | ||
| created: 2026-05-03 | ||
| last_updated: 2026-05-03 | ||
| depends_on: [B-0171, B-0173] | ||
| composes_with: [B-0169, B-0170] | ||
| tags: [skill-domain, plugin, packaging, claude-code, foundation, p2-promotion-trigger-pending] | ||
| --- | ||
|
|
||
| # Skill-domain plugin packaging | ||
|
|
||
| Aaron 2026-05-03 named plugin-packaging as Rule 3a of the skill-design memo (`feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md`): | ||
|
|
||
| > *"look at packaking skill domains a plugins or other packagin so we can take advantage of hooks in harnesses"* | ||
|
|
||
| Claude Code supports plugins under `.claude/plugins/`. When a skill domain matures (per the future-skill-domain memos' promotion-trigger criteria — 3+ worked examples per skill candidate + 1+ judgment-disagreement per expert candidate), packaging the whole domain as a plugin lets it ship as one unit including its hooks. | ||
|
AceHack marked this conversation as resolved.
|
||
|
|
||
| ## Why P2 (promotion-trigger pending) | ||
|
|
||
| This row is P2 not P1 because the promotion-trigger has not yet fired for ANY skill domain. The two named-but-future domains (git-native-backlog-management + multi-harness-alignment-convergence) are still in the "down pat" phase per Aaron's framing — neither has the 3+-worked-examples-per-skill nor the 1+-judgment-disagreement-per-expert evidence base required. | ||
|
|
||
| When promotion-trigger DOES fire on a skill domain, this row becomes the implementation work. | ||
|
|
||
| ## Why depends_on B-0171 + B-0173 | ||
|
|
||
| - **B-0171** (OpenSpec catch-up): plugins package skill domains' contracts; contracts live in specs; specs need to be current first. | ||
| - **B-0173** (hook authoring): the value of plugin packaging is that hooks ship inside the package; without hooks, packaging is bare-skill-grouping. | ||
|
|
||
| ## Scope (when promotion-trigger fires) | ||
|
|
||
| Per Claude Code plugin convention (`.claude/plugins/<name>/`): | ||
|
|
||
| 1. Each plugin contains: | ||
| - One or more `SKILL.md` files (the procedure-skills of the domain) | ||
| - One or more `agent.md` files (the named-persona-experts) | ||
| - One or more hook configurations (per B-0173) | ||
| - Tools under `tools/` (TS files per Aaron skill-design rule 2) | ||
| - References to OpenSpec capabilities the plugin contracts against (per B-0171) | ||
| 2. Plugin manifest (`plugin.json` per Anthropic spec) with description + dependencies + capabilities | ||
| 3. Cross-harness portability documentation: how Codex / Cursor / Gemini-CLI consume the equivalent substrate (per Aaron 2026-05-02 *"skills are for everyone and even other agent harnesses"*) | ||
|
|
||
| ## Cross-harness consideration | ||
|
|
||
| Per Aaron 2026-05-02 corrective: skills propagate across team + harnesses. Plugin packaging is harness-specific by definition (Claude Code plugins use a particular structure). The packaging design needs: | ||
|
|
||
| - A canonical "skill-domain bundle" format that's harness-agnostic at the substrate layer (the SKILL.md files, agent.md files, tools/, and OpenSpec references) | ||
| - Per-harness packaging adapters that read the canonical bundle and emit harness-specific package formats (Claude Code plugin / Codex equivalent / Cursor / Gemini-CLI) | ||
|
AceHack marked this conversation as resolved.
|
||
|
|
||
| The canonical bundle format itself is part of this row's scope; per-harness adapters are downstream rows. | ||
|
|
||
| ## Done-criteria | ||
|
|
||
| This row closes when: | ||
|
|
||
| 1. Promotion-trigger has fired on at least 1 skill domain (per future-skill-domain memos' criteria) | ||
| 2. Canonical "skill-domain bundle" format is documented + at least one domain is packaged | ||
| 3. Claude Code plugin adapter exists + the packaged domain installs as a plugin | ||
| 4. Cross-harness portability documentation covers at least 2 harnesses (Claude Code + 1 other) | ||
|
|
||
| ## Composes with | ||
|
|
||
| - **B-0169** (decision-archaeology skill) — likely first skill packaged once mature | ||
| - **B-0170** (substrate-claim-checker TS tool) — tool that lives inside the packaged skill domain | ||
| - **B-0171** (OpenSpec catch-up) — plugin contracts reference OpenSpec capabilities | ||
| - **B-0173** (hook authoring for skill-creation contracts) — hooks ship inside the plugin package | ||
| - `memory/feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md` — Rule 3a of the three skill-design rules | ||
| - `memory/feedback_git_native_backlog_management_long_arc_future_skill_domain_aaron_2026_05_02.md` — first future-skill-domain memo | ||
| - `memory/feedback_multi_harness_alignment_convergence_design_future_skill_domain_aaron_2026_05_03.md` — second future-skill-domain memo | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.