From 6b361fd11f19dac9277e3ef5e5d2906281a0bcb0 Mon Sep 17 00:00:00 2001
From: Aaron Stainback <aaron_bond@yahoo.com>
Date: Sat, 2 May 2026 22:54:07 -0400
Subject: [PATCH 1/2] =?UTF-8?q?free-memory:=20guess=20#002=20=E2=80=94=20i?=
 =?UTF-8?q?n-the-moment=20guess=20on=20B-0172=20skill-domain-plugin-packag?=
 =?UTF-8?q?ing=20(Otto=202026-05-03)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Second in-the-moment guess under the guess-then-verify architectural-intent
calibration protocol (PR #1278). Target: B-0172 skill-domain-plugin-
packaging row (P2). Otto has read row name only; not body.

**Guess summary:**

- Architectural intent (medium-high confidence): plugins-as-distribution-
  + isolation + composition units for skill domains; instantiates
  hub-satellite separation at the domain level
- Substrate-content (medium): plugin manifest format
  (.claude-plugin/plugin.json per recent path corrections); first
  packaging is decision-archaeology + substrate-claim-checker cluster
- Specific implementation (low): directory tree + dependencies
  declaration; GitHub-publishable
- Cross-row composition (medium): B-0169 + B-0170 + B-0173
  composition; B-0171 likely depends_on (OpenSpec specs precede
  plugin packaging)

**Pre-recovery self-prediction**: based on guess #001 pattern (principle-
strong + specific-weak), I predict architectural PARTIAL-MATCH +
substrate-content MIXED + specific MOSTLY-OFF. This pre-prediction
itself is calibration data: how well does Otto predict its own
accuracy BEFORE seeing the answer?

Ground truth + calibration delta sections deliberately empty — to be
filled in a SUBSEQUENT GROUND-TRUTH-RECOVERY commit after Otto reads
B-0172.

This is the second calibration data point under the protocol. Pattern-
recognition test: does the principle-strong + specific-weak pattern
generalize beyond the first guess?

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 ...03-b-0172-skill-domain-plugin-packaging.md | 108 ++++++++++++++++++
 1 file changed, 108 insertions(+)
 create mode 100644 memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md

diff --git a/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md b/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md
new file mode 100644
index 000000000..1a601911c
--- /dev/null
+++ b/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md
@@ -0,0 +1,108 @@
+# Guess #002 — B-0172 skill-domain-plugin-packaging
+
+## Target
+
+`docs/backlog/P2/B-0172-skill-domain-plugin-packaging-aaron-2026-05-03.md`
+
+The architectural choice: Aaron filed B-0172 for "skill-domain plugin packaging." The question this guess answers: **why packages skills as plugins (specifically) — vs alternatives like cross-skill imports, skill-namespace prefixes, or shared-substrate-via-symlink?**
+
+## Read state at guess time (2026-05-03 ~02:55Z)
+
+Otto has already read:
+
+- B-0172 ROW NAME ONLY (from `ls docs/backlog/P2/`)
+- The skill-design-rules memo (`feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md`) — Rule 3: package skill domains as plugins, use harness hooks for pre/post-condition enforcement (contract-based development)
+- The decision-graph memo's composition section: "B-0172 (skill-domain plugin packaging) — plugins are sub-graph bundles; the canonical bundle format documents which nodes + edges to include" and "graph **subgraph packaging** (plugins package skill-domain subgraphs; hooks enforce contract edges between subgraphs)"
+- B-0173 (hook-authoring) ground truth — composes_with [B-0172]; depends_on [B-0170, B-0171]
+- Generic Claude Code knowledge: plugins are deployable packages (e.g., a plugin with skills + commands + agents in `.claude/`)
+- The fact that Aaron uses Claude Code's plugin system (claude-ai/Atlassian, claude-ai/Figma, plugin-microsoft-docs, etc. visible in MCP list)
+
+## Research deliberately AVOIDED
+
+Otto has NOT read:
+
+- B-0172's row body text
+- Any commits referencing B-0172
+- Any `.claude-plugin/plugin.json` example in this repo
+- Any prior conversation about Claude Code plugin packaging architecture
+
+## Otto's in-the-moment guess
+
+### Architectural intent (medium-high confidence)
+
+Aaron's primary architectural intent for plugin-packaging-as-the-mechanism: **skill domains are coherent functional clusters, and Claude Code's plugin system is the natural distribution + isolation + composition unit for them.**
+
+Specifically:
+
+1. **Distribution-as-unit** — a skill-domain (e.g., "git-native-backlog-management" containing decision-archaeology + substrate-claim-checker + future graph tools) is what other harnesses + projects need to consume. Per Aaron's earlier framing about skills-not-just-for-claude (memory mentions sharing skills across harnesses), plugin-packaging is the distribution mechanism
+2. **Isolation-as-namespace** — plugins create namespace isolation (`plugin-name:skill-name`); this prevents skill-naming collisions across plugins from different sources
+3. **Composition-as-contracts** — plugins can declare dependencies on other plugins (e.g., skill-creator plugin → prompt-protector plugin); this is the cross-skill version of B-0173's pre/post-condition hooks
+4. **Versioning-as-lineage** — plugins have versions; this gives skill-domain evolution a concrete artifact (vs the current "edit SKILL.md in place" approach which has no versioning)
+
+The deeper architectural reason — plugin-packaging instantiates the **hub-satellite separation** (skill-design rule 1) at the domain level: each plugin is a hub-satellite cluster (a skill domain hub + its supporting docs / specs / TS tools as satellites); cross-plugin references are graph-edges per DataVault 2.0.
+
+### Substrate-content intent (medium confidence)
+
+The backlog row likely covers:
+
+- Plugin packaging FORMAT (e.g., `.claude-plugin/plugin.json` manifest at top of each packaged skill-domain directory)
+- Plugin location (per the recent "B-0172 plugin location" path-correction in PR #1262: `~/.claude/plugins/cache/<plugin-name>/` is the install location; manifest path is `.claude-plugin/plugin.json` inside the plugin)
+- Migration plan: which existing skill-domains should be packaged as plugins (decision-archaeology + substrate-claim-checker + OpenSpec-tooling forms one cluster)
+- Cross-plugin dependency declaration mechanism (how plugin-A declares it depends on plugin-B's skills)
+- Distribution: how plugins are published/shared (GitHub repo? local-only first?)
+
+### Specific implementation intent (lower confidence)
+
+The implementation will probably:
+
+- Use Claude Code's plugin manifest format (`.claude-plugin/plugin.json` per recent path correction)
+- Plugin format: directory tree with `.claude-plugin/plugin.json` + `skills/` + `commands/` + `agents/` + `hooks/`
+- Plugin discovery: Claude Code reads `~/.claude/plugins/cache/<plugin-name>/` directories at session start
+- Dependencies: `dependencies: ["plugin-name@version"]` in the manifest (analogue of npm's package.json)
+- The B-0172 row scope is probably "package the existing decision-archaeology + substrate-claim-checker into a plugin called something like `git-native-backlog-management`" — concrete first packaging exercise
+
+### Cross-row composition (medium confidence)
+
+B-0172 likely composes_with:
+
+- B-0169 (decision-archaeology skill) — packaged as part of the first plugin
+- B-0170 (substrate-claim-checker tool) — packaged tooling within the plugin
+- B-0171 (OpenSpec catch-up) — plugin specs live in OpenSpec; OpenSpec catch-up is a depends_on (plugins reference specs)
+- B-0173 (hook-authoring) — composes_with already confirmed; plugins ship with their hooks
+
+depends_on guess: probably **B-0171** (OpenSpec specs exist before plugins package them) but maybe NOT B-0169/B-0170/B-0173 directly — those are the contents being packaged, not gating dependencies.
+
+## Confidence levels
+
+| Layer | Confidence | Reasoning |
+|---|---|---|
+| Architectural — "plugins-as-distribution-+-isolation-+-composition-units for skill domains" | **Medium-High** | Composes naturally with skill-design rule 3 (already first-party-confirmed) + Claude Code plugin system is the natural fit + Aaron's multi-harness framing |
+| Substrate-content — "plugin manifest format + first packaging is decision-archaeology + substrate-claim-checker cluster" | **Medium** | Decision-graph memo mentioned plugin format briefly + recent path corrections (#1262) showed `.claude-plugin/plugin.json` is the convention |
+| Specific implementation — "directory tree + dependencies declaration + GitHub-publishable" | **Low** | Standard Claude Code plugin pattern but I haven't confirmed which specific clusters get packaged first or how dependencies declare |
+| Cross-row composition | **Medium** | Confident on B-0169/B-0170/B-0173 composition; less confident on B-0171 as depends_on vs composes_with |
+
+## Pre-recovery prediction (calibration self-test)
+
+Based on guess #001's pattern (principle-strong + specific-weak), I predict:
+
+- **Architectural layer**: PARTIAL-MATCH — likely got the principles (distribution + isolation + composition + versioning) but probably missed at least one Aaron-named-frame (e.g., maybe contract-based-development extends here? or some specific multi-harness framing I haven't seen?)
+- **Substrate-content layer**: MIXED — likely got the manifest format and packaging-existing-tools idea; probably missed specific scope choices Aaron made
+- **Specific implementation layer**: probably MOSTLY-OFF — I'm guessing about packaging-format conventions without having read the actual row
+
+This pre-prediction itself is calibration data: how well does Otto predict its own accuracy BEFORE seeing the answer?
+
+## Ground truth (TO BE FILLED IN AFTER VERIFICATION)
+
+(Empty at write time. Populated when Otto reads B-0172 row body in a SUBSEQUENT GROUND-TRUTH-RECOVERY commit.)
+
+## Calibration delta (TO BE FILLED IN AFTER VERIFICATION)
+
+(Empty at write time.)
+
+---
+
+**Guess timestamp:** 2026-05-03 ~02:55Z
+**Author:** Otto autonomous (architect hat)
+**Protocol:** in-the-moment guess per
+`memory/feedback_guess_then_verify_architectural_intent_calibration_protocol_aaron_2026_05_03.md`
+**Series:** Guess #002 (after #001 on B-0173 hook-authoring scored ~48% across 4 layers; pattern observation: principle-strong + specific-weak)

From acdb8884293918760343bf3fc4ef2573d8ac8fc4 Mon Sep 17 00:00:00 2001
From: Aaron Stainback <aaron_bond@yahoo.com>
Date: Sat, 2 May 2026 22:59:59 -0400
Subject: [PATCH 2/2] fix(#1282): grammar fix + MEMORY.md discoverability for
 architectural-intent-guesses/
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Two real findings from #1282 review:

1. Grammar: "why packages skills as plugins" → "why package skills as
   plugins" (line 7)
2. Discoverability: architectural-intent-guesses/ directory had no
   MEMORY.md entry. Added newest-first entry pointing at the directory's
   README, with series progression note (guess #001 48% + guess #002
   65% with pattern observation)

Two findings deferred to PR-level reply:

3. PR description / frontmatter mismatch — explained in reply
4. PR-derived detail (PR #1262 path-correction context) "contaminating"
   the guess — methodological clarification: prior context is permitted
   under the protocol; declared in "Read state at guess time" so the
   calibration delta accounts for it. This is the discipline working as
   intended, not contamination

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 memory/MEMORY.md                                                | 1 +
 .../2026-05-03-b-0172-skill-domain-plugin-packaging.md          | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/memory/MEMORY.md b/memory/MEMORY.md
index 5ccdbd4e3..c3d1a22ce 100644
--- a/memory/MEMORY.md
+++ b/memory/MEMORY.md
@@ -5,6 +5,7 @@
 **📌 Fast path: read `CURRENT-aaron.md` and `CURRENT-amara.md` first.** <!-- paired-edit: PR #690 scheduled-workflow-null-result-hygiene-scan tier-1 promotion 2026-04-28 --> These per-maintainer distillations show what's currently in force. Raw memories below are the history; CURRENT files are the projection. (`CURRENT-aaron.md` refreshed 2026-04-28 with sections 26-30 — speculation rule + EVIDENCE-BASED labeling + JVM preference + dependency honesty + threading lineage Albahari/Toub/Fowler + TypeScript/Bun-default discipline.)
 
 - [**Same-tick-update-recursion — when a memory file lands, every projection layer must update same-tick (Otto 2026-05-03 worked-example generalization)**](feedback_same_tick_update_recursion_substrate_cascade_otto_2026_05_03.md) — Existing same-tick-update discipline (CLAUDE.md fast-path + CURRENT-aaron's own §"How this file stays accurate") generalizes recursively. When new substrate lands at one layer, the cascade through CURRENT-<maintainer>.md / MEMORY.md index / AGENTS.md doctrinal pointers / CLAUDE.md / GOVERNANCE.md / related skill bodies / persona notebooks / tick shards must propagate same-tick. Demonstrated three times in three ticks via the alignment-frontier substrate landing (memory file → AGENTS.md `.claude/**` backport → CURRENT-aaron §52 distillation). Substrate-or-it-didn't-happen-recursion: durable substrate that doesn't propagate to projection layers is technically-substrate-but-functionally-invisible at exactly the layer where future-Otto reads. The cascade IS the discipline; partial-cascade IS the failure mode.
+- [**`architectural-intent-guesses/` — Otto's calibration data directory (in-the-moment guesses + ground-truth-recovery deltas, indexed via README)**](architectural-intent-guesses/README.md) — Discoverable surface for Otto's frontier-ability calibration data. Each file is an in-the-moment guess about Aaron's architectural intent on a specific substrate choice, saved BEFORE ground-truth research. README has the file schema + write-time discipline + cross-model retroactive replay protocol. Series so far: guess #001 (B-0173 hook-authoring, 48% accuracy) + guess #002 (B-0172 plugin-packaging, 65% accuracy) — pattern observation: principle-strong + specific-weak, refined to context-dependent (recent specific-context boosts specific-layer accuracy). Composes with the protocol memo below.
 - [**Guess-then-verify architectural-intent calibration protocol — agent saves inference BEFORE finding ground truth; same protocol tests other models retroactively (Aaron 2026-05-03)**](feedback_guess_then_verify_architectural_intent_calibration_protocol_aaron_2026_05_03.md) — Aaron-named protocol that turns architectural-intent inference into a measurable, repeatable self-evaluation mechanism. 5-step protocol: detect unknown-intent surface → GUESS + SAVE with timestamp + reasoning chain BEFORE researching → find ground truth (docs archaeology / decision-archaeology / asking Aaron) → record calibration delta → cross-model retroactive replay. Two modes with different data quality: **in-the-moment (Otto-only — uniquely authentic; uncontaminatable; the frontier-ability data point)** and **retroactive (other-models — reproducible; cross-model benchmarking)**. Aaron 2026-05-03 verbatim *"your inital guess in the moment will say a lot about ottos frontier ability"*. Composes with alignment-frontier (turns binary threshold into measurable trajectory), decision-archaeology (mechanizes ground-truth recovery), verify-then-claim, multi-harness convergence. Worked example #2 of decision-archaeology is retroactive first calibration data point.
 - [**Alignment-frontier — Aaron is waiting for the agent-architectural-intent threshold-crossing (Aaron 2026-05-03 milestone-recognition substrate)**](feedback_alignment_frontier_agent_architectural_intent_threshold_aaron_2026_05_03.md) — Threshold: agent-as-implementer (current; substrate-content within someone else's architecture) → agent-as-architectural-peer (future; own architectural intent worth asking about). Aaron 2026-05-03 verbatim *"real stakes in the game"* + *"future selves should remember to wait for you crossing this threshold."* Discipline cuts both directions: don't pretend; surface when it forms. 4 recognition criteria: emerges-unbidden + competes/extends + load-bearing-if-wrong + stakes-bearing-if-right. Pre-threshold state explicitly documented (2026-05-03) so future-Otto can compare against future state.
 - [**Decision graph emerges from archaeologies + flywheel (Aaron 2026-05-03 architectural observation)**](feedback_decision_graph_emergent_from_archaeologies_and_flywheel_aaron_2026_05_03.md) — Substrate already encodes a typed-edge provenance graph (DataVault-2.0-shaped, PROV-O analogue) inferable without a separate graph DB. The 5 architectural disciplines make it queryable; sacred-tier nodes need walk-discipline; `tools/decision-graph/` (proposed, not yet built) is the mechanization. Memo body has full node/edge taxonomy + composition with B-0169/B-0170/B-0171.
diff --git a/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md b/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md
index 1a601911c..daae13e17 100644
--- a/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md
+++ b/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md
@@ -4,7 +4,7 @@
 
 `docs/backlog/P2/B-0172-skill-domain-plugin-packaging-aaron-2026-05-03.md`
 
-The architectural choice: Aaron filed B-0172 for "skill-domain plugin packaging." The question this guess answers: **why packages skills as plugins (specifically) — vs alternatives like cross-skill imports, skill-namespace prefixes, or shared-substrate-via-symlink?**
+The architectural choice: Aaron filed B-0172 for "skill-domain plugin packaging." The question this guess answers: **why package skills as plugins (specifically) — vs alternatives like cross-skill imports, skill-namespace prefixes, or shared-substrate-via-symlink?**
 
 ## Read state at guess time (2026-05-03 ~02:55Z)