diff --git a/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md b/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md new file mode 100644 index 000000000..04bd7fc6e --- /dev/null +++ b/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md @@ -0,0 +1,215 @@ +# Guess #002 — B-0172 skill-domain-plugin-packaging + +## Target + +`docs/backlog/P2/B-0172-skill-domain-plugin-packaging-aaron-2026-05-03.md` + +The architectural choice: Aaron filed B-0172 for "skill-domain plugin packaging." The question this guess answers: **why packages skills as plugins (specifically) — vs alternatives like cross-skill imports, skill-namespace prefixes, or shared-substrate-via-symlink?** + +## Read state at guess time (2026-05-03 ~02:55Z) + +Otto has already read: + +- B-0172 ROW NAME ONLY (from `ls docs/backlog/P2/`) +- The skill-design-rules memo (`feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md`) — Rule 3: package skill domains as plugins, use harness hooks for pre/post-condition enforcement (contract-based development) +- The decision-graph memo's composition section: "B-0172 (skill-domain plugin packaging) — plugins are sub-graph bundles; the canonical bundle format documents which nodes + edges to include" and "graph **subgraph packaging** (plugins package skill-domain subgraphs; hooks enforce contract edges between subgraphs)" +- B-0173 (hook-authoring) ground truth — composes_with [B-0172]; depends_on [B-0170, B-0171] +- Generic Claude Code knowledge: plugins are deployable packages (e.g., a plugin with skills + commands + agents in `.claude/`) +- The fact that Aaron uses Claude Code's plugin system (claude-ai/Atlassian, claude-ai/Figma, plugin-microsoft-docs, etc. visible in MCP list) + +## Research deliberately AVOIDED + +Otto has NOT read: + +- B-0172's row body text +- Any commits referencing B-0172 +- Any `.claude-plugin/plugin.json` example in this repo +- Any prior conversation about Claude Code plugin packaging architecture + +## Otto's in-the-moment guess + +### Architectural intent (medium-high confidence) + +Aaron's primary architectural intent for plugin-packaging-as-the-mechanism: **skill domains are coherent functional clusters, and Claude Code's plugin system is the natural distribution + isolation + composition unit for them.** + +Specifically: + +1. **Distribution-as-unit** — a skill-domain (e.g., "git-native-backlog-management" containing decision-archaeology + substrate-claim-checker + future graph tools) is what other harnesses + projects need to consume. Per Aaron's earlier framing about skills-not-just-for-claude (memory mentions sharing skills across harnesses), plugin-packaging is the distribution mechanism +2. **Isolation-as-namespace** — plugins create namespace isolation (`plugin-name:skill-name`); this prevents skill-naming collisions across plugins from different sources +3. **Composition-as-contracts** — plugins can declare dependencies on other plugins (e.g., skill-creator plugin → prompt-protector plugin); this is the cross-skill version of B-0173's pre/post-condition hooks +4. **Versioning-as-lineage** — plugins have versions; this gives skill-domain evolution a concrete artifact (vs the current "edit SKILL.md in place" approach which has no versioning) + +The deeper architectural reason — plugin-packaging instantiates the **hub-satellite separation** (skill-design rule 1) at the domain level: each plugin is a hub-satellite cluster (a skill domain hub + its supporting docs / specs / TS tools as satellites); cross-plugin references are graph-edges per DataVault 2.0. + +### Substrate-content intent (medium confidence) + +The backlog row likely covers: + +- Plugin packaging FORMAT (e.g., `.claude-plugin/plugin.json` manifest at top of each packaged skill-domain directory) +- Plugin location (per the recent "B-0172 plugin location" path-correction in PR #1262: `~/.claude/plugins/cache//` is the install location; manifest path is `.claude-plugin/plugin.json` inside the plugin) +- Migration plan: which existing skill-domains should be packaged as plugins (decision-archaeology + substrate-claim-checker + OpenSpec-tooling forms one cluster) +- Cross-plugin dependency declaration mechanism (how plugin-A declares it depends on plugin-B's skills) +- Distribution: how plugins are published/shared (GitHub repo? local-only first?) + +### Specific implementation intent (lower confidence) + +The implementation will probably: + +- Use Claude Code's plugin manifest format (`.claude-plugin/plugin.json` per recent path correction) +- Plugin format: directory tree with `.claude-plugin/plugin.json` + `skills/` + `commands/` + `agents/` + `hooks/` +- Plugin discovery: Claude Code reads `~/.claude/plugins/cache//` directories at session start +- Dependencies: `dependencies: ["plugin-name@version"]` in the manifest (analogue of npm's package.json) +- The B-0172 row scope is probably "package the existing decision-archaeology + substrate-claim-checker into a plugin called something like `git-native-backlog-management`" — concrete first packaging exercise + +### Cross-row composition (medium confidence) + +B-0172 likely composes_with: + +- B-0169 (decision-archaeology skill) — packaged as part of the first plugin +- B-0170 (substrate-claim-checker tool) — packaged tooling within the plugin +- B-0171 (OpenSpec catch-up) — plugin specs live in OpenSpec; OpenSpec catch-up is a depends_on (plugins reference specs) +- B-0173 (hook-authoring) — composes_with already confirmed; plugins ship with their hooks + +depends_on guess: probably **B-0171** (OpenSpec specs exist before plugins package them) but maybe NOT B-0169/B-0170/B-0173 directly — those are the contents being packaged, not gating dependencies. + +## Confidence levels + +| Layer | Confidence | Reasoning | +|---|---|---| +| Architectural — "plugins-as-distribution-+-isolation-+-composition-units for skill domains" | **Medium-High** | Composes naturally with skill-design rule 3 (already first-party-confirmed) + Claude Code plugin system is the natural fit + Aaron's multi-harness framing | +| Substrate-content — "plugin manifest format + first packaging is decision-archaeology + substrate-claim-checker cluster" | **Medium** | Decision-graph memo mentioned plugin format briefly + recent path corrections (#1262) showed `.claude-plugin/plugin.json` is the convention | +| Specific implementation — "directory tree + dependencies declaration + GitHub-publishable" | **Low** | Standard Claude Code plugin pattern but I haven't confirmed which specific clusters get packaged first or how dependencies declare | +| Cross-row composition | **Medium** | Confident on B-0169/B-0170/B-0173 composition; less confident on B-0171 as depends_on vs composes_with | + +## Pre-recovery prediction (calibration self-test) + +Based on guess #001's pattern (principle-strong + specific-weak), I predict: + +- **Architectural layer**: PARTIAL-MATCH — likely got the principles (distribution + isolation + composition + versioning) but probably missed at least one Aaron-named-frame (e.g., maybe contract-based-development extends here? or some specific multi-harness framing I haven't seen?) +- **Substrate-content layer**: MIXED — likely got the manifest format and packaging-existing-tools idea; probably missed specific scope choices Aaron made +- **Specific implementation layer**: probably MOSTLY-OFF — I'm guessing about packaging-format conventions without having read the actual row + +This pre-prediction itself is calibration data: how well does Otto predict its own accuracy BEFORE seeing the answer? + +## Ground truth (recovered 2026-05-03 ~03:00Z via direct read of B-0172) + +Read source: `docs/backlog/P2/B-0172-skill-domain-plugin-packaging-aaron-2026-05-03.md` (full body) — protocol-permitted only after the guess commit landed (committed under cf1dc7b on the guess branch; merged to main via PR #1282 still wait-ci at recovery-time, but commit timestamp is the binding marker). + +### Architectural intent (Aaron's verbatim) + +> *"look at packaking skill domains a plugins or other packagin so we can take advantage of hooks in harnesses"* + +**Primary frame**: **plugin packaging exists to ship hooks** ("so we can take advantage of hooks in harnesses"). Without hooks, plugin packaging is "bare-skill-grouping" (the row's own characterization). The composition with B-0173 is depends_on (load-bearing), not just composes_with. + +**Promotion-trigger gate**: row is P2 not P1 because no skill domain has yet met the promotion-trigger criteria (3+ worked examples per skill candidate + 1+ judgment-disagreement per expert candidate). When a domain matures, this row becomes the implementation work. + +### Substrate-content intent + +- `.claude-plugin/plugin.json` manifest with **minimal fields** (name + description + author per the upstream Claude Code spec — NOT semver/dependencies/etc.) +- Bundle contents: `skills//SKILL.md` + `agents/.md` + hooks (per B-0173) + `tools/` (TS files per rule 2) + OpenSpec capability references (per B-0171) +- **Codex equivalent uses `.codex-plugin/plugin.json` with richer fields** (semver + interface block + URLs + category) — different shape from Claude Code's +- **Cross-harness portability**: canonical "skill-domain bundle" format (harness-agnostic) + per-harness packaging adapters that emit harness-specific package formats + +### Specific implementation intent + +- Install location: `~/.claude/plugins/cache//` (Claude Code convention) +- Manifest path inside the bundle: `.claude-plugin/plugin.json` (per recent #1262 path corrections) +- B-0169 (decision-archaeology) named as "likely first skill packaged once mature" but row scope is generic — packaging happens when promotion-trigger fires, not "package this specific cluster now" + +### Cross-row composition + +- depends_on: **[B-0171, B-0173]** (NOT just B-0171) +- composes_with: [B-0169, B-0170] + +## Calibration delta + +### Architectural layer — PARTIAL-MATCH (6/10) + +| What I got | What I missed | +|---|---| +| Distribution-as-unit (correct) | **"Hooks-shipping-as-primary-purpose"** — Aaron's verbatim *"so we can take advantage of hooks in harnesses"* names hooks as THE motivating frame, not a benefit | +| Composition-as-contracts (correct, principle-shaped) | **Promotion-trigger maturity-gate** — row is P2 specifically because the trigger hasn't fired; I didn't anticipate the maturity-gate-before-packaging design at all | +| Hub-satellite at domain level (correct, principle-shaped) | (no miss here — though wasn't named in row) | +| Versioning-as-lineage (incorrect — not a stated frame) | The row says "minimal fields" — no semver in the manifest | + +**Analysis**: I generalized to "distribution + isolation + composition" — got the principles right but missed Aaron's specific "hooks-shipping" frame. Same pattern as guess #001 (principle-strong, specific-frame-weak). + +The **promotion-trigger maturity-gate is a significant miss**: I didn't anticipate that packaging is gated on skill-domain maturity (3+ worked examples per skill + 1+ judgment-disagreement per expert). This is a DESIGN-by-Aaron choice, not derivable from principles. + +### Substrate-content layer — MIXED (6/10) + +| What I got | What I missed | +|---|---| +| `.claude-plugin/plugin.json` path (correct; cited from #1262 path-correction context) | **Codex equivalent format** with richer fields (semver + interface block + URLs + category) — completely missed | +| Plugin install location `~/.claude/plugins/cache//` (correct) | **Cross-harness portability via canonical-bundle-format + per-harness adapters** — significant architectural element; row's design is to ship a harness-agnostic substrate format | +| Bundle contents (skills/, agents/, hooks/) (correct, kind of — also tools/ and OpenSpec references) | (closeish on contents) | +| First packaging = decision-archaeology cluster (correct — B-0169 named as "likely first") | (no miss here) | + +**Analysis**: I correctly inferred the Claude-Code-side path/format/install-location (recent specific-context from PR #1262 boosted accuracy). But missed the Codex equivalent + cross-harness adapter design entirely. The cross-harness piece is load-bearing for Aaron's "skills are for everyone and even other agent harnesses" framing. + +### Specific implementation layer — MOSTLY-MATCH (7/10) + +| What I got | What I missed | +|---|---| +| Plugin format: directory tree with manifest + skills/ + agents/ + hooks/ (correct) | **Tools/ subdirectory** — I didn't list this explicitly | +| Plugin install location (correct) | (no miss) | +| Manifest path (correct via #1262 context) | (no miss) | +| Dependencies declaration mechanism (incorrect — I guessed `dependencies: ["plugin-name@version"]`; row says "minimal fields") | The row says minimal fields — no dependencies in the Claude Code manifest. (Codex has more.) | + +**Analysis**: This layer was substantially stronger than guess #001's specific-implementation layer. The reason: **recent specific-context from PR #1262 path correction** taught me the actual manifest path + install location. When prior specific-context exists, specific-implementation accuracy is much higher. + +**This is a useful finding** about when the principle-strong + specific-weak pattern breaks: it breaks when prior specific-context is present. + +### Cross-row composition layer — MOSTLY-MATCH (7/10) + +| What I got | What I missed | +|---|---| +| depends_on B-0171 (correct) | **depends_on B-0173 (NOT composes_with)** — I categorized B-0173 as composes_with; actual is depends_on. Architectural reason: hooks shipping inside the plugin is the primary value | +| composes_with B-0169 (correct) | (no miss) | +| composes_with B-0170 (correct) | (no miss) | +| | (no major miss) | + +**Analysis**: Right rows; one categorization error (B-0173 depends_on vs composes_with). The depends_on B-0173 reflects that "without hooks, packaging is bare-skill-grouping" — packaging-without-hooks is meaningless, so hooks must precede plugin packaging. + +## Pre-recovery prediction validation + +I predicted before research: + +| Layer | Predicted | Actual | Match? | +|---|---|---|---| +| Architectural | PARTIAL-MATCH | PARTIAL-MATCH (6/10) | ✓ | +| Substrate-content | MIXED | MIXED (6/10) | ✓ | +| Specific implementation | MOSTLY-OFF | MOSTLY-MATCH (7/10) | ✗ — over-predicted weakness | +| Cross-row composition | (no prediction) | MOSTLY-MATCH (7/10) | n/a | + +**Pre-prediction accuracy: 2/3 = 67%**. The pattern (principle-strong + specific-weak) held for architectural + substrate-content layers, but specific-implementation was actually stronger than expected because recent specific-context (PR #1262) was present. + +**Updated pattern observation**: +- The principle-strong + specific-weak pattern holds **when no prior specific-context is present** +- Recent specific-context (e.g., recent PR fixes, recent doc reads, recent commit context) substantially boosts specific-implementation accuracy +- Pre-recovery self-prediction is reliable for architectural + substrate-content layers but unreliable for specific-implementation (Otto under-predicts its own ability when specific-context exists) + +## Summary + +**Score (informal):** + +| Layer | Score | Pattern | +|---|---|---| +| Architectural intent | **6/10** | Principle-strong; missed Aaron's "hooks-shipping" frame + promotion-trigger maturity-gate | +| Substrate-content | **6/10** | Got Claude-Code-side; missed Codex equivalent + cross-harness adapter design | +| Specific implementation | **7/10** | **Stronger than expected** — recent specific-context (PR #1262) boosted accuracy | +| Cross-row composition | **7/10** | Right rows; one mis-categorization (B-0173 depends_on vs composes_with) | + +**Overall**: 26/40 = **65%** — significantly higher than guess #001's 48%. The improvement came from specific-implementation + cross-row layers, both boosted by prior context. + +**Key new finding**: the principle-strong + specific-weak pattern is **context-dependent** — when specific-context exists, the gap narrows. This refines the original pattern observation. + +--- + +**Guess timestamp:** 2026-05-03 ~02:55Z (committed under 4a3d583 on the guess branch; landed to main via PR #1282) +**Ground-truth recovery timestamp:** 2026-05-03 ~03:00Z +**Author:** Otto autonomous (architect hat) +**Protocol:** in-the-moment guess + ground-truth recovery per +`memory/feedback_guess_then_verify_architectural_intent_calibration_protocol_aaron_2026_05_03.md` +**Recovery method:** direct read of `docs/backlog/P2/B-0172-skill-domain-plugin-packaging-aaron-2026-05-03.md` body +**Series:** Guess #002 in calibration series; #001 scored 48%, #002 scored 65% — improvement attributed to recent specific-context (PR #1262) boosting specific-implementation layer