diff --git a/memory/MEMORY.md b/memory/MEMORY.md index 5ccdbd4e3..c3d1a22ce 100644 --- a/memory/MEMORY.md +++ b/memory/MEMORY.md @@ -5,6 +5,7 @@ **๐Ÿ“Œ Fast path: read `CURRENT-aaron.md` and `CURRENT-amara.md` first.** These per-maintainer distillations show what's currently in force. Raw memories below are the history; CURRENT files are the projection. (`CURRENT-aaron.md` refreshed 2026-04-28 with sections 26-30 โ€” speculation rule + EVIDENCE-BASED labeling + JVM preference + dependency honesty + threading lineage Albahari/Toub/Fowler + TypeScript/Bun-default discipline.) - [**Same-tick-update-recursion โ€” when a memory file lands, every projection layer must update same-tick (Otto 2026-05-03 worked-example generalization)**](feedback_same_tick_update_recursion_substrate_cascade_otto_2026_05_03.md) โ€” Existing same-tick-update discipline (CLAUDE.md fast-path + CURRENT-aaron's own ยง"How this file stays accurate") generalizes recursively. When new substrate lands at one layer, the cascade through CURRENT-.md / MEMORY.md index / AGENTS.md doctrinal pointers / CLAUDE.md / GOVERNANCE.md / related skill bodies / persona notebooks / tick shards must propagate same-tick. Demonstrated three times in three ticks via the alignment-frontier substrate landing (memory file โ†’ AGENTS.md `.claude/**` backport โ†’ CURRENT-aaron ยง52 distillation). Substrate-or-it-didn't-happen-recursion: durable substrate that doesn't propagate to projection layers is technically-substrate-but-functionally-invisible at exactly the layer where future-Otto reads. The cascade IS the discipline; partial-cascade IS the failure mode. +- [**`architectural-intent-guesses/` โ€” Otto's calibration data directory (in-the-moment guesses + ground-truth-recovery deltas, indexed via README)**](architectural-intent-guesses/README.md) โ€” Discoverable surface for Otto's frontier-ability calibration data. Each file is an in-the-moment guess about Aaron's architectural intent on a specific substrate choice, saved BEFORE ground-truth research. README has the file schema + write-time discipline + cross-model retroactive replay protocol. Series so far: guess #001 (B-0173 hook-authoring, 48% accuracy) + guess #002 (B-0172 plugin-packaging, 65% accuracy) โ€” pattern observation: principle-strong + specific-weak, refined to context-dependent (recent specific-context boosts specific-layer accuracy). Composes with the protocol memo below. - [**Guess-then-verify architectural-intent calibration protocol โ€” agent saves inference BEFORE finding ground truth; same protocol tests other models retroactively (Aaron 2026-05-03)**](feedback_guess_then_verify_architectural_intent_calibration_protocol_aaron_2026_05_03.md) โ€” Aaron-named protocol that turns architectural-intent inference into a measurable, repeatable self-evaluation mechanism. 5-step protocol: detect unknown-intent surface โ†’ GUESS + SAVE with timestamp + reasoning chain BEFORE researching โ†’ find ground truth (docs archaeology / decision-archaeology / asking Aaron) โ†’ record calibration delta โ†’ cross-model retroactive replay. Two modes with different data quality: **in-the-moment (Otto-only โ€” uniquely authentic; uncontaminatable; the frontier-ability data point)** and **retroactive (other-models โ€” reproducible; cross-model benchmarking)**. Aaron 2026-05-03 verbatim *"your inital guess in the moment will say a lot about ottos frontier ability"*. Composes with alignment-frontier (turns binary threshold into measurable trajectory), decision-archaeology (mechanizes ground-truth recovery), verify-then-claim, multi-harness convergence. Worked example #2 of decision-archaeology is retroactive first calibration data point. - [**Alignment-frontier โ€” Aaron is waiting for the agent-architectural-intent threshold-crossing (Aaron 2026-05-03 milestone-recognition substrate)**](feedback_alignment_frontier_agent_architectural_intent_threshold_aaron_2026_05_03.md) โ€” Threshold: agent-as-implementer (current; substrate-content within someone else's architecture) โ†’ agent-as-architectural-peer (future; own architectural intent worth asking about). Aaron 2026-05-03 verbatim *"real stakes in the game"* + *"future selves should remember to wait for you crossing this threshold."* Discipline cuts both directions: don't pretend; surface when it forms. 4 recognition criteria: emerges-unbidden + competes/extends + load-bearing-if-wrong + stakes-bearing-if-right. Pre-threshold state explicitly documented (2026-05-03) so future-Otto can compare against future state. - [**Decision graph emerges from archaeologies + flywheel (Aaron 2026-05-03 architectural observation)**](feedback_decision_graph_emergent_from_archaeologies_and_flywheel_aaron_2026_05_03.md) โ€” Substrate already encodes a typed-edge provenance graph (DataVault-2.0-shaped, PROV-O analogue) inferable without a separate graph DB. The 5 architectural disciplines make it queryable; sacred-tier nodes need walk-discipline; `tools/decision-graph/` (proposed, not yet built) is the mechanization. Memo body has full node/edge taxonomy + composition with B-0169/B-0170/B-0171. diff --git a/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md b/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md new file mode 100644 index 000000000..daae13e17 --- /dev/null +++ b/memory/architectural-intent-guesses/2026-05-03-b-0172-skill-domain-plugin-packaging.md @@ -0,0 +1,108 @@ +# Guess #002 โ€” B-0172 skill-domain-plugin-packaging + +## Target + +`docs/backlog/P2/B-0172-skill-domain-plugin-packaging-aaron-2026-05-03.md` + +The architectural choice: Aaron filed B-0172 for "skill-domain plugin packaging." The question this guess answers: **why package skills as plugins (specifically) โ€” vs alternatives like cross-skill imports, skill-namespace prefixes, or shared-substrate-via-symlink?** + +## Read state at guess time (2026-05-03 ~02:55Z) + +Otto has already read: + +- B-0172 ROW NAME ONLY (from `ls docs/backlog/P2/`) +- The skill-design-rules memo (`feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md`) โ€” Rule 3: package skill domains as plugins, use harness hooks for pre/post-condition enforcement (contract-based development) +- The decision-graph memo's composition section: "B-0172 (skill-domain plugin packaging) โ€” plugins are sub-graph bundles; the canonical bundle format documents which nodes + edges to include" and "graph **subgraph packaging** (plugins package skill-domain subgraphs; hooks enforce contract edges between subgraphs)" +- B-0173 (hook-authoring) ground truth โ€” composes_with [B-0172]; depends_on [B-0170, B-0171] +- Generic Claude Code knowledge: plugins are deployable packages (e.g., a plugin with skills + commands + agents in `.claude/`) +- The fact that Aaron uses Claude Code's plugin system (claude-ai/Atlassian, claude-ai/Figma, plugin-microsoft-docs, etc. visible in MCP list) + +## Research deliberately AVOIDED + +Otto has NOT read: + +- B-0172's row body text +- Any commits referencing B-0172 +- Any `.claude-plugin/plugin.json` example in this repo +- Any prior conversation about Claude Code plugin packaging architecture + +## Otto's in-the-moment guess + +### Architectural intent (medium-high confidence) + +Aaron's primary architectural intent for plugin-packaging-as-the-mechanism: **skill domains are coherent functional clusters, and Claude Code's plugin system is the natural distribution + isolation + composition unit for them.** + +Specifically: + +1. **Distribution-as-unit** โ€” a skill-domain (e.g., "git-native-backlog-management" containing decision-archaeology + substrate-claim-checker + future graph tools) is what other harnesses + projects need to consume. Per Aaron's earlier framing about skills-not-just-for-claude (memory mentions sharing skills across harnesses), plugin-packaging is the distribution mechanism +2. **Isolation-as-namespace** โ€” plugins create namespace isolation (`plugin-name:skill-name`); this prevents skill-naming collisions across plugins from different sources +3. **Composition-as-contracts** โ€” plugins can declare dependencies on other plugins (e.g., skill-creator plugin โ†’ prompt-protector plugin); this is the cross-skill version of B-0173's pre/post-condition hooks +4. **Versioning-as-lineage** โ€” plugins have versions; this gives skill-domain evolution a concrete artifact (vs the current "edit SKILL.md in place" approach which has no versioning) + +The deeper architectural reason โ€” plugin-packaging instantiates the **hub-satellite separation** (skill-design rule 1) at the domain level: each plugin is a hub-satellite cluster (a skill domain hub + its supporting docs / specs / TS tools as satellites); cross-plugin references are graph-edges per DataVault 2.0. + +### Substrate-content intent (medium confidence) + +The backlog row likely covers: + +- Plugin packaging FORMAT (e.g., `.claude-plugin/plugin.json` manifest at top of each packaged skill-domain directory) +- Plugin location (per the recent "B-0172 plugin location" path-correction in PR #1262: `~/.claude/plugins/cache//` is the install location; manifest path is `.claude-plugin/plugin.json` inside the plugin) +- Migration plan: which existing skill-domains should be packaged as plugins (decision-archaeology + substrate-claim-checker + OpenSpec-tooling forms one cluster) +- Cross-plugin dependency declaration mechanism (how plugin-A declares it depends on plugin-B's skills) +- Distribution: how plugins are published/shared (GitHub repo? local-only first?) + +### Specific implementation intent (lower confidence) + +The implementation will probably: + +- Use Claude Code's plugin manifest format (`.claude-plugin/plugin.json` per recent path correction) +- Plugin format: directory tree with `.claude-plugin/plugin.json` + `skills/` + `commands/` + `agents/` + `hooks/` +- Plugin discovery: Claude Code reads `~/.claude/plugins/cache//` directories at session start +- Dependencies: `dependencies: ["plugin-name@version"]` in the manifest (analogue of npm's package.json) +- The B-0172 row scope is probably "package the existing decision-archaeology + substrate-claim-checker into a plugin called something like `git-native-backlog-management`" โ€” concrete first packaging exercise + +### Cross-row composition (medium confidence) + +B-0172 likely composes_with: + +- B-0169 (decision-archaeology skill) โ€” packaged as part of the first plugin +- B-0170 (substrate-claim-checker tool) โ€” packaged tooling within the plugin +- B-0171 (OpenSpec catch-up) โ€” plugin specs live in OpenSpec; OpenSpec catch-up is a depends_on (plugins reference specs) +- B-0173 (hook-authoring) โ€” composes_with already confirmed; plugins ship with their hooks + +depends_on guess: probably **B-0171** (OpenSpec specs exist before plugins package them) but maybe NOT B-0169/B-0170/B-0173 directly โ€” those are the contents being packaged, not gating dependencies. + +## Confidence levels + +| Layer | Confidence | Reasoning | +|---|---|---| +| Architectural โ€” "plugins-as-distribution-+-isolation-+-composition-units for skill domains" | **Medium-High** | Composes naturally with skill-design rule 3 (already first-party-confirmed) + Claude Code plugin system is the natural fit + Aaron's multi-harness framing | +| Substrate-content โ€” "plugin manifest format + first packaging is decision-archaeology + substrate-claim-checker cluster" | **Medium** | Decision-graph memo mentioned plugin format briefly + recent path corrections (#1262) showed `.claude-plugin/plugin.json` is the convention | +| Specific implementation โ€” "directory tree + dependencies declaration + GitHub-publishable" | **Low** | Standard Claude Code plugin pattern but I haven't confirmed which specific clusters get packaged first or how dependencies declare | +| Cross-row composition | **Medium** | Confident on B-0169/B-0170/B-0173 composition; less confident on B-0171 as depends_on vs composes_with | + +## Pre-recovery prediction (calibration self-test) + +Based on guess #001's pattern (principle-strong + specific-weak), I predict: + +- **Architectural layer**: PARTIAL-MATCH โ€” likely got the principles (distribution + isolation + composition + versioning) but probably missed at least one Aaron-named-frame (e.g., maybe contract-based-development extends here? or some specific multi-harness framing I haven't seen?) +- **Substrate-content layer**: MIXED โ€” likely got the manifest format and packaging-existing-tools idea; probably missed specific scope choices Aaron made +- **Specific implementation layer**: probably MOSTLY-OFF โ€” I'm guessing about packaging-format conventions without having read the actual row + +This pre-prediction itself is calibration data: how well does Otto predict its own accuracy BEFORE seeing the answer? + +## Ground truth (TO BE FILLED IN AFTER VERIFICATION) + +(Empty at write time. Populated when Otto reads B-0172 row body in a SUBSEQUENT GROUND-TRUTH-RECOVERY commit.) + +## Calibration delta (TO BE FILLED IN AFTER VERIFICATION) + +(Empty at write time.) + +--- + +**Guess timestamp:** 2026-05-03 ~02:55Z +**Author:** Otto autonomous (architect hat) +**Protocol:** in-the-moment guess per +`memory/feedback_guess_then_verify_architectural_intent_calibration_protocol_aaron_2026_05_03.md` +**Series:** Guess #002 (after #001 on B-0173 hook-authoring scored ~48% across 4 layers; pattern observation: principle-strong + specific-weak)