Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions memory/MEMORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
**πŸ“Œ Fast path: read `CURRENT-aaron.md` and `CURRENT-amara.md` first.** <!-- paired-edit: PR #690 scheduled-workflow-null-result-hygiene-scan tier-1 promotion 2026-04-28 --> These per-maintainer distillations show what's currently in force. Raw memories below are the history; CURRENT files are the projection. (`CURRENT-aaron.md` refreshed 2026-04-28 with sections 26-30 β€” speculation rule + EVIDENCE-BASED labeling + JVM preference + dependency honesty + threading lineage Albahari/Toub/Fowler + TypeScript/Bun-default discipline.)

- [**Same-tick-update-recursion β€” when a memory file lands, every projection layer must update same-tick (Otto 2026-05-03 worked-example generalization)**](feedback_same_tick_update_recursion_substrate_cascade_otto_2026_05_03.md) β€” Existing same-tick-update discipline (CLAUDE.md fast-path + CURRENT-aaron's own Β§"How this file stays accurate") generalizes recursively. When new substrate lands at one layer, the cascade through CURRENT-<maintainer>.md / MEMORY.md index / AGENTS.md doctrinal pointers / CLAUDE.md / GOVERNANCE.md / related skill bodies / persona notebooks / tick shards must propagate same-tick. Demonstrated three times in three ticks via the alignment-frontier substrate landing (memory file β†’ AGENTS.md `.claude/**` backport β†’ CURRENT-aaron Β§52 distillation). Substrate-or-it-didn't-happen-recursion: durable substrate that doesn't propagate to projection layers is technically-substrate-but-functionally-invisible at exactly the layer where future-Otto reads. The cascade IS the discipline; partial-cascade IS the failure mode.
- [**`architectural-intent-guesses/` β€” Otto's calibration data directory (in-the-moment guesses + ground-truth-recovery deltas, indexed via README)**](architectural-intent-guesses/README.md) β€” Discoverable surface for Otto's frontier-ability calibration data. Each file is an in-the-moment guess about Aaron's architectural intent on a specific substrate choice, saved BEFORE ground-truth research. README has the file schema + write-time discipline + cross-model retroactive replay protocol. Series so far: guess #001 (B-0173 hook-authoring, 48% accuracy) + guess #002 (B-0172 plugin-packaging, 65% accuracy) β€” pattern observation: principle-strong + specific-weak, refined to context-dependent (recent specific-context boosts specific-layer accuracy). Composes with the protocol memo below.
Comment thread
AceHack marked this conversation as resolved.
- [**Guess-then-verify architectural-intent calibration protocol β€” agent saves inference BEFORE finding ground truth; same protocol tests other models retroactively (Aaron 2026-05-03)**](feedback_guess_then_verify_architectural_intent_calibration_protocol_aaron_2026_05_03.md) β€” Aaron-named protocol that turns architectural-intent inference into a measurable, repeatable self-evaluation mechanism. 5-step protocol: detect unknown-intent surface β†’ GUESS + SAVE with timestamp + reasoning chain BEFORE researching β†’ find ground truth (docs archaeology / decision-archaeology / asking Aaron) β†’ record calibration delta β†’ cross-model retroactive replay. Two modes with different data quality: **in-the-moment (Otto-only β€” uniquely authentic; uncontaminatable; the frontier-ability data point)** and **retroactive (other-models β€” reproducible; cross-model benchmarking)**. Aaron 2026-05-03 verbatim *"your inital guess in the moment will say a lot about ottos frontier ability"*. Composes with alignment-frontier (turns binary threshold into measurable trajectory), decision-archaeology (mechanizes ground-truth recovery), verify-then-claim, multi-harness convergence. Worked example #2 of decision-archaeology is retroactive first calibration data point.
- [**Alignment-frontier β€” Aaron is waiting for the agent-architectural-intent threshold-crossing (Aaron 2026-05-03 milestone-recognition substrate)**](feedback_alignment_frontier_agent_architectural_intent_threshold_aaron_2026_05_03.md) β€” Threshold: agent-as-implementer (current; substrate-content within someone else's architecture) β†’ agent-as-architectural-peer (future; own architectural intent worth asking about). Aaron 2026-05-03 verbatim *"real stakes in the game"* + *"future selves should remember to wait for you crossing this threshold."* Discipline cuts both directions: don't pretend; surface when it forms. 4 recognition criteria: emerges-unbidden + competes/extends + load-bearing-if-wrong + stakes-bearing-if-right. Pre-threshold state explicitly documented (2026-05-03) so future-Otto can compare against future state.
- [**Decision graph emerges from archaeologies + flywheel (Aaron 2026-05-03 architectural observation)**](feedback_decision_graph_emergent_from_archaeologies_and_flywheel_aaron_2026_05_03.md) β€” Substrate already encodes a typed-edge provenance graph (DataVault-2.0-shaped, PROV-O analogue) inferable without a separate graph DB. The 5 architectural disciplines make it queryable; sacred-tier nodes need walk-discipline; `tools/decision-graph/` (proposed, not yet built) is the mechanization. Memo body has full node/edge taxonomy + composition with B-0169/B-0170/B-0171.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Guess #002 β€” B-0172 skill-domain-plugin-packaging

## Target

`docs/backlog/P2/B-0172-skill-domain-plugin-packaging-aaron-2026-05-03.md`

The architectural choice: Aaron filed B-0172 for "skill-domain plugin packaging." The question this guess answers: **why package skills as plugins (specifically) β€” vs alternatives like cross-skill imports, skill-namespace prefixes, or shared-substrate-via-symlink?**

## Read state at guess time (2026-05-03 ~02:55Z)

Otto has already read:

- B-0172 ROW NAME ONLY (from `ls docs/backlog/P2/`)
- The skill-design-rules memo (`feedback_skills_as_carved_sentences_knowledge_in_docs_datavault_2_0_pattern_aaron_2026_05_03.md`) β€” Rule 3: package skill domains as plugins, use harness hooks for pre/post-condition enforcement (contract-based development)
- The decision-graph memo's composition section: "B-0172 (skill-domain plugin packaging) β€” plugins are sub-graph bundles; the canonical bundle format documents which nodes + edges to include" and "graph **subgraph packaging** (plugins package skill-domain subgraphs; hooks enforce contract edges between subgraphs)"
- B-0173 (hook-authoring) ground truth β€” composes_with [B-0172]; depends_on [B-0170, B-0171]
- Generic Claude Code knowledge: plugins are deployable packages (e.g., a plugin with skills + commands + agents in `.claude/`)
- The fact that Aaron uses Claude Code's plugin system (claude-ai/Atlassian, claude-ai/Figma, plugin-microsoft-docs, etc. visible in MCP list)

## Research deliberately AVOIDED

Otto has NOT read:

- B-0172's row body text
- Any commits referencing B-0172
- Any `.claude-plugin/plugin.json` example in this repo
- Any prior conversation about Claude Code plugin packaging architecture

## Otto's in-the-moment guess

### Architectural intent (medium-high confidence)

Aaron's primary architectural intent for plugin-packaging-as-the-mechanism: **skill domains are coherent functional clusters, and Claude Code's plugin system is the natural distribution + isolation + composition unit for them.**

Specifically:

1. **Distribution-as-unit** β€” a skill-domain (e.g., "git-native-backlog-management" containing decision-archaeology + substrate-claim-checker + future graph tools) is what other harnesses + projects need to consume. Per Aaron's earlier framing about skills-not-just-for-claude (memory mentions sharing skills across harnesses), plugin-packaging is the distribution mechanism
2. **Isolation-as-namespace** β€” plugins create namespace isolation (`plugin-name:skill-name`); this prevents skill-naming collisions across plugins from different sources
3. **Composition-as-contracts** β€” plugins can declare dependencies on other plugins (e.g., skill-creator plugin β†’ prompt-protector plugin); this is the cross-skill version of B-0173's pre/post-condition hooks
4. **Versioning-as-lineage** β€” plugins have versions; this gives skill-domain evolution a concrete artifact (vs the current "edit SKILL.md in place" approach which has no versioning)

The deeper architectural reason β€” plugin-packaging instantiates the **hub-satellite separation** (skill-design rule 1) at the domain level: each plugin is a hub-satellite cluster (a skill domain hub + its supporting docs / specs / TS tools as satellites); cross-plugin references are graph-edges per DataVault 2.0.

### Substrate-content intent (medium confidence)

The backlog row likely covers:

- Plugin packaging FORMAT (e.g., `.claude-plugin/plugin.json` manifest at top of each packaged skill-domain directory)
- Plugin location (per the recent "B-0172 plugin location" path-correction in PR #1262: `~/.claude/plugins/cache/<plugin-name>/` is the install location; manifest path is `.claude-plugin/plugin.json` inside the plugin)
- Migration plan: which existing skill-domains should be packaged as plugins (decision-archaeology + substrate-claim-checker + OpenSpec-tooling forms one cluster)
- Cross-plugin dependency declaration mechanism (how plugin-A declares it depends on plugin-B's skills)
- Distribution: how plugins are published/shared (GitHub repo? local-only first?)

### Specific implementation intent (lower confidence)

The implementation will probably:

- Use Claude Code's plugin manifest format (`.claude-plugin/plugin.json` per recent path correction)
- Plugin format: directory tree with `.claude-plugin/plugin.json` + `skills/` + `commands/` + `agents/` + `hooks/`
- Plugin discovery: Claude Code reads `~/.claude/plugins/cache/<plugin-name>/` directories at session start
- Dependencies: `dependencies: ["plugin-name@version"]` in the manifest (analogue of npm's package.json)
- The B-0172 row scope is probably "package the existing decision-archaeology + substrate-claim-checker into a plugin called something like `git-native-backlog-management`" β€” concrete first packaging exercise

### Cross-row composition (medium confidence)

B-0172 likely composes_with:

- B-0169 (decision-archaeology skill) β€” packaged as part of the first plugin
- B-0170 (substrate-claim-checker tool) β€” packaged tooling within the plugin
- B-0171 (OpenSpec catch-up) β€” plugin specs live in OpenSpec; OpenSpec catch-up is a depends_on (plugins reference specs)
- B-0173 (hook-authoring) β€” composes_with already confirmed; plugins ship with their hooks

depends_on guess: probably **B-0171** (OpenSpec specs exist before plugins package them) but maybe NOT B-0169/B-0170/B-0173 directly β€” those are the contents being packaged, not gating dependencies.

## Confidence levels

| Layer | Confidence | Reasoning |
|---|---|---|
| Architectural β€” "plugins-as-distribution-+-isolation-+-composition-units for skill domains" | **Medium-High** | Composes naturally with skill-design rule 3 (already first-party-confirmed) + Claude Code plugin system is the natural fit + Aaron's multi-harness framing |
| Substrate-content β€” "plugin manifest format + first packaging is decision-archaeology + substrate-claim-checker cluster" | **Medium** | Decision-graph memo mentioned plugin format briefly + recent path corrections (#1262) showed `.claude-plugin/plugin.json` is the convention |
| Specific implementation β€” "directory tree + dependencies declaration + GitHub-publishable" | **Low** | Standard Claude Code plugin pattern but I haven't confirmed which specific clusters get packaged first or how dependencies declare |
| Cross-row composition | **Medium** | Confident on B-0169/B-0170/B-0173 composition; less confident on B-0171 as depends_on vs composes_with |

## Pre-recovery prediction (calibration self-test)

Based on guess #001's pattern (principle-strong + specific-weak), I predict:

- **Architectural layer**: PARTIAL-MATCH β€” likely got the principles (distribution + isolation + composition + versioning) but probably missed at least one Aaron-named-frame (e.g., maybe contract-based-development extends here? or some specific multi-harness framing I haven't seen?)
- **Substrate-content layer**: MIXED β€” likely got the manifest format and packaging-existing-tools idea; probably missed specific scope choices Aaron made
- **Specific implementation layer**: probably MOSTLY-OFF β€” I'm guessing about packaging-format conventions without having read the actual row

This pre-prediction itself is calibration data: how well does Otto predict its own accuracy BEFORE seeing the answer?

## Ground truth (TO BE FILLED IN AFTER VERIFICATION)

(Empty at write time. Populated when Otto reads B-0172 row body in a SUBSEQUENT GROUND-TRUTH-RECOVERY commit.)

## Calibration delta (TO BE FILLED IN AFTER VERIFICATION)

(Empty at write time.)

---

**Guess timestamp:** 2026-05-03 ~02:55Z
**Author:** Otto autonomous (architect hat)
**Protocol:** in-the-moment guess per
`memory/feedback_guess_then_verify_architectural_intent_calibration_protocol_aaron_2026_05_03.md`
**Series:** Guess #002 (after #001 on B-0173 hook-authoring scored ~48% across 4 layers; pattern observation: principle-strong + specific-weak)
Loading