diff --git a/docs/DECISIONS/2026-04-26-sync-drain-plan-acehack-lfg-roundtrip-option-c.md b/docs/DECISIONS/2026-04-26-sync-drain-plan-acehack-lfg-roundtrip-option-c.md new file mode 100644 index 00000000..e69f768e --- /dev/null +++ b/docs/DECISIONS/2026-04-26-sync-drain-plan-acehack-lfg-roundtrip-option-c.md @@ -0,0 +1,144 @@ +# 2026-04-26 — Sync drain plan: AceHack ↔ LFG round-trip via option-c (cherry-pick-with-rewrites) + +Scope: ADR canonicalizing the AceHack ↔ LFG fork-divergence drain plan executed 2026-04-26, codifying the option-c choice (cherry-pick-with-rewrites over alternatives) and the 7-step round-trip structure for future drain cycles. + +Attribution: Aaron (human maintainer) chose option-c via *"both all, figure out how to combine"* + *"don't lose ideas and backlog"* directional picks 2026-04-26. Otto (Claude opus-4-7) executed the drain across 7 steps using parallel-subagent dispatch for the LFG → AceHack reverse leg. The discipline composes with Otto-329 (Phase 1 LFG drain) + Otto-225 (cherry-pick rebase technique). + +Operational status: research-grade ADR (decision recorded; future drain cycles can adopt or amend the plan) + +Non-fusion disclaimer: this ADR records a decision, not a completed framework. The 7-step structure documented here is the post-hoc reconstruction of an executed drain; the steady-state cadence (step 7) is the part still being calibrated against `feedback_fork_pr_cost_model_prs_land_on_acehack_sync_to_lfg_in_bulk.md`. + +(Per GOVERNANCE.md §33 archive-header requirement on cross-substrate ADRs.) + +## Context + +Two forks of Zeta diverged: + +- **`AceHack/Zeta`** (Aaron's fork) — primary work surface for cheap experimentation, free CI on public repo, the AceHack-first dev workflow per `feedback_fork_pr_cost_model_prs_land_on_acehack_sync_to_lfg_in_bulk.md` +- **`Lucent-Financial-Group/Zeta`** (LFG, Aaron's umbrella org) — canonical training-corpus aggregator per Otto-252 (`feedback_lfg_is_central_training_signal_aggregator_for_all_forks_divergent_signals_push_to_lfg_otto_252_2026_04_24.md`) + +Divergence on 2026-04-26: AceHack 62 commits ahead of LFG / LFG 482 commits ahead of AceHack. The `AceHack/Zeta` and `Lucent-Financial-Group/Zeta` forks accumulated independent history because each was the canonical surface for different work-classes (research/dev on AceHack, governance/release on LFG). Sync became overdue per task #302 (`UPSTREAM-RHYTHM bidirectional drift`). + +## Decision + +**Option-c chosen** over alternatives: + +| Option | Approach | Rejected because | +|---|---|---| +| **a — Copy whole thing** | `git push -f` one direction | Loses other side's work; violates "both all" + Otto-220 don't-lose-substrate | +| **b — Just merge** | Single big merge commit on each side | Re-introduces divergence quickly; "merge" is not "drain"; doesn't rewrite shape for target context | +| **c — Cherry-pick-with-rewrites** | Per-commit (or per-batch) cherry-pick + rewrite for target-context coherence | **CHOSEN.** Preserves both sides' contributions; allows shape rewrite per fork; bounded effort scaling to commit count | +| **d — Reset divergent fork to match canonical** | `git reset --hard origin/main` | Equivalent to (a); same rejection | + +Aaron's framing 2026-04-26: *"both all, figure out how to combine"* + *"don't lose ideas and backlog"* — picks option-c structurally even before it's named. + +## The 7-step round-trip plan (executed 2026-04-26) + +```text +Forward leg (AceHack → LFG): batches 1-4 + closure (steps 1-5) +Reverse leg (LFG → AceHack): full-reconciliation merge (step 6) +Steady state: UPSTREAM-RHYTHM batched cadence (step 7) +``` + +**Citation conventions for this section:** all `Commit:` / +`Tick-history:` SHAs in steps 1-6 reference the +**`Lucent-Financial-Group/Zeta`** repository (the forward-sync +target); all `PR:` numbers in steps 1-6 reference the same. +Step 7 (steady-state cadence) examples reference +**`AceHack/Zeta`** explicitly; deviations from this default are +noted inline. SHAs are short (7-char) for in-prose readability; +qualify to full SHAs via `gh api repos///commits/` +when programmatic verification matters. Per Codex review on this +ADR (PR #31): bare short-SHAs without repo context create +verification ambiguity once forks diverge — this preamble +removes that ambiguity for the entire steps-1-6 block. + +### Step 1 — batch-1: foundation files + +Forward-sync 17 missing files + audit doc + Otto-347 discipline. Establishes the cherry-pick-with-rewrites pattern; lands the audit infrastructure that subsequent batches use to verify content preservation. + +- **Commit:** `Lucent-Financial-Group/Zeta@1c1bd95` — sync(acehack→lfg) batch-1: 17 missing files + audit doc + Otto-347 +- **PR:** Lucent-Financial-Group/Zeta#592 +- **Tick-history:** `Lucent-Financial-Group/Zeta@790be82` (2026-04-26T12:23:02Z) + +### Step 2 — batch-2: BACKLOG row migration + +Forward-sync 23 BACKLOG-row-only commits, rewritten into per-row files (per `2026-04-22-backlog-per-row-file-restructure.md` ADR). The rewrite preserves intent while migrating to the per-row file shape that LFG/main canonicalized. + +- **Commits:** `a3b7e24`, `fecd8d0` — sync(acehack→lfg) batch-2: 23 BACKLOG-row-only commits rewritten into per-row-files (option-c) +- **PR:** #633 + +### Step 3 — batch-3: terminology canonicalization + +Forward-sync UPSTREAM-RHYTHM "three surfaces, two vocabularies" terminology section per `feedback_dont_invent_when_existing_vocabulary_exists.md`. The rewrite captures Aaron's 5-step ladder (scope framing → terminology question → git-native correction → general principle → 3-surface count correction). + +- **Commits:** `ff4ee39`, `a1d781c` — sync(acehack→lfg) batch-3: UPSTREAM-RHYTHM three-surfaces terminology (option-c) +- **PR:** #634 + +### Step 4 — batch-4: bug fixes + tooling hygiene + +Forward-sync `AppContext.BaseDirectory` + curl|bash self-contradiction fix. The bug-fix-class commits ride along with the sync batch so AceHack's tooling improvements reach LFG without separate per-PR overhead. + +- **Commit:** `05d274f` — sync(acehack→lfg) batch-4: AppContext.BaseDirectory + curl|bash self-contradiction fix (option-c) +- **PR:** #635 + +### Step 5 — closure: tick-history + substrate transition + +Tick-history row marking the forward-sync arc complete + transition to substrate-work register. Captures the "phase-1 done, phase-2 starting" gate so later sessions can reconstruct what phase the drain was in. + +- **Commit:** `e4b1fa2` — tick-history: 18:02Z sync option-c COMPLETE + substrate transition + +### Step 6 — reverse leg: full LFG → AceHack reconciliation + +Single large PR landing all LFG-only files on AceHack via 7-parallel-subagent content-preserving merge per `feedback_parallel_subagent_dispatch_for_content_preserving_merge_pattern_2026_04_26.md`. The reverse leg's scale (282K lines, 1046 files) makes per-commit cherry-pick infeasible; the parallel-subagent pattern preserves content while reconciling the larger divergence. + +- **PR:** #26 on AceHack/Zeta — `sync: AceHack ∪ LFG full reconciliation via per-file content-preserving merge (task #302)` +- **Subagent dispatch:** 7 parallel subagents handled 26 conflicting files; each confirmed *"no substantive content silently dropped"* +- **Otto-side spot-checks:** Blockers section restored, jsonl rows preserved, hygiene rows 39/40/41 restored, marketing drafts both attribution variants preserved +- **Publication-fitness gate:** Copilot inline-review surfaced PII flag; redactions applied per Aaron's sharpened bar (2 commits: `e3e4afd` redaction, `86747cd` rollback to wiki-style refs) + +### Step 7 — steady state: UPSTREAM-RHYTHM batched cadence + +Going forward, fork-divergence drain happens via the UPSTREAM-RHYTHM batched cadence (every ~10 PRs, not per-PR) per `feedback_fork_pr_cost_model_prs_land_on_acehack_sync_to_lfg_in_bulk.md`. The 7-step plan recurs as needed when divergence accumulates again, with batches re-counted from 1 each cycle. + +- **Trigger:** AceHack ahead of LFG by ≥10 commits OR LFG ahead of AceHack by ≥10 commits, AND no merge in flight either direction +- **Owner:** the agent currently running the autonomous loop on AceHack +- **Batch size:** 5-25 commits per batch (small enough to review, large enough to amortize per-PR overhead) + +## Consequences + +### Positive + +- **Both forks preserve their contributions** — no Otto-220 substrate-loss +- **Each side's shape is respected** — rewrite-per-target-context allows AceHack-shape commits to land in LFG-shape repo +- **Bounded effort** — batch size scales linearly with commit count; parallel-subagent dispatch handles the larger reverse leg +- **Steady state is predictable** — the 7-step plan is now a template for future drain cycles, not ad-hoc work each time +- **Plan is documented** — this ADR fixes the gap that surfaced when Aaron asked *"do you have the 7 step plan?"* and Otto had to reconstruct it from git history + +### Negative + +- **Per-batch overhead is non-trivial** — each batch ships as a separate PR with its own review cycle +- **PII / publication-fitness gate is owed** — the parallel-subagent merge pattern (step 6) verified preservation but not publication-fitness; this surfaced the Copilot flag on PR #26 (`feedback_subagent_merge_verification_neq_publication_fitness_orthogonal_gates_2026_04_26.md` captures the missing gate) +- **The 7-step structure is post-hoc** — the steps emerged from execution, not from upfront planning; future drain cycles may discover this template doesn't fit unmodified + +### Mitigations for the negatives + +- **Per-batch overhead:** mitigated by Otto-252 (LFG is the central training-signal aggregator; batch-overhead amortizes across the training value) +- **Publication-fitness gate:** add Stage 3 to the parallel-subagent merge pipeline per the orthogonal-gates memory; future merges include the publication-fitness pass before PR-open +- **Post-hoc structure:** convergence-test this ADR — if next drain cycle adds ≤ 1 step modification, the template is stable; if 3+ modifications needed, the template is overfit to 2026-04-26 and needs revision + +## Composes with + +- `docs/UPSTREAM-RHYTHM.md` — operational rhythm governing when drain cycles trigger +- `docs/DECISIONS/2026-04-22-backlog-per-row-file-restructure.md` — the BACKLOG-row migration shape used in step 2 +- `feedback_fork_pr_cost_model_prs_land_on_acehack_sync_to_lfg_in_bulk.md` — the cost-model rationale for AceHack-first +- `feedback_parallel_subagent_dispatch_for_content_preserving_merge_pattern_2026_04_26.md` — the technique used in step 6 +- `feedback_git_merge_file_union_is_not_set_union_can_lose_content_2026_04_26.md` — the failure mode the parallel-subagent pattern replaced +- `feedback_subagent_merge_verification_neq_publication_fitness_orthogonal_gates_2026_04_26.md` — the missing gate this drain surfaced +- `feedback_lfg_is_central_training_signal_aggregator_for_all_forks_divergent_signals_push_to_lfg_otto_252_2026_04_24.md` — the why-LFG-anchored-corpus rationale +- task #284 — completed parent task (option-c sync execution) +- task #302 — pending parent task (UPSTREAM-RHYTHM bidirectional drift; this ADR addresses by documenting the plan) + +## Convergence test + +If next sync drain cycle (when divergence next accumulates) executes with ≤ 1 modification to this 7-step structure, the template is stable. If 3+ modifications, the template is overfit and needs amendment. Track in tick-history each time the plan is invoked. diff --git a/memory/feedback_dont_invent_when_existing_vocabulary_exists.md b/memory/feedback_dont_invent_when_existing_vocabulary_exists.md new file mode 100644 index 00000000..6c756f25 --- /dev/null +++ b/memory/feedback_dont_invent_when_existing_vocabulary_exists.md @@ -0,0 +1,254 @@ +--- +name: Don't invent vocabulary when one already exists — adopt-or-explicitly-decline, never implicit +description: Aaron 2026-04-22 "we should always try to not invent termonology where some already exists unless it's an explicit decison no implicit it's part of the everyhting has it's home, like six sigma we explicity decided not to pull in their entire termonology". When a well-known vocabulary (git, OWASP, Six Sigma, Kanban, W3C, RFC, a consuming library, a standard) already covers a concept, adopt its terms verbatim rather than minting parallel names. Inventions are only OK when they are the product of an explicit documented decision — "everything has its home." Implicit inventions (even small ones like "primary/dev-surface" alongside "upstream/fork") violate this; they accumulate into a bespoke dialect that doesn't index into external knowledge. Paired with the Six-Sigma-vocabulary exception pattern: we adopted DMAIC + WIP, explicitly declined the full lexicon. +type: feedback +originSessionId: 1937bff2-017c-40b3-adc3-f4e226801a3d +--- +**Rule:** Before naming a new concept — or re-naming an +existing concept in factory prose, skills, memories, ADRs, +BACKLOG rows, error messages, or code identifiers — **check +whether an established vocabulary already has a term for this +thing.** If yes, adopt that term verbatim. The only licensed +way to *not* adopt is an **explicit documented decision** (ADR, +skill decision log, memory, or inline "we explicitly decline +this term because …" note). No implicit inventions. + +**Why:** Aaron 2026-04-22, verbatim (with silent correction per +typing-style memory): + +> "we should always try to not invent termonology where some +> already exists unless it's an explicit decison no implicti +> it's part of the everyhting has it's home, like six sigma +> we explicity decided not to pull in their entire +> termonology" + +Triggered immediately after I invented parallel labels +("primary" for LFG, "dev-surface" for AceHack) when git +already had canonical names for that relationship +("upstream" / "fork"). The invented pair paid no rent — the +governance framing was expressible as a consequence of the +git topology — but it accumulated noise that a reader had to +translate back into git's actual vocabulary. The broader +lesson: **invented parallel vocabularies don't index into +external knowledge**. A future contributor who googles +"GitHub upstream fork" gets real documentation. A future +contributor who googles "primary dev-surface Zeta" gets +nothing. + +Three facets in Aaron's message: + +1. **"Everything has its home."** Concepts have canonical + homes in established vocabularies. Git, OWASP, Six Sigma, + Kanban, W3C, NIST, RFCs, C# naming conventions, F# idiom, + .NET framework terminology, the operator-algebra of the + retractable-contract ledger itself — each is a home. When + a concept is described in a home vocabulary, that is its + home. Invent only when no home exists or when multiple + homes conflict and a factory-local term is needed as a + tie-breaker. + +2. **Six-Sigma is the model of explicit decline.** Per + `memory/user_kanban_six_sigma_process_preference.md`, we + adopted Six Sigma's DMAIC + WIP and **explicitly declined** + the rest of the lexicon (black belts, kaizen events, + control charts by that name). That decline was recorded. + That is the licensed path. Silently inventing "primary" + and "dev-surface" when "upstream" and "fork" already + exist is the **anti-pattern** Six-Sigma's explicit partial + adoption was meant to contrast with. + +3. **Explicit > implicit.** An invention documented with a + reason ("we chose X over Y because Y carries these + connotations we want to avoid") is defensible. An + invention that silently slides into the codebase is not. + The cost of recording the decision is tiny; the cost of + a year-later "why do we say X instead of Y" debugging + session is large. + +**How to apply:** + +1. **Before naming anything, check for an existing home.** + When about to write a new noun for a concept in any + artifact (doc, skill, memory, ADR, BACKLOG row, code + identifier), ask: + + - Does git / GitHub API / GitHub docs have a term for + this? (upstream, fork, remote, branch, tag, PR, + ruleset, ref, HEAD, …) + - Does the language / framework have a term? (F# + record, C# partial class, .NET analyzer, Result + monad, discriminated union, …) + - Does a standard have a term? (OWASP LLM-01 through + LLM-10, CWE, CVE, RFC verbs, NIST RMF function, + SLSA level, SPDX license ID, …) + - Does a methodology in use here have a term? (Six + Sigma DMAIC, Kanban WIP, PDCA, TDD red/green, …) + - Does the factory's own documented vocabulary have a + term? (`docs/GLOSSARY.md`, the operator-algebra, the + retraction-native contracts, the ledger, rounds, + persona names, BP-NN rule IDs, …) + - **Does a formal-substrate vocabulary have a term?** + (ML / AI / Bayesian: belief propagation, factor graph, + sum-product, variational inference, MDL, information + gain; probabilistic programming: Infer.NET primitives, + Pyro, Stan vocabulary; graph theory: DAG, tree-width, + min-cut, PageRank; mathematics: lattice / poset / meet + / join / Heyting algebra, group / ring / field, Galois + connection; physics / chemistry: catalyst, HPHT, phase + transition, free energy; information theory: Kolmogorov + complexity, entropy, mutual information, Shannon bound; + religious studies / social theory: Girard mimetic, + parable substrate, ritual, liminal). Added 2026-04-22 + after the belief-propagation reframe proved this axis + was missing from the checklist — I invented "kernel- + vocabulary propagation" without grepping Pearl 1982 or + `docs/ROADMAP.md:80` (which already held `Zeta.Bayesian` + = Infer.NET = belief propagation). A one-minute check + of this axis would have caught it pre-emptive. + + If any answer is yes, adopt verbatim. If multiple answers + yes and they conflict, file a decision note naming the + chosen term and the declined alternatives. + +2. **"Explicit" means written-down, not just thought-about.** + The licensed invention path is one of: + + - An ADR under `docs/DECISIONS/YYYY-MM-DD-*.md` naming + the new term, the rejected alternatives, and why the + invention pays rent that an existing term wouldn't. + - A skill decision log entry (skill-creator workflow) + when the invention is skill-local. + - An inline "we explicitly decline `` + because ``; use ``" note at the + first use-site in the governing doc. + - A memory entry like this one, when the decision is + factory-wide policy. + + Silent adoption of a new term in a commit without any of + these is the violated shape. + +3. **Six-Sigma partial-adoption is the template.** We took + DMAIC + WIP; we declined the rest. That split is + recorded in + `memory/user_kanban_six_sigma_process_preference.md` + and + `docs/FACTORY-METHODOLOGIES.md`. Any future partial + vocabulary adoption should follow the same pattern: + adopt-verbatim for the kept terms, record-the-decline + for the rejected ones. + +4. **Retroactive fixes.** When an accidental invention is + discovered (the pattern here: review, notice, compare + against the established vocabulary, correct), do the + same four things every time: + + - Rewrite the invented term to the established one. + - Add a one-line note in the governing doc that the + established vocabulary is now the canonical one. + - If the invention had any genuine content that the + established vocabulary didn't capture, preserve that + content phrased in the established vocabulary. + - Log the correction in the commit message with a brief + "; reason: ". + +5. **Counter-instances where invention IS justified.** Not + every factory-local noun is an invention-to-apologize-for. + Legitimate inventions: + + - **Concept no established vocabulary covers.** + E.g. "retractable-contract ledger" — there is no + prior term of art. Invention is required. + - **Disambiguation from an overloaded term.** + E.g. "spec" means TLA+ in one context and OpenSpec in + another; the factory explicitly keeps both, per + `docs/GLOSSARY.md`. + - **Factory-specific roles where generic terms mislead.** + E.g. persona names (Kenji, Sova, Ilyana) — generic + "reviewer-1 / reviewer-2" would be worse. + - **Pedagogical / teaching aids.** + E.g. SPACE-OPERA variant of the threat model; the + canonical term (STRIDE) is also present; the variant + is an explicit teaching overlay. + + All four counter-instances share one property: they have + a recorded rationale. The common mode is *record the why + at the moment of invention*, not *apologize six months + later when someone asks*. + +**What this rule does NOT mean:** + +- **Not a ban on factory-internal composites.** Phrases like + "round-close ledger" or "bulk-sync PR" compose + established terms (round, close, ledger, bulk, sync, PR). + Composition of established terms into a local phrase is + fine and requires no explicit decision; inventing + single-word *alternatives to* established terms is what + the rule targets. +- **Not a ban on shortenings.** "LFG" for + Lucent-Financial-Group and "AX" for agent-experience are + established abbreviations, not inventions. +- **Not a style-guide override.** When an established + vocabulary has multiple canonical terms for the same + concept (e.g. "fork" vs "mirror"), picking one is a + style call that doesn't need an ADR — though the GLOSSARY + should reflect the choice. + +**Cross-reference:** + +- `memory/user_kanban_six_sigma_process_preference.md` — + the canonical instance of explicit partial adoption: + "adopt practices not bureaucracy"; DMAIC + WIP kept, rest + declined. This rule generalizes that pattern to all + vocabularies. +- `docs/DECISIONS/` — where ADRs recording invented terms + belong. +- `docs/GLOSSARY.md` — where adopted-verbatim terms and + their sources should appear. +- `.claude/skills/naming-expert/` — the naming expert's + scope; public-API invention falls under this rule + naturally via Ilyana's gate. +- `memory/user_typing_style_typos_expected_asterisk_correction.md` + — applies to the initial message's "termonology" typos + (no asterisk-correction on this one; Aaron did not send a + follow-up with asterisks, so silent pass-through as + terminology/explicit/implicit is sufficient). +- `docs/UPSTREAM-RHYTHM.md` — the immediate application + site; commit `2d1ca77` removed the invented + primary/dev-surface pair in favor of upstream/fork. +- `docs/BACKLOG.md` row 2867 — same commit, same reason. +- `memory/feedback_kernel_vocabulary_propagation_is_belief_propagation_infer_net_memetic_mimetic.md` + — **second worked instance (2026-04-22, same tick as + this rule).** I invented "kernel-vocabulary propagation" + for the factory's skill-library-wide term-migration + phenomenon. Aaron corrected across five messages: + *"this is belief propagation ... infer.net ... maps to + memtic theory ... the french guy i think r somthing maybe + ... Girard ... dawkins is like a description Girard is + like how and why"*. Established vocabulary covered all + three layers of the invented term: **belief propagation** + (Pearl 1982, sum-product algorithm), **Infer.NET** + (Microsoft Research .NET implementation, already on + Zeta roadmap), **Girard mimetic theory** (mechanism-layer + authority, *Things Hidden Since the Foundation of the + World* 1978). Canonical shorthand Aaron settled on: + **"dawkins=what, Girard=why/how"** — depth-ordered, not + peer-ordered. The invention violated the rule more deeply + than the upstream/fork case because it named a formal + substrate (the skill library as a factor graph for BP + inference) that has a full computational framework and + Bayesian / religious-studies literature behind it. The + lesson generalizes: **check against ML / AI / Bayesian + established vocabulary** (belief propagation, factor + graphs, variational inference, sum-product, max-product) + before inventing composite terms for substrate-level + factory phenomena. The `Zeta.Bayesian` roadmap entry + (`docs/ROADMAP.md:80`, `docs/INSTALLED.md:72`) was the + retroactive discoverable-home the invention would have + found with one grep. + +**Source:** Aaron direct message 2026-04-22 during +round-44-speculative tick, immediately after I landed the +git-native-terminology rewrite of UPSTREAM-RHYTHM.md and +BACKLOG row 2867, generalizing the specific correction +("we are git native use their termonology") into the +broader rule it implies. diff --git a/memory/feedback_fork_pr_cost_model_prs_land_on_acehack_sync_to_lfg_in_bulk.md b/memory/feedback_fork_pr_cost_model_prs_land_on_acehack_sync_to_lfg_in_bulk.md new file mode 100644 index 00000000..401b9e05 --- /dev/null +++ b/memory/feedback_fork_pr_cost_model_prs_land_on_acehack_sync_to_lfg_in_bulk.md @@ -0,0 +1,152 @@ +--- +name: Fork-PR cost model — PRs land on AceHack, bulk-sync to LFG only every ~10 +description: Aaron 2026-04-22 correction — the "every 10 PRs" rhythm means PRs target AceHack/Zeta:main (free CI, no LFG Copilot billing) and accumulate there; one bulk AceHack/main → LFG/main sync happens every ~10 batches. Agent was opening individual PRs against LFG which triggered LFG Copilot + Actions per PR = the expensive path. Money-conscious factory default: the fork is the work surface; LFG is the publish surface. +type: feedback +originSessionId: 1937bff2-017c-40b3-adc3-f4e226801a3d +--- + +**Rule:** Day-to-day agent PRs target +`AceHack/Zeta: → AceHack/Zeta:main`, not +`AceHack/Zeta: → Lucent-Financial-Group/Zeta:main`. +AceHack's CI + merge-queue runs on AceHack's free minutes; +LFG's Copilot code-review + LFG Actions billing is paid +once per *bulk sync*, not once per PR. + +Bulk sync happens roughly every 10 PRs: one +`AceHack/Zeta:main → Lucent-Financial-Group/Zeta:main` PR +that carries the accumulated work. LFG Copilot + LFG Actions +run on that one sync PR. + +**Why:** Aaron 2026-04-22, quoted verbatim: + +> "nah so the way you are doing it is still expensive i +> can't thiink of a way to update the factory to have make +> you recgonize you are money inefficent right now. When I +> said every 10 prs. I was thinkg you would be pushing to +> main on AceHack for 10 prs and then all 10 in 1 from main +> to main on LFG. This is the poor mans setup got to bet +> money concious." + +And the follow-up clarification on blast-radius: + +> "this is not an ememrgency, rmember you can't cost me +> real money the build will just stop working on LFG when i +> run out of free credits." + +So the concrete risk is **LFG build grinds to a halt when +free-tier Actions minutes exhaust** — not dollars flowing +out. Budget caps protect Aaron's wallet; the rule protects +the factory's LFG-side *functioning*. Still load-bearing — +a dead LFG CI means PRs can't gate, sync PRs can't validate, +adopters can't see a green build — but it's prudence, not +panic. + +And the anchoring cost-reality memory: + +- `project_lfg_org_cost_reality_copilot_models_paid_contributor_tradeoff.md` + — LFG pays for Copilot + models; AceHack is free. Every PR + opened against LFG pays; every PR opened against AceHack + does not. + +The agent's prior pattern (PRs 45, 51, 52, 53 all opened +directly against `Lucent-Financial-Group/Zeta:main`) paid the +LFG Copilot + Actions cost *per PR*. Wrong direction. The +"every 10 PRs" rhythm was supposed to amortize LFG cost **by +a factor of 10**. + +**How to apply:** + +1. **Default PR target is AceHack, not LFG.** When opening a + PR via the fork-PR workflow, the default command is: + ```bash + gh pr create --repo AceHack/Zeta \ + --head AceHack: \ + --base main \ + --title ... + ``` + NOT `--repo Lucent-Financial-Group/Zeta`. + +2. **Auto-merge on AceHack.** `gh pr merge --repo + AceHack/Zeta --auto --squash` — AceHack's CI runs the + gate, AceHack's merge queue processes. + +3. **Bulk-sync threshold.** Once `AceHack/Zeta:main` is + ~10 commits ahead of `Lucent-Financial-Group/Zeta:main`, + open **one** sync PR: + ```bash + # From AceHack/Zeta's main branch + gh pr create --repo Lucent-Financial-Group/Zeta \ + --head AceHack:main \ + --base main \ + --title "Sync: AceHack/Zeta:main → LFG/Zeta:main (N PRs)" \ + --body "$(cat <<'EOF' + ## Summary + Bulk upstream sync per the 10-PR cost-efficiency rhythm. + + ## Included PRs + (listed by `git log LFG/main..AceHack/main --oneline`) + + ## Cost rationale + Bulk sync = LFG Copilot + Actions run once for N PRs' + worth of work, not N times. + EOF + )" + ``` + LFG Copilot + Actions run *once* on this bulk PR. + +4. **Threshold is a suggestion, not a hard rule.** The + Aaron message says "every 10 prs"; anything from 5-20 is + reasonable. Urgent fixes (security, P0 bugs) can sync + sooner. Pure speculative factory work can wait longer. + The principle is "one-to-many cost amortization", not + "exactly 10". + +5. **When LFG sync is required sooner than 10 PRs:** + - Security P0 (any Mateo / Nazar / Aminata finding) + - External contributor depends on the change (rare pre-v1) + - Aaron explicitly requests the sync + +6. **Sunk-cost handling.** If LFG PRs are already open when + this rule is adopted, let them finish rather than closing + them — CI has already run, cost is paid. Don't double-pay + by closing + re-opening on AceHack. *Exception:* if LFG + CI is red and blocking, consider closing + reopening on + AceHack to avoid re-running LFG CI on a fix. + +7. **Poor-man's setup framing.** Aaron's words: "This is the + poor mans setup got to bet money concious". The cost + discipline is load-bearing for the whole factory — without + it, LFG billing scales linearly with PR count, which + defeats the fork-based workflow's whole point. + +**Cross-reference to existing memories:** + +- `feedback_fork_based_pr_workflow_for_personal_copilot_usage.md` + — the workflow existed; this memory fixes the target + direction (AceHack, not LFG by default). +- `feedback_fork_upstream_batched_every_10_prs_rhythm.md` — + already stated "every 10 PRs" but the agent interpreted + that as "per-PR to LFG with a 10-row ledger somewhere" + instead of "PRs to AceHack, sync to LFG every 10". +- `feedback_lfg_budgets_set_permits_free_experimentation.md` + — LFG budgets are set, but budgets aren't free; budget ≠ + cost-invisible. +- `project_lfg_org_cost_reality_copilot_models_paid_contributor_tradeoff.md` + — the cost-reality this rule responds to. + +**Factory artifacts to update / create:** + +- `docs/UPSTREAM-RHYTHM.md` — Zeta-specific upstream-sync + cadence doc should carry this model as the concrete + "what PR target goes where" section. +- `.claude/skills/fork-pr-workflow/SKILL.md` — the skill + must make AceHack-first the default `gh pr create` call, + with LFG as the bulk-sync exception. +- `docs/FACTORY-HYGIENE.md` — candidate row: "bulk-sync + cadence monitor" (every N rounds check if AceHack is + ahead by ~10+ and flag). + +**Source:** Aaron direct message 2026-04-22 during round-44 +speculative drain. Message triggered by agent opening PRs +#51, #52, #53 all against LFG in sequence — the exact +anti-pattern this rule corrects. diff --git a/memory/feedback_parallel_subagent_dispatch_for_content_preserving_merge_pattern_2026_04_26.md b/memory/feedback_parallel_subagent_dispatch_for_content_preserving_merge_pattern_2026_04_26.md new file mode 100644 index 00000000..e6860a96 --- /dev/null +++ b/memory/feedback_parallel_subagent_dispatch_for_content_preserving_merge_pattern_2026_04_26.md @@ -0,0 +1,186 @@ +--- +name: Parallel-subagent dispatch for content-preserving merge of N divergent files — operational pattern that worked for the 26-file AceHack/LFG fork-sync 2026-04-26; 7 subagents × ~4 files each completed in ~5 min wall-clock with content-preservation verified per file; subagent reports themselves became the substrate documenting judgment calls; safer than `git merge-file --union` (proven lossy) and faster than serial manual merge +description: After `git merge-file --union` failed the preservation requirement (silently dropped 172-line ADR section + 2 of 3 JSONL rows), Aaron picked Strategy A (per-file 3-way merge with content-preservation verification). I dispatched 7 parallel subagents handling ~4 files each. Each subagent: read ours/base/theirs blobs from `/tmp/sync-merge/`; applied content-preservation discipline (preserve all substantive content from both sides; keep both versions where rewritten differently; dedupe identical lines); wrote merged content to `.merged` files; reported judgment calls. All 7 confirmed "no substantive content silently dropped." Wall-clock: ~5 min from dispatch to all reports back. Otto then aggregated, verified load-bearing preservation cases (Blockers section restored, 3 jsonl rows present, BACKLOG unique sections preserved, bash scripts pass syntax check), committed merge with AgencySignature trail. Pattern works; document for future reuse. +type: feedback +originSessionId: 1937bff2-017c-40b3-adc3-f4e226801a3d +--- + +## When to use this pattern + +Triggers: + +- N divergent files need content-preserving merge (typically N >= 6, where serial work becomes time-expensive) +- Both sides have substantive content the other doesn't have +- Blind merge strategies (`-X theirs`, `-X ours`, `merge-file --union`) are unsafe because preservation is the load-bearing rule +- Aaron has explicitly directed "preserve all content from both sides" or equivalent +- Each file's resolution requires judgment calls that benefit from focused per-file attention + +Counter-indications: + +- N < 6: serial Otto execution is faster than dispatch overhead +- Files are all code (executable scripts) where one canonical wins per file: pick-canonical is correct, not union +- Files require coordinated cross-file reasoning (e.g., a refactor that touches several files together): subagents lose context between files + +## The pattern (operational steps) + +1. **Stage blobs** in a temp directory: for each conflicting file ``, save `.ours`, `.base`, `.theirs` to `/tmp/sync-merge/` (replace `/` with `_` in path for safe filename). + + ```bash + mkdir -p /tmp/sync-merge + base_ref="$(git merge-base ours_ref theirs_ref)" + while IFS= read -r f; do + safe="$(printf '%s' "$f" | tr '/' '_')" + git show ours_ref:"$f" > "/tmp/sync-merge/${safe}.ours" 2>/dev/null || true + git show theirs_ref:"$f" > "/tmp/sync-merge/${safe}.theirs" 2>/dev/null || true + git show "$base_ref":"$f" > "/tmp/sync-merge/${safe}.base" 2>/dev/null || true + done < /tmp/conflicts.txt + ``` + +2. **Group files by class** (skills / docs / drafts / tools / append-only-logs / decisions / security). Each group becomes one subagent's batch. Aim for ~4 files per group; 7 groups handles up to ~28 files. + +3. **Dispatch parallel subagents** in a SINGLE Agent tool call message (multiple tool calls in one message run in parallel). Each subagent prompt includes: + - The preservation rule (LOAD-BEARING) + - File-class-specific guidance (e.g., "for SKILL.md, preserve all sections from both sides; for JSONL, dedup-by-key + sort") + - File list with absolute blob paths + - Output destination (`/tmp/sync-merge/.merged`) + - Required report content (line counts, judgment calls, preservation evidence) + +4. **Aggregate**: copy each `.merged` file to its target path in the working tree; `git add` each. + + ```bash + while IFS= read -r f; do + safe="$(printf '%s' "$f" | tr '/' '_')" + cp "/tmp/sync-merge/${safe}.merged" "$f" + git add "$f" + done < /tmp/conflicts.txt + ``` + +5. **Verify load-bearing preservation cases** before commit. Spot-check: + - Critical sections from each side are present (`grep -c ""`) + - Append-only logs have all rows from both sides + - Bash scripts pass `bash -n` syntax check + - Markdown files don't have orphaned merge markers (`<<<<<<<`, `=======`, `>>>>>>>`) + +6. **Commit with AgencySignature trail** documenting subagent reports + per-file judgment calls. Include verification evidence in the Proof: section. + +## Subagent prompt template (load-bearing) + +```text +You are merging files for [CONTEXT] with strict content-preservation discipline. + +CONTEXT: [Aaron's directive + reason] + +PRESERVATION RULE (LOAD-BEARING): your merged output must contain ALL +substantive content from BOTH ours and theirs. When the same paragraph +was rewritten differently on both sides, keep BOTH versions side-by-side. +When content is identical between sides, deduplicate to one copy. Never +silently drop content. + +FILES TO MERGE (N): +1. + ours: /tmp/sync-merge/.ours + base: /tmp/sync-merge/.base + theirs: /tmp/sync-merge/.theirs +[...] + +FOR EACH FILE: +1. Read all 3 blobs +2. Identify what each side added/modified/deleted relative to base +3. Compose merged content preserving ALL substantive content +4. [File-class-specific guidance] +5. Write merged content to /tmp/sync-merge/.merged +6. Verify by spot-checking key sections from BOTH sides are present + +REPORT BACK (under [400-500] words): +- Files merged: [count] +- ours/theirs/merged line counts per file +- Judgment calls (e.g., "ours and theirs both had X but different content; I kept both with markers") +- Identical content deduplicated +- Confirmation: no substantive content silently dropped +``` + +## File-class-specific guidance + +| File class | Preservation strategy | Subagent hint | +|---|---|---| +| SKILL.md | Operational instructions; preserve all sections | "If both sides have a 'when to wear' section with different criteria, KEEP ALL criteria" | +| ADR / DECISION | Architectural records; preserve all rationale | "If one side added a section the other lacks, preserve the addition; verify by grep" | +| BACKLOG row-list | Append-only; preserve all rows | "Union of rows; resolve duplicates only when truly identical" | +| Marketing drafts | Both versions kept as drafts | "Keep both with date-stamped or AceHack/LFG markers" | +| Research notes | Append-only research findings | "Preserve all observations; both sides' findings are valid" | +| Hygiene rows | Numbered list of hygiene rules | "Preserve all rows from both sides; renumber if needed" | +| Tick-history table | Append-only row log | "Concat all rows; dedup by row identifier; preserve order" | +| Append-only JSONL | One JSON per line | "concat + jq dedup-by-ts + sort_by(.ts)" | +| Bash scripts (.sh) | Single executable | "Pick newer/canonical OR if both added new flags/functions keep both; verify with bash -n" | +| Configuration | Pattern lists, exclusions | "Union the patterns; dedupe identical lines" | + +## Performance + +For the 2026-04-26 sync (26 files, mixed file classes): + +- **Wall-clock dispatch-to-aggregate**: ~5 minutes +- **Subagent task durations**: + - Group 1 (3 files, skills+gitignore): 171s + - Group 2 (4 files, agent docs): 174s + - Group 3 (4 files, BACKLOG+hygiene): 254s + - Group 4 (3 files, decisions+upstream): 617s (the heaviest because of Blockers-section restoration) + - Group 5 (3 files, marketing): 321s + - Group 6 (4 files, research+security): 113s + - Group 7 (5 files, ops+budget): 217s +- **Total subagent compute**: ~30 min serial; **parallel wall-clock** ~10 min +- **Otto verification + commit**: ~3 min after subagents reported + +Compared to: + +- **Serial Otto manual merge**: ~30-60 min wall-clock per Aaron's earlier estimate +- **Blind `git merge-file --union`**: ~10 sec but UNSAFE (proven lossy) + +## Risk mitigations + +1. **Subagent inconsistency**: each subagent handles different files independently; reports document judgment calls for Otto verification. +2. **Subagent silent loss**: prompt requires explicit "no substantive content silently dropped" confirmation; Otto verifies load-bearing cases (key phrases, section counts, row counts). +3. **Subagent context overflow**: file-class grouping keeps each subagent focused; ~4 files per group fits comfortably. +4. **Output file path conflicts**: standardized `/tmp/sync-merge/.merged` naming prevents collision. +5. **Subagent crash mid-batch**: if a subagent fails to produce all `.merged` files, Otto detects on aggregation step (skipped count > 0); can re-dispatch missing files. + +## Composes with + +- **Otto-220** (don't lose substrate) — the load-bearing rule the pattern enforces +- **Otto-227** (verbatim signal-in-signal-out absorb) — applied to merge-time content preservation +- **Otto-275-FOREVER** (bounded perfectionism) — pattern is bounded enough to be tractable; sub-agent dispatch keeps scope manageable +- **Otto-294** (antifragile-cross-substrate-review) — multiple subagents are themselves a cross-review form +- **Otto-347** (2nd-agent verify) — Otto's verification step on subagent output is the 2nd-agent check applied to merge work +- **`feedback_git_merge_file_union_is_not_set_union_can_lose_content_2026_04_26.md`** — sibling memory naming the pattern this replaces +- **The Substrate Truth Principle** (AgencySignature ferry-16 maxim) — the parser is the witness for trailers; subagent reports + Otto's verification spot-checks are the witness for content preservation +- **`superpowers:dispatching-parallel-agents` skill** — the meta-pattern this is an instance of + +## When NOT to use this pattern + +- For files where pick-canonical is correct (single executable code; one canonical wins): subagents add overhead without benefit +- For tightly-coupled cross-file refactors: subagents can't see each other's work +- For < 6 files: serial Otto execution is faster +- For files where Aaron explicitly wants per-file substrate-author judgment: pre-empts Otto's full-execution authorization + +## Direct evidence from the 2026-04-26 application + +All 7 subagent reports confirmed "no substantive content silently dropped." Spot-checks Otto ran: + +- ADR Blockers section (172 lines): **RESTORED** (was lost by blind union) +- snapshots.jsonl: **3 rows preserved** (was 1 after blind union) +- BACKLOG unique sections (4 spot-checked): **all present** +- Bash scripts: **both pass `bash -n`** +- Hygiene rows: FACTORY-HYGIENE.md base rows 39/40/41 (which ours had silently dropped in earlier renumber): **RESTORED** +- Marketing drafts: **both attribution variants preserved as inline alternates** + +Pattern produced PR #26 (sync: full reconciliation) which passed BACKLOG drift check + MEMORY.md reference-existence check on first run. + +## Future-Otto reuse + +Reach for this pattern when: + +1. Aaron directs "merge everything" / "both all" / "preserve content" +2. The conflict count is moderate (6-30 files) +3. File classes are diverse (mixed docs, code, drafts, logs) +4. Time pressure is moderate (need it done in this session, not over multiple ticks) + +Adapt the file-class table for new classes encountered. Each successful application adds evidence the pattern is generalizable.