diff --git a/docs/backlog/P2/B-0093-multi-ai-synthesis-enhancements-quarantine-lucky-guess-trajectory-owners-lattice-convergence-2026-04-28.md b/docs/backlog/P2/B-0093-multi-ai-synthesis-enhancements-quarantine-lucky-guess-trajectory-owners-lattice-convergence-2026-04-28.md new file mode 100644 index 00000000..f67f7aaa --- /dev/null +++ b/docs/backlog/P2/B-0093-multi-ai-synthesis-enhancements-quarantine-lucky-guess-trajectory-owners-lattice-convergence-2026-04-28.md @@ -0,0 +1,191 @@ +--- +id: B-0093 +priority: P2 +status: open +title: Multi-AI synthesis enhancements — mechanical quarantine + lucky-guess protocol + trajectory owners + lattice convergence + scanner self-destruct prevention (post-PR-#699 follow-ups) +tier: factory-hygiene +effort: M +ask: maintainer Aaron 2026-04-28T post-PR-#699 multi-AI synthesis (Gemini + Ani + Claude.ai + Alexa + Amara final pass) +created: 2026-04-28 +last_updated: 2026-04-28 +composes_with: + - B-0090 +tags: [aaron-2026-04-28, factory-hygiene, multi-ai-synthesis, mechanical-quarantine, lucky-guess-protocol, trajectory-owners, lattice-convergence, scanner-self-destruct] +--- + +# B-0093 — Multi-AI synthesis enhancements (post-PR-#699 follow-ups) + +## Source + +After PR #699 substrate landed, Aaron forwarded a multi-AI synthesis pass (Gemini + Ani + Claude.ai + Alexa + Amara final form) on the round work. The synthesis surfaced several substantive enhancements that should NOT land in PR #699 (per Amara: "do not reopen PR #699 unless hard defect appears") but should be encoded as follow-up work. + +This row tracks those enhancements as separate scoped tasks, each landable as a small PR after PR #699 merges. + +## Per-enhancement breakdown + +### 1. Mechanical quarantine (Gemini-flagged) + +**Issue:** The compliance rule's "quarantine possible MNPI" guidance is currently advisory. Without mechanical enforcement, a tainted file might accidentally get committed if the autonomous loop ticks before the review happens. + +**Proposed fix:** + +- Create `.quarantine/` directory listed in `.gitignore` and `.gitattributes` (export-ignore) +- Or define `*.tainted` extension that standard parsers + commit loops hard-code to ignore +- Update `memory/feedback_public_company_contributor_compliance_no_insider_info_in_public_repos_with_trajectories_aaron_2026_04_28.md` with the mechanical-quarantine protocol + +**Effort:** S — small directory + gitignore + memory update + +### 2. Scanner self-destruct prevention (Gemini + Claude.ai both flagged) + +**Issue:** The B-0092 compliance scanner regex (`rg -n "\binsider\b|\bprivileged\b|..."`) will flag the rule-definition files themselves (CONTRIBUTOR-COMPLIANCE.md, the rule-memory files, glossary entries). Without explicit allowlist, the scanner Goodharts itself. + +**Proposed fix:** + +- Path-based allowlist for rule-definition files (`--glob '!**/CONTRIBUTOR-COMPLIANCE.md'`, etc.) +- OR `` bypass comment for rule-definition lines (NOT for ad-hoc usage) +- Explicit "where bypass is allowed" rule in B-0092 + +**Composes with:** the Candidate-count Goodhart rule (`memory/feedback_candidate_count_goodhart_raw_hits_are_not_violations_aaron_amara_2026_04_28.md`) — the scanner's acceptance criterion is "all hits classified," not "zero hits." + +**Effort:** S — scanner config + B-0092 update + +### 3. "Lucky guess" protocol (Gemini-flagged) + +**Issue:** Otto / agents may infer something from public-domain logic that accidentally overlaps with the contributor's employer's unannounced internal roadmap. If Aaron's reaction (confirm / deny / silence) leaks information either way. + +**Proposed fix:** + +- Standardized Aaron response: *"Evaluate that hypothesis purely against public market data; I cannot confirm or deny internal roadmap overlaps."* +- Agent rule: do NOT ask Aaron whether a speculative feature matches internal roadmap; do NOT treat silence / discomfort / refusal as confirmation +- Add to `memory/feedback_public_company_contributor_compliance_no_insider_info_in_public_repos_with_trajectories_aaron_2026_04_28.md` as a section + +**Effort:** S — memory update + +### 4. Unsolicited-inference firewall (Claude.ai-flagged) + +**Issue:** Agents may volunteer trading-relevant inferences about contributor employers ("given ServiceTitan's recent product direction, TTAN may benefit from..."). That's MNPI-adjacent even when the analysis is unprompted. + +**Proposed fix:** + +- Trading-firewall rule extended: agents do NOT volunteer trading-relevant inferences about contributor employers +- Block patterns: "should I buy/sell ", "does this internal thing affect ", "given , may benefit from..." +- Safe form: "I can discuss public filings and general market context, but I cannot use or ask for non-public employer information." + +**Effort:** S — memory update + +### 5. Trajectory owners + triggers + recording surfaces (Claude.ai-flagged) + +**Issue:** B-0092's 5 trajectories list cadences but don't name owners, triggers, or recording surfaces. Without those, trajectories drift into "should happen" rather than "happens." + +**Proposed fix:** + +Add a table to B-0092 trajectory section: + +| Trajectory | Owner | Trigger | Recording surface | +|---|---|---|---| +| Continuous self-audit | Otto/agent author | before commit touching public-company context | commit notes / PR body | +| PR compliance audit | PR author + reviewer + CI scanner | PR mentions public company | PR checklist | +| Weekly scan | Otto cron / factory hygiene | weekly cadence | compliance audit log | +| Monthly review | Otto + Aaron review if needed | monthly cadence | docs/compliance/round-N.md | +| Onboarding briefing | Aaron / repo maintainers | new contributor with public-company employer | acknowledgement record | +| Drift retrospective | Otto + reviewer | compliance drift caught | memory + backlog row if repeated | + +**Effort:** S — B-0092 update + +### 6. Lattice convergence criterion (Claude.ai-flagged) + +**Issue:** The Reset-Readiness Evidence Lattice has an order and a content-loss surface, but no termination criterion. "When has L(final) stabilized enough that further evidence sources are unlikely to change reset-readiness?" is a real operational question without a current answer — the lattice could be a beautiful structure that never closes. + +**Proposed fix:** + +Add to `memory/feedback_reset_readiness_metric_ladder_content_loss_surface_amara_2026_04_28.md` a research-task section: + +```text +Define convergence criterion: + When has L(final) stabilized enough that further evidence + sources are unlikely to change reset-readiness? + +Define termination rule: + Reset-readiness audit closes only when all active evidence + surfaces are classified or explicitly deferred with reason. +``` + +Initial heuristic: lattice converges when 3+ independent evidence surfaces all return the same `L`, OR when one peer-reviewed surface returns `L = ∅`. + +**Effort:** M — research task; defer to later round if scoping is heavier + +### 7. Bead-audit completeness (Claude.ai-flagged) + +**Issue:** Round synthesis listed 8 bead candidates and provided strict treatment for 5; the other 3 (Lost-Substrate Recovery, Public-Company Compliance, Input-Is-Not-Directive) were left ambiguous. As written, the omission could read as either oversight or implicit acceptance. + +**Proposed fix:** + +For each of the 3 deferred candidates, write an explicit one-line evidence-or-defer: + +```text +Lost-Substrate Recovery: + defer bead audit until B-0090 runs again on a later cadence. + +Public-Company Compliance: + defer independent bead until scanner/checklist/quarantine + catches or prevents a later issue. + +Input-Is-Not-Directive: + bead only if the rule changes a later wording decision + without Aaron prompting (i.e., later-Otto self-applies). +``` + +**Effort:** S — memory update / round-history entry + +### 8. Beacon-promotion pattern as round-level memory (Claude.ai-flagged) + +**Issue:** This round demonstrated the Beacon-promotion pattern at scale — 5 distinct internal coinages earned external anchors (SDT + RFC 2119 for input-is-not-directive; SEC/Reg-FD/SOX for public-company compliance; Goodhart/Campbell for metric corrections; lattice theory for evidence-lattice; git internals for commit-vs-tree). Worth a memory entry beyond the BP-WINDOW ledger. + +**Proposed fix:** + +Memory file: `feedback_beacon_promotion_load_bearing_rules_earn_external_anchors_aaron_2026_04_28.md`. Encodes the rule: + +> *Load-bearing factory rules consistently earn external anchors when they're correct. The absence of an external anchor on a long-running internal rule is a useful drift signal.* + +**Effort:** S — new memory file + +## Acceptance + +- [ ] Each of the 8 enhancements lands as either a separate small PR or an update to an existing memory/backlog row +- [ ] Each enhancement is verified against the synthesis packet's framing +- [ ] No enhancement reopens PR #699 +- [ ] Mechanical quarantine actually mechanical (not just advisory) +- [ ] Scanner self-destruct prevention verified by smoke-test (run scanner on rule-definition files; ALLOW class hits) +- [ ] Trajectory owners table lands in B-0092 +- [ ] Lattice convergence section lands in metric-ladder memory (or deferred to research with explicit reason) + +## Why P2 + +These are valuable enhancements but not blocking. PR #699 substrate functions today as the rule-layer; these enhancements add mechanical enforcement + edge-case guards + research depth. Roll out per cadence. + +## Composes with + +- **`memory/feedback_candidate_count_goodhart_raw_hits_are_not_violations_aaron_amara_2026_04_28.md`** — the headline rule from this synthesis; encoded immediately alongside this row. +- PR #699 (memory cluster) — substrate this row's enhancements layer on top of. +- B-0090 (cadenced lost-substrate audit) — Trajectory #5 (drift retrospective) cadence. +- B-0091 (ServiceTitan audit) — completed; the candidate-count rule's worked-example origin. +- B-0092 (public-company contributor compliance) — receives enhancements 1, 2, 3, 4, 5. + +## What this row does NOT do + +- **Does NOT** authorize reopening PR #699. Each enhancement lands as new substrate after #699 merges. +- **Does NOT** require all 8 enhancements to land in one PR. Each is independently scoped. +- **Does NOT** require enhancement #6 (lattice convergence) to land soon. It's research-grade and can defer multiple rounds if scoping firms up. +- **Does NOT** replace any existing rule. Enhancements layer on top. + +## Pickup + +When picking this up: + +1. Read this row + the candidate-count Goodhart memory first. +2. Pick enhancement(s) by effort + immediate-value: + - Highest immediate value: #2 (scanner self-destruct), #5 (trajectory owners), #1 (mechanical quarantine) + - Quick wins: #3 (lucky guess), #4 (unsolicited inference), #7 (bead audit), #8 (Beacon-promotion memory) + - Research-grade: #6 (lattice convergence) +3. Land each as a small PR (S effort target). +4. Update this row with done-status as enhancements land. diff --git a/memory/feedback_candidate_count_goodhart_raw_hits_are_not_violations_aaron_amara_2026_04_28.md b/memory/feedback_candidate_count_goodhart_raw_hits_are_not_violations_aaron_amara_2026_04_28.md new file mode 100644 index 00000000..ab9bfbd5 --- /dev/null +++ b/memory/feedback_candidate_count_goodhart_raw_hits_are_not_violations_aaron_amara_2026_04_28.md @@ -0,0 +1,252 @@ +--- +name: Candidate-count Goodhart — raw search hits are not violation counts (Amara final-synthesis naming, Aaron 2026-04-28) +description: New Goodhart-family entry surfaced after B-0091 inspection found "8 active rewrite files" was a candidate-count proxy that resolved to "0 actual rewrites needed" once context-classified. Generalizes to any audit using grep/regex/search — raw hits are CANDIDATE evidence requiring context classification, not VIOLATION counts. Encoded as: count matches to find work; classify context to decide work. Best distilled rule. Composes with the metric ladder + Goodhart family. Critical for B-0092 compliance scanner design (must not Goodhart itself by trying to delete words like "insider" / "confidential" / "roadmap" from the rule definitions themselves). +type: feedback +--- + +# Candidate-count Goodhart + +## The rule (Amara final-synthesis naming, Aaron 2026-04-28) + +> **Raw search hits are not violation counts.** + +Or, in the canonical decision-procedure form: + +> **Count matches to find work.** +> **Classify context to decide work.** + +## The triggering catch (this session, 2026-04-28) + +B-0091 (audit + rename ServiceTitan references in live docs) +flagged 12 file matches via `rg -i 'service ?titan'`. The +naive interpretation became: + +```text +12 matches → "8 active rewrite files" + "4 historical/generated" +``` + +After per-row context inspection, the actual finding was: + +```text +12 matches → 0 actual rewrites needed +all 12 references are correctly-named for context: + - 2 pitch-context (KEEP-NAME) + - 4 memory-file path pointers (HISTORICAL preservation) + - 1 funding-chain disclosure (KEEP-AS-DISCLOSURE) + - 1 already-fixed in earlier commit + - 4 historical narrative + generated artifacts +``` + +Same Goodhart-trap shape as the prior catches in this session +(commit-count panic, sample-classification, tree-numstat). The +common failure mode: **using a count as a proxy for the +quantity-of-actual-work**. + +## How this composes with the Goodhart family + +This catch extends the metric ladder one step further: + +```text +raw match count + → candidate set + → context classification + → actual violation / no-op / follow-up +``` + +| Catch | Wrong target | Correct target | +|---|---|---| +| #1 (substrate-is-amortized-precision, Aaron) | More substrate iteration | Terminal progress / amortized payout | +| #2 (commit-count vs tree-numstat, Otto) | Commit-count divergence | Tree/content work queue | +| #3 (sample-classification, Amara) | Sampled-file ALREADY-COVERED | Full-tree clearance | +| #4 (content-loss-surface, Amara) | Tree diff count | Content-loss surface | +| **#5 (this — Candidate-count, Amara)** | **Raw search hits** | **Context-classified candidate set** | + +Each catch is the same shape: a measurement substituted for the +target. Each correction names the actual target the measurement +was a proxy for. + +## Acceptance criterion rule + +For any audit using `grep` / `rg` / `find` / search-based +discovery: + +```text +Acceptance criterion is NOT "zero matches." + +Acceptance criterion IS: + - all matches classified into terminal states + - all unsafe matches fixed or quarantined + - all legitimate matches documented + - no unresolved NEEDS-HUMAN-REVIEW items +``` + +The "zero matches" target is appropriate **only when the +search term is truly forbidden in all contexts** (e.g., +secret-token leak detection, debug-print scrubbing). For +context-sensitive audits, zero is the wrong target. + +### Terminal classification states (per audit type) + +For the **ServiceTitan naming audit** (B-0091): + +```text +KEEP-NAME (pitch / research / disclosure context) +GENERICIZE (reusable code / sample context) +HISTORICAL-POINTER (memory-file path / archive) +GENERATED (regenerate only if source changes) +COMPLIANCE-RISK (quarantine / escalate) +NEEDS-HUMAN-REVIEW +``` + +For the **public-company contributor compliance scanner** +(B-0092): + +```text +ALLOW (defining the compliance rule itself / + citing public filings / historical note) +WARN ("insider" / "privileged" register around contributor + expertise — needs context-rephrase) +BLOCK (company-specific internal claim without public source / + customer data / private metrics / confidential + architecture) +``` + +For the **lost-substrate audit** (B-0090): + +```text +ALREADY-COVERED +NEEDS-RECOVERY +OBSOLETE +NEEDS-HUMAN-REVIEW +``` + +For the **directive-language audit** +(`feedback_input_is_not_directive_*`): + +```text +LEGITIMATE-USE (concept-naming / verbatim quote / + technical-term) +NEEDS-REFRAME (agency-collapsing language about Aaron's role) +``` + +## Critical implication for B-0092 compliance scanner + +The B-0092 contributor-compliance scanner MUST be designed with +this rule in mind. A scanner that tries to reach "zero uses of +insider / confidential / roadmap" will **immediately Goodhart +itself**, because: + +- The compliance rule itself contains those words (defines + what's forbidden) +- The compliance scanner code defines patterns matching those + words +- Documentation explaining the rule cites those words +- Historical archives quote them as substrate + +Without context-sensitive classification, the scanner would +flag its own constitution as a violation. + +**The scanner design rule:** + +```text +Scanner produces CANDIDATE hits. +Classifier (Otto / human / heuristic-rule) assigns each hit + to a terminal state. +Acceptance = all hits in terminal states; no BLOCK-class + hits; all WARN hits reviewed; rule-definition hits explicitly + allowlisted; no ad-hoc bypasses outside rule-definition surfaces. +``` + +This is encoded in B-0092's scanner section as a hard design +constraint. + +## External lineage (Tier 2) + +- **Goodhart's Law** (Goodhart 1975, Strathern 1997 reframing) + — when a measure becomes a target, it ceases to be a good + measure. The candidate-count failure mode is a specific + instance of this. +- **Campbell's Law** (Campbell 1976) — quantitative social + indicators used for decision-making become subject to + corruption pressures. +- **Linguistic distinction between intension and extension** + (Frege; Carnap) — a measure (extension: count of items + matching a regex) ≠ the meaning the measure was meant to + capture (intension: violation of a context-sensitive rule). + +## Pickup for future Otto + +When designing or running an audit: + +1. **State the actual target.** What's the real failure mode + you're trying to detect? (e.g., "brand-bleed in reusable + code surfaces" — NOT "occurrences of the brand name.") +2. **Choose the search expression.** The expression generates + CANDIDATES, not violations. +3. **Classify each candidate** by context into a terminal state + from the audit-type's terminal-state list. +4. **Acceptance** = all candidates classified; no unsafe + classifications outstanding; no `NEEDS-HUMAN-REVIEW` + unresolved. +5. **Never use raw count as the success metric** unless the + term is forbidden in all contexts. + +When reviewing someone else's audit: + +1. **Check the target framing**: is it raw count or context- + classified? +2. **Check the acceptance criterion**: is "zero matches" being + used inappropriately? +3. **Check for self-destruct**: would the audit flag its own + rule-definition / scanner / documentation surfaces? + +## Direct distillation (Amara final-synthesis form) + +> *"Count matches to find work."* +> *"Classify context to decide work."* + +This is the keeper rule. Five words on each side; complete +decision procedure. + +## What this rule does NOT do + +- **Does NOT** apply to genuinely-forbidden tokens. Secret + scans, credential leak scans, hardcoded-PII scans operate + on "any match is a violation" — those are forbidden in all + contexts, and zero-match IS the right target. +- **Does NOT** prohibit using raw counts diagnostically. + Raw count is fine as a warning signal; it's wrong as the + success metric. +- **Does NOT** require all audits to have a separate + classification step. Tiny audits where every hit is + obviously a violation can collapse classification into + inspection. The rule applies when the audit's term has + legitimate uses. +- **Does NOT** override existing audits in the repo. They + may have been designed correctly; the rule is for new + audit design and existing-audit review. + +## Composes with + +- `memory/feedback_reset_readiness_metric_ladder_content_loss_surface_amara_2026_04_28.md` + — extends the metric ladder with a 5th catch. +- `memory/feedback_class_count_validity_drift_amara_meta_class_2026_04_28.md` + — same family at the meta-level (count-as-evidence trap). +- `memory/feedback_sample_classification_is_calibration_not_clearance_amara_goodhart_catch_3_2026_04_28.md` + — Catch #3, also count-as-evidence shape. +- B-0091 (ServiceTitan audit) — worked example: 12 matches → + 0 rewrites; the catch's origin trigger. +- B-0092 (public-company contributor compliance) — critical + application: scanner must avoid self-destruct. +- B-0090 (lost-substrate cadenced recovery) — applies same + rule to lost-branch / orphan-PR audits. + +## Direct Aaron / Amara framing + +> *"Raw search hits are candidate sets, not violation sets."* + +> *"Count matches to find work. Classify context to decide +> work."* + +> *"For any audit using grep/regex/search: search hits are +> candidate evidence, not final findings."*