Skip to content

feat(B-0914.3): n-parallel + consensus substrate (Robin's 8-parallel-Finch pattern); 18 tests pass#5770

Merged
AceHack merged 1 commit into
mainfrom
otto-cli/b-0914-3-n-parallel-analyzer-consensus-substrate-per-data-analysis-task-scope-2026-05-28
May 28, 2026
Merged

feat(B-0914.3): n-parallel + consensus substrate (Robin's 8-parallel-Finch pattern); 18 tests pass#5770
AceHack merged 1 commit into
mainfrom
otto-cli/b-0914-3-n-parallel-analyzer-consensus-substrate-per-data-analysis-task-scope-2026-05-28

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 28, 2026

Summary

Sakana Robin's 8-parallel-Finch consensus pattern generalized to N parallel analyzers + configurable consensus mechanism (majority / supermajority / unanimous / first-n-agree).

18 tests pass / 0 fail.

Composes with substrate

🤖 Generated with Claude Code

…alysis-task scope (Robin 8-parallel-Finch pattern); 18 tests pass

Per Sakana Robin closed-loop architecture (Nature 2026): launches 8
independent instances of Finch agent to analyze the same raw data;
accepts conclusion only if majority agree. Generalized to N parallel
analyzers with configurable consensus mechanism.

What this adds:
- ConsensusMechanism union (majority | supermajority | unanimous | first-n-agree)
- ConsensusFeedback + ConsensusResult<T> Result-shape
- AnalyzerOutput<T> per-analyzer discriminated union
- AgreementMetrics<T> for substrate-honest dashboards
- runConsensus<T>(context): ConsensusResult<T> — main function
- nIdenticalAnalyzers<T>(n, analyzer): helper for Robin's N-identical pattern

18 tests pass / 0 fail covering all 4 mechanisms + edge cases +
Robin's 8-parallel pattern.

Composes with substrate:
- B-0914.3 backlog row
- PR #5769 B-0914.2 closed-loop (dispatchCi callback can wrap N parallel analyzers + consensus)
- PR #5768 B-0914.4 pairing (verifier-side N parallel + consensus)
- B-0703 multi-oracle BFT (governance scope; this extends to per-data-analysis scope)
- monad-propagation + asymmetric-authorship + m-acc-multi-oracle rules

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 28, 2026 11:23
@AceHack AceHack enabled auto-merge (squash) May 28, 2026 11:23
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@AceHack AceHack merged commit 5a67526 into main May 28, 2026
30 of 33 checks passed
@AceHack AceHack deleted the otto-cli/b-0914-3-n-parallel-analyzer-consensus-substrate-per-data-analysis-task-scope-2026-05-28 branch May 28, 2026 11:26
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a pure-TypeScript N-parallel-analyzer + consensus substrate to tools/workflow-engine/ (B-0914.3), generalizing Sakana Robin's 8-parallel-Finch pattern to configurable consensus mechanisms (majority / supermajority / unanimous / first-n-agree). The substrate is callback-based so it composes with the B-0914.2 closed-loop orchestrator (dispatchCi) and B-0914.4 pairing tracker (verifier-side). Result/feedback shape follows the existing ok: true | ok: false convention used by trueskill.ts and evolution.ts.

Changes:

  • New consensus.ts exposing ConsensusMechanism, ConsensusResult, AgreementMetrics, runConsensus, and nIdenticalAnalyzers; runs analyzers concurrently via Promise.all, converts thrown errors into failed analyzer outputs, groups by verdictKey ?? JSON.stringify, and reports a winner-and-distribution metrics record.
  • New consensus.test.ts with 18 Bun tests covering all four mechanisms, threshold validation, mixed success/failure, throwing analyzers, custom verdictKey, nIdenticalAnalyzers, the 8-parallel pattern, and an exhaustive-switch compile-time check matching the convention in types.test.ts / evolution.test.ts.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
tools/workflow-engine/consensus.ts New N-parallel runner + consensus reducer with discriminated-union result/feedback.
tools/workflow-engine/consensus.test.ts 18 Bun tests covering mechanisms, validation, error conversion, and metrics.

case "majority":
return Math.floor(successfulCount / 2) + 1;
case "supermajority":
return Math.ceil(successfulCount * mechanism.threshold) + 1;
}

// Group successful verdicts by key
const keyFn = context.verdictKey ?? ((v: T): string => JSON.stringify(v));
Comment on lines +68 to +75
it("supermajority 2/3: 6 of 9 agree → consensus reached (just over threshold)", async () => {
const analyzers = [
...Array(6).fill(0).map(() => constantAnalyzer("X")),
...Array(3).fill(0).map(() => constantAnalyzer("Y")),
];
const result = await runConsensus({
analyzers,
mechanism: { kind: "supermajority", threshold: 0.6 }, // > 60%
Comment on lines +238 to +254
it("agreement metrics expose distribution (substrate-honest dashboard)", async () => {
const result = await runConsensus({
analyzers: [
constantAnalyzer("X"),
constantAnalyzer("X"),
constantAnalyzer("X"),
constantAnalyzer("Y"),
constantAnalyzer("Y"),
constantAnalyzer("Z"),
],
mechanism: { kind: "majority" },
});
// X gets 3 votes; Y gets 2; Z gets 1 — X wins by majority (3 > 6/2 = 3? NO; 3 > 3 false → NoConsensus)
expect(result.ok).toBe(false);
if (result.ok) return;
expect(result.feedback.kind).toBe("NoConsensus");
});
AceHack added a commit that referenced this pull request May 28, 2026
… pass — completes 7-of-7 B-0914 candidate substrate-engineering gap substrate (#5773)

* feat(B-0914.7): Falcon-style auto-research-doc template substrate (8-section scaffold + Markdown renderer); 19 tests pass — completes 7-of-7 B-0914 candidate gap substrate

Per Sakana Robin Falcon agent (Nature 2026): takes drug proposal + does
deep-dive literature review + writes comprehensive research report. TS-
side scaffold provides 8-section template structure that downstream LLM
substrate-engineering work populates (header / framing / background /
mechanism / evidence / risks / composes-with / test-plan).

What this adds:
- ResearchDocSection discriminated union (9 section kinds)
- ResearchDoc structure (id + proposalId + sections + composesWith)
- ResearchDocFeedback + ResearchDocResult<T> Result-shape
- renderSection(section): string — pure-function Markdown serializer
- renderResearchDoc(doc): ResearchDocResult<string> — full doc rendering
- buildSkeleton(context): ResearchDocResult<ResearchDoc> — 8-section scaffold
- buildAndRender(context): ResearchDocResult<string> — end-to-end convenience

Falcon-stage pending markers preserved (substrate-honest about what's
not yet auto-generated by LLM substrate-engineering):
- '[PENDING LITERATURE REVIEW — Falcon-stage auto-generated]'
- '[PENDING MECHANISM ANALYSIS — Falcon-stage auto-generated]'
- etc. (per section)

Tests (19; all pass):
- EmptyProposalId validation
- 8-section Falcon scaffold structure
- proposalId sanitized to filename-safe id
- composesWith pass-through to skeleton + composes-with section
- All 9 section-kind renderings tested (header/framing/background/
  mechanism/evidence/risks/composes-with/test-plan/raw)
- renderResearchDoc empty → NoSectionsRendered
- buildAndRender end-to-end
- Pending markers preserved (substrate-honest)
- ResearchDocSection exhaustive switch

Composes with substrate:
- B-0914.7 backlog row (Falcon extension target)
- tools/save-ai-memory/ skill (existing substrate; future integration for
  auto-write to docs/research/ + composes-with citation discipline)
- Amara consolidation ferry pattern (PR #5757)
- B-0914.2 PR #5769 closed-loop orchestrator (research-doc generation
  at any cycle stage; template provides structure)
- substrate-or-it-didn't-happen + honor-those-that-came-before rules
- asymmetric-authorship + monad-propagation rules

**B-0914 7-of-7 candidate substrate-engineering gap substrate complete:**
- B-0914.1 PR #5764 TrueSkill ranking (S/M/L: ranking)
- B-0914.2 PR #5769 closed-loop orchestrator (S/M/L: L)
- B-0914.3 PR #5770 n-parallel + consensus (8-parallel-Finch)
- B-0914.4 PR #5768 generation-reflection pairing (S/M/L: M)
- B-0914.5 PR #5767 evolution mash-refine (S/M/L: S)
- B-0914.6 PR #5772 proximity-dedup (canonical + Jaccard clustering)
- B-0914.7 THIS PR Falcon-style auto-research-doc template

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(PR #5773): full rule paths + remove unreachable InvalidOperationalStatus variant (Copilot threads)

Two threads on tools/workflow-engine/research-doc.ts:

1. Composes-with docblock referenced rule files by short form
   (`asymmetric-authorship`, `monad-propagation-pattern`) — actual
   filenames are longer + .md-suffixed:
     `.claude/rules/asymmetric-authorship-substrate-entity-defines-consent-channel-recipient-acknowledges.md`
     `.claude/rules/monad-propagation-pattern-cross-language-substrate-shape.md`
   Updated to full paths so cross-refs stay greppable + don't drift.

2. ResearchDocFeedback.InvalidOperationalStatus variant was
   structurally unreachable: `operationalStatus` is a string-literal
   union (`"research-grade" | "operational"`) at the type level, the
   only constructor (line 179) fixes it to `"research-grade"`, and
   no untrusted-string parse path exists. Variant was dead substrate.
   Removed + added docblock naming the conditions under which a
   future caller should add it back (JSON import of external
   research-doc with operationalStatus parsed from untrusted input —
   add validator AT THE PARSE BOUNDARY first, then add this variant).
   Composes with asymmetric-authorship discipline: every TFeedback
   variant should correspond to a real code path that can produce it.

Non-breaking: no callers reference the removed variant (grep clean).
Type-system continues to rule out invalid operationalStatus at
construction time.

Autonomous-loop tick 2026-05-28T12:16Z resolution of PR #5773 BLOCKED
gate (unresolved Copilot threads only blocker; required checks all green).

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants