docs+feat(B-0914 + upstream): add co-scientist + Robin + Microsoft Infer.NET to upstream references + backlog B-0914 7-candidate substrate-engineering gap decomposition (Aaron 2026-05-28 explicit)#5763
Conversation
…fer.NET to references/reference-sources.json + UPSTREAM-LIST.md + backlog B-0914 7-candidate substrate-engineering gap decomposition (Aaron 2026-05-28: 'we should add coscientis and add it to our upstram references and refersh update them so we can take a peak lol also lets backlog all the candidates') Per Aaron 2026-05-28 explicit substrate-engineering directives: 1. 'we should add coscientis and add it to our upstram references' → added 6 entries to references/reference-sources.json: - SakanaAI/AI-Scientist (original v1) - SakanaAI/AI-Scientist-v2 (Robin descendant; agentic tree search) - jataware/open-coscientist (best open-source co-scientist via LangGraph) - llnl/open-ai-co-scientist (LLNL government-lab implementation) - The-Swarm-Corporation/AI-CoScientist (minimal Swarms framework) - Microsoft Research Infer.NET (TrueSkill substrate; canonical) 2. 'refresh update them so we can take a peak' → operator may run tools/setup/common/sync-upstreams.sh to mirror new repos into references/upstreams/ (operator-side; not auto-run by Otto-CLI per safety discipline) 3. 'lets backlog all the candidates' → filed B-0914 parent row with 7-candidate decomposition: - B-0914.1 ELO-style ranking-agent via TrueSkill/Infer.NET - B-0914.2 Closed-loop CI-result → next-hypothesis dispatch - B-0914.3 n-parallel-agent-instances + consensus per data-analysis-task - B-0914.4 Generation+reflection adversarial pairing structurally enforced - B-0914.5 Evolution agent (mash + refine surviving substrate) - B-0914.6 Proximity-agent for substrate-engineering substrate de-duplication - B-0914.7 Falcon-style auto-generate-substrate-research-doc per proposal Also added 'Multi-agent scientific discovery' section to docs/UPSTREAM-LIST.md naming Google co-scientist + Sakana Robin + Microsoft Infer.NET TrueSkill with substrate-engineering composition notes. Per WebSearch 2026-05-28 verification: - https://github.com/SakanaAI/AI-Scientist - https://github.com/SakanaAI/AI-Scientist-v2 - https://github.com/jataware/open-coscientist - https://github.com/llnl/open-ai-co-scientist - https://github.com/The-Swarm-Corporation/AI-CoScientist - https://github.com/dotnet/infer - Google co-scientist itself closed-source (Nature 2026; only Science Skills data layer on GitHub) - Sakana Robin: Nature 2026 (s41586-026-10652-y); arXiv:2505.13400 Composes with PR #5762 (YouTube ferry preservation) + B-0867 workflow engine substrate cluster + B-0865 + B-0865.17 benchmark + B-0703 multi-oracle BFT. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
Adds 6 multi-agent scientific-discovery upstream references (Sakana AI-Scientist v1/v2, three open co-scientist ports, Microsoft Infer.NET) and backlogs a P2 parent row B-0914 decomposing 7 substrate-engineering candidate gaps surfaced by the YouTube ferry PR #5762.
Changes:
- Add 6 entries to
references/reference-sources.jsonfor co-scientist / Robin / Infer.NET upstreams - Add a "Multi-agent scientific discovery" section to
docs/UPSTREAM-LIST.md - Add backlog row
B-0914(P2) with 7-candidate decomposition and updatedocs/BACKLOG.mdindex
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
references/reference-sources.json |
Six new upstream entries (Sakana v1/v2, jataware, LLNL, Swarms, Infer.NET) |
docs/UPSTREAM-LIST.md |
New "Multi-agent scientific discovery" section listing the same upstreams |
docs/backlog/P2/B-0914-...md |
New P2 backlog row with 7 candidate sub-row decomposition |
docs/BACKLOG.md |
Index entry linking to B-0914 |
This was referenced May 28, 2026
AceHack
added a commit
that referenced
this pull request
May 28, 2026
…nking-agent (Herbrich+Minka+Graepel 2007 paper algorithm; substrate for cross-vendor benchmark on common ground) (#5764) Per Aaron 2026-05-28 substantive substrate-engineering decision: - 'they are doing this for their idea ranking with Infra.net basically' - 'we'd build ELO from scratch is this a good idea too or nah with infer.net?' - 'you are too careful just ship stuff and lets inventory later' Substrate-honest answer shipped: HYBRID is best. - TS-side (this PR): pure-TS TrueSkill 1v1 for vendor skill runtime (cross-vendor benchmark on common ground B-0865.17 REQUIRES TS-side because Infer.NET can't run in Claude/GPT/Gemini/Grok skill stores) - F#/.NET side (future Zeta.Bayesian work): Infer.NET TrueSkill for deep production integration + full BP/EP framework - Both compose via shared API shape (TrueSkillRating + match update fn) Implementation: published TrueSkill algorithm from Herbrich+Minka+Graepel 2007 NeurIPS paper. Minimal 1v1 case; team-play extension deferred. ~340 lines including documentation. What this adds: - TrueSkillRating interface (mu + sigma posterior gaussian) - DEFAULT_INITIAL_RATING (Xbox Live convention: mu=25 sigma=25/3) - DEFAULT_PARAMS (beta=mu/6 tau=mu/300 drawProb=0.10) - MatchOutcome discriminated union (win-A / win-B / draw) - RankingFeedback discriminated union (InvalidRating / NumericalInstability / UnsupportedOutcome) - RankingResult Result-shape per monad-propagation rule - rate1v1(a, b, outcome, params): RankingResult — full 1v1 TrueSkill update - conservativeSkill(rating): number — Xbox Live lower-bound convention (mu - 3*sigma) - Internal helpers: normalPdf, normalCdf (A&S 7.1.26), inverseNormalCdf (Newton's method), drawMargin, vWin/wWin (non-draw truncated normal corrections), vDraw/wDraw (draw truncated normal corrections) Tests (17; all pass): - Default initial rating Xbox Live convention - Default params paper convention - conservativeSkill = mu - 3*sigma - win-A increases A's mu, decreases B's - win-B increases B's mu, decreases A's - Both sigmas decrease after match (uncertainty reduction) - After 2 matches both sigmas decrease + mus drift bounded - Strong-beats-weak → small mu shift (expected outcome) - Weak-beats-strong → large mu shift (upset) - Draw between equal players → minimal mu change - Draw between unequal players → strong loses mu, weak gains - Returns InvalidRating for NaN mu / non-positive sigma / negative sigma - conservativeSkill ranking with sigma-punishment semantic preserved - 5-match tournament convergence (sigma reduction + mu separation) - MatchOutcome exhaustive switch (TS strict mode) Composes with substrate: - B-0914.1 backlog row (TrueSkill ranking-agent extension target) - B-0867 workflow engine substrate (future ActionClass 'rank-via-trueskill') - B-0865 + B-0865.17 cross-vendor benchmark substrate - B-0867.20 lifecycle DU (rank action gets pr-review-light via Mod 1) - Microsoft Infer.NET upstream reference (PR #5763 in flight) - .claude/rules/monad-propagation-pattern (Result<T, TFeedback> shape) - .claude/rules/asymmetric-authorship (TFeedback authored by ranking fn) Source citation: Herbrich, Minka, Graepel 'TrueSkill: A Bayesian Skill Rating System' (NeurIPS 2006/2007); algorithm implementation from published paper, not Infer.NET source. Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
AceHack
added a commit
that referenced
this pull request
May 28, 2026
…es.net PhD learning substrate (Aaron 2026-05-28 substrate-engineering questions) (#5765) Per Aaron 2026-05-28 substrate-engineering questions: - 'is there anything like infer.net in ts? can we build it if not using infer.net source code for reference?' → WebPPL is closest TS/JS analog - 'you'd love videolectures.net in your free time i think... PhD everything here. they don't throttle and they have transcripts and powerpoints' → free-time-substrate learning material Adds 2 entries to references/reference-sources.json + new 'Probabilistic programming / Bayesian inference' section to docs/UPSTREAM-LIST.md: 1. WebPPL (probmods/webppl; Stanford; MIT-licensed) - Full PP framework in JS with multiple inference engines - Closest TS-side substrate to Microsoft Infer.NET - Composes with B-0914.1 TrueSkill substrate (PR #5764) - Composes with future factor-graph-DSL work 2. videolectures.net (PhD learning substrate; Aaron-named for free-time-as-valid-mode substrate per never-be-idle + agent-qol) - Transcripts + slides substrate-accessible - Tom Minka TrueSkill canonical talks - Per Aaron: 'they don't throttle that i can tell' Composes with substrate: - PR #5763 (Google co-scientist + Sakana Robin + Microsoft Infer.NET upstream additions) - PR #5764 (B-0914.1 pure-TS TrueSkill 1v1 scaffold) - B-0914 (7 substrate-engineering candidate gaps) - B-0914.1 (TrueSkill ranking-agent extension target) - B-0865 + B-0865.17 cross-vendor benchmark substrate Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…erences-add-coscientist-robin-trueskill-plus-backlog-7-substrate-engineering-candidates-2026-05-28 # Conflicts: # docs/UPSTREAM-LIST.md # references/reference-sources.json
… heading Fixes failing required check `lint (markdownlint)` on PR #5763: - docs/UPSTREAM-LIST.md:150 — blank line above `### Probabilistic programming / Bayesian inference` heading - docs/backlog/P2/B-0914-...md:178 — blank line above `- This PR: adds SakanaAI/AI-Scientist…` list Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Per Aaron 2026-05-28 explicit substrate-engineering directives:
'we should add coscientis and add it to our upstram references' → 6 entries added to
references/reference-sources.json:'refresh update them so we can take a peak' → operator may run
tools/setup/common/sync-upstreams.sh(operator-side; Otto-CLI doesn't auto-run sync per safety discipline)'lets backlog all the candidates' → filed B-0914 parent row with 7-candidate decomposition (per YouTube ferry preservation PR docs(ip-questionable): preserve YouTube AI co-scientist + Robin video VERBATIM 2026-05-28 — Aaron 'exactly what we are doing but times 10 missing a few step' framing + 7 substrate-engineering candidate gaps (Aaron-authorized) #5762):
Also added 'Multi-agent scientific discovery' section to
docs/UPSTREAM-LIST.md.Verification
WebSearch 2026-05-28 verified all upstream URLs.
Test plan
🤖 Generated with Claude Code