Skip to content

backlog(B-0760): USB as repair tool for any node — identity preservation + no-disruption-at-3+-nodes invariant#5038

Closed
AceHack wants to merge 2 commits into
mainfrom
otto-cli/b0760-usb-as-repair-tool-2026-05-25
Closed

backlog(B-0760): USB as repair tool for any node — identity preservation + no-disruption-at-3+-nodes invariant#5038
AceHack wants to merge 2 commits into
mainfrom
otto-cli/b0760-usb-as-repair-tool-2026-05-25

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 25, 2026

Aaron 2026-05-25 mid-B-0754-v1 testing prep, naming the USB-as-repair-tool design intent + the 3-node-zero-disruption invariant from desired-state/declarative/git-native/AI-native principles. Composes with B-0754 / B-0755 / B-0756 / B-0757 / B-0758.

…ion + no-disruption-at-3+-nodes invariant

Aaron 2026-05-25 mid-iteration-1 prep: 'think of this usb as the
easiest repair tool for any node in the cluster if it fails we
should harden it like that since we are desired state / declarative
/ git native / ai native a full rebuild of a node should not stop
normal operations once we get to 3 nodes.'

Captures the repair-tool design: plug USB into failed node, mDNS-
detects existing cluster, MAC-address-detects prior node identity,
inherits hostname + role from cluster's known-nodes registry,
reinstalls + rejoins as SAME identity. At 3+ nodes (B-0756 HA
quorum), single-node rebuild is zero-disruption to cluster ops.

Composes with B-0754 (zero-typing first-boot — substrate this
extends), B-0756 (HA quorum prerequisite), B-0757 (mDNS auto-
discovery — the 'is there a cluster?' detection reused for 'is
this a known node?'), B-0755 (role inheritance must support all
role variants), B-0758 (orthogonal — repair-tool semantics apply
whether OS lives on disk OR USB).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 25, 2026 23:56
@AceHack AceHack enabled auto-merge (squash) May 25, 2026 23:56
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new P2 backlog row (B-0760) describing the “USB as repair tool” desired-state semantics for cluster node rebuilds, including identity preservation and a “zero disruption at 3+ nodes” invariant, and registers it in the generated backlog index.

Changes:

  • Adds new backlog row file B-0760 with problem statement, target flow, acceptance criteria, and related cross-references.
  • Updates docs/BACKLOG.md to include the new P2 entry.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
docs/backlog/P2/B-0760-usb-as-repair-tool-for-any-node-identity-preservation-across-rebuilds-no-disruption-at-3-plus-nodes-aaron-2026-05-25.md New backlog row defining repair-USB semantics, acceptance criteria, and operational/security constraints.
docs/BACKLOG.md Adds the B-0760 entry to the P2 index.

Two findings:

1. Line 29: continuation "+ ZETA_AUTO_CONFIRM ..." parsed as markdown
   list item, breaking markdownlint. Rewrap as comma-separated list
   inside the parenthetical so no line starts with "+".

2. Line 142-143: "even counts split-brain on partition" inaccurate
   for quorum-based Raft/etcd. Even clusters lose quorum on equal-
   split partitions (becoming unavailable) rather than forming
   divergent leaders. Rewrite to reflect the actual fault-tolerance
   tradeoff: 3 and 4 both survive 1 failure, so even counts add a
   node without improving tolerance.

Resolves lint (markdownlint) failure + both Copilot threads.

Co-Authored-By: Claude <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 26, 2026
* docs(archive): preserve discussions for merged PRs

* docs(shadow): log CI failure drift from otto-cli in PR #5032

* docs(shadow): update log with repeated drift from otto-cli in PR #5038

* fix(PR-5041): metadata block uses proper bullet list, not bold-with-leading-hyphen

Per Copilot finding: `**- Date:**` renders a literal hyphen inside the bold
text, inconsistent with the dominant `docs/research/**` shadow-lesson-log
format. Switch to `- **Date:**` (proper Markdown bullet list).

* docs(shadow): fix markdown rendering in PR #5041 (Copilot review)

Two findings from copilot-pull-request-reviewer on PR #5041:

1. `## 2. Update: Repeated Drift Pattern` was misnumbered — appeared
   after `## 3. Lesson` and duplicated `## 2. Analysis`. Renumbered
   to `## 4.` to match document flow.
2. Three metadata lines used `**- Date:**` form, rendering a literal
   hyphen inside bold text rather than a proper list-item marker.
   Reformatted to `- **Date:**` matching the file's opening lines.

No content change; only markdown structural rendering.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Otto <otto@zeta.factory>
Co-authored-by: Otto <noreply@anthropic.com>
@AceHack
Copy link
Copy Markdown
Member Author

AceHack commented May 26, 2026

Closing as substrate-stale + ID-collision per .claude/rules/pr-triage-tiers.md Tier 3 + the agent-roster-reference-card ID-allocation discipline.

Discriminator pass per .claude/rules/fighting-past-self-vs-peer-agent-distinguisher...md:

Disposition: close. The substrate-honest re-land path is cherry-pick onto a fresh branch off current main, renumber the collided IDs to next-free (B-0806+ at filing time), and open a focused PR.

This close is NOT a substrate punt — it's explicit ownership classification per the rule I just sharpened in #5126: routing the work to "re-land via cherry-pick with renumbered IDs" rather than leaving it in indeterminate state with operator paying triage tax.

@AceHack AceHack closed this May 26, 2026
auto-merge was automatically disabled May 26, 2026 08:05

Pull request was closed

This was referenced May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants