From d441a0879eb0ce658f94e2df7ce94ea6d88f353d Mon Sep 17 00:00:00 2001 From: Lior Date: Tue, 26 May 2026 02:34:45 -0400 Subject: [PATCH] fix(B-0792 iter-5.2.2): move hostname auto-gen from flash-time to install-time + login-banner shows hostname pre-login (Aaron 2026-05-26) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two fixes addressing maintainer 2026-05-26 empirical observations: (1) Move hostname auto-generation from flash-time to install-time (REVERTS iter-5.2.1 flash-time auto-gen): The maintainer 2026-05-26: "wait zflash has a hard coded name? i was thinking it would be auto generated on each machine so i can't use that same usb twice?" iter-5.2.1 baked the auto-name into the USB ESP at flash time → every install from the same USB inherited the SAME hostname → mDNS collision when reused across machines. Defeats the multi-node-from-single-USB use case the maintainer wanted. Fix: when --host is NOT passed to zflash, DON'T write zeta-hostname.txt to the ESP. zeta-install.sh now generates a fresh random node-<6hex> ON THE NODE at install time via head -c 3 /dev/urandom | xxd -p. Each install from the same USB gets a different hostname. zflash --host pikachu still works the same way (operator override). The previous "log auto-name pre-flash so operator knows what to ssh to" UX is traded for multi-node correctness; operator now reads the auto-name from the cluster console's pre-login banner (see (2) below) OR from mDNS scan. (2) Login banner shows hostname + ssh hint BEFORE login: The maintainer 2026-05-26: "i mean i see a login but no hostname until after i login can you update to show hostname before i login" NixOS default getty shows just "login:" without prominent hostname. New nixos/modules/login-banner.nix sets services.getty.greetingLine + services.getty.helpLine to display: ╭─────────────────────────────────────────────────────────╮ │ ZETA CLUSTER NODE │ │ Hostname: │ │ SSH from operator Mac: │ │ ssh zeta@.local │ │ Console login: │ │ user: zeta │ │ password: zeta-change-me (rotate after first login) │ ╰─────────────────────────────────────────────────────────╯ login: Photo-friendly per the maintainer's discipline: "whenever i have to ferry commands by reading and typing i'm going to avoid it like the plague and try to get like pictures and auto run and short commands pre built in". common.nix imports login-banner.nix so EVERY host (control-plane, worker-gpu, worker-template, future configs) gets the pre-login banner via the existing import chain. Changes: - full-ai-cluster/tools/zflash.ts: removed iter-5.2.1 flash-time auto-gen; replaced with informational message when no --host passed pointing at on-node install-time generation + pre-login banner display - full-ai-cluster/usb-nixos-installer/zeta-install.sh: when no zeta-hostname.txt on ESP, generate node-<6hex> on-node via head -c 3 /dev/urandom | xxd -p; write to /mnt/etc/zeta/ cluster-node-id; injected-hostname.nix module picks up - full-ai-cluster/nixos/modules/login-banner.nix (NEW): getty greetingLine + helpLine customization - full-ai-cluster/nixos/modules/common.nix: imports login-banner.nix Composes with iter-5.1 + iter-5.2 + iter-5.2.1 substrate already on main; iter-5.2.2 supersedes iter-5.2.1's flash-time-auto-gen mechanism with install-time generation. Co-Authored-By: Claude Opus 4.7 --- full-ai-cluster/nixos/modules/common.nix | 8 ++- .../nixos/modules/login-banner.nix | 62 +++++++++++++++++++ full-ai-cluster/tools/zflash.ts | 54 ++++++++-------- .../usb-nixos-installer/zeta-install.sh | 30 ++++++++- 4 files changed, 127 insertions(+), 27 deletions(-) create mode 100644 full-ai-cluster/nixos/modules/login-banner.nix diff --git a/full-ai-cluster/nixos/modules/common.nix b/full-ai-cluster/nixos/modules/common.nix index 6e5e85abbf..bce08ddb83 100644 --- a/full-ai-cluster/nixos/modules/common.nix +++ b/full-ai-cluster/nixos/modules/common.nix @@ -8,7 +8,13 @@ # iter-5.2 (B-0792): per-node hostname injection lives in its own # module so every host (control-plane, worker-gpu, worker-template, # future configs) inherits the override capability automatically. - imports = [ ./injected-hostname.nix ]; + # iter-5.2.2 adds login-banner.nix — shows hostname + ssh hint at + # console pre-login per the maintainer 2026-05-26 photo-friendly + # diagnostic discipline. + imports = [ + ./injected-hostname.nix + ./login-banner.nix + ]; nix.settings = { experimental-features = [ "nix-command" "flakes" ]; diff --git a/full-ai-cluster/nixos/modules/login-banner.nix b/full-ai-cluster/nixos/modules/login-banner.nix new file mode 100644 index 0000000000..7580d8ccf2 --- /dev/null +++ b/full-ai-cluster/nixos/modules/login-banner.nix @@ -0,0 +1,62 @@ +# full-ai-cluster/nixos/modules/login-banner.nix +# +# iter-5.2.2 (B-0792): NixOS getty login-banner customization so the +# hostname (+ primary IP + ssh-from-Mac hint) is visible BEFORE the +# operator logs in at the console. +# +# Problem the maintainer 2026-05-26 surfaced: *"i mean i see a login +# but no hostname until after i login can you update to show hostname +# before i login"* — NixOS default getty shows just "login:" without +# the hostname when the hostname is generic/default. Even when +# `networking.hostName` is set, the getty issue file doesn't +# necessarily display it prominently. +# +# Fix: configure `services.getty.greetingLine` + `services.getty.helpLine` +# so the pre-login console shows: +# +# ╭────────────────────────────────────────────────────╮ +# │ ZETA CLUSTER NODE │ +# │ Hostname: │ +# │ SSH from operator Mac: │ +# │ ssh zeta@.local │ +# │ Console login: │ +# │ user: zeta │ +# │ password: zeta-change-me (rotate after first) │ +# ╰────────────────────────────────────────────────────╯ +# login: +# +# Photo-friendly per the maintainer's 2026-05-26 *"whenever i have +# to ferry commands by reading and typing i'm going to avoid it +# like the plague and try to get like pictures and auto run and +# short commands pre built in"* discipline. + +{ config, lib, ... }: + +let + hostName = config.networking.hostName; +in +{ + # services.getty.greetingLine: printed once before login prompt. + # services.getty.helpLine: printed after greeting; conventionally + # the multi-line block goes here so each VT shows the same banner. + # \\n in literal NixOS string becomes "\n" in /etc/issue, which + # agetty expands to the system hostname at runtime. + services.getty.greetingLine = "<<< Welcome to ${hostName} (Zeta cluster node) >>>"; + services.getty.helpLine = '' + + + ╭─────────────────────────────────────────────────────────╮ + │ ZETA CLUSTER NODE │ + │ │ + │ Hostname: ${hostName} │ + │ │ + │ SSH from operator Mac (zero-typing if pubkey injected):│ + │ ssh zeta@${hostName}.local │ + │ │ + │ Console login (if needed for diagnostics): │ + │ user: zeta │ + │ password: zeta-change-me (rotate after first login) │ + ╰─────────────────────────────────────────────────────────╯ + + ''; +} diff --git a/full-ai-cluster/tools/zflash.ts b/full-ai-cluster/tools/zflash.ts index 66325d16aa..cf1a56f5f8 100755 --- a/full-ai-cluster/tools/zflash.ts +++ b/full-ai-cluster/tools/zflash.ts @@ -910,35 +910,39 @@ async function main() { willInject = false; } - // iter-5.2.1 (B-0792): if operator didn't pass --host, auto-generate - // a random unique hostname `node-<6hex>` (24-bit entropy = ~16M - // possible names, negligible collision risk for any homelab cluster - // size; node-by-node mDNS uniqueness preserved). Operator can rename - // later via the digital-twin substrate planned under B-0794 (node - // self-registration; not yet shipped — see B-0794 row for the - // target node-config substrate that will host the rename mechanism). + // iter-5.2.2 (B-0792) — REVERTS iter-5.2.1 flash-time auto-generation: // - // The maintainer 2026-05-26: "can we have it auto generate the host - // name we can change later via digital twin after it self registers." + // The maintainer 2026-05-26 surfaced the design flaw with flash-time + // auto-generation: *"wait zflash has a hard coded name? i was + // thinking it would be auto generated on each machine so i can't + // use that same usb twice?"* — baking a name into the ESP at flash + // time meant every install from the same USB inherits the SAME + // hostname, causing mDNS collision when reused across machines. // - // Auto-gen happens only when --host was NOT passed (preserves - // operator intent when they did pick a name) AND when iter-4.2 - // inject will actually run (gated on `willInject` so we never - // promise an ssh target for a hostname that won't be written to - // the USB ESP — finalized AFTER the pubkey existence check so - // missing-pubkey path doesn't print a misleading ssh promise). + // Fix: when --host is NOT passed, DON'T write zeta-hostname.txt + // to the ESP. zeta-install.sh now generates a fresh random + // node-<6hex> ON THE NODE at install time (per-install unique). + // Each install from the same USB gets a different hostname. + // + // Operator paths: + // - `zflash --host pikachu` → ESP carries 'pikachu'; install honors + // - `zflash` (no --host) → ESP has no hostname; install auto-gens + // on-node + prints in install banner + + // displays in pre-login banner per + // iter-5.2.2 NixOS login-banner module. + // + // The previous iter-5.2.1 "log the auto-name pre-flash so operator + // knows what to ssh to" UX is lost in trade — operator now reads + // the auto-name from the cluster console's login banner (printed + // pre-login per iter-5.2.2 login-banner module) OR from mDNS scan. + // Right trade for multi-node correctness. if (hostOverride === null && willInject) { - // Web Crypto: 3 random bytes → 6 hex chars; node-XXXXXX. Prefix - // `node-` keeps the namespace clean (operator-named hosts can - // avoid the `node-` prefix to distinguish from auto-named). - const rand = new Uint8Array(3); - crypto.getRandomValues(rand); - const hex = Array.from(rand, (b) => b.toString(16).padStart(2, "0")).join(""); - hostOverride = `node-${hex}`; process.stdout.write( - `\niter-5.2.1: --host not specified; auto-generated hostname: ${hostOverride}\n` + - ` (rename later via B-0794 digital-twin substrate when shipped)\n` + - ` cluster will be reachable as: ssh zeta@${hostOverride}.local\n\n`, + `\niter-5.2.2: --host not specified; zeta-install.sh on-node will\n` + + ` auto-generate a unique node-<6hex> hostname per-install.\n` + + ` Pre-login banner on first boot displays the chosen hostname\n` + + ` + IP (per iter-5.2.2 NixOS login-banner module).\n` + + ` For memorable names, re-flash with: zflash --host \n\n`, ); } diff --git a/full-ai-cluster/usb-nixos-installer/zeta-install.sh b/full-ai-cluster/usb-nixos-installer/zeta-install.sh index 6063cf8491..7f8b134d1a 100755 --- a/full-ai-cluster/usb-nixos-installer/zeta-install.sh +++ b/full-ai-cluster/usb-nixos-installer/zeta-install.sh @@ -402,8 +402,36 @@ if [ -n "$HOSTNAME_FILE" ]; then echo "[iter-5.2] falling back to flake default ($HOST)" fi else + # iter-5.2.2 fix (B-0792): when no operator-explicit hostname is + # on the ESP, generate a fresh random hostname ON THE NODE at + # install time (NOT at flash time). This is the load-bearing fix + # for the "same USB reused on second machine" multi-node case + # the maintainer 2026-05-26 surfaced: *"i was thinking it would + # be auto generated on each machine so i can't use that same + # usb twice?"*. zflash no longer auto-generates at flash time; + # zeta-install.sh now generates per-install. Each install from + # the same USB gets a unique node-<6hex> hostname. + # + # Format: node-<6hex> from /dev/urandom (24-bit entropy = + # ~16M unique names; negligible collision risk for any homelab + # cluster size; mDNS uniqueness preserved per-node). echo "[iter-5.2] no zeta-hostname.txt on USB ESP" - echo "[iter-5.2] using flake default hostname for #$HOST" + echo "[iter-5.2.2] generating fresh random hostname on-node (per-install unique) ..." + GENERATED_HOSTNAME="node-$(head -c 3 /dev/urandom | xxd -p)" + if echo "$GENERATED_HOSTNAME" \ + | grep -Eq '^[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?$'; then + echo "[iter-5.2.2] generated: $GENERATED_HOSTNAME" + sudo mkdir -p "$(dirname "$HOSTNAME_DST")" + echo "$GENERATED_HOSTNAME" | sudo tee "$HOSTNAME_DST" >/dev/null + sudo chmod 0644 "$HOSTNAME_DST" + echo "[iter-5.2.2] wrote $HOSTNAME_DST" + echo "[iter-5.2.2] networking.hostName will be '$GENERATED_HOSTNAME' on first boot" + echo "[iter-5.2.2] ssh access: ssh zeta@${GENERATED_HOSTNAME}.local" + echo "[iter-5.2.2] *** REMEMBER THIS HOSTNAME *** — printed in login banner per iter-5.2.2 substrate" + else + echo "[iter-5.2.2] WARN: generation produced invalid hostname '$GENERATED_HOSTNAME'" + echo "[iter-5.2.2] falling back to flake default ($HOST)" + fi fi echo