backlog: closure-table hardening for fast-git (pluggable hierarchical index)#396
Merged
backlog: closure-table hardening for fast-git (pluggable hierarchical index)#396
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
8638985 to
9c868d0
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new P2 research-grade BACKLOG entry to scope and track Phase 0 research for hardening Zeta’s hierarchy/closure-table approach into a pluggable hierarchical index suitable for filesystem-scale Git workloads (as part of the broader native F# git effort).
Changes:
- Adds a P2 BACKLOG row capturing the 2026-04-24 maintainer directive verbatim.
- Documents Phase 0 research scope (survey + interface sketch + baseline benchmark) and how it composes with related initiatives (#395, #394).
Comments suppressed due to low confidence (2)
docs/BACKLOG.md:5625
cgitis a web UI for Git repositories, not a Git implementation/library to benchmark native Git performance against. Consider replacing this comparison with something likelibgit2/JGit/ coregitCLI (or clarify what “compete” means here).
> on top of our distributed db. we can stick a ui in
> front of that too lol. Also you need to do a lot of
> research here cause some nodes will try to call you a
docs/BACKLOG.md:5638
- The claim that filesystem trees can be “100k+ files deep” is likely incorrect/misleading (depth is typically constrained by path length / OS limits). Suggest rewording to something like “100k+ nodes (files+dirs) total, very wide; depth varies but is usually far smaller” so the research scope stays accurate.
> them in streams and maybe further. backlog"*
**Two load-bearing motivations:**
1. **Aurora preparation** — Zeta's own blockchain-ish
3 tasks
… index)
Maintainer 2026-04-24 directive — closure-table substrate
needs hardening to support filesystem-class workloads
(deep + wide trees, 100k+ files) for the native F# git
implementation. Make the index pluggable so a faster
substrate can swap in if profiling shows it's the
bottleneck. Maintainer hasn't looked at space/time
tradeoffs; backlog research.
Phase 0 research scope captured in the row:
- State-of-the-art survey: nested-set, materialized-path,
closure-table, Postgres ltree, B-tree-prefix-index,
radix-trie, Verkle/Merkle Patricia.
- Substrates worth interface-compatibility: B-trees
(ZFS/btrfs scale), Patricia/HAMT/CRDT-tree, Dolt /
TerminusDB existing precedents.
- Define IHierarchicalIndex contract.
- Empirical baseline benchmark on representative repo.
Composes with native F# git impl (#395 cluster as primary
consumer), Mode 2 protocol upgrade, Ouroboros bootstrap
meta-thesis (index correctness IS part of the closure
proof), blockchain-ingest (#394 — block hierarchy may
share the same abstraction).
Otto-275 log-don't-implement: row captures research scope,
does NOT authorize implementation start.
9c868d0 to
a4d7c32
Compare
AceHack
added a commit
that referenced
this pull request
Apr 25, 2026
…indexes
Maintainer 2026-04-24 directive — every first-class interface on
Zeta's substrate (git, SQL, operator algebra, LINQ, future
GraphQL / blockchain query / WASM-RPC) must compose with every
other interface. Mixed-DSL queries must:
(1) parse + bind through unified type system
(2) plan through cost-based optimizer (full mixed AST)
(3) hit indexes for each constituent DSL
(4) preserve retraction semantics end-to-end
Architectural primitive captured: this is a direct application
of the 2026-04-22 semiring-parameterized Zeta substrate research
("one algebra to map the others"). With operator algebra
parameterized by a semiring, every other DSL's semantics maps
into the same one algebra by semiring-swap, and cross-DSL
composability falls out for free.
Phased: Phase 0 design proposal → pairwise adapters → unified
planner/binder → index-utilization audit → retraction-preservation
proof.
Composes with closure-table hardening (#396 — the hierarchical
index this layer hits), native F# git impl (#395), Ouroboros
bootstrap meta-thesis (cross-DSL composability IS an Ouroboros
closure), semiring-parameterized substrate, blockchain ingest
(#394 — chain queries compose via same substrate).
Otto-275 log-don't-implement: research scope captured; does NOT
authorize implementation start.
AceHack
added a commit
that referenced
this pull request
Apr 25, 2026
…indexes (#397) * backlog: cross-DSL composability — git/SQL/operator-algebra/LINQ hit indexes Maintainer 2026-04-24 directive — every first-class interface on Zeta's substrate (git, SQL, operator algebra, LINQ, future GraphQL / blockchain query / WASM-RPC) must compose with every other interface. Mixed-DSL queries must: (1) parse + bind through unified type system (2) plan through cost-based optimizer (full mixed AST) (3) hit indexes for each constituent DSL (4) preserve retraction semantics end-to-end Architectural primitive captured: this is a direct application of the 2026-04-22 semiring-parameterized Zeta substrate research ("one algebra to map the others"). With operator algebra parameterized by a semiring, every other DSL's semantics maps into the same one algebra by semiring-swap, and cross-DSL composability falls out for free. Phased: Phase 0 design proposal → pairwise adapters → unified planner/binder → index-utilization audit → retraction-preservation proof. Composes with closure-table hardening (#396 — the hierarchical index this layer hits), native F# git impl (#395), Ouroboros bootstrap meta-thesis (cross-DSL composability IS an Ouroboros closure), semiring-parameterized substrate, blockchain ingest (#394 — chain queries compose via same substrate). Otto-275 log-don't-implement: research scope captured; does NOT authorize implementation start. * drain(#397): fix 5 Copilot threads on cross-DSL composability row P0/P1/P1/P1/P2 from late Copilot re-review on the freshly-opened PR. All five fixes land as in-place edits to the new BACKLOG row (the row itself was added by this PR, so this is not an append-only-file violation). - title: rewrap so `operator-algebra` stays contiguous (P1). - body: rewrap `closure-table-hardening` contiguous (P1). - body: rewrap inline-code `query-optimizer-expert` contiguous (P0 — inline-code split breaks rendering and grep). - composes-with: closure-table dependency pointer made concrete — names `src/Core/Hierarchy.fs` and the "Closure-table over DBSP" research row under `## Research projects` instead of a non-existent "same section" hardening row (P2). - semiring memory pointer: add `memory/` prefix to match the convention used at the existing semiring rows (P1). Drain log at `docs/pr-preservation/397-drain-log.md` per Otto-250.
This was referenced Apr 25, 2026
Merged
AceHack
added a commit
that referenced
this pull request
Apr 25, 2026
…le + safe-ROM substrate (#400) * artifact-c: tools/alignment/audit_archive_headers.sh — archive-header lint v0 (detect-only) Amara's 5th-ferry Artifact C landing (PR #235 absorb). Detect-only lint for the four archive-header fields proposed in §33 (PR #235 exemplar; not yet governance-landed): - Scope: - Attribution: - Operational status: - Non-fusion disclaimer: Defaults to checking docs/aurora/*.md; --path DIR overrides. --enforce flips exit 2 on any gap; CI does not currently call it (Aminata Otto-80 pass classified §33 as IMPORTANT-pending- Aaron-signoff + lint-required-to-prevent-3-5-round-decay). First-run baseline: 2/2 existing aurora absorbs missing all four headers (predate the proposal). Detect-only first prevents CI block on baseline; enforcement flips when Aaron signs off on §33 + baseline is green (either backfill the 2 absorbs or explicit grandfather clause in §33). v0 limitations documented in script: - Partial-header adversary (label anywhere in first 20 lines passes; no syntactic check). - Fake-header adversary (values not content-audited). - In-memory-import adversary (memory/ not covered; different surface). Harden in follow-up after §33 lands. Bash 3.2 compatible (while-read loop, not mapfile) for macOS default shell. Same --json / --out DIR / exit code shape as existing audit_commit.sh / audit_personas.sh / audit_skills.sh. FACTORY-HYGIENE row #60 added: - Detect-only cadence landed. - Enforcement deferred until Aaron §33 signoff + baseline green. - Same detect-only → triage → enforce pattern as rows #51 (cross-platform parity) and #55 (machine-specific scrubber). tools/alignment/README.md table updated with new row. Composes with: - Aminata threat-model pass (PR #241; names the decay risk this lint prevents). - Amara's 5th-ferry absorb (PR #235; exemplar self-applies the format). - Memory-index hygiene trio (rows #58 / #59 + this row's archive-header hygiene trio). Otto-81 tick deliverable. * drain(#243): seven Copilot/Codex threads — recursive scan + name-attribution + exit-code alignment - Switch audit_archive_headers.sh from -maxdepth 1 to recursive find matching documented `docs/aurora/**/*.md` scope; exclude `references/` as bibliographic substrate. - Encode subdirectory in --out per-file JSON basename to avoid collisions under recursive scan. - Replace 'Aaron' with 'human-maintainer' role ref in script and FACTORY-HYGIENE row 60 (FACTORY-DISCIPLINE name-attribution rule). - Drop persona names (Aminata, Amara) from script comments and row 60 in favour of role references (threat-model reviewer, absorbing agent), per Otto-220 code-comments-explain-code rule. - Realign exit codes to sibling audit_*.sh convention: 1 = content-level signal under --enforce; 2 = script error / missing dependency / bad arg. Update header doc-block + row 60 wording to match. - Remove dead cross-reference to non-existent `docs/aurora/2026-04-23-amara-zeta-ksk-aurora-validation-5th-ferry.md` in row 60. Verified the aminata-threat-model-5th-ferry citation does exist on origin/main; kept that one. - Append docs/pr-preservation/243-drain-log.md per Otto-250. Smoke-tested: clean run exit 0 (16 files scanned), --enforce exit 1, bad --path exit 2, --json exit 0, --out has no basename collisions. * drain(#243): quote target_path inside parameter expansion (SC2295) Local shellcheck install only flagged this on the lint runner with --severity=style. Quote $target_path inside the ${file#...} prefix-strip so the prefix is not interpreted as a glob pattern. * drain: PR #243 round 2 — address 6 late-review threads Round 2 drain after round 1 closed all 7 threads. Copilot re-reviewed and opened 6 new P2 suggestion-shape threads; all 6 are FIX outcomes: - r2-1 (line 128): normalise --path to strip trailing slash so `docs/aurora/` matches the references/ exclusion. - r2-2 (line 172): make --out filename encoding injective by percent-encoding literal `_` to `_5F` before the `/` -> `__` swap. Was non-injective: `a/b__c.md` and `a__b/c.md` both became `a__b__c.json`. - r2-3 (line 26): fix stale Usage wording — `--enforce` exits 1 on gap (matches the dedicated Exit-codes section and round-1 Thread-7 realignment). - r2-4 (line 61): correct factual error about memory surface — in-repo `memory/` is canonical per GOVERNANCE.md §18 and `memory/README.md`; per-user path is staging. - r2-5 (line 128): force C-locale sort with `LC_ALL=C` for deterministic byte-order output regardless of caller env. - r2-6 (line 7): drop persona name "Amara" from header banner in favour of role/artifact references ("5th-ferry Artifact C" / "the 5th-ferry external- research absorb"). Round 1 caught "Aaron" but missed "Amara". Append-only drain-log update per Otto-229: prior round-1 sections untouched; new "Drain pass: 2026-04-24 (round 2 — 6 threads)" section appended. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * backlog+memory+roms: emulators on OS-interface + rewindable/retractable controls + safe-ROM substrate Maintainer 2026-04-24 directive — emulators are the canonical proof-out workload for the OS-interface (#399). Two related directives captured: (1) "emulators should run very nicely on this, let me know when you want some roms of any kind that are safe." (2) "rewindable/retractable os/emulator controls" Plus: maintainer requested a `roms/` folder with a gitignored-except-sentinels pattern (same as `drop/`) so binaries never enter git history but the directory exists on every clone. Why emulators compose perfectly with the OS-interface: - Emulator event loop = durable-async runtime workload - Save states FREE (every yield-point = checkpoint) - Cross-node migration FREE (state follows the function) - Multiplayer FREE (shared durable substrate) - DST guarantees speedrun/TAS bit-equal replay Rewindable/retractable controls — the killer generalization: - Z-set retraction-native semantics extend UP to OS surface - "Rewind 5 seconds" is a first-class OS primitive - rr / Pernosco architectural class, generalized - Otto-238 trust-vector: rewindable controls grant agency Activates 2026-04-22 ARC-3 adversarial-self-play absorption-scoring research (level-creator / adversary / player loop on durable-async + rewindable substrate). Phased: Phase 0 research (Game Boy / NES / SNES / Genesis; libretro; rr/Pernosco) → Phase 1 single emulator on durable-async → Phase 2 rewindable controls promoted to OS primitive → Phase 3 ARC-3 loop → Phase 4 cross-emulator composition. Safe-ROM offer captured durably; ask gated on Phase 1 landing first. Allowed classes enumerated in roms/README.md (public-domain / homebrew / official test suites / commercially-released-as-free / explicit-license). Otto-275 log-don't-implement applies. Composes with #399 OS-interface, Otto-73/238/272, Z-set retraction-native, #396/#397 closure-table+cross-DSL, request-play skill. --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Maintainer 2026-04-24 directive — closure-table needs hardening for filesystem-class workloads to support the native F# git implementation (#395 cluster). Make pluggable so faster substrate can swap in if profiling shows bottleneck.
Phase 0 research scope captured
IHierarchicalIndexcontract definition.Composes with
Test plan
🤖 Generated with Claude Code