Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
202 changes: 202 additions & 0 deletions docs/BACKLOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5599,6 +5599,208 @@ systems. This track claims the space.

## P2 — research-grade

- [ ] **Blockchain block ingestion — first-class BTC /
ETH / SOL streaming into Zeta's distributed database;
bi-directional protocol participation; cross-chain
stream bridge.** Maintainer 2026-04-24 directive
(verbatim):

> *"i would love to test our database by having first
> class support for bitcoin, eth, and solana blocks
> into our database in the order of priority unless you
> tell me there are other ones worth exploring for two
> reason, 1 to help us understand blockchain for Aurora
> we don't want to just jump in and we will be starting
> from scriatch so making sure we completely understand
> everysing thing about the blocks are important so we
> get ours right. can you make a post install script
> that will streaing ingest these block chains into our
> database and make them querable will all our entry
> points/intefaces backlog. this is not a full node
> implimentation or anyting yet that will come leter
> layed on top of our multinode database so we can have
> distributed node support from the start cause we are
> on top of our distributed db. we can stick a ui in
> front of that too lol. Also you need to do a lot of
> research here cause some nodes will try to call you a
> bad node if you don't hame some amount of the full
> protocol, they give extra tests exactly to try to
> stop this freeloader scenaro where you download but
> dont upload, you can look at their source code to
> figure it out. Also if you have to do full nodes of
> those types to be able to download we have to upload
> too go ahead and to that, i want those interfaces too
> just like our SQL interfaces and i also want deep
> integration into those networks so we can 'bridge'
> them in streams and maybe further. backlog"*

**Two load-bearing motivations:**
1. **Aurora preparation** — Zeta's own blockchain-ish
substrate (Aurora / Lucent-KSK lineage per the
existing memory cluster) wants concrete grounding
before we design the Aurora chain shape. Ingesting
real BTC / ETH / SOL blocks into our database gives
us deep understanding of the actual data model
before we specify ours.
2. **Database stress-test** — BTC / ETH / SOL are
three of the most battle-tested streaming workloads
on the planet (continuous append, chain
reorganizations, finality semantics, adversarial
environment). If Zeta's distributed DB can absorb
them live and serve queries through the existing
interfaces, that's a load-bearing proof of the
substrate.

**Priority order (maintainer-specified):** BTC → ETH →
SOL. Priority is authoritative; additional chains
(Cosmos Hub / Polkadot / Cardano / Avalanche / L2
rollups like Base / Optimism / Arbitrum) should be
evaluated in a later phase, not reordered.

**Phased plan (scope decomposition — each phase a
future dedicated PR or PR cluster; this row is the
umbrella):**

**Phase 0 — Research pass (no code; starts the work):**
- Read the actual client source for each chain:
`bitcoin/bitcoin` (C++), `ethereum/go-ethereum`
+ `paradigmxyz/reth` (Go + Rust), `solana-labs/solana`
(Rust). Map the block shape verbatim; capture field
semantics (timestamps / merkle roots / witness data /
slot-vs-block distinctions / SOL's entries-within-slot
model).
- Identify **misbehavior / freeloader detection** per
chain — specifically what each client does to detect
a download-only peer and how it penalizes / bans.
Key sources: BTC's `net_processing.cpp` DoS scoring,
ETH's devp2p / Snap-sync reciprocity tests, SOL's
turbine-shred forwarding requirements. This
determines whether **Phase 2 full-node participation
is REQUIRED or OPTIONAL** per chain.
- Identify what's pullable WITHOUT running a full node:
BTC block explorer APIs + Electrum + public RPC;
ETH public RPC + Alchemy/Infura snapshot archives;
SOL public RPC + warehouse archive (Google BigQuery
has a public SOL blocks dataset). This bounds the
Phase 1 scope.
- Produce `docs/research/blockchain-ingestion-phase-0-bitcoin.md`,
`docs/research/blockchain-ingestion-phase-0-ethereum.md`,
`docs/research/blockchain-ingestion-phase-0-solana.md`.

**Phase 1 — Post-install block-ingestion script
(NOT a full node):**
- Post-install script under `tools/setup/blockchain-ingest/`
(composes with GOVERNANCE §24 three-way-parity
install script). Per-chain: `bitcoin.sh`,
`ethereum.sh`, `solana.sh`.
- Each script streams blocks via public RPC / explorer
APIs into Zeta's distributed DB as Z-set entries —
retraction-native (chain reorgs are first-class
retractions; our substrate was designed for this).
- Schema-design: use the paced-ontology-landing
discipline; each chain gets a dedicated ontology
(block / transaction / log / witness / slot /
entry-vs-block-vs-shred) and the cross-chain
umbrella ontology comes later (Phase 4).
- Queryable through all existing entry points: SQL
binder, operator algebra, LINQ, any future
GraphQL / REST surface. NO new interface class
unique to blockchain; re-use what Zeta already has.
- `dotnet run -- --chain bitcoin --from-height N
--to-height latest --follow` shape.

**Phase 2 — Full-node protocol participation
(CONDITIONAL on Phase 0 finding):**
- If Phase 0 research shows that the target chain's
client BANS download-only peers after a window
(true for BTC's DoS scoring, likely for ETH's Snap
sync, and definitely for SOL's turbine),
implement the minimum UPLOAD side of the protocol
to stay a good network citizen.
- Maintainer directive is explicit: *"if you have to
do full nodes of those types to be able to download
we have to upload too go ahead and to that, i want
those interfaces too just like our SQL interfaces"*.
Upload-side interfaces expose as first-class Zeta
interfaces on par with SQL — not private internals.
- Architecturally this is **full-node-layered-on-top
of Zeta's distributed DB** (maintainer's explicit
frame), not a standalone fork of bitcoind / geth /
solana-labs. We use Zeta as the storage / consensus
/ query substrate and implement the chain protocol
ON TOP of it. Distributed-node support falls out of
Zeta's multi-node primitives for free.
- This is where Zeta's distributed-consensus substrate
(`distributed-consensus-expert` / `raft-expert` /
`paxos-expert` / `calm-theorem-expert` /
`replication-expert`) becomes load-bearing.

**Phase 3 — Cross-chain stream bridge:**
- Deep integration per maintainer: *"deep integration
into those networks so we can 'bridge' them in
streams and maybe further"*.
- Bridge = Z-set operator composition across chain
streams. Each chain is a ZSet; cross-chain joins
produce derived ZSets (e.g. Bitcoin timestamp vs
Ethereum block for time alignment; SOL finality vs
ETH finality for comparative-consensus research).
- "Maybe further" = likely cross-chain atomic ops,
value-transfer bridges, or unified-view layers;
scope intentionally open at this phase.
- Composes with `distributed-coordination-expert` +
`crdt-expert` + `gossip-protocols-expert`.

**Phase 4 — UI:**
- Per maintainer: *"stick a ui in front of that too
lol"*. Frontier-UX / former-Starboard-now-rename-
target (kernel-A farm-related + kernel-B
carpentry-related per the 2026-04-24 rename
directive) — cross-chain block explorer + streaming
dashboard + cross-chain bridge visualizer as initial
surfaces.

**Additional chains worth evaluating in a later phase**
(do NOT reorder the primary BTC/ETH/SOL priority):
- **Cosmos Hub** — IBC is a canonical cross-chain
bridging primitive; directly relevant to Phase 3.
- **Polkadot** — substrate chain + parachain
composition = close architectural cousin to Zeta's
multi-node + cross-chain design.
- **Cardano** — Ouroboros PoS pedagogy (Ouroboros is
the most formally-verified consensus protocol
deployed at scale).
- **Avalanche** — sub-net architecture is a real
distributed-systems primitive worth studying.
- **L2 rollups** (Base / Optimism / Arbitrum / zkSync
Era / StarkNet) — bridge-to-ETH substrate; good
study material for Phase 3 bridging.

**Priority / effort:** P2 research-grade; umbrella
effort is L (phased across many rounds). Phase 0 is
M (three research docs, deep source reading). Phase 1
per-chain is M-L each (ingest script + schema +
retraction-native integration). Phase 2 per-chain is
L each (full-node protocol on top of Zeta). Phase 3
is L+ (cross-chain bridge). Phase 4 is S (UI on top
of existing query surface).

**Composes with:** Aurora substrate (all Lucent-KSK +
Aurora ferry absorbs), paced-ontology-landing (one
ontology per chain), `distributed-consensus-expert` +
sibling consensus hats (Phase 2), GOVERNANCE §24
install-script discipline (Phase 1 post-install),
Otto-175c rename directive (the Frontier-UI surface
for Phase 4), Otto-275 log-don't-implement (this row
is the capture, not the kickoff).

**Does NOT authorize:** starting implementation yet —
Phase 0 research is the gate. Does NOT authorize
expanding scope to additional chains before BTC / ETH
/ SOL are understood. Does NOT authorize running a
live Zeta instance on mainnet without Aminata
threat-model sign-off on the network-exposure surface
(Phase 2 only).

- [ ] **Land per-maintainer CURRENT-memory ADR + companion
feedback memory.** PR #153 landed the CLAUDE.md fast-path
pointer at the per-user `CURRENT-<maintainer>.md`
Expand Down
1 change: 1 addition & 0 deletions memory/MEMORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

**📌 Fast path: read `CURRENT-aaron.md` and `CURRENT-amara.md` first.** These per-maintainer distillations show what's currently in force. Raw memories below are the history; CURRENT files are the projection.

- [**BLOCKCHAIN INGEST — first-class BTC/ETH/SOL streaming into Zeta's distributed DB; two motivations (Aurora prep + DB stress test); BTC→ETH→SOL priority; NOT fork of bitcoind/geth/solana-labs — on top of Zeta distributed DB; freeloader-detection research required (BTC net_processing.cpp / ETH devp2p+Snap / SOL turbine-shred); upload-side interfaces first-class on par with SQL; Phase 0 research gate + Phase 1 post-install ingest + Phase 2 conditional full-node + Phase 3 cross-chain bridge + Phase 4 UI; additional chains (Cosmos/Polkadot/Cardano/Avalanche/L2s) evaluated later; Otto-275 log-don't-implement; Aaron 2026-04-24**](feedback_blockchain_ingest_btc_eth_sol_first_class_db_support_aurora_prep_2026_04_24.md) — Verbatim directive captured. Phase 0 research gate = read actual client source per chain to map freeloader detection (determines whether Phase 2 upload-side is required to stay in-network). Architecturally on top of Zeta's multi-node primitives (distributed-node support from start). Composes with Aurora substrate + paced-ontology-landing + distributed-consensus-expert + GOVERNANCE §24 + Otto-175c rename (Frontier-UI → kernel-A/B).
- [**RENAME Starboard → two seed-extension kernels (farm + carpentry) shrink-over-time; KEEP all nautical/Elron research (Otto-237 mention vs adoption); "big bangs at every layer" metaphor liked; 2 Google AI slates received (batch 1 general farm, batch 2 Q/Z algebraic); Siliqua-Core + Zeta-ic Yield + Zanja flagged as notable resonances; naming-expert triage before any rename PR; Otto-275 log-don't-implement; reverses Otto-175c Starboard adoption; Aaron 2026-04-24**](feedback_rename_starboard_to_farm_carpentry_seed_extension_kernels_2026_04_24.md) — Directive verbatim: *"Instead of Starboard lets go with someting farm related and carperntry related since those will be our two seed extenion kernels we can shrink over time..."*. Two kernels, shrink-over-time property, substrate preserved, iterate don't auto-adopt. Carpentry-side slate not yet proposed; future work scope. Composes with Otto-168/170/175/237/244/275.
- [**Otto-276 NEVER PRAY AUTO-MERGE COMPLETES — when polling a BLOCKED PR, ALWAYS inspect statusCheckRollup + reviewThreads + reviewDecision; "summary says BLOCKED, must be CI" is prayer not diagnosis; RECURRING class (#190 #385 #388); Aaron 2026-04-24**](feedback_never_pray_auto_merge_completes_inspect_actual_blockers_otto_276_2026_04_24.md) — DST "observable state" = check-level detail not summary. Inspect before concluding either success or failure.
- [**Otto-275 RAPID-FIRE BACKLOG INPUT DRIFT — when handed many backlog items in rapid succession, LOG durably (memory) but DO NOT pivot to immediate per-item implementation; PATTERN RECURS across sessions; composes with Otto-257/259/262 balance-stack for recovery work; Aaron 2026-04-24**](feedback_rapid_backlog_input_context_switch_drift_counterweight_log_dont_implement_otto_275_2026_04_24.md) — Real learning lesson: I dropped #147 drain focus to capture Otto-270/272/273/274 as a "storm of PRs." Fix: log durable + draft BACKLOG row + continue primary drain; batch BACKLOG rows later.
Expand Down
Loading
Loading