feat: event downloader + balance replay debug script#342
feat: event downloader + balance replay debug script#342andreabadesso merged 15 commits intomasterfrom
Conversation
89b8a7c to
42b3937
Compare
- Fix completedBatches double-counting in orchestrator stats - Fix pendingBatches going negative by tracking activeWorkers separately - Guard onError from firing after onComplete with isCompleted flag - Guard worker onmessage after batch completion with isDone flag - Remove dead exports: getPendingBatches, getLastEventId, getConfig - Add *.tsbuildinfo to .gitignore and untrack build artifact - Relax @hathor/wallet-lib peer dep to >=2.8.3 in event-downloader - Use yarn dlx ts-node for replay-balance script instead of full build - Add better-sqlite3 as dep to daemon for replay-balance script Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new event-downloader package with WS workers, batching, SQLite persistence, types, orchestrator and CLI UI; adds a daemon replay-balance script; updates root workspaces and gitignore entries; adds daemon deps and scripts. No changes to existing runtime logic outside the new package and script. Changes
Sequence DiagramsequenceDiagram
participant CLI as CLI Entry
participant Orchestrator
participant DB as SQLite DB
participant FullNode as Full Node (WS)
participant Worker as Event Worker
participant UI as Progress UI
CLI->>Orchestrator: downloadAllEvents(callbacks)
Orchestrator->>DB: initDatabase()
Orchestrator->>FullNode: getLatestEventId() (WS)
FullNode-->>Orchestrator: latestEventId
Orchestrator->>DB: getAllBatchProgress()
DB-->>Orchestrator: existing progress
Orchestrator->>Orchestrator: calculate & merge batches
loop For each batch (concurrent)
Orchestrator->>Worker: start(batch)
Worker->>FullNode: WebSocket connect & START_STREAM
FullNode-->>Worker: FullNodeEvent*
loop Streaming events
Worker->>Worker: process event
Worker->>DB: insertEvents()/insertTxEvents()
Worker->>Orchestrator: report progress
Orchestrator->>UI: onStatsUpdate
UI->>CLI: renderProgress()
end
Worker->>FullNode: ACK(lastReceivedEventId)
Worker->>Orchestrator: onComplete
Orchestrator->>DB: updateBatchProgress(status=complete)
end
Orchestrator->>UI: onComplete()
UI->>CLI: Display final summary
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (5)
packages/daemon/package.json (1)
16-16: Consider addingts-nodeas a devDependency instead of usingyarn dlx.Using
yarn dlx ts-nodedownloads ts-node on each invocation, which adds latency and requires network access. Since other scripts in this package already follow theyarn dlx ts-nodepattern (lines 22-24), this is consistent, but addingts-nodetodevDependencieswould improve reliability and execution speed for all these scripts.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/daemon/package.json` at line 16, The package.json script "replay-balance" (and other scripts using "yarn dlx ts-node") relies on downloading ts-node at runtime; add "ts-node" to devDependencies in packages/daemon/package.json and run yarn install so scripts can invoke ts-node locally (e.g., change scripts to keep "yarn ts-node src/scripts/replay-balance.ts" or continue using "yarn dlx" but prefer local ts-node). Update package.json's devDependencies to include an appropriate ts-node version and run the package manager to lock it in.packages/event-downloader/src/event-parser.ts (1)
17-25: Consider improving type safety for event data access.Casting
event.event.datatoany(line 19) loses type safety. While this is pragmatic for handling varied event structures, consider using a type guard or more specific type assertions for the known event types to maintain some level of type checking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/event-downloader/src/event-parser.ts` around lines 17 - 25, extractTxHash currently casts event.event.data to any, losing type safety; instead add a type guard (e.g., isVertexEventData) that checks the event.event.type against FullNodeEventTypes.NEW_VERTEX_ACCEPTED | VERTEX_METADATA_CHANGED | VERTEX_REMOVED and narrows event.event.data to a specific interface with a hash property, then use that guard inside extractTxHash to safely access data.hash (only assert the narrower type after the guard). Update references to FullNodeEventTypes and the data access in extractTxHash to rely on the guard so you can remove the any cast and retain compile-time checks.packages/event-downloader/package.json (2)
1-5: Consider adding aversionfield.The package is missing a
versionfield, which is typically required for proper package management, even for internal packages. This can cause issues with some tooling and makes it harder to track releases.Proposed fix
{ "name": "event-downloader", + "version": "0.1.0", "license": "MIT", "main": "dist/index.js",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/event-downloader/package.json` around lines 1 - 5, The package.json for event-downloader is missing a version field; add a semantically meaningful "version" property (e.g., "version": "0.1.0" or appropriate semver) into the top-level JSON alongside "name", "license", "main", and "typings" so the package has a proper version for tooling and release tracking.
18-20: Peer dependency range may be too permissive.Using
>=2.8.3allows any future major version of@hathor/wallet-lib, which could introduce breaking changes. The rootpackage.jsonpins version2.12.0. Consider constraining to the same major version.Proposed fix
"peerDependencies": { - "@hathor/wallet-lib": ">=2.8.3" + "@hathor/wallet-lib": "^2.8.3" },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/event-downloader/package.json` around lines 18 - 20, The peerDependency declaration in package.json currently allows any future major of `@hathor/wallet-lib` ("@hathor/wallet-lib": ">=2.8.3"); update the peerDependencies entry for "@hathor/wallet-lib" to constrain to the same major/version range as the root (e.g. "^2.12.0" or ">=2.8.3 <3.0.0") so it won't accept breaking v3+ releases; modify the "peerDependencies" object accordingly.packages/event-downloader/src/db.ts (1)
17-23: Constrain batch status to a finite set in both TS and SQL.
statusis currently free-form. A typo can silently break resume/completion logic.Suggested refactor
+export type BatchStatus = 'pending' | 'in_progress' | 'completed' | 'failed'; + export interface BatchProgress { batch_start: number; batch_end: number; last_downloaded: number | null; - status: string; + status: BatchStatus; updated_at: string; } @@ const CREATE_DOWNLOAD_PROGRESS_TABLE = ` CREATE TABLE IF NOT EXISTS download_progress ( @@ - status TEXT DEFAULT 'pending', + status TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'in_progress', 'completed', 'failed')), updated_at DATETIME DEFAULT CURRENT_TIMESTAMP ) `; @@ export function updateBatchProgress( db: DatabaseType, batchStart: number, batchEnd: number, lastDownloaded: number | null, - status: string + status: BatchStatus ): void {Also applies to: 45-53, 164-170
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/event-downloader/src/db.ts` around lines 17 - 23, BatchProgress.status is free-form and should be constrained to a finite set to avoid typos breaking resume/completion logic: define a TypeScript string-union or enum (e.g., BatchStatus = 'pending' | 'in_progress' | 'completed' | 'failed') and change the BatchProgress interface to use that type; add a corresponding SQL constraint for the status column (use an ENUM type or a CHECK constraint listing the same values) in the table/migration that defines the status column, and update any code that reads/writes the status (places referenced around the other occurrences) to use the enum constants rather than raw strings so runtime and DB are aligned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/daemon/src/scripts/replay-balance.ts`:
- Around line 64-68: The loadAddresses function currently always drops the first
CSV row via lines.slice(1), which removes a real address when the file has no
header; change loadAddresses to detect whether the first line is a header (e.g.,
compare it to expected header tokens or validate it against the expected address
format) and only skip it when it looks like a header, otherwise include the
first line; update the logic in loadAddresses (the CSV parsing path that
currently uses lines.slice(1).map(...)) to perform a header check using a simple
address regex/validation and then build the Set accordingly.
- Around line 87-103: The WHERE clause building in replay-balance.ts is
vulnerable and breaks when the CSV is empty because conditions becomes empty
(producing AND ()), and it interpolates addresses directly into SQL; fix by
guarding walletAddresses (returned from loadAddresses) early and either exit
with a message if size === 0 or construct a safe parameterized query: build an
array of placeholders like "data LIKE ?" for each address, bind values as
`%${addr}%` when calling sqlite.prepare(...).all(bindValues), and replace the
current raw `conditions`/interpolation usage so the SELECT uses parameterized
LIKE placeholders instead of string interpolation (refer to walletAddresses,
conditions, sqlite.prepare, and the events query).
- Around line 49-55: In the argument parsing loop in replay-balance.ts (the for
loop over args that assigns into opts), validate that a value exists before
consuming args[++i]: for each flag that expects a value (--db, --addresses,
--expected) check that args[i+1] is defined and does not start with '-' (or
otherwise indicate another flag); if the check fails, throw or log a clear
error/usage message. Also wrap the BigInt conversion for --expected in a
try/catch or validate the string is numeric before calling BigInt to produce a
clear error rather than a later crash. Apply these checks where opts.db,
opts.addresses, and opts.expected are set.
In `@packages/event-downloader/src/config.ts`:
- Around line 14-30: checkEnvVariables currently treats empty strings as present
and numeric envs (BATCH_SIZE, PARALLEL_CONNECTIONS, WINDOW_SIZE,
CONNECTION_TIMEOUT_MS) aren’t validated, which can allow invalid values (e.g.,
<=0 or non-numeric) to stall downloads; update checkEnvVariables to treat empty
string values as missing for required string vars like FULLNODE_HOST/USE_SSL,
then validate and parse numeric envs used by BATCH_SIZE, PARALLEL_CONNECTIONS,
WINDOW_SIZE, and CONNECTION_TIMEOUT_MS (ensure parseInt succeeds and values > 0
where applicable) and throw a clear Error if any validation fails so the process
fails fast.
In `@packages/event-downloader/src/index.ts`:
- Around line 166-174: The signal handler's cleanup function exits immediately
preventing graceful shutdown; update cleanup (used in process.on('SIGINT') and
process.on('SIGTERM') and referencing SHOW_CURSOR) to instead call the
orchestrator's stop/abort method (e.g., orchestrator.stop() or
orchestrator.abort()) if available, await completion or poll for in-flight tasks
to finish, flush/await any DB writes and close WebSocket connections, restore
the cursor and print the message, then use a short timeout (e.g., few seconds)
before forcing process.exit(0) only if graceful shutdown didn't complete; ensure
cleanup is async-aware so it doesn't exit before awaiting these operations.
- Around line 127-133: The percentage calculation for each item in sortedWorkers
can divide by zero when status.batchEnd === status.batchStart; compute a safe
denominator (const denom = status.batchEnd - status.batchStart) and only perform
(status.lastEventId - status.batchStart) / denom when denom > 0, otherwise set
workerPercent to a sensible fallback (e.g., '100' if status.lastEventId >=
status.batchEnd, or '0' otherwise), then call toFixed(0) on that value; update
the logic around the workerPercent variable in the loop that references
sortedWorkers, status.lastEventId, status.batchStart, and status.batchEnd to
guard against denom <= 0 before formatting.
In `@packages/event-downloader/src/orchestrator.ts`:
- Around line 279-284: The try/catch in the worker loop (calling
fn(items[index], workerSlot)) currently swallows errors and still allows the
orchestrator to emit onComplete; change this so failures are recorded and
prevent global completion: add an errors array or a boolean flag (e.g.,
workerFailed) scoped to the orchestrator, push or mark any caught error inside
the catch block instead of just console.error, and after all workers finish
check that collection — only emit onComplete when no failures, otherwise
propagate or emit an error/completion-with-failure path so callers know the run
failed; update the same pattern where used again (the other occurrence around
index 365).
- Around line 105-109: The batch-generation loop treats event IDs as inclusive
but uses a strict `<` check causing the final ID to be skipped and totalEvents
to undercount; change the loop condition from `while (start < latestEventId)` to
`while (start <= latestEventId)` so the `latestEventId` is included when pushing
`{ start, end }` into `batches`, and apply the same fix at the other occurrence
(around line with the second loop). Also audit any `totalEvents` calculation
(where it uses `latestEventId` and a first/last ID) and make it inclusive by
adding 1 (e.g., ensure it computes `last - first + 1`) so the count isn't off by
one.
- Around line 232-251: The onComplete handler sets isCompleted before calling
flushBuffers() and updateBatchProgress(), so if either throws the promise can
remain unresolved; change the onComplete implementation in the orchestrator to
perform flushBuffers() and updateBatchProgress(db, batch.start, batch.end,
batch.end, 'completed') inside a try/catch first, and only set isCompleted =
true and call resolve() after those succeed; in the catch ensure you still call
reject(error) (or at minimum set isCompleted appropriately and resolve/reject to
avoid dangling promises) so that flushBuffers() failures do not leave the
promise unresolved.
- Around line 78-85: The handler for socket.onmessage currently parses the WS
payload and resolves with data.latest_event_id without validation; update the
logic in the socket.onmessage callback (the block using
bigIntUtils.JSONBigInt.parse, latestEventId, socket.close(), resolve) to
validate that parsed.data.latest_event_id exists and is of an expected numeric
shape (e.g., a string/number that can be safely converted to a BigInt or a
numeric type), only then close the socket and call resolve with the
validated/normalized value; if the field is missing or cannot be parsed to a
valid integer, do not resolve — instead close the socket and call reject with a
clear error (or ignore and wait for another message depending on intended flow)
so invalid payloads cannot cascade into downstream batch math errors.
- Around line 350-353: The totalEventsDownloaded drops when a worker starts a
new batch because you recompute from workerStatuses which holds only the current
batch's eventsDownloaded; fix by tracking per-worker lastReported count and
accumulating deltas: in onWorkerUpdate (and alongside workerStatuses) introduce
a Map like lastEventsPerWorker to store the last eventsDownloaded for each
worker, compute delta = status.eventsDownloaded -
(lastEventsPerWorker.get(workerId) || 0), if delta < 0 treat it as a reset and
use delta = status.eventsDownloaded, then add delta to totalEventsDownloaded,
update lastEventsPerWorker.set(workerId, status.eventsDownloaded) and still
update workerStatuses.set(workerId, status) so totalEventsDownloaded remains
cumulative across sequential batches.
In `@packages/event-downloader/src/worker.ts`:
- Around line 151-153: The fatal error handling currently calls onError multiple
times and leaves sockets/timers running; update the worker to make fatal
handling single-shot by introducing a guard flag (e.g. fatalHandled or
isStopped) and use it in the catch blocks and socket 'close'/'error' handlers to
return early if already handled; when a fatal occurs, immediately set the flag,
clear any timers, and properly destroy/close the socket before invoking onError
(or ensure onError does not re-enter shutdown), and apply this change to the
catch block shown plus the other fatal paths referenced (lines ~156-158 and
~160-167) so shutdown/cleanup is performed exactly once.
---
Nitpick comments:
In `@packages/daemon/package.json`:
- Line 16: The package.json script "replay-balance" (and other scripts using
"yarn dlx ts-node") relies on downloading ts-node at runtime; add "ts-node" to
devDependencies in packages/daemon/package.json and run yarn install so scripts
can invoke ts-node locally (e.g., change scripts to keep "yarn ts-node
src/scripts/replay-balance.ts" or continue using "yarn dlx" but prefer local
ts-node). Update package.json's devDependencies to include an appropriate
ts-node version and run the package manager to lock it in.
In `@packages/event-downloader/package.json`:
- Around line 1-5: The package.json for event-downloader is missing a version
field; add a semantically meaningful "version" property (e.g., "version":
"0.1.0" or appropriate semver) into the top-level JSON alongside "name",
"license", "main", and "typings" so the package has a proper version for tooling
and release tracking.
- Around line 18-20: The peerDependency declaration in package.json currently
allows any future major of `@hathor/wallet-lib` ("@hathor/wallet-lib": ">=2.8.3");
update the peerDependencies entry for "@hathor/wallet-lib" to constrain to the
same major/version range as the root (e.g. "^2.12.0" or ">=2.8.3 <3.0.0") so it
won't accept breaking v3+ releases; modify the "peerDependencies" object
accordingly.
In `@packages/event-downloader/src/db.ts`:
- Around line 17-23: BatchProgress.status is free-form and should be constrained
to a finite set to avoid typos breaking resume/completion logic: define a
TypeScript string-union or enum (e.g., BatchStatus = 'pending' | 'in_progress' |
'completed' | 'failed') and change the BatchProgress interface to use that type;
add a corresponding SQL constraint for the status column (use an ENUM type or a
CHECK constraint listing the same values) in the table/migration that defines
the status column, and update any code that reads/writes the status (places
referenced around the other occurrences) to use the enum constants rather than
raw strings so runtime and DB are aligned.
In `@packages/event-downloader/src/event-parser.ts`:
- Around line 17-25: extractTxHash currently casts event.event.data to any,
losing type safety; instead add a type guard (e.g., isVertexEventData) that
checks the event.event.type against FullNodeEventTypes.NEW_VERTEX_ACCEPTED |
VERTEX_METADATA_CHANGED | VERTEX_REMOVED and narrows event.event.data to a
specific interface with a hash property, then use that guard inside
extractTxHash to safely access data.hash (only assert the narrower type after
the guard). Update references to FullNodeEventTypes and the data access in
extractTxHash to rely on the guard so you can remove the any cast and retain
compile-time checks.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (14)
.gitignorepackage.jsonpackages/daemon/package.jsonpackages/daemon/src/scripts/replay-balance.tspackages/event-downloader/.gitignorepackages/event-downloader/package.jsonpackages/event-downloader/src/config.tspackages/event-downloader/src/db.tspackages/event-downloader/src/event-parser.tspackages/event-downloader/src/index.tspackages/event-downloader/src/orchestrator.tspackages/event-downloader/src/types.tspackages/event-downloader/src/worker.tspackages/event-downloader/tsconfig.json
- replay-balance: validate CLI args, detect CSV headers, parameterize SQL - config: validate numeric env vars, treat empty strings as missing - index: guard division by zero in progress, improve signal handler - orchestrator: validate latest_event_id, fix off-by-one in batches, settle promises exactly once, propagate worker errors, fix cumulative event count - worker: single-shot error handling via fail() helper Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/event-downloader/src/orchestrator.ts (1)
58-102: Consider adding anonclosehandler for unexpected disconnections.If the WebSocket connection closes before receiving the initial message (e.g., server rejects the connection), the promise may hang until the 30-second timeout. An
onclosehandler could provide faster feedback.Suggested improvement
socket.onerror = (error) => { clearTimeout(timeout); reject(new Error(`WebSocket error: ${error.message}`)); }; + + socket.onclose = (event) => { + clearTimeout(timeout); + if (!event.wasClean) { + reject(new Error(`WebSocket closed unexpectedly: code ${event.code}`)); + } + }; }); }Note: This is a minor improvement for edge cases where the connection closes without an error event firing first. The existing 30-second timeout provides adequate fallback protection.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/event-downloader/src/orchestrator.ts` around lines 58 - 102, The getLatestEventId promise can hang if the WebSocket closes before the initial message arrives; add a socket.onclose handler in the getLatestEventId function to clear the timeout and reject the promise immediately with a descriptive error (e.g., "WebSocket closed before initial event") unless the promise has already been resolved, ensuring you still close/cleanup the socket and not double-reject; reference socket.onclose, timeout, and the resolve/reject logic when implementing.packages/daemon/src/scripts/replay-balance.ts (1)
114-129: Ensure DB closure on all paths (try/finally).If query preparation/execution throws before Line 129, the handle is not explicitly closed.
♻️ Suggested change
- const sqlite = new Database(opts.db, { readonly: true }); + const sqlite = new Database(opts.db, { readonly: true }); + let rows: Array<{ id: number; type: string; data: string }>; - // Build the WHERE clause — one parameterized LIKE condition per address - const addresses = Array.from(walletAddresses); - const conditions = addresses.map(() => 'data LIKE ?').join(' OR '); - const likeParams = addresses.map(addr => `%${addr}%`); - - const rows = sqlite.prepare(` - SELECT id, type, data - FROM events - WHERE type IN ('NEW_VERTEX_ACCEPTED', 'VERTEX_METADATA_CHANGED') - AND (${conditions}) - ORDER BY id ASC - `).all(...likeParams) as Array<{ id: number; type: string; data: string }>; - - sqlite.close(); + try { + // Build the WHERE clause — one parameterized LIKE condition per address + const addresses = Array.from(walletAddresses); + const conditions = addresses.map(() => 'data LIKE ?').join(' OR '); + const likeParams = addresses.map(addr => `%${addr}%`); + + rows = sqlite.prepare(` + SELECT id, type, data + FROM events + WHERE type IN ('NEW_VERTEX_ACCEPTED', 'VERTEX_METADATA_CHANGED') + AND (${conditions}) + ORDER BY id ASC + `).all(...likeParams) as Array<{ id: number; type: string; data: string }>; + } finally { + sqlite.close(); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/daemon/src/scripts/replay-balance.ts` around lines 114 - 129, The SQLite Database instance created as const sqlite = new Database(opts.db, { readonly: true }) may not be closed if prepare(...).all(...) throws; wrap the creation and query in a try/finally so sqlite.close() is always invoked (move sqlite.close() into the finally block), keeping the query logic that builds conditions/likeParams and the call to sqlite.prepare(...).all(...) intact and referencing sqlite, prepare, and rows as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/daemon/src/scripts/replay-balance.ts`:
- Around line 117-127: The query builds a variable number of "data LIKE ?"
placeholders from walletAddresses (addresses, conditions, likeParams) and calls
sqlite.prepare(...).all(...likeParams), which can exceed SQLite's bind-parameter
limit for large address sets; change replay logic to chunk addresses into
batches under the SQLite parameter limit (e.g., 999), run the SELECT in
sqlite.prepare(...).all(...) per chunk using the chunk's likeParams,
collect/merge the rows (deduplicating by id) and preserve ORDER BY id ASC, and
ensure this batching is implemented where the current query is constructed and
executed (see variables addresses, conditions, likeParams and the
sqlite.prepare(...).all call).
---
Nitpick comments:
In `@packages/daemon/src/scripts/replay-balance.ts`:
- Around line 114-129: The SQLite Database instance created as const sqlite =
new Database(opts.db, { readonly: true }) may not be closed if
prepare(...).all(...) throws; wrap the creation and query in a try/finally so
sqlite.close() is always invoked (move sqlite.close() into the finally block),
keeping the query logic that builds conditions/likeParams and the call to
sqlite.prepare(...).all(...) intact and referencing sqlite, prepare, and rows as
before.
In `@packages/event-downloader/src/orchestrator.ts`:
- Around line 58-102: The getLatestEventId promise can hang if the WebSocket
closes before the initial message arrives; add a socket.onclose handler in the
getLatestEventId function to clear the timeout and reject the promise
immediately with a descriptive error (e.g., "WebSocket closed before initial
event") unless the promise has already been resolved, ensuring you still
close/cleanup the socket and not double-reject; reference socket.onclose,
timeout, and the resolve/reject logic when implementing.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
packages/daemon/src/scripts/replay-balance.tspackages/event-downloader/src/config.tspackages/event-downloader/src/index.tspackages/event-downloader/src/orchestrator.tspackages/event-downloader/src/worker.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/event-downloader/src/worker.ts
- packages/event-downloader/src/config.ts
| const addresses = Array.from(walletAddresses); | ||
| const conditions = addresses.map(() => 'data LIKE ?').join(' OR '); | ||
| const likeParams = addresses.map(addr => `%${addr}%`); | ||
|
|
||
| const rows = sqlite.prepare(` | ||
| SELECT id, type, data | ||
| FROM events | ||
| WHERE type IN ('NEW_VERTEX_ACCEPTED', 'VERTEX_METADATA_CHANGED') | ||
| AND (${conditions}) | ||
| ORDER BY id ASC | ||
| `).all(...likeParams) as Array<{ id: number; type: string; data: string }>; |
There was a problem hiding this comment.
Potential failure on large address sets due to SQLite bind-parameter caps.
Line 118/Line 127 scale placeholders with address count; large CSVs can exceed SQLite variable limits and abort the script.
💡 Proposed fix (chunk address filters)
- const addresses = Array.from(walletAddresses);
- const conditions = addresses.map(() => 'data LIKE ?').join(' OR ');
- const likeParams = addresses.map(addr => `%${addr}%`);
-
- const rows = sqlite.prepare(`
- SELECT id, type, data
- FROM events
- WHERE type IN ('NEW_VERTEX_ACCEPTED', 'VERTEX_METADATA_CHANGED')
- AND (${conditions})
- ORDER BY id ASC
- `).all(...likeParams) as Array<{ id: number; type: string; data: string }>;
+ const addresses = Array.from(walletAddresses);
+ const MAX_SQL_PARAMS = 900;
+ const rowById = new Map<number, { id: number; type: string; data: string }>();
+
+ for (let i = 0; i < addresses.length; i += MAX_SQL_PARAMS) {
+ const chunk = addresses.slice(i, i + MAX_SQL_PARAMS);
+ const conditions = chunk.map(() => 'data LIKE ?').join(' OR ');
+ const likeParams = chunk.map(addr => `%${addr}%`);
+
+ const chunkRows = sqlite.prepare(`
+ SELECT id, type, data
+ FROM events
+ WHERE type IN ('NEW_VERTEX_ACCEPTED', 'VERTEX_METADATA_CHANGED')
+ AND (${conditions})
+ ORDER BY id ASC
+ `).all(...likeParams) as Array<{ id: number; type: string; data: string }>;
+
+ for (const row of chunkRows) rowById.set(row.id, row);
+ }
+
+ const rows = Array.from(rowById.values()).sort((a, b) => a.id - b.id);#!/bin/bash
set -euo pipefail
SCRIPT_FILE="$(fd 'replay-balance.ts$' | head -n1)"
echo "Checking placeholder-per-address pattern in ${SCRIPT_FILE}:"
rg -n "data LIKE \?|all\(\.\.\.likeParams\)" "${SCRIPT_FILE}"
CSV_FILE="$(fd 'addresses.csv$' | head -n1 || true)"
if [[ -n "${CSV_FILE}" ]]; then
python - "${CSV_FILE}" <<'PY'
import sys
p = sys.argv[1]
lines = [l.strip() for l in open(p, encoding="utf-8").read().splitlines() if l.strip()]
if lines and lines[0].lower().startswith("address"):
lines = lines[1:]
count = len(lines)
print(f"address_count={count}")
print("sqlite_param_risk=HIGH" if count > 999 else "sqlite_param_risk=LOW")
PY
else
echo "No addresses.csv found in repository to estimate runtime parameter count."
fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/daemon/src/scripts/replay-balance.ts` around lines 117 - 127, The
query builds a variable number of "data LIKE ?" placeholders from
walletAddresses (addresses, conditions, likeParams) and calls
sqlite.prepare(...).all(...likeParams), which can exceed SQLite's bind-parameter
limit for large address sets; change replay logic to chunk addresses into
batches under the SQLite parameter limit (e.g., 999), run the SELECT in
sqlite.prepare(...).all(...) per chunk using the chunk's likeParams,
collect/merge the rows (deduplicating by id) and preserve ORDER BY id ASC, and
ensure this batching is implemented where the current query is constructed and
executed (see variables addresses, conditions, likeParams and the
sqlite.prepare(...).all call).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/daemon/package.json`:
- Line 16: Add ts-node to devDependencies and update the package.json script
entries so they call the installed ts-node binary instead of using yarn dlx;
specifically, add "ts-node" to devDependencies and change the scripts
"replay-balance", "test_images_wait_for_db", "test_images_wait_for_ws", and
"test_images_setup_database" to invoke ts-node directly (e.g., replace "yarn dlx
ts-node src/scripts/replay-balance.ts" with "ts-node
src/scripts/replay-balance.ts") so the lockfile determines the ts-node version
and prevents dlx-driven version drift.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
packages/daemon/package.jsonpackages/daemon/src/scripts/replay-balance.tspackages/event-downloader/package.json
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/event-downloader/package.json
- packages/daemon/src/scripts/replay-balance.ts
| "lint": "eslint .", | ||
| "build": "tsc -b", | ||
| "start": "node dist/index.js", | ||
| "replay-balance": "yarn dlx ts-node src/scripts/replay-balance.ts", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Checking for 'replay-balance' script entries ==="
rg -n '"replay-balance"\s*:' -g '**/package.json' || echo "No matches found"
echo ""
echo "=== Checking for 'ts-node' in package.json files ==="
rg -n '"ts-node"\s*:' -g '**/package.json' || echo "No matches found"
echo ""
echo "=== Checking for 'yarn dlx ts-node' usage ==="
rg -n 'yarn dlx ts-node' -g '**/package.json' || echo "No matches found"
echo ""
echo "=== Checking root package.json for ts-node ==="
if [ -f "package.json" ]; then
grep -n "ts-node" package.json || echo "ts-node not found in root package.json"
fi
echo ""
echo "=== Verifying replay-balance.ts script exists ==="
if [ -f "packages/daemon/src/scripts/replay-balance.ts" ]; then
echo "✓ Script file exists: packages/daemon/src/scripts/replay-balance.ts"
else
echo "✗ Script file NOT found at packages/daemon/src/scripts/replay-balance.ts"
fi
echo ""
echo "=== Checking if ts-node is in daemon package devDependencies ==="
if [ -f "packages/daemon/package.json" ]; then
grep -A 50 '"devDependencies"' packages/daemon/package.json | grep -n '"ts-node"' || echo "ts-node not in daemon devDependencies"
fiRepository: HathorNetwork/hathor-wallet-service
Length of output: 1169
Add ts-node to devDependencies to avoid version drift across multiple scripts.
Using yarn dlx ts-node bypasses lockfile reproducibility and introduces version drift over time. This pattern affects multiple scripts in the package (lines 16, 22–24):
replay-balancetest_images_wait_for_dbtest_images_wait_for_wstest_images_setup_database
Add ts-node to devDependencies and update all four scripts to invoke it directly (e.g., ts-node src/scripts/replay-balance.ts instead of yarn dlx ts-node src/scripts/replay-balance.ts).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/daemon/package.json` at line 16, Add ts-node to devDependencies and
update the package.json script entries so they call the installed ts-node binary
instead of using yarn dlx; specifically, add "ts-node" to devDependencies and
change the scripts "replay-balance", "test_images_wait_for_db",
"test_images_wait_for_ws", and "test_images_setup_database" to invoke ts-node
directly (e.g., replace "yarn dlx ts-node src/scripts/replay-balance.ts" with
"ts-node src/scripts/replay-balance.ts") so the lockfile determines the ts-node
version and prevents dlx-driven version drift.
- Move better-sqlite3 to devDependency in daemon package - Fix all dependency versions (remove ^ prefixes) - Add --token parameter to replay-balance script
c8cb100 to
568292a
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
packages/daemon/src/scripts/replay-balance.ts (1)
122-132:⚠️ Potential issue | 🟠 MajorHandle SQLite bind-variable limits by chunking address filters.
Line 123/Line 132 scale placeholders with address count; large CSVs can exceed SQLite variable limits and fail the query. This was previously flagged and is still present.
💡 Proposed fix (chunk + dedupe by event id)
- const addresses = Array.from(walletAddresses); - const conditions = addresses.map(() => 'data LIKE ?').join(' OR '); - const likeParams = addresses.map(addr => `%${addr}%`); - - const rows = sqlite.prepare(` - SELECT id, type, data - FROM events - WHERE type IN ('NEW_VERTEX_ACCEPTED', 'VERTEX_METADATA_CHANGED') - AND (${conditions}) - ORDER BY id ASC - `).all(...likeParams) as Array<{ id: number; type: string; data: string }>; + const addresses = Array.from(walletAddresses); + const MAX_SQL_PARAMS = 900; + const rowById = new Map<number, { id: number; type: string; data: string }>(); + + for (let i = 0; i < addresses.length; i += MAX_SQL_PARAMS) { + const chunk = addresses.slice(i, i + MAX_SQL_PARAMS); + const conditions = chunk.map(() => 'data LIKE ?').join(' OR '); + const likeParams = chunk.map(addr => `%${addr}%`); + + const chunkRows = sqlite.prepare(` + SELECT id, type, data + FROM events + WHERE type IN ('NEW_VERTEX_ACCEPTED', 'VERTEX_METADATA_CHANGED') + AND (${conditions}) + ORDER BY id ASC + `).all(...likeParams) as Array<{ id: number; type: string; data: string }>; + + for (const row of chunkRows) rowById.set(row.id, row); + } + + const rows = Array.from(rowById.values()).sort((a, b) => a.id - b.id);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/daemon/src/scripts/replay-balance.ts` around lines 122 - 132, The current query builds a single clause using addresses/conditions/likeParams and calls sqlite.prepare(...).all(...), which can exceed SQLite bind-variable limits for large address sets; modify replay-balance.ts to chunk the addresses array into batches (e.g., <= SQLITE_MAX_VARS like 900), for each chunk build its own conditions and likeParams and run sqlite.prepare(...).all, collect results into a Map keyed by event id to dedupe across chunks, then combine and return the deduped rows ordered by id (preserving the existing fields id, type, data); ensure you reference the existing variables/expressions addresses, conditions, likeParams and the sqlite.prepare(...).all call when implementing the chunking and dedupe logic.
🧹 Nitpick comments (1)
packages/daemon/src/scripts/replay-balance.ts (1)
119-135: Ensure SQLite handle is always closed withtry/finally.If the query throws, Line 134 is skipped and the DB handle may stay open longer than intended.
♻️ Proposed cleanup pattern
- const sqlite = new Database(opts.db, { readonly: true }); - - // Build the WHERE clause — one parameterized LIKE condition per address - const addresses = Array.from(walletAddresses); - const conditions = addresses.map(() => 'data LIKE ?').join(' OR '); - const likeParams = addresses.map(addr => `%${addr}%`); - - const rows = sqlite.prepare(` - SELECT id, type, data - FROM events - WHERE type IN ('NEW_VERTEX_ACCEPTED', 'VERTEX_METADATA_CHANGED') - AND (${conditions}) - ORDER BY id ASC - `).all(...likeParams) as Array<{ id: number; type: string; data: string }>; - - sqlite.close(); + const sqlite = new Database(opts.db, { readonly: true }); + let rows: Array<{ id: number; type: string; data: string }>; + try { + // Build the WHERE clause — one parameterized LIKE condition per address + const addresses = Array.from(walletAddresses); + const conditions = addresses.map(() => 'data LIKE ?').join(' OR '); + const likeParams = addresses.map(addr => `%${addr}%`); + + rows = sqlite.prepare(` + SELECT id, type, data + FROM events + WHERE type IN ('NEW_VERTEX_ACCEPTED', 'VERTEX_METADATA_CHANGED') + AND (${conditions}) + ORDER BY id ASC + `).all(...likeParams) as Array<{ id: number; type: string; data: string }>; + } finally { + sqlite.close(); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/daemon/src/scripts/replay-balance.ts` around lines 119 - 135, Wrap creation and use of the SQLite Database handle in a try/finally so the handle is always closed: move the new Database(opts.db, { readonly: true }) into a scoped block (where variable sqlite is created), execute the prepare(...).all(...) query inside the try, and call sqlite.close() in the finally block; update the code around the sqlite variable and the query that builds conditions/likeParams in replay-balance.ts so any thrown error still triggers sqlite.close() for the Database instance created with new Database(...).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/event-downloader/package.json`:
- Around line 10-12: The package.json currently declares "engines": {"node":
">=18"} but depends on `@types/node`@22.10.2; update package.json so runtime and
typings align by either pinning the devDependency "@types/node" to a Node
18-compatible major (e.g., ^18.x) or bumping the "engines.node" field to ">=22"
if you require Node 22 APIs—modify the "@types/node" entry or the "engines"
field accordingly and run a reinstall/type-check to verify no new type errors.
---
Duplicate comments:
In `@packages/daemon/src/scripts/replay-balance.ts`:
- Around line 122-132: The current query builds a single clause using
addresses/conditions/likeParams and calls sqlite.prepare(...).all(...), which
can exceed SQLite bind-variable limits for large address sets; modify
replay-balance.ts to chunk the addresses array into batches (e.g., <=
SQLITE_MAX_VARS like 900), for each chunk build its own conditions and
likeParams and run sqlite.prepare(...).all, collect results into a Map keyed by
event id to dedupe across chunks, then combine and return the deduped rows
ordered by id (preserving the existing fields id, type, data); ensure you
reference the existing variables/expressions addresses, conditions, likeParams
and the sqlite.prepare(...).all call when implementing the chunking and dedupe
logic.
---
Nitpick comments:
In `@packages/daemon/src/scripts/replay-balance.ts`:
- Around line 119-135: Wrap creation and use of the SQLite Database handle in a
try/finally so the handle is always closed: move the new Database(opts.db, {
readonly: true }) into a scoped block (where variable sqlite is created),
execute the prepare(...).all(...) query inside the try, and call sqlite.close()
in the finally block; update the code around the sqlite variable and the query
that builds conditions/likeParams in replay-balance.ts so any thrown error still
triggers sqlite.close() for the Database instance created with new
Database(...).
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
yarn.lockis excluded by!**/yarn.lock,!**/*.lock
📒 Files selected for processing (3)
packages/daemon/package.jsonpackages/daemon/src/scripts/replay-balance.tspackages/event-downloader/package.json
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/daemon/package.json
- Bump node engine to >=22 to align with @types/node@22.10.2 - Remove stale comment in event-parser.ts
46ffc9e
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/event-downloader/package.json (1)
18-20: Use caret constraint for peer dependency to prevent future major version breaking upgrades.
"@hathor/wallet-lib": ">=2.8.3"will accept any future major version without an upper bound. Change to"^2.8.3"to restrict to the major version 2 line, following semver conventions for peer dependencies.Suggested change
"peerDependencies": { - "@hathor/wallet-lib": "^2.8.3" + "@hathor/wallet-lib": "^2.8.3" },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/event-downloader/package.json` around lines 18 - 20, Update the peerDependencies entry for "@hathor/wallet-lib" in package.json to use a caret range instead of an open-ended greater-or-equal constraint: replace the current "@hathor/wallet-lib": ">=2.8.3" value with "@hathor/wallet-lib": "^2.8.3" so the peer dependency is constrained to the v2 semver line while allowing patch/minor updates; ensure you update the peerDependencies object accordingly and run any dependency checks after changing the version string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/event-downloader/package.json`:
- Around line 18-20: Update the peerDependencies entry for "@hathor/wallet-lib"
in package.json to use a caret range instead of an open-ended greater-or-equal
constraint: replace the current "@hathor/wallet-lib": ">=2.8.3" value with
"@hathor/wallet-lib": "^2.8.3" so the peer dependency is constrained to the v2
semver line while allowing patch/minor updates; ensure you update the
peerDependencies object accordingly and run any dependency checks after changing
the version string.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 942b390b-16c0-43c0-8769-1a98f1d378b0
📒 Files selected for processing (2)
packages/event-downloader/package.jsonpackages/event-downloader/src/event-parser.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/event-downloader/src/event-parser.ts
Motivation
Debugging wallet balance discrepancies requires replaying all historical fullnode events for a given wallet to verify what the balance should be. This PR adds two tools to support that:
packages/event-downloader— a standalone CLI that connects to a fullnode's WebSocket event stream (/v1a/event_ws) and downloads all events into a local SQLite database. Supports resumable downloads, parallel workers, and a live progress UI.packages/daemon/src/scripts/replay-balance.ts— a debug script that reads the SQLite database produced by the event downloader, replays all events touching a given set of wallet addresses using the daemon's ownprepareOutputs/prepareInputsutilities, and computes the expected HTR balance. Useful for comparing againstwallet_balancevalues stored in the service DB.Acceptance Criteria
event-downloaderconnects to a fullnode via WebSocket and downloads all events to SQLitereplay-balancescript correctly computes the HTR balance for a wallet from the downloaded eventsreplay-balanceaccepts--expectedflag to compare computed vs stored balance and highlights mismatchesChecklist
master, confirm this code is production-ready and can be included in future releases as soon as it gets mergedSummary by CodeRabbit
New Features
Chores