feat(daemon): add scheduled balance validation to monitoring layer#382
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 Walkthrough📝 Walkthrough🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
packages/daemon/__tests__/actors/MonitoringActor.test.ts (1)
385-424: Tests verify basic mismatch detection, but consider adding an edge case test.The current test uses a single address/token which doesn't exercise the index alignment issue flagged in the implementation review. Once the comparison logic is fixed to use map-based lookup, consider adding a test where:
fetchAddressBalancereturns entries with varyingtransactionscounts (some > 0, some = 0)- Verify that filtering doesn't break the address/token matching
This would protect against regression of the index-alignment bug.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/daemon/__tests__/actors/MonitoringActor.test.ts` around lines 385 - 424, Add a new edge-case unit test for MonitoringActor that exercises the index-alignment bug: call MonitoringActor(...) and send CONNECTED, then mock db.fetchAllDistinctAddresses, db.fetchAddressBalance and db.fetchAddressTxHistorySum to return multiple entries (multiple addresses and tokenIds) in different orders and include some entries with transactions === 0 and some > 0; after advancing timers and flushing promises assert that the comparison uses map-based lookup (not positional alignment) by verifying mockAddAlert is called with the correct totalMismatches count, that entries with transactions === 0 are ignored, and that fetchAddressBalance/fetchAddressTxHistorySum were called with the expected address list and that mockMysql.release is invoked.packages/daemon/src/actors/MonitoringActor.ts (1)
208-209: Consider comparing BigInt values directly to avoid precision loss.Converting
BigInttoNumbercan lose precision for values exceedingNumber.MAX_SAFE_INTEGER(2^53 - 1). While unlikely in typical usage, balance validation is a safety mechanism where correctness matters.♻️ Proposed fix to compare BigInt directly
- const balanceFromTable = Number(addressBalance.unlockedBalance + addressBalance.lockedBalance); - const balanceFromHistory = Number(addressTxHistorySum.balance); - - if (balanceFromTable !== balanceFromHistory) { + const balanceFromTable = addressBalance.unlockedBalance + addressBalance.lockedBalance; + const balanceFromHistory = addressTxHistorySum.balance; + + if (balanceFromTable !== balanceFromHistory) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/daemon/src/actors/MonitoringActor.ts` around lines 208 - 209, The current comparison converts BigInt values to Number (balanceFromTable and balanceFromHistory) losing precision; change the logic in MonitoringActor to compare BigInt directly by ensuring addressBalance.unlockedBalance, addressBalance.lockedBalance, and addressTxHistorySum.balance are treated as BigInt (or converted using BigInt(...) from strings) and compute balanceFromTable as addressBalance.unlockedBalance + addressBalance.lockedBalance (BigInt) and compare it to addressTxHistorySum.balance (BigInt) without using Number(), updating any type annotations or parsing where needed so the equality check uses BigInt arithmetic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/daemon/src/actors/MonitoringActor.ts`:
- Around line 182-218: The index-based comparison between
filteredAddressBalances and addressTxHistorySums is incorrect; update the logic
in MonitoringActor to build a lookup map keyed by address+tokenId from
addressTxHistorySums (e.g., key =
`${addressTxHistorySum.address}:${addressTxHistorySum.tokenId}`), then iterate
filteredAddressBalances and for each entry compute the same key, find the
matching addressTxHistorySum via the map, log and count a mismatch if no match
is found, validate tokenId and then compare balanceFromTable
(Number(addressBalance.unlockedBalance + addressBalance.lockedBalance)) against
Number(addressTxHistorySum.balance); remove reliance on array indices and ensure
all error logs include identifying fields like address and tokenId (use symbols
filteredAddressBalances, addressTxHistorySums, addressBalance,
addressTxHistorySum, totalMismatches).
---
Nitpick comments:
In `@packages/daemon/__tests__/actors/MonitoringActor.test.ts`:
- Around line 385-424: Add a new edge-case unit test for MonitoringActor that
exercises the index-alignment bug: call MonitoringActor(...) and send CONNECTED,
then mock db.fetchAllDistinctAddresses, db.fetchAddressBalance and
db.fetchAddressTxHistorySum to return multiple entries (multiple addresses and
tokenIds) in different orders and include some entries with transactions === 0
and some > 0; after advancing timers and flushing promises assert that the
comparison uses map-based lookup (not positional alignment) by verifying
mockAddAlert is called with the correct totalMismatches count, that entries with
transactions === 0 are ignored, and that
fetchAddressBalance/fetchAddressTxHistorySum were called with the expected
address list and that mockMysql.release is invoked.
In `@packages/daemon/src/actors/MonitoringActor.ts`:
- Around line 208-209: The current comparison converts BigInt values to Number
(balanceFromTable and balanceFromHistory) losing precision; change the logic in
MonitoringActor to compare BigInt directly by ensuring
addressBalance.unlockedBalance, addressBalance.lockedBalance, and
addressTxHistorySum.balance are treated as BigInt (or converted using
BigInt(...) from strings) and compute balanceFromTable as
addressBalance.unlockedBalance + addressBalance.lockedBalance (BigInt) and
compare it to addressTxHistorySum.balance (BigInt) without using Number(),
updating any type annotations or parsing where needed so the equality check uses
BigInt arithmetic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d6406e37-04d9-4b92-9b8a-ca95327eaf6f
📒 Files selected for processing (4)
packages/daemon/__tests__/actors/MonitoringActor.test.tspackages/daemon/src/actors/MonitoringActor.tspackages/daemon/src/config.tspackages/daemon/src/db/index.ts
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/daemon/__tests__/actors/MonitoringActor.test.ts`:
- Around line 406-415: The test over-specifies mockAddAlert arity:
MonitoringActor calls addAlert with 4 args but the assertions use
toHaveBeenCalledWith(..., expect.anything()) as a 5th arg; update the assertions
in MonitoringActor.test.ts (the expect(mockAddAlert).toHaveBeenCalledWith checks
around the "Balance validation found mismatches" case and the similar block at
lines ~453-459) to assert only the four actual arguments (title, message
matcher, Severity.MAJOR, and the expect.objectContaining payload) and remove the
extraneous expect.anything() so the arity matches the actual addAlert calls.
- Around line 474-476: The test assertion for mockLoggerError only checks the
first argument but the actual call logs both a message and an error object;
update the assertion in MonitoringActor.test.ts (the
expect(mockLoggerError).toHaveBeenCalledWith(...) line) to assert both
arguments, e.g. use expect.stringContaining('Balance validation error') as the
first arg and expect.any(Error) or expect.anything() as the second, so the call
signature (message, error) is matched correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 38733e00-1165-48db-af84-dd77c0417be3
📒 Files selected for processing (3)
packages/daemon/__tests__/actors/MonitoringActor.test.tspackages/daemon/src/actors/MonitoringActor.tspackages/daemon/src/config.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/daemon/src/config.ts
- packages/daemon/src/actors/MonitoringActor.ts
| expect(mockAddAlert).toHaveBeenCalledWith( | ||
| 'Balance validation found mismatches', | ||
| expect.stringContaining('1 balance mismatch'), | ||
| Severity.MAJOR, | ||
| expect.objectContaining({ | ||
| truncated: false, | ||
| samples: [mismatchRow], | ||
| }), | ||
| expect.anything(), | ||
| ); |
There was a problem hiding this comment.
Fix addAlert assertion arity (currently over-specified).
Line 414 and Line 458 require a 5th argument (expect.anything()), but MonitoringActor calls addAlert with 4 arguments in this path. This will fail even when behavior is correct.
Proposed fix
expect(mockAddAlert).toHaveBeenCalledWith(
'Balance validation found mismatches',
expect.stringContaining('1 balance mismatch'),
Severity.MAJOR,
expect.objectContaining({
truncated: false,
samples: [mismatchRow],
}),
- expect.anything(),
);
@@
expect(mockAddAlert).toHaveBeenCalledWith(
'Balance validation found mismatches',
expect.stringContaining('100+'),
Severity.MAJOR,
- expect.objectContaining({ truncated: true }),
- expect.anything(),
+ expect.objectContaining({ truncated: true }),
);Also applies to: 453-459
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/daemon/__tests__/actors/MonitoringActor.test.ts` around lines 406 -
415, The test over-specifies mockAddAlert arity: MonitoringActor calls addAlert
with 4 args but the assertions use toHaveBeenCalledWith(..., expect.anything())
as a 5th arg; update the assertions in MonitoringActor.test.ts (the
expect(mockAddAlert).toHaveBeenCalledWith checks around the "Balance validation
found mismatches" case and the similar block at lines ~453-459) to assert only
the four actual arguments (title, message matcher, Severity.MAJOR, and the
expect.objectContaining payload) and remove the extraneous expect.anything() so
the arity matches the actual addAlert calls.
There was a problem hiding this comment.
False positive. addAlert is a 5-argument function:
// packages/common/src/utils/alerting.utils.ts:18
export const addAlert = async (
title: string,
message: string,
severity: Severity,
metadata: unknown,
logger: Logger, // <-- 5th arg
): Promise<void> => { ... }MonitoringActor.ts:208-214 passes all five including logger at the end, so the test's expect.anything() as the 5th matcher is correctly matching the logger argument. Both tests pass with the current assertions.
There was a problem hiding this comment.
@andreabadesso, you're correct — I apologize for the false positive. The addAlert signature from @wallet-service/common does take a 5th logger argument, and MonitoringActor correctly passes it. The expect.anything() in both test assertions is a valid matcher for that logger parameter. The tests are properly specified.
✏️ Learnings added
Learnt from: andreabadesso
Repo: HathorNetwork/hathor-wallet-service PR: 382
File: packages/daemon/__tests__/actors/MonitoringActor.test.ts:406-415
Timestamp: 2026-04-14T14:36:00.500Z
Learning: In `packages/common/src/utils/alerting.utils.ts`, the `addAlert` function is a 5-argument async function with signature: `addAlert(title: string, message: string, severity: Severity, metadata: unknown, logger: Logger): Promise<void>`. Callers in the daemon (e.g., `MonitoringActor.ts`) always pass all 5 arguments including the `logger` instance as the last parameter. Test assertions using `toHaveBeenCalledWith(..., expect.anything())` as the 5th matcher are correctly matching the `logger` argument and should not be flagged as over-specified.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: andreabadesso
Repo: HathorNetwork/hathor-wallet-service PR: 0
File: :0-0
Timestamp: 2026-04-09T20:12:33.128Z
Learning: In packages/daemon/src/tracing.ts, the OTel SDK shutdown handler intentionally uses `catch {` (no binding) and logs only a fixed-text `console.warn` message. Raw error objects (including `err.message` and `err.name`) must NOT be logged because `AggregateError`'s `.name` contains the substring "Error", which would match the log-based alert regex. The accepted trade-off is losing shutdown-flush diagnostics in exchange for clean, alert-free termination.
| expect(mockLoggerError).toHaveBeenCalledWith( | ||
| expect.stringContaining('Balance validation error'), | ||
| ); |
There was a problem hiding this comment.
Error-log assertion misses the second argument.
Line 474 expects a single-argument logger.error call, but this code path logs message + error object. Match both args to avoid false failures.
Proposed fix
expect(mockLoggerError).toHaveBeenCalledWith(
- expect.stringContaining('Balance validation error'),
+ expect.stringContaining('Balance validation error'),
+ expect.any(Error),
);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/daemon/__tests__/actors/MonitoringActor.test.ts` around lines 474 -
476, The test assertion for mockLoggerError only checks the first argument but
the actual call logs both a message and an error object; update the assertion in
MonitoringActor.test.ts (the expect(mockLoggerError).toHaveBeenCalledWith(...)
line) to assert both arguments, e.g. use expect.stringContaining('Balance
validation error') as the first arg and expect.any(Error) or expect.anything()
as the second, so the call signature (message, error) is matched correctly.
There was a problem hiding this comment.
False positive. The error path logs a single interpolated string, not (message, error):
// MonitoringActor.ts:217
} catch (err) {
logger.error(`[monitoring] Balance validation error: ${err}`);
}There's no second argument, so the test's single expect.stringContaining(...) is correct. Test passes as-is.
There was a problem hiding this comment.
@andreabadesso, you're right — my apologies for the noise. Since the catch block uses a single template-literal string (\[monitoring] Balance validation error: ${err}``), there is no second argument to assert, and the existing single-matcher assertion is correct. I'll note this pattern to avoid flagging it again.
✏️ Learnings added
Learnt from: andreabadesso
Repo: HathorNetwork/hathor-wallet-service PR: 382
File: packages/daemon/__tests__/actors/MonitoringActor.test.ts:474-476
Timestamp: 2026-04-14T14:36:04.345Z
Learning: In `packages/daemon/src/actors/MonitoringActor.ts`, the balance validation catch block logs errors as a single interpolated string: `logger.error(\`[monitoring] Balance validation error: ${err}\`)`. There is no second argument passed to `logger.error`, so test assertions should use only one `expect.stringContaining(...)` matcher — not a second `expect.any(Error)` argument.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: andreabadesso
Repo: HathorNetwork/hathor-wallet-service PR: 0
File: :0-0
Timestamp: 2026-04-09T20:12:33.128Z
Learning: In packages/daemon/src/tracing.ts, the OTel SDK shutdown handler intentionally uses `catch {` (no binding) and logs only a fixed-text `console.warn` message. Raw error objects (including `err.message` and `err.name`) must NOT be logged because `AggregateError`'s `.name` contains the substring "Error", which would match the log-based alert regex. The accepted trade-off is losing shutdown-flush diagnostics in exchange for clean, alert-free termination.
Adds an optional periodic safety-net check that compares every row in address_balance against the sum of its non-voided address_tx_history rows, firing a MAJOR alert on any mismatch. This complements #376, which made per-void validation skippable for performance — the scheduled check closes the loop by guaranteeing full-database consistency is still verified on an operator-tunable cadence. The check is a single SQL LEFT JOIN query bounded by LIMIT 100, running on a configurable interval while CONNECTED: BALANCE_VALIDATION_ENABLED default: false BALANCE_VALIDATION_INTERVAL_MS default: 600000 (10 min) Letting the database do the pairing (JOIN), the math (native BIGINT, no JS Number coercion), and the consistency snapshot (single statement = one InnoDB read view) keeps the actor logic to ~30 lines and avoids a family of correctness traps that come with paginated pairing in application code — index desync when filters diverge, precision loss on large token supplies, and races between separate balance/history queries. An in-flight guard skips overlapping ticks so a slow run never stacks connections. Errors log and move on — the daemon never crashes on validation failure. DISCONNECTED clears the interval; any in-flight SELECT completes and releases its connection harmlessly. If mismatches are found, the alert payload carries up to 100 sample rows and a `truncated` flag so operators know whether to run their own unbounded query to assess full extent. Tests: 7 new cases covering enable/disable, start/stop lifecycle, mismatch detection, no-mismatch path, truncation, DB error handling, and idle/stuck/storm regression (22 existing cases still pass). Depends on #369. Relates to #380. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
52a07ce to
b2a88d9
Compare
There was a problem hiding this comment.
Pull request overview
Adds a scheduled (timer-driven) address balance consistency validator to the daemon’s monitoring layer so balance checks can be decoupled from the void hot path while still surfacing inconsistencies via logs + MAJOR alerts.
Changes:
- Added new monitoring config flags for enabling scheduled balance validation and setting its interval.
- Extended
MonitoringActorto start/stop a periodic DB validation job on CONNECTED/DISCONNECTED lifecycle events. - Added unit tests covering the new validation job behavior and alerting paths.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
packages/daemon/src/config.ts |
Introduces env-driven config for scheduled balance validation (enabled + interval). |
packages/daemon/src/actors/MonitoringActor.ts |
Implements periodic DB query for detecting balance/history mismatches, wired to monitoring lifecycle, with alerting + error handling. |
packages/daemon/__tests__/actors/MonitoringActor.test.ts |
Adds tests for the new scheduled validation behavior, including alerting and error resilience. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/daemon/src/actors/MonitoringActor.ts (1)
195-229: Improve type safety by usingMysqlConnectioninstead ofanyfor the pooled connection.
getDbConnection()returnsPromise<MysqlConnection>(frompackages/daemon/src/db/index.ts), so you can drop both eslint-disables and the(mysql as any).release()cast:♻️ Suggested typing
-import { getDbConnection } from '../db'; +import { getDbConnection, MysqlConnection } from '../db'; @@ - const runBalanceValidation = async () => { - // eslint-disable-next-line `@typescript-eslint/no-explicit-any` - let mysql: any; + const runBalanceValidation = async () => { + let mysql: MysqlConnection | undefined; try { mysql = await getDbConnection(); const [rows] = await mysql.query(BALANCE_VALIDATION_SQL); - // eslint-disable-next-line `@typescript-eslint/no-explicit-any` - const samples = rows as any[]; + const samples = rows as Array<{ + address: string; + tokenId: string; + balanceSum: string | number; + historySum: string | number; + }>; @@ - if (mysql) { - try { - // eslint-disable-next-line `@typescript-eslint/no-explicit-any` - (mysql as any).release(); - } catch (releaseErr) { + if (mysql) { + try { + mysql.release(); + } catch (releaseErr) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/daemon/src/actors/MonitoringActor.ts` around lines 195 - 229, The current runBalanceValidation function uses untyped "any" for the pooled connection and casts on release; change the mysql variable to type MysqlConnection (import the type if needed) instead of any, remove the two eslint-disable comments and the (mysql as any).release() cast, and call mysql.release() directly in the finally block; keep all existing logic around BALANCE_VALIDATION_SQL, BALANCE_VALIDATION_SAMPLE_LIMIT, getDbConnection, addAlert, Severity and logger unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/daemon/src/actors/MonitoringActor.ts`:
- Around line 175-214: The BALANCE_VALIDATION_SQL returns BIGINTs that get
decoded as JS Numbers (losing precision); modify BALANCE_VALIDATION_SQL so the
SELECTed balanceSum and historySum are CAST to CHAR (string) for
transport/logging while keeping the HAVING comparison on the numeric SUM/SIGNED
expressions (e.g. leave the HAVING clause using the numeric expressions like
CAST(... AS SIGNED) or the raw SUM(...) but change the aliases used in the
SELECT to CAST(... AS CHAR)); update references to samples (used by logger.error
and addAlert) will then contain string values preserving full precision. Target
symbols: BALANCE_VALIDATION_SQL, balanceSum, historySum, logger.error, addAlert.
---
Nitpick comments:
In `@packages/daemon/src/actors/MonitoringActor.ts`:
- Around line 195-229: The current runBalanceValidation function uses untyped
"any" for the pooled connection and casts on release; change the mysql variable
to type MysqlConnection (import the type if needed) instead of any, remove the
two eslint-disable comments and the (mysql as any).release() cast, and call
mysql.release() directly in the finally block; keep all existing logic around
BALANCE_VALIDATION_SQL, BALANCE_VALIDATION_SAMPLE_LIMIT, getDbConnection,
addAlert, Severity and logger unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a851c070-4a7c-4871-bf82-41cd4b901e07
📒 Files selected for processing (3)
packages/daemon/__tests__/actors/MonitoringActor.test.tspackages/daemon/src/actors/MonitoringActor.tspackages/daemon/src/config.ts
✅ Files skipped from review due to trivial changes (1)
- packages/daemon/src/config.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/daemon/tests/actors/MonitoringActor.test.ts
Four related changes addressing @luislhl + Copilot comments: - Guard BALANCE_VALIDATION_INTERVAL_MS against NaN / below-minimum values. `parseInt('abc', 10)` yields NaN, and `setInterval(fn, NaN)` behaves like delay=0 — a tight loop hammering the DB. On invalid input, log a loud error and leave the scheduler disabled for the session rather than silently substituting a default (operators should see and fix the misconfig). - Log errors with `err instanceof Error ? (err.stack ?? err.message) : String(err)` instead of template-interpolating `${err}`. The old form collapsed non-Error throws to `[object Object]` and dropped stack traces entirely. - Expose the sample cap as `BALANCE_VALIDATION_SAMPLE_LIMIT` (default 100). Naming reflects what it actually is — a result-row cap for the alert payload, not a batch size. The query is still a full-table scan per tick; a follow-up for cursor-based batching is out of scope. - Drop `WHERE ab.transactions > 0` from the validation SQL. The LEFT JOIN + COALESCE(SUM, 0) handles empty history correctly, and the filter was hiding a real bug class: rows with `transactions = 0` AND non-zero balance (void cleanup should have deleted them). Genuinely empty rows still match `historySum = 0` and get filtered by HAVING, so no false positives. Code comment above the SQL also warns that LIMIT bounds the result payload, not execution cost — operators should EXPLAIN this against production-sized data before enabling BALANCE_VALIDATION_ENABLED=true. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/daemon/src/actors/MonitoringActor.ts`:
- Around line 187-202: The BALANCE_VALIDATION_SAMPLE_LIMIT value parsed into
config can be NaN, zero, or negative and is directly interpolated into the
BALANCE_VALIDATION_SQL LIMIT clause via sampleLimit in startBalanceValidation,
which can cause SQL errors or silent no-op checks; add a runtime guard in
startBalanceValidation (or where sampleLimit is used) to coerce/validate the
value (e.g., ensure Number.isFinite(sampleLimit) and sampleLimit > 0, otherwise
fall back to 100 or config default) before embedding it into the SQL, and
reference the config symbol BALANCE_VALIDATION_SAMPLE_LIMIT and the function
startBalanceValidation/BALANCE_VALIDATION_SQL when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e7f0f52c-65b5-4a4f-9ea9-cfcdeecf4132
📒 Files selected for processing (3)
packages/daemon/__tests__/actors/MonitoringActor.test.tspackages/daemon/src/actors/MonitoringActor.tspackages/daemon/src/config.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/daemon/src/config.ts
Benchmarked the full-table query on production data (≈1.5M address_balance rows, ≈8.3M address_tx_history rows) and confirmed what @luislhl suspected: LIMIT bounds the result payload, not the execution cost. A full-table tick takes tens of seconds because the LEFT JOIN + GROUP BY touches every row. Scoping the outer set by `ab.updated_at > NOW() - INTERVAL :window SECOND` uses the existing `address_balance_updated_at_idx` to restrict the tick to recently changed rows. updated_at is `ON UPDATE CURRENT_TIMESTAMP` at the schema level, so every row write bumps it — correctness of the scope is structural, not app-managed. Added `BALANCE_VALIDATION_WINDOW_MS` config (default 15 min) with 50% slack over the default 10 min tick so a late tick can't miss a row. Trade-off this does NOT solve — hot addresses: Scoping limits WHICH addresses we check per tick; it does NOT limit HOW MUCH history we sum per address. address_tx_history has no covering index that includes `balance`, so MySQL fetches every non-voided history row for each recently-changed address via a PK scan on `address`. For whale addresses with hundreds of thousands of history rows, a single tick where that address shows up pays a multi-second per-address cost. EXPLAIN ANALYZE confirmed: optimizer estimated 5 rows per address, reality was ~800K rows per address (the long tail). Mitigation is a covering index (address, voided, token_id, balance) on address_tx_history — tracked as follow-up #404 (retargeted from cursor-batching to covering-index). Until that index ships, keep BALANCE_VALIDATION_ENABLED=false in prod. Long-tail drift on cold rows (never updated since the bug introduced it) is out of scope for the scheduled tick — a separate full-table sweep at a lower cadence is the right mechanism for that. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/daemon/__tests__/actors/MonitoringActor.test.ts (1)
343-513: LGTM — balance validation tests cover happy path, truncation, no-mismatch, DB error, and config guards.Coverage is thorough: conditional scheduling, interval count on
CONNECTED/DISCONNECTED, SQL shape (bothLEFT JOINand the load-bearingupdated_atfilter), alert payload w/truncatedflag, release on success, error branch, and both invalid-interval cases (NaN and sub-minimum). TheaddAlert5-arg assertion (includingexpect.anything()for thelogger) and the single-arglogger.errormatcher are correct for the current implementation.Nit:
jest.spyOn(logger, 'error')at lines 469, 487, and 503 attaches fresh spies per test without explicitmockRestore().jest.clearAllMocks()inbeforeEachclears call history but doesn't unwrap the spies. Not a functional bug since the spies wrap the sameloggerinstance, but a singleconst mockLoggerError = jest.spyOn(logger, 'error')hoisted into thedescribe'sbeforeEach(withmockRestoreinafterEach) would be cleaner.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/daemon/__tests__/actors/MonitoringActor.test.ts` around lines 343 - 513, The tests create fresh spies on logger.error inside multiple it blocks which aren't restored; to clean this up hoist a single jest.spyOn(logger, 'error') into the describe-level setup (e.g., create the spy in the describe's beforeEach) and call mockRestore in an afterEach to unwrap it (or restore the spy explicitly), while keeping jest.clearAllMocks() for call-history cleanup; reference the existing jest.spyOn(logger, 'error') calls and the tests around beforeEach/afterEach so you replace per-test spy creation with a single managed spy lifecycle.packages/daemon/src/actors/MonitoringActor.ts (1)
205-222: SQL template is computed once at actor init — acceptable but worth confirming.
BALANCE_VALIDATION_SQL,windowSeconds, andsampleLimitare all captured fromconfigat actor construction time. If operators ever mutateconfig.BALANCE_VALIDATION_WINDOW_MSorBALANCE_VALIDATION_SAMPLE_LIMITat runtime (some tests do this to seed values before instantiation), changes won't be picked up for the life of the actor. This matches the pre-existing pattern foridleTimeoutMs/stuckTimeoutMs/stormThresholdon lines 47–50, so it's internally consistent.Also note: if
BALANCE_VALIDATION_WINDOW_MSis misconfigured (NaN, or < 1000),windowSecondsbecomesNaN/0, which embeds into the SQL asINTERVAL NaN SECOND(errors every tick, caught by try/catch but noisy) orINTERVAL 0 SECOND(silently returns zero rows — false "all clear"). The existingBALANCE_VALIDATION_INTERVAL_MSruntime guard at lines 280–287 would be a natural place to extend withNumber.isFinite(windowSeconds) && windowSeconds > 0if you want symmetric fail-loud behavior. Low priority givenENABLED=falsedefault in prod.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/daemon/src/actors/MonitoringActor.ts` around lines 205 - 222, The SQL template BALANCE_VALIDATION_SQL is built once using windowSeconds and sampleLimit captured from config at actor construction, so runtime mutations or bad config values (NaN, <=0) won't be reflected and can produce invalid or silent SQL behavior; fix by validating and/or computing these values at runtime before each use: ensure windowSeconds = Math.floor(Number(config.BALANCE_VALIDATION_WINDOW_MS) / 1000) is finite and >0 and sampleLimit is a positive finite integer (throw or log+disable if invalid), and move the SQL string construction or interpolation into the periodic validation path (the same place that enforces BALANCE_VALIDATION_INTERVAL_MS) so that changes to config are picked up and invalid values are rejected early to avoid embedding INTERVAL NaN/0 in queries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/daemon/__tests__/actors/MonitoringActor.test.ts`:
- Around line 343-513: The tests create fresh spies on logger.error inside
multiple it blocks which aren't restored; to clean this up hoist a single
jest.spyOn(logger, 'error') into the describe-level setup (e.g., create the spy
in the describe's beforeEach) and call mockRestore in an afterEach to unwrap it
(or restore the spy explicitly), while keeping jest.clearAllMocks() for
call-history cleanup; reference the existing jest.spyOn(logger, 'error') calls
and the tests around beforeEach/afterEach so you replace per-test spy creation
with a single managed spy lifecycle.
In `@packages/daemon/src/actors/MonitoringActor.ts`:
- Around line 205-222: The SQL template BALANCE_VALIDATION_SQL is built once
using windowSeconds and sampleLimit captured from config at actor construction,
so runtime mutations or bad config values (NaN, <=0) won't be reflected and can
produce invalid or silent SQL behavior; fix by validating and/or computing these
values at runtime before each use: ensure windowSeconds =
Math.floor(Number(config.BALANCE_VALIDATION_WINDOW_MS) / 1000) is finite and >0
and sampleLimit is a positive finite integer (throw or log+disable if invalid),
and move the SQL string construction or interpolation into the periodic
validation path (the same place that enforces BALANCE_VALIDATION_INTERVAL_MS) so
that changes to config are picked up and invalid values are rejected early to
avoid embedding INTERVAL NaN/0 in queries.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4efa4483-9848-42e0-b41b-471840869b97
📒 Files selected for processing (3)
packages/daemon/__tests__/actors/MonitoringActor.test.tspackages/daemon/src/actors/MonitoringActor.tspackages/daemon/src/config.ts
… in prod The existing comment implied that flipping BALANCE_VALIDATION_ENABLED=true was a goal (just pending #404). That's not the plan — the validator is intended for ad-hoc / on-demand use (local, testnet, manual) and the feature flag is designed to stay false in production. Rephrase the comment so the next reader doesn't assume there's a path to "turn this on" once #404 lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small correctness items from the PR review:
- Wrap the BIGINT sums in CAST(... AS CHAR) so `balanceSum` and
`historySum` transport to the client as decimal strings instead of
JS Numbers. mysql2 returns BIGINT as Number by default, which loses
precision above 2^53. HTR values are well below that, but the alert
payload is read by humans and tooling — keeping the decimal string
form avoids any silent rounding if the validator is ever run
against a non-HTR token. HAVING still compares the same CHAR
expressions, which for canonical CAST(SIGNED AS CHAR) output is
equivalent to numeric equality.
- Add the same "fail loud, stay disabled" guard on
BALANCE_VALIDATION_SAMPLE_LIMIT that INTERVAL_MS already has.
sampleLimit is interpolated directly into `LIMIT ${sampleLimit}`:
NaN produces a SQL syntax error, 0/negative returns no rows and
silently claims "no mismatches" without actually checking. The
guard logs a loud error and leaves the scheduler disabled rather
than pretending everything is fine.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…tion-monitoring # Conflicts: # packages/daemon/src/config.ts
5318965

Motivation
PR #376 adds
VALIDATE_ADDRESS_BALANCESenv var to skip per-void balance validation, saving ~2s per void (~39% of void time). Disabling it removes a safety net. This PR adds a scheduled validation job to the monitoring layer (#369) so we get fast voids + periodic consistency checks across the entire database.Acceptance Criteria
MonitoringActorwith periodic address balance validation (new responsibility alongside idle/stuck/storm detection)address_balancerows againstaddress_tx_historysums in configurable batchesBALANCE_VALIDATION_ENABLED(defaultfalse),BALANCE_VALIDATION_INTERVAL_MS(default 10min),BALANCE_VALIDATION_BATCH_SIZE(default 5000)MAJORseverity alerts; never crashes the daemon on failureMonitoringActor.test.tsDepends on #369. Relates to #380.
Checklist
master, confirm this code is production-ready and can be included in future releases as soon as it gets mergedSummary by CodeRabbit
New Features
Behavior & Reliability
Tests