Skip to content

fix: Auth token race conditions in start()#1058

Open
tuliomir wants to merge 9 commits intomasterfrom
fix/auth-token-race-conditions
Open

fix: Auth token race conditions in start()#1058
tuliomir wants to merge 9 commits intomasterfrom
fix/auth-token-race-conditions

Conversation

@tuliomir
Copy link
Copy Markdown
Contributor

@tuliomir tuliomir commented Apr 7, 2026

Summary

Eliminates race conditions in the wallet-service start() auth token flow that caused flaky 403 errors in CI integration tests. Replaces the dual fire-and-forget promise pattern with interceptor-driven on-demand auth, adds timeout bounds and transient error tolerance to polling, and fixes a logical contradiction in startReadOnly().

Acceptance Criteria

  • Wallet-service integration tests (start.test.ts) pass consistently without 403 flakes
  • pollForWalletStatus times out after 60s instead of hanging indefinitely

Why we're refactoring the auth token flow in start()

The problem in one sentence

The wallet service start() method launches auth token renewal as background promises that race with wallet status polling, causing flaky 403 errors in CI.

What's happening today

When HathorWalletServiceWallet.start() runs, it needs to do two things:

  1. Create the wallet on the wallet-service backend (POST /wallet/init)
  2. Get an auth token so subsequent API calls are authenticated

These two operations have an ordering constraint: the token endpoint requires the wallet to already exist. For a brand-new wallet, the first token request will always fail because the wallet hasn't been created yet.

The current code tries to be clever about this by launching token renewal in the background while the wallet is being created, hoping it will finish by the time we need it. Here's the simplified flow:

                          start()
                            |
            +---------------+---------------+
            |                               |
  renewPromise = renewToken()     await createWallet()
  (fire-and-forget, may fail)        |
            |                        |
            +-------+----------------+
                    |
              await renewPromise
              (if it failed...)
                    |
            renewPromise2 = renewToken()   <-- another fire-and-forget
                    |
              handleCreate(status, renewPromise2)
                    |
              if status == 'creating':
                  await pollForWalletStatus()  <-- needs auth token!
                  |                               but renewPromise2
                  |                               hasn't been awaited yet
              await renewPromise2              <-- too late, polling already needed it
                  |
              onWalletReady()

The race condition

pollForWalletStatus() fires a setInterval that calls getWalletStatus() every second. Each call needs an auth token. But renewPromise2 (the token renewal) is running concurrently in the background. There are three possible outcomes:

  1. renewPromise2 finishes first -- polling finds a valid token. Things work.
  2. Polling fires before renewPromise2 completes -- validateAndRenewAuthToken() is called again from inside the polling tick, racing with renewPromise2. Both try to set this.authToken. Usually one wins and things work.
  3. renewPromise2 fails silently -- renewAuthToken() has a catch block that sets this.authToken = null without throwing. If this happens between another operation reading the token and using it, the request goes out with no token. The wallet-service returns 403 Forbidden.

Outcome 3 is the flaky failure we see in CI. It depends on exact timing between the polling interval, the token endpoint response time, and JavaScript's microtask scheduling. It's not reproducible on demand, but it happens often enough to block CI.

Additional problems

pollForWalletStatus can run forever. It uses setInterval with no maximum attempt count. If the wallet-service never returns 'ready', the test (or app) hangs indefinitely.

setInterval callbacks can stack. If getWalletStatus() takes longer than the 1-second interval, multiple callbacks run concurrently, each trying to renew the auth token simultaneously.

setInterval async callbacks swallow errors. When getWalletStatus() throws inside the setInterval callback, the rejection is unhandled — the wrapping Promise never settles, and in Node.js 15+ this can terminate the process.

startReadOnly() has a logical contradiction. When the read-only token request fails, the catch block calls getWalletStatus() -- which requires an auth token we just failed to obtain.

walletId is overwritten 4+ times during startup (lines 455, 476, 526, 487), creating a window where concurrent token renewals might sign with a stale value.

What we're changing

The new start() flow

Replace the concurrent fire-and-forget pattern with interceptor-driven on-demand auth:

                    start()
                      |
              generate keys (unchanged)
                      |
              await createWallet()          <-- no auth needed
                      |
              this.walletId = response.walletId   (set ONCE)
              this.authPrivKey = derived key       (set for interceptor)
                      |
              if (!waitReady) return        (early exit, unchanged)
                      |
              if status == 'creating':
                  await pollForWalletStatus()
                  |    └─ getWalletStatus() goes through axios interceptor
                  |       └─ interceptor calls validateAndRenewAuthToken()
                  |          └─ finds authPrivKey in memory, obtains token
                  |       └─ sets Authorization header, fires request
              else if status != 'ready':
                  throw error
                      |
              await onWalletReady()         <-- also goes through interceptor
                      |
              clearSensitiveData()          <-- NOW clears authPrivKey

Why this is better:

  • No race conditions. Auth token is obtained on-demand by the axios interceptor (walletServiceAxios.ts) before each authenticated request. By the time pollForWalletStatus runs, both walletId and authPrivKey are in memory, so the interceptor can obtain a token without the PIN.
  • No fire-and-forget promises. Every async operation is awaited.
  • No closure-captured error variables. The flow is linear; errors propagate naturally through try/catch.
  • walletId set once. No redundant overwrites.
  • Token renewal is self-healing. Each poll call goes through the interceptor, which checks validateJWTExpireDate(). If the token is near expiry, it gets renewed before the request fires — no special handling needed for long polling windows.

"But won't this be slower?" The preemptive renewal (the first fire-and-forget) always failed for new wallets because the wallet didn't exist yet. We were paying the cost of a failing HTTP request for zero benefit. For existing wallets (the WALLET_ALREADY_LOADED path), the interceptor obtains the token on the first API call, which adds one sequential round-trip. This is a small, acceptable cost for correctness.

The new pollForWalletStatus()

Replace setInterval with a bounded for loop with transient error tolerance:

async pollForWalletStatus(): Promise<void> {
  let lastError: Error | null = null;
  for (let attempt = 0; attempt < MAX_POLL_ATTEMPTS; attempt++) {
    let data;
    try {
      data = await walletApi.getWalletStatus(this);
    } catch (err) {
      if (!(err instanceof WalletRequestError)) throw err;
      // Transient server/auth error — retry
      lastError = err;
      await delay(POLLING_INTERVAL);
      continue;
    }

    lastError = null;
    if (data.status.status === 'ready') return;
    if (data.status.status !== 'creating') throw new WalletRequestError(...);

    await delay(POLLING_INTERVAL);
  }
  throw new WalletRequestError('Wallet status polling timed out', { cause: lastError });
}
  • Sequential: only one getWalletStatus in-flight at a time.
  • Bounded: times out after 60 seconds (configurable via constant).
  • No leaked intervals: no setInterval to clean up.
  • Transient error tolerance: WalletRequestError (server responded but said no) is retried; non-WalletRequestError (programming errors) fails fast. Permanent wallet error states ('error') also fail fast — the status check is outside the try-catch.

The new startReadOnly()

Replace the catch block's getWalletStatus() (which needs auth) with a retry loop on getReadOnlyAuthToken() (which doesn't need auth), with error classification:

async startReadOnly() {
  // ...
  let lastError: Error | null = null;
  for (let attempt = 0; attempt < MAX_POLL_ATTEMPTS; attempt++) {
    try {
      await this.getReadOnlyAuthToken();
      lastError = null;
      break;
    } catch (err) {
      if (!(err instanceof WalletRequestError)) throw err;
      // Only retry on HTTP 400 (wallet still creating).
      // Other status codes (401, 403, etc.) are permanent failures.
      const cause = err.cause as { status?: number } | undefined;
      if (cause?.status && cause.status !== 400) throw err;
      lastError = err;
      await delay(POLLING_INTERVAL);
    }
  }
  if (lastError) throw new WalletRequestError('Read-only wallet startup timed out', { cause: lastError });
  await this.onWalletReady(skipAddressFetch);
}

The RO token endpoint returns 400 when the wallet is still 'creating'. We retry only that case. Non-400 HTTP errors (401, 403, 404) fail fast as permanent failures. Non-WalletRequestError (network/transport) also fails fast.

renewAuthToken now throws on failure

The old renewAuthToken() silently caught all errors and set this.authToken = null. This was the root cause of the 403 race condition — a fire-and-forget renewal could null the token between another operation reading it and using it. The new code removes the try-catch entirely, letting errors propagate to the caller. All callers now await the result, so the non-throwing contract is no longer needed.

Await fire-and-forget in validateAndRenewAuthToken

The else if (usePassword) branch at line 1191 calls renewAuthToken() without await. This is a fire-and-forget that can silently null a valid token if the renewal fails. We add await to make it blocking. The latency cost is negligible since this branch only fires when the token is already valid (opportunistic refresh).

What we're NOT changing

  • Public API is unchanged. start(), startReadOnly(), pollForWalletStatus(), validateAndRenewAuthToken() keep the same signatures and return types.
  • axiosInstance interceptor stays. The interceptor in walletServiceAxios.ts that calls validateAndRenewAuthToken() before authenticated requests is the mechanism we now rely on for startup auth — it was already doing this for all other API calls.

Test coverage

We're adding unit tests for every edge case in the refactored flow:

Scenario What we verify
New wallet, status 'ready' Token obtained before onWalletReady
New wallet, status 'creating' Polling completes, wallet reaches ready
Status 'error' WalletRequestError thrown immediately (not retried)
Token renewal fails Error propagates (not swallowed)
waitReady=false No token renewal, no polling
start() does not call validateAndRenewAuthToken Auth delegated to interceptor
Poll returns 'ready' first try Resolves immediately
Poll returns 'creating' then 'ready' Resolves after retry
Poll exceeds max attempts Timeout error
Poll callbacks don't stack Only one getWalletStatus in-flight
Poll transient error then recovery WalletRequestError retried, then succeeds
Poll non-WalletRequestError Propagates immediately (fail-fast)
Poll persistent transient errors Timeout with last error as cause
startReadOnly with ready wallet Token on first try
startReadOnly with creating wallet Token after retries
startReadOnly timeout WalletRequestError with cause chain
startReadOnly permanent error (non-400) Fails fast, no retry
startReadOnly never calls getWalletStatus Authenticated endpoint not used

Summary by CodeRabbit

  • Tests

    • Added an integration test verifying wallet initialization, readiness, auth token, and address availability.
    • Expanded wallet tests for status polling, creation, error, timeout, retry semantics, and sequential polling behavior.
    • Revised read-only startup tests to cover auth-token retry logic, timeout behavior, failure gating, and address-fetch variations.
    • Added tests for auth-token renewal and validation ordering.
  • Refactor

    • Made startup and read-only flows deterministic with bounded retries and explicit timeouts; adjusted auth-token renewal and validation sequencing.

Replace the dual fire-and-forget promise pattern with a
sequential flow: createWallet → validateAndRenewAuthToken
→ pollForWalletStatus. This eliminates the race between
renewPromise2 and polling that caused flaky 403 errors.

Changes:
- start(): remove renewPromise/renewPromise2, handleCreate
  closure, and closure-captured error variables. Token
  renewal is now a single awaited call after createWallet.
- pollForWalletStatus(): replace setInterval with bounded
  while loop (max 60 attempts). Prevents callback stacking
  and adds timeout.
- startReadOnly(): replace getWalletStatus fallback (needed
  auth) with retry loop on getReadOnlyAuthToken (no auth).
- validateAndRenewAuthToken(): await the previously
  fire-and-forget renewAuthToken in else-if(usePassword).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tuliomir tuliomir self-assigned this Apr 7, 2026
@tuliomir tuliomir added the bug Something isn't working label Apr 7, 2026
@tuliomir tuliomir moved this from Todo to In Progress (WIP) in Hathor Network Apr 7, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 906b964a-4a1b-41b7-bf64-8a95a56e7478

📥 Commits

Reviewing files that changed from the base of the PR and between c9ceeca and 85e369e.

📒 Files selected for processing (5)
  • __tests__/integration/service-specific/start.test.ts
  • __tests__/wallet/readOnlyWallet.test.ts
  • __tests__/wallet/wallet.test.ts
  • src/wallet/api/walletApi.ts
  • src/wallet/wallet.ts
✅ Files skipped from review due to trivial changes (1)
  • src/wallet/wallet.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/integration/service-specific/start.test.ts

📝 Walkthrough

Walkthrough

Start/read-only startup flows refactored: startup now uses bounded retry loops with explicit timeouts for wallet status and read-only auth-token acquisition; auth-token renewal now propagates errors and is performed on-demand; tests added/updated to cover retries, timeouts, and polling behavior.

Changes

Cohort / File(s) Summary
Integration Test
__tests__/integration/service-specific/start.test.ts
Added integration test that builds a wallet, calls wallet.start({ pinCode, password }), and asserts isReady(), a non-null auth token, and a current address with address and index.
Read-Only Wallet Tests
__tests__/wallet/readOnlyWallet.test.ts
Reworked tests to drive startup via retrying createReadOnlyAuthToken (instead of status polling); added negative-path tests for non-retryable errors and timeout behavior using fake timers; adjusted call-count and skipAddressFetch expectations.
Wallet Tests (polling & auth)
__tests__/wallet/wallet.test.ts
Introduced pollForWalletStatus and validateAndRenewAuthToken suites; normalized renewAuthToken mock to set authToken; added tests for polling retry, sequential calls, timeout propagation, and start error propagation adjustments.
Wallet Implementation
src/wallet/wallet.ts
Rewrote start() to conditionally poll only when create returns creating; replaced interval polling with bounded retry loop and explicit timeout errors; made renewAuthToken() throw on failure and changed renewal to awaited behavior when invoked; rewrote startReadOnly() to retry getReadOnlyAuthToken() only for specific WalletRequestError causes and throw a read-only startup timeout after attempts exhausted.
Wallet API Error Context
src/wallet/api/walletApi.ts
Enhanced error construction for getWalletStatus, createAuthToken, and createReadOnlyAuthToken to include { status, data } in WalletRequestError.cause for failed HTTP responses.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Wallet
    participant WalletAPI
    participant Storage

    Client->>Wallet: start({pinCode, password})
    Wallet->>WalletAPI: createWallet(payload)
    alt create returns creating
        Wallet->>Wallet: pollForWalletStatus() (bounded retry loop)
        Wallet->>WalletAPI: getWalletStatus() (retries until ready)
    else create returns ready
        Wallet->>Wallet: (skip polling)
    end
    Wallet->>Wallet: validateAndRenewAuthToken(pinCode) (awaited when invoked)
    Wallet->>Storage: onWalletReady() (persist state)
    Wallet-->>Client: resolve (ready, auth token, address)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Possibly related PRs

Suggested labels

tests

Suggested reviewers

  • pedroferreira1
  • r4mmer

Poem

🐰 I hopped through tests and timeout threads,
I counted polls and chased old creds,
Retries now tidy, tokens reborn,
Start wakes the wallet at the crack of morn,
Hooray — code hops light as a feather!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change: fixing auth token race conditions in the start() method by refactoring the startup flow to eliminate concurrent token renewal patterns and use interceptor-driven on-demand auth.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/auth-token-race-conditions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 98.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 87.97%. Comparing base (0ccc910) to head (85e369e).

Files with missing lines Patch % Lines
src/wallet/api/walletApi.ts 66.66% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #1058       +/-   ##
===========================================
+ Coverage   71.03%   87.97%   +16.93%     
===========================================
  Files         114      114               
  Lines        8910     8895       -15     
  Branches     2030     2019       -11     
===========================================
+ Hits         6329     7825     +1496     
+ Misses       2551     1042     -1509     
+ Partials       30       28        -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

tuliomir and others added 2 commits April 7, 2026 18:42
Exercises the creating→ready polling path with a brand-new
wallet that the wallet-service has never seen, which is the
exact path where the old fire-and-forget pattern raced.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tuliomir tuliomir marked this pull request as ready for review April 7, 2026 21:49
@tuliomir tuliomir requested a review from pedroferreira1 as a code owner April 7, 2026 21:49
@tuliomir tuliomir requested review from andreabadesso and r4mmer April 7, 2026 21:50
@tuliomir tuliomir moved this from In Progress (WIP) to In Progress (Done) in Hathor Network Apr 7, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
__tests__/wallet/readOnlyWallet.test.ts (1)

181-211: ⚠️ Potential issue | 🟡 Minor

Restore fake timers even when the test fails.

jest.useRealTimers() only runs on the happy path here. If any assertion before Line 211 fails, later tests in this file inherit fake timers and can start failing for unrelated reasons.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@__tests__/wallet/readOnlyWallet.test.ts` around lines 181 - 211, The test
'should time out if getReadOnlyAuthToken never succeeds' currently calls
jest.useRealTimers() only on the happy path; wrap the body that uses
jest.useFakeTimers() (the setup, timer advances and assertions referencing
wallet.startReadOnly, mockCreateReadOnlyAuthToken, mockGetWalletStatus) in a
try/finally so jest.useRealTimers() is always executed, or move
jest.useRealTimers() to a file-level afterEach to restore real timers for every
test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@__tests__/integration/service-specific/start.test.ts`:
- Around line 125-130: The test claims to exercise the creating→ready path but
calls buildWalletInstance() with no seed, which triggers the
precalculated-wallet fallback in service-facade.helper and can skip first-time
creation; update the test to force a true new-wallet creation by calling
buildWalletInstance with an explicit unique seed or the helper's "force create"
parameter (e.g., buildWalletInstance({ seed: '<unique-seed>' }) or
buildWalletInstance({ forceCreate: true })) so the flow goes through
createWallet → validateAndRenewAuthToken → pollForWalletStatus and reliably
covers the race condition.

In `@__tests__/wallet/wallet.test.ts`:
- Around line 3601-3628: The test uses jest.useFakeTimers() but only calls
jest.useRealTimers() on the success path, so failures leave fake timers active;
wrap the timer restoration in a guaranteed cleanup (e.g., use a try/finally
inside the test or add an afterEach that calls jest.useRealTimers()) to ensure
jest.useRealTimers() always runs regardless of assertions—update the test
containing wallet.pollForWalletStatus() to restore real timers in finally or
rely on a test-level cleanup to avoid leaking fake timers.

In `@src/wallet/wallet.ts`:
- Around line 722-739: The loop in (method containing) walletApi.getWalletStatus
is bounded by attempts (MAX_WALLET_STATUS_POLL_ATTEMPTS) but each axios request
can itself block up to the HTTP timeout, so the total wait can exceed the
intended ~60s; change the polling to be time-bounded instead of attempt-bounded
by tracking elapsed time (e.g., start = Date.now()) and loop while (Date.now() -
start < MAX_WALLET_STATUS_POLL_DURATION_MS), calling walletApi.getWalletStatus
and breaking/throwing as you already do, or alternatively modify
walletApi.getWalletStatus to accept/forward a per-request timeout/AbortSignal so
each request respects a short timeout and keep the existing interval logic;
update references: walletApi.getWalletStatus, MAX_WALLET_STATUS_POLL_ATTEMPTS,
WALLET_STATUS_POLLING_INTERVAL, and WalletRequestError when implementing the
duration-based stop.
- Around line 1224-1234: The current loop catches errors from both
getReadOnlyAuthToken() and onWalletReady(), which hides permanent failures;
change the logic so the retry loop only wraps getReadOnlyAuthToken() (using
MAX_WALLET_STATUS_POLL_ATTEMPTS and WALLET_STATUS_POLLING_INTERVAL), and if
getReadOnlyAuthToken() succeeds break out and then call
onWalletReady(skipAddressFetch) once outside the retry loop so any errors from
onWalletReady (including getNewAddresses() or storage/auth setup) propagate
immediately; do not swallow or rethrow those errors as transient—only retry
failures from the auth token acquisition.
- Around line 483-485: The call to validateAndRenewAuthToken(pinCode) can
complete without actually setting this.authToken (renewAuthToken swallows
failures), so ensure startup fails fast if no token was obtained: either make
renewAuthToken propagate errors instead of swallowing them, or immediately after
await this.validateAndRenewAuthToken(pinCode) add a null-check for
this.authToken and throw a clear error (e.g., "missing auth token after
renewal") so subsequent calls (which use wallet.getAuthToken() and
walletServiceAxios building Authorization: Bearer ${...}) never send Bearer
null; update validateAndRenewAuthToken/renewAuthToken behavior accordingly to
guarantee a token is present or an exception is raised.

---

Outside diff comments:
In `@__tests__/wallet/readOnlyWallet.test.ts`:
- Around line 181-211: The test 'should time out if getReadOnlyAuthToken never
succeeds' currently calls jest.useRealTimers() only on the happy path; wrap the
body that uses jest.useFakeTimers() (the setup, timer advances and assertions
referencing wallet.startReadOnly, mockCreateReadOnlyAuthToken,
mockGetWalletStatus) in a try/finally so jest.useRealTimers() is always
executed, or move jest.useRealTimers() to a file-level afterEach to restore real
timers for every test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 542ed17a-15f7-4b0e-9920-f603c30cb2e9

📥 Commits

Reviewing files that changed from the base of the PR and between 0ccc910 and 1443741.

📒 Files selected for processing (4)
  • __tests__/integration/service-specific/start.test.ts
  • __tests__/wallet/readOnlyWallet.test.ts
  • __tests__/wallet/wallet.test.ts
  • src/wallet/wallet.ts

Comment thread __tests__/integration/service-specific/start.test.ts
Comment thread __tests__/wallet/wallet.test.ts Outdated
Comment thread src/wallet/wallet.ts Outdated
Comment thread src/wallet/wallet.ts
Comment on lines +722 to 739
for (let attempt = 0; attempt < MAX_WALLET_STATUS_POLL_ATTEMPTS; attempt++) {
const data = await walletApi.getWalletStatus(this);

if (data.status.status === WS_STATUS_READY) {
return;
}
// Only possible states are 'ready', 'creating' and 'error'. If status
// is not ready or creating, we should throw an error.
if (data.status.status !== WS_STATUS_CREATING) {
throw new WalletRequestError('Error getting wallet status.', { cause: data.status });
}

await new Promise(resolve => {
setTimeout(resolve, WALLET_STATUS_POLLING_INTERVAL);
});
}
throw new WalletRequestError('Wallet status polling timed out.');
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The new poll cap is attempt-bounded, not time-bounded.

walletApi.getWalletStatus() (src/wallet/api/walletApi.ts:64-72) uses axiosInstance() with the default HTTP timeout (src/wallet/api/walletServiceAxios.ts:26-61) on every iteration. Under a slow or lossy connection, each attempt can block for the full request timeout before the 1s sleep, so this loop can still hang for minutes instead of roughly 60 seconds.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/wallet/wallet.ts` around lines 722 - 739, The loop in (method containing)
walletApi.getWalletStatus is bounded by attempts
(MAX_WALLET_STATUS_POLL_ATTEMPTS) but each axios request can itself block up to
the HTTP timeout, so the total wait can exceed the intended ~60s; change the
polling to be time-bounded instead of attempt-bounded by tracking elapsed time
(e.g., start = Date.now()) and loop while (Date.now() - start <
MAX_WALLET_STATUS_POLL_DURATION_MS), calling walletApi.getWalletStatus and
breaking/throwing as you already do, or alternatively modify
walletApi.getWalletStatus to accept/forward a per-request timeout/AbortSignal so
each request respects a short timeout and keep the existing interval logic;
update references: walletApi.getWalletStatus, MAX_WALLET_STATUS_POLL_ATTEMPTS,
WALLET_STATUS_POLLING_INTERVAL, and WalletRequestError when implementing the
duration-based stop.

Comment thread src/wallet/wallet.ts
@tuliomir tuliomir moved this from In Progress (Done) to In Progress (WIP) in Hathor Network Apr 8, 2026
- Fail fast if authToken is null after renewal in start()
- Move onWalletReady outside retry loop in startReadOnly
  so its errors propagate instead of being retried
- Wrap fake timer tests in try/finally for cleanup
- Add test for validateAndRenewAuthToken proactive renewal
- Fix renewAuthToken mocks to set authToken (null-check)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/wallet/wallet.ts (1)

477-506: ⚠️ Potential issue | 🟠 Major

waitReady: false now aborts the only remaining startup path.

After removing the old fire-and-forget continuation, this early return means the non-blocking branch never renews the token, never polls, and never calls onWalletReady(). start({ waitReady: false }) now leaves the wallet stuck in Loading with no path to READY. Please keep the post-createWallet() flow in a single startup promise and only await it conditionally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/wallet/wallet.ts` around lines 477 - 506, The early return when waitReady
is false aborts the remaining startup flow; instead, compose the post-create
flow (validateAndRenewAuthToken(pinCode), the authToken check, the status
handling with pollForWalletStatus(), and the final await onWalletReady() plus
clearSensitiveData()) into a single startup promise (e.g., startupPromise) and
then await that promise only if waitReady is true; if waitReady is false, start
startupPromise without awaiting it but attach a catch to log or handle errors to
avoid unhandled rejections. Ensure you update the code paths around
validateAndRenewAuthToken, pollForWalletStatus, onWalletReady, and
clearSensitiveData so they execute in that single promise rather than being
skipped by the early return.
♻️ Duplicate comments (1)
src/wallet/wallet.ts (1)

728-745: ⚠️ Potential issue | 🟠 Major

These retries still exceed the advertised 60s timeout.

Both loops are attempt-bounded, but each iteration awaits the HTTP call before the 1s sleep. Even normal request latency pushes the wall-clock timeout past 60 seconds, and slow responses can stretch it much further, so this still misses the acceptance criterion. Track an absolute deadline, or enforce a shorter per-request timeout, instead of only counting attempts.

Also applies to: 1227-1249

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/wallet/wallet.ts` around lines 728 - 745, The polling loop using
MAX_WALLET_STATUS_POLL_ATTEMPTS and WALLET_STATUS_POLLING_INTERVAL can exceed
the advertised 60s because each iteration awaits walletApi.getWalletStatus
(which can be slow) plus the 1s sleep; change it to enforce an absolute deadline
(e.g., compute const deadline = Date.now() + 60000) and before each iteration
check Date.now() >= deadline and throw WalletRequestError('Wallet status polling
timed out.') if expired, and/or wrap the walletApi.getWalletStatus(this) call in
a per-request timeout (Promise.race with a timeout set to the remaining time
until deadline) so slow HTTP responses can’t push the total wall-clock time
beyond 60s; update both the loop around walletApi.getWalletStatus and the
similar block at the other occurrence to use the same
deadline+per-request-timeout approach and keep existing
WS_STATUS_READY/WS_STATUS_CREATING checks and WalletRequestError usage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/wallet/wallet.ts`:
- Around line 477-506: The early return when waitReady is false aborts the
remaining startup flow; instead, compose the post-create flow
(validateAndRenewAuthToken(pinCode), the authToken check, the status handling
with pollForWalletStatus(), and the final await onWalletReady() plus
clearSensitiveData()) into a single startup promise (e.g., startupPromise) and
then await that promise only if waitReady is true; if waitReady is false, start
startupPromise without awaiting it but attach a catch to log or handle errors to
avoid unhandled rejections. Ensure you update the code paths around
validateAndRenewAuthToken, pollForWalletStatus, onWalletReady, and
clearSensitiveData so they execute in that single promise rather than being
skipped by the early return.

---

Duplicate comments:
In `@src/wallet/wallet.ts`:
- Around line 728-745: The polling loop using MAX_WALLET_STATUS_POLL_ATTEMPTS
and WALLET_STATUS_POLLING_INTERVAL can exceed the advertised 60s because each
iteration awaits walletApi.getWalletStatus (which can be slow) plus the 1s
sleep; change it to enforce an absolute deadline (e.g., compute const deadline =
Date.now() + 60000) and before each iteration check Date.now() >= deadline and
throw WalletRequestError('Wallet status polling timed out.') if expired, and/or
wrap the walletApi.getWalletStatus(this) call in a per-request timeout
(Promise.race with a timeout set to the remaining time until deadline) so slow
HTTP responses can’t push the total wall-clock time beyond 60s; update both the
loop around walletApi.getWalletStatus and the similar block at the other
occurrence to use the same deadline+per-request-timeout approach and keep
existing WS_STATUS_READY/WS_STATUS_CREATING checks and WalletRequestError usage.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fe22a3b4-9bf4-4578-93e6-203623081459

📥 Commits

Reviewing files that changed from the base of the PR and between 1443741 and a256031.

📒 Files selected for processing (3)
  • __tests__/wallet/readOnlyWallet.test.ts
  • __tests__/wallet/wallet.test.ts
  • src/wallet/wallet.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/wallet/readOnlyWallet.test.ts
  • tests/wallet/wallet.test.ts

@tuliomir tuliomir moved this from In Progress (WIP) to In Progress (Done) in Hathor Network Apr 8, 2026
- renewAuthToken now throws on failure instead of silently
  setting authToken=null. All callers already await it, so
  the non-throwing contract was no longer needed.
- start() propagates the original createAuthToken error
  (e.g., "Auth invalid signature") instead of a generic
  "Auth token missing" message.
- startReadOnly retries only WalletRequestError (server
  responded with non-200). Network/transport errors
  propagate immediately instead of being retried for 60s.
- Timeout error preserves the last retry error as cause.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/wallet/wallet.ts (1)

113-114: ⚠️ Potential issue | 🟠 Major

The new 60s cap is still attempt-bounded, not wall-clock bounded.

Both loops call wallet-service APIs that inherit axiosInstance()'s request timeout before the 1s sleep. Under slow responses, each iteration can burn the full HTTP timeout, so startup can still run well past 60 seconds.

⏱️ Suggested direction
-const MAX_WALLET_STATUS_POLL_ATTEMPTS = 60;
+const MAX_WALLET_STATUS_POLL_TIMEOUT_MS = 60_000;

- for (let attempt = 0; attempt < MAX_WALLET_STATUS_POLL_ATTEMPTS; attempt++) {
+ const deadline = Date.now() + MAX_WALLET_STATUS_POLL_TIMEOUT_MS;
+ while (Date.now() < deadline) {
    // request...

-   await new Promise(resolve => {
-     setTimeout(resolve, WALLET_STATUS_POLLING_INTERVAL);
-   });
+   const remaining = deadline - Date.now();
+   if (remaining <= 0) {
+     break;
+   }
+   await new Promise(resolve => {
+     setTimeout(resolve, Math.min(WALLET_STATUS_POLLING_INTERVAL, remaining));
+   });
  }

Apply the same deadline-based guard in startReadOnly(), or pass a short per-request timeout into the wallet API calls.

Also applies to: 724-740, 1219-1239

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/wallet/wallet.ts` around lines 113 - 114, The
MAX_WALLET_STATUS_POLL_ATTEMPTS constant creates an attempt-bound loop that can
exceed a real-time cap if each wallet-service call hits the HTTP timeout; update
startReadOnly() and the other polling loops (the ones using
MAX_WALLET_STATUS_POLL_ATTEMPTS) to enforce a wall-clock deadline instead of
counting attempts by: compute a deadline (Date.now()+desiredMs) before the loop,
check Date.now() against the deadline each iteration and break with a timeout
error when exceeded, and/or pass an explicit short per-request timeout into the
wallet service calls (via axios request config) so individual requests cannot
block the total startup time; make these changes in the methods referencing
MAX_WALLET_STATUS_POLL_ATTEMPTS and the startReadOnly() polling logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/wallet/wallet.ts`:
- Around line 113-114: The MAX_WALLET_STATUS_POLL_ATTEMPTS constant creates an
attempt-bound loop that can exceed a real-time cap if each wallet-service call
hits the HTTP timeout; update startReadOnly() and the other polling loops (the
ones using MAX_WALLET_STATUS_POLL_ATTEMPTS) to enforce a wall-clock deadline
instead of counting attempts by: compute a deadline (Date.now()+desiredMs)
before the loop, check Date.now() against the deadline each iteration and break
with a timeout error when exceeded, and/or pass an explicit short per-request
timeout into the wallet service calls (via axios request config) so individual
requests cannot block the total startup time; make these changes in the methods
referencing MAX_WALLET_STATUS_POLL_ATTEMPTS and the startReadOnly() polling
logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f4cc2c11-4ad1-4c67-abfc-687cc40b55d3

📥 Commits

Reviewing files that changed from the base of the PR and between a256031 and c9ceeca.

📒 Files selected for processing (3)
  • __tests__/wallet/readOnlyWallet.test.ts
  • __tests__/wallet/wallet.test.ts
  • src/wallet/wallet.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/wallet/readOnlyWallet.test.ts
  • tests/wallet/wallet.test.ts

@tuliomir tuliomir moved this from In Progress (Done) to In Progress (WIP) in Hathor Network Apr 8, 2026
tuliomir and others added 4 commits April 8, 2026 13:56
The auth token endpoint requires the wallet to be in 'ready'
state. Requesting it right after createWallet (while status
may still be 'creating') caused start() to fail with
"Error requesting auth token".

Move validateAndRenewAuthToken to after pollForWalletStatus
so the wallet is guaranteed ready before token renewal.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Remove explicit validateAndRenewAuthToken(pinCode) from start().
   The axios interceptor already handles auth renewal on-demand
   when any authenticated API call is made (including during
   polling). authPrivKey and walletId are both set by that point.

2. Enrich createReadOnlyAuthToken error with HTTP status and
   response data. startReadOnly() now only retries on HTTP 400
   (wallet still creating); other status codes (401, 403, 404)
   fail immediately instead of retrying for 60 seconds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
pollForWalletStatus now retries WalletRequestError (transient
server/auth errors) instead of crashing on the first failure.
Non-WalletRequestError propagates immediately. Timeout error
preserves the last transient error as cause.

This mirrors the error classification already used by
startReadOnly and closes a resilience gap introduced by
the setInterval → for-loop migration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
createAuthToken and getWalletStatus threw generic messages
discarding the response status and body. Enrich both with
{ status, data } in the cause — matching what was already
done for createReadOnlyAuthToken. This gives callers (and
users) diagnostic info for 4xx failures instead of opaque
"Error requesting auth token" messages.

Also fix stale comment in start integration test that
still referenced the removed explicit auth call.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tuliomir tuliomir moved this from In Progress (WIP) to In Progress (Done) in Hathor Network Apr 9, 2026
@tuliomir
Copy link
Copy Markdown
Contributor Author

tuliomir commented Apr 9, 2026

Performance Impact: Wallet Startup Time Benchmark

TL;DR

No meaningful performance regression. Total wallet startup time is statistically equivalent between this branch and master for both tested wallets. The auth token work shifts from the API phase to the Ready phase, but the total time is unchanged.


Methodology

A standalone benchmark script measures end-to-end wallet.start() time (from call to state === 'Ready') against the public testnet wallet service, with sub-phase instrumentation via API-level monkey-patching (no source code changes needed).

  • 10 measured iterations per wallet, per branch (+ 1 warmup iteration excluded from stats)
  • 2-second cooldown between iterations to avoid rate-limiting
  • WebSocket disabled to isolate the startup path
  • Sub-phases measured: Setup (crypto key derivation), API (POST /wallet/init), Poll (GET /wallet/status loop), Ready (address fetch + auth token on-demand)

Tested against: https://wallet-service.testnet.hathor.network/

Results

Wallet 1 (moderate history)

Branch Total (median) Setup API Poll Ready Stddev
fix/auth-token-race-conditions 2570ms 1460ms 238ms 0ms 818ms 141ms
master 2558ms 1624ms 714ms 0ms 254ms 176ms
Delta +12ms (+0.5%) -164ms -476ms 0ms +564ms

Wallet 2 (heavy history -- triggers polling on cold start)

Branch Total (median) Setup API Poll Ready Stddev
fix/auth-token-race-conditions 2620ms 1455ms 247ms 0ms 818ms 171ms
master 2451ms 1564ms 703ms 0ms 279ms 127ms
Delta +169ms (+6.9%) -109ms -456ms 0ms +539ms

Interpretation

Both wallets show the same pattern: the branch is faster at API but slower at Ready, with totals within noise.

The reason is the auth token timing shift:

  • master: Auth token renewal fires in parallel (fire-and-forget) during createWallet(). The token is usually resolved by the time getNewAddresses() runs. Hence: slow API (token renewal overlaps), fast Ready (token already cached).

  • This branch: Auth token renewal is deferred to on-demand via the axios interceptor. createWallet() completes without auth overhead. The first authenticated call (getNewAddresses() in the Ready phase) triggers token renewal. Hence: fast API (no auth overhead), slow Ready (token obtained here).

The total work is the same -- it just moved. The +169ms delta on the heavy wallet is within 1 standard deviation (stddev=127-171ms) and is not statistically significant.

Polling path

The heavy wallet triggered the polling path during warmup (cold start): 9022ms total, 4 polling attempts over ~5 seconds. This exercises the new bounded for-loop polling introduced by this branch. On subsequent iterations (warm start), the wallet service returns status: 'ready' immediately (0 polling), which is the common case for production wallets.

The master warmup for the same wallet showed no polling (2765ms) -- because the wallet was already known to the Wallet Service from the branch's prior run.

Conclusion

Metric Result
Total startup time regression None (within noise)
Auth token overhead Redistributed, not added
Polling path Exercised successfully on cold start
Bounded timeout Working (60s max, vs. unbounded on master)

The refactored auth flow trades a negligible, non-statistically-significant timing shift for deterministic, race-condition-free startup behavior.


throw new WalletRequestError('Error requesting read-only auth token.');
throw new WalletRequestError('Error requesting read-only auth token.', {
cause: { status: response.status, data: response.data },
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Comment thread src/wallet/wallet.ts
@tuliomir tuliomir moved this from In Progress (Done) to In Review (WIP) in Hathor Network Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

Status: In Review (WIP)

Development

Successfully merging this pull request may close these issues.

2 participants