test: give container-awaiting beforeAll hooks a 120s timeout#32211
Conversation
|
Updated 9:21 PM PT - Jun 12th, 2026
✅ @alii, your commit 780f4e4e439bfbb995c94700e0342835a92abf1d passed in 🧪 To try this PR locally: bunx bun-pr 32211That installs a local version of the PR into your bun-32211 --bun |
WalkthroughTest hook timeout bounds and documentation are adjusted across test fixtures to accommodate cold Docker container startup. The harness layer documents startup timing expectations globally, and test files apply matching 120-second timeouts with clarifying comments about compose wait-timeout and build latency. ChangesCold Container Startup Documentation and Timeouts
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
LGTM — straightforward test-infra deflake (longer beforeAll timeout + one-shot compose-up retry).
Extended reasoning...
Overview
This PR touches two test-infrastructure files only:
test/harness.ts: passes150_000as the timeout to thebeforeAllhook insidedescribeWithContainer, replacing the default 5s hook timeout that was firing beforecompose up --wait-timeout 60could finish on cold starts.test/docker/index.ts: indoUp(), retriesdocker compose up -d --wait --wait-timeout 60 <service>exactly once on non-zero exit, logging the first failure's stderr. Theconstdestructure becomesletand is reassigned via parenthesized destructuring on retry; the existing failure path (ps + logs + throw) still runs if the retry also fails.
Both changes are small, mechanical, and well-commented inline. The PR description includes a clear diagnosis (the "Creating" lines were captured progress output at the moment the 5s hook gave up, not a docker race) and cold-start verification (218/218 passing across two runs).
Security risks
None. No production/runtime code is touched — only the test harness and the docker-compose helper used by tests. No auth, crypto, network exposure, or input handling changes.
Level of scrutiny
Low. This is a flaky-test fix confined to test scaffolding. The worst-case downside of either change is that a genuinely broken container takes a bit longer to fail (one extra 60s wait, or up to 150s in the hook) — which is still bounded and surfaces the real ensure() error instead of an opaque (unnamed) hook timeout. No CODEOWNERS apply to these paths.
Other factors
The bug-hunting system found no issues. There are no outstanding reviewer comments. The retry is idempotent (up -d --wait on an already-started container just waits for health), and the change interacts correctly with the existing upPromises dedup / coordinator paths since it's contained inside doUp().
describeWithContainer (and the valkey/autobahn tests' direct ensure() callers) await `compose up --wait-timeout 60` plus a `compose build` step inside beforeAll, but the hook had bun:test's default 5s timeout. On a cold or busy machine the hook fired first and the runner's auto-killer SIGTERMed the in-flight compose subprocess; compose then exited non-zero with stderr at whatever progress line it was on, which doUp() reported as `Failed to start service X: Container ... Creating`. There is no docker race — 4 services brought up concurrently from cold all return exit=0. The coordinator (#32033) solved a cross-process race (warmup-ci.ts vs the first test process); within one `bun bd test` process the per-service in-flight dedup at test/docker/index.ts:200-217 already covers same-service calls, and cross-service concurrency is fine in practice. Matches the existing pattern at websocket-proxy.test.ts:602. Verified by 2x `docker compose down` then `bun bd test` of the SQL container suites from cold: 0 hook timeouts, 0 'Failed to start service' across 5 services starting concurrently.
5c10fb8 to
780f4e4
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/harness.ts`:
- Around line 1049-1053: The beforeAll timeout in test harness is set to
120_000ms which is too tight and duplicated in test/js/valkey/test-utils.ts;
define a single shared constant like COLD_START_TIMEOUT_MS = 150_000 and replace
the numeric literals used as beforeAll timeouts in both the harness (the
beforeAll block that currently ends with }, 120_000)) and the corresponding
beforeAll/timeout in valkey test-utils with that constant so both tests use the
same extended timeout and avoid drift/flakiness.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 62521346-6faa-49b6-94bd-93619aaedada
📒 Files selected for processing (4)
test/harness.tstest/js/valkey/test-utils.tstest/js/valkey/valkey.test.tstest/js/web/websocket/autobahn.test.ts
There was a problem hiding this comment.
LGTM — straightforward test-only timeout bump matching the existing pattern in websocket-proxy.test.ts.
Extended reasoning...
Overview
This PR adds a 120_000 ms timeout argument to four beforeAll hooks that await docker-compose container startup (ensure()), along with explanatory comments. Files touched are all test infrastructure: test/harness.ts (describeWithContainer), test/js/valkey/test-utils.ts, test/js/valkey/valkey.test.ts, and test/js/web/websocket/autobahn.test.ts. No production/runtime code is modified.
Security risks
None. The change only adjusts test hook timeout values and adds comments. No auth, crypto, network, or data-handling logic is touched.
Level of scrutiny
Low. This is a mechanical test-harness fix: each hunk changes }); → }, 120_000); on a beforeAll callback plus an inline comment explaining why. It exactly mirrors the pre-existing pattern at test/js/web/websocket/websocket-proxy.test.ts:603, so it's following established convention rather than introducing a new approach. The worst-case failure mode is a test taking longer to time out, which is benign.
Other factors
The PR description gives a clear root-cause analysis (bun:test's 5s default hook timeout SIGTERMs the in-flight compose up subprocess on cold starts) and explains why the earlier retry-based approach was dropped. The bug-hunting system found no issues. The diff is small, self-contained, and the intent is unambiguous.
What this does
describeWithContainer(and the valkey/autobahn tests' directensure()callers) awaitcompose up --wait-timeout 60plus acompose buildstep insidebeforeAll, but the hook had bun:test's default 5s timeout. On a cold or busy machine the hook fired first and the runner's auto-killer SIGTERMed the in-flight compose subprocess; compose then exited non-zero with stderr at whatever progress line it was on, whichdoUp()reported asFailed to start service X: Container ... Creating.This bumps the four
beforeAllhooks that awaitensure()to 120s — matching the existing pattern atwebsocket-proxy.test.ts:602.test/docker/index.tsis unchanged.Why not the alternatives
compose uponce (earlier version of this PR): wrong layer — the failure is bun:test killing compose, not compose actually failing. Retrying papers over the symptom.doUp()across services: not needed. 4 services brought up concurrently from cold all returnexit=0. The coordinator (ci: serialize docker service startup through a per-shard coordinator #32033) solved a cross-process race (warmup-ci.ts vs the first test process); within onebun bd testprocess the per-service in-flight dedup attest/docker/index.ts:200-217already covers same-service calls, and cross-service concurrency works.Verification
2×
docker compose downthenbun bd test sql-mysql.test.ts sql-onconnect-onclose-throw.test.ts sql-mysql.auth.test.tsfrom cold: 0 hook timeouts, 0 'Failed to start service' across 5 services starting concurrently.Separate pre-existing flake (out of scope)
sql-mysql.test.ts:325'should not timeout in long results' (10K inserts + 3 selects, 10s test timeout) can fail on a freshly-cold MySQL and pass warm. Unchanged by this PR; different timeout class (per-test 10s, not the 5s hook timeout fixed here).