Skip to content

Deflake streams-leak pull-buffer test by pacing writes on a per-chunk ack#32192

Open
robobun wants to merge 2 commits into
mainfrom
farm/f5b6944b/deflake-streams-leak-pull-buffer
Open

Deflake streams-leak pull-buffer test by pacing writes on a per-chunk ack#32192
robobun wants to merge 2 commits into
mainfrom
farm/f5b6944b/deflake-streams-leak-pull-buffer

Conversation

@robobun

@robobun robobun commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Fixes #32190

Problem

test/js/web/streams/streams-leak.test.ts > "native ReadableStream reuses the pull buffer across small reads" flakes on loaded CI runners, failing all 4 retry attempts when it triggers (build 61700 debian x64, build 62104 alpine x64):

error: expect(received).toBeGreaterThan(expected)
Expected: > 20
Received: 14   (also 10, 11, 13, 19 across attempts)
    at streams-leak.test.ts:39:25

The fixture had the server write "x\n" 1000 times with only flush() + sleep(0) between writes, and relied on the client keeping up so each write became its own read. On a busy runner the consumer lags, flushed writes pile up in the TCP buffer and coalesce, and the 2000 bytes arrive in as few as 10 reads, starving the chunks.length > 20 sample-size precondition. The buffer-reuse assertions themselves (the point of the test, from #30593) were never the ones failing.

Simply lowering the threshold would gut the test: at ~10 chunks the pre-fix behavior (one fresh 256KB buffer per read) only barely trips distinctBuffers.size < 8 and no longer trips the 4MB backing-bytes bound at all.

Fix

Strengthen the fixture instead of weakening the assertions. The server and the consumer share one event loop, so the pull callback now paces itself on an ack per chunk: it writes chunk i+1 only after the consumer has read chunk i and resolved the ack promise. At most one chunk is ever in flight, so writes cannot coalesce regardless of machine load, and the read count is deterministic (64 paced writes, observed as exactly 64 reads in every run, idle and under 8 busy-loop CPU load). The assertions are unchanged; the sample they run on is now guaranteed by construction, and the pre-fix signal is much stronger (64 distinct buffers vs the previous marginal ~10).

The test also got faster: 64 acked round trips instead of 1000 sleep(0) turns (~9ms release, ~500ms debug ASAN).

Verification

  • Fails on the pre-fix #getInternalBuffer rotation (reverted chunk.buffer.byteLength back to chunk.length locally): Expected: < 8, Received: 64, with the sample-size precondition passing.
  • Passes 30/30 with release bun under 8 busy-loop CPU load, 10/10 with the debug ASAN build; chunk count is exactly 64 in every observation, including with an artificially lagging consumer.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@robobun, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 9 minutes and 40 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 57e1b9eb-70ab-4727-a1e5-c795a9336835

📥 Commits

Reviewing files that changed from the base of the PR and between 5d8b292 and 021049a.

📒 Files selected for processing (1)
  • test/js/web/streams/streams-leak.test.ts

Comment @coderabbitai help to get the list of available commands and usage tips.

@robobun

robobun commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author
Updated 6:32 PM PT - Jun 12th, 2026

@robobun, your commit 021049a has 2 failures in Build #62148 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 32192

That installs a local version of the PR into your bun-32192 executable, so you can run:

bun-32192 --bun

@github-actions

Copy link
Copy Markdown
Contributor

This PR may be a duplicate of:

  1. test: relax the timing-dependent chunk-count assertion in streams-leak.test.ts #31355 - Also deflakes the same timing-dependent chunk-count assertion in streams-leak.test.ts (relaxes the threshold instead of pacing writes with acks)

🤖 Generated with Claude Code

@robobun

robobun commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author

On the duplicate flag: #31355 targets the same flake but keeps the unpaced fixture and lowers the chunk-count floor to > 7, which holds only while coalescing stays above 7 reads. The failures tracked in #32190 already got down to 10, and nothing bounds how far the 2000 bytes can coalesce on a loaded runner, so the relaxed floor can still flake (and below 8 reads the distinct-buffer assertion stops discriminating pre-fix behavior at all).

This PR removes the load dependence instead: the server writes chunk i+1 only after the consumer has read chunk i, so the read count is deterministically 64 on any machine and the original assertions stay at full strength. If maintainers prefer the smaller diff in #31355, happy to close this one.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — test-only deflake that makes the fixture deterministic without weakening the assertions.

Extended reasoning...

Overview

This PR touches a single test file, test/js/web/streams/streams-leak.test.ts, modifying only the "native ReadableStream reuses the pull buffer across small reads" test. It replaces the previous 1000-iteration flush() + sleep(0) write loop with a 64-iteration loop paced by a per-chunk ack promise shared between the in-process server and consumer. The actual assertions (chunks.length > 20, distinctBuffers.size < 8, backingBytes < 4MB) are unchanged; only the fixture that produces the sample is made deterministic. Comments are updated to reflect the new byte counts.

Security risks

None. This is a test fixture change with no production code, no new dependencies, no network/file/auth surface beyond the existing in-process Bun.serve on port 0.

Level of scrutiny

Low. Test-only deflake that strengthens (rather than relaxes) the regression signal. The mechanism is sound: server and client share one event loop, ack is assigned before controller.write() and before any yield point, so it's always defined when the consumer calls ack!(). Resolving an already-settled promise is a no-op, so even pathological splitting/coalescing wouldn't crash — and the assertion is still > 20 rather than === 64, leaving headroom. The PR description documents 30/30 passes under CPU load and 10/10 under debug ASAN, plus confirmation that the test still fails when the original #getInternalBuffer bug is reintroduced.

Other factors

No prior reviewer comments to address. The bug-hunting system found nothing. The change also makes the test ~100x faster, which is a nice side benefit. This is a textbook safe deflake.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — test-only deflake with a sound same-event-loop ack mechanism; assertions preserved.

Extended reasoning...

Overview

This PR touches a single test file, test/js/web/streams/streams-leak.test.ts, and only the first test case ("native ReadableStream reuses the pull buffer across small reads"). It replaces a timing-dependent sleep(0) pacing scheme with an explicit per-chunk ack: the server's pull loop stores a Promise.withResolvers resolver in a shared closure variable, writes one chunk, and awaits the promise; the consumer's for await loop resolves it after pushing each chunk. The iteration count drops from 1000 to 64, and comments are updated to match. No assertions are weakened.

Security risks

None. This is a test-only change with no production code, no new dependencies, and no externally reachable surface.

Level of scrutiny

Low. The change is confined to a test fixture and is purely about making the chunk count deterministic rather than load-dependent. The synchronization is correct: producer and consumer share one event loop, ack is assigned before each write/flush so the consumer's ack!() is never undefined, and after the 64th ack the loop exits cleanly via controller.close(). The preserved assertions (> 20 chunks, < 8 distinct buffers, < 4MB backing bytes) remain meaningful at 64 deterministic reads — in fact stronger than before, since pre-fix behavior would now produce 64 distinct buffers.

Other factors

The bug-hunting system found no issues. The PR description includes thorough verification (30/30 under load, 10/10 debug ASAN, confirmed regression detection against the pre-fix code). The one CI failure (test/cli/init/init.test.ts on macOS aarch64) is unrelated to the changed file. There is a competing PR #31355 that relaxes the threshold instead; the author has already explained the tradeoff in the thread, and choosing between the two is a maintainer call that doesn't affect this PR's correctness.

@robobun

robobun commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author

CI status for the record: the diff is one test file and streams-leak.test.ts passed on every lane in both runs. The red lanes are unrelated flakes:

  • build 62112: test/cli/init/init.test.ts snapshot on macOS aarch64 (bun init emits CLAUDE.md on runners that have Claude Code installed, init_command.rs:1571), plus two CI-tagged flaky tests on Windows (hot.test.ts temp-dir ENOENT, spawn-pipe-leak RSS 0.815 vs 0.8)
  • build 62148 (re-run): fetch-abort-slow-connect.test.ts (AbortSignal.timeout produced Error instead of TimeoutError) and fetch-leak.test.ts (RSS measurement) on macOS aarch64, plus six CI-tagged flaky tests on other lanes (complex-workspace migration, security-scanner-workspaces, update_interactive_install, serve-protocols, spawn-stdin-readable-stream, regression/30205)

None of these execute the changed file. Not pushing further retriggers; ready for review/merge as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky CI test: streams-leak.test.ts "native ReadableStream reuses the pull buffer across small reads"

1 participant