fix(etl): prevent Worker OOM on large CSV files via stream backpressure by andrew-bierman · Pull Request #2418 · PackRat-AI/PackRat

andrew-bierman · 2026-05-13T18:38:52Z

Summary

Root cause: R2 delivers large files (329–607 MB) faster than the csv-parse loop can consume rows. Without backpressure, parser.write() buffers the entire file before a single row is processed → Worker hits 128 MB memory limit → killed externally → outer catch never runs → job stuck running forever.
Fix: Check parser.write() return value and await drain before pushing more chunks — standard Node.js streams backpressure contract.
Secondary fix (previous commit): yield to event loop every 100 rows instead of every row, which was causing wall-clock limit issues on large files.

Root Cause Detail

R2 stream → streamToText → parser.write() [no backpressure]
                             ↓
                    Parser buffer grows to 329-607 MB
                             ↓
                    Worker OOM (128 MB limit)
                             ↓
                    Worker killed externally
                             ↓
                    catch{} block never runs
                             ↓
                    Job stays in 'running' state forever

Fix

const ok = parser.write(chunk);
if (!ok) await new Promise<void>((resolve) => parser.once('drain', resolve));

Affected Jobs

Seven jobs were stuck/failed due to this bug and have been re-queued:

backcountry (329 MB file, failed 5×)
campsaver, rei, campmor, omcgear, geartrade, evo

Post-Deploy Monitoring & Validation

What to monitor: wrangler tail packrat-api --format pretty — watch for ✅ Done processing log lines for the 7 re-queued jobs
Expected healthy signal: Each job transitions running → completed, totalProcessed matches CSV row count
Failure signal: Job stays running > 10 min → still OOM (unlikely after fix)
Validation window: 30 min after deploy; owner: @andrew-bierman

Queries:

SELECT id, source, status, "totalProcessed", "totalValid" 
FROM etl_jobs 
WHERE status != 'completed' 
ORDER BY "startedAt" DESC LIMIT 20;

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

Bug Fixes
- Improved stability and reliability of catalog data processing for large files, reducing potential performance issues during data import operations.

…files Without drain-wait, R2 delivers the entire file into the csv-parse buffer before the main processing loop can drain any rows. For files >50 MB this fills the 128 MB Worker memory limit → Worker killed externally → outer catch never runs → job stays stuck in 'running'. Fix: check the return value of parser.write() and await 'drain' before pushing more chunks, exactly as the Node.js streams backpressure contract requires.

coderabbitai · 2026-05-13T18:39:08Z

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

The ETL catalog processor now implements backpressure handling when feeding CSV chunks to the parser. The parser's write() method returns a boolean indicating buffer state; when false, the processor waits for a drain event before continuing, preventing memory exhaustion on large files.

Changes

CSV Parser Backpressure

Layer / File(s)	Summary
CSV parser backpressure mechanism `packages/api/src/services/etl/processCatalogEtl.ts`	Parser backpressure is now honored by checking the return value of `parser.write(chunk)` and awaiting the `drain` event when the buffer is full, preventing the reader from outpacing the parser on large file ingestion.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

PackRat-AI/PackRat#2395: Modifies the same processCatalogETL function with a separate change to job completion status handling.
PackRat-AI/PackRat#2394: Also modifies processCatalogETL to change job completion behavior on the success path.

Suggested labels

api

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: implementing stream backpressure handling in the ETL process to prevent Worker out-of-memory errors on large CSV files.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/etl-completion-ordering

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-13T18:39:18Z

Coverage Report for API Unit Tests Coverage (./packages/api)

Status	Category	Percentage	Covered / Total
🔵	Lines	76.17%	502 / 659
🔵	Statements	76.17% (🎯 65%)	502 / 659
🔵	Functions	95%	38 / 40
🔵	Branches	88.67%	227 / 256

File Coverage

No changed files found.

Generated in workflow #1188 for commit 7654123 by the Vitest Coverage Report Action

github-actions · 2026-05-13T18:39:24Z

Coverage Report for Expo Unit Tests Coverage (./apps/expo)

Status	Category	Percentage	Covered / Total
🔵	Lines	82.61%	480 / 581
🔵	Statements	82.61% (🎯 75%)	480 / 581
🔵	Functions	92.59%	50 / 54
🔵	Branches	90.9%	170 / 187

File Coverage

No changed files found.

Generated in workflow #1188 for commit 7654123 by the Vitest Coverage Report Action

Copilot

Pull request overview

Adds Node stream backpressure handling to the catalog ETL CSV ingestion loop so the csv-parse parser doesn't buffer entire multi-hundred-MB R2 files in memory and crash the Worker with OOM.

Changes:

Capture parser.write() return value and await the drain event when it returns false.
Adds an inline comment explaining the OOM root cause and rationale.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings May 13, 2026 18:38

github-actions Bot added the api label May 13, 2026

Copilot started reviewing on behalf of andrew-bierman May 13, 2026 18:39 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

andrew-bierman merged commit b875cd2 into main May 13, 2026
13 of 14 checks passed

andrew-bierman deleted the fix/etl-completion-ordering branch May 13, 2026 18:41

andrew-bierman mentioned this pull request May 13, 2026

fix(etl): raise cpu_ms limit to 400k (CF max) for large-file queue processing #2419

Merged

This was referenced May 21, 2026

fix(etl): catalog sprint fixes — lenient CSV parsing, chunk timeout, instanceId test #2465

Merged

feat(etl): add JSONL/NDJSON support alongside CSV #2471

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(etl): prevent Worker OOM on large CSV files via stream backpressure#2418

fix(etl): prevent Worker OOM on large CSV files via stream backpressure#2418
andrew-bierman merged 1 commit into
mainfrom
fix/etl-completion-ordering

andrew-bierman commented May 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 13, 2026 •

edited

Loading

Review failed

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andrew-bierman commented May 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause Detail

Fix

Affected Jobs

Post-Deploy Monitoring & Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

github-actions Bot commented May 13, 2026

Coverage Report for API Unit Tests Coverage (./packages/api)

Uh oh!

github-actions Bot commented May 13, 2026

Coverage Report for Expo Unit Tests Coverage (./apps/expo)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andrew-bierman commented May 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 13, 2026 •

edited

Loading