Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added a new CLI arg --merge-async to asynchronously and incrementally merge process coverage files to avoid OOM due to heap exhaustion #469

Merged
merged 5 commits into from
May 26, 2023

Conversation

bizob2828
Copy link
Contributor

@bizob2828 bizob2828 commented May 3, 2023

This PR changes the merging strategy of process coverage files to be async and incremental. Our team runs a lot of tests against a versioned matrix of dependencies in subprocesses. This in turns creates a lot of temp v8 coverage files. We were finding we were running out of heap on GitHub actions as you can see in this job

screenshot 2023-05-03 at 5 11 08 PM

Fixes #338

Checklist
  • npm test, tests passing
  • npm run test:snap (to update the snapshot)
  • tests and/or benchmarks are included
  • documentation is changed or added

…process coverage files to avoid

OOM due to heap exhaustion.
lib/report.js Outdated
Comment on lines 240 to 246
/**
* Returns the list of V8 process coverages generated by Node.
*
* @return {ProcessCov[]} Process coverages generated by Node.
* @private
*/
_loadReports () {
Copy link
Contributor

@AriPerkkio AriPerkkio May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance this _loadReports could be kept here, and its API as is? This is something Vitest uses internally. https://github.com/vitest-dev/vitest/blob/708b10fe227c04b4219a1e5d4b539def009729b4/packages/coverage-c8/src/provider.ts#L94-L95

I've been meaning to open a PR here and propose this method as part of public API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ok I can keep loadReports but this PR fundamentally changes this in that it does not load all reports at once because of the OOM issues.

Copy link
Contributor Author

@bizob2828 bizob2828 May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe I can provide this async merging as a CLI arg and keep existing functionality but that means for libraries that overrode "private" methods you'd have another one to override. @bcoe what are your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to prevent these OOMs would be to make _loadReports a bit more efficient:

  1. Load a single V8 report from file system and convert it to JSON
  2. Filter its result based on testExclude.shouldInstrument(result[].url) and filter out all unrelated coverage reports
  3. Return these filtered ScriptCoverage[] from _loadReport - no other changes required

The V8 reports collected using process.env.NODE_V8_COVERAGE contain so much noise as they include all node_modules and NodeJS internal node:<module-name> modules. When all V8 reports are loaded into memory at once, there is a lot unnecessary memory occupied by these irrelevant coverage reports.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. I could look into that. I realized I requested a review from you but I really would like @bcoe to take a look at this. Is this project still alive? there's quite a lot of PRs in the queue

@bcoe
Copy link
Owner

bcoe commented May 23, 2023

@bizob2828 one thought, what if we just had both options, with a flag like --merge-async. We could then keep the API the same, but add the new option for your use case.

@bizob2828
Copy link
Contributor Author

@bizob2828 one thought, what if we just had both options, with a flag like --merge-async. We could then keep the API the same, but add the new option for your use case.

Yea I was thinking of doing this. It also looks like I have to fix 10 and 12 because fs doesn't have promise based methods. I can work on all of this.

…mentally merge process coverage files to avoid OOM due to heap exhaustion
@bizob2828 bizob2828 changed the title fix: updated getMergedProcessCov to be async and incrementally merge process coverage files feat: added a new CLI arg --merge-async to asynchronously and incrementally merge process coverage files to avoid OOM due to heap exhaustion May 23, 2023
@bizob2828 bizob2828 force-pushed the async-merge-reports branch 2 times, most recently from eb08f1b to 8d9aeb2 Compare May 23, 2023 21:42
@bizob2828
Copy link
Contributor Author

@bcoe do you mind taking another look? I updated to default to same behavior but added a new --merge-async flag. Also removed some legacy tests that no longer worked which I think should be ok because Node <10 is no longer supported.

This was breaking the lib/report.js since it now relies on either `fs/promises` or `fs.promises` depending on the version of Node.js.
lib/report.js Show resolved Hide resolved
Copy link
Owner

@bcoe bcoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, but tests are failing (you may need to retake the snapshot... if the snapshot is actually different, I'm at a loss).

…ning across different test iterations within the same process
@bizob2828
Copy link
Contributor Author

bizob2828 commented May 24, 2023

This LGTM, but tests are failing (you may need to retake the snapshot... if the snapshot is actually different, I'm at a loss).

@bcoe thanks for reviewing/approving. I see what's up. These tests atomic via chance because they were only run once. Now that they are run twice, merging sync and async it caused issues. I fixed it by providing --clean=true to the one test to delete all v8 coverage files. I made the fixes, can you take one last look please 🙏🏻

@bcoe
Copy link
Owner

bcoe commented May 25, 2023

@bizob2828 looks like we're most of the way there, just the Node 10 tests failing (perhaps just didn't run the snapshot for 10).

aside, we should really drop Node 10 support, no good reason to still support it.

@bizob2828
Copy link
Contributor Author

bizob2828 commented May 25, 2023

@bcoe weird, I did re-run and update snapshots and it works locally. I'll dig more, sorry for the back and forth. Looks like it's a timestamp difference which I don't get how that'd ever worked before

@bizob2828
Copy link
Contributor Author

bizob2828 commented May 25, 2023

@bcoe I think I fixed. The report wasn't correct for the same reason another test was failing. Now that the tests are being run more than once we need to clean up beforehand

@bcoe bcoe merged commit 45f2f84 into bcoe:main May 26, 2023
@bcoe
Copy link
Owner

bcoe commented May 26, 2023

@bizob2828 thanks for this great contribution, please let me know if it helps you avoid OOM errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential OOM error
3 participants