Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sf data query --bulk throws Fatal Error Javascript Heap out of memory #1995

Closed
AllanOricil opened this issue Mar 12, 2023 · 8 comments
Closed
Labels
feature Issue or pull request for a new feature

Comments

@AllanOricil
Copy link

AllanOricil commented Mar 12, 2023

Is your feature request related to a problem? Please describe.

Bulk query stores all records in memory, instead of writing them to disk

image
src: https://github.com/jsforce/jsforce/blob/95932f288a59a0b68ff9fe8cc1a94523a4d0fb85/src/api/bulk.ts#L1077-L1099

image
src: https://github.com/jsforce/jsforce/blob/95932f288a59a0b68ff9fe8cc1a94523a4d0fb85/src/api/bulk.ts#L1244-L1264

this could throw an exception like Process out of Memory Exception if someone tries to export a few million records, because all records are stored in memory inside of a record[] array. For instance,

sf data query --query "SELECT Id FROM Contact" --bulk

image
src: https://github.com/salesforcecli/plugin-data/blob/b56e778f6c9ef8affbc0719cafb629b07d19ce2c/src/commands/data/query.ts#L135-L153

This type of exception can't be captured by a try/catch. The node runtime would exit 1 straight away.

This issue does not happen when using jsforce bulk v1 implementation because it was developed with streams in mind.

image
src: https://github.com/jsforce/jsforce/blob/95932f288a59a0b68ff9fe8cc1a94523a4d0fb85/src/api/bulk.ts#L959-L983

What are you trying to do
Export a few million records using sfdx force:data:query --bulk

Describe the solution you'd like

1 . jsforce must accept a write stream so that memory could be released after its data is written to disk. Additionally, memory can be released when this.records[] reaches N kbs of data.
2. sfdx should create a write stream when the -b flag is true.

Describe alternatives you've considered
N/a

Additional context
N/a

@AllanOricil AllanOricil added the feature Issue or pull request for a new feature label Mar 12, 2023
@github-actions
Copy link

Thank you for filing this feature request. We appreciate your feedback and will review the feature at our next grooming or sprint planning session. We prioritize feature requests with more upvotes and comments.

@git2gus
Copy link

git2gus bot commented Mar 12, 2023

This issue has been linked to a new work item: W-12677563

@AllanOricil
Copy link
Author

AllanOricil commented Mar 12, 2023

@mshanemc @RodEsp
I think you could take a look at this since you are maintaining jsforce 2.0

@RodEsp
Copy link
Contributor

RodEsp commented Mar 12, 2023

Heya @AllanOricil, I'm not involved with JSForce anymore but I'm sure Shane can take a look.

@AllanOricil AllanOricil changed the title sf data query could throw Process out of Memory Exception ? sf data query could throw JavaScript heap out of memory because jsforce bulk v2 allocates all record results in memory Mar 16, 2023
@AllanOricil
Copy link
Author

AllanOricil commented Mar 16, 2023

Experiment with 103k Product2 records

image

image

This is the result when using the default heap size
image

This is the result after configuring the heap size to 128Mb
image

image

@iowillhoit
Copy link
Contributor

Thanks for this also @AllanOricil, I'll add this to a JSForce category in our back log

@AllanOricil AllanOricil changed the title sf data query could throw JavaScript heap out of memory because jsforce bulk v2 allocates all record results in memory sf data query --bulk throws Fatal Error Javascript Heap out of memory Mar 27, 2023
@AllanOricil
Copy link
Author

These PRs fix this bug

@cristiand391
Copy link
Member

closing this, for big queries we added a new data export bulk commands:
https://github.com/forcedotcom/cli/tree/main/releasenotes/#2626-october-16-2024

bulk mode for data query will be deprecated in a future release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Issue or pull request for a new feature
Projects
None yet
Development

No branches or pull requests

4 participants