perf: don't load all shopify nodes into memory at once and avoid creating many temp objects #39138
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Handling of incremental data update in shopify has quite bad memory characteristics:
updateCache
functionThis avoid loading all of the nodes and holding them strongly in memory. Instead it does handle it in batches (per node type) and allow event loop turn in between batches.
When testing with 30k shopify products here's how memory usage looks like comparatively:
On the left is
latest
(with just some extra logs/activities, but no functional changes otherwise), on the right it's with changes in this PR - notice the time difference - 152s vs 9s for build and notice how on the left peak memory usage was much higher and also on the left you can see Garbage Collection being triggered a lot (because of temporary objects being created and discarded over and over again)Those changes are available in canary publish:
Tests
Manual test and https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby-source-shopify/__tests__/update-cache.ts continues to pass
Related Issues