-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v2] Hulksmash build slowdowns on larger sites #6226
Conversation
Deploy preview for using-glamor failed. Built with commit fa32e5d https://app.netlify.com/sites/using-glamor/deploys/5b457017c6aed64e9461ef06 |
Hmm and a 25,000 page site builds in ~7.5 minutes. The file for pages metadata is now enormous (~4mb) so we'll need to fix that but this is all very promising. |
Hmmm, gatsbyjs.org isn't building since StaticQueries aren't being run during builds. Is this a known issue? Been heads down last few days. |
} | ||
|
||
// Delete internal data from pageContext | ||
delete result.pageContext.jsonName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refs: #5096
This would break current implementation of sitemap plugin I think - but we can get that info for redux store instead of querying for it. This also would allow us to skip updating schema part (which is needed to add Switching nodes reducer to map and mutating it instead of creating new state for every
I'm not seeing this here |
Me neither. Seems like nice speedups for .org. Unscientific tests running
Edit: oh yeah, |
Oh, great point. That would speed up creating the SitePage nodes a ton. I'll try that too in this PR before declaring SitePage nodes dead. |
Hrrmmm... weird. I deleted node_modules and yarn.lock. Will keep poking at this. |
This would be nice still. Though it seems your changes have been making the schema generation a lot faster. |
@@ -13,7 +13,7 @@ module.exports = async (program: any) => { | |||
|
|||
debug(`generating static HTML`) | |||
// Reduce pages objects to an array of paths. | |||
const pages = store.getState().pages.map(page => page.path) | |||
const pages = [...store.getState().pages.values()].map(page => page.path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Array.from(store.getState().pages.values(), page => page.path)
maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh haha — didn't know this was part of Array.from
. Yeah, totally makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i just learned it myself!
@@ -81,7 +90,7 @@ const findIdsWithoutDataDependencies = () => { | |||
// paths. | |||
const notTrackedIds = _.difference( | |||
[ | |||
...state.pages.map(p => p.path), | |||
...[...state.pages.values()].map(p => p.path), | |||
...[...state.staticQueryComponents.values()].map(c => c.jsonName), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should do Array.from(set, mapFn)
here probably as well
Finally got gatsbyjs.org to work — had deleted public/static/d without also deleting the .cache directory 🤦♂️ Builds are taking anywhere between ~55-80 seconds on a warm cache. Not bad! |
Deploy preview for using-drupal ready! Built with commit b647bb1 |
Deploy preview for gatsbygram ready! Built with commit b647bb1 |
This is coming along! Just built a 25k page site in a bit over 2 minutes
Creating pages can still probably be made a lot faster. Writing out page data is getting weirdly slow w/ large number of pages but progress. |
For building a 25k page site, it reduced the time spent writing out page data from 32 seconds to 0.32 seconds 😱
Aaaandddd dropped a 25k build site another 75% to 32 seconds :-D /p/t/my-hello-world gatsby build
success open and validate gatsby-config — 0.007 s
success onPreBootstrap — 0.027 s
success delete html and css files from previous builds — 0.005 s
success copy gatsby files — 0.036 s
success source and transform nodes — 0.015 s
success building schema — 0.088 s
success createPages — 4.843 s
success createPagesStatefully — 0.111 s
success onPreExtractQueries — 0.003 s
success update schema — 0.055 s
success extract queries from components — 0.076 s
success run graphql queries — 8.749 s
success write out page data — 0.328 s
success write out redirect data — 0.002 s
success onPostBootstrap — 0.001 s
info bootstrap finished - 16.599 s
success Building production JavaScript and CSS bundles — 3.022 s
success Building static HTML for pages — 12.919 s
info Done building in 32.619 sec A 10k page site builds in ~18 seconds and a 100k page site builds in 175 seconds. |
Not really loving all the nested caching I'm adding... but it's necessary. Will revisit this Monday to see if there's a cleaner way to cache things.
Is this still the case, or will this be a viable solution? I’m looking at using Gatsby for a 400k page site, but not sure if things would just explode trying to do this. |
@zachgibson not sure I understand what you mean? |
@KyleAMathews It seemed your comment was saying this PR was an experiment. I was wondering if the techniques you ended up finding in this work would end up getting merged into core Gatsby. |
@zachgibson the plan is to merge this in to Gatsby core as soon as we can! |
@KyleAMathews I don't know how this works but is there a need to keep all those empty directories ( gatsby/packages/gatsby/src/bootstrap/index.js Line 185 in d863bcb
|
@tradziej yeah :-) I'm going to put up a PR in a sec to not pre-create the folders. |
@KyleAMathews Intresting to see your benchmarks. I'm at the moment trying out building ~12k pages and it takes around 22 minutes. Our bottleneck is the graphQL queries that takes around 1000s-1500s to complete. What type website did you build with this timings? As I see in your output the GraphQL queries take not that long time to run, how many queries do you run? |
The benchmark sites https://github.com/gatsbyjs/gatsby/tree/master/benchmarks Would love to hear more details about your site in another issue to see if we can figure out how to optimize it! |
@KyleAMathews thanks, will check that out. Will post another issue if I can't find out any other ways to do it. I've created an issue for this now #7373 |
Spent the afternoon and evening going through critical path for creating pages.
Currently a 5000 page site can build in ~37 seconds. A very significant
improvement over the current.
This does necessitate one breaking change namely creating nodes for each page. This frankly
was more of academic interest and for debugging purposes and given that it adds a
large amount of slowdowns and should be rarely if ever used in sites, we should
hopefully be finish letting it go. In any case, if people are using it, there's
far better ways of querying the same data.
TODO