-
Notifications
You must be signed in to change notification settings - Fork 4.3k
fix: large amounts of stacks slows down synthesis #34480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
A large amount of stacks leads to a large body of metadata into the `manifest.json` file. The file can grow too large to be written to disk (>512MB), but even if it doesn't fail outright, serializing and writing and loading a large JSON file takes a lot of CPU time. This PR splits the metadata for stacks off into a separate file which is written separately, moving it out of the hot path of working with the `manifest.json` file. On a tree of 256 stacks and ~40,000 constructs, this shaves off 5 seconds off of the total synthesis time.
27b76dd to
046eb38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This review is outdated)
✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.
| * calls and accumulates into an array, both of which are much slower | ||
| * than this solution. | ||
| */ | ||
| export function* iterateDfsPreorder(root: IConstruct) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[q]: preorder would be visiting left children before right one, which is not what this method does. Was that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preorder means parent before children, which this method does do.
| import { IConstruct } from 'constructs'; | ||
|
|
||
| /** | ||
| * Breadth-first iterator over the construct tree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean Depth-first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, copy/paste-o. Thanks.
| // Use a specialized queue data structure. Using `Array.shift()` | ||
| // has a huge performance penalty (difference on the order of | ||
| // ~50ms vs ~1s to iterate a large construct tree) | ||
| const queue: IConstruct[] = [root]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be called stack (filo), a queue would be (fifo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
Comments were outdated
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
When CDK apps grow extremely large (think 10-100 stacks, 1000-10000 constructs), all metadata together begins to exceed 512MB, the maximum string size in NodeJS. People usually deal with this by disabling metadata, but they shouldn't have to. In addition, even for manifests that don't exceed 512MB the extremely large size of the single JSON object slows down its writing and reading every time, even if the metadata doesn't need to be accessed. An effective solution is to write the metadata of an artifact to a separate file. This PR introduces the ability for that into the Cloud Assembly schema, and updates the CLI to read from both sources if available. Relates to aws/aws-cdk#34480.
A large amount of stacks leads to a large body of metadata into the
manifest.jsonfile. The file can grow too large to be written to disk (>512MB), but even if it doesn't fail outright, serializing and writing and loading a large JSON file takes a lot of CPU time.This PR splits the metadata for stacks off into a separate file which is written separately, moving it out of the hot path of working with the
manifest.jsonfile. On a tree of 256 stacks and ~40,000constructs, this shaves off 5 seconds off of the total synthesis time.
Also in this PR: construct tree iteration was being done in a lot of places, with function
recursion and returning arrays. Moving this all into some functions that use generators. This improves performance only slightly, but it should improve memory usage by a whole lot for large construct trees.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license