-
Notifications
You must be signed in to change notification settings - Fork 4.3k
fix: synthesis fails if tree.json exceeds 512MB
#34478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This review is outdated)
30603a7 to
35ebdaa
Compare
✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.
| @@ -0,0 +1,45 @@ | |||
| /** | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few improvements
- Length tracking
private _length = 0;
public get length() {
return this._length;
}- isEmpty check
public isEmpty(): boolean {
return this.head === undefined;
}- clear method
public clear() {
this.head = undefined;
this.last = undefined;
this._length = 0;
}- Iterator support
public *[Symbol.iterator](): IterableIterator<A> {
let current = this.head;
while (current) {
yield current.value;
current = current.next;
}
}for use as
const q = new LinkedQueue<number>([1, 2, 3]);
for (const item of q) {
console.log(item); // 1, 2, 3
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite comfortable adding these methods that we wouldn't be using.
length/isEmpty: sure, I guess. Though if possible I prefer coding style of the form "do, then check" rather than "will next call succeed / do next call". This is a style that will avoid TOCTOU's without having to constantly mentally think about whether you are in a situation where it does or does not apply.clear: not using it, so why add it?iterator: I'm not comfortable writing aforloop over a data structure that gets mutated. I understand we could code the list to make that work, but I would get uneasy seeing the client side of that code, since it relies on too many unknowns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment was to make it a more general purpose ds. This is great for the current ask.
| * we will convert the prospective parent to the root of a new tree and replace it | ||
| * with a reference in the original tree. | ||
| * - Choosing this method instead of making the child a new root because we have to | ||
| * assume that all leaf nodes of a "full" tree will still get children added to them, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since balancing the tree is not a consideration here, does replacing bfs with dfs make sense for readability of templates and keeping related referenced nodes together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal here is to show clients that a aren't aware of subtree references "as much as possible". That means they get to see "a bit of Stack1-Stack100" rather than "everything from Stack1 and nothing from Stack2-Stack99"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if there is a metadata (not same as CDK metadata or maybe same) section/file that should be included to summarize stats like number of nodes in the template, how many refs were generated etc. might be good for traceability and troubleshooting if things go wrong.
| /** | ||
| * Whether the given tree is full | ||
| */ | ||
| private isTreeFull(t: Tree) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is an assumptions stated above where number of nodes as opposed to size of the template determines when we trigger the splitting algo. since each node varies in size, would it make sense to keep track of the size as well. For example a node with many many props may cause the template to become fuller faster without hitting the number of nodes limitation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Estimating the size of a node is an expensive operation. I used to have a solution that sampled node sizes by stringifying the entire JSON string, but it is slower and you need to think about sampling frequencies and size bias: leaf nodes (resources) are typically larger than intermediate nodes (grouping constructs).
I did a conservative, adjustable estimate here which will be cheap to compute. I think this will suffice for now, we can always come back it and do something more sophisticated if we find this to be causing problems.
|
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
|
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
|
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
|
Comments on closed issues and PRs are hard for our team to see. |
In large applications with a lot of constructs,
tree.jsoncan grow to exceed 512MB, which means it can't be serialized anymore.In this PR, introduce the concept of a "subtree reference". Once a construct tree grows to exceed a fixed number of nodes we write subtrees to individual other files, and put a reference to those files into the original tree.
The number of nodes can be configured with the context value
@aws-cdk/core.TreeMetadata:maxNodes, and defaults to 500,000 (that assumes an average size of 1kB per node, which is an overestimate for safety. If we find that this number is too high in practice we may still lower it in the future).Fixes #27261.
For unupdated consumers, there is graceful degradation here: the parts of the tree they will be able to see are cut off at certain tree depths, but only very large applications would be affected by this. Tree data consumers can be updated gradually to deal with these references.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license