fix: synthesis fails if `tree.json` exceeds 512MB #34478

rix0rrr · 2025-05-16T08:25:57Z

In large applications with a lot of constructs, tree.json can grow to exceed 512MB, which means it can't be serialized anymore.

In this PR, introduce the concept of a "subtree reference". Once a construct tree grows to exceed a fixed number of nodes we write subtrees to individual other files, and put a reference to those files into the original tree.

The number of nodes can be configured with the context value @aws-cdk/core.TreeMetadata:maxNodes, and defaults to 500,000 (that assumes an average size of 1kB per node, which is an overestimate for safety. If we find that this number is too high in practice we may still lower it in the future).

Fixes #27261.

For unupdated consumers, there is graceful degradation here: the parts of the tree they will be able to see are cut off at certain tree depths, but only very large applications would be affected by this. Tree data consumers can be updated gradually to deal with these references.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

aws-cdk-automation

(This review is outdated)

✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.

… synthesis

QuantumNeuralCoder · 2025-05-16T20:03:57Z

packages/aws-cdk-lib/core/lib/private/linked-queue.ts

@@ -0,0 +1,45 @@
+/**


Few improvements

Length tracking

private _length = 0; public get length() { return this._length; }

isEmpty check

public isEmpty(): boolean { return this.head === undefined; }

clear method

public clear() { this.head = undefined; this.last = undefined; this._length = 0; }

Iterator support

public *[Symbol.iterator](): IterableIterator<A> { let current = this.head; while (current) { yield current.value; current = current.next; } }

for use as

const q = new LinkedQueue<number>([1, 2, 3]); for (const item of q) { console.log(item); // 1, 2, 3 }

I'm not quite comfortable adding these methods that we wouldn't be using.

length/isEmpty: sure, I guess. Though if possible I prefer coding style of the form "do, then check" rather than "will next call succeed / do next call". This is a style that will avoid TOCTOU's without having to constantly mentally think about whether you are in a situation where it does or does not apply.

clear: not using it, so why add it?

iterator: I'm not comfortable writing a for loop over a data structure that gets mutated. I understand we could code the list to make that work, but I would get uneasy seeing the client side of that code, since it relies on too many unknowns.

Comment was to make it a more general purpose ds. This is great for the current ask.

QuantumNeuralCoder · 2025-05-16T20:16:20Z

packages/aws-cdk-lib/core/lib/private/tree-metadata.ts

+ *   we will convert the prospective parent to the root of a new tree and replace it
+ *   with a reference in the original tree.
+ *   - Choosing this method instead of making the child a new root because we have to
+ *     assume that all leaf nodes of a "full" tree will still get children added to them,


since balancing the tree is not a consideration here, does replacing bfs with dfs make sense for readability of templates and keeping related referenced nodes together?

My goal here is to show clients that a aren't aware of subtree references "as much as possible". That means they get to see "a bit of Stack1-Stack100" rather than "everything from Stack1 and nothing from Stack2-Stack99"

Wondering if there is a metadata (not same as CDK metadata or maybe same) section/file that should be included to summarize stats like number of nodes in the template, how many refs were generated etc. might be good for traceability and troubleshooting if things go wrong.

QuantumNeuralCoder · 2025-05-16T20:18:47Z

packages/aws-cdk-lib/core/lib/private/tree-metadata.ts

+  /**
+   * Whether the given tree is full
+   */
+  private isTreeFull(t: Tree) {


there is an assumptions stated above where number of nodes as opposed to size of the template determines when we trigger the splitting algo. since each node varies in size, would it make sense to keep track of the size as well. For example a node with many many props may cause the template to become fuller faster without hitting the number of nodes limitation.

Estimating the size of a node is an expensive operation. I used to have a solution that sampled node sizes by stringifying the entire JSON string, but it is slower and you need to think about sampling frequencies and size bias: leaf nodes (resources) are typically larger than intermediate nodes (grouping constructs).

I did a conservative, adjustable estimate here which will be cheap to compute. I think this will suffice for now, we can always come back it and do something more sophisticated if we find this to be causing problems.

mergify · 2025-05-22T18:07:03Z

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

mergify · 2025-05-22T18:09:54Z

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

aws-cdk-automation · 2025-05-22T19:07:26Z

AWS CodeBuild CI Report

CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
Commit ID: 6f37d38
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

mergify · 2025-05-22T19:07:47Z

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

github-actions · 2025-05-22T19:08:02Z

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

rix0rrr requested a review from a team as a code owner May 16, 2025 08:25

rix0rrr self-assigned this May 16, 2025

aws-cdk-automation requested a review from a team May 16, 2025 08:26

github-actions bot added bug This issue is a bug. p1 labels May 16, 2025

aws-cdk-automation previously requested changes May 16, 2025

View reviewed changes

fix: synthesis fails if tree.json exceeds 512MB

35ebdaa

rix0rrr force-pushed the huijbers/split-tree branch from 30603a7 to 35ebdaa Compare May 16, 2025 11:44

mergify bot added the contribution/core This is a PR that came from AWS. label May 16, 2025

rix0rrr added the pr-linter/exempt-integ-test The PR linter will not require integ test changes label May 16, 2025

rix0rrr added 2 commits May 16, 2025 13:49

Undo all the changes that have to do with performance improvements of…

92e180a

… synthesis

DFS preorder is not used in this PR

79fdace

aws-cdk-automation added the pr/needs-maintainer-review This PR needs a review from a Core Team Member label May 16, 2025

QuantumNeuralCoder reviewed May 16, 2025

View reviewed changes

QuantumNeuralCoder approved these changes May 22, 2025

View reviewed changes

aws-cdk-automation removed the pr/needs-maintainer-review This PR needs a review from a Core Team Member label May 22, 2025

Merge branch 'main' into huijbers/split-tree

6f37d38

mergify bot merged commit ff2f4af into main May 22, 2025
16 checks passed

mergify bot deleted the huijbers/split-tree branch May 22, 2025 19:07

github-actions bot locked as resolved and limited conversation to collaborators May 22, 2025

fix: synthesis fails if tree.json exceeds 512MB #34478

fix: synthesis fails if tree.json exceeds 512MB #34478

Uh oh!

Conversation

rix0rrr commented May 16, 2025

Uh oh!

aws-cdk-automation left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mergify bot commented May 22, 2025

Uh oh!

mergify bot commented May 22, 2025

Uh oh!

aws-cdk-automation commented May 22, 2025

AWS CodeBuild CI Report

Uh oh!

mergify bot commented May 22, 2025

Uh oh!

Uh oh!

github-actions bot commented May 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: synthesis fails if `tree.json` exceeds 512MB #34478

fix: synthesis fails if `tree.json` exceeds 512MB #34478

aws-cdk-automation left a comment •

edited

Loading