perf(estree/tokens): preallocate sufficient space for tokens JSON#19851
Conversation
Merging this PR will improve performance by 23.74%
Performance Changes
Comparing Footnotes
|
There was a problem hiding this comment.
Pull request overview
This PR improves performance of oxc_estree_tokens JSON serialization by precomputing a more accurate output-size estimate and reserving serializer buffer capacity up front to avoid buffer growth/copying during token emission.
Changes:
- Add
estimate_json_len(tokens_len, source_text_len, is_compact)to estimate token JSON size. - Use the new estimator to preallocate capacity for both compact and pretty token JSON serializers.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| crates/oxc_estree_tokens/src/serialize.rs | Introduces a JSON size estimation helper used to preallocate serializer buffer capacity. |
| crates/oxc_estree_tokens/src/lib.rs | Switches serializer initialization to use the new estimated capacity for compact/pretty outputs. |
c2eaf64 to
928f1e3
Compare
0d8f3e4 to
e50063c
Compare
Merge activity
|
928f1e3 to
a996b9e
Compare
e50063c to
7eec16b
Compare
…9851) Make a more accurate estimate of total size the JSON will be, and reserve sufficient capacity in the `CodeBuffer` to contain it before starting serializing tokens. This ensures the buffer will not need to grow during serialization, reducing memory copying.
a996b9e to
d7a1be8
Compare
7eec16b to
c1bfdcf
Compare
### 🚀 Features - 733d6dc parser: Report error on `infer` outside conditional type (#19879) (camc314) - c2a42f6 allocator: Add `Vec::into_bump_slice_mut` (#19895) (overlookmotel) - ee4982b parser: Add `VARIANTS` const to `Kind` via `fieldless_enum!` macro (#19877) (overlookmotel) - b3dceae data_structures: Add `fieldless_enum!` macro (#19876) (overlookmotel) - 12b841e parser: Make all `Kind::is_*` methods `const` (#19874) (overlookmotel) - 25c2e25 estree/tokens: Add function to update tokens in place (#19856) (overlookmotel) - f78e6df parser: Add `mutate_tokens` Cargo feature (#19853) (overlookmotel) - 5036bb6 parser: Report error on `for await` in static blocks (#19844) (camc314) - 42bd431 parser: Report error for missing initializer in using decl (#19824) (camc314) - a2f58e5 parser: Report error for `implements` clause in non-ts files (#19820) (Cameron) - b25228a estree: Add `IS_COMPACT` const to `Formatter` trait (#19787) (overlookmotel) - e2a1b79 estree: Expose buffer and formatter of serializers (#19773) (overlookmotel) - 4699498 data_structures: Add `CodeBuffer::print_strs_array` (#19760) (overlookmotel) - 233f947 estree: `oxc_estree` crate export config and formatter types (#19724) (overlookmotel) - 5937a32 semantic: Introduce `symbol_declarations` method (#19609) (camc314) - ea6b796 parser: Add `LexerConfig::TOKENS_METHOD_IS_STATIC` const (#19683) (overlookmotel) - 655c38f semantic: Add "did you mean?" suggestions to undefined name errors (#19102) (copilot-swe-agent) - 9e11dc6 parser,estree,coverage: Collect tokens in parser and convert to ESTree format (#19497) (camc314) - c4a3677 parser: Report error for initializer in ambient context (#19187) (camc314) ### 🐛 Bug Fixes - abc7e19 codegen: Improve parenthesised checks when printing types (#19880) (camc314) - 017de5d parser: Update error code for type annotation in `for...in` statement (#19882) (camc314) - 7682e5a linter/plugins: Decode escapes in identifier tokens (#19838) (overlookmotel) - 06767ed estree/tokens: Convert `this` tokens in `TSTypeName` (#19815) (overlookmotel) - ef798af parser: Use TS8037 for satisfies expression in JS files diagnostic (#19819) (camc314) - 98ea5c5 parser: Use TS8016 for type assertions in JS files diagnostic (#19818) (camc314) - 1710f56 codegen: Remove double indentation for enum inside namespace (#19775) (Dunqing) - 9e4995c codegen: Print type annotation on `CatchParameter` (#19790) (camc314) - 297b2bb codegen: Wrap `TSConditionalType` in parens when necessary (#19788) (camc314) - cec7878 codegen: Print `definite` property on AccessorProperty (#19786) (camc314) - 6f395cf codegen: Print `definite` property on PropertyDefinition (#19785) (camc314) - b749373 codegen: Correctly parenthesise TSArrayType (#19784) (camc314) - 876dc1b codegen: Print object property `this` param (#19783) (camc314) - 93bb861 formatter: Trim trailing whitespace before breaking line (#19740) (leaysgur) - ed17bbf codegen: Print `override` keyword for method and property definitions (#19753) (Dunqing) - 6a59a76 parser: Improve error recovery for private identifiers in property names (#19710) (Boshen) - 3b96f41 codegen: Print comments in JSX expression containers and spread attributes (#19701) (Boshen) - f5694ce estree/tokens: Reverse field order of `regex` object in tokens (#19679) (overlookmotel) - b2b7a55 estree/tokens: Generate tokens for files with BOM (#19535) (overlookmotel) - 50a7514 estree: Fix tokens for JSX (#19524) (overlookmotel) - a35063e minifier: Preserve side effects for meta property url reads (#19668) (Boshen) - 8ad3430 semantic/jsdoc: Handle even-numbered backtick sequences in JSDoc parsing (#19664) (Boshen) ### ⚡ Performance - 05ccf9f linter/plugins: Transfer tokens via raw transfer (#19893) (overlookmotel) - c1bfdcf estree/tokens: Preallocate sufficient space for tokens JSON (#19851) (overlookmotel) - 4b0611a estree/tokens: Introduce `ESTreeTokenConfig` trait (#19842) (overlookmotel) - 81bab90 estree/tokens: Do not JSON-encode keyword, punctuator, etc tokens (#19814) (overlookmotel) - 6260ddd estree/tokens: Do not JSON-encode `this` identifiers (#19813) (overlookmotel) - b378f4a estree/tokens: Do not JSON-encode JSX identifiers (#19812) (overlookmotel) - 5016d92 estree/tokens: Handle regex tokens separately (#19796) (overlookmotel) - 780a68e estree/tokens: Use strings from AST for identifier tokens (#19744) (overlookmotel) - dc9c2e3 estree: Use `CodeBuffer::print_strs_array` to reduce bounds checks (#19766) (overlookmotel) - 845da35 estree: Use `CodeBuffer::print_indent` (#19727) (overlookmotel) - ec88f6a estree/tokens: Serialize tokens while visiting AST (#19726) (overlookmotel) - bc6507f estree/tokens: Serialize with `ESTree` not `serde` (#19725) (overlookmotel) - ec24859 estree/tokens: Do not branch on presence of override twice (#19721) (overlookmotel) - dac14be estree/tokens: Replace hash map with `Vec` (#19718) (overlookmotel) - b9d2443 estree/tokens: Replace multiple hash sets into a single hash map (#19716) (overlookmotel) - 7233548 parser: Remove branches from `finish_next_inner` (#19695) (overlookmotel) - b5d9845 parser: Remove const generic param from `finish_next_inner` (#19684) (overlookmotel) - 8940f66 estree/tokens: Serialize tokens to compact JSON (#19572) (overlookmotel) - 136e39b parser/tokens: Pre-allocate capacity for tokens (#19543) (overlookmotel) - 6a6513c linter/plugins: Use Oxc tokens in plugins (#19498) (camc314) - b3b2d30 parser: Introduce `ParserConfig` (#19637) (overlookmotel) ### 📚 Documentation - b2b7a64 estree/tokens: Correct comment (#19873) (overlookmotel) - 0399311 estree/tokens: Improve comments (#19836) (overlookmotel) - 1b392de minifier: Add `Function.prototype.toString` assumption (#19758) (sapphi-red) - 75c9cd8 parser: Improve doc comments for `ParserConfig` and `LexerConfig` (#19682) (overlookmotel) - 2fa936f README.md: Map npm package links to npmx.dev (#19666) (Boshen) Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>

Make a more accurate estimate of total size the JSON will be, and reserve sufficient capacity in the
CodeBufferto contain it before starting serializing tokens. This ensures the buffer will not need to grow during serialization, reducing memory copying.