feat(estree/tokens): add function to update tokens in place#19856
Conversation
Merging this PR will improve performance by ×5.3
Performance Changes
Comparing Footnotes
|
43878f4 to
a5153bb
Compare
a5153bb to
87976ec
Compare
f4a1906 to
a2de704
Compare
There was a problem hiding this comment.
Pull request overview
Adds an in-place ESTree token conversion path intended for raw transfer to the JS side, while keeping the existing JSON token serialization. The core change is a refactor that shares one AST visitor across two “modes” via separate context implementations.
Changes:
- Add
update_tokensAPI to mutateVec<Token>/&mut [Token]in place (kinds + UTF-16 span conversion). - Refactor token JSON serialization to use a shared
Visitor<Context>abstraction alongside the new update mode. - Extend lexer
KindwithJSXIdentifierand update benchmarks to use the raw in-place update path.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tasks/benchmark/benches/parser.rs | Switch raw token benchmark to call update_tokens instead of building JSON. |
| crates/oxc_parser/src/lexer/kind.rs | Add Kind::JSXIdentifier variant for ESTree conversion output. |
| crates/oxc_estree_tokens/src/serialize.rs | Major refactor: introduce Context + Visitor, add update_tokens, split JSON vs update behavior. |
| crates/oxc_estree_tokens/src/lib.rs | Re-export update_tokens from the crate root. |
| crates/oxc_estree_tokens/Cargo.toml | Enable oxc_parser’s mutate_tokens feature for in-place token mutation. |
4854bfd to
32675fd
Compare
Merge activity
|
Add a function `update_tokens` which converts tokens to ESTree format in place, mutating the `Token`s in the `Vec<Token>` in place. This is what will be used for sending tokens to JS side via raw transfer. Support for serializing tokens to JSON is also retained. The 2 implementations share the same AST visitor, and hook into it via separate `Context` implementations.
32675fd to
a3b01a2
Compare
Add a function `update_tokens` which converts tokens to ESTree format in place, mutating the `Token`s in the `Vec<Token>` in place. This is what will be used for sending tokens to JS side via raw transfer. Support for serializing tokens to JSON is also retained. The 2 implementations share the same AST visitor, and hook into it via separate `Context` implementations.
a3b01a2 to
25c2e25
Compare
#19856 added an `update_tokens` function to update tokens, ready to be sent over to JS side via raw transfer. Except for tokens where the token `Kind` needs specific modification, this function left token's `Kind`s alone. Unfortunately this means we have to handle all 169 variants of `Kind` on JS side. Instead, convert them to `ESTreeKind`, which has only the 11 variants which ESTree tokens have. This change has negative perf impact on Rust side (-9%), but it improves perf on JS side in return.
### 🚀 Features - 733d6dc parser: Report error on `infer` outside conditional type (#19879) (camc314) - c2a42f6 allocator: Add `Vec::into_bump_slice_mut` (#19895) (overlookmotel) - ee4982b parser: Add `VARIANTS` const to `Kind` via `fieldless_enum!` macro (#19877) (overlookmotel) - b3dceae data_structures: Add `fieldless_enum!` macro (#19876) (overlookmotel) - 12b841e parser: Make all `Kind::is_*` methods `const` (#19874) (overlookmotel) - 25c2e25 estree/tokens: Add function to update tokens in place (#19856) (overlookmotel) - f78e6df parser: Add `mutate_tokens` Cargo feature (#19853) (overlookmotel) - 5036bb6 parser: Report error on `for await` in static blocks (#19844) (camc314) - 42bd431 parser: Report error for missing initializer in using decl (#19824) (camc314) - a2f58e5 parser: Report error for `implements` clause in non-ts files (#19820) (Cameron) - b25228a estree: Add `IS_COMPACT` const to `Formatter` trait (#19787) (overlookmotel) - e2a1b79 estree: Expose buffer and formatter of serializers (#19773) (overlookmotel) - 4699498 data_structures: Add `CodeBuffer::print_strs_array` (#19760) (overlookmotel) - 233f947 estree: `oxc_estree` crate export config and formatter types (#19724) (overlookmotel) - 5937a32 semantic: Introduce `symbol_declarations` method (#19609) (camc314) - ea6b796 parser: Add `LexerConfig::TOKENS_METHOD_IS_STATIC` const (#19683) (overlookmotel) - 655c38f semantic: Add "did you mean?" suggestions to undefined name errors (#19102) (copilot-swe-agent) - 9e11dc6 parser,estree,coverage: Collect tokens in parser and convert to ESTree format (#19497) (camc314) - c4a3677 parser: Report error for initializer in ambient context (#19187) (camc314) ### 🐛 Bug Fixes - abc7e19 codegen: Improve parenthesised checks when printing types (#19880) (camc314) - 017de5d parser: Update error code for type annotation in `for...in` statement (#19882) (camc314) - 7682e5a linter/plugins: Decode escapes in identifier tokens (#19838) (overlookmotel) - 06767ed estree/tokens: Convert `this` tokens in `TSTypeName` (#19815) (overlookmotel) - ef798af parser: Use TS8037 for satisfies expression in JS files diagnostic (#19819) (camc314) - 98ea5c5 parser: Use TS8016 for type assertions in JS files diagnostic (#19818) (camc314) - 1710f56 codegen: Remove double indentation for enum inside namespace (#19775) (Dunqing) - 9e4995c codegen: Print type annotation on `CatchParameter` (#19790) (camc314) - 297b2bb codegen: Wrap `TSConditionalType` in parens when necessary (#19788) (camc314) - cec7878 codegen: Print `definite` property on AccessorProperty (#19786) (camc314) - 6f395cf codegen: Print `definite` property on PropertyDefinition (#19785) (camc314) - b749373 codegen: Correctly parenthesise TSArrayType (#19784) (camc314) - 876dc1b codegen: Print object property `this` param (#19783) (camc314) - 93bb861 formatter: Trim trailing whitespace before breaking line (#19740) (leaysgur) - ed17bbf codegen: Print `override` keyword for method and property definitions (#19753) (Dunqing) - 6a59a76 parser: Improve error recovery for private identifiers in property names (#19710) (Boshen) - 3b96f41 codegen: Print comments in JSX expression containers and spread attributes (#19701) (Boshen) - f5694ce estree/tokens: Reverse field order of `regex` object in tokens (#19679) (overlookmotel) - b2b7a55 estree/tokens: Generate tokens for files with BOM (#19535) (overlookmotel) - 50a7514 estree: Fix tokens for JSX (#19524) (overlookmotel) - a35063e minifier: Preserve side effects for meta property url reads (#19668) (Boshen) - 8ad3430 semantic/jsdoc: Handle even-numbered backtick sequences in JSDoc parsing (#19664) (Boshen) ### ⚡ Performance - 05ccf9f linter/plugins: Transfer tokens via raw transfer (#19893) (overlookmotel) - c1bfdcf estree/tokens: Preallocate sufficient space for tokens JSON (#19851) (overlookmotel) - 4b0611a estree/tokens: Introduce `ESTreeTokenConfig` trait (#19842) (overlookmotel) - 81bab90 estree/tokens: Do not JSON-encode keyword, punctuator, etc tokens (#19814) (overlookmotel) - 6260ddd estree/tokens: Do not JSON-encode `this` identifiers (#19813) (overlookmotel) - b378f4a estree/tokens: Do not JSON-encode JSX identifiers (#19812) (overlookmotel) - 5016d92 estree/tokens: Handle regex tokens separately (#19796) (overlookmotel) - 780a68e estree/tokens: Use strings from AST for identifier tokens (#19744) (overlookmotel) - dc9c2e3 estree: Use `CodeBuffer::print_strs_array` to reduce bounds checks (#19766) (overlookmotel) - 845da35 estree: Use `CodeBuffer::print_indent` (#19727) (overlookmotel) - ec88f6a estree/tokens: Serialize tokens while visiting AST (#19726) (overlookmotel) - bc6507f estree/tokens: Serialize with `ESTree` not `serde` (#19725) (overlookmotel) - ec24859 estree/tokens: Do not branch on presence of override twice (#19721) (overlookmotel) - dac14be estree/tokens: Replace hash map with `Vec` (#19718) (overlookmotel) - b9d2443 estree/tokens: Replace multiple hash sets into a single hash map (#19716) (overlookmotel) - 7233548 parser: Remove branches from `finish_next_inner` (#19695) (overlookmotel) - b5d9845 parser: Remove const generic param from `finish_next_inner` (#19684) (overlookmotel) - 8940f66 estree/tokens: Serialize tokens to compact JSON (#19572) (overlookmotel) - 136e39b parser/tokens: Pre-allocate capacity for tokens (#19543) (overlookmotel) - 6a6513c linter/plugins: Use Oxc tokens in plugins (#19498) (camc314) - b3b2d30 parser: Introduce `ParserConfig` (#19637) (overlookmotel) ### 📚 Documentation - b2b7a64 estree/tokens: Correct comment (#19873) (overlookmotel) - 0399311 estree/tokens: Improve comments (#19836) (overlookmotel) - 1b392de minifier: Add `Function.prototype.toString` assumption (#19758) (sapphi-red) - 75c9cd8 parser: Improve doc comments for `ParserConfig` and `LexerConfig` (#19682) (overlookmotel) - 2fa936f README.md: Map npm package links to npmx.dev (#19666) (Boshen) Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>

Add a function
update_tokenswhich converts tokens to ESTree format in place, mutating theTokens in theVec<Token>in place.This is what will be used for sending tokens to JS side via raw transfer.
Support for serializing tokens to JSON is also retained. The 2 implementations share the same AST visitor, and hook into it via separate
Contextimplementations.