feat(parser,estree,coverage): collect tokens in parser and convert to ESTree format by camc314 · Pull Request #19497 · oxc-project/oxc

camc314 · 2026-02-17T13:53:07Z

Implement a collect_tokens option for parser. When enabled, the parser stores all tokens produced during parsing in a Vec. Implement serializing the tokens to ESTree format.

Currently this is not hooked up to NAPI parser or linter. That happens in a later PR in this stack (#19498).

Serialization of tokens implementation is in new oxc_estree_tokens crate. Unfortunately it can't go in existing oxc_estree crate because oxc_ast depends on oxc_estree, and the tokens serialization code requires AST types - which would be a circular dependency. We may be able to move it into another crate (parser?) later on.

camc314 · 2026-02-17T13:53:21Z

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

0-merge - adds this PR to the back of the merge queue
hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

This stack of pull requests is managed by Graphite. Learn more about stacking.

codspeed-hq · 2026-02-17T13:59:37Z

Merging this PR will degrade performance by 15.38%

❌ 8 regressed benchmarks
✅ 39 untouched benchmarks
⏩ 3 skipped benchmarks¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	Simulation	`parser[RadixUIAdoptionSection.jsx]`	81.3 µs	87.8 µs	-7.37%
❌	Simulation	`parser[binder.ts]`	3.3 ms	3.5 ms	-7.03%
❌	Simulation	`parser[react.development.js]`	1.3 ms	1.4 ms	-6.94%
❌	Simulation	`parser[cal.com.tsx]`	26.1 ms	28.1 ms	-7.34%
❌	Simulation	`lexer[react.development.js]`	358.3 µs	410.7 µs	-12.76%
❌	Simulation	`lexer[RadixUIAdoptionSection.jsx]`	21.1 µs	23.9 µs	-11.81%
❌	Simulation	`lexer[cal.com.tsx]`	5.5 ms	6.5 ms	-15.38%
❌	Simulation	`lexer[binder.ts]`	885.3 µs	1,013.4 µs	-12.64%

_{Comparing c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion (7f91f57) with main (a4d5b34)}

3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

overlookmotel

I know this is in a rough state but I couldn't resist having a look.

What would be really useful is if you could ask Codex to add some comments to the code explaining why things are as they are.

The biggest mystery to me is why the extra AST visitation. I'm unclear if this is purely to work around the few edge case differences between ESLint and TS-ESLint (escaped identifiers etc), or whether it'd still be required if we ignored those differences.

There are other parts which have me puzzled too (e.g. comments below).

Some of this code is wildly inefficient, but we can fix that in follow-ups. We'd also need to find a way to avoid the perf penalty on compiler pipeline (not trivial, see #16785 for one attempt but even that had a negative perf impact) - again, could be a follow-up PR (and we don't merge any of stack until we have a solution).

Passing all the tests is a huge feat in itself.

crates/oxc_parser/src/lexer/mod.rs

crates/oxc_parser/src/lexer/punctuation.rs

crates/oxc_estree_tokens/src/lib.rs

crates/oxc_parser/src/lexer/mod.rs

crates/oxc_estree_tokens/src/lib.rs

graphite-app · 2026-02-24T02:34:53Z

Merge activity

Feb 24, 2:34 AM UTC: Boshen added this pull request to the Graphite merge queue.
Feb 24, 2:42 AM UTC: Merged by the Graphite merge queue.

… ESTree format (#19497) Implement a `collect_tokens` option for parser. When enabled, the parser stores all tokens produced during parsing in a `Vec`. Implement serializing the tokens to ESTree format. Currently this is not hooked up to NAPI parser or linter. That happens in a later PR in this stack (#19498). Serialization of tokens implementation is in new `oxc_estree_tokens` crate. Unfortunately it can't go in existing `oxc_estree` crate because `oxc_ast` depends on `oxc_estree`, and the tokens serialization code requires AST types - which would be a circular dependency. We may be able to move it into another crate (parser?) later on.

### What this PR does Introduce `ParserConfig` trait (another try at #16785). The aim is to remove the large performance regression in parser that #19497 created, by making whether the parser generates tokens or not a compile-time option. `ParserConfig::tokens` method replaces `ParseOptions::collect_tokens` property. The former can be const-folded at compile time, where the latter couldn't. ### 3 options This PR also introduces 3 different config types that users can pass to the parser: * `NoTokensParserConfig` (default) * `TokensParserConfig` * `RuntimeParserConfig` The first 2 set whether tokens are collected or not at compile time. The last sets it at runtime. All 3 implement `ParserConfig`. `NoTokensParserConfig` is the default, and is what's used in compiler pipeline. It switches tokens off in the parser, and makes all the tokens-related code dead code, which the compiler eliminates. This makes the ability of the parser to generate tokens zero cost when it's not used (in the compiler pipeline). `TokensParserConfig` is the one to use where you *always* want tokens. This is probably the config that linter will use. `RuntimeParserConfig` is the one to use when an application decides whether to generate tokens or not at runtime. This config avoids compiling the parser twice, at the cost of runtime checks. This is what NAPI parser package will use. ### Future extension #### Supporting additional features In future we intend to build the UTF-8 to UTF-16 offsets conversion table in the parser. This will be more performant than searching through the source text for unicode characters in a 2nd pass later on. But this feature is only required for uses of the parser where we're interacting with JS side (NAPI parser package, linter with JS plugins). `ParserConfig` can be extended to toggle this feature on/off at compile time or runtime, in the same way as you toggle on/off tokens. #### Options and configs This PR introduces `ParserConfig` but leaves `ParseOptions` as it is. So we now have 2 sets of options, passed to `Parser` with `with_options(...)` and `with_config(...)`. This is confusing. We could merge the 2 by making `ParseOptions` implement `ParserConfig`, so then you'd define all options with one `with_options` call. This would have the side effect of making all other parser options (e.g. `preserve_parens`) able to be set at either runtime or compile time, depending on the use case. For users consuming `oxc_parser` as a library, with specific needs, they could also configure `Parser` to their needs e.g. create a parser which only handles plain JS code with all the code paths for JSX and TS shaken out as dead code. This would likely improve parsing speed significantly for these use cases. ### Implementation details Why a trait instead of a cargo feature? IMO a trait is preferable for the following reasons: 1. We have 3 different use cases we need to support (the 3 provided configs). 3 different Cargo features would be unwieldy. 2. This situation would become far worse once we introduce more features e.g. UTF-8 -> UTF-16 conversion. 3. No problems around feature unification. We found Cargo features caused headaches when we used them in linter for toggling on/off JS plugins support. 4. No clippy errors which only appear when the feature is/isn't disabled, requiring complex `#[cfg_attr(feature = "whatever", expect(clippy::unused_async))]` etc. The introduction of a trait does not seem to significantly affect compile time: ``` # Before this PR cargo build -p oxc_parser 15.33s user 3.27s system 254% cpu 7.316 total cargo build -p oxc_parser --release 17.36s user 2.26s system 231% cpu 8.477 total cargo build -p oxc_parser --example parser 18.43s user 3.75s system 271% cpu 8.156 total cargo build -p oxc_parser --example parser --release 32.52s user 2.59s system 180% cpu 19.454 total # After this PR cargo build -p oxc_parser 15.00s user 3.24s system 272% cpu 6.692 total cargo build -p oxc_parser --release 16.71s user 2.12s system 287% cpu 6.539 total cargo build -p oxc_parser --example parser 18.50s user 3.91s system 285% cpu 7.845 total cargo build -p oxc_parser --example parser --release 33.48s user 2.63s system 169% cpu 21.263 total ``` Measured on Mac Mini M4 Pro, `cargo clean` run before each. The difference appears to be mostly within the noise threshold.

Builds on #19497. Use the tokens generated by Oxc's parser in linter plugins, instead of running TS parser to tokenize source. This is a sizeable perf gain, and also allows us to remove TS parser from `oxlint`'s bundle (#19531). Additionally, our parser is more accurate than TypeScript - we handle HTML comments and space before slash in closing JSX elements (`< /div>`) - so this fixes a few conformance tests.

camc314 mentioned this pull request Feb 17, 2026

feat(coverage): add estree token conformance scaffolding #19496

Merged

github-actions bot added A-parser Area - Parser A-formatter Area - Formatter labels Feb 17, 2026

camc314 mentioned this pull request Feb 17, 2026

perf(linter/plugins): use Oxc tokens in plugins #19498

Merged

github-actions bot added the C-enhancement Category - New feature or request label Feb 17, 2026

camc314 force-pushed the c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion branch from 4dc586d to 4d88151 Compare February 17, 2026 16:39

camc314 force-pushed the c/02-17-feat_coverage_add_estree_token_conformance_scaffolding branch from 480fb45 to ef55626 Compare February 17, 2026 16:44

camc314 force-pushed the c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion branch 2 times, most recently from eaa2671 to d67073b Compare February 17, 2026 16:56

camc314 force-pushed the c/02-17-feat_coverage_add_estree_token_conformance_scaffolding branch from ef55626 to 781641e Compare February 17, 2026 17:01

camc314 force-pushed the c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion branch 3 times, most recently from 5cb52ed to 56e8c4f Compare February 17, 2026 17:18

overlookmotel reviewed Feb 17, 2026

View reviewed changes

crates/oxc_parser/src/lexer/mod.rs Show resolved Hide resolved

crates/oxc_parser/src/lexer/punctuation.rs Show resolved Hide resolved

camc314 force-pushed the c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion branch 2 times, most recently from a7e3db8 to 5d93223 Compare February 17, 2026 21:53

camc314 commented Feb 17, 2026

View reviewed changes

crates/oxc_estree_tokens/src/lib.rs Show resolved Hide resolved

camc314 commented Feb 17, 2026

View reviewed changes

crates/oxc_estree_tokens/src/lib.rs Show resolved Hide resolved

camc314 force-pushed the c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion branch from 5d93223 to 53e2396 Compare February 17, 2026 22:26

camc314 commented Feb 17, 2026

View reviewed changes

crates/oxc_parser/src/lexer/mod.rs Show resolved Hide resolved

camc314 commented Feb 17, 2026

View reviewed changes

crates/oxc_estree_tokens/src/lib.rs Show resolved Hide resolved

camc314 force-pushed the c/02-17-feat_coverage_add_estree_token_conformance_scaffolding branch from 781641e to a52fd1b Compare February 18, 2026 09:56

camc314 force-pushed the c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion branch from 53e2396 to a13b812 Compare February 18, 2026 09:56

camc314 mentioned this pull request Feb 18, 2026

chore(submodules): upgrade estree-conformance to latest SHA #19515

Closed

camc314 force-pushed the c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion branch from a13b812 to 28676ac Compare February 18, 2026 10:12

camc314 force-pushed the c/02-17-feat_coverage_add_estree_token_conformance_scaffolding branch from a52fd1b to 5cbebb7 Compare February 18, 2026 10:12

camc314 mentioned this pull request Feb 18, 2026

fix(tasks/coverage): bring back acorn-jsx estree testing #19516

Merged

camc314 force-pushed the c/02-17-feat_coverage_add_estree_token_conformance_scaffolding branch from 5cbebb7 to 97f88f9 Compare February 18, 2026 10:54

overlookmotel mentioned this pull request Feb 23, 2026

ci(benchmarks): correct comment #19650

Merged

graphite-app bot added the 0-merge Merge with Graphite Merge Queue label Feb 24, 2026

graphite-app bot force-pushed the c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion branch from 7f91f57 to 9e11dc6 Compare February 24, 2026 02:35

graphite-app bot merged commit 9e11dc6 into main Feb 24, 2026
23 checks passed

graphite-app bot deleted the c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion branch February 24, 2026 02:42

graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label Feb 24, 2026

overlookmotel added the A-linter-plugins Area - Linter JS plugins label Feb 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(parser,estree,coverage): collect tokens in parser and convert to ESTree format#19497

feat(parser,estree,coverage): collect tokens in parser and convert to ESTree format#19497
graphite-app[bot] merged 1 commit intomainfrom
c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion

camc314 commented Feb 17, 2026 •

edited by overlookmotel

Loading

Uh oh!

camc314 commented Feb 17, 2026 •

edited by overlookmotel

Loading

Uh oh!

codspeed-hq bot commented Feb 17, 2026 •

edited

Loading

Uh oh!

overlookmotel left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

graphite-app bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

camc314 commented Feb 17, 2026 • edited by overlookmotel Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

camc314 commented Feb 17, 2026 • edited by overlookmotel Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to use the Graphite Merge Queue

Uh oh!

codspeed-hq bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will degrade performance by 15.38%

Performance Changes

Footnotes

Uh oh!

overlookmotel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

graphite-app bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

camc314 commented Feb 17, 2026 •

edited by overlookmotel

Loading

camc314 commented Feb 17, 2026 •

edited by overlookmotel

Loading

codspeed-hq bot commented Feb 17, 2026 •

edited

Loading

overlookmotel left a comment •

edited

Loading

graphite-app bot commented Feb 24, 2026 •

edited

Loading