Skip to content

perf(parser): introduce ParserConfig#19637

Merged
graphite-app[bot] merged 1 commit intomainfrom
om/02-23-perf_parser_introduce_parserconfig_
Feb 24, 2026
Merged

perf(parser): introduce ParserConfig#19637
graphite-app[bot] merged 1 commit intomainfrom
om/02-23-perf_parser_introduce_parserconfig_

Conversation

@overlookmotel
Copy link
Member

@overlookmotel overlookmotel commented Feb 23, 2026

What this PR does

Introduce ParserConfig trait (another try at #16785).

The aim is to remove the large performance regression in parser that #19497 created, by making whether the parser generates tokens or not a compile-time option.

ParserConfig::tokens method replaces ParseOptions::collect_tokens property. The former can be const-folded at compile time, where the latter couldn't.

3 options

This PR also introduces 3 different config types that users can pass to the parser:

  • NoTokensParserConfig (default)
  • TokensParserConfig
  • RuntimeParserConfig

The first 2 set whether tokens are collected or not at compile time. The last sets it at runtime.

All 3 implement ParserConfig.

NoTokensParserConfig is the default, and is what's used in compiler pipeline. It switches tokens off in the parser, and makes all the tokens-related code dead code, which the compiler eliminates. This makes the ability of the parser to generate tokens zero cost when it's not used (in the compiler pipeline).

TokensParserConfig is the one to use where you always want tokens. This is probably the config that linter will use.

RuntimeParserConfig is the one to use when an application decides whether to generate tokens or not at runtime. This config avoids compiling the parser twice, at the cost of runtime checks. This is what NAPI parser package will use.

Future extension

Supporting additional features

In future we intend to build the UTF-8 to UTF-16 offsets conversion table in the parser. This will be more performant than searching through the source text for unicode characters in a 2nd pass later on. But this feature is only required for uses of the parser where we're interacting with JS side (NAPI parser package, linter with JS plugins).

ParserConfig can be extended to toggle this feature on/off at compile time or runtime, in the same way as you toggle on/off tokens.

Options and configs

This PR introduces ParserConfig but leaves ParseOptions as it is. So we now have 2 sets of options, passed to Parser with with_options(...) and with_config(...). This is confusing.

We could merge the 2 by making ParseOptions implement ParserConfig, so then you'd define all options with one with_options call.

This would have the side effect of making all other parser options (e.g. preserve_parens) able to be set at either runtime or compile time, depending on the use case.

For users consuming oxc_parser as a library, with specific needs, they could also configure Parser to their needs e.g. create a parser which only handles plain JS code with all the code paths for JSX and TS shaken out as dead code. This would likely improve parsing speed significantly for these use cases.

Implementation details

Why a trait instead of a cargo feature?

IMO a trait is preferable for the following reasons:

  1. We have 3 different use cases we need to support (the 3 provided configs). 3 different Cargo features would be unwieldy.
  2. This situation would become far worse once we introduce more features e.g. UTF-8 -> UTF-16 conversion.
  3. No problems around feature unification. We found Cargo features caused headaches when we used them in linter for toggling on/off JS plugins support.
  4. No clippy errors which only appear when the feature is/isn't disabled, requiring complex #[cfg_attr(feature = "whatever", expect(clippy::unused_async))] etc.

The introduction of a trait does not seem to significantly affect compile time:

# Before this PR
cargo build -p oxc_parser                             15.33s user 3.27s system 254% cpu 7.316 total
cargo build -p oxc_parser --release                   17.36s user 2.26s system 231% cpu 8.477 total
cargo build -p oxc_parser --example parser            18.43s user 3.75s system 271% cpu 8.156 total
cargo build -p oxc_parser --example parser --release  32.52s user 2.59s system 180% cpu 19.454 total

# After this PR
cargo build -p oxc_parser                             15.00s user 3.24s system 272% cpu 6.692 total
cargo build -p oxc_parser --release                   16.71s user 2.12s system 287% cpu 6.539 total
cargo build -p oxc_parser --example parser            18.50s user 3.91s system 285% cpu 7.845 total
cargo build -p oxc_parser --example parser --release  33.48s user 2.63s system 169% cpu 21.263 total

Measured on Mac Mini M4 Pro, cargo clean run before each. The difference appears to be mostly within the noise threshold.

Copy link
Member Author

overlookmotel commented Feb 23, 2026


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions github-actions bot added A-parser Area - Parser A-formatter Area - Formatter labels Feb 23, 2026
@github-actions github-actions bot added the C-performance Category - Solution not expected to change functional behavior, only performance label Feb 23, 2026
@codspeed-hq
Copy link

codspeed-hq bot commented Feb 23, 2026

Merging this PR will improve performance by 18.18%

⚡ 8 improved benchmarks
✅ 39 untouched benchmarks
⏩ 3 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation parser[RadixUIAdoptionSection.jsx] 87.9 µs 82.9 µs +6.06%
Simulation parser[binder.ts] 3.5 ms 3.3 ms +8.43%
Simulation parser[cal.com.tsx] 28.1 ms 25.9 ms +8.79%
Simulation parser[react.development.js] 1.4 ms 1.3 ms +8.83%
Simulation lexer[cal.com.tsx] 6.5 ms 5.5 ms +18.18%
Simulation lexer[RadixUIAdoptionSection.jsx] 23.9 µs 21.2 µs +12.6%
Simulation lexer[react.development.js] 410.2 µs 358.4 µs +14.44%
Simulation lexer[binder.ts] 1,013.2 µs 885.2 µs +14.46%

Comparing om/02-23-perf_parser_introduce_parserconfig_ (79be32d) with c/02-17-feat_parser_estree_coverage_add_parser_tokens_and_shared_estree_conversion (7eb388c)

Open in CodSpeed

Footnotes

  1. 3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@overlookmotel overlookmotel marked this pull request as ready for review February 23, 2026 16:20
Copilot AI review requested due to automatic review settings February 23, 2026 16:20
@overlookmotel overlookmotel requested review from camc314 and removed request for Dunqing February 23, 2026 16:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a ParserConfig trait to control whether the parser collects tokens at compile-time or runtime, addressing a performance regression from #19497. The change enables zero-cost abstractions for token collection by making it a compile-time decision.

Changes:

  • Introduced ParserConfig trait with three implementations: NoTokensParserConfig (default), TokensParserConfig, and RuntimeParserConfig
  • Removed collect_tokens field from ParseOptions and replaced it with the config system
  • Updated all parser and lexer implementations to be generic over the config type
  • Migrated byte handler dispatch from a static array to per-config static arrays to enable better optimization

Reviewed changes

Copilot reviewed 34 out of 35 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
crates/oxc_parser/src/config.rs New module defining ParserConfig and LexerConfig traits with three concrete implementations
crates/oxc_parser/src/lib.rs Updated Parser struct to be generic over ParserConfig, added with_config method, removed collect_tokens from ParseOptions
crates/oxc_parser/src/lexer/mod.rs Updated Lexer to be generic over LexerConfig, changed config field from bool to generic type
crates/oxc_parser/src/lexer/byte_handlers.rs Converted static BYTE_HANDLERS array to per-config static arrays in byte_handler_tables module
crates/oxc_parser/src/js/*.rs Added generic Config parameter to all ParserImpl implementations in JS parsing modules
crates/oxc_parser/src/ts/*.rs Added generic Config parameter to all ParserImpl implementations in TS parsing modules
crates/oxc_parser/src/jsx/mod.rs Added generic Config parameter to ParserImpl implementation for JSX
crates/oxc_parser/src/lexer/*.rs Added generic Config parameter to all Lexer implementations in lexer submodules
tasks/coverage/src/tools.rs Updated to use RuntimeParserConfig for token collection in coverage tests
tasks/benchmark/benches/lexer.rs Updated to use NoTokensLexerConfig for benchmarks
napi/playground/src/lib.rs Removed collect_tokens field from ParseOptions struct initialization
crates/oxc_formatter/src/service/mod.rs Removed collect_tokens field from ParseOptions struct initialization

@camc314
Copy link
Contributor

camc314 commented Feb 23, 2026

@overlookmotel i think we should move this below #19497 so we can monitor the perf change more clearly?

@overlookmotel
Copy link
Member Author

@overlookmotel i think we should move this below 19497 so we can monitor the perf change more clearly?

Yes, I agree that'd be preferable. I tried, but it was a bit of a nightmare because the 2 PRs touch all the same code.

I've checked the numbers on CodSpeed and they're exactly back to where they were before the preceding PR.

graphite-app bot pushed a commit that referenced this pull request Feb 24, 2026
…g` (#19682)

Improve documentation for the config types added in #19637.
This was referenced Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-formatter Area - Formatter A-linter-plugins Area - Linter JS plugins A-parser Area - Parser C-performance Category - Solution not expected to change functional behavior, only performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants