Skip to content

refactor(estree/tokens): replace stored token when re-lexing#19698

Merged
graphite-app[bot] merged 1 commit intomainfrom
om/02-21-refactor_estree_tokens_replace_stored_token_when_re-lexing
Feb 25, 2026
Merged

refactor(estree/tokens): replace stored token when re-lexing#19698
graphite-app[bot] merged 1 commit intomainfrom
om/02-21-refactor_estree_tokens_replace_stored_token_when_re-lexing

Conversation

@overlookmotel
Copy link
Member

@overlookmotel overlookmotel commented Feb 25, 2026

Previously the ESTree tokens serializer had a special case to skip a << token which has same start as preceding < token. Instead, solve this problem at source - prevent this duplicate token from being in the Vec<Token> in the first place, by modifying the parser.

This is important because we want to move to sending tokens to JS via raw transfer, and we need the Vec<Token> to have the right number of tokens in it to start with, so we can do lazy deserialization and just get a token at a specific index. This breaks down if there are extra tokens that need to be skipped.

The logic around this in parser is quite labyrinthine, so add lengthy comments explaining it.

This change also simplifies the main loop in ESTree serializer, so has side effect of +1% on estree_tokens benchmark. Conversely, it doesn't affect the parser_tokens benchmark perf, because the path in parser which got more complex is not commonly taken.

@github-actions github-actions bot added A-parser Area - Parser C-cleanup Category - technical debt or refactoring. Solution not expected to change behavior labels Feb 25, 2026
This was referenced Feb 25, 2026
Copy link
Member Author

overlookmotel commented Feb 25, 2026


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@graphite-app graphite-app bot changed the base branch from om/02-24-perf_parser_remove_branches_from_finish_next_inner_ to graphite-base/19698 February 25, 2026 00:08
@codspeed-hq
Copy link

codspeed-hq bot commented Feb 25, 2026

Merging this PR will not alter performance

✅ 52 untouched benchmarks
⏩ 3 skipped benchmarks1


Comparing om/02-21-refactor_estree_tokens_replace_stored_token_when_re-lexing (79a366f) with main (14b0fb7)

Open in CodSpeed

Footnotes

  1. 3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@graphite-app graphite-app bot force-pushed the graphite-base/19698 branch from 1eac4b3 to 7233548 Compare February 25, 2026 00:15
@overlookmotel overlookmotel force-pushed the om/02-21-refactor_estree_tokens_replace_stored_token_when_re-lexing branch from a05a0b1 to 79a366f Compare February 25, 2026 16:39
@overlookmotel overlookmotel marked this pull request as ready for review February 25, 2026 16:46
Copilot AI review requested due to automatic review settings February 25, 2026 16:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the TypeScript angle bracket re-lexing mechanism to fix token collection issues by preventing duplicate tokens rather than skipping them during serialization. Previously, when the parser speculatively tried parsing << as type arguments and failed, it would leave a duplicate < token in the stream that the ESTree serializer had to skip. Now, the parser pops the compound token before re-lexing and restores it if the speculative parse fails.

Changes:

  • Refactored parser to pop compound tokens (e.g., <<) during TypeScript angle bracket re-lexing and restore them via rewrite_last_collected_token when speculative parsing fails
  • Removed special-case logic from ESTree tokens serializer that was skipping duplicate tokens
  • Added comprehensive comments explaining the labyrinthine re-lexing logic

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
oxfmtrc.jsonc Added new tokens test fixtures directory to formatting exclusion list
crates/oxc_parser/src/lexer/typescript.rs Added detailed comments explaining re-lexing behavior and token popping logic for left angle brackets
crates/oxc_parser/src/lexer/mod.rs Added rewrite_last_collected_token method to restore popped tokens after failed speculative parses
crates/oxc_parser/src/js/expression.rs Restructured conditional logic to call rewrite_last_collected_token when type argument parsing fails
crates/oxc_estree_tokens/src/lib.rs Removed special-case logic for skipping duplicate << tokens and related ts_type_parameter_starts tracking
apps/oxlint/test/fixtures/tokens/files/ts_angle_relex.ts Added test cases for TypeScript angle bracket disambiguation scenarios
apps/oxlint/test/fixtures/tokens/output.snap.md Updated test snapshots with expected token output for new test cases

@graphite-app
Copy link
Contributor

graphite-app bot commented Feb 25, 2026

Merge activity

Previously the ESTree tokens serializer had a special case to skip a `<<` token which has same start as preceding `<` token. Instead, solve this problem at source - prevent this duplicate token from being in the `Vec<Token>` in the first place, by modifying the parser.

This is important because we want to move to sending tokens to JS via raw transfer, and we need the `Vec<Token>` to have the right number of tokens in it to start with, so we can do lazy deserialization and just get a token at a specific index. This breaks down if there are extra tokens that need to be skipped.

The logic around this in parser is quite labyrinthine, so add lengthy comments explaining it.

This change also simplifies the main loop in ESTree serializer, so has side effect of +1% on `estree_tokens` benchmark. Conversely, it doesn't affect the `parser_tokens` benchmark perf, because the path in parser which got more complex is not commonly taken.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-cli Area - CLI A-linter Area - Linter A-linter-plugins Area - Linter JS plugins A-parser Area - Parser C-cleanup Category - technical debt or refactoring. Solution not expected to change behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants