Skip to content

refactor(ast)!: simplify RegExpPattern#10834

Merged
graphite-app[bot] merged 1 commit intomainfrom
05-05-refactor_ast_simplify_regexppattern_
May 7, 2025
Merged

refactor(ast)!: simplify RegExpPattern#10834
graphite-app[bot] merged 1 commit intomainfrom
05-05-refactor_ast_simplify_regexppattern_

Conversation

@overlookmotel
Copy link
Member

@overlookmotel overlookmotel commented May 6, 2025

RegExpPattern was an enum with Raw, Invalid, and Pattern variants. i.e. only the raw string or the Pattern could be stored, but not both.

This required complications in various places, because if the regexp has been parsed, in order to examine or print it, you have to either convert the Pattern back to a String, or rely on the Span and slice the source text.

The Invalid variant is not very useful. It's only created if parse_regular_expression option is enabled, in which case an error is reported from the parser anyway.

Instead, always store the pattern string as an Atom, and store the parsed pattern as Option<Box<Pattern>>.

This simplifies code in various places, and makes printing RegExpLiterals faster in oxc_codegen when parse_regular_expression is enabled, because it can just print the pattern string, rather than converting a Pattern back to a String.

Copy link
Member Author

overlookmotel commented May 6, 2025


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions github-actions bot added A-linter Area - Linter A-parser Area - Parser A-ast Area - AST A-transformer Area - Transformer / Transpiler A-codegen Area - Code Generation C-cleanup Category - technical debt or refactoring. Solution not expected to change behavior labels May 6, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented May 6, 2025

CodSpeed Instrumentation Performance Report

Merging #10834 will not alter performance

Comparing 05-05-refactor_ast_simplify_regexppattern_ (ad4fbf4) with main (4c62348)

Summary

✅ 36 untouched benchmarks

@overlookmotel overlookmotel marked this pull request as ready for review May 6, 2025 16:02
@overlookmotel overlookmotel requested a review from Dunqing as a code owner May 6, 2025 16:02
@overlookmotel overlookmotel marked this pull request as draft May 6, 2025 16:04
@overlookmotel overlookmotel force-pushed the 05-05-refactor_ast_simplify_regexppattern_ branch 3 times, most recently from edd6453 to 11f56a6 Compare May 7, 2025 09:49
@overlookmotel overlookmotel marked this pull request as ready for review May 7, 2025 09:49
@overlookmotel overlookmotel requested a review from camc314 as a code owner May 7, 2025 09:49
@overlookmotel
Copy link
Member Author

This PR produces a -1% perf regression on 4 out of 5 transformer benchmarks. I've tried to fix that, but with no success. The code in transformer barely changes in this PR, so it's a mystery where the regression is coming from - I suspect something random like it changing the compiler's decisions about inlining, which some other cosmetic change could also affect positively.

Core team discussed at meet today and agreed to accept the -1%, as the benefits of this change in terms of simplicity outweigh the small perf cost.

@Boshen Boshen added the 0-merge Merge with Graphite Merge Queue label May 7, 2025
Copy link
Member

Boshen commented May 7, 2025

Merge activity

`RegExpPattern` was an enum with `Raw`, `Invalid`, and `Pattern` variants. i.e. only the raw string *or* the `Pattern` could be stored, but not both.

This required complications in various places, because if the regexp has been parsed, in order to examine or print it, you have to either convert the `Pattern` back to a `String`, or rely on the `Span` and slice the source text.

The `Invalid` variant is not very useful. It's only created if `parse_regular_expression` option is enabled, in which case an error is reported from the parser anyway.

Instead, always store the pattern string as an `Atom`, and store the parsed pattern as `Option<Box<Pattern>>`.

This simplifies code in various places, and makes printing `RegExpLiteral`s faster in `oxc_codegen` when `parse_regular_expression` is enabled, because it can just print the pattern string, rather than converting a `Pattern` back to a `String`.
@graphite-app graphite-app bot force-pushed the 05-05-refactor_ast_simplify_regexppattern_ branch from 11f56a6 to ad4fbf4 Compare May 7, 2025 10:05
@graphite-app graphite-app bot merged commit ad4fbf4 into main May 7, 2025
26 checks passed
@graphite-app graphite-app bot deleted the 05-05-refactor_ast_simplify_regexppattern_ branch May 7, 2025 10:12
@graphite-app graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label May 7, 2025
graphite-app bot pushed a commit that referenced this pull request May 7, 2025
…0855)

`Pattern` and other types in `oxc_regular_expression` are not part of ESTree AST. Don't implement `ESTree`, and don't generate raw transfer deserializers for these types. None of that code is actually used.

#10834 moved `Pattern` to an optional field of `RegExpPattern`, which is skipped in ESTree serialization, which enables now removing it from ESTree AST entirely.

This reduces the scope of ESTree serialization.
This was referenced May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ast Area - AST A-codegen Area - Code Generation A-linter Area - Linter A-parser Area - Parser A-transformer Area - Transformer / Transpiler C-cleanup Category - technical debt or refactoring. Solution not expected to change behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants