refactor(ast)!: simplify RegExpPattern#10834
Conversation
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
CodSpeed Instrumentation Performance ReportMerging #10834 will not alter performanceComparing Summary
|
edd6453 to
11f56a6
Compare
|
This PR produces a -1% perf regression on 4 out of 5 transformer benchmarks. I've tried to fix that, but with no success. The code in transformer barely changes in this PR, so it's a mystery where the regression is coming from - I suspect something random like it changing the compiler's decisions about inlining, which some other cosmetic change could also affect positively. Core team discussed at meet today and agreed to accept the -1%, as the benefits of this change in terms of simplicity outweigh the small perf cost. |
Merge activity
|
`RegExpPattern` was an enum with `Raw`, `Invalid`, and `Pattern` variants. i.e. only the raw string *or* the `Pattern` could be stored, but not both. This required complications in various places, because if the regexp has been parsed, in order to examine or print it, you have to either convert the `Pattern` back to a `String`, or rely on the `Span` and slice the source text. The `Invalid` variant is not very useful. It's only created if `parse_regular_expression` option is enabled, in which case an error is reported from the parser anyway. Instead, always store the pattern string as an `Atom`, and store the parsed pattern as `Option<Box<Pattern>>`. This simplifies code in various places, and makes printing `RegExpLiteral`s faster in `oxc_codegen` when `parse_regular_expression` is enabled, because it can just print the pattern string, rather than converting a `Pattern` back to a `String`.
11f56a6 to
ad4fbf4
Compare
…0855) `Pattern` and other types in `oxc_regular_expression` are not part of ESTree AST. Don't implement `ESTree`, and don't generate raw transfer deserializers for these types. None of that code is actually used. #10834 moved `Pattern` to an optional field of `RegExpPattern`, which is skipped in ESTree serialization, which enables now removing it from ESTree AST entirely. This reduces the scope of ESTree serialization.

RegExpPatternwas an enum withRaw,Invalid, andPatternvariants. i.e. only the raw string or thePatterncould be stored, but not both.This required complications in various places, because if the regexp has been parsed, in order to examine or print it, you have to either convert the
Patternback to aString, or rely on theSpanand slice the source text.The
Invalidvariant is not very useful. It's only created ifparse_regular_expressionoption is enabled, in which case an error is reported from the parser anyway.Instead, always store the pattern string as an
Atom, and store the parsed pattern asOption<Box<Pattern>>.This simplifies code in various places, and makes printing
RegExpLiterals faster inoxc_codegenwhenparse_regular_expressionis enabled, because it can just print the pattern string, rather than converting aPatternback to aString.