-
Notifications
You must be signed in to change notification settings - Fork 656
perf(rome_js_formatter): Reduce the String
allocations for Tokens
#2462
Conversation
@@ -1125,6 +1127,13 @@ pub enum Token { | |||
// The position of the dynamic token in the unformatted source code | |||
source_position: TextSize, | |||
}, | |||
// A token that is taken 1:1 from the source code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// A token that is taken 1:1 from the source code | |
/// A token that is taken 1:1 from the source code |
What does "taken" mean exactly here? Computed? Extracted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try to improve the existing documentation. It was neither computed nor extracted. It just means it's a text that we keep as is, the formatting doesn't change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "borrowed" could be a more familiar wording, although in Rust it implies some lifetime semantics that are not present here
@@ -1125,6 +1127,13 @@ pub enum Token { | |||
// The position of the dynamic token in the unformatted source code | |||
source_position: TextSize, | |||
}, | |||
// A token that is taken 1:1 from the source code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "borrowed" could be a more familiar wording, although in Rust it implies some lifetime semantics that are not present here
Deploying with Cloudflare Pages
|
This PR reduces the amount of `String` allocation necessary for `FormatElement::Token`s by making use of the observation that most tokens match the text of a `SyntaxToken`. For example, identifiers, or punctuation tokens are kept by the formatter as is. This is even true for string literal tokens if they already use the right quotes. The way this is implemented is by introducing a new `SyntaxTokenText` that is `Send + Sync` and allows referencing a slice in a `SyntaxToken` without worrying about the `&str`'s lifetime. The PR further extends `FormatElement::Token` to make use of this new introduced `SyntaxTokenText`. This change reduces overall memory consumption and improves performance: ``` group format-element token ----- -------------- ----- formatter/checker.ts 1.04 250.2±5.03ms 10.4 MB/sec 1.00 239.5±1.76ms 10.9 MB/sec formatter/compiler.js 1.07 145.6±1.30ms 7.2 MB/sec 1.00 136.5±1.43ms 7.7 MB/sec formatter/d3.min.js 1.07 117.4±3.70ms 2.2 MB/sec 1.00 109.6±1.24ms 2.4 MB/sec formatter/dojo.js 1.03 7.4±0.15ms 9.2 MB/sec 1.00 7.2±0.03ms 9.5 MB/sec formatter/ios.d.ts 1.05 181.2±1.95ms 10.3 MB/sec 1.00 172.8±2.23ms 10.8 MB/sec formatter/jquery.min.js 1.02 29.1±0.55ms 2.8 MB/sec 1.00 28.5±0.07ms 2.9 MB/sec formatter/math.js 1.05 233.1±4.69ms 2.8 MB/sec 1.00 222.8±1.79ms 2.9 MB/sec formatter/parser.ts 1.03 5.3±0.15ms 9.2 MB/sec 1.00 5.1±0.01ms 9.5 MB/sec formatter/pixi.min.js 1.10 131.0±7.11ms 3.3 MB/sec 1.00 119.3±2.12ms 3.7 MB/sec formatter/react-dom.production.min.js 1.07 37.0±0.82ms 3.1 MB/sec 1.00 34.5±0.21ms 3.3 MB/sec formatter/react.production.min.js 1.08 1825.1±57.85µs 3.4 MB/sec 1.00 1683.8±30.49µs 3.7 MB/sec formatter/router.ts 1.02 3.7±0.09ms 16.2 MB/sec 1.00 3.6±0.01ms 16.6 MB/sec formatter/tex-chtml-full.js 1.05 288.3±5.19ms 3.2 MB/sec 1.00 273.4±1.29ms 3.3 MB/sec formatter/three.min.js 1.11 155.7±3.79ms 3.8 MB/sec 1.00 139.7±1.76ms 4.2 MB/sec formatter/typescript.js 1.04 945.2±6.64ms 10.1 MB/sec 1.00 909.3±7.16ms 10.4 MB/sec formatter/vue.global.prod.js 1.07 49.1±1.49ms 2.5 MB/sec 1.00 45.8±0.20ms 2.6 MB/sec ```
245c0ea
to
a93522b
Compare
Parser conformance results on ubuntu-latestjs/262
jsx/babel
ts/babel
ts/microsoft
|
This PR reduces the amount of
String
allocation necessary forFormatElement::Token
s bymaking use of the observation that most tokens match the text of a
SyntaxToken
. For example,identifiers, or punctuation tokens are kept by the formatter as is. This is even true for string literal tokens if they already use the right quotes.
The way this is implemented is by introducing a new
SyntaxTokenText
that isSend + Sync
and allows referencing a slice in aSyntaxToken
without worrying about the&str
's lifetime. The PR further extendsFormatElement::Token
to make use of this new introducedSyntaxTokenText
.This change reduces overall memory consumption and improves performance: