Skip to content
This repository has been archived by the owner on Aug 31, 2023. It is now read-only.

perf(rome_js_formatter): Reduce FormatElement size #2456

Merged
merged 4 commits into from
Apr 20, 2022

Conversation

MichaReiser
Copy link
Contributor

@MichaReiser MichaReiser commented Apr 15, 2022

This PR reduces the size of a FormatElement from 56 to 32 bytes by:

  • Using a Box<str> to store a dynamic token's string as it doesn't need to be mutable (safes 8 bytes for the capacity)
  • Only storing the start position for a DynamicToken because the length isn't used by the printer
  • Change the semantics of verbatim_ranges to store the ranges in the formatted string. The ranges in the formatted output should be sufficient for debugging and it allows to resolve the String rather than having to store it on the FormatElement (range can be computed in the printer).

It further improves the get_lines_between implementation to only check the leading trivia because the trailing trivia should never contain new lines except for the EOF token which isn't relevant for that function.

This improves the formatter's performance by about 10%

group                                    format-element                         main
-----                                    --------------                         ----
formatter/checker.ts                     1.00    250.2±5.03ms    10.4 MB/sec    1.12    280.3±3.07ms     9.3 MB/sec
formatter/compiler.js                    1.00    145.6±1.30ms     7.2 MB/sec    1.09    158.7±1.82ms     6.6 MB/sec
formatter/d3.min.js                      1.00    117.4±3.70ms     2.2 MB/sec    1.11    129.9±1.57ms     2.0 MB/sec
formatter/dojo.js                        1.00      7.4±0.15ms     9.2 MB/sec    1.09      8.1±0.02ms     8.4 MB/sec
formatter/ios.d.ts                       1.00    181.2±1.95ms    10.3 MB/sec    1.18   214.0±18.85ms     8.7 MB/sec
formatter/jquery.min.js                  1.00     29.1±0.55ms     2.8 MB/sec    1.10     32.1±0.23ms     2.6 MB/sec
formatter/math.js                        1.00    233.1±4.69ms     2.8 MB/sec    1.12    261.7±1.85ms     2.5 MB/sec
formatter/parser.ts                      1.00      5.3±0.15ms     9.2 MB/sec    1.08      5.7±0.05ms     8.5 MB/sec
formatter/pixi.min.js                    1.00    131.0±7.11ms     3.3 MB/sec    1.09    142.6±2.03ms     3.1 MB/sec
formatter/react-dom.production.min.js    1.00     37.0±0.82ms     3.1 MB/sec    1.07     39.5±0.59ms     2.9 MB/sec
formatter/react.production.min.js        1.00  1825.1±57.85µs     3.4 MB/sec    1.08   1970.9±3.55µs     3.1 MB/sec
formatter/router.ts                      1.00      3.7±0.09ms    16.2 MB/sec    1.09      4.0±0.11ms    14.8 MB/sec
formatter/tex-chtml-full.js              1.00    288.3±5.19ms     3.2 MB/sec    1.10    316.9±3.00ms     2.9 MB/sec
formatter/three.min.js                   1.00    155.7±3.79ms     3.8 MB/sec    1.08    167.6±3.41ms     3.5 MB/sec
formatter/typescript.js                  1.00    945.2±6.64ms    10.1 MB/sec    1.10   1037.9±8.11ms     9.2 MB/sec
formatter/vue.global.prod.js             1.00     49.1±1.49ms     2.5 MB/sec    1.05     51.8±1.26ms     2.3 MB/sec

Test Plan

cargo test.

I changed a node implementation to call format_verbatim and verified the printed text & ranges.

Future improvements

It should be possible to bring the FormatElement size down to 24 bytes by

  • Introducing a new Marker Element that creates a source map entry from source position to formatted position
  • Extract the source position from the Token variants
  • Replace the List's Vec with a Vec that uses u32 for indices. This should be sufficient, considering that Rome only supports files with less than u32::max characters

@MichaReiser MichaReiser temporarily deployed to aws April 15, 2022 21:39 Inactive
@github-actions
Copy link

github-actions bot commented Apr 15, 2022

Parser conformance results on ubuntu-latest

js/262

Test result main count This PR count Difference
Total 45323 45323 0
Passed 44383 44383 0
Failed 940 940 0
Panics 0 0 0
Coverage 97.93% 97.93% 0.00%

jsx/babel

Test result main count This PR count Difference
Total 39 39 0
Passed 36 36 0
Failed 3 3 0
Panics 0 0 0
Coverage 92.31% 92.31% 0.00%

ts/babel

Test result main count This PR count Difference
Total 588 588 0
Passed 519 519 0
Failed 69 69 0
Panics 0 0 0
Coverage 88.27% 88.27% 0.00%

ts/microsoft

Test result main count This PR count Difference
Total 16171 16171 0
Passed 12315 12315 0
Failed 3856 3856 0
Panics 0 0 0
Coverage 76.15% 76.15% 0.00%

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Apr 15, 2022

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: e3be30f
Status:⚡️  Build in progress...

View logs

@MichaReiser MichaReiser temporarily deployed to aws April 15, 2022 21:42 Inactive
@github-actions
Copy link

github-actions bot commented Apr 15, 2022

This PR reduces the size of a `FormatElement` from 56 to 32 bytes by:

* Using a `Box<str>` to store a dynamic token's string as it doesn't need to be mutable (safes 8 bytes for the `capacity`)
* Only storing the start position for a `DynamicToken` because the length isn't used by the printer
* Change the semantics of `verbatim_ranges` to store the ranges in the **formatted** string. The ranges in the formatted output should be sufficient for debugging and it allows to resolve the `String` rather than having to store it on the `FormatElement` (range can be computed in the printer).

This improves the formatter's performance by about 10%
@MichaReiser MichaReiser temporarily deployed to aws April 16, 2022 07:11 Inactive
@MichaReiser MichaReiser marked this pull request as ready for review April 19, 2022 05:48
crates/rome_formatter/src/format_element.rs Show resolved Hide resolved
crates/rome_formatter/src/format_element.rs Outdated Show resolved Hide resolved
Comment on lines -266 to -267
pub fn into_verbatim(self) -> Vec<(String, TextRange)> {
self.verbatim_source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason why we removed this function? It was there mainly to be in line with into_sourcemap and into_code. If it's been removed because not used, then we should do the same with into_sourcemap too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it because it wasn't used and the struct no longer stores the Vec that it could just take and return. Having the iterator specific method makes it clear that taking the verbatim sources requires a vector allocation.

@MichaReiser MichaReiser temporarily deployed to aws April 19, 2022 12:22 Inactive
@MichaReiser MichaReiser temporarily deployed to aws April 20, 2022 13:58 Inactive
@MichaReiser MichaReiser merged commit f52bb37 into main Apr 20, 2022
@MichaReiser MichaReiser deleted the perf/concat-elements branch April 20, 2022 14:01
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants