perf(napi/parser, linter/plugins): speed up decoding strings in raw transfer#21021
Merged
graphite-app[bot] merged 1 commit intomainfrom Apr 3, 2026
Conversation
This was referenced Apr 3, 2026
Merged
Member
Author
This was referenced Apr 3, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR speeds up raw-transfer string deserialization (a known hotspot) by introducing a Latin-1 decoded source string for fast ASCII slicing on non-ASCII sources, plus a faster short-string fallback path that avoids per-byte concatenation and reduces native TextDecoder.decode calls.
Changes:
- Decode source bytes once into a Latin-1 “byte-addressable” string (
sourceTextLatin) and prefer slicing it for ASCII substrings even when the originalsourceTextis non-ASCII. - Replace the byte-by-byte concatenation loop for short non-source strings with a reused temp-array +
String.fromCharCode.applyapproach. - Raise the native
TextDecoder.decodecrossover threshold to 64 bytes.
Reviewed changes
Copilot reviewed 1 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tasks/ast_tools/src/generators/raw_transfer.rs | Generator updates implementing Latin-1 source decoding, temp-array short-string decoding, and the new crossover threshold. |
| napi/parser/src-js/generated/deserialize/ts.js | Regenerated TS deserializer using sourceTextLatin, array-based short-string decode, and 64-byte crossover. |
| napi/parser/src-js/generated/deserialize/ts_range.js | Same regeneration for the range variant. |
| napi/parser/src-js/generated/deserialize/ts_range_parent.js | Same regeneration for the range + parent variant. |
| napi/parser/src-js/generated/deserialize/ts_parent.js | Same regeneration for the parent variant. |
| napi/parser/src-js/generated/deserialize/js.js | Regenerated JS deserializer with the new decoding strategy. |
| napi/parser/src-js/generated/deserialize/js_range.js | Same regeneration for the range variant. |
| napi/parser/src-js/generated/deserialize/js_range_parent.js | Same regeneration for the range + parent variant. |
| napi/parser/src-js/generated/deserialize/js_parent.js | Same regeneration for the parent variant. |
| apps/oxlint/src-js/generated/deserialize.js | Regenerated oxlint-side deserializer with the new Latin-1 and short-string decoding paths. |
This was referenced Apr 3, 2026
Contributor
Merge activity
|
…ransfer (#21021) Improve perf of deserializing strings in raw transfer. This PR combines several optimizations, which have been tested and benchmarked in https://github.com/overlookmotel/oxc-raw-str-bench. This PR implements the version "latin-slice-onebyte64" from that repo, which is the current winner. String deserialization is the main bottleneck in raw transfer, so speeding it up will likely make a large impact on deserialization overall. This work follows on from #20834 which produced a major speed-up in many files by making files which contain some non-ASCII characters take the fast path of slicing `sourceText` more often. This PR tackles the remainder - speeding up the fallback path where the fast path can't be taken. ## Optimizations The optimizations in this PR are: ### Latin1 When source is not 100% ASCII, decode source text from buffer as Latin1. A Latin1-decoded string represents each UTF-8 byte as a single Latin1 character, so it can be indexed into using UTF-8 offsets. So when we can't slice the string from `sourceText` because the UTF-8 and UTF-16 offsets differ (after any non-ASCII character), loop through the string's bytes and check if they're all ASCII. If they are, the string can be sliced from `sourceTextLatin` instead, with the original UTF-8 offsets. This is way faster than calling `textDecoder.decode`, as it avoids a call into C++. [Benchmarks show](https://github.com/overlookmotel/oxc-raw-str-bench/blob/4f96275efa9a35d5d27615abb27f21a137149cc0/README.md#apply30-vs-latin-vs-latin-source64) speed up of 55% on average, and up to 70% on some files. ### Latin1 decoding method It turns out that `new TextDecoder("latin1").decode(arr)` doesn't actually decode to Latin1! Per the WHATWG Encoding Standard, "latin1" is mapped to "windows-1252". The result is that with `TextDecoder("latin1")`: 1. `decode` is quite complicated, requiring a 2-pass scan of the bytes to determine if they're all ASCII, followed by a 2nd pass to do the actual `windows-1252` decoding. If the string *does* contain any non-ASCII characters (which it always does in our usecase), NodeJS implements the decoding in JS, not native code. Slow. 2. `decode` produces a 2-byte-per-char string (`TWO_BYTE` in V8), which takes more memory, and is slower for all operations on it e.g. string comparison, hashing for use as an object key etc. Instead, use `Buffer.prototype.latin1Slice` which: 1. Does a pure Latin1 decode, which is just a single `memcpy` call. 2. Produces a 1-byte-per-char string (`ONE_BYTE` in V8). `latin1Slice` involves a call into C++, but we only do it once per file, so this cost is tiny in context of deserializing the whole AST. ### Latin1 string slicing In the fast path, slice from the Latin1-decoded string, instead of `sourceText`. In the fast path, we know that all bytes of source comprising the string are ASCII, so no further checks are required. This makes no difference on benchmarks for `deserializeStr` itself, but it may have beneficial effects downstream for code (e.g. lint rules) which access strings in the AST, e.g. `Identifier` names. Because Latin1-decoded source text is `ONE_BYTE`-encoded, slices of it are too. In comparison, slices of `sourceText` may be `ONE_BYTE` or `TWO_BYTE`. If a file's source is pure ASCII, it'll be `ONE_BYTE`, if source contains any non-ASCII characters, it'll be `TWO_BYTE`. Files in a repo will likely be a mix of both, which makes strings returned from `deserializeStr` and placed in the AST a mix too. This in turn makes functions (e.g. lint rule visitors) polymorphic. V8 cannot optimize them as aggressively as if they see only `ONE_BYTE` strings. We cannot make sure that all strings returned by `deserializeStr` are `ONE_BYTE`. Some string may contain non-ASCII characters, and they *have* to be represented in `TWO_BYTE` form. But we can minimize it - now only strings which *themselves* contain non-ASCII characters are `TWO_BYTE`, whereas before they would be if the source text as a whole contains a single non-ASCII byte. Code which accesses `Identifier` names, for example will exclusively see `ONE_BYTE` strings and will be more heavily optimized, because Unicode `Identifier`s are rarer than hen's teeth in real-world code. ### Remove string-concatenation loop Previously strings which are outside of source text were assembled byte-by-byte in a loop via concatenation. Instead, check that all the bytes are ASCII first, copy them into an array and pass that array to `String.fromCharCode` with `fromCharCode.apply(null, array)`. To avoid allocating a fresh array every time, hold a stock of arrays for all string lengths that this path can require, and reuse them. This is a variation on the approach that #20883 took, but without the massive switch. This produces much tighter assembly, and avoids regressing the fast path due making `deserializeStr` a very large function. Despite the complexity, and multiple operations, [this is up to 3x faster](https://github.com/overlookmotel/oxc-raw-str-bench/blob/4f96275efa9a35d5d27615abb27f21a137149cc0/README.md#apply30-vs-switch30) than the switch approach, and gives an average 30% speed-up. ### Increase native call threshold The above optimizations make the slow path much faster. This shifts the tipping point at which it's faster to make a native call to `TextDecoder.decode` from 9 bytes to 64 bytes. Most strings now avoid the native call and stay in JS code which is heavily optimized by Turbofan. The tipping point of 64 is something of a guesstimate. Benchmarking shows its in the right ballpark, but we could finesse it, and probably squeeze out another couple of %. ## Credit The Latin1 string technique was cooked up by @joshuaisaact in overlookmotel/oxc-raw-str-bench#1. All credit to him for this masterstroke which cracks the whole problem!
2710dc5 to
55e1e9b
Compare
683d3b3 to
012c924
Compare
graphite-app bot
pushed a commit
that referenced
this pull request
Apr 3, 2026
…lier (#21025) There's no need to defer clearing vars containing buffers and source text in raw transfer deserializer module until end of linting. It was a left-over from when this module contained code which lazily deserialized comments, but that's now done elsewhere. Clear these vars as soon as deserialization is complete. Buffer and `sourceText` are still held in vars in other modules until linting completes, but `sourceTextLatin` (introduced in #21021) can be freed immediately.
Base automatically changed from
om/04-03-perf_napi_parser_linter_plugins_initialize_vars_as_0
to
main
April 3, 2026 22:14
This was referenced Apr 6, 2026
leaysgur
pushed a commit
that referenced
this pull request
Apr 7, 2026
### 🐛 Bug Fixes - fc7f60c allocator: Revert changes to `get_current_chunk_footer_field_offset` (#20964) (overlookmotel) - 31316c8 semantic: Rebind class expressions before identifier checks (#20916) (camc314) ### ⚡ Performance - fb52383 napi/parser, linter/plugins: Clear buffers and source texts earlier (#21025) (overlookmotel) - 3b7dec4 napi/parser, linter/plugins: Use `utf8Slice` for decoding UTF-8 strings (#21022) (overlookmotel) - 012c924 napi/parser, linter/plugins: Speed up decoding strings in raw transfer (#21021) (overlookmotel) - 55e1e9b napi/parser, linter/plugins: Initialize vars as 0 (#21020) (overlookmotel) - c25ef02 napi/parser, linter/plugins: Simplify branch condition in `deserializeStr` (#21019) (overlookmotel) - 9f494c3 napi/parser, linter/plugins: Raw transfer use `String.fromCharCode` in string decoding (#21018) (overlookmotel) - 91cf105 allocator: Increase initial chunk size from 512B to 16KB (#20968) (overlookmotel) - cbc0c21 allocator: Add `#[cold]` to to error handling functions (#20967) (overlookmotel) - 0503a78 napi/parser, linter/plugins: Faster deserialization of `raw` fields (#20923) (overlookmotel) - a24f75e napi/parser: Optimize string deserialization for non-ASCII sources (#20834) (Joshua Tuddenham) ### 📚 Documentation - c78a57a syntax: Fix typo (#21044) (camc314) - f5e228f allocator: Fix typo in comment (#20972) (overlookmotel) - 7159d51 allocator: Improve doc comment examples for `vec2::Vec` (#20969) (overlookmotel) - b1da750 allocator, data_structures: Correct comments (#20966) (overlookmotel) Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>
leaysgur
pushed a commit
that referenced
this pull request
Apr 7, 2026
# Oxlint ### 💥 BREAKING CHANGES - 22ce6af oxlint/lsp: [**BREAKING**] Show/fix safe suggestions by default (#19816) (Sysix) ### 🚀 Features - 7a7b7b8 oxlint/lsp: Add source.fixAllDangerous.oxc code action kind (#20526) (bab) - 9cfe57e linter/unicorn: Implement prefer-import-meta-properties rule (#20662) (Irfan - ئىرفان) - 1edb391 linter/eslint: Implement `no-restricted-exports` rule (#20592) (Nicolas Le Cam) - 0f12bcd linter/react: Implement `hook-use-state` rule (#20986) (Khaled Labeb) - 1513a9f oxlint/lsp: Show note field for lsp diagnostic (#20983) (Sysix) - 7fdf722 linter/unicorn: Implement `no-useless-iterator-to-array` rule (#20945) (Mikhail Baev) - 39c8f2c linter/jest: Implement padding-around-after-all-blocks (#21034) (Sapphire) - ac39e51 linter/eslint-vitest-plugin: Prefer importing vitest globals (#20960) (Said Atrahouch) - 0b84de1 oxlint: Support allow option for prefer-promise-reject-errors (#20934) (camc314) - 23db851 linter/consistent-return: Move rule from nursery to suspicious (#20920) (camc314) - 9a27e32 linter/no-unnecessary-type-conversion: Move rule from nursery to suspicious (#20919) (camc314) - 1ca7b58 linter/dot-notation: Move rule from nursery to style (#20918) (camc314) - 73ba81a linter/consistent-type-exports: Move rule from nursery to style (#20917) (camc314) - b9199b1 linter/unicorn: Implement switch-case-break-position (#20872) (Mikhail Baev) - 3435ff8 linter: Implements `prefer-snapshot-hint` rule in Jest and Vitest (#20870) (Said Atrahouch) - 98510d2 linter: Implement react/prefer-function-component (#19652) (Connor Shea) - 871f9d9 linter: Implement no-useless-assignment (#15466) (Zhaoting Zhou) - 0f01fbd linter: Implement eslint/object-shorthand (#17688) (yue) ### 🐛 Bug Fixes - dd2df87 npm: Export package.json for oxlint and oxfmt (#20784) (kazuya kawaguchi) - 9bc77dd linter/no-unused-private-class-members: False positive with await expr (#21067) (camc314) - 60a57cd linter/const-comparisons: Detect equality contradictions (#21065) (camc314) - 2bb2be2 linter/no-array-index-key: False positive when index is passed as function argument (#21012) (bab) - 6492953 linter/no-this-in-sfc: Only flag `this` used as member expression object (#20961) (bab) - 9446dcc oxlint/lsp: Skip `node_modules` in oxlint config walker (#21004) (copilot-swe-agent) - af89923 linter/no-namespace: Support glob pattern matching against basename (#21031) (bab) - 64a1a7e oxlint: Don't search for nested config outside base config (#21051) (Sysix) - 3b953bc linter/button-has-type: Ignore `document.createElement` calls (#21008) (Said Atrahouch) - 8c36070 linter/unicorn: Add support for `Array.from()` for `prefer-set-size` rule (#21016) (Mikhail Baev) - c1a48f0 linter: Detect vitest import from vite-plus/test (#20976) (Said Atrahouch) - 5c32fd1 lsp: Prevent corrupted autofix output from overlapping text edits (#19793) (Peter Wagenet) - ca79960 linter/no-array-index-key: Move span to `key` property (#20947) (camc314) - 2098274 linter: Add suggestion for `jest/prefer-equality-matcher` (#20925) (eryue0220) - 6eb77ec linter: Allow default-import barrels in import/named (#20757) (Bazyli Brzóska) - 9c218ef linter/eslint-vitest-plugin: Remove pending fix status for require-local-test-context-for-concurrent-snapshot (#20890) (Said Atrahouch) ### ⚡ Performance - fb52383 napi/parser, linter/plugins: Clear buffers and source texts earlier (#21025) (overlookmotel) - 3b7dec4 napi/parser, linter/plugins: Use `utf8Slice` for decoding UTF-8 strings (#21022) (overlookmotel) - 012c924 napi/parser, linter/plugins: Speed up decoding strings in raw transfer (#21021) (overlookmotel) - 55e1e9b napi/parser, linter/plugins: Initialize vars as 0 (#21020) (overlookmotel) - c25ef02 napi/parser, linter/plugins: Simplify branch condition in `deserializeStr` (#21019) (overlookmotel) - 9f494c3 napi/parser, linter/plugins: Raw transfer use `String.fromCharCode` in string decoding (#21018) (overlookmotel) - 0503a78 napi/parser, linter/plugins: Faster deserialization of `raw` fields (#20923) (overlookmotel) - a24f75e napi/parser: Optimize string deserialization for non-ASCII sources (#20834) (Joshua Tuddenham) ### 📚 Documentation - af72b80 oxlint: Fix typo for --tsconfig (#20889) (leaysgur) - 70c53b1 linter: Highlight that tsconfig is not respected in type aware linting (#20884) (camc314) # Oxfmt ### 🚀 Features - 35cf6e8 oxfmt: Add node version hint for ts config import failures (#21046) (camc314) ### 🐛 Bug Fixes - dd2df87 npm: Export package.json for oxlint and oxfmt (#20784) (kazuya kawaguchi) - 9d45511 oxfmt: Propagate file write errors instead of panicking (#20997) (leaysgur) - 139ddd9 formatter: Handle leading comment after array elision (#20987) (leaysgur) - 4216380 oxfmt: Support `.editorconfig` `tab_width` fallback (#20988) (leaysgur) - d10df39 formatter: Resolve pending space in fits measurer before expanded-mode early exit (#20954) (Dunqing) - f9ef1bd formatter: Avoid breaking after `=>` when arrow body has JSDoc type cast (#20857) (bab) Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Improve perf of deserializing strings in raw transfer. This PR combines several optimizations, which have been tested and benchmarked in https://github.com/overlookmotel/oxc-raw-str-bench. This PR implements the version "latin-slice-onebyte64" from that repo, which is the current winner.
String deserialization is the main bottleneck in raw transfer, so speeding it up will likely make a large impact on deserialization overall.
This work follows on from #20834 which produced a major speed-up in many files by making files which contain some non-ASCII characters take the fast path of slicing
sourceTextmore often.This PR tackles the remainder - speeding up the fallback path where the fast path can't be taken.
Optimizations
The optimizations in this PR are:
Latin1
When source is not 100% ASCII, decode source text from buffer as Latin1.
A Latin1-decoded string represents each UTF-8 byte as a single Latin1 character, so it can be indexed into using UTF-8 offsets.
So when we can't slice the string from
sourceTextbecause the UTF-8 and UTF-16 offsets differ (after any non-ASCII character), loop through the string's bytes and check if they're all ASCII. If they are, the string can be sliced fromsourceTextLatininstead, with the original UTF-8 offsets.This is way faster than calling
textDecoder.decode, as it avoids a call into C++. Benchmarks show speed up of 55% on average, and up to 70% on some files.Latin1 decoding method
It turns out that
new TextDecoder("latin1").decode(arr)doesn't actually decode to Latin1!Per the WHATWG Encoding Standard, "latin1" is mapped to "windows-1252".
The result is that with
TextDecoder("latin1"):decodeis quite complicated, requiring a 2-pass scan of the bytes to determine if they're all ASCII, followed by a 2nd pass to do the actualwindows-1252decoding. If the string does contain any non-ASCII characters (which it always does in our usecase), NodeJS implements the decoding in JS, not native code. Slow.decodeproduces a 2-byte-per-char string (TWO_BYTEin V8), which takes more memory, and is slower for all operations on it e.g. string comparison, hashing for use as an object key etc.Instead, use
Buffer.prototype.latin1Slicewhich:memcpycall.ONE_BYTEin V8).latin1Sliceinvolves a call into C++, but we only do it once per file, so this cost is tiny in context of deserializing the whole AST.Latin1 string slicing
In the fast path, slice from the Latin1-decoded string, instead of
sourceText. In the fast path, we know that all bytes of source comprising the string are ASCII, so no further checks are required.This makes no difference on benchmarks for
deserializeStritself, but it may have beneficial effects downstream for code (e.g. lint rules) which access strings in the AST, e.g.Identifiernames.Because Latin1-decoded source text is
ONE_BYTE-encoded, slices of it are too. In comparison, slices ofsourceTextmay beONE_BYTEorTWO_BYTE. If a file's source is pure ASCII, it'll beONE_BYTE, if source contains any non-ASCII characters, it'll beTWO_BYTE. Files in a repo will likely be a mix of both, which makes strings returned fromdeserializeStrand placed in the AST a mix too. This in turn makes functions (e.g. lint rule visitors) polymorphic. V8 cannot optimize them as aggressively as if they see onlyONE_BYTEstrings.We cannot make sure that all strings returned by
deserializeStrareONE_BYTE. Some string may contain non-ASCII characters, and they have to be represented inTWO_BYTEform. But we can minimize it - now only strings which themselves contain non-ASCII characters areTWO_BYTE, whereas before they would be if the source text as a whole contains a single non-ASCII byte.Code which accesses
Identifiernames, for example will exclusively seeONE_BYTEstrings and will be more heavily optimized, because UnicodeIdentifiers are rarer than hen's teeth in real-world code.Remove string-concatenation loop
Previously strings which are outside of source text were assembled byte-by-byte in a loop via concatenation.
Instead, check that all the bytes are ASCII first, copy them into an array and pass that array to
String.fromCharCodewithfromCharCode.apply(null, array).To avoid allocating a fresh array every time, hold a stock of arrays for all string lengths that this path can require, and reuse them.
This is a variation on the approach that #20883 took, but without the massive switch. This produces much tighter assembly, and avoids regressing the fast path due making
deserializeStra very large function.Despite the complexity, and multiple operations, this is up to 3x faster than the switch approach, and gives an average 30% speed-up.
Increase native call threshold
The above optimizations make the slow path much faster. This shifts the tipping point at which it's faster to make a native call to
TextDecoder.decodefrom 9 bytes to 64 bytes. Most strings now avoid the native call and stay in JS code which is heavily optimized by Turbofan.The tipping point of 64 is something of a guesstimate. Benchmarking shows its in the right ballpark, but we could finesse it, and probably squeeze out another couple of %.
Credit
The Latin1 string technique was cooked up by @joshuaisaact in overlookmotel/oxc-raw-str-bench#1. All credit to him for this masterstroke which cracks the whole problem!