perf(parser): use visitor instead of JSON.parse reviver#10791
perf(parser): use visitor instead of JSON.parse reviver#10791overlookmotel merged 10 commits intooxc-project:mainfrom ArnaudBarre:drop-json-reviver
Conversation
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. |
WalkthroughA new script, 📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
⏰ Context from checks skipped due to timeout of 90000ms (6)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
napi/parser/generate-visitor-keys.mjs (2)
1-3: Export style mismatch causes fragile inter-op
wrap.mjsconsumes the generated file as a CommonJS default export (module.exports = {...}) from ESM usingimport visitorKeys from .... This works in Node, but bundlers (Rollup/ESBuild) may place the object under.default.Simplest fix: emit an ESM module when
.mjsis imported from ESM:-module.exports = {$ +export default {$and change
wrap.cjstorequire('./generated/visitor-keys.cjs'), or addconst visitorKeys = (imported.default ?? imported);in the consumer code.
4-17: Minor robustness nits
- Use
path.resolve(import.meta.url, '..')to ensure the script works when run outside the package root.- Dropping the trailing comma after the last property avoids unnecessary diff noise across Node versions.
- Append a final newline to keep tooling happy.
These are cosmetic but improve portability.
napi/parser/wrap.mjs (1)
4-4: Defensive import for CJS/ESM bridgeWhen importing a CJS file from ESM, some bundlers expose exports under
.default. Safeguard:-import visitorKeys from './generated/visitor-keys.js'; +import _visitorKeysImport from './generated/visitor-keys.js'; +const visitorKeys = _visitorKeysImport.default ?? _visitorKeysImport;This prevents runtime failures in browser bundles.
napi/parser/wrap.cjs (2)
34-49: Clean up transient fields after hydration & guard against edge-cases
node.bigintmay be the string"0", which is falsy when coerced. Use an explicit!== undefinedcheck so zero-value bigints are not skipped.- Once
valueis created, the rawbigint/regexpayload is redundant and can be removed to keep the AST minimal and avoid double-sources-of-truth.-if (node.type === 'Literal') { - if (node.bigint) { - node.value = BigInt(node.bigint); - } - if (node.regex) { +if (node.type === 'Literal') { + if (node.bigint !== undefined) { + node.value = BigInt(node.bigint); + delete node.bigint; + } + if (node.regex) { try { node.value = RegExp(node.regex.pattern, node.regex.flags); + delete node.regex; } catch (_err) { // Invalid regexp, or valid regexp using syntax not supported by this version of NodeJS } } }This keeps the resulting AST closer to what ESTree-consumers expect.
4-4: Minor: rely on Node’s default.jsresolutionThe explicit
.jsextension is redundant and slightly hinders tooling that rewrites paths (e.g. bundlers converting to.cjs). Dropping it keeps the import resilient:-const visitorKeys = require('./generated/visitor-keys.js'); +const visitorKeys = require('./generated/visitor-keys');
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
napi/.gitignore(1 hunks)napi/parser/generate-visitor-keys.mjs(1 hunks)napi/parser/package.json(2 hunks)napi/parser/wrap.cjs(2 hunks)napi/parser/wrap.mjs(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Test NAPI
- GitHub Check: Clippy
- GitHub Check: Test Linux
🔇 Additional comments (3)
napi/.gitignore (1)
4-4: Generated artefact correctly excludedIgnoring
parser/generated/visitor-keys.jskeeps the repository clean while still allowing the file to be published via thefilesfield. No issues spotted here.napi/parser/package.json (1)
46-47: Verify.gitignorevsfilesinteractionThe generated file is ignored by Git but included in
"files". That is fine for npm―as long as the file is present on disk whennpm packruns. After applying the previous suggestion, please double-check that a fresh clone →pnpm install→pnpm packincludesgenerated/visitor-keys.js.napi/parser/wrap.cjs (1)
29-31: Gracefully handle already-parsed inputIf
result.programis accidentally passed as an object (e.g. coming from another API level),JSON.parsewill throw. A small guard prevents the crash without impacting performance:-const parsed = JSON.parse(program); +const parsed = + typeof program === 'string' ? JSON.parse(program) : program;Consider adding a unit-test for this scenario.
|
I've squashed to 1 commit and rebased on main, just to fix merge conflicts and make it easier to review. I'll try and figure out cause of the test failures. |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
napi/parser/wrap.cjs (1)
51-63: Fixed traversal order to properly handle leaf nodes.The implementation correctly calls
fn(node)before checking for visitor keys, which addresses the previous issue with leaf nodes being skipped. This ensures thatLiteralnodes withbigintandregexvalues are properly processed.However, note that this implements a hybrid traversal approach - parent nodes are processed before child nodes for non-array types, but array nodes are not processed directly.
For consistency, you might want to add a comment explaining this traversal strategy, especially since the code includes a note about it being duplicated in
wrap.mjs:function visitNode(node, fn) { if (!node) return; if (Array.isArray(node)) { for (const el of node) visitNode(el, fn); return; } + // Process node before its children (pre-order traversal) fn(node); const keys = visitorKeys[node.type]; if (!keys) return; for (const key of keys) visitNode(node[key], fn); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (5)
napi/.gitignore(1 hunks)napi/parser/generate-visitor-keys.mjs(1 hunks)napi/parser/package.json(3 hunks)napi/parser/wrap.cjs(2 hunks)napi/parser/wrap.mjs(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- napi/.gitignore
- napi/parser/package.json
- napi/parser/generate-visitor-keys.mjs
- napi/parser/wrap.mjs
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: Test wasm32-wasip1-threads
- GitHub Check: Test Linux
- GitHub Check: Test NAPI
- GitHub Check: Clippy
- GitHub Check: Conformance
- GitHub Check: Test VSCode
🔇 Additional comments (3)
napi/parser/wrap.cjs (3)
4-5: Appropriate addition of visitor keys import.The import of the generated visitor keys aligns with the PR objective of using a visitor pattern instead of a JSON reviver function.
28-32: Good refactoring to two-step parsing approach.The function now correctly parses the JSON first and then applies transformations through the visitor pattern, which is cleaner than using a reviver function.
34-49: Simplified transformation logic.The transform function has been refactored to take a single node and mutate it directly rather than returning a transformed value, which is more efficient and aligns well with the visitor pattern approach.
We import `@typescript-eslint/visitor-keys` as a dependency, rather than copying it's code. So no need for license attribution.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
napi/parser/generate-visitor-keys.mjs (1)
5-6: Consider usingfileURLToPathfor more robust path resolution.The script uses
import.meta.dirnamewhich might not be available in all Node.js versions. For better compatibility, consider usingimport.meta.urlwithfileURLToPath.- import { join as pathJoin } from 'node:path'; + import { join as pathJoin, dirname } from 'node:path'; + import { fileURLToPath } from 'node:url'; - const PATH = pathJoin(import.meta.dirname, 'generated/visitor-keys.js'); + const PATH = pathJoin(dirname(fileURLToPath(import.meta.url)), 'generated/visitor-keys.js');
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
napi/parser/generated/visitor-keys.jsis excluded by!**/generated/**
📒 Files selected for processing (3)
napi/parser/generate-visitor-keys.mjs(1 hunks)napi/parser/wrap.cjs(2 hunks)napi/parser/wrap.mjs(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- napi/parser/wrap.mjs
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: Test wasm32-wasip1-threads
- GitHub Check: Test Linux
- GitHub Check: Test NAPI
- GitHub Check: Clippy
- GitHub Check: Test VSCode
- GitHub Check: Conformance
🔇 Additional comments (10)
napi/parser/generate-visitor-keys.mjs (5)
1-4: LGTM: Proper imports for the script.The script correctly imports the necessary dependencies: visitor keys from TypeScript ESLint, and Node.js built-in file system and path modules.
7-12: LGTM: Properly extending the visitor keys.The code correctly extends the original visitor keys with additional entries for
ParenthesizedExpressionandTSParenthesizedTypewhich are missing in the original TypeScript ESLint keys.
14-18: LGTM: Proper filtering and formatting of keys.The code efficiently filters out empty visitor keys and formats the remaining ones into a string representation suitable for a JS module.
19-26: LGTM: Clear warning comment and proper module export format.The warning comment clearly indicates that the file is auto-generated and should not be edited directly.
27-28: LGTM: File is properly written to the specified path.The code correctly uses
writeFileSyncto write the generated code to the specified path.napi/parser/wrap.cjs (5)
4-5: LGTM: Properly importing the generated visitor keys.The code correctly requires the generated visitor keys module.
28-32: LGTM: Improved JSON parsing with visitor pattern.The function has been refactored to use a two-step process: first parse the JSON normally, then visit each node in the AST to apply transformations. This approach is more flexible and potentially more performant than using a JSON reviver function.
34-49: LGTM: Simplified transform function.The transform function has been updated to directly mutate the node instead of returning transformed values. This is cleaner and more intuitive when used with the visitor pattern.
51-67: LGTM: Fixed visitor implementation.The visitor implementation correctly processes the AST by:
- Handling null nodes and arrays
- Applying the transform function to each node (including leaf nodes)
- Recursively visiting child nodes according to the visitor keys
This implementation addresses the previously identified issue where leaf nodes like
Literalwere being skipped, which brokeBigIntandRegExphydration.
60-66:Details
❓ Verification inconclusive
Verify the order of operations in the visitor.
The current implementation applies the transform function before visiting child nodes (pre-order traversal). Make sure this is the intended behavior, as the opposite order (post-order traversal) might be needed in some cases.
🏁 Script executed:
#!/bin/bash # Check for any test cases that might rely on specific traversal order rg -A 5 -B 5 "BigInt|RegExp" napi/parser/wrap.test.js || echo "No test file found"Length of output: 167
Confirm traversal order in
visitNode(pre-order vs. post-order)I didn’t find any existing tests that verify whether the transform function (
fn(node)) should run before or after visiting child nodes. If you intended a pre-order traversal, no change is needed. Otherwise:
- Consider swapping the order so children are visited first (post-order).
- Add a targeted unit test to lock in the expected behavior for future refactors.
|
I found the problem. Our AST has 2 node types which TS-ESLint doesn't:
Have pushed a commit to fix, and also done a little refactoring to (in my opinion) make the code a bit clearer. |
|
Thanks for finding that! Changes looks good to me! |
…up visitor (#10813) Follow-on after #10791, part of #10783. In the fixup visitor, replace `visitNode` with a more specialized `transformNode` function. The difference is that `transformNode` doesn't need to be passed `fn` as 2nd param. Also, move the check for whether a node is a `Literal` into `transformNode` to avoid the cost of a function call for every node (most of which are not `Literal`s). Running benchmarks locally (`pnpm run bench` in `napi/parser`), I'm seeing a 2%-3% speed-up.
#10791 added 2 files `generated/visitor-keys.cjs` and `visitor-keys.mjs` to the NAPI package. When I amended the original PR, I forgot to update `package.json` to include them both in NPM package.
#10791 fixed the performance of `oxc-parser` NPM package by removing the reviver from `JSON.parse` call. However, it replaced it with a complete traversal of the AST on JS side to locate `Literal` nodes and update them. Instead, generate a list of paths to `Literal` nodes needing fixing on Rust side in `ESTree` serializer. e.g. for this program: ```js 123n; foo(/xyz/); ``` the fix paths are: ```json [ ["body", 0, "expression"], ["body", 1, "expression", "arguments", 2] ] ``` Having the location of nodes which need updating reduces work on JS side, as no need to search the entire AST. Running `pnpm run bench` (in `napi/parser`) locally shows between 8% and 15% speed-up from this change.
Fixes #10783
This requires having a list of ~135 nodes with their non primitive keys.
Technically this list could be generated alongside the raw deserializer but I don't know this code enough to do it.
I've chosen to inline visitor keys to avoid adding 3 dependencies for that.
I didn't find how the main LICENCE was injected during the publish so I couldn't append the bundled LICENSES and choose to use
@licencecomments with links