Skip to content

perf(parser): use visitor instead of JSON.parse reviver#10791

Merged
overlookmotel merged 10 commits intooxc-project:mainfrom
ArnaudBarre:drop-json-reviver
May 5, 2025
Merged

perf(parser): use visitor instead of JSON.parse reviver#10791
overlookmotel merged 10 commits intooxc-project:mainfrom
ArnaudBarre:drop-json-reviver

Conversation

@ArnaudBarre
Copy link
Contributor

Fixes #10783

This requires having a list of ~135 nodes with their non primitive keys.
Technically this list could be generated alongside the raw deserializer but I don't know this code enough to do it.
I've chosen to inline visitor keys to avoid adding 3 dependencies for that.
I didn't find how the main LICENCE was injected during the publish so I couldn't append the bundled LICENSES and choose to use @licence comments with links

@graphite-app
Copy link
Contributor

graphite-app bot commented May 4, 2025

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

@github-actions github-actions bot added the C-performance Category - Solution not expected to change functional behavior, only performance label May 4, 2025
@coderabbitai
Copy link

coderabbitai bot commented May 4, 2025

Walkthrough

A new script, generate-visitor-keys.mjs, was added to generate visitor-keys.cjs and visitor-keys.mjs files that extend the original visitor keys from the @typescript-eslint/visitor-keys package with additional entries for certain AST node types. These generated files are included in the package and the script is run as part of the development build process. The JSON AST parsing in both wrap.cjs and wrap.mjs was refactored: the previous approach using JSON.parse with a reviver function was replaced by a two-step process where the JSON is parsed normally, then the entire AST is recursively traversed with a new visitNode function that applies a node-transforming function. The transform function was changed to accept and mutate AST nodes directly rather than using key-value pairs. The traversal uses the generated visitor keys to determine child nodes, enabling consistent and explicit AST processing. No changes were made to the signatures of exported or public APIs.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1efb6f4 and a760ebc.

⛔ Files ignored due to path filters (2)
  • napi/parser/generated/visitor-keys.cjs is excluded by !**/generated/**
  • napi/parser/generated/visitor-keys.mjs is excluded by !**/generated/**
📒 Files selected for processing (3)
  • napi/parser/generate-visitor-keys.mjs (1 hunks)
  • napi/parser/wrap.cjs (2 hunks)
  • napi/parser/wrap.mjs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • napi/parser/generate-visitor-keys.mjs
  • napi/parser/wrap.mjs
  • napi/parser/wrap.cjs
⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Conformance
  • GitHub Check: Test wasm32-wasip1-threads
  • GitHub Check: Test NAPI
  • GitHub Check: Clippy
  • GitHub Check: Test VSCode
  • GitHub Check: Test Linux

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
napi/parser/generate-visitor-keys.mjs (2)

1-3: Export style mismatch causes fragile inter-op

wrap.mjs consumes the generated file as a CommonJS default export (module.exports = {...}) from ESM using import visitorKeys from .... This works in Node, but bundlers (Rollup/ESBuild) may place the object under .default.

Simplest fix: emit an ESM module when .mjs is imported from ESM:

-module.exports = {$
+export default {$

and change wrap.cjs to require('./generated/visitor-keys.cjs'), or add

const visitorKeys = (imported.default ?? imported);

in the consumer code.


4-17: Minor robustness nits

  1. Use path.resolve(import.meta.url, '..') to ensure the script works when run outside the package root.
  2. Dropping the trailing comma after the last property avoids unnecessary diff noise across Node versions.
  3. Append a final newline to keep tooling happy.

These are cosmetic but improve portability.

napi/parser/wrap.mjs (1)

4-4: Defensive import for CJS/ESM bridge

When importing a CJS file from ESM, some bundlers expose exports under .default. Safeguard:

-import visitorKeys from './generated/visitor-keys.js';
+import _visitorKeysImport from './generated/visitor-keys.js';
+const visitorKeys = _visitorKeysImport.default ?? _visitorKeysImport;

This prevents runtime failures in browser bundles.

napi/parser/wrap.cjs (2)

34-49: Clean up transient fields after hydration & guard against edge-cases

  1. node.bigint may be the string "0", which is falsy when coerced. Use an explicit !== undefined check so zero-value bigints are not skipped.
  2. Once value is created, the raw bigint / regex payload is redundant and can be removed to keep the AST minimal and avoid double-sources-of-truth.
-if (node.type === 'Literal') {
-  if (node.bigint) {
-    node.value = BigInt(node.bigint);
-  }
-  if (node.regex) {
+if (node.type === 'Literal') {
+  if (node.bigint !== undefined) {
+    node.value = BigInt(node.bigint);
+    delete node.bigint;
+  }
+  if (node.regex) {
     try {
       node.value = RegExp(node.regex.pattern, node.regex.flags);
+      delete node.regex;
     } catch (_err) {
       // Invalid regexp, or valid regexp using syntax not supported by this version of NodeJS
     }
   }
 }

This keeps the resulting AST closer to what ESTree-consumers expect.


4-4: Minor: rely on Node’s default .js resolution

The explicit .js extension is redundant and slightly hinders tooling that rewrites paths (e.g. bundlers converting to .cjs). Dropping it keeps the import resilient:

-const visitorKeys = require('./generated/visitor-keys.js');
+const visitorKeys = require('./generated/visitor-keys');
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c33eb9c and f64a922.

📒 Files selected for processing (5)
  • napi/.gitignore (1 hunks)
  • napi/parser/generate-visitor-keys.mjs (1 hunks)
  • napi/parser/package.json (2 hunks)
  • napi/parser/wrap.cjs (2 hunks)
  • napi/parser/wrap.mjs (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Test NAPI
  • GitHub Check: Clippy
  • GitHub Check: Test Linux
🔇 Additional comments (3)
napi/.gitignore (1)

4-4: Generated artefact correctly excluded

Ignoring parser/generated/visitor-keys.js keeps the repository clean while still allowing the file to be published via the files field. No issues spotted here.

napi/parser/package.json (1)

46-47: Verify .gitignore vs files interaction

The generated file is ignored by Git but included in "files". That is fine for npm―as long as the file is present on disk when npm pack runs. After applying the previous suggestion, please double-check that a fresh clone → pnpm installpnpm pack includes generated/visitor-keys.js.

napi/parser/wrap.cjs (1)

29-31: Gracefully handle already-parsed input

If result.program is accidentally passed as an object (e.g. coming from another API level), JSON.parse will throw. A small guard prevents the crash without impacting performance:

-const parsed = JSON.parse(program);
+const parsed =
+  typeof program === 'string' ? JSON.parse(program) : program;

Consider adding a unit-test for this scenario.

@overlookmotel
Copy link
Member

I've squashed to 1 commit and rebased on main, just to fix merge conflicts and make it easier to review.

I'll try and figure out cause of the test failures.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
napi/parser/wrap.cjs (1)

51-63: Fixed traversal order to properly handle leaf nodes.

The implementation correctly calls fn(node) before checking for visitor keys, which addresses the previous issue with leaf nodes being skipped. This ensures that Literal nodes with bigint and regex values are properly processed.

However, note that this implements a hybrid traversal approach - parent nodes are processed before child nodes for non-array types, but array nodes are not processed directly.

For consistency, you might want to add a comment explaining this traversal strategy, especially since the code includes a note about it being duplicated in wrap.mjs:

function visitNode(node, fn) {
  if (!node) return;
  if (Array.isArray(node)) {
    for (const el of node) visitNode(el, fn);
    return;
  }
+  // Process node before its children (pre-order traversal)
  fn(node);

  const keys = visitorKeys[node.type];
  if (!keys) return;
  for (const key of keys) visitNode(node[key], fn);
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 794be76 and 0c9aaaf.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (5)
  • napi/.gitignore (1 hunks)
  • napi/parser/generate-visitor-keys.mjs (1 hunks)
  • napi/parser/package.json (3 hunks)
  • napi/parser/wrap.cjs (2 hunks)
  • napi/parser/wrap.mjs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • napi/.gitignore
  • napi/parser/package.json
  • napi/parser/generate-visitor-keys.mjs
  • napi/parser/wrap.mjs
⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Test wasm32-wasip1-threads
  • GitHub Check: Test Linux
  • GitHub Check: Test NAPI
  • GitHub Check: Clippy
  • GitHub Check: Conformance
  • GitHub Check: Test VSCode
🔇 Additional comments (3)
napi/parser/wrap.cjs (3)

4-5: Appropriate addition of visitor keys import.

The import of the generated visitor keys aligns with the PR objective of using a visitor pattern instead of a JSON reviver function.


28-32: Good refactoring to two-step parsing approach.

The function now correctly parses the JSON first and then applies transformations through the visitor pattern, which is cleaner than using a reviver function.


34-49: Simplified transformation logic.

The transform function has been refactored to take a single node and mutate it directly rather than returning a transformed value, which is more efficient and aligns well with the visitor pattern approach.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
napi/parser/generate-visitor-keys.mjs (1)

5-6: Consider using fileURLToPath for more robust path resolution.

The script uses import.meta.dirname which might not be available in all Node.js versions. For better compatibility, consider using import.meta.url with fileURLToPath.

- import { join as pathJoin } from 'node:path';
+ import { join as pathJoin, dirname } from 'node:path';
+ import { fileURLToPath } from 'node:url';

- const PATH = pathJoin(import.meta.dirname, 'generated/visitor-keys.js');
+ const PATH = pathJoin(dirname(fileURLToPath(import.meta.url)), 'generated/visitor-keys.js');
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0c9aaaf and 0b1cf2d.

⛔ Files ignored due to path filters (1)
  • napi/parser/generated/visitor-keys.js is excluded by !**/generated/**
📒 Files selected for processing (3)
  • napi/parser/generate-visitor-keys.mjs (1 hunks)
  • napi/parser/wrap.cjs (2 hunks)
  • napi/parser/wrap.mjs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • napi/parser/wrap.mjs
⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Test wasm32-wasip1-threads
  • GitHub Check: Test Linux
  • GitHub Check: Test NAPI
  • GitHub Check: Clippy
  • GitHub Check: Test VSCode
  • GitHub Check: Conformance
🔇 Additional comments (10)
napi/parser/generate-visitor-keys.mjs (5)

1-4: LGTM: Proper imports for the script.

The script correctly imports the necessary dependencies: visitor keys from TypeScript ESLint, and Node.js built-in file system and path modules.


7-12: LGTM: Properly extending the visitor keys.

The code correctly extends the original visitor keys with additional entries for ParenthesizedExpression and TSParenthesizedType which are missing in the original TypeScript ESLint keys.


14-18: LGTM: Proper filtering and formatting of keys.

The code efficiently filters out empty visitor keys and formats the remaining ones into a string representation suitable for a JS module.


19-26: LGTM: Clear warning comment and proper module export format.

The warning comment clearly indicates that the file is auto-generated and should not be edited directly.


27-28: LGTM: File is properly written to the specified path.

The code correctly uses writeFileSync to write the generated code to the specified path.

napi/parser/wrap.cjs (5)

4-5: LGTM: Properly importing the generated visitor keys.

The code correctly requires the generated visitor keys module.


28-32: LGTM: Improved JSON parsing with visitor pattern.

The function has been refactored to use a two-step process: first parse the JSON normally, then visit each node in the AST to apply transformations. This approach is more flexible and potentially more performant than using a JSON reviver function.


34-49: LGTM: Simplified transform function.

The transform function has been updated to directly mutate the node instead of returning transformed values. This is cleaner and more intuitive when used with the visitor pattern.


51-67: LGTM: Fixed visitor implementation.

The visitor implementation correctly processes the AST by:

  1. Handling null nodes and arrays
  2. Applying the transform function to each node (including leaf nodes)
  3. Recursively visiting child nodes according to the visitor keys

This implementation addresses the previously identified issue where leaf nodes like Literal were being skipped, which broke BigInt and RegExp hydration.


60-66:

Details

❓ Verification inconclusive

Verify the order of operations in the visitor.

The current implementation applies the transform function before visiting child nodes (pre-order traversal). Make sure this is the intended behavior, as the opposite order (post-order traversal) might be needed in some cases.


🏁 Script executed:

#!/bin/bash
# Check for any test cases that might rely on specific traversal order
rg -A 5 -B 5 "BigInt|RegExp" napi/parser/wrap.test.js || echo "No test file found"

Length of output: 167


Confirm traversal order in visitNode (pre-order vs. post-order)

I didn’t find any existing tests that verify whether the transform function (fn(node)) should run before or after visiting child nodes. If you intended a pre-order traversal, no change is needed. Otherwise:

  • Consider swapping the order so children are visited first (post-order).
  • Add a targeted unit test to lock in the expected behavior for future refactors.

@overlookmotel
Copy link
Member

I found the problem. Our AST has 2 node types which TS-ESLint doesn't:

  • ParenthesizedExpression
  • TSParenthesizedType

Have pushed a commit to fix, and also done a little refactoring to (in my opinion) make the code a bit clearer.

@ArnaudBarre
Copy link
Contributor Author

Thanks for finding that! Changes looks good to me!

@overlookmotel overlookmotel merged commit f85bda4 into oxc-project:main May 5, 2025
16 checks passed
Boshen pushed a commit that referenced this pull request May 5, 2025
…up visitor (#10813)

Follow-on after #10791, part of #10783.

In the fixup visitor, replace `visitNode` with a more specialized
`transformNode` function. The difference is that `transformNode` doesn't
need to be passed `fn` as 2nd param.

Also, move the check for whether a node is a `Literal` into
`transformNode` to avoid the cost of a function call for every node
(most of which are not `Literal`s).

Running benchmarks locally (`pnpm run bench` in `napi/parser`), I'm
seeing a 2%-3% speed-up.
graphite-app bot pushed a commit that referenced this pull request May 6, 2025
#10791 added 2 files `generated/visitor-keys.cjs` and `visitor-keys.mjs` to the NAPI package. When I amended the original PR, I forgot to update `package.json` to include them both in NPM package.
graphite-app bot pushed a commit that referenced this pull request May 6, 2025
#10791 fixed the performance of `oxc-parser` NPM package by removing the reviver from `JSON.parse` call. However, it replaced it with a complete traversal of the AST on JS side to locate `Literal` nodes and update them.

Instead, generate a list of paths to `Literal` nodes needing fixing on Rust side in `ESTree` serializer.

e.g. for this program:

```js
123n;
foo(/xyz/);
```

the fix paths are:

```json
[
    ["body", 0, "expression"],
    ["body", 1, "expression", "arguments", 2]
]
```

Having the location of nodes which need updating reduces work on JS side, as no need to search the entire AST.

Running `pnpm run bench` (in `napi/parser`) locally shows between 8% and 15% speed-up from this change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C-performance Category - Solution not expected to change functional behavior, only performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf(napi/parser): json reviver is slow

2 participants