perf(linter/plugins): deserialize comments without AST#20364
Conversation
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
Merging this PR will not alter performance
Comparing Footnotes
|
There was a problem hiding this comment.
Pull request overview
This PR optimizes comment deserialization by removing the dependency on the full AST for accessing the hashbang. Instead, the Rust side now inserts the hashbang as a Line comment at position 0 in the comments vector, and the JS side detects and re-labels it as a Shebang comment by checking if the first comment starts at offset 0 and the source begins with #!.
Changes:
- Rust: Insert hashbang as a
Linecomment at the start of the commentsVecin both the linter and NAPI parser paths - JS: Remove AST dependency from
initComments(), detect shebang by inspecting the first comment instead - Add
oxc_astdependency toapps/oxlintcrate
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| crates/oxc_linter/src/lib.rs | Insert hashbang as Line comment at index 0 in both single-alloc and dual-alloc paths |
| apps/oxlint/src/js_plugins/parse.rs | Insert hashbang as Line comment at index 0 in NAPI parse path |
| apps/oxlint/src-js/plugins/comments.ts | Remove AST dependency; detect shebang from first comment + source text |
| apps/oxlint/Cargo.toml | Add oxc_ast workspace dependency |
| Cargo.lock | Lock file update |
Just for the record, this is incorrect. This change only affects Oxlint. |
Merge activity
|
70b2361 to
159c3e2
Compare
Previously the entire AST had to be deserialized when calling any comments method. This was purely to get the `hashbang` property of `Program`, which has to be added to start of comments array as a `Shebang` comment. Usually the AST has to be deserialized anyway in order to visit it, but some rules might early exit and return an empty visitor after checking if a certain comment is present. In those cases, it was extremely wasteful to deserialize the whole AST just to check for `hashbang`. Instead, add a `Line` comment to start of `Vec<Comment>` on Rust side if a hashbang is present. If there are a lot of comments, this involves shifting them all up which could be expensive, but files with hashbangs are rare, and this change enables other optimizations coming in future PRs that rely on all comments being present in the buffer.
bf4968e to
d2cd98d
Compare
#20364 removed the reliance internally on `hashbang` property being present on `Program`. Now we can remove this field. It's not part of ESTree standard. We include it in `oxc-parser` as a non-standard extension as it can be useful, but in the case of Oxlint, rules written with the alternative `createOnce` API are meant to also be compatible with ESLint, so we can't provided APIs which ESLint doesn't support.
Previously the entire AST had to be deserialized when calling any comments method. This was purely to get the `hashbang` property of `Program`, which has to be added to start of comments array as a `Shebang` comment. Usually the AST has to be deserialized anyway in order to visit it, but some rules might early exit and return an empty visitor after checking if a certain comment is present. In those cases, it was extremely wasteful to deserialize the whole AST just to check for `hashbang`. Instead, add a `Line` comment to start of `Vec<Comment>` on Rust side if a hashbang is present. If there are a lot of comments, this involves shifting them all up which could be expensive, but files with hashbangs are rare, and this change enables other optimizations coming in future PRs that rely on all comments being present in the buffer.
159c3e2 to
ef41ffa
Compare
d2cd98d to
b0125c5
Compare
#20364 removed the reliance internally on `hashbang` property being present on `Program`. Now we can remove this field. It's not part of ESTree standard. We include it in `oxc-parser` as a non-standard extension as it can be useful, but in the case of Oxlint, rules written with the alternative `createOnce` API are meant to also be compatible with ESLint, so we can't provided APIs which ESLint doesn't support.
# Oxlint ### 🚀 Features - c95951f linter/plugins: Implement `sourceCode.markVariableAsUsed` (#20357) (overlookmotel) - 7a2a7d0 linter: Implement `n/handle-callback-err` rule (#19616) (Mikhail Baev) ### 🐛 Bug Fixes - f8fbd6e linter/plugins: Remove `hashbang` property from AST (#20365) (overlookmotel) - 6eb5b01 linter/prefer-await-to-then: Ignore Promise static methods (#20347) (camc314) - a4b61f7 linter: Remove `defineConfig` check (#20308) (camc314) - 3ad7f53 linter/explicit-module-boundary-types: False positive with satisfies expr (#20309) (camc314) - f547401 linter/no-unused-private-class-members: Treat switch discriminants as read (#20307) (camc314) - 1c07b3b diagnostics: Handle `WouldBlock` in stdout writes to prevent panic (#20295) (Boshen) ### ⚡ Performance - e4f7248 linter: Remove unnecessary clone of owned String in drain loop (#20388) (Boshen) - 4a67f1d linter: Eliminate Vec allocation in disable directive matching (#20387) (Boshen) - 618a598 linter/plugins: Add fast path for files with no comments (#20366) (overlookmotel) - b0125c5 linter/plugins: Deserialize comments without AST (#20364) (overlookmotel) - 9cd612f linter/plugins: Recycle comment objects (#20362) (overlookmotel) - bf442f8 linter/plugins: Cheaper `Token` creation (#20360) (overlookmotel) - 5474d0a semantic: V8-style walk-up reference resolution (#20292) (Boshen) - 7946eba linter/plugins: Avoid arguments spread and temp array when merging (#20318) (overlookmotel) - fc7cf8a linter/plugins: Pre-define less CFG merger functions (#20317) (overlookmotel) - 3b9eb28 linter/plugins: Streamline getting/creating visit fn mergers (#20319) (overlookmotel) - f04e850 linter/plugins: Inline binary search functions into call sites (#20312) (overlookmotel) - fe24afe linter/plugins: Apply replace globals TSDown plugin to JS files (#20305) (overlookmotel) - 77cdacc linter/plugins: Use array buffer views for tokens (#20301) (overlookmotel) - 910c941 linter/plugins: Reorder branches in `getTokenByRangeStart` (#20296) (overlookmotel) - af7674c linter/tokens: Avoid extra token value allocation (#20013) (camc314) ### 📚 Documentation - 24490b5 linter: Improve formatting for 80ish rules' docs. (#20411) (connorshea) - 3383523 linter: Improve `--tsconfig` flag docs (#20342) (camc314) # Oxfmt ### 🚀 Features - d22c443 oxfmt: Export `OxfmtConfig` type (#20275) (leaysgur) - a11ecff oxfmt/lsp: Respect `angular` language id as `.component.html` file (#20242) (Sysix) ### 🐛 Bug Fixes - ce65099 formatter: Preserve parentheses around as expression before private field access (#20419) (bab) - f908742 oxfmt: Revert #20326 partially (#20413) (leaysgur) - 4ef93ea formatter: Honor trailing ignore comments after list separators (#19925) (Andreas Lubbe) - 68fb0d0 oxfmt: Skip vite.config.ts which fails to import (#20326) (leaysgur) - 88ee826 oxfmt: Handle literalline for script-in-vue (#20130) (leaysgur) - 1c07b3b diagnostics: Handle `WouldBlock` in stdout writes to prevent panic (#20295) (Boshen) Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>

Previously the entire AST had to be deserialized when calling any comments method. This was purely to get the
hashbangproperty ofProgram, which has to be added to start of comments array as aShebangcomment.Usually the AST has to be deserialized anyway in order to visit it, but some rules might early exit and return an empty visitor after checking if a certain comment is present. In those cases, it was extremely wasteful to deserialize the whole AST just to check for
hashbang.Instead, add a
Linecomment to start ofVec<Comment>on Rust side if a hashbang is present. If there are a lot of comments, this involves shifting them all up which could be expensive, but files with hashbangs are rare, and this change enables other optimizations coming in future PRs that rely on all comments being present in the buffer.