Skip to content

perf(linter): batch plugins into single visitor with anchor-kind dispatch#9184

Merged
dyc3 merged 15 commits intobiomejs:mainfrom
chocky335:perf/batch-visitor-anchor
Feb 23, 2026
Merged

perf(linter): batch plugins into single visitor with anchor-kind dispatch#9184
dyc3 merged 15 commits intobiomejs:mainfrom
chocky335:perf/batch-visitor-anchor

Conversation

@chocky335
Copy link
Contributor

@chocky335 chocky335 commented Feb 21, 2026

AI assistance was used during development.

Summary

Improve GritQL plugin performance with two optimizations:

1. BatchPluginVisitor — Instead of registering N separate PluginVisitor instances (one per plugin), a single BatchPluginVisitor holds all plugins and dispatches using FxHashMap<Kind, Vec<usize>> for O(1) per-node lookup.

2. Anchor-kind dispatchGritQuery::execute_optimized() extracts the target syntax kinds from the compiled pattern tree and only executes the inner Bubble pattern at matching nodes, replacing the full Contains tree walk.

Changes

  • BatchPluginVisitor in biome_analyze — batched visitor with kind-to-plugin index map. Deduplicates plugin indices per kind to prevent double evaluation.
  • applies_to_file() on AnalyzerPlugin trait (default true) — enables future file-scoping support.
  • execute_optimized() on GritQuery — anchor-kind filtered execution with safe fallback to execute() when pattern structure can't be statically analyzed.
  • extract_anchor_kinds() — navigates the auto-wrapped pattern tree to extract target syntax kinds. Handles Not (returns []), Or/Any with universal branches (returns []), and Where/Predicate::Match chains.
  • All three analyzer crates switched from per-plugin PluginVisitor loop to single BatchPluginVisitor.

Benchmarks

Tested on a real project (1524 files, 84 GritQL plugins, biome lint ., release builds, best of 3 runs):

Configuration Time Plugin overhead
No plugins (baseline) 200ms
84 plugins, before this PR 26.4s 26.2s
84 plugins, after this PR 3.5s 3.3s

~7.6x faster plugin execution, reducing overhead from 26.2s to 3.3s.

Related

Test Plan

  • All biome_grit_patterns tests pass (52 total: 40 existing + 12 new)
  • All biome_plugin_loader tests pass (9/9)
  • cargo clippy clean on all affected crates
  • cargo fmt clean
  • New tests cover: extract_anchor_kinds (6 tests), extract_contains_inner (2 tests), execute_optimized equivalence (4 tests)
  • execute_optimized falls back to execute() when anchor extraction fails — safe by default

Docs

Internal performance improvement — no user-facing API changes.

@changeset-bot
Copy link

changeset-bot bot commented Feb 21, 2026

🦋 Changeset detected

Latest commit: dd8c139

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 13 packages
Name Type
@biomejs/biome Patch
@biomejs/cli-win32-x64 Patch
@biomejs/cli-win32-arm64 Patch
@biomejs/cli-darwin-x64 Patch
@biomejs/cli-darwin-arm64 Patch
@biomejs/cli-linux-x64 Patch
@biomejs/cli-linux-arm64 Patch
@biomejs/cli-linux-x64-musl Patch
@biomejs/cli-linux-arm64-musl Patch
@biomejs/wasm-web Patch
@biomejs/wasm-bundler Patch
@biomejs/wasm-nodejs Patch
@biomejs/backend-jsonrpc Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added A-Linter Area: linter L-JavaScript Language: JavaScript and super languages L-CSS Language: CSS L-JSON Language: JSON and super languages labels Feb 21, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

This PR introduces a public BatchPluginVisitor that batches analyzer plugins, building a kind-to-plugins index to reduce per-node dispatch from O(N) to O(1) and enforces per-plugin file applicability. Multiple analyze crates (CSS, JS, JSON, etc.) are updated to register a single batch visitor per language instead of per-plugin visitors. Separately, GritQL gains anchor-kind extraction and an execute_optimized path with tests and Criterion benches; a GitHub Actions workflow for running the Grit benchmarks was also added.

Suggested reviewers

  • ematipico
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarises the main change: batching plugins into a single visitor with anchor-kind dispatch for performance optimisation.
Description check ✅ Passed The description comprehensively explains both optimisations, architectural changes across analyzer crates, testing coverage, and benchmarks demonstrating ~7.6× speedup.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
crates/biome_analyze/src/analyzer_plugin.rs (2)

227-255: Consider caching applies_to_file results per analysis run.

applies_to_file is invoked on every matching node for every relevant plugin, but file_path is constant for the entire analysis run. Once real implementations return false, this becomes redundant work on a hot path.

You could pre-compute an applicable bitmap (or filtered index list) on the first visit call, or — even better — accept the file path in new_unchecked and filter out non-applicable plugins at construction time.

Not urgent since the default is true today, but worth noting for when file-scoping ships.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_analyze/src/analyzer_plugin.rs` around lines 227 - 255,
applies_to_file is being called repeatedly for the same file_path inside the
loop over plugin_indices; compute and cache which plugins are applicable once
and iterate only those to avoid redundant calls. Fix by pre-filtering
plugin_indices (or plugins) at the start of the analysis/visit using
plugin.applies_to_file(&ctx.options.file_path) and iterate that filtered list
instead of checking inside the loop; alternatively accept file_path in
new_unchecked and construct a filtered plugin list there so the main loop uses
the already-filtered plugins (refer to plugin_indices, plugins, applies_to_file,
and new_unchecked to locate where to implement the change).

234-237: Profiling label "plugin" is identical for all plugins — consider using the plugin name.

Every plugin in the batch is timed under the same "plugin" label, making it impossible to distinguish per-plugin costs in profiling output. Passing the actual plugin name (if available) would make performance investigations much easier.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_analyze/src/analyzer_plugin.rs` around lines 234 - 237, The
profiling label currently uses the constant "plugin", which makes timings
indistinguishable; change the call to profiling::start_plugin_rule(...) to pass
a unique label per plugin (e.g., use the plugin's name or id) instead of the
literal "plugin". Locate the snippet around plugin.evaluate(...) where
rule_timer is created and replace the static label with the plugin's identifier
(for example plugin.name() or plugin.id() — or format!("{:?}", plugin) / Display
impl if no accessor exists) so each plugin's timer is started with a distinct,
human-readable label before calling rule_timer.stop().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.changeset/batch-plugin-visitor.md:
- Line 2: Update the changeset entry that currently reads '"@biomejs/biome":
patch' to use the correct change type for a next-targeted PR by replacing patch
with minor; edit the .changeset/batch-plugin-visitor.md content so the line
reads '"@biomejs/biome": minor' to reflect that this is a non-breaking feature
targeting the next branch.

---

Nitpick comments:
In `@crates/biome_analyze/src/analyzer_plugin.rs`:
- Around line 227-255: applies_to_file is being called repeatedly for the same
file_path inside the loop over plugin_indices; compute and cache which plugins
are applicable once and iterate only those to avoid redundant calls. Fix by
pre-filtering plugin_indices (or plugins) at the start of the analysis/visit
using plugin.applies_to_file(&ctx.options.file_path) and iterate that filtered
list instead of checking inside the loop; alternatively accept file_path in
new_unchecked and construct a filtered plugin list there so the main loop uses
the already-filtered plugins (refer to plugin_indices, plugins, applies_to_file,
and new_unchecked to locate where to implement the change).
- Around line 234-237: The profiling label currently uses the constant "plugin",
which makes timings indistinguishable; change the call to
profiling::start_plugin_rule(...) to pass a unique label per plugin (e.g., use
the plugin's name or id) instead of the literal "plugin". Locate the snippet
around plugin.evaluate(...) where rule_timer is created and replace the static
label with the plugin's identifier (for example plugin.name() or plugin.id() —
or format!("{:?}", plugin) / Display impl if no accessor exists) so each
plugin's timer is started with a distinct, human-readable label before calling
rule_timer.stop().

@chocky335 chocky335 changed the base branch from next to main February 21, 2026 20:43
@chocky335 chocky335 force-pushed the perf/batch-visitor-anchor branch from 4a82ad3 to b6d6bce Compare February 21, 2026 20:44
@chocky335 chocky335 changed the base branch from main to next February 21, 2026 20:44
@chocky335 chocky335 force-pushed the perf/batch-visitor-anchor branch 2 times, most recently from 896420d to 56158f6 Compare February 21, 2026 20:47
@chocky335 chocky335 changed the base branch from next to main February 21, 2026 20:47
@github-actions github-actions bot added A-Parser Area: parser A-Formatter Area: formatter A-Tooling Area: internal tools L-HTML Language: HTML and super languages L-Grit Language: GritQL L-Tailwind Language: Tailwind CSS labels Feb 21, 2026
@chocky335 chocky335 force-pushed the perf/batch-visitor-anchor branch from 56158f6 to 23489db Compare February 21, 2026 20:48
@github-actions github-actions bot removed A-Parser Area: parser A-Formatter Area: formatter A-Tooling Area: internal tools L-HTML Language: HTML and super languages L-Grit Language: GritQL L-Tailwind Language: Tailwind CSS labels Feb 21, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In @.changeset/batch-plugin-visitor.md:
- Line 2: The changeset currently uses the entry string '"@biomejs/biome":
patch' but this PR targets the next branch so the change type must be 'minor';
update the changeset line in .changeset/batch-plugin-visitor.md to use
'"@biomejs/biome": minor' (preserve surrounding formatting) so the release
tooling treats this as a minor change for the next branch.

@chocky335 chocky335 force-pushed the perf/batch-visitor-anchor branch 2 times, most recently from 9e8b18f to dbf45e8 Compare February 21, 2026 21:03
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/biome_grit_patterns/src/lib.rs (1)

24-24: Consider hiding the testing module from public docs.

This module is pub so benches can access it (they compile as external crates), which is fine. But it leaks into the crate's public API documentation. A #[doc(hidden)] would keep it accessible to benches while signalling it's not for downstream consumers.

Suggested change
-pub mod testing;
+#[doc(hidden)]
+pub mod testing;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_grit_patterns/src/lib.rs` at line 24, The public testing module
(pub mod testing) is exposed in crate docs; annotate the module with
#[doc(hidden)] to hide it from generated documentation while keeping it public
for benches; locate the pub mod testing declaration in lib.rs and add the
#[doc(hidden)] attribute immediately above it so the testing module remains
accessible to external benches but is omitted from public API docs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/biome_grit_patterns/src/lib.rs`:
- Line 24: The public testing module (pub mod testing) is exposed in crate docs;
annotate the module with #[doc(hidden)] to hide it from generated documentation
while keeping it public for benches; locate the pub mod testing declaration in
lib.rs and add the #[doc(hidden)] attribute immediately above it so the testing
module remains accessible to external benches but is omitted from public API
docs.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/biome_grit_patterns/src/lib.rs (1)

24-25: Feature gate the testing module to reduce production build artefacts.

Benchmarks correctly import biome_grit_patterns::testing from benches/grit_query.rs, which is why #[cfg(test)] alone won't work (benches run outside #[cfg(test)]). However, exposing test helpers unconditionally in all builds is unnecessary.

A testing Cargo feature cleanly gates this:

+#[cfg(any(test, feature = "testing"))]
 #[doc(hidden)]
 pub mod testing;

Then in Cargo.toml:

[features]
testing = []

[[bench]]
harness = false
name = "grit_query"
required-features = ["testing"]

This keeps the intent clear whilst avoiding dead code in production builds.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_grit_patterns/src/lib.rs` around lines 24 - 25, Gate the
published testing helpers behind a Cargo feature: change the module declaration
for testing (the pub mod testing in lib.rs) to only compile when the "testing"
feature is enabled (use #[cfg(feature = "testing")] or equivalent) so it is not
unconditionally exposed in production builds, and add a "testing = []" feature
in Cargo.toml plus mark the grit_query bench with required-features =
["testing"] so benchmarks can still import biome_grit_patterns::testing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/biome_grit_patterns/src/lib.rs`:
- Around line 24-25: Gate the published testing helpers behind a Cargo feature:
change the module declaration for testing (the pub mod testing in lib.rs) to
only compile when the "testing" feature is enabled (use #[cfg(feature =
"testing")] or equivalent) so it is not unconditionally exposed in production
builds, and add a "testing = []" feature in Cargo.toml plus mark the grit_query
bench with required-features = ["testing"] so benchmarks can still import
biome_grit_patterns::testing.

@chocky335 chocky335 force-pushed the perf/batch-visitor-anchor branch from 3848262 to 8a896f9 Compare February 22, 2026 17:23
@chocky335
Copy link
Contributor Author

@coderabbitai resume

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 22, 2026

✅ Actions performed

Reviews resumed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_grit_patterns/Cargo.toml`:
- Line 47: Replace the wildcard dependency for codspeed-criterion-compat with a
pinned version to ensure reproducible builds: change the criterion dependency
line that currently reads `criterion = { package = "codspeed-criterion-compat",
version = "*" }` to use a concrete version (e.g. `version = "=3.0.5"`) matching
the other crate (`biome_html_analyze`) so the package field
(`"codspeed-criterion-compat"`) remains the same but the version is pinned.

@chocky335 chocky335 requested a review from dyc3 February 22, 2026 17:59
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/biome_grit_patterns/Cargo.toml`:
- Line 47: No change required: keep the dev-dependency line criterion = {
package = "codspeed-criterion-compat", version = "*" } as-is (this matches the
repo-wide convention), so do not modify the version or replace the wildcard in
the Cargo.toml entry or related Cargo config.

@ematipico
Copy link
Member

ematipico commented Feb 23, 2026

@chocky335 In fact, we broke the benchmarks. I believe it's best to create a new workflow like we did for other files. Grit plugins don't depend on changes to the javascript langauge. It should be very easy to create a new file. And we should include all grit related crates to the path that trigger the workflow

@chocky335
Copy link
Contributor Author

Re: Cargo.toml feature gate — Removed the testing feature and required-features. Not needed, other benchmark crates don't use feature gates.

Re: broken benchmarks / new workflow — Created a separate benchmark_grit.yml with all 5 grit crates in the trigger paths. Reverted benchmark_js.yml to its original state (only change from main is the CodSpeed version bump).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/workflows/benchmark_grit.yml (1)

11-30: Path filters include crates that aren't benchmarked, triggering expensive 70-minute CI runs unnecessarily.

The matrix only benchmarks biome_grit_patterns, yet biome_grit_factory, biome_grit_formatter, biome_grit_parser, and biome_grit_syntax are all in the trigger paths. A change to the formatter/parser would spin up the ARM runner, compile for 20 minutes, and run benchmarks for up to 50 minutes — all for no useful signal.

✂️ Proposed trim
     paths:
       - 'Cargo.lock'
-      - 'crates/biome_grit_factory/**/*.rs'
-      - 'crates/biome_grit_formatter/**/*.rs'
-      - 'crates/biome_grit_parser/**/*.rs'
       - 'crates/biome_grit_patterns/**/*.rs'
-      - 'crates/biome_grit_syntax/**/*.rs'
       - 'crates/biome_rowan/**/*.rs'

Apply the same trim to the push paths block (lines 24–30).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/benchmark_grit.yml around lines 11 - 30, The workflow's
path filters include several crates (biome_grit_factory, biome_grit_formatter,
biome_grit_parser, biome_grit_syntax, biome_rowan) causing unnecessary CI runs;
update the top-level paths: and the push: paths: blocks so they only list the
crate(s) actually benchmarked (e.g., keep 'crates/biome_grit_patterns/**/*.rs'
and 'Cargo.lock') and remove the other crate-specific entries; ensure you make
the same change for both the initial paths: block and the push: paths: block so
both trigger and push filters match.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/benchmark_grit.yml:
- Around line 11-30: The workflow's path filters include several crates
(biome_grit_factory, biome_grit_formatter, biome_grit_parser, biome_grit_syntax,
biome_rowan) causing unnecessary CI runs; update the top-level paths: and the
push: paths: blocks so they only list the crate(s) actually benchmarked (e.g.,
keep 'crates/biome_grit_patterns/**/*.rs' and 'Cargo.lock') and remove the other
crate-specific entries; ensure you make the same change for both the initial
paths: block and the push: paths: block so both trigger and push filters match.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fc4ce0c and 12c20a3.

📒 Files selected for processing (3)
  • .github/workflows/benchmark_grit.yml
  • crates/biome_grit_patterns/Cargo.toml
  • crates/biome_grit_patterns/src/lib.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/biome_grit_patterns/Cargo.toml

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
.github/workflows/benchmark_grit.yml (1)

60-72: cargo codspeed run isn't scoped to ${{ matrix.package }}.

The compile step correctly pins to the matrix package, but the run step doesn't. Today this is harmless (single-entry matrix), but if the matrix ever grows, each job would re-run all previously compiled benchmarks rather than only its own. A tiny consistency fix for future-proofing:

🔧 Proposed fix
       - name: Run the benchmarks
         uses: CodSpeedHQ/action@4deb3275dd364fb96fb074c953133d29ec96f80f # v4.10.6
         timeout-minutes: 50
         with:
           mode: simulation
-          run: cargo codspeed run
+          run: cargo codspeed run -p ${{ matrix.package }}
           token: ${{ secrets.CODSPEED_TOKEN }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/benchmark_grit.yml around lines 60 - 72, The benchmark run
step currently calls "cargo codspeed run" without scoping to the matrix package,
so each job may run all benchmarks; update the Run step's "run" argument to pass
the matrix package (e.g., add "-p ${{ matrix.package }}" to the cargo codspeed
run invocation) so it mirrors the Compile step's scoping (compare to the Compile
step using "cargo codspeed build -p ${{ matrix.package }}") and ensure the
CodSpeed action runs only the intended package for the current matrix job.
crates/biome_grit_patterns/benches/grit_query.rs (3)

2-2: criterion::black_box is deprecated — use std::hint::black_box instead.

criterion::black_box carries #[deprecated(note = "use std::hint::black_box() instead")], which will produce a compiler warning at build time.

🔧 Proposed fix
-use criterion::{BenchmarkId, Criterion, black_box, criterion_group, criterion_main};
+use criterion::{BenchmarkId, Criterion, criterion_group, criterion_main};
+use std::hint::black_box;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_grit_patterns/benches/grit_query.rs` at line 2, The code imports
and uses the deprecated criterion::black_box; update the import and all usages
to use std::hint::black_box instead by removing black_box from the criterion use
list and replacing it with calls to std::hint::black_box wherever referenced
(e.g., in benchmark functions and any invocations inside functions like those
registering benchmarks with Criterion/BenchmarkId); ensure no remaining
references to criterion::black_box remain and run cargo build to verify the
deprecation warning is gone.

8-17: Incomplete coverage for musl Linux on non-aarch64 architectures.

The Jemalloc guard (Line 10) excludes all musl environments, yet the System allocator fallback at Line 15 only covers musl + linux + aarch64. A musl Linux x86_64 runner ends up with no #[global_allocator] attribute — which is harmless (Rust defaults to the system allocator), but the gap is presumably unintentional. Either cover all musl Linux:

🔧 Proposed fix
-#[cfg(all(target_env = "musl", target_os = "linux", target_arch = "aarch64"))]
+#[cfg(all(target_env = "musl", target_os = "linux"))]
 #[global_allocator]
 static GLOBAL: std::alloc::System = std::alloc::System;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_grit_patterns/benches/grit_query.rs` around lines 8 - 17, The
cfg guards leave musl+linux on non-aarch64 without a global_allocator; update
the allocator cfgs so musl Linux uses the System allocator by broadening the
second cfg to #[cfg(all(target_env = "musl", target_os = "linux"))] (or
otherwise add a branch covering musl linux x86_64) and keep the first jemalloc
declaration for non-musl macos/linux as-is; refer to the static GLOBAL
definitions and their #[global_allocator] attributes to locate and change the
cfg on the System allocator declaration.

56-74: base_file.clone() inside b.iter() inflates measured time.

If the routine requires some per-iteration setup that shouldn't be timed, iter_batched or iter_batched_ref should be used. Both execute and execute_optimized carry the same clone overhead so their relative comparison is unaffected, but absolute numbers are overstated. Worth fixing so the reported µs accurately reflects query execution alone.

🔧 Proposed fix using iter_batched
-        group.bench_with_input(BenchmarkId::new("execute", name), pattern_src, |b, _| {
-            b.iter(|| {
-                black_box(query.execute(base_file.clone()).unwrap());
-            });
-        });
-
-        group.bench_with_input(
-            BenchmarkId::new("execute_optimized", name),
-            pattern_src,
-            |b, _| {
-                b.iter(|| {
-                    black_box(query.execute_optimized(base_file.clone()).unwrap());
-                });
-            },
-        );
+        group.bench_with_input(BenchmarkId::new("execute", name), pattern_src, |b, _| {
+            b.iter_batched(
+                || base_file.clone(),
+                |f| black_box(query.execute(f).unwrap()),
+                criterion::BatchSize::SmallInput,
+            );
+        });
+
+        group.bench_with_input(
+            BenchmarkId::new("execute_optimized", name),
+            pattern_src,
+            |b, _| {
+                b.iter_batched(
+                    || base_file.clone(),
+                    |f| black_box(query.execute_optimized(f).unwrap()),
+                    criterion::BatchSize::SmallInput,
+                );
+            },
+        );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_grit_patterns/benches/grit_query.rs` around lines 56 - 74, The
benchmark currently clones base_file inside b.iter(), inflating measured time;
change the two group.bench_with_input blocks to use iter_batched (or
iter_batched_ref) so the clone of make_js_file(JS_CODE) is performed in the
setup closure and only query.execute(...) / query.execute_optimized(...) are
timed; locate the blocks that call group.bench_with_input with
BenchmarkId::new("execute", ...) and BenchmarkId::new("execute_optimized", ...)
and replace b.iter(|| { black_box(query.execute(base_file.clone()).unwrap()); })
and the optimized variant with b.iter_batched(|| base_file.clone(), |bf| {
black_box(query.execute(bf).unwrap()); }, BatchSize::SmallInput) (or
iter_batched_ref variant) so cloning is excluded from the timed section.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/benchmark_grit.yml:
- Around line 60-72: The benchmark run step currently calls "cargo codspeed run"
without scoping to the matrix package, so each job may run all benchmarks;
update the Run step's "run" argument to pass the matrix package (e.g., add "-p
${{ matrix.package }}" to the cargo codspeed run invocation) so it mirrors the
Compile step's scoping (compare to the Compile step using "cargo codspeed build
-p ${{ matrix.package }}") and ensure the CodSpeed action runs only the intended
package for the current matrix job.

In `@crates/biome_grit_patterns/benches/grit_query.rs`:
- Line 2: The code imports and uses the deprecated criterion::black_box; update
the import and all usages to use std::hint::black_box instead by removing
black_box from the criterion use list and replacing it with calls to
std::hint::black_box wherever referenced (e.g., in benchmark functions and any
invocations inside functions like those registering benchmarks with
Criterion/BenchmarkId); ensure no remaining references to criterion::black_box
remain and run cargo build to verify the deprecation warning is gone.
- Around line 8-17: The cfg guards leave musl+linux on non-aarch64 without a
global_allocator; update the allocator cfgs so musl Linux uses the System
allocator by broadening the second cfg to #[cfg(all(target_env = "musl",
target_os = "linux"))] (or otherwise add a branch covering musl linux x86_64)
and keep the first jemalloc declaration for non-musl macos/linux as-is; refer to
the static GLOBAL definitions and their #[global_allocator] attributes to locate
and change the cfg on the System allocator declaration.
- Around line 56-74: The benchmark currently clones base_file inside b.iter(),
inflating measured time; change the two group.bench_with_input blocks to use
iter_batched (or iter_batched_ref) so the clone of make_js_file(JS_CODE) is
performed in the setup closure and only query.execute(...) /
query.execute_optimized(...) are timed; locate the blocks that call
group.bench_with_input with BenchmarkId::new("execute", ...) and
BenchmarkId::new("execute_optimized", ...) and replace b.iter(|| {
black_box(query.execute(base_file.clone()).unwrap()); }) and the optimized
variant with b.iter_batched(|| base_file.clone(), |bf| {
black_box(query.execute(bf).unwrap()); }, BatchSize::SmallInput) (or
iter_batched_ref variant) so cloning is excluded from the timed section.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 12c20a3 and e62b371.

📒 Files selected for processing (2)
  • .github/workflows/benchmark_grit.yml
  • crates/biome_grit_patterns/benches/grit_query.rs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
crates/biome_grit_patterns/benches/grit_query.rs (2)

63-77: base_file.clone() is inside the timed loop — use iter_batched instead.

Because execute and execute_optimized consume the file, you're forced to clone it per-iteration. Putting that clone inside b.iter() includes the AST-clone cost in every measurement. When a routine requires per-iteration setup that shouldn't be timed, iter_batched (or iter_batched_ref) is the right loop. The absolute numbers reported in the PR description (87 µs vs 350–418 µs) could be inflated by the clone.

♻️ Proposed fix
-        group.bench_with_input(BenchmarkId::new("execute", name), pattern_src, |b, _| {
-            b.iter(|| {
-                black_box(query.execute(base_file.clone()).unwrap());
-            });
-        });
-
-        group.bench_with_input(
-            BenchmarkId::new("execute_optimized", name),
-            pattern_src,
-            |b, _| {
-                b.iter(|| {
-                    black_box(query.execute_optimized(base_file.clone()).unwrap());
-                });
-            },
-        );
+        group.bench_with_input(BenchmarkId::new("execute", name), pattern_src, |b, _| {
+            b.iter_batched(
+                || base_file.clone(),
+                |file| black_box(query.execute(file).unwrap()),
+                criterion::BatchSize::SmallInput,
+            );
+        });
+
+        group.bench_with_input(
+            BenchmarkId::new("execute_optimized", name),
+            pattern_src,
+            |b, _| {
+                b.iter_batched(
+                    || base_file.clone(),
+                    |file| black_box(query.execute_optimized(file).unwrap()),
+                    criterion::BatchSize::SmallInput,
+                );
+            },
+        );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_grit_patterns/benches/grit_query.rs` around lines 63 - 77, The
benchmark currently clones base_file inside the timed closure (in
group.bench_with_input calling b.iter) so the AST clone is measured; change both
benchmarks (the BenchmarkId "execute" and "execute_optimized" blocks that call
query.execute and query.execute_optimized) to use b.iter_batched (or
iter_batched_ref) with a setup closure that clones base_file and returns the
cloned file, and a measurement closure that calls
black_box(query.execute(cloned_file).unwrap()) /
black_box(query.execute_optimized(cloned_file).unwrap()); this moves
base_file.clone() into the setup phase and excludes the clone cost from timing.

15-17: Optional: widen the musl System allocator guard to cover all architectures.

The condition musl AND linux AND aarch64 silently falls through to Rust's default system allocator on x86_64-unknown-linux-musl (and other musl non-aarch64 targets) anyway, so functionally it's fine. But being explicit here avoids anyone scratching their head later wondering whether a musl x86 build gets jemalloc:

✨ Proposed tidy-up
-#[cfg(all(target_env = "musl", target_os = "linux", target_arch = "aarch64"))]
+#[cfg(all(target_env = "musl", target_os = "linux"))]
 #[global_allocator]
 static GLOBAL: std::alloc::System = std::alloc::System;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_grit_patterns/benches/grit_query.rs` around lines 15 - 17, The
cfg guard on the global allocator is too narrow (only aarch64) causing musl
builds on other arches to fall through; broaden the condition by removing the
architecture check so the attribute applies to all musl Linux targets: update
the #[cfg(all(target_env = "musl", target_os = "linux", target_arch =
"aarch64"))] to #[cfg(all(target_env = "musl", target_os = "linux"))] for the
#[global_allocator] static GLOBAL: std::alloc::System declaration so musl x86_64
and other musl targets use the intended System allocator.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/biome_grit_patterns/benches/grit_query.rs`:
- Around line 63-77: The benchmark currently clones base_file inside the timed
closure (in group.bench_with_input calling b.iter) so the AST clone is measured;
change both benchmarks (the BenchmarkId "execute" and "execute_optimized" blocks
that call query.execute and query.execute_optimized) to use b.iter_batched (or
iter_batched_ref) with a setup closure that clones base_file and returns the
cloned file, and a measurement closure that calls
black_box(query.execute(cloned_file).unwrap()) /
black_box(query.execute_optimized(cloned_file).unwrap()); this moves
base_file.clone() into the setup phase and excludes the clone cost from timing.
- Around line 15-17: The cfg guard on the global allocator is too narrow (only
aarch64) causing musl builds on other arches to fall through; broaden the
condition by removing the architecture check so the attribute applies to all
musl Linux targets: update the #[cfg(all(target_env = "musl", target_os =
"linux", target_arch = "aarch64"))] to #[cfg(all(target_env = "musl", target_os
= "linux"))] for the #[global_allocator] static GLOBAL: std::alloc::System
declaration so musl x86_64 and other musl targets use the intended System
allocator.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e62b371 and dd8c139.

📒 Files selected for processing (1)
  • crates/biome_grit_patterns/benches/grit_query.rs

@chocky335
Copy link
Contributor Author

chocky335 commented Feb 23, 2026

Dismissing nitpicks from reviews #3841092930 and
#3841581928 — all match established biome patterns, no changes needed:

  1. cargo codspeed run without -p: All 10 biome benchmark workflows (js, css, json, graphql, html, etc.) use this pattern. codspeed run only executes what was compiled with
    -p in the build step.

  2. criterion::black_box: Biome uses codspeed-criterion-compat v3.0.5, not the real criterion crate. It aliases codspeed-criterion-compat-walltime which has no deprecation
    on black_box. All 19 benchmark files use criterion::black_box, zero use std::hint::black_box.

  3. Musl allocator cfg with target_arch = "aarch64": All 7 benchmarks use the same cfg. Jemalloc only fails on musl+aarch64, works fine on musl+x86_64.

  4. b.iter() with clone: GritTargetFile clone is lightweight. Both execute and execute_optimized get the same clone overhead, so the relative comparison is unaffected. Other
    biome benchmarks use b.iter() with cheap clones — iter_batched is reserved for expensive setup like cache creation + parsing.

Benchmark automation confirmed working:

  • All 20 Bench (*) checks pass, including Bench (biome_grit_patterns)
  • CodSpeed: 58 untouched benchmarks, 156
    skipped — no regressions
  • benchmark_grit.yml triggers correctly on grit crate paths

@coderabbitai resume

@chocky335 chocky335 requested a review from ematipico February 23, 2026 15:36
@ematipico
Copy link
Member

@dyc3 I leave review and merging to you

Copy link
Contributor

@dyc3 dyc3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@dyc3 dyc3 merged commit 49c8fde into biomejs:main Feb 23, 2026
34 checks passed
@github-actions github-actions bot mentioned this pull request Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Linter Area: linter A-Project Area: project L-CSS Language: CSS L-Grit Language: GritQL L-JavaScript Language: JavaScript and super languages L-JSON Language: JSON and super languages

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants