chore(bench/html): add real test fixtures to biome_html_analyze benches#9578
chore(bench/html): add real test fixtures to biome_html_analyze benches#9578
Conversation
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughBenchmarks in crates/biome_html_analyze now load filesystem fixtures and run fixture-driven benchmarks alongside existing library-based benches. A fixture entry was added at crates/biome_html_analyze/benches/fixtures/real referencing ../../../biome_html_parser/benches/fixtures/real. The bench harness now recursively collects non-.md fixtures, derives HtmlFileSource from file path/extension, pre-parses inputs once per case, reports throughput in bytes, and registers dynamic benchmark IDs per fixture. Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment Tip Rust Clippy can be used to improve the quality of Rust code reviews.Clippy is the official Rust linter. It provides lints to catch common mistakes and improve your Rust code. To configure Clippy, add a See Clippy Documentation for more details. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
crates/biome_html_analyze/benches/html_analyzer.rs (1)
45-56: Keep the full relative path in the benchmark ID.
grouponly keeps the first path segment andnameonly the basename, so nested fixtures likereal/foo/index.htmlandreal/bar/index.htmlboth collapse toreal/index.html. Usingrelas the parameter would keep Codspeed labels distinct and less cryptic.Also applies to: 131-135
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_html_analyze/benches/html_analyzer.rs` around lines 45 - 56, The benchmark ID currently collapses nested fixtures by building `group` from only the first path segment and `name` from the basename; instead use the full relative path `rel` as the label to keep IDs distinct. Replace the `group`/`name` construction with a single string created from `rel` (e.g. `let rel_label = rel.to_string_lossy().to_string()` and use `rel_label` where the benchmark ID/label is passed), falling back to `"root"` if `rel` is empty; apply the same replacement for the other occurrence around the `rel` handling at the second location (the block currently at lines 131-135).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/biome_html_analyze/benches/html_analyzer.rs`:
- Around line 85-86: The code currently silently falls back to a default
HtmlFileSource when HtmlFileSource::try_from(test_case.path()) fails (uses
unwrap_or_default), which masks bad paths/extensions; change this to propagate
or handle the error explicitly by removing unwrap_or_default and returning or
panicking with a clear message (e.g., use expect or map_err+? to surface the
failure) so failures in HtmlFileSource::try_from are loud; apply the same change
to the similar usage around parse_html and HtmlParserOptions::from (the block at
lines ~128-130) so both lookup failures fail fast instead of downgrading to
default HTML.
- Around line 29-30: The fixture discovery is currently swallowing IO errors;
change load_fixtures to propagate failures instead of returning an empty result:
modify the signature of load_fixtures to return
Result<Vec<(String,String,String)>, std::io::Error>, use the ? operator on
fs::read_dir and fs::read_to_string (and any iterator item.file_name()/metadata
calls) to bubble up errors, and update the bench entry point that calls
load_fixtures to .expect("failed to load fixtures") (or otherwise handle the
Result) so the run aborts with a clear error message; apply the same pattern to
other similar blocks that use read_dir/read_to_string in this file.
---
Nitpick comments:
In `@crates/biome_html_analyze/benches/html_analyzer.rs`:
- Around line 45-56: The benchmark ID currently collapses nested fixtures by
building `group` from only the first path segment and `name` from the basename;
instead use the full relative path `rel` as the label to keep IDs distinct.
Replace the `group`/`name` construction with a single string created from `rel`
(e.g. `let rel_label = rel.to_string_lossy().to_string()` and use `rel_label`
where the benchmark ID/label is passed), falling back to `"root"` if `rel` is
empty; apply the same replacement for the other occurrence around the `rel`
handling at the second location (the block currently at lines 131-135).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a46cdc5f-3f32-4814-85ed-19b63026fd85
📒 Files selected for processing (2)
crates/biome_html_analyze/benches/fixtures/realcrates/biome_html_analyze/benches/html_analyzer.rs
Merging this PR will not alter performance
Performance Changes
Comparing Footnotes
|
f4f2a0a to
0b64642
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (2)
crates/biome_html_analyze/benches/html_analyzer.rs (2)
133-133:⚠️ Potential issue | 🟠 MajorSame silent downgrade concern here.
try_from_extension(ext).unwrap_or_default()will silently treat unknown extensions as plain HTML. Worth failing fast to catch fixture misconfigurations.Suggested fix
- let file_source = HtmlFileSource::try_from_extension(ext).unwrap_or_default(); + let file_source = HtmlFileSource::try_from_extension(ext) + .unwrap_or_else(|_| panic!("unsupported HTML benchmark fixture extension: {ext}"));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_html_analyze/benches/html_analyzer.rs` at line 133, The code silently downgrades unknown extensions by using HtmlFileSource::try_from_extension(ext).unwrap_or_default(); change this to fail fast so fixture misconfigurations are caught: replace the unwrap_or_default call on HtmlFileSource::try_from_extension(ext) (used when assigning file_source) with an explicit failure path (e.g., use .expect or .unwrap_or_else to panic with a clear message including ext) or propagate the Result so unknown extensions return Err instead of defaulting to plain HTML.
89-90:⚠️ Potential issue | 🟠 MajorDon't silently downgrade unknown sources to default HTML.
This still uses
unwrap_or_default(), which masks bad paths or unexpected extensions. A failing benchmark is better than a quietly misleading one.Suggested fix
- let file_source = HtmlFileSource::try_from(test_case.path()).unwrap_or_default(); + let file_source = HtmlFileSource::try_from(test_case.path()) + .unwrap_or_else(|_| panic!("unsupported HTML bench case: {:?}", test_case.path()));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_html_analyze/benches/html_analyzer.rs` around lines 89 - 90, The code silently downgrades unknown sources by calling HtmlFileSource::try_from(test_case.path()).unwrap_or_default() which masks invalid paths; change this to handle the Result explicitly — either propagate the error or fail fast with a clear message (e.g., replace unwrap_or_default() with .expect("Invalid HtmlFileSource for test_case.path(): <path>") or match the Result and panic/log), then pass the valid HtmlFileSource into HtmlParserOptions::from and parse_html(code, HtmlParserOptions::from(&file_source)); ensure you reference HtmlFileSource::try_from, HtmlFileSource (the variable file_source), HtmlParserOptions::from and parse_html when making the change so unknown sources are not silently defaulted.
🧹 Nitpick comments (1)
crates/biome_html_analyze/benches/html_analyzer.rs (1)
139-163: Consider extracting shared benchmark logic.The filter/options construction and
analyzecall are duplicated between the library and fixture loops. A small helper would reduce this repetition.That said, for benchmark code, explicitness has its merits—so entirely optional.
Example helper
fn run_analysis(parse: &HtmlParse, file_source: HtmlFileSource) { let filter = AnalysisFilter { categories: RuleCategoriesBuilder::default() .with_syntax() .with_lint() .with_assist() .build(), ..AnalysisFilter::default() }; let options = AnalyzerOptions::default().with_configuration( AnalyzerConfiguration::default().with_jsx_runtime(JsxRuntime::default()), ); biome_html_analyze::analyze(&parse.tree(), filter, &options, file_source, |event| { black_box(event.diagnostic()); black_box(event.actions()); ControlFlow::<Never>::Continue(()) }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_html_analyze/benches/html_analyzer.rs` around lines 139 - 163, The benchmark duplicates AnalysisFilter/AnalyzerOptions construction and the biome_html_analyze::analyze call; extract that shared logic into a helper function (e.g., fn run_analysis(parse: &HtmlParse, file_source: HtmlFileSource)) that builds the AnalysisFilter with RuleCategoriesBuilder, creates AnalyzerOptions with AnalyzerConfiguration::with_jsx_runtime(JsxRuntime::default()), and invokes biome_html_analyze::analyze(&parse.tree(), filter, &options, file_source, |event| { black_box(event.diagnostic()); black_box(event.actions()); ControlFlow::<Never>::Continue(()) }); then replace the duplicated blocks in both the library and fixture benchmark loops with a call to run_analysis.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/biome_html_analyze/benches/html_analyzer.rs`:
- Line 133: The code silently downgrades unknown extensions by using
HtmlFileSource::try_from_extension(ext).unwrap_or_default(); change this to fail
fast so fixture misconfigurations are caught: replace the unwrap_or_default call
on HtmlFileSource::try_from_extension(ext) (used when assigning file_source)
with an explicit failure path (e.g., use .expect or .unwrap_or_else to panic
with a clear message including ext) or propagate the Result so unknown
extensions return Err instead of defaulting to plain HTML.
- Around line 89-90: The code silently downgrades unknown sources by calling
HtmlFileSource::try_from(test_case.path()).unwrap_or_default() which masks
invalid paths; change this to handle the Result explicitly — either propagate
the error or fail fast with a clear message (e.g., replace unwrap_or_default()
with .expect("Invalid HtmlFileSource for test_case.path(): <path>") or match the
Result and panic/log), then pass the valid HtmlFileSource into
HtmlParserOptions::from and parse_html(code,
HtmlParserOptions::from(&file_source)); ensure you reference
HtmlFileSource::try_from, HtmlFileSource (the variable file_source),
HtmlParserOptions::from and parse_html when making the change so unknown sources
are not silently defaulted.
---
Nitpick comments:
In `@crates/biome_html_analyze/benches/html_analyzer.rs`:
- Around line 139-163: The benchmark duplicates AnalysisFilter/AnalyzerOptions
construction and the biome_html_analyze::analyze call; extract that shared logic
into a helper function (e.g., fn run_analysis(parse: &HtmlParse, file_source:
HtmlFileSource)) that builds the AnalysisFilter with RuleCategoriesBuilder,
creates AnalyzerOptions with
AnalyzerConfiguration::with_jsx_runtime(JsxRuntime::default()), and invokes
biome_html_analyze::analyze(&parse.tree(), filter, &options, file_source,
|event| { black_box(event.diagnostic()); black_box(event.actions());
ControlFlow::<Never>::Continue(()) }); then replace the duplicated blocks in
both the library and fixture benchmark loops with a call to run_analysis.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 61360687-aad1-414b-b776-3cbcaef699ef
📒 Files selected for processing (2)
crates/biome_html_analyze/benches/fixtures/realcrates/biome_html_analyze/benches/html_analyzer.rs
✅ Files skipped from review due to trivial changes (1)
- crates/biome_html_analyze/benches/fixtures/real
0b64642 to
72584be
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/biome_html_analyze/benches/html_analyzer.rs (1)
99-110: Consider extracting the common filter/options setup.The
AnalysisFilterandAnalyzerOptionsconstruction is duplicated between the library benchmarks and fixture benchmarks. A small helper would reduce this repetition.♻️ Possible refactor
+fn bench_filter() -> AnalysisFilter<'static> { + AnalysisFilter { + categories: RuleCategoriesBuilder::default() + .with_syntax() + .with_lint() + .with_assist() + .build(), + ..AnalysisFilter::default() + } +} + +fn bench_options() -> AnalyzerOptions { + AnalyzerOptions::default() + .with_configuration(AnalyzerConfiguration::default().with_jsx_runtime(JsxRuntime::default())) +}Then use
let filter = bench_filter();andlet options = bench_options();in both benchmark loops.Also applies to: 143-153
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_html_analyze/benches/html_analyzer.rs` around lines 99 - 110, Extract the duplicated construction into two helpers (e.g., bench_filter() -> AnalysisFilter and bench_options() -> AnalyzerOptions): implement bench_filter() to return AnalysisFilter built with RuleCategoriesBuilder::default().with_syntax().with_lint().with_assist().build() and default for the rest, and implement bench_options() to return AnalyzerOptions::default().with_configuration(AnalyzerConfiguration::default().with_jsx_runtime(JsxRuntime::default())); then replace the duplicated inline constructions in the library and fixture benchmarks with let filter = bench_filter(); and let options = bench_options(); respectively, updating references to AnalysisFilter, RuleCategoriesBuilder, AnalyzerOptions, AnalyzerConfiguration, and JsxRuntime as needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/biome_html_analyze/benches/html_analyzer.rs`:
- Around line 99-110: Extract the duplicated construction into two helpers
(e.g., bench_filter() -> AnalysisFilter and bench_options() -> AnalyzerOptions):
implement bench_filter() to return AnalysisFilter built with
RuleCategoriesBuilder::default().with_syntax().with_lint().with_assist().build()
and default for the rest, and implement bench_options() to return
AnalyzerOptions::default().with_configuration(AnalyzerConfiguration::default().with_jsx_runtime(JsxRuntime::default()));
then replace the duplicated inline constructions in the library and fixture
benchmarks with let filter = bench_filter(); and let options = bench_options();
respectively, updating references to AnalysisFilter, RuleCategoriesBuilder,
AnalyzerOptions, AnalyzerConfiguration, and JsxRuntime as needed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c7e6608e-f9c4-42aa-852a-21dbb379f6ad
📒 Files selected for processing (2)
crates/biome_html_analyze/benches/fixtures/realcrates/biome_html_analyze/benches/html_analyzer.rs
✅ Files skipped from review due to trivial changes (1)
- crates/biome_html_analyze/benches/fixtures/real
72584be to
20826ae
Compare
Summary
I chose not to include the synthetic html benchmark fixtures because those are likely to be low signal for the analyzer benchmarks
generated by gpt 5.4
Test Plan
new benchmarks should show up in codspeed
Docs