Skip to content

perf: optimize gobbleParens, specificity, and regex pattern caching#85

Open
TrevorBurnham wants to merge 1 commit into
LeaVerou:masterfrom
TrevorBurnham:optimizations
Open

perf: optimize gobbleParens, specificity, and regex pattern caching#85
TrevorBurnham wants to merge 1 commit into
LeaVerou:masterfrom
TrevorBurnham:optimizations

Conversation

@TrevorBurnham
Copy link
Copy Markdown

Summary

This CR adds a benchmark suite to parsel-js and applies three optimizations.

For context, I'm trying to get happy-dom to adopt parsel-js instead of its own regex-based CSS selector parser, which has known bugs that parsel-js fixes: capricorn86/happy-dom#2010 However, the lead maintainer is concerned about the performance hit. I'd love to see a world where JavaScript has CSS selector parser that's both fast and accurate! ⚡🎯

Changes

1. gobbleParens — slice instead of string concatenation

The original built a result string character-by-character (result += char), which creates a new string allocation on every iteration — O(n²) for long parenthesized expressions. Replaced with index tracking and a single text.slice() call at the end.

2. specificity — remove duplicate computation for selector lists

When computing specificity for a list selector (e.g. div, #foo), specificity() was called twice per child selector:

// Before — calls specificity(ast) twice
const sp = specificity(ast);
base = Math.max(base, ...specificity(ast));

// After — reuses the result
const sp = specificity(ast);
base = Math.max(base, ...sp);

3. getArgumentPatternByType — cache compiled RegExp

Each call for pseudo-class or pseudo-element types was constructing a new RegExp via string replacement. Since the token grammar is stable between calls, the compiled patterns are now cached in a Map.

Benchmark results

Measured with bench.mjs (included), which exercises tokenize, parse, stringify, specificity, walk, and end-to-end across 23 selectors of varying complexity. 5000 iterations per benchmark, 500 warmup, median reported.

Benchmark Before After Change
tokenize/ALL 165 µs 154 µs -7%
parse/ALL 232 µs 222 µs -4%
specificity/ALL 270 µs 239 µs -11%
specificity-ast/ALL 31.5 µs 16.3 µs -48%
e2e/ALL 268 µs 239 µs -11%

Optimizations evaluated but not included

  • nestTokens single-pass comma detection: Merging the find() + iteration into one pass showed no measurable improvement — token arrays are small enough that the extra scan is negligible.
  • tokenizeBy splice avoidance: Replacing in-place splice with a two-array swap showed ~3% on tokenize but added complexity for minimal gain.

Testing

All existing tests pass.

- gobbleParens: replace O(n²) string concatenation with a single slice
- specificity: fix duplicate specificity() call in list selector handling
- getArgumentPatternByType: cache compiled RegExp for pseudo-class/element patterns

Benchmarked with bench.mjs across simple, moderate, heavy, and stress selectors.
Combined effect: ~11% faster specificity/e2e, ~7% faster tokenize, ~2x faster
specificity-from-AST. All 34 existing tests pass.
@netlify
Copy link
Copy Markdown

netlify Bot commented Feb 9, 2026

Deploy Preview for parsel ready!

Name Link
🔨 Latest commit 8cd637f
🔍 Latest deploy log https://app.netlify.com/projects/parsel/deploys/6989efb5d0711a0008bd9be7
😎 Deploy Preview https://deploy-preview-85--parsel.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant