perf(parse/tailwind): use `lookup_byte` for slightly better throughput by dyc3 · Pull Request #9183 · biomejs/biome

dyc3 · 2026-02-21T19:42:54Z

Summary

This refactors the tailwind parser to use charater byte lookups using biome_unicode_table::lookup_byte.

On my machine, this results in about a 33% speedup, and anywhere from 40-60% throughput increase.

Test Plan

no snapshot changes

Docs

changeset-bot · 2026-02-21T19:42:59Z

⚠️ No Changeset found

Latest commit: db1ed69

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

codspeed-hq · 2026-02-21T19:48:51Z

Merging this PR will not alter performance

✅ 10 untouched benchmarks
⏩ 206 skipped benchmarks¹

_{Comparing dyc3/tw-lexer-perf (db1ed69) with main (b834078)}

206 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

coderabbitai · 2026-02-21T19:50:33Z

No actionable comments were generated in the recent review. 🎉

Walkthrough

The PR refactors the Tailwind lexer to use the Dispatch enum and lookup_byte for character classification instead of raw u8 comparisons. base_name_store.rs and lexer/mod.rs now obtain a Dispatch for bytes and pass it to updated helpers (is_delimiter, is_boundary_byte) and token-consumption paths. Several internal functions and branching points were changed to accept or operate on Dispatch values; public APIs remain unchanged.

Possibly related PRs

perf(parse/tailwind): use compact trie for lexing base names instead of linear search #7977: Introduced the original BaseNameMatcher and boundary-check logic that this PR converts to Dispatch-based checks.
perf(parse/tw): avoid going into basename store trie when not needed #8528: Touches the same delimiter logic and updates is_delimiter/call sites in mod.rs, closely related at the code level.

Suggested reviewers

ematipico

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: refactoring the Tailwind parser to use `lookup_byte` for performance improvements.
Description check	✅ Passed	The description clearly explains the motivation (performance improvement via character byte lookups), provides measured results, and confirms no snapshot changes were made.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch dyc3/tw-lexer-perf

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/biome_tailwind_parser/src/lexer/mod.rs (1)

135-172: ⚠️ Potential issue | 🟠 Major

Fast path now ignores - and can skip dashed-basename matching.
The loop never stops on -, so inputs like border-t-red-300 will take the fast path and consume the whole string as TW_BASE, bypassing the dashed-basename trie. That looks like a tokenisation regression.

💡 Suggested fix

-        let mut end = 0usize;
+        let mut end = 0usize;
+        let mut saw_dash = false;
         while end < slice.len() {
             let b = slice[end];
             let dispatched = lookup_byte(b);
-            if dispatched == COL || is_delimiter(dispatched) {
+            if dispatched == MIN {
+                saw_dash = true;
+                break;
+            }
+            if dispatched == COL || is_delimiter(dispatched) {
                 break;
             }
             end += 1;
         }
@@
-        if end > 0 && (end == slice.len() || is_delimiter(lookup_byte(slice[end]))) {
+        if !saw_dash
+            && end > 0
+            && (end == slice.len() || is_delimiter(lookup_byte(slice[end])))
+        {
             self.advance(end);
             return TW_BASE;
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/biome_tailwind_parser/src/lexer/mod.rs` around lines 135 - 172, The
fast-path loop in consume_base is incorrectly treating '-' as non-delim so
inputs like "border-t-red-300" are consumed as TW_BASE; modify the loop that
computes end to break when encountering a dash so dashed basenames go through
the trie: in consume_base check for b == b'-' (or dispatched == DASH if you have
a DASH kind) alongside the existing COL/is_delimiter check, so the loop stops at
'-' and the subsequent logic falls back to
BASENAME_STORE.matcher(slice).base_end() to return DATA_KW or TW_BASE as
appropriate; keep the DATA_KW special-case checks unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@crates/biome_tailwind_parser/src/lexer/mod.rs`:
- Around line 135-172: The fast-path loop in consume_base is incorrectly
treating '-' as non-delim so inputs like "border-t-red-300" are consumed as
TW_BASE; modify the loop that computes end to break when encountering a dash so
dashed basenames go through the trie: in consume_base check for b == b'-' (or
dispatched == DASH if you have a DASH kind) alongside the existing
COL/is_delimiter check, so the loop stops at '-' and the subsequent logic falls
back to BASENAME_STORE.matcher(slice).base_end() to return DATA_KW or TW_BASE as
appropriate; keep the DATA_KW special-case checks unchanged.

github-actions bot added A-Parser Area: parser L-Tailwind Language: Tailwind CSS labels Feb 21, 2026

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

dyc3 requested review from a team February 21, 2026 22:19

ematipico approved these changes Feb 21, 2026

View reviewed changes

perf(parse/tailwind): use lookup_byte for slightly better throughput

db1ed69

dyc3 force-pushed the dyc3/tw-lexer-perf branch from bda3ca0 to db1ed69 Compare February 22, 2026 12:21

dyc3 merged commit b76c42b into main Feb 22, 2026
14 checks passed

dyc3 deleted the dyc3/tw-lexer-perf branch February 22, 2026 12:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(parse/tailwind): use `lookup_byte` for slightly better throughput#9183

perf(parse/tailwind): use `lookup_byte` for slightly better throughput#9183
dyc3 merged 1 commit intomainfrom
dyc3/tw-lexer-perf

dyc3 commented Feb 21, 2026

Uh oh!

changeset-bot bot commented Feb 21, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Feb 21, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Feb 21, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

dyc3 commented Feb 21, 2026

Summary

Test Plan

Docs

Uh oh!

changeset-bot bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

codspeed-hq bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

coderabbitai bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

changeset-bot bot commented Feb 21, 2026 •

edited

Loading

codspeed-hq bot commented Feb 21, 2026 •

edited

Loading

coderabbitai bot commented Feb 21, 2026 •

edited

Loading