feat(oxc_ast): add RegExpFlags bitflag for storing regex flags#32
Merged
feat(oxc_ast): add RegExpFlags bitflag for storing regex flags#32
Conversation
Contributor
Parser Benchmark Results |
This reduces `TokenValue` from 56 to 40 bytes, `Token` from 72 to 56 bytes.
Closed
Boshen
added a commit
that referenced
this pull request
Oct 7, 2025
Replace multi-function calls and multiple enum variant checks with simple range checks, reducing assembly instructions in hot paths. ## Changes **`is_any_keyword()`**: - Before: Called 4 separate functions (`is_reserved_keyword()`, `is_contextual_keyword()`, `is_strict_mode_contextual_keyword()`, `is_future_reserved_keyword()`) checking 70+ enum variants with early returns - After: Single range check `Await..=Yield` since all keywords are contiguous in the enum **`is_number()`**: - Before: Matched 11 separate enum variants - After: Single range check `Decimal..=HexBigInt` since all numeric literals are contiguous ## Assembly Impact Multi-function approach generated **5 instructions** with complex bitmask setup: ```asm mov x8, #992 movk x8, #992, lsl #16 movk x8, #240, lsl #32 lsr x8, x8, x0 and w0, w8, #0x1 ``` Range check generates **4 instructions** with simple arithmetic: ```asm and w8, w0, #0xff sub w8, w8, #5 cmp w8, #39 cset w0, lo ``` ## Performance - `is_any_keyword()` is called from `advance()` on **every single token** - 20% fewer instructions (5 → 4) - Simpler logic enables better branch prediction - Eliminates complex constant loading Added tests to verify enum layout assumptions remain valid. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Boshen
added a commit
that referenced
this pull request
Oct 7, 2025
Replace multi-function calls and multiple enum variant checks with simple range checks, reducing assembly instructions in hot paths. ## Changes **`is_any_keyword()`**: - Before: Called 4 separate functions (`is_reserved_keyword()`, `is_contextual_keyword()`, `is_strict_mode_contextual_keyword()`, `is_future_reserved_keyword()`) checking 70+ enum variants with early returns - After: Single range check `Await..=Yield` since all keywords are contiguous in the enum **`is_number()`**: - Before: Matched 11 separate enum variants - After: Single range check `Decimal..=HexBigInt` since all numeric literals are contiguous ## Assembly Impact Multi-function approach generated **5 instructions** with complex bitmask setup: ```asm mov x8, #992 movk x8, #992, lsl #16 movk x8, #240, lsl #32 lsr x8, x8, x0 and w0, w8, #0x1 ``` Range check generates **4 instructions** with simple arithmetic: ```asm and w8, w0, #0xff sub w8, w8, #5 cmp w8, #39 cset w0, lo ``` ## Performance - `is_any_keyword()` is called from `advance()` on **every single token** - 20% fewer instructions (5 → 4) - Simpler logic enables better branch prediction - Eliminates complex constant loading Added tests to verify enum layout assumptions remain valid. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Boshen
added a commit
that referenced
this pull request
Oct 7, 2025
Replace multi-function calls and multiple enum variant checks with simple range checks, reducing assembly instructions in hot paths. ## Changes **`is_any_keyword()`**: - Before: Called 4 separate functions (`is_reserved_keyword()`, `is_contextual_keyword()`, `is_strict_mode_contextual_keyword()`, `is_future_reserved_keyword()`) checking 70+ enum variants with early returns - After: Single range check `Await..=Yield` since all keywords are contiguous in the enum **`is_number()`**: - Before: Matched 11 separate enum variants - After: Single range check `Decimal..=HexBigInt` since all numeric literals are contiguous ## Assembly Impact Multi-function approach generated **5 instructions** with complex bitmask setup: ```asm mov x8, #992 movk x8, #992, lsl #16 movk x8, #240, lsl #32 lsr x8, x8, x0 and w0, w8, #0x1 ``` Range check generates **4 instructions** with simple arithmetic: ```asm and w8, w0, #0xff sub w8, w8, #5 cmp w8, #39 cset w0, lo ``` ## Performance - `is_any_keyword()` is called from `advance()` on **every single token** - 20% fewer instructions (5 → 4) - Simpler logic enables better branch prediction - Eliminates complex constant loading Added tests to verify enum layout assumptions remain valid. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
graphite-app bot
pushed a commit
that referenced
this pull request
Oct 7, 2025
…14410) ## Summary Replace multi-function calls and multiple enum variant checks with simple range checks, reducing assembly instructions in hot paths. ## Changes ### `is_any_keyword()` **Before**: Called 4 separate functions checking 70+ enum variants: - `is_reserved_keyword()` - 38 variants - `is_contextual_keyword()` - 39 variants - `is_strict_mode_contextual_keyword()` - 8 variants - `is_future_reserved_keyword()` - 7 variants **After**: Single range check `Await..=Yield` since all keywords are contiguous in the enum ### `is_number()` **Before**: Matched 11 separate enum variants **After**: Single range check `Decimal..=HexBigInt` since numeric literals are contiguous ## Assembly Analysis ### Before (with scattered checks) ```asm mov x8, #992 ; Load bitmask constant movk x8, #992, lsl #16 ; More bitmask setup movk x8, #240, lsl #32 ; Even more bitmask setup lsr x8, x8, x0 ; Shift by kind value and w0, w8, #0x1 ; Extract result bit ``` **5 instructions** with complex constant loading ### After (with range check) ```asm and w8, w0, #0xff ; Extract byte sub w8, w8, #5 ; Subtract range start cmp w8, #39 ; Compare to range size cset w0, lo ; Set result ``` **4 instructions** with simple arithmetic ## Performance Impact - **20% fewer instructions** (5 → 4) - **Simpler logic** = better CPU pipeline utilization - **No complex constants** = smaller code size - **Better branch prediction** with single comparison This is particularly important because: - `is_any_keyword()` is called from `advance()` on **every single token** - This is one of the hottest code paths in the entire parser ## Testing Added unit tests to verify that: - All keywords remain contiguous in the enum (`Await..=Yield`) - All numeric literals remain contiguous (`Decimal..=HexBigInt`) These tests will catch any future enum reordering that would break the optimization. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Boshen
added a commit
that referenced
this pull request
Jan 18, 2026
Cache `ptr` and `chunk_start` fields directly in `Bump` struct to eliminate pointer indirection through `ChunkFooter` in the allocation fast path. Before (2 dependent loads): ```asm ldr x9, [x0, #16] ; Load footer ptr from Bump ldr x8, [x9, #32] ; Load ptr from footer (WAITS for x9!) ``` After (2 independent loads): ```asm ldr x8, [x0] ; Load ptr directly (offset 0) ldr x9, [x0, #8] ; Load chunk_start directly - PARALLEL! ``` This removes the data dependency between loads, allowing ARM to issue both loads in parallel via out-of-order execution. Changes: - Add `ptr` and `chunk_start` cached fields to `Bump` struct - Add `#[repr(C)]` to ensure field ordering for optimal cache access - Update `try_alloc_layout_fast` to use direct field access - Sync cached fields on slow path (new chunk allocation) and iteration - Update helper methods to use cached ptr Size impact: `Bump` grows from 24 to 40 bytes - acceptable tradeoff for the hot path optimization. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This reduces
TokenValuefrom 56 to 40 bytes,Tokenfrom 72 to 56 bytes.