Skip to content

Conversation

@muellerj2
Copy link
Contributor

This implements simplified backtracking for the case when the pattern of a lookahead assertion matches. It's kind of the equivalent of #5828 for lookahead assertions, though it's more complicated while being much less practically relevant. But I need this for the next PRs that will greatly reduce the number of allocations the matcher performs.

When the pattern in a lookahead assertion matches, we know that the lookahead assertion as a whole succeeded or failed. We can then mostly skip the stack unwinding up until the stack frame that was pushed at the start of the lookahead assertion, except for the effects these stack frames have on the stack counter, because no stack unwinding opcode translates does any other work when a pattern matched in ECMAScript mode (and ECMAScript is the only regex grammar that supports lookahead assertions). Much of of the work at the end of a lookahead assertion is now also handled when processing the _N_end_assert node and no longer when processing the unwinding opcodes _After_assert and _After_neg_assert.

You might notice that we could actually avoid the new loop in _N_end_assert if we kept track of the stack usage counts and the positions of the _After_assert and _After_neg_assert stack frames. But I will have to add a variant of this loop in the PR after the next one anyway, so it doesn't seem worth it to spend much effort on avoiding this loop.

@muellerj2 muellerj2 requested a review from a team as a code owner November 9, 2025 00:10
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews Nov 9, 2025
@StephanTLavavej StephanTLavavej added enhancement Something can be improved regex meow is a substring of homeowner labels Nov 9, 2025
@StephanTLavavej StephanTLavavej self-assigned this Nov 9, 2025
@StephanTLavavej
Copy link
Member

because no stack unwinding opcode translates does any other work when a pattern matched in ECMAScript mode

I can't quite parse this - was "translates" a spurious word introduced during editing?

@StephanTLavavej StephanTLavavej removed their assignment Nov 10, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Nov 10, 2025
@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Nov 11, 2025
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej merged commit d806de4 into microsoft:main Nov 12, 2025
41 checks passed
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Nov 12, 2025
@StephanTLavavej
Copy link
Member

💚 😻 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Something can be improved regex meow is a substring of homeowner

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants