Skip to content
This repository has been archived by the owner on Feb 16, 2024. It is now read-only.

Summarize outcome of match order discussion w.r.t. same-length strings #58

Merged
merged 1 commit into from
Mar 31, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,8 @@ Matching the longest strings first is key to the integration with properties of

For more details on the rationale for matching longest strings first, see [issue #25](https://github.com/tc39/proposal-regexp-set-notation/issues/25).

A character class may contain multiple strings of the same length: e.g. `[xyz]` contains three strings consisting of a single character, and `[\q{xx|yy|zz}]` (using the new string literal syntax) contains three strings consisting of two characters. There is no inherent or observable match order for those same-length strings. The committee [discussed](https://github.com/tc39/proposal-regexp-set-notation/issues/55) and decided that character classes are mathematical sets with no inherent order. Similar to how there is no observable match order difference between `[xyz]` and `[zyx]`, there is no match order difference between `[\q{xx|yy|zz}]` and `[\q{zz|yy|xx}]`. This nuance enables implementers to use sets (i.e. implementations of mathematical sets) and tries (retrieval trees) for runtime optimizations.

### Are properties of strings eager / atomic?

No. As shown in the previous FAQ entry, `\p{PropertyOfStrings}` desugars into a plain disjunction, rather than an [atomic group](https://www.regular-expressions.info/atomic.html) containing a disjunction. We believe this behavior is the most future-proof, for the following reasons.
Expand Down