-
-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement workarounds for regex parsing known issues #1603
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for helping out here! Seems that following test cases can also removed from Test262Harness.settings.json
exclude now as they pass after the fixes:
"built-ins/RegExp/match-indices/indices-array-unicode-property-names.js",
"built-ins/RegExp/named-groups/non-unicode-property-names.js",
"built-ins/RegExp/named-groups/unicode-property-names.js",
"built-ins/RegExp/prototype/Symbol.replace/named-groups.js",
PR updated. There's a failing macos test but doesn't seem to have anything to do with my changes:
Is this really something unrelated? |
Unrelated. Engine is not thread safe and for some reason that test case is racy on macos. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I guess we can merge?
Yep, I think this one is ready. However, I just discovered another issue:
It seems that capturing group numbering is messed up in .NET, at least it doesn't work in the same way as in JS (which assigns strictly increasing numbers to capturing groups, regardless of being a named group or not). This is a pretty sad situation and I'm not sure if we can do anything about it. I'll try to find a workaround but will do that in a separate PR (if it exists at all - but probably that will need to be fixed in Esprima...) |
It's almost like that |
For some reason I cannot neither update or rebase this branch via GitHub UI, could you please do the rebase locally and force push? |
Thanks once again! |
Glad to help! I did some investigation on capturing group numbering. According to MSDN, this "anomaly" is by design:
Crazy design decision. I never thought I would say this, but in this regard JS is the sane one ... If only MS had got the ECMAScript compatibility mode right at least... But that ship has sailed for good. I can see only one viable workaround: in the regex parser we rewrite all named capturing groups to numbered ones (and, of course, named backrefs as well), like I think we can't get away without addressing this issues because it's a big one. Without fixing this, we can't even call legacy regex handling okayish... So looks like we're in for another RC. I'll get back to you soon. |
Implements workarounds for regex parsing known issues.
Also, fixes a bug in unicode mode regex handling (code point -> code unit index translation is not needed as .NET regexes return code unit indices).