Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OS#14763260: Correctly update RegExp.$1 after RegExp.prototype.test matches and used a cached value #4607

Merged
merged 4 commits into from
Jan 30, 2018

Conversation

boingoing
Copy link
Contributor

Regression from #3802 - if we hit the cache, we don't update the Regex constructor object with the last match and so retrieving properties about the match from the regex constructor object fails.

Fixed by adding an invalidation mechanism to the JavascriptRegExpConstructor. If we hit the cache, we'll mark the Regex constructor object last match properties as invalidated and we will then compute them on-demand the first time they're needed.

Fixes:
https://microsoft.visualstudio.com/web/wi.aspx?id=14763260

…atches and used a cached value

Regression from chakra-core#3802 - if we hit the cache, we don't update the Regex constructor object with the last match and so retrieving properties about the match from the regex constructor object fails.

Fixed by adding an invalidation mechanism to the `JavascriptRegExpConstructor`. If we hit the cache, we'll mark the Regex constructor object last match properties as invalidated and we will then compute them on-demand the first time they're needed.

Fixes:
https://microsoft.visualstudio.com/web/wi.aspx?id=14763260
ScriptContext* scriptContext = this->GetScriptContext();
const CharCount lastInputLen = lastInput->GetLength();
const wchar_t* lastInputStr = lastInput->GetString();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use WCHAR instead. It refers to char16 on xplat. (why? wchar_t is 4 bytes on some platforms while 2 bytes on others)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced it with char16 since all the APIs here are using char16. Seems this must be fine for xplat.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is also fine! Thanks

Copy link
Contributor

@dilijev dilijev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM but @akroshg should have a look too.

{
AssertMsg(lastPattern != nullptr, "lastPattern should not be null");
AssertMsg(lastInput != nullptr, "lastInput should not be null");
AssertMsg(JavascriptOperators::GetTypeId(lastInput) != TypeIds_Null, "lastInput should not be JavaScript null");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these particular checks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check similar things in JavascriptRegExpConstructor::SetLastMatch which is close to what we're doing here. Just kept it consistent.

this->lastPattern = lastPattern;
this->lastInput = lastInput;
this->lastMatch.Reset();
this->reset = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "reset" in the two lines above have the same meaning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. We want to reset the last match to undefined and let the regex constructor know it has to reset it's cache via JavascriptRegExpConstructor::EnsureValues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between reset and invalidateLastMatch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing flag reset is used to indicate we should reset the cache backed-by JavascriptRegExpConstructor::EnsureValues. I added invalidateLastMatch which is used to indicate that there is no value stored in the last match field. This means we need to first calculate the last match before we use it. Ordinarily we assert that the last match field in JavascriptRegExpConstructor::EnsureValues is not undefined.

Copy link
Contributor

@dilijev dilijev Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reset flag is used in the logic for materializing all of the RegExp['$blah'] properties as actual JavascriptStrings (when needed) rather than input + offset which is faster to compute, and the logic ensures this only happens once. I'm not sure how the term "reset" applies to this logic but this code is apparently very old.

@dilijev dilijev requested a review from akroshg January 27, 2018 03:23
r2.test(s2);
r1.test(s1);

assert.areEqual("abc", RegExp.$1, "Stale last match should be invalidated by second r1.test(s1)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there other side-effects that regex matching can have? It might be good to add a few more tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add a few more tests.


void EnsureValues();

Field(UnifiedRegex::RegexPattern*) lastPattern;
Field(JavascriptString*) lastInput;
Field(bool) invalidatedLastMatch; // true if last match must be recalculated before use
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move it down with other bool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

// - There was a match (test returned true)
if (invalidatedLastMatch)
{
this->lastMatch = RegexHelper::SimpleMatch(scriptContext, pattern, lastInputStr, lastInputLen, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SimpleMatch [](start = 47, length = 11)

could this function be re-entrant and change lastInput by any chance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope not. We don't pass any Var here, just the const char16* so I don't think user can apply getter/setters or use Symbol overrides. Seems as safe as the other operations here.


r1.test(s1);

assert.areEqual("abc", RegExp.input, "RegExp.input property caclculated correctly");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: spelling of calculated in this file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch

@chakrabot chakrabot merged commit b98be0e into chakra-core:release/1.8 Jan 30, 2018
chakrabot pushed a commit that referenced this pull request Jan 30, 2018
…r RegExp.prototype.test matches and used a cached value

Merge pull request #4607 from boingoing:FixRegExpTestCache

Regression from #3802 - if we hit the cache, we don't update the Regex constructor object with the last match and so retrieving properties about the match from the regex constructor object fails.

Fixed by adding an invalidation mechanism to the `JavascriptRegExpConstructor`. If we hit the cache, we'll mark the Regex constructor object last match properties as invalidated and we will then compute them on-demand the first time they're needed.

Fixes:
https://microsoft.visualstudio.com/web/wi.aspx?id=14763260
@boingoing
Copy link
Contributor Author

Thanks, everyone

chakrabot pushed a commit that referenced this pull request Jan 30, 2018
…xp.$1 after RegExp.prototype.test matches and used a cached value

Merge pull request #4607 from boingoing:FixRegExpTestCache

Regression from #3802 - if we hit the cache, we don't update the Regex constructor object with the last match and so retrieving properties about the match from the regex constructor object fails.

Fixed by adding an invalidation mechanism to the `JavascriptRegExpConstructor`. If we hit the cache, we'll mark the Regex constructor object last match properties as invalidated and we will then compute them on-demand the first time they're needed.

Fixes:
https://microsoft.visualstudio.com/web/wi.aspx?id=14763260
chakrabot pushed a commit that referenced this pull request Jan 30, 2018
…y update RegExp.$1 after RegExp.prototype.test matches and used a cached value

Merge pull request #4607 from boingoing:FixRegExpTestCache

Regression from #3802 - if we hit the cache, we don't update the Regex constructor object with the last match and so retrieving properties about the match from the regex constructor object fails.

Fixed by adding an invalidation mechanism to the `JavascriptRegExpConstructor`. If we hit the cache, we'll mark the Regex constructor object last match properties as invalidated and we will then compute them on-demand the first time they're needed.

Fixes:
https://microsoft.visualstudio.com/web/wi.aspx?id=14763260
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants