forked from dotnet/runtime
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Streamline Regex path to matching, and improve Replace/Split (dotnet#…
…1950) * Add ThrowHelper, and clean up some style Trying to streamline main path to the engine, ensuring helpers can be inlined, reducing boilerplate, etc. And as long as ThrowHelper was being used in some places, used it in others where it didn't require adding additional methods. Also cleaned style along the way. * Streamline the Scan loop The main costs remaining are the virtual calls to FindFirstChar/Go. * Enumerate matches with a reusable match object in Regex.Replace/Split The cost of re-entering the scan implementation and creating a new Match object for each NextMatch is measurable, but in these cases, we can use an iterator to quickly get back to where we were and reuse the match object. It adds a couple interface calls per iteration, as well as an allocation for the enumerator, but it saves more than that in the common case. * Use SegmentStringBuilder instead of ValueStringBuilder in Replace A previous .NET Core release saw StringBuilder in Regex.Replace replaced by ValueStringBuilder. This was done to avoid the allocations from the StringBuilder. However, in some ways, for large input strings, it made things worse. StringBuilder is implemented as a linked list of builders, whereas ValueStringBuilder is contiguous memory taken from the ArrayPool. For large input strings, we start requesting buffers too large for the ArrayPool, and thus when we grow we generate large array allocations that become garbage. We're better off using a simple arraypool-backed struct to store the segments that need to be concatenated, and then just creating the final string from those segments. For the common case where there are many fewer replacements than the length of the string, this saves a lot of memory as well as avoiding a layer of copying. * Replace previously added enumerator with a callback mechanism The delegate invocation per match is faster than the two interface calls per match, plus we can avoid allocating the enumerator and just pass the relevant state through by ref. * Remove unnecessary fields from MatchCollection * More exception streamlining and style cleanup * Address PR feedback, and fix merge with latest changes
- Loading branch information
1 parent
e5444a1
commit c469f2d
Showing
25 changed files
with
1,100 additions
and
956 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.