-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace string creation with static strings for Regex options conversion to string #1504
base: main
Are you sure you want to change the base?
Replace string creation with static strings for Regex options conversion to string #1504
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paulomorgado Thank you for this creative optimization PR! While the switch-based lookup achieves impressive performance improvements (20ns to 0.9ns), We have concerns about the long-term maintainability of this approach. After reviewing your implementation, we're proposing an alternative approach that maintains the performance benefits while offering better long-term maintainability.
Here's the proposed solution:
private static readonly string[] RegexOptionStrings = new[]
{
"", // 0000
"i", // 0001
"m", // 0010
"im", // 0011
"s", // 0100
"is", // 0101
"ms", // 0110
"ims", // 0111
"x", // 1000
"ix", // 1001
"mx", // 1010
"imx", // 1011
"sx", // 1100
"isx", // 1101
"msx", // 1110
"imsx" // 1111
};
public string GetRegexOptions(RegexOptions options)
{
int index = ((options & RegexOptions.IgnoreCase) != 0 ? 1 : 0) |
((options & RegexOptions.Multiline) != 0 ? 1 : 0) << 1 |
((options & RegexOptions.Singleline) != 0 ? 1 : 0) << 2 |
((options & RegexOptions.IgnorePatternWhitespace) != 0 ? 1 : 0) << 3;
return RegexOptionStrings[index];
}
This approach:
- Should maintain the near-zero performance overhead of your original PR
- Provides much easier maintainability
- Uses a clear, readable bit manipulation strategy
Would you be willing to adapt your PR to this implementation?
@adelinowona, I'll have a look into it as soon as I have the time. Why do you find what you are provides much easier maintainability? |
@paulomorgado To be fair our proposed approach doesn't necessarily provide better functional maintainability but it looks better visually. We will rather have that than the switch-based lookup approach. Visually, it transforms what could be a verbose switch or repeated conditionals into a concise, mathematical-looking operation. |
Sure. No problem. Can you create a review with a suggestion? |
@paulomorgado Isn't this comment requesting changes enough? |
3a03958
to
c22b878
Compare
Sorry for taking so long, @adelinowona. I was waiting for you to submit a suggestion through a review. 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @paulomorgado, thanks again for your contribution! Happy New Year!
I'll merge your PR after some CI checks.
The
BsonRegularExpression(Regex)
constructor uses string concatenation to create the options string.If there's none only one option, it will result on a static interned string. Otherwise, it will create new strings. One less than the number of options.
Using a buffer to create the string is only better when there are 4 options.
However, using precomputed strings (because they aren't that much) beat every option in time and memory consumption.
Benchmarks
BenchmarkDotNet v0.13.8, Windows 11 (10.0.26100.2152)
13th Gen Intel Core i9-13900K, 1 CPU, 32 logical and 24 physical cores
.NET SDK 9.0.100-rc.2.24474.11
[Host] : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT AVX2