Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Microsoft.DotNet.ApiCompat failing with RegexMatchTimeoutException in #81543

Closed
jkotas opened this issue Feb 2, 2023 · 6 comments · Fixed by dotnet/sdk#30347
Closed

Microsoft.DotNet.ApiCompat failing with RegexMatchTimeoutException in #81543

jkotas opened this issue Feb 2, 2023 · 6 comments · Fixed by dotnet/sdk#30347
Assignees
Labels
area-Infrastructure-libraries blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab

Comments

@jkotas
Copy link
Member

jkotas commented Feb 2, 2023

.packages/microsoft.dotnet.apicompat.task/8.0.100-alpha.1.22571.3/build/Microsoft.DotNet.ApiCompat.ValidateAssemblies.Common.targets(16,5): error MSB4018: (NETCORE_ENGINEERING_TELEMETRY=Build) The "Microsoft.DotNet.ApiCompat.Task.ValidateAssembliesTask" task failed unexpectedly.
System.Text.RegularExpressions.RegexMatchTimeoutException: The Regex engine has timed out while trying to match a pattern to an input string. This can occur for many reasons, including very large inputs or excessive backtracking caused by nested quantifiers, back-references and other factors.
   at System.Text.RegularExpressions.RegexRunner.<CheckTimeout>g__ThrowRegexTimeout|25_0()
   at Regex165_TryMatchAtCurrentPosition(RegexRunner, ReadOnlySpan`1)
   at Regex165_Scan(RegexRunner, ReadOnlySpan`1)
   at System.Text.RegularExpressions.Regex.RunAllMatchesWithCallback[TState](String inputString, ReadOnlySpan`1 inputSpan, Int32 startat, TState& state, MatchCallback`1 callback, RegexRunnerMode mode, Boolean reuseMatchObject)
   at System.Text.RegularExpressions.RegexReplacement.Replace(Regex regex, String input, Int32 count, Int32 startat)
   at Microsoft.DotNet.ApiCompat.RegexStringTransformer.Transform(String input) in /_/src/ApiCompat/Microsoft.DotNet.ApiCompat.Shared/RegexStringTransformer.cs:line 51

Build Information

Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=157087
Build error leg or test failing: Build / Libraries Build osx arm64 Debug / Restore and Build Product
Pull request: #81510

Error Message

Fill the error message using known issues guidance.

{
  "ErrorMessage": "Microsoft.DotNet.ApiCompat.RegexStringTransformer.Transform",
  "BuildRetry": true,
  "ErrorPattern": "",
  "ExcludeConsoleLog": false
}

Report

Build Definition Step Name Console log Pull Request
157151 dotnet/runtime Restore and Build Product Log #81535
157087 dotnet/runtime Restore and Build Product Log #81510

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 2 2
@jkotas jkotas added blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab labels Feb 2, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Feb 2, 2023
@ghost
Copy link

ghost commented Feb 2, 2023

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details
.packages/microsoft.dotnet.apicompat.task/8.0.100-alpha.1.22571.3/build/Microsoft.DotNet.ApiCompat.ValidateAssemblies.Common.targets(16,5): error MSB4018: (NETCORE_ENGINEERING_TELEMETRY=Build) The "Microsoft.DotNet.ApiCompat.Task.ValidateAssembliesTask" task failed unexpectedly.
System.Text.RegularExpressions.RegexMatchTimeoutException: The Regex engine has timed out while trying to match a pattern to an input string. This can occur for many reasons, including very large inputs or excessive backtracking caused by nested quantifiers, back-references and other factors.
   at System.Text.RegularExpressions.RegexRunner.<CheckTimeout>g__ThrowRegexTimeout|25_0()
   at Regex165_TryMatchAtCurrentPosition(RegexRunner, ReadOnlySpan`1)
   at Regex165_Scan(RegexRunner, ReadOnlySpan`1)
   at System.Text.RegularExpressions.Regex.RunAllMatchesWithCallback[TState](String inputString, ReadOnlySpan`1 inputSpan, Int32 startat, TState& state, MatchCallback`1 callback, RegexRunnerMode mode, Boolean reuseMatchObject)
   at System.Text.RegularExpressions.RegexReplacement.Replace(Regex regex, String input, Int32 count, Int32 startat)
   at Microsoft.DotNet.ApiCompat.RegexStringTransformer.Transform(String input) in /_/src/ApiCompat/Microsoft.DotNet.ApiCompat.Shared/RegexStringTransformer.cs:line 51

Build Information

Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=157087
Build error leg or test failing: Build / Libraries Build osx arm64 Debug / Restore and Build Product
Pull request: #81510

Error Message

Fill the error message using known issues guidance.

{
  "ErrorMessage": "Microsoft.DotNet.ApiCompat.RegexStringTransformer.Transform",
  "BuildRetry": true,
  "ErrorPattern": "",
  "ExcludeConsoleLog": false
}
Author: jkotas
Assignees: -
Labels:

area-System.Text.RegularExpressions, blocking-clean-ci, Known Build Error

Milestone: -

@ghost
Copy link

ghost commented Feb 2, 2023

Tagging subscribers to this area: @dotnet/area-infrastructure-libraries
See info in area-owners.md if you want to be subscribed.

Issue Details
.packages/microsoft.dotnet.apicompat.task/8.0.100-alpha.1.22571.3/build/Microsoft.DotNet.ApiCompat.ValidateAssemblies.Common.targets(16,5): error MSB4018: (NETCORE_ENGINEERING_TELEMETRY=Build) The "Microsoft.DotNet.ApiCompat.Task.ValidateAssembliesTask" task failed unexpectedly.
System.Text.RegularExpressions.RegexMatchTimeoutException: The Regex engine has timed out while trying to match a pattern to an input string. This can occur for many reasons, including very large inputs or excessive backtracking caused by nested quantifiers, back-references and other factors.
   at System.Text.RegularExpressions.RegexRunner.<CheckTimeout>g__ThrowRegexTimeout|25_0()
   at Regex165_TryMatchAtCurrentPosition(RegexRunner, ReadOnlySpan`1)
   at Regex165_Scan(RegexRunner, ReadOnlySpan`1)
   at System.Text.RegularExpressions.Regex.RunAllMatchesWithCallback[TState](String inputString, ReadOnlySpan`1 inputSpan, Int32 startat, TState& state, MatchCallback`1 callback, RegexRunnerMode mode, Boolean reuseMatchObject)
   at System.Text.RegularExpressions.RegexReplacement.Replace(Regex regex, String input, Int32 count, Int32 startat)
   at Microsoft.DotNet.ApiCompat.RegexStringTransformer.Transform(String input) in /_/src/ApiCompat/Microsoft.DotNet.ApiCompat.Shared/RegexStringTransformer.cs:line 51

Build Information

Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=157087
Build error leg or test failing: Build / Libraries Build osx arm64 Debug / Restore and Build Product
Pull request: #81510

Error Message

Fill the error message using known issues guidance.

{
  "ErrorMessage": "Microsoft.DotNet.ApiCompat.RegexStringTransformer.Transform",
  "BuildRetry": true,
  "ErrorPattern": "",
  "ExcludeConsoleLog": false
}
Author: jkotas
Assignees: -
Labels:

area-Infrastructure-libraries, area-System.Text.RegularExpressions, blocking-clean-ci, untriaged, Known Build Error

Milestone: -

@ViktorHofer
Copy link
Member

That's interesting. I would never have assumed that super simple patterns like the ones that APICompat uses could result in a timeout exception. The timeout is hardcoded to two seconds: https://github.com/dotnet/sdk/blob/80c0e6412ce0b5d56298c020e500f5a267fab1a1/src/ApiCompat/Microsoft.DotNet.ApiCompat.Shared/RegexStringTransformer.cs#L16

The regex is just a replace that transforms an assembly path to a relative path with the help of capture groups:

<!-- Transform the API Compat assemblies passed in to log-able strings. -->
<ApiCompatLeftAssembliesTransformationPattern Include="$(_ApiCompatCaptureGroupPattern)" ReplacementString="ref/$1/$2" />
<ApiCompatLeftAssembliesTransformationPattern Include="$(_ApiCompatRuntimePrefixPattern)" ReplacementString="runtimes/$3/$1/$2/$4" />
<ApiCompatLeftAssembliesTransformationPattern Include="runtimes/windows/" ReplacementString="runtimes/win/" />
<ApiCompatRightAssembliesTransformationPattern Include="$(_ApiCompatCaptureGroupPattern)" ReplacementString="$(_ApiCompatLibReplacementString)" />
<ApiCompatRightAssembliesTransformationPattern Include="$(_ApiCompatRuntimePrefixPattern)" ReplacementString="runtimes/$3/$1/$2/$4" />
<ApiCompatRightAssembliesTransformationPattern Include="runtimes/windows/" ReplacementString="runtimes/win/" />

I see that this happened on a osx-arm64 machine. Maybe that machine was overloaded and hence the operation timed out? We can definitely increase the timeout to 5 seconds but the overall APICompat tool invocation shouldn't take that long.

@jkotas
Copy link
Member Author

jkotas commented Feb 2, 2023

Maybe that machine was overloaded and hence the operation timed out?

Probably.

The API compat tool is not running in security sensitive environment. I think it would be fine to run the regex without any timeout in API compat tool. We do not have similar timeouts in the other tools and compilers either. For example, you can make Roslyn to take forever on bad input. We do not protect against that.

@smasher164
Copy link
Member

I imagine the backtracking here is just referring to NFA backtracking, so removing the timeout shouldn't result in pathological behavior or anything.

@stephentoub
Copy link
Member

I would never have assumed that super simple patterns like the ones that APICompat uses could result in a timeout exception.

Are these the patterns it's using?

<_ApiCompatCaptureGroupPattern>.+%5C$([System.IO.Path]::DirectorySeparatorChar)(.+)%5C$([System.IO.Path]::DirectorySeparatorChar)(.+)</_ApiCompatCaptureGroupPattern>
<_ApiCompatRuntimePrefixPattern>(.+)/(net%5Cd.%5Cd)-(.+)/(.+)</_ApiCompatRuntimePrefixPattern>

While they might be "super simple", they also contain multiple loops written in a way where they can't be automatically made atomic by the implementation and thus potentially have a non-trivial amount of backtracking and could be e.g. O(N^3) in the length of the input. Some simple changes would likely eliminate that backtracking. For example, the .+%5C part of the pattern says to greedily match anything other than a newline and then match a backslash. That's going to end up consuming everything until a newline or the end of the input, and then back off until it finds a slash, try the rest of the comparison there, then back off until the next slash, try the rest of the comparison there, etc. If you instead wrote it as [^\n%5C]+%5C (or whatever syntax would be needed in the file), replacing the . with a negated character class that doesn't include backslash, then the optimizer will be able to see that nothing the loop could match could give back a backslash, which will allow it to eliminate backtracking caused by that loop.

You could also change the tool to specify the RegexOptions.NonBacktracking option introduced in .NET 7. That guarantees linear time in the length of the input. You're restricted from doing certain advanced things in the pattern, but these patterns don't use those things (e.g. backreferences).

And as Jan says, regardless you can raise or entirely remove the timeout (which will also be much less relevant with NonBacktracking).

@ViktorHofer ViktorHofer self-assigned this Feb 2, 2023
ViktorHofer added a commit to dotnet/sdk that referenced this issue Feb 5, 2023
Fixes dotnet/runtime#81543

As discussed in dotnet/runtime, a regex timeout isn't necessary as the tool currently doesn't run in a security sensitive environment in our core stack repositories and customers aren't expected to be using this functionality in such an environment either.

When ApiCompat upgrades and targets .NET 7+, we should also leverage the `RegexOptions.NonBacktracking` mode.
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Feb 5, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Mar 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Infrastructure-libraries blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants