Skip to content

<regex>: Nonlinear slowdown with increasing string length #5452

@StephanTLavavej

Description

@StephanTLavavej

Here, this regex is searching for UUIDs optionally wrapped in braces with \{? and \}? (no attempt is made to ensure pairing). There are no UUIDs in the remaining string (2.5 million characters in the original user repro DevCom-10895059, internal VSO-2458266 / AB#2458266 ). The optional braces cause matching to nonlinearly slow down with increasing length of the non-matching string.

Repro with VS 2022 17.14 Preview 5 x64 and microsoft/STL main as of 8e9f4ee.

C:\Temp>type meow.cpp
#include <cassert>
#include <chrono>
#include <cstddef>
#include <iostream>
#include <regex>
using namespace std;
using namespace std::chrono;

int main() {
#ifdef AVOID_OUTER_BRACES
    const regex UuidRegEx{R"([0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12})"};
#else
    const regex UuidRegEx{R"(\{?[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}\}?)"};
#endif

    for (size_t n = 1'000; n < 3'000'000; n *= 2) {
        const string allContent(n, 'x');
        const auto start  = steady_clock::now();
        const bool result = regex_search(allContent, UuidRegEx);
        const auto finish = steady_clock::now();
        const auto ms     = duration_cast<milliseconds>(finish - start).count();
        assert(!result);
        cout << "n=" << n << "; ms=" << ms << "; speed n/ms=";
        if (ms == 0) {
            cout << "instant\n";
        } else {
            cout << n * 1.0 / ms << "\n";
        }
    }
}
C:\Temp>cl /EHsc /nologo /W4 /std:c++latest /MT /O2 meow.cpp && meow
meow.cpp
n=1000; ms=1; speed n/ms=1000
n=2000; ms=8; speed n/ms=250
n=4000; ms=28; speed n/ms=142.857
n=8000; ms=107; speed n/ms=74.7664
n=16000; ms=415; speed n/ms=38.5542
n=32000; ms=1629; speed n/ms=19.644
n=64000; ms=6543; speed n/ms=9.78145
n=128000; ms=25819; speed n/ms=4.95759
^C
C:\Temp>cl /EHsc /nologo /W4 /std:c++latest /MT /O2 /DAVOID_OUTER_BRACES meow.cpp && meow
meow.cpp
n=1000; ms=0; speed n/ms=instant
n=2000; ms=0; speed n/ms=instant
n=4000; ms=1; speed n/ms=4000
n=8000; ms=1; speed n/ms=8000
n=16000; ms=3; speed n/ms=5333.33
n=32000; ms=7; speed n/ms=4571.43
n=64000; ms=14; speed n/ms=4571.43
n=128000; ms=30; speed n/ms=4266.67
n=256000; ms=60; speed n/ms=4266.67
n=512000; ms=122; speed n/ms=4196.72
n=1024000; ms=244; speed n/ms=4196.72
n=2048000; ms=472; speed n/ms=4338.98

Metadata

Metadata

Assignees

No one assigned

    Labels

    fixedSomething works now, yay!performanceMust go fasterregexmeow is a substring of homeowner

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions