Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<regex>: std::regex uses a nonstandard regex constant "error_syntax" #438

Open
BillyONeal opened this issue Jan 22, 2020 · 2 comments
Open
Labels
bug Something isn't working LWG issue needed A wording defect that should be submitted to LWG as a new issue

Comments

@BillyONeal
Copy link
Member

BillyONeal commented Jan 22, 2020

Describe the bug
The standard has a set of std::regex_constants values defined for regex errors, but doesn't define a value for every possible kind of error; for example, an empty pattern that only contains lookbehind. We probably need to add the missing constant to the standard somehow.

Boost uses error_bad_pattern for this case.

Command-line test case
STL version (git commit or Visual Studio version): Visual Studio 2019 version 16.4

C:\Users\bion\Desktop>type test.cpp
#include <regex>
#include <stdio.h>

int main() {
  try {
    std::regex r(R"((?<=abc))");
    puts("Parsing regex succeeded.");
  } catch (const std::regex_error &r) {
    switch (r.code()) {
    case std::regex_constants::error_collate:
    case std::regex_constants::error_ctype:
    case std::regex_constants::error_escape:
    case std::regex_constants::error_backref:
    case std::regex_constants::error_brack:
    case std::regex_constants::error_paren:
    case std::regex_constants::error_brace:
    case std::regex_constants::error_badbrace:
    case std::regex_constants::error_range:
    case std::regex_constants::error_space:
    case std::regex_constants::error_badrepeat:
    case std::regex_constants::error_complexity:
    case std::regex_constants::error_stack:
      puts("Got a standard regex error code.");
      break;
    default:
      puts("Got a nonstandard regex error code.");
      break;
    }

    puts(r.what());
  }
}

C:\Users\bion\Desktop>cl /EHsc /W4 /WX /std:c++latest /nologo .\test.cpp
test.cpp

C:\Users\bion\Desktop>.\test.exe
Got a nonstandard regex error code.
regex_error(error_syntax)

C:\Users\bion\Desktop>

Expected behavior
The standard should describe every possible value regex_error::code() can return.

This is a dual of Microsoft-internal VSO-173840 / AB#173840.

vNext note: Resolving this issue will require breaking binary compatibility. We won't be able to accept pull requests for this issue until the vNext branch is available. See #169 for more information.

@BillyONeal BillyONeal added bug Something isn't working vNext Breaks binary compatibility LWG issue needed A wording defect that should be submitted to LWG as a new issue decision needed We need to choose something before working on this labels Jan 22, 2020
@StephanTLavavej StephanTLavavej removed the decision needed We need to choose something before working on this label Jan 18, 2023
@StephanTLavavej
Copy link
Member

We checked libstdc++ and libc++'s behavior, and it's as Tolkien said: "go not to the STLs for advice, for they will tell you error_syntax and error_paren and error_badrepeat" 🧝 : https://godbolt.org/z/nvPz5rnaf

There should definitely be an LWG issue for this, ideally saying exactly which error code should be thrown (or [re] should be dropped entirely 😼 🦄 😹).

@StephanTLavavej StephanTLavavej removed the vNext Breaks binary compatibility label Jan 18, 2023
@StephanTLavavej
Copy link
Member

Removing vNext because we believe that although this would be a behavioral change, it wouldn't break ABI (it's worth noting that we've gotten more aggressive about behavioral changes since the beginning of the GitHub era). Of course, we would need to preserve the values of all existing error code constants.

Changing the error code of the exception being thrown, and/or removing the identifier error_syntax, could be done without realistically disrupting any code (it is unlikely that code is out there that is depending on the exact error code value being thrown for bogus syntax, or mentioning our non-Standard undocumented error constant).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working LWG issue needed A wording defect that should be submitted to LWG as a new issue
Projects
None yet
Development

No branches or pull requests

2 participants