-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
string literals: disallow backslash before non-escapes #21284
Comments
Sounds good to me. |
Sounds ok to me too. |
Disallowing unrecognized escapes seems pretty common: C, Java, C#, R, ...? |
I was under the impression that C allowed unrecognized escapes, but perhaps I'm misremembering. |
I was misremembering:
Clang warns about unknown escapes in C string literals and then ignores the backslash. We should definitely change this. |
@StefanKarpinski, according to C99 (ISO/IEC 9899:1999), section 6.4.4.4, "If any other character follows a backslash, the result is not a token and a diagnostic is required." In 6.11.4, it says "Lowercase letters as escape sequences are reserved for future standardization." |
Yeah, I saw that but I figured I should also try it since compilers don't always follow specs :) |
If I understand the parser code correctly, the processing of escape sequences is actually done in the flisp parser by this I think it should be sufficient to change this line to something like: char esc = read_escape_control_char((char)c);
if (esc == (char)c && esc != '\\' && esc != '"' && esc != '\'') {
free(buf);
lerror(fl_ctx, fl_ctx->ParseError, "read: invalid escape sequence");
}
buf[i++] = esc; (utf8.c also includes a redundant |
@stevengj, would you be willing to make this change? Everyone on the triage call is in favor. |
Looking into it now. |
See: http://stackoverflow.com/questions/43114125/how-to-replace-string-literals-back-front-slashes-in-julia. There are a couple of issues here:
Since Python behaves differently here for backslash followed by non-escapes, allowing these sequences can cause confusion for Python programmers. It would be better to give an error when a user writes something like
"ab\c"
telling them to write"ab\\c"
instead if they wanted a literal backslash or to write"abc"
if they don't want a literal backslash.If
"ab\c"
is an error, that leaves the syntax open in the future for new meanings, giving us room for extensibility. This does not sacrifice any expressiveness since writing\c
for non-control characters is otherwise completely useless. For example, we could then introduce something like the syntax"foo\(bar)baz"
that has previously been proposed for interpolation (this is Swift's interpolation syntax) without fear of breaking anyone's code.The text was updated successfully, but these errors were encountered: