Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn about strings that skip multiple newlines with one backslash #87319

Closed
joshtriplett opened this issue Jul 20, 2021 · 2 comments · Fixed by #87671
Closed

Warn about strings that skip multiple newlines with one backslash #87319

joshtriplett opened this issue Jul 20, 2021 · 2 comments · Fixed by #87671
Labels
A-diagnostics Area: Messages for errors, warnings, and lints T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@joshtriplett
Copy link
Member

joshtriplett commented Jul 20, 2021

(This came up in today's @rust-lang/lang meeting.)

Given the following code:

fn main() {
    let s1 = "a\

b";
    let s2 = "a\

    b";
    println!("{:?} {:/}", s1, s2);
}

Playground link

The compiler interprets both of the string literals as "ab", omitting multiple newlines after the \.

This seems confusing, and different from the behavior of other languages. For instance, C and Python both produce parse errors for such string literals. Or, for a Python multi-line string literal, the backslash escapes the immediately following newline but not the subsequent newline.

Rust does allow unescaped newlines in string literals, which is useful. The ability to escape a newline can also be useful sometimes. But the combination of both escaped and unescaped newlines seems sufficiently confusing and surprising to merit a warning.

We'd ideally like to work towards this being an error. Depending on the presence of existing usage in the ecosystem, that might or might not require an edition.

@joshtriplett joshtriplett added A-diagnostics Area: Messages for errors, warnings, and lints T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 20, 2021
@pierwill
Copy link
Member

@rustbot claim

@pierwill
Copy link
Member

pierwill commented Jul 23, 2021

Discussed in Zulip:

The escape sequence handling for string literals is handled here:

fn unescape_str_or_byte_str<F>(src: &str, mode: Mode, callback: &mut F)

It looks like the issue lies in

fn skip_ascii_whitespace(chars: &mut Chars<'_>) {
let str = chars.as_str();
let first_non_space = str
.bytes()
.position(|b| b != b' ' && b != b'\t' && b != b'\n' && b != b'\r')
.unwrap_or(str.len());
*chars = str[first_non_space..].chars()
}
- we consider \n to be 'ascii whitespace' and will therefore skip over it after we encounter an escaped newline

JohnTitor added a commit to JohnTitor/rust that referenced this issue Jul 30, 2021
…ce, r=estebank

Add warning when whitespace is not skipped after an escaped newline

Fixes issue rust-lang#87318, also simplifies issue rust-lang#87319.

* Add support to the lexer to emit warnings as well as errors.
* Emit a warning when a string literal contains an escaped newline, but when (some of) the whitespace on the next line is not skipped due to it being non-ASCII.
JohnTitor added a commit to JohnTitor/rust that referenced this issue Jul 30, 2021
…ce, r=estebank

Add warning when whitespace is not skipped after an escaped newline

Fixes issue rust-lang#87318, also simplifies issue rust-lang#87319.

* Add support to the lexer to emit warnings as well as errors.
* Emit a warning when a string literal contains an escaped newline, but when (some of) the whitespace on the next line is not skipped due to it being non-ASCII.
JohnTitor added a commit to JohnTitor/rust that referenced this issue Aug 11, 2021
…es, r=estebank

Warn when an escaped newline skips multiple lines

Resolves rust-lang#87319
JohnTitor added a commit to JohnTitor/rust that referenced this issue Aug 11, 2021
…es, r=estebank

Warn when an escaped newline skips multiple lines

Resolves rust-lang#87319
@bors bors closed this as completed in 53a66ac Aug 12, 2021
@pierwill pierwill removed their assignment Oct 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants