Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider handling of non-printable characters when showing code/diffs #10841

Closed
carljm opened this issue Apr 8, 2024 · 0 comments · Fixed by #11687
Closed

Consider handling of non-printable characters when showing code/diffs #10841

carljm opened this issue Apr 8, 2024 · 0 comments · Fixed by #11687
Labels
bug Something isn't working

Comments

@carljm
Copy link
Contributor

carljm commented Apr 8, 2024

Search keyword used: "non-printable"

Python source code containing non-printable chars (e.g. ^H or backspace, ^Z or substitution, ^[ or escape; see https://github.com/astral-sh/ruff/blob/main/crates/ruff_linter/resources/test/fixtures/pylint/invalid_characters.py for examples) can cause unpredictable and confusing results if we output it as-is.

Here's a minimal example of the current behavior:

➜ python3 -c "print('b = \'\x08\'')" > bad.py

➜ cat bad.py
b = '

➜ cat -v bad.py
b = '^H'

➜ xxd bad.py
00000000: 6220 3d20 2708 270a                      b = '.'.

➜ cargo run -- check --diff --no-cache --preview --select PLE2510 bad.py
    Finished dev [unoptimized + debuginfo] target(s) in 0.09s
     Running `target/debug/ruff check --diff --no-cache --preview --select PLE2510 bad.py`
--- bad.py
+++ bad.py
@@ -1 +1 @@
-b = '
+b = '\b'

Would fix 1 error.

Here the diff is confusing because it looks like the original string doesn't even have matching quotes, just a single quote char. In fact it has two quotes, but the intervening backspace character deleted one of them. (Note this is the same behavior shown by cat without -v; it just allows control characters to take effect.)

Interestingly, this is also the same as the default behavior of both diff and git diff, which means we may want to be cautious about diverging from it. But we could consider doing some kind of escaping of non-printable characters before outputting code frames or diffs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants