Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode escape sequences are not preserved #196

Open
linuxdaemon opened this issue May 6, 2024 · 1 comment
Open

Unicode escape sequences are not preserved #196

linuxdaemon opened this issue May 6, 2024 · 1 comment
Labels
bug Something isn't working enhancement New feature or request

Comments

@linuxdaemon
Copy link

This seems related to #55 and #104 but those are both closed as completed and this issue is still present on v1.0.1.

Example

flynt -tj -s "'\u2122'.join(('a', 'b'))"

Results in:
"a™b"
Instead of:
"a\u2122b"

This seems to also occur with octal values:

flynt -tj -s "'\40'.join(('a', 'b'))"

returns:
"a b"

@ikamensh ikamensh added bug Something isn't working enhancement New feature or request labels Aug 21, 2024
@ikamensh
Copy link
Owner

this is a known and unfortunate limitation. Once python parses your code, which I need to do to get to abstract syntax tree, its no longer possible to determine if a character was an escape sequence or a special character. Now, it might be possible to read file as bytes, and find location of each expression in the file, and therefore see if its a unicode character (I think), so there could be two fixes:

  1. Just detect the usage of escape sequences and raise ConversionRefused, to make flynt skip this expression (easier)
  2. Actually preserve unicode sequences where present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants