-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode Encoding Error in Python #701
Comments
That is indeed a tricky bit of Python 2 to wrap our heads around and common enough that lots of people experience it. Bascially, it comes down to the fact that In [1]: x = "😱"
In [2]: u'test' + x
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-2-b42a30d7afc1> in <module>()
----> 1 u'test' + x
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 0: ordinal not in range(128) What happens here is that In [3]: unicode(x)
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-3-c268b90adfa4> in <module>()
----> 1 unicode(x)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 0: ordinal not in range(128)
In [4]: x.encode('utf-8')
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-4-562856162c51> in <module>()
----> 1 x.encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 0: ordinal not in range(128) If instead we started with a unicode string and ended with a unicode string, it just works:
As pointed out in unicode frustrations, this also depends on the terminal reported to Python which in our case in Atom script is based on your environment variables when running Long story short, you'll want to figure out how to handle unicode directly in your code because this will bite you in production far worse than it will in your editor. |
CODE# -*- coding: utf-8 -*-
print u'中国' PS:中国 is China. OUTPUTwhen Ctrl + Alt + B:
|
It looks like atom-script on Windows uses cp1252 (Windows 1252) encoding by default instead of utf-8. As a work-around you can specify the encoding as utf-8 in your code for the system out and error streams:
|
I was able to fix these issues in my Atom script output by simply adding the environment variable |
hello,
thank you for the help! I appreciate it a lot because i had to run my code
outside
of atom some times with the python interpreter so thank you so much.
…On Fri, 16 Nov 2018 at 20:50, D. Starr ***@***.***> wrote:
I was able to fix these issues in my Atom script output by simply adding
the environment variable PYTHONIOENCODING=utf8
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#701 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AmgkDp0szcoAJpkwDJ1-krF1Lzxbg1HDks5uvwjrgaJpZM4G4luU>
.
|
When I was trying to print some unicode strings, an error occurred:
So it seems that strings are always encoded in ascii, even though I have declared them to be in unicode form by preceding
u
. However, the code was able to run in Python 2.7.10 Shell. So is there any convention that I should follow in my program or is it a bug?Thanks 😄
The text was updated successfully, but these errors were encountered: