Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix exception formatting while importing test modules #2337

Merged
merged 2 commits into from
Mar 30, 2017

Conversation

nicoddemus
Copy link
Member

Fix #2336

cc @fabioz

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.02%) to 92.69% when pulling 58ac4fa on nicoddemus:2336-unicode-tb into 6cfe087 on pytest-dev:master.

@@ -237,5 +237,7 @@ def safe_str(v):
try:
return str(v)
except UnicodeError:
if not isinstance(v, unicode):
v = unicode(v)
errors = 'replace'
return v.encode('ascii', errors)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good for me, the only thing is that I find it a bit weird that safe_str looses information if it has non-ascii chars (i.e.: usually the v.encode() would be something as v.encode('utf-8', 'replace') or v.encode(sys.getdefaultencoding(), 'replace'))

Anyways, this isn't related to this patch at all, so, +1 from the current patch from me ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You got a point, I think it is safe to convert to a utf-8 byte-stream, because it is ascii compatible anyway.

Updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a bad idea - things like the Window console will throw exceptions when trying to print UTF-8 with Python < 3.6.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm perhaps it's something on my machine, but I did this test:

# encoding: utf-8
from _pytest.compat import safe_str
print(safe_str(u'oh noes ☺'))
(.env27) C:\Users\bruno\pytest>python foo.py
oh noes Γÿ║
(.env27) C:\Users\bruno\pytest>python --version
Python 2.7.9 :: Continuum Analytics, Inc.

@fabioz
Copy link
Contributor

fabioz commented Mar 29, 2017

p.s.: Thanks for the speedy fix ;)

This way we don't lose information and the returned string is
ascii-compatible anyway
@fabioz
Copy link
Contributor

fabioz commented Mar 29, 2017

Actually, I thought that safe_str would be used for things internally, not really printing...

For printing you shouldn't use utf-8, you should use the value defined in the PYTHOIOENCODING env var -- a default of ascii could be used for being "safe" (there should probably be a special function dedicated to printing which makes sure that whatever is being printed can be encoded to PYTHONIOENCODING -- i.e.: it's always possible that printing a non-ascii value can give an error).

This value should also be acessible through sys.stdout.encoding. ie.:

λ set PYTHONIOENCODING=cp1252
                                                                                        
λ python                                                                                
Python 2.7.12 (default, Sep 20 2016, 14:51:55) [MSC v.1600 64 bit (AMD64)] on win32     
Type "help", "copyright", "credits" or "license" for more information.                  
>>> import sys                                                                          
>>> sys.stdout.encoding                                                                 
'cp1252'
                                                                                        
λ set PYTHONIOENCODING=                                                                 
                                                                                        
λ python                                                                                
Python 2.7.12 (default, Sep 20 2016, 14:51:55) [MSC v.1600 64 bit (AMD64)] on win32     
Type "help", "copyright", "credits" or "license" for more information.                  
>>> import sys                                                                          
>>> sys.stdout.encoding                                                                 
'cp437'                                                                                 

So, to be on the safe side, you can first try to print directly, but if it fails, try to encode to the related encoding first -- and if that still fails, go for ascii, as I believe ascii shouldn't ever fail (if that still fails, I think there's nothing more you can do)... if you expect those strings to end up in the console stderr, maybe using sys.stderr.encoding would be better (and the user/IDE should control PYTHONIOENCODING as needed).

@nicoddemus
Copy link
Member Author

Yeah I hear you. Ideally we should be using unicode for everything internally, but unfortunately that's not the case for pytest at the moment.

For example, the usage for safe_repr in the line which generated this error was creating bytes at that point and raising an exception, which is not something you should do (text internally, bytes at boundaries). Also, we have to consider backward compatibility: we have to keep the same API, which almost always returns text as bytes.

So the current support for pytest regarding unicode filenames or unicode system/user exceptions is very fragile. 😞

In order to actually implement proper and complete unicode support for Python 2, we would have to revisit the majority of the code base (and that includes py), and it is hard to find time for a major overhaul like this.

That's my opinion on the current state of things at least, don't know about the other contributors and would love to hear their opinions on the subject.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.02%) to 92.69% when pulling a542ed4 on nicoddemus:2336-unicode-tb into 6cfe087 on pytest-dev:master.

@RonnyPfannschmidt RonnyPfannschmidt merged commit 144d909 into pytest-dev:master Mar 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants