-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-360]auto convert str to bytes in img.imdecode when py3 #10697
Conversation
a872b87
to
8f8e104
Compare
python/mxnet/image/image.py
Outdated
@@ -132,6 +134,9 @@ def imdecode(buf, *args, **kwargs): | |||
<NDArray 224x224x3 @cpu(0)> | |||
""" | |||
if not isinstance(buf, nd.NDArray): | |||
if sys.version_info[0] == 3: | |||
if isinstance(buf, str): | |||
buf = bytes(buf, "ascii") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add a test?
also we have mx.base.py_str. Would that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mx.base.py_str is 'utf8' encoding,I'm not sure it can work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about using buffer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case really shouldn't happen, because in this case the API is underspecified.
String is always tied to an encoding, and if the string comes from the user, it could have been decoded with any encoding/charset, so the 'ascii' assumption can easily be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@szha or I change it into a friendly error message. prompt user convet it himself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
Is this going to cause error if no converting is done? |
8f8e104
to
5234b10
Compare
@piiswrong np.frombuffer receive buffer like param,but in python3 str not is buffer like.will trigger TypeError. |
I can merge this work around first. But I think we should have a mx.nd.frombuffer that uses CAPI directly. BTW why would buf be str in python3? open('xxx', 'b').read() produces bytes in py3 right? |
This conversion is used to be compatible with python2,If someone saves data with python2 and reads it with python3, it will go wrong. |
It's a bug of numpy -- numpy/numpy#8933 |
@piiswrong are you still planning to merge this? |
@piiswrong @zhreshold bouncing for a review. thanks! |
It is still relevant now? |
@zhreshold Numpy has not been fixed. This examination should be useful. |
Adding @sandeep-krishnamurthy @Roshrini @vandanavk for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor wording nit.
python/mxnet/image/image.py
Outdated
@@ -133,6 +135,9 @@ def imdecode(buf, *args, **kwargs): | |||
<NDArray 224x224x3 @cpu(0)> | |||
""" | |||
if not isinstance(buf, nd.NDArray): | |||
if sys.version_info[0] == 3 and not isinstance(buf, (bytes, np.ndarray)): | |||
raise ValueError('buf must bytes or numpy.ndarray,' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: suggest rephrasing a bit to "buf must be of type bytes or numpy.ndarray"
python/mxnet/image/image.py
Outdated
@@ -133,6 +135,9 @@ def imdecode(buf, *args, **kwargs): | |||
<NDArray 224x224x3 @cpu(0)> | |||
""" | |||
if not isinstance(buf, nd.NDArray): | |||
if sys.version_info[0] == 3 and not isinstance(buf, (bytes, np.ndarray)): | |||
raise ValueError('buf must bytes or numpy.ndarray,' | |||
'if you input str,please convert it to bytes') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: suggest rephrase: "if you would like to input type str, please convert to bytes"
@lupesko done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This is still relevant edge case numpy/numpy#8933 as pointed out by @chinakook
@szha @piiswrong - ping for any other concerns. |
@zhreshold - This is good to go, if it looks fine for you. |
Description
auto convert str to bytes in img.imdecode when py3
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments