-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TAsyncProtocolBase and TAsyncTransportBase #108
Conversation
Signed-off-by: aiudirog <[email protected]>
Signed-off-by: aiudirog <[email protected]>
Signed-off-by: aiudirog <[email protected]>
thriftpy2/transport/__init__.py
Outdated
def _read(self, sz): | ||
""" | ||
Internal read method which can read up to `sz` bytes but doesn't | ||
need to return them all. | ||
""" | ||
raise NotImplementedError | ||
|
||
def read(self, sz): | ||
"""Get exactly `sz` bytes from the underlying connection.""" | ||
return readall(self._read, sz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original TTransportBase
seems to declare that read()
will always return sz
bytes. However not every transport on its own does this. TFramedTransport
is a good example because it always assumes it will be wrapped by TBufferedTransport
and would therefore time out if it tried to read everything it was asked to. Might it be best to have the protocol call readall()
instead? Then we would remove _read()
and make read()
an abstract method.
If we decide not to make that change, I'll update this doc-string to say that if read()
isn't going to return all the requested bytes, the transport must be wrapped by one that will.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think read must return sz
bytes, that is the reason I modified the TAsyncBufferedTransport
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If transport can not find a way to return enough length data, it should raise an END_OF_FILE
exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea about readall
method, could you please tell me which cases are this method used in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The socket should always be able to furnish the data requested by the protocol, otherwise EOF would get raised. However transports are designed to wrap each other to form layers and, internally, calls to read()
between wrapped transports can't always return sz
bytes.
Like we discussed previously, TSocket
has to be wrapped by TBufferedTransport
in order to be used as a transport despite it sporting the same API. However, because of the nature of TBufferedTransport
and it requesting as much data as possible from the sub-transport to fill its buffer, the read method of the sub-transport can't always return sz
bytes. This is the case with TFramedTransport
which, despite inheriting the readall version of read()
from TTransportBase
, does not implement _read()
(to the dismay of most linters:P) and instead overrides read()
so that it doesn't have to return all sz
bytes. This allows it to be wrapped by TBufferedTransport
without running into timeout errors when it can't furnish enough data to fill the buffer but it also makes the wrapping required to ensure all the data is returned.
This leads to a weird dynamic where a transport has to make one of two choices:
- Have its
read()
return exactlysz
bytes and not support being wrapped byTBufferedTransport
- Have its
read()
return whatever is in the underlying transport and require being wrapped byTBufferedTransport
Basically, the current implementation requires that the outermost transport has to return exactly sz
bytes from its read()
and that inner transport just returns what it can get its hands on and will be called again if it didn't get enough. The means that transports like TFramedTransport
can't be put on the outer layer and anyone developing a custom transport needs to know the rules above, which aren't obvious from looking at the base class.
I propose we could make this easier one of two ways:
- Have the protocol be in charge of calling
readall()
instead ofTBufferedTransport
.- This would require quite a few changes in places and the more I look at it, the more I don't like it.
- Have two public methods for reading:
read()
which returns exactlysz
bytes andread1()
which returns up tosz
bytes.- This matches the standard Python Buffer API from io.IOBase
- I don't particularly like the name
read1
, but I chose it since it matches the standard API. We could always change it to something likeread_up_to
.
- I don't particularly like the name
read1()
would be used for intra-transport reads likeTBufferedTransport
toTFramedTransport
andread()
would be used by the protocol to guarentee all of the data gets read.- Implementation would be quite easy: the existing
_read()
method would becomeread1()
and serve the same purpose, only publicly.TTransportBase.read()
would stay exactly as is and be inherited by every transport so they would only have to define aread1()
.
- This matches the standard Python Buffer API from io.IOBase
Either change would allow transports to wrap each other arbitrarily, making it easier to develop new, more complex transports in the future.
I'll mock up a commit of the second one and send it over. We can always drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not 100% perfect (I don't like the sockets breaking the read1 vs read idea, but they have to for compatibility with Cython code) but it passes the test cases and makes transport implementation more consistent.
Edit: This change took me like 10 minutes to make, so if you don't like it it's not a lot of time lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have got your point. TFramedTransport
does not need to return data exactly equal to the required size, but TBufferedTransport
does. However, I think it is better to keep this behavior as before, leaving some comments to the read
method to describe this behavior is fine I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not like the design of read
and read1
or read_up_to
, but it seems like replacing read
with readall
would break the compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leaving some comments to the
read
method to describe this behavior is fine I think.
Sounds good. As long as it's clearly documented it shouldn't be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to see how Java did it and they actually have a read()
and readall()
: https://github.com/apache/thrift/blob/master/lib/java/src/org/apache/thrift/transport/TTransport.java#L60
I'll update the documentation for now though.
Coverage tests seem to cause the loop itself to close before running the tear down. Signed-off-by: aiudirog <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #108 +/- ##
==========================================
- Coverage 80.21% 79.95% -0.26%
==========================================
Files 39 43 +4
Lines 3811 3887 +76
==========================================
+ Hits 3057 3108 +51
- Misses 754 779 +25
Continue to review full report at Codecov.
|
Signed-off-by: aiudirog <[email protected]>
Signed-off-by: aiudirog <[email protected]>
Signed-off-by: aiudirog <[email protected]>
…rcular imports in the first place instead of working around them Signed-off-by: aiudirog <[email protected]>
Signed-off-by: aiudirog <[email protected]>
@ethe I'm all set for now. Let me know what you think and if there is anything else you want to add. |
Signed-off-by: aiudirog <[email protected]>
I updated the docstring and included a link to our discussion. Since you approved the previous changes, I'll go ahead and merge it in. Thanks! |
As promised, I'm implementing proper interfaces and cleaning up some parts of the code. I'll let you know when this is ready.:)