-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Cython in HTTP and fix TCyBufferedTransport early flush issue #129
Conversation
Signed-off-by: Roger Aiudi <[email protected]>
Fix issues related to cython support in http Signed-off-by: Roger Aiudi <[email protected]>
I tested 3.6->3.8 before pushing, looks like I need to be more thorough. I'll figure out what's up with 2.7 and 3.5 before tomorrow. |
I find that you modify the behavior of |
I changed it to just not flush the underlying transport, which is how the regular TBufferedTransport works already. I went through every transport and only TCyBufferedTransport calls flush on the wrapped transport inside its write, so I would say that it is safe to say that it is in the wrong and anything relying on that behavior shouldn't have been, especially with flush being a top level call. The only tests which are failing are the HTTP ones and that is for a different reason altogether I'm debugging now. Edit: IMO, the Cython implementations should only improve speed and should do their absolute best to not affect functionality. |
I agree. |
I checked the pure Python version of TBufferedTransport, you are right. |
Okay, I found the bug: I see this as two bugs:
Thoughts? |
I think we should correct the |
Awesome, I'll push my commit and let's see if it works. I tested on 2.7 this time, so fingers crossed. |
Ignore empty flush in THTTPClient Signed-off-by: Roger Aiudi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI passed, it looks good to me.
FYI, I noticed that the coverage reports aren't running, even in master. Might be something to check out. |
Codecov Report
@@ Coverage Diff @@
## master #129 +/- ##
==========================================
- Coverage 83.11% 83.09% -0.02%
==========================================
Files 43 43
Lines 3908 3911 +3
==========================================
+ Hits 3248 3250 +2
- Misses 660 661 +1
Continue to review full report at Codecov.
|
:) I'll merge it in then, thanks for being available so quickly on a Monday morning. |
There is a duplicate job in CI, code coverage works fine in https://travis-ci.com/github/Thriftpy/thriftpy2/jobs/298409113. I will try to remove another one. |
Ah got it, no problem then. |
@ethe Do you think you can do a release within the next week or so? By the way, here is the client I'm developing: https://github.com/aiudirog/ThriftPy2-HTTPX-Client. I decided not to send it as a PR to contrib since it's 3.6+ only and adds additional dependencies. |
OK, I will launch a release this week. |
TL;DR
This PR solves two issues:
While developing an alternative HTTP client to the one in thriftpy2.http using HTTPX (so I can use the same GSSAPI authentication on both sync and async) I ran into the issue described here:
After some debugging, I discovered that this is because TCyBufferedTransport flushes its underlying transport when its buffer size will be exceeded by a write. For streaming transports, like TSocket, this isn't an issue. However, for HTTP (and other message based transports) the flush method creates a request and sends that to the server. The request is supposed to be a complete Thrift message, however the early flushing behavior of TCyBufferedTransport causes an incomplete message to be sent.
My proposed solution is to not flush the underlying transport until
TCyBufferedTransport
'sflush()
orc_flush()
methods are called directly. I did this by addingc_dump_wbuf()
which checks if there is data to write and, if so will, writes the data into the underlying transport and then clears the write buffer. Now, when the write buffer is too full, this method is used to empty the buffer into the underlying transport without flushing it.c_flush()
can then be simplified to dump the write buffer and then flush the underlying transport.For anyone else who comes across the other issues listed in the comments of http.py:
Cannot convert TBufferedTransport to thriftpy2.transport.cybase.CyTransportBase
is caused by TCyBinaryProtocol declaring the transport as a CyTransportBase for performance. Wrapping your custom transport in TBufferedTransport (which would end up being TCyBufferedTransport) solves this issue because it doesn't care about the type of the wrapped transport.too small buffer allocated by TCyBufferedTransport
is caused by passing a buffer size less than 1024 to TCyBufferedTransport. The HTTP client is passing this value explicitly to make sure only one call was every made toself.rfile.read(size)
because an over-read like the buffered transports normally do can cause the read socket to timeout. I'm not sure why this occurs, since normal sockets typically just return what they have at the moment up to size, but I fought with it for about an hour before deciding to work around it.I was able to fix these two issues and enable arbitrary server transport support by simply pre-reading the full content into a BytesIO (which is what was internally happening anyways with the buf_size hack) and passing that to whatever transport factory the user chooses.