Regression with chunked transfer encoding? #340

pb- · 2017-03-20T17:06:31Z

I'm using https://httpbin.org and it looks like chunked transfer encoding does not work any more as expected:

$ curl -X POST https://httpbin.org/post \
 -d '{"message":"BLA"}' \
 -H 'Content-Type: application/json' \
 -H 'Transfer-Encoding: chunked'
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Connect-Time": "1", 
    "Connection": "close", 
    "Content-Type": "application/json", 
    "Host": "httpbin.org", 
    "Total-Route-Time": "0", 
    "Transfer-Encoding": "chunked", 
    "User-Agent": "curl/7.50.1", 
    "Via": "1.1 vegur", 
    "X-Request-Id": "2a06bd5b-06dc-4ced-ac89-fe83cbb6771c"
  }, 
  "json": null, 
  "origin": "<removed>", 
  "url": "https://httpbin.org/post"
}

Observe that data and json do not contain the posted data. The same request works as expected without chunked transfer encoding:

$ curl -X POST https://httpbin.org/post \
 -d '{"message":"BLA"}' \
 -H 'Content-Type: application/json'
{
  "args": {}, 
  "data": "{\"message\":\"BLA\"}", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Connect-Time": "0", 
    "Connection": "close", 
    "Content-Length": "17", 
    "Content-Type": "application/json", 
    "Host": "httpbin.org", 
    "Total-Route-Time": "0", 
    "User-Agent": "curl/7.50.1", 
    "Via": "1.1 vegur", 
    "X-Request-Id": "dfbaa440-9871-4a3c-b482-35a68459722a"
  }, 
  "json": {
    "message": "BLA"
  }, 
  "origin": "<removed>", 
  "url": "https://httpbin.org/post"
}

Am I missing something?

The text was updated successfully, but these errors were encountered:

sigmavirus24 · 2017-03-20T17:47:56Z

You're not missing anything. This does look like a regression.

For @kennethreitz:

>>> r = requests.post('https://httpbin.org/post', data=iter(('foo', 'bar', 'bogus')))
>>> r.request.headers['Transfer-Encoding']
>>> r.json()['data']
>>> pprint.pprint(r.json())
{u'args': {},
 u'data': u'',
 u'files': {},
 u'form': {},
 u'headers': {u'Accept': u'*/*',
              u'Accept-Encoding': u'gzip, deflate',
              u'Connect-Time': u'0',
              u'Connection': u'close',
              u'Host': u'httpbin.org',
              u'Total-Route-Time': u'0',
              u'Transfer-Encoding': u'chunked',
              u'User-Agent': u'python-requests/2.13.0',
              u'Via': u'1.1 vegur',
              u'X-Request-Id': u'0c60c510-80e2-4d97-a02a-11b006f38376'},
 u'json': None,
 u'origin': u'...',
 u'url': u'https://httpbin.org/post'}

kennethreitz · 2017-03-21T16:38:46Z

I don't remember this previously being a problem when hosting on Heroku.

sigmavirus24 · 2017-03-21T18:34:44Z

I don't remember this previously being a problem when hosting on Heroku.

Sadly it's reproducible.

kennethreitz · 2017-03-21T18:35:29Z

@sigmavirus24 you reproduced it on another app on heroku?

kennethreitz · 2017-03-21T18:36:09Z

I'm down with adding a "on heroku" mode to httpbin, that removes that endpoint from the html, but keeps it in the distributed version.

sigmavirus24 · 2017-03-21T18:42:02Z

@kennethreitz I didn't try reproducing with a different app, I was just able to reproduce that httpbin seems broken. I can try spinning up a different app over the weekend to see if I can reproduce this there too

kennethreitz · 2017-03-22T17:11:23Z

@sigmavirus24 sounds great

graingert · 2017-04-04T09:36:21Z

@kennethreitz @sigmavirus24 I get a broken pipe, when running locally on raw WSGIRequestHandler https://travis-ci.org/kevin1024/pytest-httpbin/jobs/218194179 https://github.com/kevin1024/pytest-httpbin/pull/38/files#diff-db42f37e1c90bb185d0251b14ddd12e0R47

graingert · 2017-04-04T09:51:25Z

Python 3.6.1 (default, Mar 22 2017, 06:17:05) 
[GCC 6.3.0 20170321] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from httpbin import app as httpbin_app
>>> from pytest_httpbin import serve, certs
>>> server = serve.Server(application=httpbin_app)
>>> server.start()
>>> server
<Server(<class 'pytest_httpbin.serve.Server'>, started 140641123542784)>
>>> server.port
35117

$ curl -X POST http://localhost:35117/post \
 -d '{"message":"BLA"}' \
 -H 'Content-Type: application/json'
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Content-Length": "", 
    "Content-Type": "application/json", 
    "Host": "localhost:35117", 
    "Transfer-Encoding": "chunked", 
    "User-Agent": "curl/7.51.0"
  }, 
  "json": null, 
  "origin": "127.0.0.1", 
  "url": "http://localhost:35117/post"
}

Lukasa · 2017-04-04T10:04:27Z

Ok, so that seems to be a werkzeug issue. Using twisted as the WSGI server (twist web --wsgi httpbin.app) I get this:

{
  "args": {}, 
  "data": "{\"message\": \"bla\"}", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Content-Length": "18", 
    "Content-Type": "application/json", 
    "Host": "localhost:8080", 
    "User-Agent": "curl/7.51.0"
  }, 
  "json": {
    "message": "bla"
  }, 
  "origin": "127.0.0.1", 
  "url": "http://localhost:8080/post"
}

So my guess would be that this is an interaction between the werkzeug server and httpbin.

Lukasa · 2017-04-04T10:07:34Z

However, the chunked problem is alive and well. Again, testing with Twisted:

>>> import requests
>>> from pprint import pprint as p
>>> def gen():
...     yield b"hello"
...     yield b"world"
... 
>>> r = requests.post('http://localhost:8080/post', data=gen())
>>> p(r.json())
{'args': {},
 'data': '',
 'files': {},
 'form': {},
 'headers': {'Accept': '*/*',
             'Accept-Encoding': 'gzip, deflate',
             'Connection': 'keep-alive',
             'Content-Length': '',
             'Content-Type': '',
             'Host': 'localhost:8080',
             'Transfer-Encoding': 'chunked',
             'User-Agent': 'python-requests/2.13.0'},
 'json': None,
 'origin': '127.0.0.1',
 'url': 'http://localhost:8080/post'}

Wireshark shows the chunked data being sent, so that issue is somewhere in the httpbin codebase.

Lukasa · 2017-04-04T10:23:50Z

So I just double-checked, and twisted has definitely received the chunked data and de-chunked it before passing it to Flask. So this rather suggests that Flask might be dropping the ball here: by the time it gets to httpbin the data is missing, so presumably somewhere in Flask the data is getting lost.

graingert · 2017-04-04T10:24:24Z

@Lukasa pallets/flask#367 ?

Lukasa · 2017-04-04T10:26:36Z

No. Flask isn't seeing the chunked data here: Twisted has parsed it before Flask ever gets a chance to see it.

graingert · 2017-04-04T10:49:26Z

And another pytest-httpbin issue kevin1024/pytest-httpbin#33 (comment)

Lukasa · 2017-04-04T11:20:04Z

Ok, so I wrote a really simple repro of this with Flask. Run the following Flask application:

from flask import Flask, request
app = Flask(__name__)

@app.route('/', methods=['GET', 'POST'])
def hello_world():
        return request.data

With the following test script:

import requests

def gen():
    yield b"hello"
    yield b"world"

print("Making first request")
r = requests.post('http://localhost:8000/', data=b'helloworld')
print("Got: %s" % r.content.decode('utf-8'))
print("Making second request")
r = requests.post('http://localhost:8000/', data=gen())
print("Got: %s" % r.content.decode('utf-8'))

Gives the output:

Making first request
Got: helloworld
Making second request
Got:

I ran this test with gunicorn. This suggests that the flask data is going...somewhere. I don't know where yet.

Lukasa · 2017-04-04T11:30:01Z

Reported as pallets/flask#2229.

kennethreitz · 2017-04-04T11:35:11Z

✨🍰✨

graingert · 2017-05-08T21:18:40Z

@tmenier this issue has never been closed...?

davidism · 2017-07-21T16:48:22Z

The way to support this is to wrap your app in the following middleware when running with Gunicorn (or another server that supports chunked encoding).

class InputTerminated:
    def __init__(self, wsgi_app):
        self.wsgi_app = wsgi_app

    def __call__(self, environ, start_response):
        environ['wsgi.input_terminated'] = True
        return self.wsgi_app(environ, start_response)

app.wsgi_app = InputTerminated(app.wsgi_app)

skyline75489 · 2017-07-31T07:44:49Z

Curl seems to be not working well with httpbin right now:

$ curl -F [email protected] http://localhost:5000/post --header "Transfer-Encoding: chunked" --header "Expect:"
curl: (56) Recv failure: Connection was reset

I'm using curl 7.51.0 and httpbin 0.5.0 running with python -m httpbin.core.

With curl 7.54.0 the response seems OK but there's no file in it:

{
  "args": {},
  "data": "",
  "files": {},
  "form": {},
  "headers": {
    "Accept": "*/*",
    "Content-Type": "multipart/form-data; boundary=------------------------5f6ff6e1cc372527",
    "Host": "localhost:5000",
    "Transfer-Encoding": "chunked",
    "User-Agent": "curl/7.54.0"
  },
  "json": null,
  "origin": "127.0.0.1",
  "url": "http://localhost:5000/post"
}

kennethreitz · 2017-08-04T23:42:51Z

Can you try again? https://github.com/kennethreitz/httpbin/issues/370

kennethreitz · 2017-08-04T23:51:35Z

That did not seem to fix this problem.

kevin1024 · 2017-08-27T19:07:58Z

OK, so 2e94212 broke pytest-httpbin (and caused some issues with requests as well) since pytest-httpbin uses werkzeug.

See also:

BUT since it didn't fix anything, I think we will remove it for the next httpbin release.

This reverts commit 2e94212. This didn't fix the problem, and caused other issues when using werkzeug to execute httpbin, so let's remove it. See also: #340

Added support for chunked encoded requests when running gunicorn. #340

@nread

* Replace FIFOBuffer implementation with BufferStream/IOBuffer wrapper. Add mark/reset support to Form <: IO Dont call eof() in body(::IO, ...) if content length is known (eof() can block) Work around JuliaLang/julia#24465 Use non-blocking readavailable() to read serverlog in test/server.jl Hack test/fifobuffer.jl to work with BufferStream/IOBuffer implementation. The BufferStream API does not expose the size of the buffer, so tests that passed a fixed size to FIFOBuffer(n) have been tweaked to pass with BufferStream's automatic buffer size managment behaviour. Add note re @test_broken and https://github.com/kennethreitz/httpbin/issues/340#issuecomment-330176449 * Rename canonicalize!(::String) -> tocameldash!(::String) Move header canonicalization out of parser. * Remove URI, header and body size limits from parser. * Replace unused parser option "lenient" with global const "strict". * Parser simplification move "close(response.body)" from parser to client remove @nread -- redundant now that header size is unrestricted remove unused handlers: onmessagecomplete, onstatus, onmessagebegin, onheaderscomplete * Add tests for parsing one charactar at a time. Fix #125 and #126. Replace local `KEY` in `parse!()` with `parser.previous_field`. Add onurlbytes(). Accumulates URL in `parser.valuebuffer`. Use `Ref{String}` to return `extra`/`upgrade` so that caller can tell the difference between upgrade with empty string and no upgrade. Add `s_req_fragment_start` to list of states where the url_mark should be reset to 1. * Store method, major, minor, url and status in Parser - begin to separate Parser from Request/Response * Use Vector{Pair} instead of Dict for Headers. See #108. Store Vector{Pair} headers in Parser struct. Move cookie processing from parser to client. Move special treatement of multi-line and repeated headers out of main parser function. * Use IOBuffer instead of Vector{UInt8} for field and value buffers in Parser. append!() is replaced by write(). Faster, less allocaiton. unsafe_string(pointer(x.buf), length(x.buf)) is replaced by String(take!(x.buf)) ``` mutable struct s1 buf::Vector{UInt8} end mutable struct s2 buf::IOBuffer end a(x::s1, s) = append!(x.buf, s) a(x::s2, s) = write(x.buf, s) r(x::s1) = unsafe_string(pointer(x.buf), length(x.buf)) r(x::s2) = String(take!(x.buf)) function go(x) inbuf = Vector{UInt8}([0]) for i = 1:100000 inbuf[1] = UInt8(i % 255) a(x, inbuf) end outbuf = String[] for i = 1:1000 push!(outbuf, r(x)) end println(length(outbuf)) end t1() = go(s1(Vector{UInt8}())) t2() = go(s2(IOBuffer())) t1() t2() println("t1, Vector...") @time t1() println("") println("t2, IOBuffer...") @time t2() ``` ``` t1, Vector... 1000 0.101289 seconds (1.12 k allocations: 95.733 MiB, 65.69% gc time) t2, IOBuffer... 1000 0.002676 seconds (2.04 k allocations: 241.281 KiB) ``` * reduce main parse! function to a single method by keeping a reference to body in the Parser struct * Remove *_mark from Parser. Explicitly handle data within main parse loop instead of using *_mark to clean up loose ends. Number of places that callbacks are called is reduced. Add `while p <= len` inner loop for URL parsing (follows pattern used for header field and value parsing) Remove on*bytes functions, just `write()` to valuebuffer or fieldbuffer as needed. * Remove (mostly) unused Parser.nread field Purpose is not clear. Wasn't working anyway in some cases in origin master branch. e.g. In the "Foo: F\01ailure" test case nread was buffer length + 1. * Explicitly return HPE_OK (not errno) in Parser in places where there is no error. Remove redundant write-back of parser.state. * Clean up gotos and main loop condition in parser * Replace @debug(PARSING_DEBUG, ParsingStateCode(p_state)) in every ifelse with single @debug at top of main loop. * Move header and body procssing out of parser (via callbacks) * throw message incomplete error from parse() fix tests for invalid content length * move upgrade flag to Parser struct so that state is preserved between calls to parse!() ensure that catch-all return at end of main loop is only used for incomplete messages * throw ArgumentError for zero len passed to parse! * simplify ParsingError display * Replace multiple status var returned by parse!() with state accessor functions. Replace multiple return points in main parse!() loop with single return at the bottom. Make s_message_done a loop termination condidion. * replace "while true" read loop in client with "while !eof(socket)" loop * throw errors directly from parse!() * Call show(r) for verbose logging and write(r) for comms. (was using string(r) for both). Work around for JuliaLang/MbedTLS.jl#113 Try to rationalise Base.write, Base.string, Base.show and opts.logbody (WIP) * fix parse error when fragmentation splits CRLF * simple debug macro * function absuri(u::URI, context::URI) * parser interface tweaks in server (untested) * Work in progress. Separation of client functionality into layers. New modules: - Connect.jl - Make a simple TCL or SSL connection. - Connections.jl - Connection pool with interleaved requests/responses. - Messages.jl - Refactored Request and Response. - Body.jl - Statick and streaming bodies. parser.jl: - add isheadresponse flag - add onheaderscomplete callback - remove "extra" view - return number of bytes consumed instead of using "extra" view - change bytes parameter to be a SubArray client.jl and types.jl - Unfinished refactoring of redirects, connection management etc * Added closeread/closewrite interface to manage pooling/interleaving. Simplify pool implementation. * Add high level client interfaces - CookieRequest - RetryRequest Add redirect and header normalisation to SendRequest Move ::IO API extensions from Connect.jl to IOExtras.jl (unread!, closeread, closewrite) Add chunked mode to write(::IO, ::Body), add setlengthheader(). Replace writebusy and readlock with writecount/readcount for managing connection interleaving in Connections.jl Clean using and import statements. Hook up StatusError Replace old parse() function with Request(::String) and Response(::String) const ructors. Replace "offset" scheme in URIs with SubStrings (WIP). * Move default port (80/443) to getconnection. Change `hostname` to `host` per obsoleting of `hostname` in RFC: https://tools.ietf.org/html/rfc3986#appendix-D.2 Add `hostport` that returns e.g. "foobar.com:8080" or "[1:2::3:4]:8080" (old `host` function returned "foobar.com8080" or "1:2::3:48080"). Replace Offset with SubString in urlparser.jl * simplify Base.print(io::IO, u::URI), just print string directly * simplify Base.print(io::IO, u::URI), just print string directly * Split util functions into: Pairs.jl, Stirngs.jl, debug.jl, parseutils.jl * debug tweaks * tweak import of MbedTLS.SSLContext * Update client.jl API to use refactored internals. Bodies.jl - Only use chunked encoding if length is unknown. Move redirect from SendRequest to CookieRequest layer so that cookies can be processed before redirect. header() and setheader() - make header lookup case-insensitive Revert to original FIFOBuffer implementation * improve show(io::IO, c::Connection) * Parsers.jl cleanup Split output fields from struct Parser into struct Message. Docstrings and cosmetics. Pass Parsers.Message to p.onheaderscomplete callback. Add const NOMETHOD. Use @PasserT for inner parser assertions (see enable_passert) Rename @strictcheck -> @errorifstrict Move http_should_keep_alive to server.jl Use HPE_INVALID_INTERNAL_STATE instead of error("unhandled state") Fix up end of loop assertion: p_state == s_message_done || p == len * doc examples * Architecture and Internal Interface Documentation * Whoops * Split stack furthur into seperate modules. See module module RequestStack and const DefaultStack in HTTP.jl * more refinement of layer separation, doc updates * note that header functions are case-insensitive * doc typos * Cache Parser in ConnectionPool. Make stack dynamically configurable. * simplify Base.read!(::IO, ::Parser) per JuliaLang/MbedTLS.jl#114 * fix prior broken commit * test/REQUIRE JSON * fix for syntax: unexpected "catch" on 0.7 * temporarily limit travis to v0.6/linux * Ensure that all old client.jl keyword options are handeled or produce a depricat ion message. Simplify Client.options by using ::Vector{Tuple{Symbol,Any}} instead of ::RequestOptions. Add minimal flag in HTTP.jl to use a stripped down stack config. * Revert rename of parser.jl -> Parsers.jl to avoid making GitHub's diff display confused. * tweaks to "minimal" configuration * fix connection pool locking for concurrent access to same connection * Parser tweaks - waitingforeof() true if p.state == s_start_req_or_res - Check isheadresponse before F_CHUNKED in s_headers_done state. HEAD response can include chunked, but has no body. - Consume trailing end of line after message. Saves unreading a stray newline at the end of a message. * cosmetics * add state and status to ParsingError. Handle eof() before message has begun * fix read/write locks in ConnectionPool.jl * typo in @err macro * add note about header fragmentation * ConnectionPool scheme refinement. Add `max_duplicates` setting. Up to `max_duplicates` for a certian scheme/host/port, create new connections if the existing connections are already locked for writing. If there are aleady `max_duplicates` connections fo a scheme/host/port, return one of the existing connections (even though it is locked for writing by another task). The new call to lock(c.writelock) will then block, forcing the calling task to wait. This seems to work nicely with the AWSS3 concurrency tests: https://github.com/samoconnor/AWSS3.jl/blob/master/test/runtests.jl#L231 Also added a new test at the bottom of test/client.jl Inspection with tcpdump/wireshark shows that multiple requests are being written to the socket before the first response comes back; and once responses start comming, requests and responses stream continuously, utilising both halves of the TCP connection. * HTTP.jl - Update top-level RequestStack.request method to convert body stream/data to a ::Body ASAP. - Pass response_body as positional (not kw) arg. Bodies.jl - Add isstreamfresh(::Body) to check if stream is untouched and safe to use in a retry. Messages.jl - Add exception field to ::Response, set by writeandread() when there is an error processing the request. Rethrown by by wait() and waitforheaders(). SocketRequest.jl - Handle exceptions thrown by writeandread() background task for streamed Response Bodies. RetryRequest.jl - Retry for Base.ArgumentError.msg == "stream is closed or unusable" - Don't retry if the previous attempt has altered the state of the request stream or the response stream. ConnectionPool.jl - Use writecount/readcount to ensure that the readlock is obtained in the correct sequence so that Responses match Requests. See closewrite(). - Revised pool strategy: - Try to find a connection with no readers or writers, then - Open new connections up to the limit, then - Try to find a connection with pending readers, but writable, then - Sleep waiting for writes to finish, then try again. - The previous scheme ended up over-interleaving requests while other connections sat idle. - New async test cases to validate connection interleaving. -- INSERT -- * v0.7 compat * remove old async test, see new test/async.jl * v0.7 updates * split parse! into parseheaders! and parsebody! * remove unreachable state s_dead * add readbody! and readheaders! * API notes * API notes * API notes * More extreme async and streaming tests: async.jl New HTTPStream <: IO module replaces Bodies.jl - Brings together Parser, Message and Connection - Provides transparent IO-compatible writing/reading of chunked bodies. - Provides a HTTP Body-aware eof() - Supports better streaming via new open() API. New HTTP.open() API function for streaming bodies: HTTP.open(...) do io ... end - Removes the need for tasks and condition signaling. parser.jl - Remove callback functions. - Reduce interface down to parseheaders() and parsebody() - parseheaders is now called as: parseheaders(p, bytes) do header ... end - parseheaders returns a SubArray of the unprocessed bytes. - parsebody returns a tuple of SubArrays: the processed and unprocessed bytes. Messages.jl - Replace ::Body with ::Vector{UInt8}. Simpler. - Add explict Response.request and Request.response fields. - Remove obsolete waiting functions - Remove connectparser. Parser no longer needs connecting. - Add readtrailers() - Remove writeandread() - now handled by StreamLayer ConnectionRequest.jl: - close io on error ConnectionPool.jl - Add reuse_limit option. e.g. httpbin.org seems to drop the connection after 100 requests, so it's better to set a limit to avoid sending hundreds of requests to a connection when only the first few will get processed. IOExtras.jl - add unread! methods for IOBuffer and BufferStream (for use in testing) Move MessageLayer above RetryLayer and ExceptionLayer to give RetryLayer access to Request and Response structs. Used to determine retryability of body streaming. MessageRequest.jl - add bodylength and bodybytes functions (with many methods for different types) - use these to set Conent-Length/Transfer-Encoding RetryRequest.jl - Use special const values and === to mark "body_was_streamed" to avoid reusing a stream. consts.jl - remove unused consts Move VERSION specific stuff to compat.jl * Move request() function from RequestStack module to top-level HTTP module. Add async streaming PUT and GET tests using AWS S3 (added simple AWS4AuthRequest.jl layer to sign requests). * whoops * More async streaming tests (and resulting bug-fixes) ConnectionPoolLayer - accept connectionpool::Bool=true option instead of seperate ConnectLayer ConnectionPool.jl - Add NonReentrantLock, wrapper for ReentrantLock with assertion of no multiple locking. - Many new locking logic assertions - Make locking logic more rigid HTTPStream.jl - Fix write to read transition logic in eof() RetryRequest.jl - Reset response body before processing retry. parser.jl - rename setheadresponse() to setnobody() * whoops * whoops * whoops * need eof bugfix in +MbedTLS 0.5.2 * tweak AWS4AuthRequest.jl debug levels * DEBUG_LEVEL = 0 by default * added lockedby() to compat.jl * type annotation * type annotation * cosmetics * reenstate verbose= option * ConnectionPool.jl enhancements. Remove NonReentrantLock for simplicity, testing indicates that locking is ok now, retain assertions Add pipeline_limit option. Split startread() code out of closewrite(). Cosmetics move localport and peerport to IOExtras.jl * test tweaks * added TimeoutLayer * Concurrency enhancements to Handle RFC7230 6.5, early termination. - Add struct ConnectionPool.Transaction <: IO to manage state of a single transaction (Request/Response) on a shared Connection. The Transaction struct holds a sequence number and a reference to the connection. - Decouple read/write lock/unlock sequecing in ConnectionPool. Each Transaction now knows its sequence number. closewrite() no longer calls startread(). Reading can now begin before writing has finished, or start much later without the need to atomically unlockread+lockwrite. - Remove redundant Connection.writelock. - Handle RFC7230 6.5, early send body termination on error: send body runs in an @async task and closeread(::HTTPStream) checks for error code and "Connection: close". - iserror(::Messages) false if status is 0 (not yet recieved) - Don't write(::IO, ::Request) the whole request in StreamLayer, instead startwrite(::HTTPStream) sends the headers automatically and StreamRequest.jl sends the body (static or streamed). - Callstartwrite(::HTTPStream) in HTTPStream{Response} constructor to send headers automatically when stream is created. - Rename readheaders(::HTTPStream) -> startread(::HTTPStream) - Rename writeend(::HTTPStream) -> closewrite(::HTTPStream) - Rename close(::HTTPStream) -> closeread(::HTTPStream), StreamLayer now calls closeread (not close), ConnectionPoolLayer now calls close when pooling is disabled (i.e. one request per connection mode). - Add @require from https://github.com/JuliaLang/julia/pull/15495/files - IOExtras.jl stubs for startwrite(), startread() and closeread() * typo * ConnectionPool eof() handle closed Transaction. Remove auto startwrite() in HTTPStream constructor (better to have the symetry of startwrite()/closewrite() in request(StreamLayer, ...) ). In closeread(http::HTTPStream) don't call closeread(::Transaction) if the Transaction is already closed (e.g. due to exception or early abort). Split out default_iofunction from request(StreamLayer, ...) * fix pipline_limit off-by-one, add +Base.close(c::Connection) = Base.close(c.io) * more variation of parameters in test/async.jl * Add test/WebSockets.jl to test connection upgrade. * Ignore 100 Continue message in startread(http::HTTPStream) * Test for 100-continue * MUST NOT automatically retry a request with a non-idempotent method https://tools.ietf.org/html/rfc7230#section-6.3.1 * fix deadlock when no conncetion duplication is allowed * https://tools.ietf.org/html/rfc7230#section-6.3.2 "A user agent SHOULD NOT pipeline requests after a non-idempotent method, until the final response status code for that method has been received" * Fix isrecoverable()/isidempotent() bug Retry on 408 timeout and 403 forbidden (credentials timeout is fixable by retrying). Improve retry debug output. * Add loopback test for abort while sending body (& fix bugs) Add loopback tests for streaming and collection body types (& fix bugs). Do closewrite() in close(t::Transaction) to ensure aborted connections are stuck in writable state. Implement post-abort exception handling in StreamRequest.jl * dont set chunked header when upgrade header is set * add tests for pipeline_limit and duplicate_limit * Pass Content-Length:0 to S3 GET in tests to avoid automatic chunked header. Tweak debug levels. * added tests for no pipelinging after non-idempotent * Fix for closed connections stuck in pool. When connection failed during write, readcount was one less than writecount. Remove read/write count check from purge. Now simply remove closed connections from pool. Any outstanding Transactions have their own reference to the Connection anyway. * loopback test connection pool cleanup * close before purge in close(c::Connection) * Set Content-Length: 0 by default for iofunction GET requests * tweak delays in loopback test to remove race conditions * dont run AWS S3 tests if creds ENV not set * whoops * docstring cleanup, work in progress * arch doc * Doc cleanup * fix !minima/minimal bug in stack() * Doc cleanup. Wrap IO errors in IOError type. * Remove Connect.jl module. Rename duplicate_limit to connection_limit. Rename timeout to readtimeout (to match original API). Use const nobody for default empty body to avoid allocation. Make default headers = Header[] rather than just [] for type stability. Move IOExtras unread!(::IOBuffer) and unread!(::BufferStream) to ../test/ Improve Streams.jl doc strings. Add `URI(uri::URI) = uri` to avoid reconstructing URIs * doc updates * typo * Fix show(::IOError). Close response_stream after writing in StreamRequest.jl. * handlers tweaks for compatibility with HTTP.Request * remove redundant connectionclosed(p::Parser), replaced by hasheader(http, "Connection", "close") * added @Ensure to debug.jl * remove unneeded abbeviation of request to req * added hasheader(::Message, key, value) * ConnectionPool and HTTP.Stream tweaks for server mode compatibility. ConnectionPool: - Make startwrite/closewrite locling symetrical with startread/closread. - Use a Condition and busy flag instead of ReentrantLock to esnure consistent behaviour regardless of what task the start/end functions are called from. - Add Connection.sequence for initialisation of new Transactions's seq No. - Base.isopen(t::Transaction) returns false after read and write are closed. MessageRequest.jl - Don't set "chunked" header, this is now done in startwrite(http::Stream) for client and server modes. Streams.jl - add messagetowrite(http::Stream{*}) - Set "chunked" header in startwrite(http::Stream) when needed. - Call startwrite() in first call to Base.unsafe_write() - Add closewrite(http::Stream{Request}) and closeread(http::Stream{Request}) for server mode. * cosmatics * WIP integrating connection pipelining and HTTP.Stream with server.jl * Add HTTP.listen server API HTTP.listen() do http write(http, "Hello World!") end Add HTTP.WebSockets.listen API as test case for HTTP.listen HTTP.WebSockets.listen() do ws write(ws, "WebSockets Echo Server...\n") while !eof(ws); s = String(readavailable(ws)) println(("in: $s") write(ws, "echo: $s\n") end end * Parser cleanup post "Connection: Upgrade" testing. - rename ULLONG_MAX => unknown_length - rationalise reset!() / constructors - move some init from reset!() to state s_start_req_or_res - replace flags with chunked::Bool and trailing::Bool - dead code removal: - No need for special treatment of Connection: and Upgrade: headers. Parser does not use these internally. (Content-Length: and Transfer-Encoding: are still used to determine how to parse body) - Hard-coded list of methods precludes use of other registered/custom methods. e.g. https://tools.ietf.org/html/rfc3253#section-3.5. Method is a token: https://tools.ietf.org/html/rfc7230#section-3.1.1 https://tools.ietf.org/html/rfc7230#section-3.2.6 So, parse method until first non-token character. * Remove commented out dead code. add parse_token function, used to parse method. * add Base.readbytes!(t::Transaction, a::Vector{UInt8}, nb::Int) * Messages.jl Fix case insensitiveness of hasheader Add RFC compliant bodylength(::Request/::Response) methods. See nodejs/http-parser#403 Update readbody() to only use Parser for chunked bodies. * Revise Streams.jl to bypass parser for non-chunked bodies. readavailable(::HTTP.Stream) for non chunked messages is now a very thin wrapper for readavailable(::TCPSocket) (or readbytes!(::TCPSocket, ...) if nb_available() is more than Content-Length). Parser overhead is removed. !unread overhead is removed. * use ConnectionPool.byteview conveniance fn in Base.readavailable(http::Stream) * Remove redundant header interpretation from Parser. bodylength(::HTTP.Message) implements RFC compliant body lenth determination. Parser no longer needs to interpret Content-Length or "chunked" headers. No Need to "parse" non chunked boides. Just let the client read them directly from the stream. Update tests per: nodejs/http-parser#403 * #135 (review) * #135 (review) * doc tweak * typo * #135 (review) #135 (review) * doc fix * doc tweaks * doc tweak * remove redundant import in Messages.jl * typos, and add escapepath to URIs module * Rename Nitrogren -> Servers 042dab6#r26814240 Add logging macros to compat.jl * 0.6 tweaks Rename some files to match module names. * update README * Merge samoconnor#1 Merged by hand, many of these changes were already made. * compat for Val * compat for Val * IOExtras.closeread::Stream{Request} fix for server mode * Add http://localhost:8081 server for ConnectionPool debug. Connecting to http://localhost:8081 while running the async test shows an auto-refreshing page showing connection pool state. * JuliaLang/julia#25479 * latest v0.7 master compat tweaks - f(;kw...), kw seems to now be an iterator. The only way I could see to get a NamedTuple was merge(NamedTuple, kw) - b"fooo" now yeilds CodeUnit , so Vector becomes AbstractVector in sniff * move STATUS_CODES const to Messages.jl (only place it is used) and use Vector instead of Dict * Don't include consts.jl in top level HTTP namespace rename STATUS_CODES => STATUS_MESSAGES Use symbols for ParsingErrorCode to avoid duplicating list of error names. Put constants only used by parser in Parsers.jl. Remove Method enum. Don't want to limit methods to the ones in the enum. * Server doc updates * untested integration of Server and serve with listen * FIXME * Add MicroLogging for Windows v0.6 * Rename handlers.jl to Handlers.jl * untested implementation of HTTP.serve() calling HTTP.listen() * Server integration and tests. Add kw option to set http_version. In closewrite(::Stream{Request}) close the connection for non keep-alive HTTP/1.0. * oops * disable extra debug printf in test/server.jl * Return status 413 for too-large headers. #135 (comment) * randomise server port number in tests * iso8859_1_to_utf8 per #141 * typo * Naming consistancy tweaks. URLs are the subset of URIs that specify a means of acessing a resource in addition to specifying the resource name. e.g http://foo.com/bar?a=b is a URL but bar?a=b is not. Changes: - Variables of type URI that must contain a compelte URL renamed uri -> url. - Remove the "URL" constructor for URIs. It seems to have been constructing URI's that were not URLs for some input, which seems wrong. The other "URI" constructors should suffice. - Simplify the main kwargs URI constructor by putting the sanity checks in preconditions. Changes: - Renames HTTP.Request.uri -> HTTP.Request.target - Added "target=" kw arg to request(MessageLayer, ...) to allow setting the target to somthing that cannot be derived from the request URL. - Remove the old behaviour (I assume untested) of handling "CONNECT" as a special case and using the Request URL host:port as the target. I seems that this would have created a request for the server to proxy to itself. - Remove the now redundant option isconnect from the url parser. A client making a CONNECT Request should use target="$host:$port". A server implemetning a CONNECT proxy should use http_parse_host(target) * update tests and compat * try to fix Inexact Error in win32 * Update @Ensure macro to print values of failed comparison. (expr-fu stolen from Test.jl) * Add postconditions to URI(;kw...) and parse(::URI, ...) to ensure that the result can be losslessly transformed back into the input. Don't allow '*' at the start of URI path. 1226c4b#r26878192 Add sentinal const blank_userinfo to handle test cases with empty '@' userinfo. Remove UF_XXX and UF_XXX_MASK constants. It's simpler to use the state machine constants to keep track of state, and use local variables to keep track of values * handle asterix-form Request Target https://tools.ietf.org/html/rfc7230#section-5.3.4

tmenier · 2018-02-17T23:57:20Z

Can I ask why this issue was closed? From my tests it doesn't appear to be fixed. Just me? Or hasn't the fix been deployed yet?

davidism · 2018-02-18T01:34:56Z

@tmenier the fix is not part of a released version yet.

javabrett · 2018-02-18T01:49:45Z

@tmenier if you build the project's Dockerfile you should be able to:

docker build --tag httpbin .
docker run -d --name httpbin -p 8080:8080 httpbin
curl -X POST http://localhost:8080/post  -d '{"message":"BLA"}'  -H 'Content-Type: application/json'  -H 'Transfer-Encoding: chunked'
{
  "args": {}, 
  "data": "{\"message\":\"BLA\"}", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Content-Type": "application/json", 
    "Host": "localhost:8080", 
    "Transfer-Encoding": "chunked", 
    "User-Agent": "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2"
  }, 
  "json": {
    "message": "BLA"
  }, 
  "origin": "172.17.0.1", 
  "url": "http://localhost:8080/post"
}

Edit: Of course you don't need to build the container yourself, as the current kennethreitz/httpbin image also has the fix.

A couple of follow-up questions of my own:

Should https://httpbin.org or some sub-page return the exact version-number of the project, either in the response or as a header? Might help with this sort of enquiry.
Above I wrote:

I have a general question about writing a test to demonstrate this bug and any candidate fix.

... asking whether there were any current project integration-style tests that run via the Gunicorn server, versus just running in Flask. If there aren't, is it worthwhile adding something? That would help with tests that require Gunicorn to assert a fix.

kennethreitz · 2018-02-18T13:08:09Z

deployed latest to production

davidism · 2018-02-18T16:13:13Z

@kennethreitz I think this exact fix is in Werkzeug now, so you could technically remove this patch if you're using the latest version of Werkzeug.

javabrett · 2018-02-18T20:55:07Z

@davidism nice we should try that.

Should we spin-up a framework to run some integration tests? I'm still a little uneasy that all existing tests might only run through the internals of Flask.

davidism · 2018-02-18T22:00:27Z

I may have been misremembering, it's been a while. I think Werkzeug's dev server now sets wsgi.input_terminated correctly, and I reached out to other WSGI servers but not all of them set the flag yet.

javabrett · 2018-02-19T00:50:37Z

pallets/werkzeug@8eb665a since Werkzeug 0.13.

httpbin is now werkzeug = ">=0.14.1", but was not as was locked to 0.12.1 at the time the patch was merged, so yes probably time to re-test without it.

javabrett · 2018-02-19T01:13:20Z

I reverted my patch commits and tested (manually, there is no automated test available for this) in a Docker (so running Gunicorn) and httpbin did not return the sent-data - it returned "json": null. So something is still not right.

samoconnor · 2018-02-19T01:18:57Z

Still not working for POST with Transfer-Encoding: chunked

using HTTP
f = Base.BufferStream() ; write(f, "hey"); close(f)
HTTP.post("http://httpbin.org/post"; body = f)

wireshark:

POST /post HTTP/1.1
Host: httpbin.org
Transfer-Encoding: chunked

3
hey
0

HTTP/1.1 411 Length Required
Date: Mon, 19 Feb 2018 01:14:20 GMT
Connection: close
Content-Type: text/html
Server: meinheld/0.6.1
Via: 1.1 vegur

<html><head><title>Length Required</title></head><body><p>Length Required.</p></body></html>

javabrett · 2018-02-19T03:11:46Z

@samoconnor yes I see the same, and also note that httpbin.org is not hosted using Gunicorn, at least not directly. I don't have the technical details of that deployment. The fix as it stands only works for Gunicorn-hosted httpbin.

Response-header from localhost Docker-hosted:

HTTP/1.1 200 OK
Server: gunicorn/19.7.1
...
X-Powered-By: Flask

Response-header from httpbin.org:

HTTP/1.1 411 Length Required
Server: meinheld/0.6.1
Via: 1.1 vegur

So clearly the hosting is different, and likely there is some reverse-proxying. Might have to wait for a response from someone familiar with the hosting set-up.

In the interim I can assure you the Docker version works ...

javabrett · 2018-02-19T05:29:29Z

I can further confirm that if the currently-published Docker is run with non-standard launch, instead running the following (borrowing from https://github.com/kennethreitz/httpbin/blob/master/Procfile#L1 )

gunicorn httpbin:app -b 0.0.0.0:8080 --log-file - --worker-class="egg:meinheld#gunicorn_worker"

... also yields:

curl -v -X POST http://localhost:8080/post -d '{"message":"BLA"}'  -H 'Content-Type: application/json'  -H 'Transfer-Encoding: chunked' 
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 8080 (#0)
> POST /post HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.54.0
> Accept: */*
> Content-Type: application/json
> Transfer-Encoding: chunked
> 
> 11
* upload completely sent off: 24 out of 17 bytes
* HTTP 1.0, assume close after body
< HTTP/1.0 411 Length Required
< Content-Type: text/html
< Server: meinheld/0.6.1
< 
* Closing connection 0
<html><head><title>Length Required</title></head><body><p>Length Required.</p></body></html>

This problem seems to occur well before Gunicorn or the middleware itself is processing the request - problem in the Meinheld Gunicorn worker?

samoconnor · 2018-02-19T06:46:57Z

should this issue be re-opened?

javabrett · 2018-02-19T10:51:54Z

I can reproduce the 411 coming from meinheld for a chunked-request, but don't know how to fix that yet.

kennethreitz · 2018-02-19T12:56:23Z

We don't have to use meinheld.

kennethreitz · 2018-02-19T12:57:02Z

Is the problem reproduced when meinheld is dropped?

javabrett · 2018-02-19T20:37:23Z

This problem cannot be reproduced (e.g. in a Docker container build running Gunicorn) without the --worker-class="egg:meinheld#gunicorn_worker". This is in meinheld and I found its logic where it return 411 and wrote a failing test. Will raise an issue and reference this issue.

javabrett · 2018-02-21T10:59:18Z

My code-change to meinheld was merged, so when it is next-released it will no-longer generate the 411 error-response when Transfer-Encoding: chunked and missing or zero Content-Length, which is as it should be. I don't think there's nightly builds other than on Travis.

kennethreitz · 2018-02-21T13:05:11Z

How long do you think that will be? We can drop meinheld in the meantime, if needed. It's not necessary. Just helps handle bursts of traffic.

kennethreitz · 2018-02-21T13:05:25Z

Sometimes, this services receives over 8k requests per second.

javabrett · 2018-02-21T23:40:53Z

There are no recent meinheld releases, but that could be needs driven, as there aren't a lot of unreleased commits: mopemope/meinheld@v0.6.1...master . I guess we could request a release.

If httpbin.org is very busy, perhaps it could retain meinheld and the chunked-request problem until the release, but other less-busy servers could have meinheld disabled for now. I don't know if that's a good idea or not. I guess it depends on whether stability or perf issues are expected on certain servers without meinheld.

This reverts commit 2e94212. This didn't fix the problem, and caused other issues when using werkzeug to execute httpbin, so let's remove it. See also: postmanlabs#340

postmanlabs#340. In this commit: - when we see a Transfer-Encoding: chunked request, and the server is gunicorn, we set environ wsgi.input_terminated, which is required by Werkzeug in the absence of Content-Lenght, or it will empty the data stream. - for chunked requests to non-gunicorn, return 501 Not Implemented.

Added support for chunked encoded requests when running gunicorn. postmanlabs#340

Lukasa mentioned this issue Apr 4, 2017

Chunked uploads missing body data. pallets/flask#2229

Closed

kennethreitz mentioned this issue Apr 24, 2017

Drip broken again #348

Closed

tmenier mentioned this issue May 8, 2017

Multipart test failing in .NET Core tmenier/Flurl#179

Closed

jeremiahpslewis mentioned this issue Jun 19, 2017

12 "Transfer-Encoding: chunked" Tests Fail Due to httpbin issue JuliaWeb/HTTP.jl#60

Closed

kevin1024 mentioned this issue Aug 27, 2017

Pin to httpbin 0.5.0. psf/requests#4261

Merged

kevin1024 added a commit that referenced this issue Aug 27, 2017

Revert "chunked encoding"

672f347

This reverts commit 2e94212. This didn't fix the problem, and caused other issues when using werkzeug to execute httpbin, so let's remove it. See also: #340

kennethreitz added a commit that referenced this issue Dec 18, 2017

Merge pull request #405 from javabrett/support-chunked-340

c0f5f0a

Added support for chunked encoded requests when running gunicorn. #340

javabrett mentioned this issue Feb 19, 2018

Transfer-Encoding: chunked request fails with 411 Length Required mopemope/meinheld#94

Closed

izumin5210 mentioned this issue May 3, 2018

Add RawRequest to Response object levigross/grequests#62

Closed

virgiliojr94 pushed a commit to virgiliojr94/httpbin-chat2desk that referenced this issue Jul 1, 2023

Merge pull request postmanlabs#405 from javabrett/support-chunked-340

963f7b2

Added support for chunked encoded requests when running gunicorn. postmanlabs#340

rkvf01 mentioned this issue Feb 29, 2024

chunked posts not seen in server #717

Open

germaniuss mentioned this issue Oct 22, 2024

Form data is not correctly set when using READFUNCTION and not setting POSTFIELDSIZE pycurl/pycurl#861

Open

Regression with chunked transfer encoding? #340

Regression with chunked transfer encoding? #340

Comments

pb- commented Mar 20, 2017 • edited Loading

sigmavirus24 commented Mar 20, 2017

kennethreitz commented Mar 21, 2017

sigmavirus24 commented Mar 21, 2017

kennethreitz commented Mar 21, 2017

kennethreitz commented Mar 21, 2017

sigmavirus24 commented Mar 21, 2017

kennethreitz commented Mar 22, 2017

graingert commented Apr 4, 2017 • edited Loading

graingert commented Apr 4, 2017 • edited Loading

Lukasa commented Apr 4, 2017

Lukasa commented Apr 4, 2017

Lukasa commented Apr 4, 2017

graingert commented Apr 4, 2017

Lukasa commented Apr 4, 2017

graingert commented Apr 4, 2017

Lukasa commented Apr 4, 2017

Lukasa commented Apr 4, 2017

kennethreitz commented Apr 4, 2017

graingert commented May 8, 2017 • edited Loading

davidism commented Jul 21, 2017

skyline75489 commented Jul 31, 2017 • edited Loading

kennethreitz commented Aug 4, 2017

kennethreitz commented Aug 4, 2017

kevin1024 commented Aug 27, 2017

tmenier commented Feb 17, 2018

davidism commented Feb 18, 2018

javabrett commented Feb 18, 2018 • edited Loading

kennethreitz commented Feb 18, 2018

davidism commented Feb 18, 2018

javabrett commented Feb 18, 2018

davidism commented Feb 18, 2018

javabrett commented Feb 19, 2018

javabrett commented Feb 19, 2018

samoconnor commented Feb 19, 2018

javabrett commented Feb 19, 2018

javabrett commented Feb 19, 2018

samoconnor commented Feb 19, 2018

javabrett commented Feb 19, 2018

kennethreitz commented Feb 19, 2018

kennethreitz commented Feb 19, 2018

javabrett commented Feb 19, 2018

javabrett commented Feb 21, 2018

kennethreitz commented Feb 21, 2018

kennethreitz commented Feb 21, 2018

javabrett commented Feb 21, 2018

pb- commented Mar 20, 2017 •

edited

Loading

graingert commented Apr 4, 2017 •

edited

Loading

graingert commented Apr 4, 2017 •

edited

Loading

graingert commented May 8, 2017 •

edited

Loading

skyline75489 commented Jul 31, 2017 •

edited

Loading

javabrett commented Feb 18, 2018 •

edited

Loading