Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes bug in rfc2616 #3.6.1 implementation. #236

Merged
merged 2 commits into from
Sep 18, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions library/Requests.php
Original file line number Diff line number Diff line change
Expand Up @@ -749,15 +749,17 @@ public static function parse_multiple(&$response, $request) {
* @return string Decoded body
*/
protected static function decode_chunked($data) {
if (!preg_match('/^([0-9a-f]+)[^\r\n]*\r\n/i', trim($data))) {
if (!preg_match('/^([0-9a-f]+)(?:;(?:[\w-]*)(?:=(?:(?:[\w-]*)*|"(?:[^\r\n])*"))?)*\r\n/i', trim($data))) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that [\w-]* isn't too strict for a token. I would be tempted to err on the side of caution and stick with the original patch and have a slightly lower bar for what constitutes a valid encoding.

return $data;
}



$decoded = '';
$encoded = $data;

while (true) {
$is_chunked = (bool) preg_match('/^([0-9a-f]+)[^\r\n]*\r\n/i', $encoded, $matches);
$is_chunked = (bool) preg_match('/^([0-9a-f]+)(?:;(?:[\w-]*)(?:=(?:(?:[\w-]*)*|"(?:[^\r\n])*"))?)*\r\n/i', $encoded, $matches);
if (!$is_chunked) {
// Looks like it's not chunked after all
return $data;
Expand Down
29 changes: 27 additions & 2 deletions tests/ChunkedEncoding.php
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,18 @@ public static function chunkedProvider() {
"02\r\nab\r\n04\r\nra\nc\r\n06\r\nadabra\r\n0c\r\n\nall we got\n",
"abra\ncadabra\nall we got\n"
),
array(
"02;foo=bar;hello=world\r\nab\r\n04;foo=baz\r\nra\nc\r\n06;justfoo\r\nadabra\r\n0c\r\n\nall we got\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is OWS (optional whitespace) allowed here too? Should add a test for that if so. Ditto if params can take quoted values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC2616 states:

chunk-ext-name = token
chunk-ext-val = token | quoted-string

Token cannot contain any white space, or any of the following: (}|<>@,;:\"/[]?={}. Quoted string can contain white space.

What we have at the moment is still a relatively low-bar for what is interpreted of correctly encoded. I'm wary about making it too strict as a misunderstanding here, when it's too strict, is more likely to lead to issues in production.

Happy to add the above restrictions with some additional tests

"abra\ncadabra\nall we got\n"
),
array(
"02;foo=\"quoted value\"\r\nab\r\n04\r\nra\nc\r\n06\r\nadabra\r\n0c\r\n\nall we got\n",
"abra\ncadabra\nall we got\n"
),
array(
"02;foo-bar=baz\r\nab\r\n04\r\nra\nc\r\n06\r\nadabra\r\n0c\r\n\nall we got\n",
"abra\ncadabra\nall we got\n"
),
);
}

Expand All @@ -34,12 +46,24 @@ public function testChunked($body, $expected){
$this->assertEquals($expected, $response->body);
}

public static function notChunkedProvider() {
return array(
'invalid chunk size' => array( 'Hello! This is a non-chunked response!' ),
'invalid chunk extension' => array( '1BNot chunked\r\nLooks chunked but it is not\r\n' ),
'unquoted chunk-ext-val with space' => array( "02;foo=unquoted with space\r\nab\r\n04\r\nra\nc\r\n06\r\nadabra\r\n0c\r\n\nall we got\n" ),
'unquoted chunk-ext-val with forbidden character' => array( "02;foo={unquoted}\r\nab\r\n04\r\nra\nc\r\n06\r\nadabra\r\n0c\r\n\nall we got\n" ),
'invalid chunk-ext-name' => array( "02;{foo}=bar\r\nab\r\n04\r\nra\nc\r\n06\r\nadabra\r\n0c\r\n\nall we got\n" ),
'incomplete quote for chunk-ext-value' => array( "02;foo=\"no end quote\r\nab\r\n04\r\nra\nc\r\n06\r\nadabra\r\n0c\r\n\nall we got\n" ),
);
}

/**
* Response says it's chunked, but actually isn't
* @dataProvider notChunkedProvider
*/
public function testNotActuallyChunked() {
public function testNotActuallyChunked($body) {
$transport = new MockTransport();
$transport->body = 'Hello! This is a non-chunked response!';
$transport->body = $body;
$transport->chunked = true;

$options = array(
Expand All @@ -50,6 +74,7 @@ public function testNotActuallyChunked() {
$this->assertEquals($transport->body, $response->body);
}


/**
* Response says it's chunked and starts looking like it is, but turns out
* that they're lying to us
Expand Down