Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove hyper::body::to_bytes or adds a params for size limit? #3111

Closed
NobodyXu opened this issue Jan 8, 2023 · 12 comments
Closed

Remove hyper::body::to_bytes or adds a params for size limit? #3111

NobodyXu opened this issue Jan 8, 2023 · 12 comments
Labels
C-feature Category: feature. This is adding a new feature.

Comments

@NobodyXu
Copy link

NobodyXu commented Jan 8, 2023

As pointed out by this article, using hyper::body::to_bytes blindly might cause DoS attack since the function reserve the Vec based on body.size_hint(), which can be way larger than the actual http response data.

This cannot be fixed by using falliable allocation even it's there since overcommit could be enabled.

IMHO the only way to migrate this is to either remove this function or add a size limit to this function.

@NobodyXu NobodyXu added the C-feature Category: feature. This is adding a new feature. label Jan 8, 2023
@NobodyXu
Copy link
Author

NobodyXu commented Jan 8, 2023

Perhaps hyper::body::to_body can even be moved to utils?

@seanmonstar
Copy link
Member

I'm aware of the article, they reached out privately a few months ago and I tried to explain it is essentially the same as Read::read_to_end, but they felt a big headline claiming hyper is vulnerable was useful to them. 🤷

@NobodyXu
Copy link
Author

NobodyXu commented Jan 8, 2023

I tried to explain it is essentially the same as Read::read_to_end

That's true, but perhaps this function can be put into hyper-utils since read_to_end without length check is not very useful in real world where every client/server is treated as untrusted?

but they felt a big headline claiming hyper is vulnerable was useful to them. 🤷

They even go as far as claiming that function is unsafe (yes, with a code block) while in reality, nothing about this function can trigger UB.

@Marcono1234
Copy link

it is essentially the same as Read::read_to_end

If I understand the blog post correctly the main issue is that hyper's to_bytes blindly trusts the attacker-provided Content-Length header. That would be much severe than the Read::read_to_end case. For read_to_end an attacker would actually have to provide MBs or GBs of data to perform a denial of service attack, whereas for hyper's to_bytes an attacker would merely have to claim that the body is multiple MBs or GBs large and to_bytes will happily try to allocate that memory, even if the attacker in the end only sends a few KBs in total.

They even go as far as claiming that function is unsafe (yes, with a code block) while in reality, nothing about this function can trigger UB.

It looks like (even in the first archived version of the blog post) "unsafe" is only colored red but not formatted as code. Though in the context of Rust, using the term "unsafe" is nonetheless quite misleading. @srmish-jfrog, you seem to be one of the authors; would it be possible to reword the blog post to use for example the term "insecure" instead of "unsafe" (2 occurrences) to prevent any misunderstandings in the context of Rust?

@seanmonstar, if the values of SizeHint can indeed be influenced by a potential attacker, would it be possible to add warnings in the documentation to clarify this and to prevent users from blindly trusting the value?

Though it appears this discussion about to_bytes might be a bit obsolete (at least for the upcoming 1.0 version) because that function does not exist anymore since commit 0888623 (I am not sure though if a similar vulnerability still exists in the new code).
But maybe it would be useful nonetheless to create an advisory to warn any users who might not be aware of this issue and who cannot upgrade to the upcoming 1.0 version directly?

Just as side note, the examples shown in https://github.com/hyperium/hyperium.github.io are currently also using to_bytes without warning the user about the consequences.

@srmish-jfrog
Copy link

srmish-jfrog commented Jan 8, 2023 via email

@ConsoleTVs
Copy link

Given how simple the solution is on user-land code, I would recommend to add a size_limit parameter on the function.

@seanmonstar
Copy link
Member

For read_to_end an attacker would actually have to provide MBs or GBs of data to perform a denial of service attack, whereas for hyper's to_bytes an attacker would merely have to claim that the body is multiple MBs or GBs large and to_bytes will happily try to allocate that memory, even if the attacker in the end only sends a few KBs in total.

I see what you're getting at now, that specific part is indeed closer to Iterator::collect::<Vec<_>>, which will preallocate based on the iterator's size_hint, which could come from anywhere (even network things). I've pushed #3115 to allocate less. But the greater issue would still exist if the source did send a large body and no limit check was done.

SizeHint [..] to prevent users from blindly trusting the value?

I don't think it's an issue of blindly trusting it. hyper does enforce that no more than content-length bytes are read inside it's protocol state machine. The issue seems to be that allocating all of it immediately allows a slow client to waste memory.

it appears this discussion about to_bytes might be a bit obsolete (at least for the upcoming 1.0 version) because that function does not exist anymore

Correct, we removed it from hyper core. In http_body_util::BodyExt, this is two new methods, limit(max), and collect(). It does seem cleaner to do body.limit(SOME_MAX).collect(), but I can also get behind the argument that people will forget to do so, and so perhaps a max limit is added to the collect method itself.

create an advisory to warn any users

I don't think an advisory is warranted. There is a lot of prior art doing the same thing (otherwise read_to_end, Iterator::collect, etc should have the same), and documentation to be careful (I realize people don't read). And it's only an issue if you've done so without a limit check, and on an untrusted source.

I am considering adding a to_bytes_with_max(body, max) function, and possibly marking to_bytes deprecated.

@fundon
Copy link
Contributor

fundon commented Jan 12, 2023

I think the Limited struct has handle that in 1.0.

@Nugine
Copy link

Nugine commented Jan 13, 2023

Hi! I'm trying to replace one usage of hyper::body::to_bytes in my project. But I cannot find the limit method mentioned above via search.
Is there any working example? Do I need to implement a limited alternative by myself?

@seanmonstar
Copy link
Member

For 0.14, Limited is here: https://docs.rs/http-body/0.4.*/http_body/struct.Limited.html

@hyperium hyperium deleted a comment from krishnaTORQUE Jan 13, 2023
@Marcono1234
Copy link

which will preallocate based on the iterator's size_hint, which could come from anywhere (even network things)

Out of curiosity, do you actually know of a specific library where the size hint of an iterator is indeed based on untrusted data from the network (without any default upper limit)? If so, then that is bound to lead to vulnerable code too.

It is not necessarily that people don't read, but rather that code is spread around for example on Stack Overflow or in examples (as it is the case for hyper, as mentioned above) and the warning from the documentation is not repeated (or maybe it was only added to the documentation at a later point). So unless the naming of the API makes it clear that it must not be used for untrusted data or prevents insecure usage in the first place it is likely that it will be used in insecure ways.

@seanmonstar
Copy link
Member

This function was removed in v1, and exists a helper in http-body-util next to the limit() helper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature Category: feature. This is adding a new feature.
Projects
None yet
Development

No branches or pull requests

7 participants