Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement io::Entropy and refactor random data generation #108874

Closed
wants to merge 1 commit into from

Conversation

joboet
Copy link
Member

@joboet joboet commented Mar 7, 2023

Implements the ACP rust-lang/libs-team#159.

Because std has special needs (its OK to use less secure random data for HashMap keys if that means we don't have to block) and the Read API diverges from getrandom (not the whole buffer must be read, but errors should be immediately returned), I decided not to branch out to getrandom, but to refactor the current std implementation, merging elements from getrandom where required. In particular:

  • On Linux, /dev/random needs to be polled before reading from /dev/urandom when generating secure data
  • The default source for secure data on UNIXes without special syscalls should be /dev/random/, unless it is known that it is identical to /dev/urandom

As a consequence, this PR is rather large, so please let me know if there is anything I can do to simplify the review process.

Checked on all platforms (that currently are not broken), but tested only on macOS (and the Linux CI).

@rustbot label +S-waiting-on-ACP +T-libs-api

@rustbot
Copy link
Collaborator

rustbot commented Mar 7, 2023

r? @joshtriplett

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 7, 2023
@rustbot
Copy link
Collaborator

rustbot commented Mar 7, 2023

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rustbot rustbot added S-waiting-on-ACP Status: PR has an ACP and is waiting for the ACP to complete. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Mar 7, 2023
@workingjubilee
Copy link
Member

I assume you know that the random number devices have been united in recent Linux kernels and they will not have substantial differences, namely that /dev/random no longer blocks and /dev/urandom will use the same CSPRNG.

@joboet
Copy link
Member Author

joboet commented Mar 8, 2023

I assume you know that the random number devices have been united in recent Linux kernels and they will not have substantial differences, namely that /dev/random no longer blocks and /dev/urandom will use the same CSPRNG.

Yes, but polling is (unfortunately) still necessary for compatibility with older versions. The new versions use getrandom anyway, if it's not blocked, so the fast path doesn't go through the file system at all.

@workingjubilee
Copy link
Member

Hmm, I don't know if it's actually worth it to ever hit /dev/random and not /dev/urandom, seeing as how even on earlier versions the differences were very marginal. But a version note on which kernel we can fully drop the fallback code on is also fine.

@joboet
Copy link
Member Author

joboet commented Mar 8, 2023

Hmm, I don't know if it's actually worth it to ever hit /dev/random and not /dev/urandom, seeing as how even on earlier versions the differences were very marginal. But a version note on which kernel we can fully drop the fallback code on is also fine.

I don't want to create security vulnerabilities by not considering the edge cases (/dev/urandom is still used when getrandom is blocked, and if /dev/random is not checked, the effect is the same as getrandom(GRND_INSECURE) which can return bad randomness on embedded devices, even on the newest kennels). Also, getrandom does this, so keeping the source the same means users don't have to reanalyse their security if switching to io::Entropy.

I'll add the version note, that's a good idea.

unsafe {
let mut ret: u64 = 0;
for _ in 0..10 {
if crate::arch::x86_64::_rdrand64_step(&mut ret) == 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a maintainer of sgx but this does not offer high quality entropy. See the "Generating seeds with rdrand" section in https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html for how to do it (which we probably don't want in std...)

Copy link
Member Author

@joboet joboet Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... don't quite get that section, unfortunately.

IIUC (please feel free to correct me), RDRAND is like /dev/urandom in that it uses a CSPRNG to generate more random data from a single, truly random seed. Because of the size of the seed, constant reseeding and the strength of the PRNG, the output should not be in any practical way less secure that RDSEED. But that doesn't explain why anyone would use RDSEED...

Edit: Ah, I missed the part about forward and backward prediction resistance... Still, I don't know if it's really, truly necessary here.

I didn't change this, because getrandom uses RDRAND on SGX. Note that it does not consider values 0 or u64::MAX valid because of hardware bugs on AMD devices. Since SGX is Intel-only, I guess it isn't a concern here?

If RDRAND is truly the wrong option here, I guess the alternative is doing what I did for Hermit here and use RDSEED to seed our own PRNG.

CC @raoulstrackx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...the DRNG using the RDRAND instruction is useful for generating high-quality keys for cryptographic protocols, and the RSEED instruction is provided for seeding software-based pseudorandom number generators (PRNGs)

I think that sums it up nicely. If someone needs stronger guarantees for a PRNG, they should take a look at what exactly happens in getrandom. The documentation explicitly mentions the randomness comes from rdrand. To me that looks fine, but I'm not a cryptographic. @zugzwang can you pitch in?

Indeed, AMD hardware bugs have no impact on SGX and the values 0 and u64::MAX should not be handled differently.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. Intel documentation looks clear, and there is useful discussion on security levels here.

@thomcc
Copy link
Member

thomcc commented Mar 11, 2023

I decided not to branch out to getrandom, but to refactor the current std implementation, merging elements from getrandom where required

std also cannot use getrandom for this because std supports platforms that getrandom does not.

See #104658 where I went through the work of removing use of getrandom from the stdlib test suite, allowing running those tests on several tier3 targets and moving more of the stdlib tests that live in uitests only so they can run on tier3 targets back into the normal test suite.

I would prefer we not undo that improvement.

@bors
Copy link
Contributor

bors commented Mar 30, 2023

☔ The latest upstream changes (presumably #109734) made this pull request unmergeable. Please resolve the merge conflicts.

@rustbot
Copy link
Collaborator

rustbot commented Mar 30, 2023

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@joboet
Copy link
Member Author

joboet commented Mar 30, 2023

Rebased to include #107221 and #107387.

@bors
Copy link
Contributor

bors commented May 8, 2023

☔ The latest upstream changes (presumably #111346) made this pull request unmergeable. Please resolve the merge conflicts.

@Dylan-DPC Dylan-DPC removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 20, 2023
@joshtriplett
Copy link
Member

r? libs

@@ -164,7 +164,8 @@ pub fn wait(event_mask: u64, mut timeout: u64) -> IoResult<u64> {
// trusted to ensure accurate timeouts.
if let Ok(timeout_signed) = i64::try_from(timeout) {
let tenth = timeout_signed / 10;
let deviation = (rdrand64() as i64).checked_rem(tenth).unwrap_or(0);
let rand = rdrand64().unwrap_or_else(|| rtabort!("Failed to obtain random data"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: There's no need to abort here when rdrand64() returns None. This code is just there to avoid developers from relying on the accuracy of timeouts. Using a default (as in the original code) is fine here.
There's even a small availability issue with the new code. The rdrand instruction can be executed by any userspace program. So an attacker may be able to execute this instruction over and over again until the pool of random values is exhausted. This will cause the enclave to abort here. SGX does not provide any security guarantees related to availability, but this attack may be executed to make enclaves abort even when the attacker would not be able to do so through other means (e.g., they don't have the access rights to stop the process running the enclave). On the other hand, when the enclave needs to do a cryptographic operation, there's no other way than to abort. Hence, it's a bit of a nitpick :)

@@ -3122,7 +3122,17 @@ impl RandomState {
// increment one of the seeds on every RandomState creation, giving
// every corresponding HashMap a different iteration order.
thread_local!(static KEYS: Cell<(u64, u64)> = {
Cell::new(sys::hashmap_random_keys())
if crate::sys::entropy::INSECURE_HASHMAP {
Cell::new((1, 2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not blocking, but one thing I was thinking of proposing as a path to make RandomState still work without libstd is allowing users to provide their own entropy function, and it would be nice if this were factored in a way that always took that path, rather than hard-coding in an insecure constant for the state.

Since no one is going to rely explicitly on the (1, 2) state for these cases by design, I think it'd be better to just provide a random eight bytes that get sent through the path instead. That way everything goes through the "entropy" path of decoding the bytes, even if the bytes are static.

///
/// Be aware that, because the data is of very high quality, reading high amounts
/// of data can be very slow, and potentially slow down other processes requiring
/// random data. Use a pseudo-random number generator if speed is important.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we do this elsewhere in libstd with other crates, perhaps we could just straight-up recommend the rand crate for this purpose? Especially since this is an org-maintained crate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have been trying to not advertise external crates from std docs. And there are other potential options beside rand even if they're more niche.

} else {
let mut v = [0u8; 16];
let mut entropy = entropy();
entropy.set_insecure(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the purpose is to prevent hash DOS, attacks, wouldn't setting insecure be counter to that? Or am I missing the purpose of this flag?

@Dylan-DPC
Copy link
Member

Dylan-DPC commented Mar 25, 2024

@joboet any updates on this?

Edit: nevermind, just realised this is waiting on the ACP being merged

#[derive(Debug)]
#[unstable(feature = "io_entropy", issue = "none")]
pub struct Entropy {
insecure: bool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably use sys::Entropy directly here (and define a set_insecure method for it)

@joboet
Copy link
Member Author

joboet commented Aug 15, 2024

Closing in favour of #129120. No matter how the API discussion turns out, that PR is a much better starting point than this one. The UNIX refactoring has been done in #128655 already.

@joboet joboet closed this Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-ACP Status: PR has an ACP and is waiting for the ACP to complete. T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.