Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAND_DRBG uses uninitialized urandom read on linux #8215

Closed
bernd-edlinger opened this issue Feb 12, 2019 · 55 comments
Closed

RAND_DRBG uses uninitialized urandom read on linux #8215

bernd-edlinger opened this issue Feb 12, 2019 · 55 comments

Comments

@bernd-edlinger
Copy link
Member

This message can be seen when openssl s_server is started early after reboot:

./openssl s_server
random: openssl: uninitialized urandom read (32 bytes read)
Using default temp DH parameters
ACCEPT

I have seen this with linux v4.18, and v4.14-stable.
Previous linux versions are also affected, but do not report the error.

@t8m
Copy link
Member

t8m commented Feb 13, 2019

This is not an error of openssl. Basically the kernel signals that the in-kernel CRNG was not yet fully initialized with sufficient entropy.

@bernd-edlinger
Copy link
Member Author

The point is, if we use /dev/urandom to initialize the DRBG the kernel tells us that the
returned random data is not good enough.
I think we should probably use /dev/random, and wait for enough entropy.

@levitte
Copy link
Member

levitte commented Feb 13, 2019

I assume this is done when we call the getrandom syscall. According to the documentation I'm reading, the call we make should block until there's enough entropy, since we don't set the GRND_NONBLOCK flag:

       GRND_NONBLOCK
              By default, when reading from  the  random  source,  getrandom()
              blocks  if  no random bytes are available, and when reading from
              the urandom source, it blocks if the entropy pool  has  not  yet
              been  initialized.   If  the GRND_NONBLOCK flag is set, then ge‐
              trandom() does not block in these cases, but instead immediately
              returns -1 with errno set to EAGAIN.

@bernd-edlinger
Copy link
Member Author

Ahm, well....
Actually the used cross-compiler comes with a linux headers from 3.2.14, and glibc headers 2.15.
So if it does not define GRND_NONBLOCK and no getrandom syscall, what happens then?

@mspncp
Copy link
Contributor

mspncp commented Feb 13, 2019

My understanding is that the getrandom() system call is more or less equivalent to reading from /dev/urandom, whereas getentropy() corresponds to /dev/random.

However, there seems to be an important difference between getrandom() and /dev/urandom during early boot time: While getrandom() blocks until the initial seeding of the pool is complete, /dev/urandom according to RANDOM(4) never blocks:

   Usage
       The  /dev/random interface is considered a legacy interface, and /dev/urandom is preferred and sufficient in all use cases, with the exception
       of applications which require randomness during early boot time; for these applications, getrandom(2) must be used instead,  because  it  will
       block until the entropy pool is initialized.

       If  a seed file is saved across reboots as recommended below (all major Linux distributions have done this since 2000 at least), the output is
       cryptographically secure against attackers without local root access as soon as it is reloaded in the boot sequence,  and  perfectly  adequate
       for  network encryption session keys.  Since reads from /dev/random may block, users will usually want to open it in nonblocking mode (or per‐
       form a read with timeout), and provide some sort of user notification if the desired entropy is not immediately available.

So I would guess that OpenSSL was seeding from /dev/urandom, not from getrandom().

@bernd-edlinger
Copy link
Member Author

Okay, if the getrandom syscall is not available we seed from /dev/urandom and only if that
is not available use /dev/random, but it should be other way round.

@bernd-edlinger
Copy link
Member Author

So, how about that?

diff --git a/e_os.h b/e_os.h
index 9c0888e..461015a 100644
--- a/e_os.h
+++ b/e_os.h
@@ -28,9 +28,9 @@
  * default, we will try to read at least one of these files
  */
 #  if defined(__s390__)
-#   define DEVRANDOM "/dev/prandom","/dev/urandom","/dev/hwrng","/dev/random"
+#   define DEVRANDOM "/dev/random","/dev/prandom","/dev/urandom","/dev/hwrng"
 #  else
-#   define DEVRANDOM "/dev/urandom","/dev/random","/dev/srandom"
+#   define DEVRANDOM "/dev/random","/dev/urandom","/dev/srandom"
 #  endif
 # endif
 # if !defined(OPENSSL_NO_EGD) && !defined(DEVRANDOM_EGD)

@mspncp
Copy link
Contributor

mspncp commented Feb 13, 2019

Okay, if the getrandom syscall is not available we seed from /dev/urandom and only if that is not available use /dev/random, but it should be other way round.

I still don't think we should do this, we would hog the entropy pool by doing it. Just think of the OpenSSL testsuite where a lot of openssl instances are launched. If they all would be seeding from /dev/random (or getentropy()), you would be in trouble.

For the current Linux kernels the fact that getrandom() blocks until the initial seeding has completed should be sufficient. After that, the kernel CSPRNG is cryptographically secure, which is good enough, unless we are running in FIPS 3.0 mode (t.b.d.) and it is either the first seeding of the <master> DRBG, or prediction resistance was requested. (It is clear that the entropy gathering needs to be improved for FIPS and that we need distinction between blocking sources with entropy guarantee and nonblocking sources).

As for older kernels without getrandom(): IMHO it is the job of the init system to save and restore a seed file across reboots, as recommended in the citation from RANDOM(4). So I wouldn't consider it an OpenSSL bug if this message occurs.

@mspncp
Copy link
Contributor

mspncp commented Feb 13, 2019

@kroeckx might know more about kernel CSPRNG details than I do.

@bernd-edlinger
Copy link
Member Author

How does this affect the answer I gave
to #8023 (comment)
I said: "If we let it fail here, the chances are that the caller of RAND_bytes does not check the
return code, or it does at least not have no good strategy to handle the error.
I would prefer to wait, unless it is a permanent error."
but I think VxWorks Randlib is working like /dev/random, just that it fails when no entropy
is immediatly available.

@mspncp
Copy link
Contributor

mspncp commented Feb 13, 2019

How does this affect the answer I gave to #8023 (comment)

It would be good to have more detailed information about the VxWorks randlib in general and the randBytes() call in particular, before answering this question. Is there any official documentation available online or offline?

cc @klotzt-draeger

@mspncp
Copy link
Contributor

mspncp commented Feb 13, 2019

.. the chances are that the caller of RAND_bytes does not check the return code, or it does at least not have no good strategy to handle the error.

This is a really good question and I remember having discussed it with @kroeckx (IIRC) at some time:

Currently the DRBG refuses to generate random bytes if it is in the error state(*), so it is crucial to check the return value of RAND_bytes(). I asked Kurt whether it wouldn't be better to return an error and generate the requested random bytes regardless of the error state, just to be safe in the case where the application wouldn't check the error. But the conclusion was that it would be even worse, because we would provide fake security and the chance of detecting the failure would be less than if the RNG would'nt refuse to generate the output.

(*) The DRBG attempts an automatic error recovery by reseeding, but if that fails, it will remain in the error state.

@mspncp
Copy link
Contributor

mspncp commented Feb 13, 2019

Citing myself from RAND(7):

       As a normal application developer, you do not have to worry about any details,
       just use RAND_bytes(3) to obtain random data.  Having said that, there is one
       important rule to obey: Always check the error return value of RAND_bytes(3)
       and do not take randomness for granted.

@kroeckx
Copy link
Member

kroeckx commented Feb 13, 2019 via email

@kroeckx
Copy link
Member

kroeckx commented Feb 13, 2019 via email

@paulidale
Copy link
Contributor

For the current Linux kernels the fact that getrandom() blocks until the initial seeding has completed should be sufficient. After that, the kernel CSPRNG is cryptographically secure, which is good enough, unless we are running in FIPS 3.0 mode (t.b.d.) and it is either the first seeding of the <master> DRBG, or prediction resistance was requested. (It is clear that the entropy gathering needs to be improved for FIPS and that we need distinction between blocking sources with entropy guarantee and nonblocking sources).

According to the discussions I've had with the FIPS lab about entropy sources, this isn't correct. Calling a blocking getrandom() system call is good enough for FIPS and is straightforward to justify.

If we need to use /dev/urandom, the only method I can think of to guarantee that it is properly seeded is via waiting for something like dd if=/dev/random of=/dev/urandom BS=32 count=1 iflag=fullblock I.e. read bytes from the blocking source and feed them back into the RNG.

We should avoid /dev/random is we can, but we might not be able to do so.

@paulidale
Copy link
Contributor

+# define DEVRANDOM "/dev/random","/dev/prandom","/dev/urandom","/dev/hwrng"

I'd put /dev/hwrng before /dev/prandom. /dev/prandom is a pseudo random generator like /dev/urandom and I don't believe claims about made about it being properly seeded.

+# define DEVRANDOM "/dev/random","/dev/urandom","/dev/srandom"

/dev/srandom should be before /dev/urandom and perhaps even before /dev/random. It is a good source.

@mspncp
Copy link
Contributor

mspncp commented Feb 14, 2019

According to the discussions I've had with the FIPS lab about entropy sources, this isn't correct. Calling a blocking getrandom() system call is good enough for FIPS and is straightforward to justify.

That's good news. It seems that I was too pessimistic. :-)

@mspncp
Copy link
Contributor

mspncp commented Feb 14, 2019

Calling a blocking getrandom() system call is good enough for FIPS and is straightforward to justify.

What about prediction resistance? It was always Kurt's opinion (not mine) that a prediction resistance request needs to be propagated upwards. This would imply that in this case, getentropy() would be required to reseed.

@t8m
Copy link
Member

t8m commented Feb 14, 2019

Please do not ever use /dev/random in openssl by default. That would be simply not usable in any normal scenario where openssl is used throughout the system.

@t8m
Copy link
Member

t8m commented Feb 14, 2019

It is OS business to ensure that the kernel RNG is properly seeded with entropy on system boot, openssl has no way to influence/enforce that (apart from using getrandom). To summarize - use getrandom/getentropy (without GRND_RANDOM flag of course) and if it is not available, use /dev/urandom. That's it.

@mspncp
Copy link
Contributor

mspncp commented Feb 14, 2019

@t8m I quite agree.

@paulidale
Copy link
Contributor

getrandom() should stir in more entropy if any is available, i.e. it isn't a pure PRNG. Of course, it would always be better to guarantee new entropy has been added.

The problem with /dev/urandom, in the context of the FIPS project, is the justification that it has been adequately seeded. Once it has been seeded, it's fine but until you can prove that, it isn't acceptable. In FIPS mode we might be forced to used /dev/random in the absence of getrandom() and getentropy() system calls. Outside of FIPS mode, I can't see this ever being mandated.

@bernd-edlinger
Copy link
Member Author

I think this does probably not do what was intended:

openssl-1.1.1a/crypto/rand/rand_unix.c:#  if defined(__linux) && defined(SYS_getrandom)
openssl-1.1.1a/crypto/rand/rand_unix.c:    return syscall(SYS_getrandom, buf, buflen, 0);

... because linux defines __NR_getrandom and glibc defines SYS_getrandom.
So glibc before 2.16 will not define SYS_getrandom regardless of the linux version.

@richsalz
Copy link
Contributor

Do not ever use /dev/random ... that is unacceptable

@t8m please explain this?

@bernd-edlinger
Copy link
Member Author

... because linux defines __NR_getrandom and glibc defines SYS_getrandom.
So glibc before 2.16 will not define SYS_getrandom regardless of the linux version.

Update:
The glibc 2.25 was the first to implement the getrandom function, but I tried to build
glibc 2.24 against the current linux headers, and it did define the SYS_getrandom.
But this define will not be created when the linux headers are updated afterwards.
It would still make sense to use the linux define __NR_getrandom directly, IMHO.

@levitte
Copy link
Member

levitte commented Feb 14, 2019

Do not ever use /dev/random ... that is unacceptable

@t8m please explain this?

Apart from the obvious blocking?

@kroeckx
Copy link
Member

kroeckx commented Feb 14, 2019 via email

@kroeckx
Copy link
Member

kroeckx commented Feb 14, 2019 via email

@richsalz
Copy link
Contributor

Apart from the obvious blocking?

Why is that a problem? You either wait until there's enough entropy, or you decide to go ahead and be insecure. The default configuration, since most users are not knowledgeable, should be to wait.

@richsalz
Copy link
Contributor

And again I ask, what is the alternative? Competition for a scarce resource, in a land of naive programmers.

@mspncp
Copy link
Contributor

mspncp commented Feb 14, 2019

Just wait until the initial seeding has completed and then simply use it's output without blocking any further (that's what getrandom() does).

In this context you might find the following blog post interesting:

https://www.2uo.de/myths-about-urandom

The author's opinion might not be authoritative, but he it seems to be supported by D. J. Bernstein himself. And if I understand Pauli's comment correctly, even the FIPS lab members agree.

@bernd-edlinger
Copy link
Member Author

bernd-edlinger commented Feb 14, 2019

Yes, but and that is only after the getentropy edit: getrandom syscall which should have been used,
but maybe the fallback after the mis-configuration is wrong, and should use /dev/random
even if it waits.

@bernd-edlinger
Copy link
Member Author

It would be good to have more detailed information about the VxWorks randlib in general and the randBytes() call in particular, before answering this question. Is there any official documentation available online or offline?

According to #8023 (comment)
You may assume that randBytes is returning error when not enough entropy, and when you read from
the randBytes the pool starts to get empty again, it depends on the hardware how long
this takes, can be seconds or minutes.
Is it better to wait or to fail if this happens?

@bernd-edlinger
Copy link
Member Author

use /dev/random even if it waits.

MAYBE we should only read 1 byte from there and after that continue with /dev/urandom.
I observe that once one byte is available several (~8) one byte reads succeed immediately
untit the /dev/random starts to block again. As a compromise ?

@levitte
Copy link
Member

levitte commented Feb 14, 2019

Guys, some of you seem a bit confused about what getentropy() and getrandom() are and how they relate.

  • getrandom() is a syscall
  • getentropy() is a library routine that uses getrandom(), you can see it for yourself in the source

@paulidale
Copy link
Contributor

FIPS says that using the OS provided entropy is fine, it does not talk about which interface you talk to the OS.

FIPS 140-2 Implementation Guidance 7.15 Entropy Assessment makes a clear distinction:

In case of the /dev/urandom request, the call to this OS entropy generator is non-blocking. The data obtained from the non-blocking call is not guaranteed to possess the desired amount of entropy. How can the vendor provide the assurance that the requirements of the FIPS 140-2 AS.07.13 assertion are satisfied?
To meet these requirements, the vendor must first demonstrate that the initial call (that is, the first call after the module has been powered up or instantiated) to /dev/urandom returns the claimed amount of entropy.

There is quite a bit more wordage around this but it is the non-blocking nature before initial seeding that is pointed to as the problem.

@paulidale
Copy link
Contributor

MAYBE we should only read 1 byte from there and after that continue with /dev/urandom.
I observe that once one byte is available several (~8) one byte reads succeed immediately
untit the /dev/random starts to block again.

I'm pretty sure that this won't guarantee that /dev/urandom has been properly seeded.

NIST doesn't say anything about getrandom() yet. However, after discussions with our FIPS lab, it appears that it would be a justifiable source and it is guaranteed to be seeded properly.

@bernd-edlinger
Copy link
Member Author

@paulidale

I'd put /dev/hwrng before /dev/prandom. /dev/prandom is a pseudo random generator like /dev/urandom and I don't believe claims about made about it being properly seeded.

I agree about /dev/prandom, and would actually remove it, it is seeded from /dev/urandom and
uses triple-DES, it only exists because of the hardware support, which might offer some speed-up.
However the /dev/hwrng is not limited to s390, any hardware device can register to be a TRNG device.
I guess if it is available it will be the seed /dev/random and /dev/urandom, so I would keep it
after /dev/urandom.

/dev/srandom should be before /dev/urandom and perhaps even before /dev/random. It is a good source.

I don't know anything about srandom, is this a BSD thing? Is it similar to /dev/hwrng, the seed
of /dev/urandom ?

@mspncp
Copy link
Contributor

mspncp commented Feb 14, 2019

  • getentropy() is a library routine that uses getrandom(), you can see it for yourself in the source

I think you meant to provide a link to getentropy.c, not getrandom.c, but thanks for the link. Indeed, I was not aware that getentropy() is implemented in glibc only, based on the getrandom system call.

(But note that the two links above also show that you are you are just almost right: both functions are library routines: while getrandom() is a thin wrapper around the getrandom system call, getentropy() adds more logic. ;-) )

@bernd-edlinger
Copy link
Member Author

I'm pretty sure that this won't guarantee that /dev/urandom has been properly seeded.

At least with recent linux kernel that is the case.
The getrandom syscall unblocks exactly at the same time where /dev/random is able
to reading the first byte after system boot.

@mspncp
Copy link
Contributor

mspncp commented Feb 14, 2019

I'm pretty sure that this won't guarantee that /dev/urandom has been properly seeded.

At least with recent linux kernel that is the case.
The getrandom syscall unblocks exactly at the same time where /dev/random is able
to reading the first byte after system boot.

Theodore Ts'o explains in the commit message of commit c6e9d6f388947986 that the behaviour of /dev/urandom was not changed to block initially (like getrandom) for compatibility reasons:

   This is because changing /dev/urandom reads to block represents an
   interface change that could potentially break userspace which is not
   acceptable. 

@bernd-edlinger
Copy link
Member Author

Idea:
instead of actually reading 1 byte from /dev/random,
we could select /dev/random for readable, and once it is readable, read /dev/urandom.
That would work more or less around the issues with concurrent access.

@mspncp
Copy link
Contributor

mspncp commented Feb 14, 2019

A promising idea, indeed. I'd be interested to learn whether this trick helps to get rid of the "uninitialized urandom read" warning.

@kroeckx
Copy link
Member

kroeckx commented Feb 14, 2019 via email

@bernd-edlinger
Copy link
Member Author

Sure, this is just a fallback, when that is not available, but from the emotional reaction
on my proposal to move /dev/random before /dev/urandom, I concluded that there
must be a real problem somewhere.
Anyway, select for readable ought to guarantee that at least one byte can be
read without blocking.

bernd-edlinger added a commit to bernd-edlinger/openssl that referenced this issue Feb 14, 2019
Use select to wait for /dev/random in readable state,
but do not actually read anything from /dev/random,
use /dev/urandom first.

Use linux define __NR_getrandom instead of the
glibc define SYS_getrandom, in case the kernel headers
are more current than the glibc headers.

Fixes openssl#8215
@bernd-edlinger
Copy link
Member Author

bernd-edlinger commented Feb 14, 2019

Hmm, this does not really work as expected, (using v4.14-stable with RT patch):
the select waits a few seconds, and then the /dev/urandom is still complaining
also reading 1 byte with dd is not waiting long enough,
acutally /dev/random seems to be returning bytes before it is supposed to be initialized:

root@socfpga_cyclone5:~# dd if=/dev/random bs=1 count=1 | hexdump -C ; dd if=/dev/urandom bs=1 count=1 | hexdump -C
1+0 records in
1+0 records out
00000000  f7                                                |.|
00000001
random: dd: uninitialized urandom read (1 bytes read)
1+0 records in
1+0 records out
00000000  96                                                |.|
00000001
root@socfpga_cyclone5:~# dd if=/dev/random bs=1 count=1 | hexdump -C ; dd if=/dev/urandom bs=1 count=1 | hexdump -C
1+0 records in
1+0 records out
00000000  76                                                |v|
00000001
random: dd: uninitialized urandom read (1 bytes read)
1+0 records in
1+0 records out
00000000  a3                                                |.|
00000001
root@socfpga_cyclone5:~# dd if=/dev/random bs=1 count=1 | hexdump -C ; dd if=/dev/urandom bs=1 count=1 | hexdump -C
1+0 records in
1+0 records out
00000000  d2                                                |.|
00000001
random: dd: uninitialized urandom read (1 bytes read)
1+0 records in
1+0 records out
00000000  5b                                                |[|
00000001
root@socfpga_cyclone5:~#

@paulidale
Copy link
Contributor

I don't know anything about srandom, is this a BSD thing?

Yes, it is a BSD device. From the man page:

Strong random data. This device returns reliable random data. If sufficient entropy is not currently available (i.e., the entropy pool quality starts to run low), the driver pauses while more of such data is collected. The entropy pool data is converted into output data using MD5.

@bernd-edlinger
Copy link
Member Author

MD5 ?

@paulidale
Copy link
Contributor

That's what the man page I found said. I wouldn't be surprised if the algorithm had been changed since. I wouldn't consider MD5 as being broken as an entropy whitening function -- collisions are very remote from reversibility. I'd still advocate moving to something more recent of course.

@t8m
Copy link
Member

t8m commented Feb 19, 2019

Do not ever use /dev/random ... that is unacceptable

@t8m please explain this?

Apart from the obvious blocking?

Yes, that is the main reason. And for the reason that always-blocking /dev/random fighting for entropy is a broken concept on multi-process multi-user system it is often also blocked by security mechanisms such as SELinux. Basically /dev/random could be accepted as a special choice for user but it should never be used by default in general purpose crypto library.

@t8m
Copy link
Member

t8m commented Feb 19, 2019

Hmm, this does not really work as expected, (using v4.14-stable with RT patch):
the select waits a few seconds, and then the /dev/urandom is still complaining
also reading 1 byte with dd is not waiting long enough,
acutally /dev/random seems to be returning bytes before it is supposed to be initialized:

Because the entropy gathered is directed to various pools and never re-used. Roughly said, first the CRNG (which feeds /dev/urandom and getrandom()) is half-initialized with 64 bits of entropy (or so), then the entropy gathered goes to initialize the /dev/random entropy pool and only after that one is initialized, it goes to complete the CRNG initialization. It might not be exactly like that and it changed a little over various kernel versions but you can get the picture.
Basically: Please leave it on the OS to ensure /dev/urandom is properly initialized, do not try to outsmart it in the library code. It would have to be patched out anyway for general use in Linux OSes.

@kroeckx
Copy link
Member

kroeckx commented Feb 19, 2019 via email

@paulidale
Copy link
Contributor

If my understanding of the process is current, this will have a race condition, whereby something else could consume the collection pool's entropy before it is sent to urandom pool. Still, it is a better approach than doing nothing.

levitte pushed a commit that referenced this issue Mar 1, 2019
Use select to wait for /dev/random in readable state,
but do not actually read anything from /dev/random,
use /dev/urandom first.

Use linux define __NR_getrandom instead of the
glibc define SYS_getrandom, in case the kernel headers
are more current than the glibc headers.

Fixes #8215

Reviewed-by: Kurt Roeckx <[email protected]>
(Merged from #8251)

(cherry picked from commit 38023b8)
@levitte levitte closed this as completed in 38023b8 Mar 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants