Skip to content
This repository has been archived by the owner on Jan 20, 2022. It is now read-only.

[Pal/Linux-SGX] Calling send() from multiple threads causes error in mbedTLS #2058

Open
lejunzhu opened this issue Jan 5, 2021 · 3 comments · May be fixed by #2059
Open

[Pal/Linux-SGX] Calling send() from multiple threads causes error in mbedTLS #2058

lejunzhu opened this issue Jan 5, 2021 · 3 comments · May be fixed by #2059

Comments

@lejunzhu
Copy link
Contributor

lejunzhu commented Jan 5, 2021

Description of the problem

When the application creates a socket pair and calls send() from more than one thread simultaneously, sometimes recv() at the other end of the pipe will receive invalid data and returns -13 (permission denied).

Tracing the problem leads to:

https://github.com/oscarlab/graphene/blob/345d271e663725d58ce941fc49f08ebc02c32fa2/Pal/lib/crypto/adapters/mbedtls_adapter.c#L504-L509

Here mbedtls_ssl_write() is called without any lock protection. Because writing to the same mbedTLS context is not thread-safe, it triggers a race condition inside mbedTLS and causes encrypted data to be sent prematurely.

Adding a spinlock (perhaps should be a futex?) around mbedtls_ssl_write() makes the problem go away.

This is found when testing Python asyncio, which internally uses one-byte write to signal the waiting thread. Errors like this appear in the log, which is caused by this same problem:

Exception in callback BaseSelectorEventLoop._read_from_self()
handle: <Handle BaseSelectorEventLoop._read_from_self()>
Traceback (most recent call last):
  File "/usr/lib/python3.7/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/lib/python3.7/asyncio/selector_events.py", line 119, in _read_from_self
    data = self._ssock.recv(4096)
PermissionError: [Errno 13] Permission denied

Steps to reproduce

Run the C program attached, with and without Graphene SGX

test.txt

Expected results

When running without Graphene, the program will report:

c1=100000, c2=100000, e13=0

Actual results

Running with Graphene SGX, the result shows a lot of error -13 and few successful recv():

c1=186, c2=64, e13=199750

@dimakuv
Copy link

dimakuv commented Jan 5, 2021

Thanks @lejunzhu for the awesome analysis, as always!

I was vaguely aware of this issue when adding mbedTLS send/recv to pipes/UNIX domain sockets, but thought that applications don't really do this (call send() from more than one thread simultaneously).

Adding a spinlock (perhaps should be a futex?) around mbedtls_ssl_write() makes the problem go away.

We need to add mutexes at a higher level than around mbedtls_ssl_write(). The best level seems to be in the Linux-SGX PAL, in functions of db_pipes.c. I will cook a PR soon (unless @lejunzhu wants to do it?)

@dimakuv
Copy link

dimakuv commented Jan 5, 2021

@lejunzhu Check out #2059.

@dimakuv
Copy link

dimakuv commented Jul 22, 2021

This one is annoying hard to fix...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants