-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PAL/Linux-SGX] AEX-Notify 5/5: Add AEX-Notify flows in exception handling #2037
base: dimakuv/aex-notify-part4
Are you sure you want to change the base?
Conversation
5a8651c
to
4ea9dcb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)
a discussion (no related file):
Debug failure with GDB support (try LibOS regression tests that use GDB).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel)
a discussion (no related file):
Debug some failures with EDMM (try LibOS regression tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 11 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Debug failure with GDB support (try LibOS regression tests that use GDB).
Done. AEX-Notify is not really compatible with GDB, see e.g. the official whitepaper: https://cdrdv2-public.intel.com/736463/aex-notify-white-paper-public.pdf, Section 8.
libos/test/regression/manifest.template
line 27 at r2 (raw file):
sgx.edmm_enable = {{ 'true' if env.get('EDMM', '0') == '1' else 'false' }} sgx.use_exinfo = {{ 'true' if env.get('EDMM', '0') == '1' else 'false' }} sgx.experimental_enable_aex_notify = {{ 'true' if env.get('AEXNOTIFY', '0') == '1' else 'false' }}
For now added only in this manifest file, but technically should add in all files (similar to sgx.edmm_enable
) and have at least one CI pipeline with AEXNOTIFY=1
envvar.
pal/src/host/linux-sgx/host_exception.c
line 397 at r2 (raw file):
noreturn void fail_on_morphed_eresume(void) { log_error("Bug in AEX-Notify flows: ERESUME morphed into EENTER but then the enclave performed " "EEXIT instead of EDECCSSA. Please debug.");
This particular bug (data race) made my brain boil for two days. I definitely want to keep this diagnostics for future, if we ever have more bugs like this.
Explaining this data race is hard, but basically:
- AEX-Notify now allows ERESUME to morph into EENTER.
- EENTER may be exited via EDECCSSA (assumed in AEX-Notify) or via EEXIT (legacy non-AEX-Notify flows).
- The above implies that ERESUME can end up in EEXIT, and the enclave should jump out to the "exit target" that by our Gramine convention is specified in RDX reg (I mean the Gramine convention for EEXIT).
- But Gramine also assumes that ERESUME never returns, and this is now broken -> data race! Note that before, RDX reg was random garbage upon ERESUME (which made sense before, as ERESUME would never use RDX and would never return).
pal/src/host/linux-sgx/pal_exception.c
line 61 at r2 (raw file):
MB(); SET_ENCLAVE_TCB(ready_for_aex_notify, 0UL); MB();
I feel like this could be implemented in a cleaner way, maybe re-using some of the existing variables... I am definitely not proud of this stopping_aex_notify
helper variable, but this seemed easy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 11 files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Debug some failures with EDMM (try LibOS regression tests).
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 61 files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners
a discussion (no related file):
In my local tests, everything seems to work. I stress-tested all our Gramine LibOS and PAL tests, with and without EDMM.
To merge this PR, our CI should have AEX-Notify-supporting workers, and AEXNOTIFY=1
envvar must be set in at least one Jenkins pipeline. This enablement must be done similarly to the EDMM=1
one.
I currently put a blocking comment on this, not to forget about updating the CI.
libos/test/regression/manifest.template
line 27 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
For now added only in this manifest file, but technically should add in all files (similar to
sgx.edmm_enable
) and have at least one CI pipeline withAEXNOTIFY=1
envvar.
Done, now added everywhere. The enablement is similar to EDMM (with its EDMM=1
envvar).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 61 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners
a discussion (no related file):
Quick perf numbers for Gramine built in Release mode on Ubuntu 24.04 with Linux v6.11.
Not sure if they are useful, just wanted to post here. They show that current AEX-Notify (with dummy mitigation) has small overhead.
make clean; AEXNOTIFY=0 EDMM=0 SGX=1 gramine-test pytest -k 'not TC_04_Attestation'
-- done in 200.36smake clean; AEXNOTIFY=0 EDMM=1 SGX=1 gramine-test pytest -k 'not TC_04_Attestation'
-- done in 138.00smake clean; AEXNOTIFY=1 EDMM=0 SGX=1 gramine-test pytest -k 'not TC_04_Attestation'
-- done in 208.25smake clean; AEXNOTIFY=1 EDMM=1 SGX=1 gramine-test pytest -k 'not TC_04_Attestation'
-- done in 141.50s
This commit adds the AEX-Notify flows inside the enclave. The stage-1 signal handler is augmented as follows when AEX-Notify is enabled: manually restore SSA[0] context, invoke the EDECCSSA instruction instead of EEXIT (to go from SSA[1] to SSA[0] without exiting the enclave) and finally jump to SSA[0].GPRSGX.RIP to resume enclave execution (it will resume in stage-2 signal handler). The stage-2 signal handler is augmented as follows: set bit 0 of SSA[0].GPRSGX.AEXNOTIFY (so that AEX-Notify starts working again for this thread), then apply AEX-Notify mitigations and finally restore regular enclave execution. This commit does not add any real AEX-Notify mitigations. Instead, we count the number of AEX events reported inside the SGX enclave and print this number on enclave termination (if log level is at least "warning"). Note that current implementation of AEX-Notify does not use the checkpoint mechanism described in the official AEX-Notify whitepaper. That checkpoint mechanism allows to coalesce multiple AEX events that occur during the execution of mitigations. This saves some CPU cycles and some signal-handling stack space, but we leave implementing this optimization as future work. Signed-off-by: Dmitrii Kuvaiskii <[email protected]>
Fixed GDB issue. Fixed a SIGSEGV data race on thread termination (ERESUME morphs into EENTER but then performs EEXIT). Added AEXNOTIFY envvar to LibOS regression tests (but only to a subset from `manifest.template`, simply because changing all manifest template files would be a huge git diff). Signed-off-by: Dmitrii Kuvaiskii <[email protected]>
Fixed EDMM issue. Turned out to be a case of too many nested signal handlers inside Gramine's SGX PAL, which overflowed the SGX enclave signal stack. Signed-off-by: Dmitrii Kuvaiskii <[email protected]>
This commit adds conditional AEX-Notify enablement to all Gramine tests. Run tests e.g. like this (on a machine that supports AEX-Notify both in hardware and in Linux kernel): $ EDMM=1 AEXNOTIFY=1 SGX=1 gramine-test pytest Signed-off-by: Dmitrii Kuvaiskii <[email protected]>
45f12b3
to
6b3950c
Compare
3e518fc
to
6504586
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 61 files reviewed, 5 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners
a discussion (no related file):
Must be applied on top of #2036. Blocking.
Do you have any intuition as to why AEX-Notify introduces overhead for these tests? In the Intel SGX SDK implementation, we never observed overheads introduced by AEX-Notify, except in the microbenchmarks that repeatedly enter and exit the enclave and do nothing else. |
IMO the checkpoint mechanism is not strictly an optimization. In very specific circumstances, if the enclave repeatedly takes interrupts in the AEX-Notify handler that prevent the stack from unwinding, the enclave thread could run out of stack. This is a different user-observable behavior that would be introduced by this PR, and therefore the checkpoint mechanism--which prevents stack overflow--is not merely an optimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 61 files reviewed, 6 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @scottconstable)
a discussion (no related file):
Previously, scottconstable (Scott Constable) wrote…
Do you have any intuition as to why AEX-Notify introduces overhead for these tests? In the Intel SGX SDK implementation, we never observed overheads introduced by AEX-Notify, except in the microbenchmarks that repeatedly enter and exit the enclave and do nothing else.
To be honest, no, I don't know. I just assumed that this tiny overhead is normal for AEX-Notify. Also note that this is the complete start-to-end time from starting the first regression test to finishing the last regression test, so maybe some "enclave startup" AEX-Notify-specific logic explains this overhead?
a discussion (no related file):
@scottconstable wrote:
IMO the checkpoint mechanism is not strictly an optimization. In very specific circumstances, if the enclave repeatedly takes interrupts in the AEX-Notify handler that prevent the stack from unwinding, the enclave thread could run out of stack. This is a different user-observable behavior that would be introduced by this PR, and therefore the checkpoint mechanism--which prevents stack overflow--is not merely an optimization.
Yes, it's correct, the enclave thread could run out of stack. In this case, Gramine chooses a fail-fast approach and hangs such a thread:
gramine/pal/src/host/linux-sgx/enclave_entry.S
Lines 431 to 438 in 6be2b4a
# Disallow too many nested exceptions. In normal Gramine flow, this should never happen. Since | |
# addresses need to be canonical, this addition does not overflow. | |
movq %gs:SGX_SIG_STACK_HIGH, %rax | |
addq %gs:SGX_SIG_STACK_LOW, %rax | |
shrq $1, %rax | |
cmp %rax, %rsi | |
jae 1f | |
FAIL_LOOP |
I hope Gramine never hits these "very specific circumstances" in benign executions. If it does, then yes, the checkpoint mechanism would need to be implemented on top of the current code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 61 files reviewed, 6 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
To be honest, no, I don't know. I just assumed that this tiny overhead is normal for AEX-Notify. Also note that this is the complete start-to-end time from starting the first regression test to finishing the last regression test, so maybe some "enclave startup" AEX-Notify-specific logic explains this overhead?
I don't see any reason that AEX-Notify would introduce additional startup overhead. These performance results look highly suspicious to me. In our benchmarks on the Intel SGX SDK we observed a performance impact in only two scenarios, and this impact is explainable by the changes to the software architecture:
- Resuming after an AEX triggered by an exception that the enclave is expected to handle is faster because EDECCSSA saves an additional round-trip out of the enclave.
- Resuming after an AEX triggered by any other interrupt/exception is slower because software is responsible for doing the work (restoring state from the SSA, etc.) that would otherwise be done by ERESUME when AEX-Notify is not enabled.
When the exception handling flows in Gramine were changed to support AEX-Notify, I wonder if these changes introduced some new, unintended overhead. That is one possible explanation. Another is that Gramine takes many more exceptions that are not expected to be handled within the enclave, and therefore flow (2) is occurring much more frequently than it does with enclaves built using the Intel SGX SDK.
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
@scottconstable wrote:
IMO the checkpoint mechanism is not strictly an optimization. In very specific circumstances, if the enclave repeatedly takes interrupts in the AEX-Notify handler that prevent the stack from unwinding, the enclave thread could run out of stack. This is a different user-observable behavior that would be introduced by this PR, and therefore the checkpoint mechanism--which prevents stack overflow--is not merely an optimization.
Yes, it's correct, the enclave thread could run out of stack. In this case, Gramine chooses a fail-fast approach and hangs such a thread:
gramine/pal/src/host/linux-sgx/enclave_entry.S
Lines 431 to 438 in 6be2b4a
# Disallow too many nested exceptions. In normal Gramine flow, this should never happen. Since # addresses need to be canonical, this addition does not overflow. movq %gs:SGX_SIG_STACK_HIGH, %rax addq %gs:SGX_SIG_STACK_LOW, %rax shrq $1, %rax cmp %rax, %rsi jae 1f FAIL_LOOP I hope Gramine never hits these "very specific circumstances" in benign executions. If it does, then yes, the checkpoint mechanism would need to be implemented on top of the current code.
I agree with your summary. The checkpoint mechanism is required for the AEX-Notify single-/zero-step mitigation that is used by the SGX SDK, because the mitigation can be trivially bypassed without the checkpoint. The only other benefit of the checkpoint is that it avoids this theoretical stack overflow, and I agree that this should never happen in benign circumstances (under malicious circumstances, it would be a DoS).
Description of the changes
Part 5 in AEX-Notify series.
This PR adds the AEX-Notify flows inside the enclave.
The stage-1 signal handler is augmented as follows when AEX-Notify is enabled: manually restore SSA[0] context, invoke the EDECCSSA instruction instead of EEXIT (to go from SSA[1] to SSA[0] without exiting the enclave) and finally jump to SSA[0].GPRSGX.RIP to resume enclave execution (it will resume in stage-2 signal handler).
The stage-2 signal handler is augmented as follows: set bit 0 of SSA[0].GPRSGX.AEXNOTIFY (so that AEX-Notify starts working again for this thread), then apply AEX-Notify mitigations and finally restore regular enclave execution.
This PR does not add any real AEX-Notify mitigations. Instead, we count the number of AEX events reported inside the SGX enclave and print this number on enclave termination (if log level is at least "warning").
Note that current implementation of AEX-Notify does not use the checkpoint mechanism described in the official AEX-Notify whitepaper. That checkpoint mechanism allows to coalesce multiple AEX events that occur during the execution of mitigations. This saves some CPU cycles and some signal-handling stack space, but we leave implementing this optimization as future work.
See also related PRs and discussions:
Related documentation:
Closes #1530
Closes #1531
How to test this PR?
AEX-Notify is enabled in all LibOS/PAL test manifests if
AEXNOTIFY=1
environment variable is set.This change is