Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PAL/Linux-SGX] AEX-Notify 5/5: Add AEX-Notify flows in exception handling #2037

Open
wants to merge 4 commits into
base: dimakuv/aex-notify-part4
Choose a base branch
from

Conversation

dimakuv
Copy link

@dimakuv dimakuv commented Oct 16, 2024

Description of the changes

Part 5 in AEX-Notify series.

This PR adds the AEX-Notify flows inside the enclave.

The stage-1 signal handler is augmented as follows when AEX-Notify is enabled: manually restore SSA[0] context, invoke the EDECCSSA instruction instead of EEXIT (to go from SSA[1] to SSA[0] without exiting the enclave) and finally jump to SSA[0].GPRSGX.RIP to resume enclave execution (it will resume in stage-2 signal handler).

The stage-2 signal handler is augmented as follows: set bit 0 of SSA[0].GPRSGX.AEXNOTIFY (so that AEX-Notify starts working again for this thread), then apply AEX-Notify mitigations and finally restore regular enclave execution.

This PR does not add any real AEX-Notify mitigations. Instead, we count the number of AEX events reported inside the SGX enclave and print this number on enclave termination (if log level is at least "warning").

Note that current implementation of AEX-Notify does not use the checkpoint mechanism described in the official AEX-Notify whitepaper. That checkpoint mechanism allows to coalesce multiple AEX events that occur during the execution of mitigations. This saves some CPU cycles and some signal-handling stack space, but we leave implementing this optimization as future work.

See also related PRs and discussions:

Related documentation:

Closes #1530
Closes #1531

How to test this PR?

AEX-Notify is enabled in all LibOS/PAL test manifests if AEXNOTIFY=1 environment variable is set.


This change is Reviewable

@dimakuv dimakuv force-pushed the dimakuv/aex-notify-part5 branch 2 times, most recently from 5a8651c to 4ea9dcb Compare October 16, 2024 14:37
Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 6 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)

a discussion (no related file):
Debug failure with GDB support (try LibOS regression tests that use GDB).


Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 6 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel)

a discussion (no related file):
Debug some failures with EDMM (try LibOS regression tests).


Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 11 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Debug failure with GDB support (try LibOS regression tests that use GDB).

Done. AEX-Notify is not really compatible with GDB, see e.g. the official whitepaper: https://cdrdv2-public.intel.com/736463/aex-notify-white-paper-public.pdf, Section 8.



libos/test/regression/manifest.template line 27 at r2 (raw file):

sgx.edmm_enable = {{ 'true' if env.get('EDMM', '0') == '1' else 'false' }}
sgx.use_exinfo = {{ 'true' if env.get('EDMM', '0') == '1' else 'false' }}
sgx.experimental_enable_aex_notify = {{ 'true' if env.get('AEXNOTIFY', '0') == '1' else 'false' }}

For now added only in this manifest file, but technically should add in all files (similar to sgx.edmm_enable) and have at least one CI pipeline with AEXNOTIFY=1 envvar.


pal/src/host/linux-sgx/host_exception.c line 397 at r2 (raw file):

noreturn void fail_on_morphed_eresume(void) {
    log_error("Bug in AEX-Notify flows: ERESUME morphed into EENTER but then the enclave performed "
              "EEXIT instead of EDECCSSA. Please debug.");

This particular bug (data race) made my brain boil for two days. I definitely want to keep this diagnostics for future, if we ever have more bugs like this.

Explaining this data race is hard, but basically:

  • AEX-Notify now allows ERESUME to morph into EENTER.
  • EENTER may be exited via EDECCSSA (assumed in AEX-Notify) or via EEXIT (legacy non-AEX-Notify flows).
  • The above implies that ERESUME can end up in EEXIT, and the enclave should jump out to the "exit target" that by our Gramine convention is specified in RDX reg (I mean the Gramine convention for EEXIT).
  • But Gramine also assumes that ERESUME never returns, and this is now broken -> data race! Note that before, RDX reg was random garbage upon ERESUME (which made sense before, as ERESUME would never use RDX and would never return).

pal/src/host/linux-sgx/pal_exception.c line 61 at r2 (raw file):

    MB();
    SET_ENCLAVE_TCB(ready_for_aex_notify, 0UL);
    MB();

I feel like this could be implemented in a cleaner way, maybe re-using some of the existing variables... I am definitely not proud of this stopping_aex_notify helper variable, but this seemed easy.

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 11 files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Debug some failures with EDMM (try LibOS regression tests).

Done


Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 61 files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners

a discussion (no related file):
In my local tests, everything seems to work. I stress-tested all our Gramine LibOS and PAL tests, with and without EDMM.

To merge this PR, our CI should have AEX-Notify-supporting workers, and AEXNOTIFY=1 envvar must be set in at least one Jenkins pipeline. This enablement must be done similarly to the EDMM=1 one.

I currently put a blocking comment on this, not to forget about updating the CI.



libos/test/regression/manifest.template line 27 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

For now added only in this manifest file, but technically should add in all files (similar to sgx.edmm_enable) and have at least one CI pipeline with AEXNOTIFY=1 envvar.

Done, now added everywhere. The enablement is similar to EDMM (with its EDMM=1 envvar).

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 61 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners

a discussion (no related file):
Quick perf numbers for Gramine built in Release mode on Ubuntu 24.04 with Linux v6.11.

Not sure if they are useful, just wanted to post here. They show that current AEX-Notify (with dummy mitigation) has small overhead.

  • make clean; AEXNOTIFY=0 EDMM=0 SGX=1 gramine-test pytest -k 'not TC_04_Attestation' -- done in 200.36s
  • make clean; AEXNOTIFY=0 EDMM=1 SGX=1 gramine-test pytest -k 'not TC_04_Attestation' -- done in 138.00s
  • make clean; AEXNOTIFY=1 EDMM=0 SGX=1 gramine-test pytest -k 'not TC_04_Attestation' -- done in 208.25s
  • make clean; AEXNOTIFY=1 EDMM=1 SGX=1 gramine-test pytest -k 'not TC_04_Attestation' -- done in 141.50s

Dmitrii Kuvaiskii added 4 commits October 22, 2024 00:27
This commit adds the AEX-Notify flows inside the enclave.

The stage-1 signal handler is augmented as follows when AEX-Notify is
enabled: manually restore SSA[0] context, invoke the EDECCSSA
instruction instead of EEXIT (to go from SSA[1] to SSA[0] without
exiting the enclave) and finally jump to SSA[0].GPRSGX.RIP to resume
enclave execution (it will resume in stage-2 signal handler).

The stage-2 signal handler is augmented as follows: set bit 0 of
SSA[0].GPRSGX.AEXNOTIFY (so that AEX-Notify starts working again for
this thread), then apply AEX-Notify mitigations and finally restore
regular enclave execution.

This commit does not add any real AEX-Notify mitigations. Instead, we
count the number of AEX events reported inside the SGX enclave and print
this number on enclave termination (if log level is at least "warning").

Note that current implementation of AEX-Notify does not use the
checkpoint mechanism described in the official AEX-Notify whitepaper.
That checkpoint mechanism allows to coalesce multiple AEX events
that occur during the execution of mitigations. This saves some CPU
cycles and some signal-handling stack space, but we leave implementing
this optimization as future work.

Signed-off-by: Dmitrii Kuvaiskii <[email protected]>
Fixed GDB issue. Fixed a SIGSEGV data race on thread termination
(ERESUME morphs into EENTER but then performs EEXIT). Added AEXNOTIFY
envvar to LibOS regression tests (but only to a subset from
`manifest.template`, simply because changing all manifest template files
would be a huge git diff).

Signed-off-by: Dmitrii Kuvaiskii <[email protected]>
Fixed EDMM issue. Turned out to be a case of too many nested signal
handlers inside Gramine's SGX PAL, which overflowed the SGX enclave
signal stack.

Signed-off-by: Dmitrii Kuvaiskii <[email protected]>
This commit adds conditional AEX-Notify enablement to all Gramine tests.

Run tests e.g. like this (on a machine that supports AEX-Notify both in
hardware and in Linux kernel):

    $ EDMM=1 AEXNOTIFY=1 SGX=1 gramine-test pytest

Signed-off-by: Dmitrii Kuvaiskii <[email protected]>
@dimakuv dimakuv changed the title [PAL/Linux-SGX] Add AEX-Notify flows in exception handling [PAL/Linux-SGX] AEX-Notify 5/5: Add AEX-Notify flows in exception handling Oct 22, 2024
Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 61 files reviewed, 5 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners

a discussion (no related file):
Must be applied on top of #2036. Blocking.


@scottconstable
Copy link

Reviewable status: 0 of 61 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners

a discussion (no related file): Quick perf numbers for Gramine built in Release mode on Ubuntu 24.04 with Linux v6.11.

Not sure if they are useful, just wanted to post here. They show that current AEX-Notify (with dummy mitigation) has small overhead.

  • make clean; AEXNOTIFY=0 EDMM=0 SGX=1 gramine-test pytest -k 'not TC_04_Attestation' -- done in 200.36s
  • make clean; AEXNOTIFY=0 EDMM=1 SGX=1 gramine-test pytest -k 'not TC_04_Attestation' -- done in 138.00s
  • make clean; AEXNOTIFY=1 EDMM=0 SGX=1 gramine-test pytest -k 'not TC_04_Attestation' -- done in 208.25s
  • make clean; AEXNOTIFY=1 EDMM=1 SGX=1 gramine-test pytest -k 'not TC_04_Attestation' -- done in 141.50s

Do you have any intuition as to why AEX-Notify introduces overhead for these tests? In the Intel SGX SDK implementation, we never observed overheads introduced by AEX-Notify, except in the microbenchmarks that repeatedly enter and exit the enclave and do nothing else.

@scottconstable
Copy link

Note that current implementation of AEX-Notify does not use the checkpoint mechanism described in the official AEX-Notify whitepaper. That checkpoint mechanism allows to coalesce multiple AEX events that occur during the execution of mitigations. This saves some CPU cycles and some signal-handling stack space, but we leave implementing this optimization as future work.

IMO the checkpoint mechanism is not strictly an optimization. In very specific circumstances, if the enclave repeatedly takes interrupts in the AEX-Notify handler that prevent the stack from unwinding, the enclave thread could run out of stack. This is a different user-observable behavior that would be introduced by this PR, and therefore the checkpoint mechanism--which prevents stack overflow--is not merely an optimization.

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 61 files reviewed, 6 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @scottconstable)

a discussion (no related file):

Previously, scottconstable (Scott Constable) wrote…

Do you have any intuition as to why AEX-Notify introduces overhead for these tests? In the Intel SGX SDK implementation, we never observed overheads introduced by AEX-Notify, except in the microbenchmarks that repeatedly enter and exit the enclave and do nothing else.

To be honest, no, I don't know. I just assumed that this tiny overhead is normal for AEX-Notify. Also note that this is the complete start-to-end time from starting the first regression test to finishing the last regression test, so maybe some "enclave startup" AEX-Notify-specific logic explains this overhead?


a discussion (no related file):
@scottconstable wrote:

IMO the checkpoint mechanism is not strictly an optimization. In very specific circumstances, if the enclave repeatedly takes interrupts in the AEX-Notify handler that prevent the stack from unwinding, the enclave thread could run out of stack. This is a different user-observable behavior that would be introduced by this PR, and therefore the checkpoint mechanism--which prevents stack overflow--is not merely an optimization.

Yes, it's correct, the enclave thread could run out of stack. In this case, Gramine chooses a fail-fast approach and hangs such a thread:

# Disallow too many nested exceptions. In normal Gramine flow, this should never happen. Since
# addresses need to be canonical, this addition does not overflow.
movq %gs:SGX_SIG_STACK_HIGH, %rax
addq %gs:SGX_SIG_STACK_LOW, %rax
shrq $1, %rax
cmp %rax, %rsi
jae 1f
FAIL_LOOP

I hope Gramine never hits these "very specific circumstances" in benign executions. If it does, then yes, the checkpoint mechanism would need to be implemented on top of the current code.


Copy link

@scottconstable scottconstable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 61 files reviewed, 6 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

To be honest, no, I don't know. I just assumed that this tiny overhead is normal for AEX-Notify. Also note that this is the complete start-to-end time from starting the first regression test to finishing the last regression test, so maybe some "enclave startup" AEX-Notify-specific logic explains this overhead?

I don't see any reason that AEX-Notify would introduce additional startup overhead. These performance results look highly suspicious to me. In our benchmarks on the Intel SGX SDK we observed a performance impact in only two scenarios, and this impact is explainable by the changes to the software architecture:

  1. Resuming after an AEX triggered by an exception that the enclave is expected to handle is faster because EDECCSSA saves an additional round-trip out of the enclave.
  2. Resuming after an AEX triggered by any other interrupt/exception is slower because software is responsible for doing the work (restoring state from the SSA, etc.) that would otherwise be done by ERESUME when AEX-Notify is not enabled.

When the exception handling flows in Gramine were changed to support AEX-Notify, I wonder if these changes introduced some new, unintended overhead. That is one possible explanation. Another is that Gramine takes many more exceptions that are not expected to be handled within the enclave, and therefore flow (2) is occurring much more frequently than it does with enclaves built using the Intel SGX SDK.


a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

@scottconstable wrote:

IMO the checkpoint mechanism is not strictly an optimization. In very specific circumstances, if the enclave repeatedly takes interrupts in the AEX-Notify handler that prevent the stack from unwinding, the enclave thread could run out of stack. This is a different user-observable behavior that would be introduced by this PR, and therefore the checkpoint mechanism--which prevents stack overflow--is not merely an optimization.

Yes, it's correct, the enclave thread could run out of stack. In this case, Gramine chooses a fail-fast approach and hangs such a thread:

# Disallow too many nested exceptions. In normal Gramine flow, this should never happen. Since
# addresses need to be canonical, this addition does not overflow.
movq %gs:SGX_SIG_STACK_HIGH, %rax
addq %gs:SGX_SIG_STACK_LOW, %rax
shrq $1, %rax
cmp %rax, %rsi
jae 1f
FAIL_LOOP

I hope Gramine never hits these "very specific circumstances" in benign executions. If it does, then yes, the checkpoint mechanism would need to be implemented on top of the current code.

I agree with your summary. The checkpoint mechanism is required for the AEX-Notify single-/zero-step mitigation that is used by the SGX SDK, because the mitigation can be trivially bypassed without the checkpoint. The only other benefit of the checkpoint is that it avoids this theoretical stack overflow, and I agree that this should never happen in benign circumstances (under malicious circumstances, it would be a DoS).


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants