-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel: use TPR for interrupt reentrancy protection #568
Conversation
fa35f13
to
0f3880c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the changes look good to me, I left two comments to improve the readability.
Also, on my KVM setup, the test-in-svsm
run failed with this message and stack-trace:
[SVSM] test cpu::irq_state::tests::tpr_guard_test ...
[SVSM] ERROR: Panic: CPU[0] panicked at kernel/src/cpu/irq_state.rs:445:9:
assertion `left == right` failed
left: 0
right: 9
[SVSM] ---BACKTRACE---:
[SVSM] [0xffffff8000144e73]
[SVSM] [0xffffff800013eb3a]
[SVSM] [0xffffff80000480a9]
[SVSM] [0xffffff80000713c1]
[SVSM] [0xffffff800010dd01]
[SVSM] [0xffffff80000da236]
[SVSM] [0xffffff80000175d5]
[SVSM] [0xffffff800009e93e]
[SVSM] [0xffffff80000cdeea]
[SVSM] [0xffffff80000cdef0]
[SVSM] ---END---
Decoded stack-trace is:
core::panicking::assert_failed_inner
??:?
core::panicking::assert_failed
??:?
svsm::cpu::irq_state::tests::tpr_guard_test
/home/joro/src/svsm/kernel/src/cpu/irq_state.rs:?
svsm::cpu::irq_state::tests::tpr_guard_test::{{closure}}
/home/joro/src/svsm/kernel/src/cpu/irq_state.rs:428
core::ops::function::FnOnce::call_once
/rustc/bc5cf994db9fb46712cefd89f78ad7fc51f184a2/library/core/src/ops/function.rs:250
svsm::testing::svsm_test_runner
/home/joro/src/svsm/kernel/src/testing.rs:80
svsm::test_main
/home/joro/src/svsm/kernel/src/lib.rs:?
svsm_main
/home/joro/src/svsm/kernel/src/svsm.rs:335
svsm::task::tasks::task_exit
/home/joro/src/svsm/kernel/src/task/tasks.rs:787
That suggests a fatal problem with TPR use in an environment that doesn't support the use of interrupts. I cannot think of any reason that the SVSM would observe a TPR of 9, particularly if the firmware has not yet started executing. But regardless, if vTPR is shared between the SVSM and the guest OS, then it is not usable for any reason in the SVSM, and is certainly not usable for lock management. For the sake of making progress on TPR and making it usable for IPIs (another scenario that depends on the use of SVSM interrupts), I will plan to split out the scheduler lock changes into a separate PR which we will postpone until SVSM interrupts are fully supported. I will also change the unit test so it only executes on systems that support SVSM interrupts. I would be grateful if you could try to figure out where the raised TPR is coming from, since that only happens on KVM systems. We should at least understand why it appears random in the test environment, since that is unexpected. |
It turns out that the |
I just ran into a fundamental problem with KVM and these patches. As it turns out, with the 6.8 kernel the TPR is still shared across VMPL levels, which makes the SVSM crash when Linux changes the TPR. The road forward here is to update the SVSM development branches for kernel and QEMU. |
In what way does that cause this change to break? I've removed the changes to consume TPR-safe locks, so the only reference to TPR should be in interrupt dispatch, and since the SVSM does not support interrupts when running under KVM, there should be no issue. I would expect that everything now functions correctly under KVM. Are you still seeing issues? |
The assert for |
The priority scheme inherently associated with interrupts permits masking interrupts of lower priority classes while handling higher-priority interrupts. This makes it possible to enable interrupts while in an interrupt handler, allowing high-priority interrupts to be handled while masking lower-priorty interrupts. By raising TPR during interrupt handling, it becomes possible to be safely preempted by higher-priority interrupts during interrupt handling. This change also defines a mechanism where code unrelated to interrupt handlers can temporarily raise TPR to mask low-priority interrupts while permitting the delivery of high-priority interrupts. In the future, it will be possible to enable spin locks or other locks to be associated with TPR so that a lock can safely be acquired and prevent reentracncy by low-priority interrupts while still permitting the handilng of high-priority interrupts. Signed-off-by: Jon Lange <[email protected]>
Using TPR for reentrancy protection means that higher priority interrupts can continue to be delivered while locks are held. Each TPR-protected lock defines the TPR with which it is associated, so any attempt to acquire the lock from a higher priority interrupt will panic due to TPR inversion. Signed-off-by: Jon Lange <[email protected]>
Thanks for identifying the issue. I've changed the assert so it will only fire when interrupts are usable (since the assert only matters if interrupts are usable). I believe this PR will now work without requiring a kernel change. |
Thanks for updating the PR. Testing looks good now. |
Using TPR for interrupt reentrancy protection allows high-priority interrupts to be serviced while low-priority interrupts are masked. This PR introduces TPR management logic and integrates TPR management into locking mechanisms.