Skip to content

Conversation

janvorli
Copy link
Member

There is a problematic case when return address is hijacked while in a managed method that tail calls a GC write barrier and when CET is enabled. The write barrier code can change while the handler for the hijacked address is executed from the vectored exception handler. When the vectored exception handler then returns to the write barrier to re-execute the ret instruction that has triggered the vectored exception handler due to the main stack containing a different address than the shadow stack (now with the main stack fixed), the instruction may no longer be ret due to the change of the write barrier change.

This change fixes it by setting the context to return to from the vectored exception handler to point to the caller and setting the Rsp and SSP to match that. That way, the write barrier code no longer matters.

There is a problematic case when return address is hijacked while in a
managed method that tail calls a GC write barrier and when CET is
enabled. The write barrier code can change while the handler for the
hijacked address is executed from the vectored exception handler.
When the vectored exception handler then returns to the write barrier to
re-execute the `ret` instruction that has triggered the vectored
exception handler due to the main stack containing a different address
than the shadow stack (now with the main stack fixed), the instruction
may no longer be `ret` due to the change of the write barrier change.

This change fixes it by setting the context to return to from the
vectored exception handler to point to the caller and setting the Rsp
and SSP to match that. That way, the write barrier code no longer
matters.
@janvorli janvorli added this to the 9.0.0 milestone Oct 21, 2024
@janvorli janvorli requested a review from VSadov October 21, 2024 13:21
@janvorli janvorli self-assigned this Oct 21, 2024
@jkotas
Copy link
Member

jkotas commented Oct 21, 2024

Should the same change be made in naot to keep the two implementations in sync?

if (areShadowStacksEnabled)
{
// Undo the "pop", so that the ret could now succeed.
interruptedContext->SetSp(interruptedContext->GetSp() - 8);
interruptedContext->SetIp(origIp);
}

@VSadov
Copy link
Member

VSadov commented Oct 21, 2024

NativeAOT does not require the fix, but it would be better if implementations match.

LGTM, otherwise. Thanks!!

@janvorli
Copy link
Member Author

I've just added commit with the nativeaot change. But I would like to know your opinion on where should the newly added GetSSP and SetSSP be.and whether they should be made a PAL API or not. Their implementation needs the windows headers to be included or the new types / API it uses redefined.

@VSadov
Copy link
Member

VSadov commented Oct 22, 2024

I would like to know your opinion on where should the newly added GetSSP and SetSSP be.and whether they should be made a PAL API or not.

That seems fine to me. That is where I'd put the helpers as well.

@janvorli
Copy link
Member Author

janvorli commented Nov 5, 2024

/backport to release/9.0-staging

Copy link
Contributor

github-actions bot commented Nov 5, 2024

Started backporting to release/9.0-staging: https://github.com/dotnet/runtime/actions/runs/11686309680

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants