Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not use AllocHGlobal in Pkcs12Reader #97447

Closed
wants to merge 6 commits into from

Conversation

krwq
Copy link
Member

@krwq krwq commented Jan 24, 2024

Make more targetted fix for #93647

@ghost
Copy link

ghost commented Jan 24, 2024

Tagging subscribers to this area: @dotnet/area-system-security, @bartonjs, @vcsjones
See info in area-owners.md if you want to be subscribed.

Issue Details

Make more targetted fix for #93647

Author: krwq
Assignees: krwq
Labels:

area-System.Security

Milestone: -

@@ -39,19 +39,13 @@ protected void ParsePkcs12(ReadOnlySpan<byte> data)

// Windows compatibility: Ignore trailing data.
ReadOnlySpan<byte> encodedData = reader.PeekEncodedValue();
byte[] dataWithoutTrailing = GC.AllocateUninitializedArray<byte>(encodedData.Length, pinned: true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I have the same concerns here as I did with the other I reviewed. The pinned object heap is not designed for short-lived objects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 This is replacing memory leak with a perf bug that can be worse than the memory leak.

Can you wrap the unmanaged memory in a safe handle and AddRef/Release the safe handle around accesses?

Copy link
Member Author

@krwq krwq Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a perf expert but this is not a hot path by any means - I suppose all operations done later with cert are way heavier than any cost of this specific allocation/freeing - note there is also lots of parsing going on even within constructor but that's without doing any actual crypto related stuff.

I honestly think we should either change code so that pinning is not needed at all or invest in changes which make this construct fast (I'd imagine passing bytes to native code isn't as uncommon).

I personally greatly prefer simplicity to small perf gains but if there is evidence this is as bad then maybe it is worth it.

When I run this in a tight loop (that was just ctor not E2E) with different cert sizes OOM was very visible while this potential perf issue wasn't particularly noticable (but I didn't measure the throughput) - I'm having a really hard time seeing how it can be worse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't about the perf of this code. It's about the effect it has on the rest of the application.

Copy link
Member

@stephentoub stephentoub Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's about the effect it has on the rest of the application.

Just to demonstrate what I'm talking about, here it's not the Mean that I care most about (though that it is significantly slower and reflective of the impact this has on not only this code but other code in the process as well), but more the fact that in the non-pinned version there are 0 gen1 and 0 gen2 GCs, and that's no longer true in the pinned version: the POH gets collected as part of gen2, so short-lived allocations that would otherwise be gen0 become gen2, while also leading to excessive fragmentation of the POH, and that affects everything else in the app.

Method Length Mean Error Ratio Gen0 Gen1 Gen2
NotPinned 10 7.054 ns 0.0653 ns 1.00 0.0004 - -
Pinned 10 234.552 ns 4.7442 ns 33.26 0.0014 0.0014 0.0014
NotPinned 1000 78.747 ns 0.7053 ns 1.00 0.0111 - -
Pinned 1000 1,175.699 ns 23.2682 ns 15.03 0.0267 0.0267 0.0267
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);

[MemoryDiagnoser]
public class Tests
{
    [Params(10, 1000)]
    public int Length { get; set; }

    [Benchmark(Baseline = true)]
    public byte[] NotPinned() => GC.AllocateUninitializedArray<byte>(Length, pinned: false);

    [Benchmark]
    public byte[] Pinned() => GC.AllocateUninitializedArray<byte>(Length, pinned: true);
}

Copy link
Member Author

@krwq krwq Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity - does allocating unpinned and then using scoped fixed statement make any difference?

Copy link
Member

@jkotas jkotas Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity - does allocating unpinned and then using scoped fixed statement make any difference?

Yes, that would work just fine in this case. However, I am not sure whether it is the right choice for this code. I assume that it allocates pinned memory to prevent GC from creating copies of the certificate in process memory when it compacts the heap. Comment that explains why this allocates pinned memory would be nice.

Why aren't pinned objects (or at least arrays) implemented in terms of GlobalHAlloc?

It would not be possible. GlobalHAlloc allocations are not managed by the GC. You have to free the allocation explicitly. Pinned object heap allocations are managed by the GC. They are collected when no longer referenced, there is no explicit free operation for you to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, the pinned object heap would not have these perf characteristics and its performance characteristics would be more like GlobalHAlloc. There is nothing fundamental preventing that. It is not how it behaves today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that it allocates pinned memory to prevent GC from creating copies of the certificate in process memory when it compacts the heap.

Yep. Well, the concern is with the private key material instead of the certificates per se, but the point is to avoid GC compaction clones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub @jkotas @bartonjs I've updated the code back to the old version and I'm using SafeHandle now, please review

@krwq krwq force-pushed the do-not-use-allochglobal-pkcs12reader branch from eabfcf2 to 4a4cfe2 Compare February 7, 2024 12:36
@krwq krwq force-pushed the do-not-use-allochglobal-pkcs12reader branch from 4a4cfe2 to aa43fe8 Compare February 7, 2024 12:37
/// <summary>
/// SafeHandle for LocalAlloc'd memory which calls ZeroMemory when releasing handle.
/// </summary>
internal sealed class SafeLocalAllocWithClearOnDisposeHandle : SafeCrypt32Handle<SafeLocalAllocWithClearOnDisposeHandle>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

public unsafe PointerMemoryManager<byte> GetMemoryManager()
=> new PointerMemoryManager<byte>((byte*)handle, _length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is bug prone to hand out the pointer to the resource managed by the handle like this. What is going to guarantee that the handle is not Disposed while the caller uses the memory manager?

The proper pattern to use with safe handles looks like this:

bool success = false;
try
{
    handle.DangerousAddRef(ref success);

    // get the span out the handle and do the operation on the memory
    // owned by the handle. Do not pass any pointers to the memory outside
   // of this scope
}
finally
{
    if (success)
       handle.DangerousRelease();
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately that approach would require a bit of refactoring since try..finally would need to be outside of the UnixPkcs12Reader and size is unknown outside. I'll move the method out of the handle class so it's not accidentally misused in the future

Copy link
Member Author

@krwq krwq Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will also add Dangerous prefix and disclaimer in the xml doc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do not use this pattern, you are not getting any additional safety from using the safe handle and the safe handle is just an unnecessary overhead. You can keep store the pointer to memory directly and free it in the finalizer like what was implemented in #93647.

Copy link
Member Author

@krwq krwq Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've ended up adding UnmanagedCryptBufferMemoryManager to isolate the pattern correctly - I'm still having hard time seeing why I should use DangerousAddRef (that honestly seems worse than just using native handle as you need to count references by hand) - I ended up using pattern described in here: https://learn.microsoft.com/en-us/dotnet/standard/garbage-collection/implementing-dispose . I'd prefer we do both solutions - mine focuses on making this buffer less error prone but I think we should still go ahead with changes proposed in #93647 but I don't think we should be touching GlobalHAlloc directly anywhere in this (Pkcs12Reader) code as it seems very fragile to refactoring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SafeHandle wraps a IntPtr handle of the unmanaged resource and a thread safe counter. DangerousAddRef/DangerousRelease that are expected to be called around the code that uses the unmanaged resource increment/decrement the counter, and the SafeHandle Dispose method checks whether the counter is zero. If it is non-zero, it means that some other code is using the unmanaged resource. The safe handle will delay the release of the unmanaged resource until the counter drops to zero.

So if you are not calling the DangerousAddRef/DangerousRelease when using the unmanaged resource, SafeHandle is a just an expensive no-op wrapper for an IntPtr.

I understand that the manual AddRef/Release around the uses may look painful, but it is what it takes to deal with unmanaged resources without risk of race conditions or leaks in .NET.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if you are not calling the DangerousAddRef/DangerousRelease when using the unmanaged resource, SafeHandle is a just an expensive no-op wrapper for an IntPtr.

And worse than a nop, it makes it more likely that you'll have use after free bugs, because without the ref counting from the add/release, if the SafeHandle is dropped while you're using the pointer, its finalizer may release the memory still in use.

@bartonjs
Copy link
Member

#98331 went with the approach of just fixing the data flow to ensure Dispose was always called; the strategy of using native memory is unchanged.

@bartonjs bartonjs closed this Feb 13, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Mar 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants